Skip to content

[ECS] [request]: Support relative container CPU share within the task hard limit #1862

@Nevon

Description

@Nevon

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
The ECS documentation says that when you specify cpu on containers within a task, that corresponds to a CPU access priority within the task, such that in case of contention the available CPU time will be proportionally allocated according to the CPU value on the container. This is implemented using cpu.shares in each container's cgroup.

However, in practice this does not behave the way you would expect from the documentation.

In my experimentation, I have a task definition with 1 vCPU on the task level and two containers where A has a CPU value of 768 and B has a CPU value of 256 with both containers running a command that tries to use as much CPU as possible to simulate contention. I deploy this task on a multi-core host and observe how much CPU time is used by each container in the task. What I expect is that container A uses the equivalent of approximately 75% of a core and B 25%.

According to docker stats each container is in fact using approximately the same amount of CPU time - 50% of a core each. While the sum total of the containers obeys the hard limit on the task, it does not appear that setting the container CPU share has any effect in terms of prioritization unless there is CPU contention on the host itself or the task CPU limit is equivalent to the entire host CPU.

What I would like is for the CPU value on the container to behave the way the documentation describes, where it governs the allocation of CPU time in case of CPU contention. Today it does not appear to be possible to split the task CPU time in any other way than evenly across the containers in the task.

Which service(s) is this request for?
ECS (EC2)

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
I have tasks where there are multiple CPU hungry containers, and I want to ensure that one of them is not able to use an outsized amount of CPU time at the expense of the other containers. Today, even an auxiliary container that is less important could impact the other containers in the task that are more important to be granted CPU time.

Are you currently working around this issue?
It doesn't seem possible to work around this issue directly from ECS. The issue is that the CPU share considers all the cores on the host, even though the task itself has a hard limit. If you pin the child cgroups to a specific CPU core using cpuset.cpus, then the CPU usage is indeed proportional to the value of cpu.shares, but then this isn't possible to do automatically since the cgroups are managed by ECS, and obviously there are issues with pinning the cgroups to specific cores.

Additional context
I have support case 10915082771 open regarding this issue, which has specific details around the experimentation conducted both by myself and the support agent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ProposedCommunity submitted issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions