Unlike Kubernetes, ECS only allows you to apply a CPU quota at the task (pod) level. Containers in the task are always unbounded.
For example, when cpu: 1024 (1 vCPU) is provided in the task definition it gets the expected quota:
cat /sys/fs/cgroup/cpu,cpuacct/ecs/1576650513ed4c5d9328a6d67a8a741b/cpu.cfs_quota_us
100000
But providing cpu: 1024 to a container inside the same task doesn't have the same effect:
cat /sys/fs/cgroup/cpu,cpuacct/ecs/1576650513ed4c5d9328a6d67a8a741b/2030454c61d157d4c38f0606fe99667bca8961ce4e0019d3667f67e625f40c12/cpu.cfs_quota_us
-1
(The container's cpu value is only used for placement and CPU shares, but doesn't actually affect CPU scheduling aws/containers-roadmap#1862.)
If the container is using automaxprocs it only sees a quota of -1 and defaults to using all of runtime.NumCPU, even though the task's cgroup clamps it to 1 vCPU.
(I'm using cgroups v1 as an example here but the same is true with v2 as well, if you happen to be using an AL2023 AMI.)
It seems like the library could climb up the mount point to find quotas belonging to parents, but this is suboptimal if the task has more than one container.
I'm mostly writing this down to help anyone else avoid this rabbit hole.
Unlike Kubernetes, ECS only allows you to apply a CPU quota at the task (pod) level. Containers in the task are always unbounded.
For example, when
cpu: 1024(1 vCPU) is provided in the task definition it gets the expected quota:But providing
cpu: 1024to a container inside the same task doesn't have the same effect:(The container's
cpuvalue is only used for placement and CPU shares, but doesn't actually affect CPU scheduling aws/containers-roadmap#1862.)If the container is using
automaxprocsit only sees a quota of-1and defaults to using all ofruntime.NumCPU, even though the task's cgroup clamps it to 1 vCPU.(I'm using cgroups v1 as an example here but the same is true with v2 as well, if you happen to be using an AL2023 AMI.)
It seems like the library could climb up the mount point to find quotas belonging to parents, but this is suboptimal if the task has more than one container.
I'm mostly writing this down to help anyone else avoid this rabbit hole.