-
Notifications
You must be signed in to change notification settings - Fork 235
Description
Describe the bug
The OOM metric pod_container_status_terminated_reason_oom_killed
from container insights enhanced observability can't be found on the Cloudwatch console even when there's a confirmed OOMKilled
error for pods in our EKS cluster. It is on the list of observable metrics found here.
Steps to reproduce
- Use the default EKS add-on for cloudwatch observability with no changes to the config - version number below.
- Simulate a pod OOM error
- Observe that the metric
pod_container_status_terminated_reason_oom_killed
cannot be found in Cloudwatch container insights
What did you expect to see?
pod_container_status_terminated_reason_oom_killed
can be viewed and graphed on Cloudwatch container insights console. Pods with OOMKilled
errors can be monitored.
What did you see instead?
The metric can't be found even when there's an OOMKilled
status occuring for pods.
What version did you use?
AmazonCloudWatchAgent CWAgent/1.300054.0b1074 (go1.23.7; linux; amd64)
- EKS v1.30
- EKS cloudwatch add-on version:
v3.6.0-eksbuild.2
What config did you use?
config.txt
Environment
AL2
Additional context
In the cloudwatch agent logs, there is the line:
manager.go:306] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory