Add exponential histograms to histograms integration test #558

dricross · 2025-07-17T17:21:50Z

Description of the issue

We are adding support to push exponential histograms with cloudwatch (PMD) as a destination. This PR adds integration tests for this functionality.

Note

See companion PR for new agent functionality: aws/amazon-cloudwatch-agent#1677

Description of changes

Update existing integration tests to support exponential histograms
Refactor the metric fetcher to support percentile metrics
Updated test suite framework to output test failure reasons is provided

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Ran integration test locally with updated agent, all histogram tests pass

Integration test run: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/16371811763
Histogram test: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/16371811763/job/46261959442

Starting new run here after fixing merge conflict: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/17081105626

Example of test failures w/o reasons (current behavior):

2025/07/17 12:31:33 >>>>>>>>>>>>>><<<<<<<<<<<<<<
2025/07/17 12:31:33 >>>>>>>>>>>>>>Failed<<<<<<<<<<<<<<
2025/07/17 12:31:33 ==============otlp_histograms==============
2025/07/17 12:31:33 ==============Failed==============
my.delta.histogram/Minimum                        Successful
my.delta.histogram/Maximum                        Successful
my.delta.histogram/Sum                            Failed
my.delta.histogram/Average                        Successful
my.delta.histogram/SampleCount                    Failed
my.cumulative.histogram/Minimum                   Successful
my.cumulative.histogram/Maximum                   Successful
my.cumulative.histogram/Sum                       Failed
my.cumulative.histogram/Average                   Successful
my.cumulative.histogram/SampleCount               Failed
my.delta.exponential.histogram/Minimum            Successful
my.delta.exponential.histogram/Maximum            Successful
my.delta.exponential.histogram/Sum                Failed
my.delta.exponential.histogram/Average            Successful
my.delta.exponential.histogram/SampleCount        Failed
my.cumulative.exponential.histogram/Minimum       Successful
my.cumulative.exponential.histogram/Maximum       Successful
my.cumulative.exponential.histogram/Sum           Failed
my.cumulative.exponential.histogram/Average       Successful
my.cumulative.exponential.histogram/SampleCount   Failed
2025/07/17 12:31:33 ==============================
2025/07/17 12:31:33 >>>>>>>>>>>>>>><<<<<<<<<<<<<<<

Example of test failure w/ reasons:

2025/07/17 12:31:33 >>>>>>>>>>>>>><<<<<<<<<<<<<<
2025/07/17 12:31:33 >>>>>>>>>>>>>>Failed<<<<<<<<<<<<<<
2025/07/17 12:31:33 ==============otlp_histograms==============
2025/07/17 12:31:33 ==============Failed==============
my.delta.histogram/Minimum                        Successful   <nil>
my.delta.histogram/Maximum                        Successful   <nil>
my.delta.histogram/Sum                            Failed       The average value 72.000000 for metric my.delta.histogram are not within bound [20.400000, 27.600000]
my.delta.histogram/Average                        Successful   <nil>
my.delta.histogram/SampleCount                    Failed       The average value 36.000000 for metric my.delta.histogram are not within bound [10.200000, 13.800000]
my.cumulative.histogram/Minimum                   Successful   <nil>
my.cumulative.histogram/Maximum                   Successful   <nil>
my.cumulative.histogram/Sum                       Failed       The average value 4428.000000 for metric my.cumulative.histogram are not within bound [20.400000, 27.600000]
my.cumulative.histogram/Average                   Successful   <nil>
my.cumulative.histogram/SampleCount               Failed       The average value 2214.000000 for metric my.cumulative.histogram are not within bound [10.200000, 13.800000]
my.delta.exponential.histogram/Minimum            Successful   <nil>
my.delta.exponential.histogram/Maximum            Successful   <nil>
my.delta.exponential.histogram/Sum                Failed       The average value 180.000000 for metric my.delta.exponential.histogram are not within bound [51.000000, 69.000000]
my.delta.exponential.histogram/Average            Successful   <nil>
my.delta.exponential.histogram/SampleCount        Failed       The average value 54.000000 for metric my.delta.exponential.histogram are not within bound [15.300000, 20.700000]
my.cumulative.exponential.histogram/Minimum       Successful   <nil>
my.cumulative.exponential.histogram/Maximum       Successful   <nil>
my.cumulative.exponential.histogram/Sum           Failed       The average value 2052.000000 for metric my.cumulative.exponential.histogram are not within bound [10.200000, 13.800000]
my.cumulative.exponential.histogram/Average       Successful   <nil>
my.cumulative.exponential.histogram/SampleCount   Failed       The average value 2052.000000 for metric my.cumulative.exponential.histogram are not within bound [10.200000, 13.800000]
2025/07/17 12:31:33 ==============================
2025/07/17 12:31:33 >>>>>>>>>>>>>>><<<<<<<<<<<<<<<

dricross · 2025-07-17T18:49:50Z

test/histograms/histograms_to_emf_test.go

 	"github.com/aws/amazon-cloudwatch-agent-test/util/common"
 )

 func TestOTLPMetrics(t *testing.T) {
+	instanceID := awsservice.GetInstanceId()


Pulling instance ID from IMDS instead of hardcoding to a dummy value so that concurrent integration tests don't interfere with each other

dricross · 2025-07-17T18:50:04Z

test/histograms/histograms_to_emf_test.go

@@ -34,7 +36,6 @@ func TestOTLPMetrics(t *testing.T) {
 		expected   []struct {
 			stat  types.Statistic
 			value float64
-			check func(t *testing.T, expected, actual float64)


This was actually completely unused

dricross · 2025-07-17T18:52:04Z

test/test_runner/base_test_runner.go

-		testGroupResult = t.TestRunner.Validate()
-	}
-	if testGroupResult.GetStatus() != status.SUCCESSFUL {
-		log.Printf("%v test group failed due to %v", testName, err)


this would often print .. test group failed due to <nil> as err comes from RunAgent() call and the status comes from the Validate() call. Decided to just rework RunAgent to return an error only.

lisguo · 2025-07-21T20:11:51Z

test/histograms/resources/otlp_emf_metrics.json

@@ -34,8 +34,8 @@
                "aggregationTemporality": 1,
                "dataPoints": [
                  {
-                    "startTimeUnixNano": START_TIME,
-                    "timeUnixNano": START_TIME,
+                    "startTimeUnixNano": METRIC_TIME,


I hate this weird template sed logic...it would be better to use some tool to generate these metrics programmatically like otelgen: https://github.com/krzko/otelgen

Not your fault though. I started this

I agree. I had Amazon Q write up a metric generator using the OTEL SDK. It got everything working except cumulative or delta exponential histograms. I then spent a day trying to figure out how to add exponential histograms, but I couldn't figure it out and eventually gave up. You can actually see the generator I had in the commit history.

okankoAMZ · 2025-07-23T18:38:37Z

.github/workflows/build-check.yml

        uses: actions/setup-go@v3
        with:
-          go-version: ~1.20.0
+          go-version: ~1.23.0


why are we bumping go as part of this?

I was trying to upgrade the whole package from go 1.20 to go 1.23, hit an issue (forget what it was now...), and then tried to revert back, but I missed this. I don't think it should hurt as 1.23 is backwards compatible with our 1.20 go.mod file.

dricross · 2025-08-07T15:49:24Z

test/metric/stat.go

+	AVERAGE                  types.Statistic = "Average"
+	SAMPLE_COUNT             types.Statistic = "SampleCount"
+	MINIMUM                  types.Statistic = "Minimum"
+	MAXUMUM                  types.Statistic = "Maximum"
+	SUM                      types.Statistic = "Sum"


I started to remove the usage of these consts as the types package in the SDK already defines these, but it would have made this PR even larger. I'd rather to that in a separate PR.

dricross requested a review from a team as a code owner July 17, 2025 17:21

dricross force-pushed the dricross/exphistograms branch from d559b35 to 167e846 Compare July 17, 2025 17:24

dricross mentioned this pull request Jul 17, 2025

Add exponential histogram support to CloudWatch PMD Exporter aws/amazon-cloudwatch-agent#1677

Merged

dricross commented Jul 17, 2025

View reviewed changes

lisguo previously approved these changes Jul 21, 2025

View reviewed changes

okankoAMZ previously approved these changes Jul 23, 2025

View reviewed changes

dricross dismissed stale reviews from okankoAMZ and lisguo via cbf9ede August 6, 2025 15:52

dricross commented Aug 7, 2025

View reviewed changes

dricross added 11 commits August 19, 2025 16:03

create otlp metrics generator

785d65f

exph to generator

5300829

some code cleanup

76199bb

Print reason with failures

de58b52

Update emf/cw test for exponential histograms

9fb88f0

Remove otlp generator

f75f648

Upgrade go

75f1307

Revert to go 1.20

c55f844

Refactor fetcher to work with percentile metrics

2b45ba7

add dummy p metrics to check whats returned

68fce08

Update expected p90 values

2603f8e

dricross force-pushed the dricross/exphistograms branch from 812b8e4 to 2603f8e Compare August 19, 2025 20:29

Paramadon approved these changes Aug 19, 2025

View reviewed changes

okankoAMZ approved these changes Aug 20, 2025

View reviewed changes

lisguo approved these changes Aug 20, 2025

View reviewed changes

dricross merged commit 451f0c5 into main Aug 20, 2025
6 checks passed

dricross deleted the dricross/exphistograms branch August 20, 2025 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add exponential histograms to histograms integration test #558

Add exponential histograms to histograms integration test #558

Uh oh!

dricross commented Jul 17, 2025 •

edited

Loading

Uh oh!

dricross Jul 17, 2025

Uh oh!

dricross Jul 17, 2025

Uh oh!

dricross Jul 17, 2025

Uh oh!

lisguo Jul 21, 2025

Uh oh!

dricross Jul 21, 2025 •

edited

Loading

Uh oh!

okankoAMZ Jul 23, 2025

Uh oh!

dricross Jul 24, 2025 •

edited

Loading

Uh oh!

dricross Aug 7, 2025

Uh oh!

Uh oh!

Uh oh!

Add exponential histograms to histograms integration test #558

Add exponential histograms to histograms integration test #558

Uh oh!

Conversation

dricross commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the issue

Description of changes

License

Tests

Uh oh!

dricross Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

dricross Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

dricross Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

lisguo Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

dricross Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

okankoAMZ Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

dricross Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dricross Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dricross commented Jul 17, 2025 •

edited

Loading

dricross Jul 21, 2025 •

edited

Loading

dricross Jul 24, 2025 •

edited

Loading