Skip to content

Conversation

@tongoss
Copy link
Contributor

@tongoss tongoss commented Sep 10, 2025

fix #7013

References:

Implement following self-observability metrics from https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/otel/sdk-metrics.md for https://pkg.go.dev/go.opentelemetry.io/otel/exporters/prometheus:

  • otel.sdk.exporter.metric_data_point.inflight
  • otel.sdk.exporter.metric_data_point.exported
  • otel.sdk.exporter.operation.duration
  • otel.sdk.metric_reader.collection.duration

Benchmarks

benchstat /tmp/bench_disabled.txt /tmp/bench_enabled.txt
goos: darwin
goarch: arm64
pkg: go.opentelemetry.io/otel/exporters/prometheus/internal/observ
cpu: Apple M1 Max
                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt       │
                                           │         sec/op          │   sec/op     vs base               │
InstrumentationExportMetrics-10                          177.5n ± 0%   177.8n ± 0%  +0.14% (p=0.039 n=20)
InstrumentationRecordOperationDuration-10                246.6n ± 0%   246.7n ± 0%       ~ (p=0.606 n=20)
InstrumentationRecordCollectionDuration-10               246.8n ± 1%   247.2n ± 0%       ~ (p=0.456 n=20)
geomean                                                  221.1n        221.3n       +0.09%

                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt        │
                                           │          B/op           │    B/op     vs base                 │
InstrumentationExportMetrics-10                           256.0 ± 0%   256.0 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordOperationDuration-10                 272.0 ± 0%   272.0 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordCollectionDuration-10                272.0 ± 0%   272.0 ± 0%       ~ (p=1.000 n=20) ¹
geomean                                                   266.6        266.6       +0.00%
¹ all samples are equal

                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt        │
                                           │        allocs/op        │ allocs/op   vs base                 │
InstrumentationExportMetrics-10                           3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordOperationDuration-10                 3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordCollectionDuration-10                3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
geomean                                                   3.000        3.000       +0.00%
¹ all samples are equal

@codecov
Copy link

codecov bot commented Sep 10, 2025

Codecov Report

❌ Patch coverage is 87.36842% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.5%. Comparing base (d1dddde) to head (47c32da).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
exporters/prometheus/exporter.go 70.9% 27 Missing and 9 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff           @@
##            main   #7345    +/-   ##
======================================
  Coverage   85.4%   85.5%            
======================================
  Files        271     274     +3     
  Lines      24391   24620   +229     
======================================
+ Hits       20846   21060   +214     
- Misses      3167    3181    +14     
- Partials     378     379     +1     
Files with missing lines Coverage Δ
exporters/prometheus/internal/counter/counter.go 100.0% <100.0%> (ø)
...ters/prometheus/internal/observ/instrumentation.go 100.0% <100.0%> (ø)
exporters/prometheus/internal/x/x.go 100.0% <100.0%> (ø)
exporters/prometheus/exporter.go 83.5% <70.9%> (-0.6%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tongoss tongoss force-pushed the feature/prometheus-exporter-self-observability branch from c83ae49 to 3b11492 Compare September 10, 2025 14:20
Copy link
Member

@flc1125 flc1125 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be continued

Copy link
Member

@flc1125 flc1125 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall adherence to standards and the quality of the code itself are very high 👍

Copy link
Member

@flc1125 flc1125 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good overall, except for the blocker: #7345 (comment)

@MrAlias MrAlias added this to the v1.39.0 milestone Sep 15, 2025
Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a high-quality PR, thank you for the contribution!

The only blocking issue is the scope name. It should match the instrumentation package name.

Outside of that, can you use benchstat to show the results of the added benchmarks?

The remaining issues included in this review are cleanup tasks that will help fix the code quality of the PR.

@tongoss
Copy link
Contributor Author

tongoss commented Sep 15, 2025

Outside of that, can you use benchstat to show the results of the added benchmarks?

@MrAlias , thanks very much for the comments. For results of the added benchmarks, please find them as below:

benchstat /tmp/bench_disabled.txt /tmp/bench_enabled.txt
goos: darwin
goarch: arm64
pkg: go.opentelemetry.io/otel/exporters/prometheus/internal/observ
cpu: Apple M1 Max
                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt       │
                                           │         sec/op          │   sec/op     vs base               │
InstrumentationExportMetrics-10                          177.5n ± 0%   177.8n ± 0%  +0.14% (p=0.039 n=20)
InstrumentationRecordOperationDuration-10                246.6n ± 0%   246.7n ± 0%       ~ (p=0.606 n=20)
InstrumentationRecordCollectionDuration-10               246.8n ± 1%   247.2n ± 0%       ~ (p=0.456 n=20)
geomean                                                  221.1n        221.3n       +0.09%

                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt        │
                                           │          B/op           │    B/op     vs base                 │
InstrumentationExportMetrics-10                           256.0 ± 0%   256.0 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordOperationDuration-10                 272.0 ± 0%   272.0 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordCollectionDuration-10                272.0 ± 0%   272.0 ± 0%       ~ (p=1.000 n=20) ¹
geomean                                                   266.6        266.6       +0.00%
¹ all samples are equal

                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt        │
                                           │        allocs/op        │ allocs/op   vs base                 │
InstrumentationExportMetrics-10                           3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordOperationDuration-10                 3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordCollectionDuration-10                3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
geomean                                                   3.000        3.000       +0.00%
¹ all samples are equal```

@pellared pellared requested a review from dashpole September 16, 2025 18:16
@flc1125
Copy link
Member

flc1125 commented Sep 17, 2025

@tongoss Please fix the conflicting parts.

@tongoss
Copy link
Contributor Author

tongoss commented Sep 17, 2025

@tongoss Please fix the conflicting parts.

@flc1125 , merge conflict has been resolved.

@MrAlias MrAlias merged commit 60f9f39 into open-telemetry:main Sep 18, 2025
33 checks passed
@mahendrabishnoi2
Copy link
Member

Outside of that, can you use benchstat to show the results of the added benchmarks?

@MrAlias , thanks very much for the comments. For results of the added benchmarks, please find them as below:

➜ benchstat /tmp/bench_disabled.txt /tmp/bench_enabled.txt
goos: darwin
goarch: arm64
pkg: go.opentelemetry.io/otel/exporters/prometheus/internal/observ
cpu: Apple M1 Max
                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt       │
                                           │         sec/op          │   sec/op     vs base               │
InstrumentationExportMetrics-10                          177.5n ± 0%   177.8n ± 0%  +0.14% (p=0.039 n=20)
InstrumentationRecordOperationDuration-10                246.6n ± 0%   246.7n ± 0%       ~ (p=0.606 n=20)
InstrumentationRecordCollectionDuration-10               246.8n ± 1%   247.2n ± 0%       ~ (p=0.456 n=20)
geomean                                                  221.1n        221.3n       +0.09%

                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt        │
                                           │          B/op           │    B/op     vs base                 │
InstrumentationExportMetrics-10                           256.0 ± 0%   256.0 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordOperationDuration-10                 272.0 ± 0%   272.0 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordCollectionDuration-10                272.0 ± 0%   272.0 ± 0%       ~ (p=1.000 n=20) ¹
geomean                                                   266.6        266.6       +0.00%
¹ all samples are equal

                                           │ /tmp/bench_disabled.txt │       /tmp/bench_enabled.txt        │
                                           │        allocs/op        │ allocs/op   vs base                 │
InstrumentationExportMetrics-10                           3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordOperationDuration-10                 3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
InstrumentationRecordCollectionDuration-10                3.000 ± 0%   3.000 ± 0%       ~ (p=1.000 n=20) ¹
geomean                                                   3.000        3.000       +0.00%
¹ all samples are equal```

Hi @tongoss, Can you please help me with the changes you made and commands you ran to get these results?
I'm trying to do the same in one of my PR but getting drastically different results.

@tongoss
Copy link
Contributor Author

tongoss commented Oct 9, 2025

Hi @mahendrabishnoi2 ,

Have you got all the help that you need?

@mahendrabishnoi2
Copy link
Member

Yes @tongoss. I have.

@MrAlias MrAlias mentioned this pull request Dec 5, 2025
MrAlias added a commit that referenced this pull request Dec 8, 2025
## Overview

### Added

- Greatly reduce the cost of recording metrics in
`go.opentelemetry.io/otel/sdk/metric` using hashing for map keys.
(#7175)
- Add `WithInstrumentationAttributeSet` option to
`go.opentelemetry.io/otel/log`, `go.opentelemetry.io/otel/metric`, and
`go.opentelemetry.io/otel/trace` packages. This provides a
concurrent-safe and performant alternative to
`WithInstrumentationAttributes` by accepting a pre-constructed
`attribute.Set`. (#7287)
- Add experimental observability for the Prometheus exporter in
`go.opentelemetry.io/otel/exporters/prometheus`. Check the
`go.opentelemetry.io/otel/exporters/prometheus/internal/x` package
documentation for more information. (#7345)
- Add experimental observability metrics in
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc`. (#7353)
- Add temporality selector functions `DeltaTemporalitySelector`,
`CumulativeTemporalitySelector`, `LowMemoryTemporalitySelector` to
`go.opentelemetry.io/otel/sdk/metric`. (#7434)
- Add experimental observability metrics for simple log processor in
`go.opentelemetry.io/otel/sdk/log`. (#7548)
- Add experimental observability metrics in
`go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc`.
(#7459)
- Add experimental observability metrics in
`go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp`.
(#7486)
- Add experimental observability metrics for simple span processor in
`go.opentelemetry.io/otel/sdk/trace`. (#7374)
- Add experimental observability metrics in
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#7512)
- Add experimental observability metrics for manual reader in
`go.opentelemetry.io/otel/sdk/metric`. (#7524)
- Add experimental observability metrics for periodic reader in
`go.opentelemetry.io/otel/sdk/metric`. (#7571)
- Support `OTEL_EXPORTER_OTLP_LOGS_INSECURE` and
`OTEL_EXPORTER_OTLP_INSECURE` environmental variables in
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#7608)
- Add `Enabled` method to the `Processor` interface in
`go.opentelemetry.io/otel/sdk/log`. All `Processor` implementations now
include an `Enabled` method. (#7639)
- The `go.opentelemetry.io/otel/semconv/v1.38.0` package. The package
contains semantic conventions from the `v1.38.0` version of the
OpenTelemetry Semantic Conventions. See the [migration
documentation](./semconv/v1.38.0/MIGRATION.md) for information on how to
upgrade from `go.opentelemetry.io/otel/semconv/v1.37.0.`(#7648)

### Changed

- `Distinct` in `go.opentelemetry.io/otel/attribute` is no longer
guaranteed to uniquely identify an attribute set. Collisions between
`Distinct` values for different Sets are possible with extremely high
cardinality (billions of series per instrument), but are highly
unlikely. (#7175)
- `WithInstrumentationAttributes` in `go.opentelemetry.io/otel/trace`
synchronously de-duplicates the passed attributes instead of delegating
it to the returned `TracerOption`. (#7266)
- `WithInstrumentationAttributes` in `go.opentelemetry.io/otel/meter`
synchronously de-duplicates the passed attributes instead of delegating
it to the returned `MeterOption`. (#7266)
- `WithInstrumentationAttributes` in `go.opentelemetry.io/otel/log`
synchronously de-duplicates the passed attributes instead of delegating
it to the returned `LoggerOption`. (#7266)
- Rename the `OTEL_GO_X_SELF_OBSERVABILITY` environment variable to
`OTEL_GO_X_OBSERVABILITY` in `go.opentelemetry.io/otel/sdk/trace`,
`go.opentelemetry.io/otel/sdk/log`, and
`go.opentelemetry.io/otel/exporters/stdout/stdouttrace`. (#7302)
- Improve performance of histogram `Record` in
`go.opentelemetry.io/otel/sdk/metric` when min and max are disabled
using `NoMinMax`. (#7306)
- Improve error handling for dropped data during translation by using
`prometheus.NewInvalidMetric` in
`go.opentelemetry.io/otel/exporters/prometheus`. ⚠️ **Breaking Change:**
Previously, these cases were only logged and scrapes succeeded. Now,
when translation would drop data (e.g., invalid label/value), the
exporter emits a `NewInvalidMetric`, and Prometheus scrapes **fail with
HTTP 500** by default. To preserve the prior behavior (scrapes succeed
while errors are logged), configure your Prometheus HTTP handler with:
`promhttp.HandlerOpts{ ErrorHandling: promhttp.ContinueOnError }`.
(#7363)
- Replace fnv hash with xxhash in `go.opentelemetry.io/otel/attribute`
for better performance. (#7371)
- The default `TranslationStrategy` in
`go.opentelemetry.io/exporters/prometheus` is changed from
`otlptranslator.NoUTF8EscapingWithSuffixes` to
`otlptranslator.UnderscoreEscapingWithSuffixes`. (#7421)
- Improve performance of concurrent measurements in
`go.opentelemetry.io/otel/sdk/metric`. (#7427)
- Include W3C TraceFlags (bits 0–7) in the OTLP `Span.Flags` field in
`go.opentelemetry.io/exporters/otlp/otlptrace/otlptracehttp` and
`go.opentelemetry.io/exporters/otlp/otlptrace/otlptracegrpc`. (#7438)
- The `ErrorType` function in `go.opentelemetry.io/otel/semconv/v1.37.0`
now handles custom error types.
If an error implements an `ErrorType() string` method, the return value
of that method will be used as the error type. (#7442)

### Fixed

- Fix `WithInstrumentationAttributes` options in
`go.opentelemetry.io/otel/trace`, `go.opentelemetry.io/otel/metric`, and
`go.opentelemetry.io/otel/log` to properly merge attributes when passed
multiple times instead of replacing them. Attributes with duplicate keys
will use the last value passed. (#7300)
- The equality of `attribute.Set` when using the `Equal` method is not
affected by the user overriding the empty set pointed to by
`attribute.EmptySet` in `go.opentelemetry.io/otel/attribute`. (#7357)
- Return partial OTLP export errors to the caller in
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc`. (#7372)
- Return partial OTLP export errors to the caller in
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#7372)
- Return partial OTLP export errors to the caller in
`go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc`.
(#7372)
- Return partial OTLP export errors to the caller in
`go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp`.
(#7372)
- Return partial OTLP export errors to the caller in
`go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc`.
(#7372)
- Return partial OTLP export errors to the caller in
`go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp`.
(#7372)
- Fix `AddAttributes`, `SetAttributes`, `SetBody` on `Record` in
`go.opentelemetry.io/otel/sdk/log` to not mutate input. (#7403)
- Do not double record measurements of `RecordSet` methods in
`go.opentelemetry.io/otel/semconv/v1.37.0`. (#7655)
- Do not double record measurements of `RecordSet` methods in
`go.opentelemetry.io/otel/semconv/v1.36.0`. (#7656)

### Removed

- Drop support for [Go 1.23]. (#7274)
- Remove the `FilterProcessor` interface in
`go.opentelemetry.io/otel/sdk/log`. The `Enabled` method has been added
to the `Processor` interface instead. All `Processor` implementations
must now implement the `Enabled` method. Custom processors that do not
filter records can implement `Enabled` to return `true`. (#7639)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Metrics SDK observability - prometheus exporter metrics

6 participants