Add `imports` section in semconv to support references on attribute_groups/metrics/events/entities from custom registries #769

lquerel · 2025-06-05T01:07:26Z

[Updated to reflect conclusions of the thread below]

This PR removes one of the current limitations of custom registries. Previously, when the --include-unreferenced option was not enabled, Weaver did not include signals from imported registries in the resolved output. This PR introduces the ability to explicitly import specific groups by name or wildcard from an imported registry.

A new top-level imports field, composed of four lists (attributes, metrics, events, entities), has been added to the semconv YAML format. This allows references to groups of type attribute_group, metric, event, and entity. References to spans are not yet supported, but could be added in the future if their names become stable and persistent across registry versions.

Support for overriding groups through this reference mechanism could also be introduced in a future PR.

Example of custom registry:

groups:
  - id: shared.attributes
    type: attribute_group
    brief: Attributes shared between our signals.
    attributes:
      - id: auction.id
        type: int
        brief: >
          The id of the auction.
        stability: stable
      - id: auction.name
        type: string
        brief: >
          The name of the auction
        stability: stable
        examples: ["Fish for sale"]

  - id: metric.auction.bid.count
    type: metric
    metric_name: auction.bid.count
    stability: stable
    brief: "Count of all bids we've seen"
    instrument: counter
    unit: "{bid}"
    attributes:
      - ref: auction.id
        requirement_level: required
      - ref: auction.name
        requirement_level: recommended
      - ref: error.type
        requirement_level: required

imports:
  attributes:
    - attributes.http.*    # Match on id, as attribute_groups don’t have names.
  metrics:
    - example.*            # Match on metric_name.
  entities:
    - gcp.*                # Match on name
  events:
    - session.start.       # Match on name

crates/weaver_semconv/src/group.rs

crates/weaver_resolver/data/multi-registry/custom_registry/custom_registry.yaml

lmolkova · 2025-06-06T01:43:29Z

crates/weaver_resolver/data/multi-registry/custom_registry/custom_registry.yaml

+        requirement_level: required
+
+imports:
+  - metric_ref: metric.example_counter


another branch of the slack discussion:
we probably want wildcard imports and cover attribute namespaces rather than groups.

E.g.

imports: - attribute.db.* - attribute.server.* - attribute.error.* - attribute.http.* - attribute.url.*

syntax allows users to allowlist attributes in bulk without importing everything or explicitly allowing each attribute.
Attribute groups are meaningless - they are grouping mechanism and we move attributes between them a lot.

there are not stability guarantees for attribute groups.

A similar syntax can be useful for bulk-importing metrics, entities, events

imports: - metric.db.* - metric.http.* - metric.system.* - entity.host.* - ...

Importing each attribute/span/event/metric/entity one-by-one will be very verbose and hard to use. Bulk import seems to be the main use-case.

lmolkova · 2025-06-06T02:00:35Z

crates/weaver_semconv/src/group.rs

+    MetricRef {
+        /// The ID of the metric group being referenced in the imported registry.
+        metric_ref: String,
+        // Additional overridable fields may be added in the future.


related to https://github.com/open-telemetry/weaver/pull/769/files#r2131292750 (bulk import)

importing without modification makes sense, but it would override the original signal definition without any means to distinguish imported from the original one.

e.g. in service A I report http.client.request.duration with url.template when it talks to internal services B and C.
Service B reports http.client.request.duration without url.template because it's not supported or possible.

How do I document these caveats if I have a singleton http.client.request.duration definition?

the proposal:

imports allow to import without modification - original definitions are fully frozen and immutable. Original id describes only that signal and not its flavor.

supported modifications are done through existing extends mechanism or something similar. I.e.

- id: my.service_a.metric.http.client.request.duration extends: metric.http.client.request.duration note: this a service a flavor of http.client.request.duration, it records url.template when talking to internal services attributes: - ref: url.template requirement_level: conditionally_required: SHOULD be recorded when calling service B or C - id: my.service_b.metric.http.client.request.duration extends: metric.http.client.request.duration note: this a service b flavor of http.client.request.duration, it does not include url.template attribute since we can't record it there.

I.e. each modified version needs a unique id that can be used to render docs, generate code, build dashboards, run live checks.
We'd need to change what extends does for telemetry signals. Today it just inherits attributes and common group properties, but it can become stricter and require to preserve stability, name, type, etc.

From my point of view, having modifiable, but singleton definition is narrow and confusing.

What matters to me is: 1) being able to import signals, and 2) being able to extend them. I like what you’re proposing, it seems clear to me. In this PR, I’ll focus only on the import part with wildcard support.

@jsuereth and @jerbly are we all in agreement with that?

Makes sense to me.

then we don't need signal-specific refs for import - they won't have signal-specific properties.

I.e. we can have a nicer syntax like

imports: - metric.db.* - metric.http.* - metric.system.* - entity.* - ...

I'm on board with this direction

maybe can should rename attribute id to name?

In any case, if we do

imports: - attribute.db.* - metric.db.*

it'll be confusing for authors because attribute. prefix doesn't exist anywhere

if we do

imports: - db.* - metric.db.*

it'll be confusing to authors and readers since it's not obvious that db.* only imports attributes.

I'm trying to create non-ambigous and consistent way for those who read or write conventions, with the

- imports: - attributes: - http.* - db.* - system.* - metrics: - http.* - db.* - system.* - entities: - host

they import by over-the-wire identifier they use in queries/dashboards. The id becomes internal semconv impl detail it should be

@lquerel - What's the extent of the wildcard capability? Can we have *.db.* for example? Or, for people not using namespace dot notation, db*?

@jerbly everything supported by globset is effectively supported by this imports section.

@lmolkova

- imports: - attributes: - http.* - db.* - system.* - metrics: - http.* - db.* - system.* - entities: - host

Unless I’m misinterpreting something in your previous messages, I see more or less the same level of confusion with this syntax. For attributes, the list under attributes contains IDs, while the list under metrics contains names.

in my suggestion you import by the formal/public name. You never see metric.db.... unless you look into semconv repo, it's internal impl detail. The fact we use different properties to represent id inside semconv is sad.

How do you want to solve 'import attribute' problem otherwise?

jerbly · 2025-06-06T18:29:33Z

refactoring_of_group_spec.patch

Did you mean to include this file?

@jerbly What I’m saying is that the glob syntax used to match signal and attribute IDs (or names, depending on the final version) is the same as the one used in the globset crate.

I understand that... this comment above was a file level comment for the patch file you've included. Is that a mistake?

Oh yes, my bad. I need to remove this patch from the repo. That was the major refactoring I did to completely change GroupSpec from a struct to an enum. I was planning to create a separate PR with the content of this patch.

Good catch, thanks

lmolkova · 2025-06-06T18:32:01Z

crates/weaver_semconv/src/semconv.rs

+
+    /// A list of imports referencing groups defined in a dependent registry.
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub(crate) imports: Option<Vec<GroupWildcard>>,


we should document what import does:

it is a way to allowlist external specific semconv in yours (that's my understanding, not sure if it's a common one)

when importing a group, does it also import all attributes the group depends on and all entities it's associated with? or do I need to explicitly import such attribute before using it in my custom signals?

is there a way to say "I allow all otel semconv" imports: ["*"]?

what do I do with imported things after? generate docs? - we need an example :)

lmolkova · 2025-06-06T18:35:12Z

crates/weaver_semconv/src/semconv.rs

+
+    /// Returns the list of imports in the semantic convention spec.
+    #[must_use]
+    pub fn imports(&self) -> Option<&[GroupWildcard]> {


(random place)

should we warn/fail when someone tries to import a thing that's in the same registry? it doesn't make sense and people would only do this by mistake, we should probably notify them

I had thought about giving this kind of user's feedback, but actually it’s not so straightforward with the wildcard system. For example, I might have an import like metric.db.* that imports all the OTEL database metrics, but at the same time have a local metric declaration that matches this wildcard.
So, it seems clear that we should build something robust, except perhaps for imports without wildcards.

I recall we discussed imports from multiple registries and that we'd need to add registry as the namespace.

E.g. if I have OTel registry and Foo registry, when I import metric.db.... (or db....) I actually implicitly import from otel:: because it's special to us.

So when a collision happens, I should be able to resolve it by adding registry foo::db.....

I don't like the idea of

- imports - metrics: db.*

importing all db metrics from all registries I have without any means to control it and all the collisions that come with it.

We could allow db.* only if there is no ambiguity, as with references, and when there is ambiguity, require the custom registry author to specify the registry alias, like otel:db.*. What do you think?

imho the best way to prevent ambiguity is to make people explicitly mention where they import from.

It's ok if we have one registry as a default one or if you can set it.
When I write imports: ['db.foo.bar'] and I have two registries I import from, it should be deterministic where db.foo.bar comes from - it comes from the default one. You can only import metric defined in non-default registry using another_registry:db.foo.bar.

So if you want to import from all registries, write imports: ['*:db.foo.bar'], weaver should not try to be too smart.

i.e. the syntax should minimize the chance of ambiguity, runtime check is the last resort

lquerel · 2025-06-18T14:35:23Z

During the semconv tooling SIG, we decided to represent the imports sections with sub-sections.

- imports:
  - attributes:
    - http.*       <- name with wildcard (id in the semconv right now)
    - db.*
    - system.*
  - metrics:
    - http.*       <- metric_name with wildcard 
    - db.*
    - system.*
  - entities:
    - host        <- name
  - events:     <- name
    - db.*

Right now, the "name" of the persistant identifiers across the various group types is non fully consistent. We have a plan to rename metric_name to name.
See #785 for more details.

jerbly · 2025-06-24T11:58:21Z

Are we ready to switch to the auto-generated json schema for use in editors? There is a trade-off that I'm not sure we've agreed on. The manual schema encodes some of the rules which shifts-left the warnings for a better experience for the author. For example:

With the MANUAL schema vscode shows an error in-place if I don't have the stability field.
With the MANUAL schema vscode shows an error in-place if I define a span without the span_kind field.
With the GENERATED schema vscode offers code completion for fields weaver will then produce warnings for: e.g. prefix
With the GENERATED schema vscode offers code completion for fields for the wrong signal: e.g. when defining a span it offers instrument

My concern is that this may be a backwards step for authors using the schema which is highly recommended for a faster in-line experience. Their editor will show no errors and then weaver will complain, leading to a much slower working loop.

jerbly · 2025-06-24T12:16:35Z

Should https://github.com/open-telemetry/weaver/blob/main/crates/weaver_resolver/README.md be updated?

lquerel · 2025-06-24T14:14:42Z

Are we ready to switch to the auto-generated json schema for use in editors? There is a trade-off that I'm not sure we've agreed on. The manual schema encodes some of the rules which shifts-left the warnings for a better experience for the author

I thought we had decided to switch, but I agree with your argument. I'll roll back this file.

lquerel · 2025-06-24T15:39:07Z

@jerbly I roll backed the JSON schema to the manual version and updated it with the new imports section.

lquerel · 2025-06-24T15:45:10Z

@jerbly I also made a small update to crates/weaver_resolver/README.md to mention the imports section in the resolution process.

jerbly

LGTM

codecov · 2025-06-24T20:07:04Z

Codecov Report

Attention: Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.

Project coverage is 76.9%. Comparing base (619ac72) to head (e591289).

Files with missing lines	Patch %	Lines
crates/weaver_resolver/src/registry.rs	92.5%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            main    #769     +/-   ##
=======================================
+ Coverage   76.8%   76.9%   +0.1%     
=======================================
  Files         69      69             
  Lines       5634    5666     +32     
=======================================
+ Hits        4328    4359     +31     
- Misses      1306    1307      +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

schemas/semconv.schema.json

Co-authored-by: Liudmila Molkova <[email protected]>

Add imports section in semconv

9b395f9

lquerel self-assigned this Jun 5, 2025

lquerel added this to OTel Weaver Project Jun 5, 2025

lquerel added the enhancement New feature or request label Jun 5, 2025

lquerel added 2 commits June 4, 2025 18:15

Update imports section

b7753e2

Resolve imports section

771557c

github-advanced-security bot found potential problems Jun 5, 2025

View reviewed changes

crates/weaver_semconv/src/group.rs Fixed Show fixed Hide fixed

crates/weaver_semconv/src/group.rs Fixed Show fixed Hide fixed

crates/weaver_semconv/src/group.rs Fixed Show fixed Hide fixed

Resolve imports section

816bfd9

lmolkova reviewed Jun 5, 2025

View reviewed changes

crates/weaver_resolver/data/multi-registry/custom_registry/custom_registry.yaml Show resolved Hide resolved

lquerel requested review from jerbly and jsuereth June 5, 2025 21:31

lmolkova reviewed Jun 6, 2025

View reviewed changes

lquerel added 2 commits June 6, 2025 08:53

Merge branch 'main' into group-ref

0ff2884

Resolve imports section

f84da95

jerbly reviewed Jun 6, 2025

View reviewed changes

lmolkova reviewed Jun 6, 2025

View reviewed changes

lquerel added 3 commits June 22, 2025 22:40

Add imports sections

1c5f253

Merge branch 'main' into group-ref

bac4335

Update CHANGELOG.md

6dc3c38

lquerel marked this pull request as ready for review June 23, 2025 05:58

lquerel requested a review from a team as a code owner June 23, 2025 05:58

lquerel changed the title ~~[WIP] Add imports section in semconv to support references on metrics/events/entities from custom registries~~ Add imports section in semconv to support references on attribute_groups/metrics/events/entities from custom registries Jun 23, 2025

lquerel added 3 commits June 23, 2025 08:50

Fix test multi-registry

ff956b4

Remove attributes from imports section

187e181

Update semconv JSON schema

b5c037c

lquerel requested a review from a team as a code owner June 24, 2025 00:30

Remove refactoring_of_group_spec.patch from repository

f420a57

lquerel added 2 commits June 24, 2025 08:32

Roll back semconv.schema.json

526775b

Updated semconv.schema.json

ed4ad70

Updated crates/weaver_resolver/README.md

7b1eb6e

jerbly approved these changes Jun 24, 2025

View reviewed changes

Merge branch 'main' into group-ref

204e2d4

lquerel enabled auto-merge (squash) June 24, 2025 19:59

lquerel requested review from lmolkova and removed request for jsuereth June 24, 2025 20:08

lmolkova reviewed Jun 25, 2025

View reviewed changes

schemas/semconv.schema.json Outdated Show resolved Hide resolved

schemas/semconv.schema.json Show resolved Hide resolved

lquerel and others added 2 commits June 25, 2025 07:47

Update schemas/semconv.schema.json

09de7b2

Co-authored-by: Liudmila Molkova <[email protected]>

Document custom registries and imports section

e591289

lmolkova approved these changes Jun 25, 2025

View reviewed changes

Fix typo issue

c578b2f

lquerel merged commit 779c0ff into open-telemetry:main Jun 25, 2025
21 checks passed

github-project-automation bot moved this to Done in OTel Weaver Project Jun 25, 2025

Add imports section in semconv to support references on attribute_groups/metrics/events/entities from custom registries #769

Add imports section in semconv to support references on attribute_groups/metrics/events/entities from custom registries #769

Uh oh!

Conversation

lquerel commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lmolkova Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmolkova Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmolkova Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmolkova Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmolkova Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmolkova Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmolkova Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lquerel commented Jun 18, 2025

Uh oh!

jerbly commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerbly commented Jun 24, 2025

Uh oh!

lquerel commented Jun 24, 2025

Uh oh!

lquerel commented Jun 24, 2025

Uh oh!

Add `imports` section in semconv to support references on attribute_groups/metrics/events/entities from custom registries #769

Add `imports` section in semconv to support references on attribute_groups/metrics/events/entities from custom registries #769

lquerel commented Jun 5, 2025 •

edited

Loading

lmolkova Jun 6, 2025 •

edited

Loading

lmolkova Jun 6, 2025 •

edited

Loading

lmolkova Jun 6, 2025 •

edited

Loading

lmolkova Jun 6, 2025 •

edited

Loading

lmolkova Jun 6, 2025 •

edited

Loading

lmolkova Jun 6, 2025 •

edited

Loading

lmolkova Jun 6, 2025 •

edited

Loading

jerbly commented Jun 24, 2025 •

edited

Loading

codecov bot commented Jun 24, 2025 •

edited

Loading