-
Notifications
You must be signed in to change notification settings - Fork 932
Description
All,
Please find here an alternate proposal to address how to use environment variables,
and merge changes, with a config.yaml
configuration file.
Using an issue for discussion.
Related to:
- Define OTEL_CONFIG_FILE environment variable #3805
- Define OTEL_CONFIG_FILE with environment variable merge semantics #3840
- Define OTEL_CONFIG_FILE with placeholder for new environment variable override scheme #3850
- Should file configuration merge environment variable configuration? #3752
- configuration SIG meeting, 2024-03-04
cc @jack-berg
-
Environment variable
OTEL_CONFIG_FILE
is defined. -
Every other environment variable defined in the SDK as of today is
ignored, in the low level SDK implementation. -
The yaml configuration file supports env var substitution, with default
values.
For example:
tracer_provider:
processors:
- batch:
exporter:
otlp:
protocol: ${OTEL_EXPORTER_OTLP_TRACES_PROTOCOL:http/protobuf}
endpoint: ${OTEL_EXPORTER_OTLP_TRACES_ENDPOINT:http://localhost:4318/v1/traces}
-
A
config.yaml
template is delivered -
For all the existing SDK environment variables defined today,
sort them in two categories:
5.1. SDK variables that it make sense to still honor from config.yaml
For example, OTEL_EXPORTER_OTLP_TRACES_PROTOCOL
is in this category,
because it controls a yaml node scalar that can be used with substitution.
These SDK environment variables are henceforth referred to as "legacy".
5.2. SDK variables that it does not make sense to keep in config.yaml
For example, OTEL_TRACES_EXPORTER
is in this category,
because it controls an intermediate structural node in the config, not a
leaf scalar.
Yaml is a better way to describe children of exporter, OTEL_TRACES_EXPORTER
is not suited for that, and is abandoned.
- Deliver a config.yaml template with all legacy environment variables used
in the applicable nodes in config.yaml.
For example:
disabled: ${OTEL_SDK_DISABLED:false}
attribute_limits:
attribute_value_length_limit: ${OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT:4096}
attribute_count_limit: ${OTEL_ATTRIBUTE_COUNT_LIMIT: 128}
tracer_provider:
processors:
- batch:
exporter:
otlp:
protocol: ${OTEL_EXPORTER_OTLP_TRACES_PROTOCOL:http/protobuf}
endpoint: ${OTEL_EXPORTER_OTLP_TRACES_ENDPOINT:http://localhost:4318/v1/traces}
This will show to users how to convert an existing configuration, based on
legacy env vars, into a working config.yaml.
This also avoids the trap of forcefully mapping a flat list of env vars into
a tree structure:
- every flat list config can be expressed in yaml
- config.yaml can be extended beyond flat lists, by using env vars beyond
what is supported by legacy today, at end user option.
For example:
tracer_provider:
processors:
- batch:
exporter:
otlp:
# Possible to express today
protocol: ${OTEL_EXPORTER_OTLP_TRACES_PROTOCOL:http/protobuf}
endpoint: ${OTEL_EXPORTER_OTLP_TRACES_ENDPOINT:http://localhost:4318/v1/traces}
- batch:
exporter:
otlp:
# Not possible to express today without yaml
protocol: ${MY_SECOND_TRACES_PROTOCOL:http/protobuf}
endpoint: ${MY_SECOND_TRACES_ENDPOINT:http://remotehost:4318/v1/traces}
Note the net effect here:
- variable
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
is not used because it is hard
coded in the SDK. - It is used because it is referenced from config.yaml
As a bonus, every SIG that did not implement some config variable (see the
spec compatibility matrix) is now compliant, since all variables are supported from yaml,
without SDK changes (except for supporting yaml of course).
- In the template config.yaml delivered, every env var substitution has a
default value defined as fallback.
This ensures the config is still valid if an environment variable is not
set, which is helpful and corresponds to the behavior today with legacy
variables.
- About merges
A "merge" is an operation when a configuration overlay is applied on top of an existing configuration.
A typical use case is when:
- an application is packaged and deployed with a default
config.yaml
- the end user wants to make local changes to the default, instead of making a copy of the default configuration provided and making edits to the copy.
Disclaimer:
This section is not prototyped yet, but I think we can do better and have a
syntax closer to yaml including to express merges.
How to deal with arrays is not defined yet, I just wrote this 2 hours
after the configuration SIG meeting on 2024-03-04, bear with me.
The idea of multiple export PATH_TO_YAML_NODE="value" syntax is abandoned.
Instead, a configuration override on top of an existing config.yaml is
expressed in one yaml document.
8.1. Environment variable OTEL_CONFIG_MERGE
contains the config to merge.
Having text with \n inside an env var seems doable (to confirm), as in:
OTEL_CONFIG_MERGE=$(
cat <<EOF
foo:
bar:
baz:
EOF
);
echo "$OTEL_CONFIG_MERGE";
Alternate, point to a config_merge.yaml
file.
8.2. The config_merge.yaml document has a different schema, to describe
changes to apply on top of config.yaml
8.3. Top level merge actions
The merge doc contains:
- remove actions, to remove a node and children
- change actions, to replace nodes with new values
- add actions, to add a new node and children
8.4. Remove actions
merge:
- remove:
# Remove the parent based sampler from kitchen-sink.yaml
tracer_provider:
sampler:
parent_based:
Here, the full path to the node to remove is given
(tracer_provider.sampler.parent_based).
The parent_based
node is removed, because it is a leaf in the remove
tree.
Multiple leaf removals can be expressed in the same remove
tree (not shown, similar to change below)
8.5. Change actions
merge:
- change:
# increase from 128
tracer_provider:
limits:
attribute_count_limit: 256
- change:
# decrease from 128
tracer_provider:
limits:
event_count_limit: 64
This syntax is to be compared with:
- export OTEL_CONFIG_TRACER_PROVIDER__LIMITS__ATTRIBUTE_COUNT_LIMIT="256"
- export OTEL_CONFIG_TRACER_PROVIDER__LIMITS__EVENT_COUNT_LIMIT="64"
The two scalars attribute_count_limit
and event_count_limit
are changed, because they appear as leafs in the change
tree.
The change tree can contain multiple changes, for example:
merge:
- change:
tracer_provider:
limits:
# increase from 128
attribute_count_limit: 256
# decrease from 128
event_count_limit: 64
8.6. Add actions
merge:
- add:
# Add a trace_id_ratio_based sampler instead in kitchen-sink.yaml
tracer_provider:
sampler:
trace_id_ratio_based:
ratio: 0.07
8.7. Complete merge example
merge:
- remove:
# Remove the parent based sampler from kitchen-sink.yaml
tracer_provider:
sampler:
parent_based:
- change:
# increase from 128
tracer_provider:
limits:
attribute_count_limit: 256
- change:
# decrease from 128
tracer_provider:
limits:
event_count_limit: 64
- add:
# Add a trace_id_ratio_based sampler instead in kitchen-sink.yaml
tracer_provider:
sampler:
trace_id_ratio_based:
ratio: 0.07
A couple of points to note:
- The main language is yaml, and the structure follows the schema of
config.yaml, which is a good news for users (no dual env var syntax to learn) - This is yaml, so it can have a schema, helping users to produce valid
merge docs (unlike env vars) - Assigning a new value to a scalar is done with the change action, which
does not changes the tree structure but only sets a leaf node - Changing a subtree (replacing a parent_based sampler with a
trace_id_ratio_based sampler here) is done as a remove + add action - To apply the merge, all removals are processed first, then changes, then
additions (for example, for point is to remove before add).
When is merge useful ?
Assume an existing configuration that uses a scalar value, with no provisions for a substitution variable:
attribute_limits:
# Hard coded, how to change this ?
attribute_count_limit: 200
The merge can replace 200
with 300
, or better, replace 200
with ${MY_ATTRIBUTE_COUNT_LIMIT:300}
, so that the modified configuration file is easier to use.
Another major use case is to replace en entire subtree in the yaml configuration:
- add a new exporter
- define more metrics
- replace an existing sampler with an alternate sampler
- replace existing elements with extension points
8.8. Possible extensions
Several merges can be applied in a chain, using a collection of merge
documents.
For example:
merge:
- remove:
xxx:
- add:
yyy:
---
merge:
- remove:
yyy:
- add:
zzz: