Skip to content

Commit f2d4a98

Browse files
aaronctechnicallytyaljo242
authored
feat: OpenTelemetry configuration and BaseApp instrumentation (#25516)
Co-authored-by: Tyler <[email protected]> Co-authored-by: Alex | Cosmos Labs <[email protected]>
1 parent a15ac0b commit f2d4a98

File tree

64 files changed

+1645
-180
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+1645
-180
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,5 @@ debug_container.log
6565
*.synctex.gz
6666
/x/genutil/config/priv_validator_key.json
6767
/x/genutil/data/priv_validator_state.json
68+
/.envrc
69+
/.env

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ Ref: https://keepachangelog.com/en/1.0.0/
6262
* (crypto/ledger) [#25435](https://github.com/cosmos/cosmos-sdk/pull/25435) Add SetDERConversion to reset skipDERConversion and App name for ledger.
6363
* (gRPC) [#25565](https://github.com/cosmos/cosmos-sdk/pull/25565) Support for multi gRPC query clients serve with historical binaries to serve proper historical state.
6464
* (blockstm) [#25600](https://github.com/cosmos/cosmos-sdk/pull/25600) Allow dynamic retrieval of the coin denomination from multi store at runtime.
65+
* [#25516](https://github.com/cosmos/cosmos-sdk/pull/25516) Support automatic configuration of OpenTelemetry via [OpenTelemetry declarative configuration](https://pkg.go.dev/go.opentelemetry.io/contrib/otelconf) and add OpenTelemetry instrumentation of `BaseApp`.
6566

6667
### Improvements
6768

@@ -95,6 +96,7 @@ Ref: https://keepachangelog.com/en/1.0.0/
9596
* (x/nft) [#24575](https://github.com/cosmos/cosmos-sdk/pull/24575) Deprecate the `x/nft` module in the Cosmos SDK repository. This module will not be maintained to the extent that our core modules will and will be kept in a [legacy repo](https://github.com/cosmos/cosmos-legacy).
9697
* (x/group) [#24571](https://github.com/cosmos/cosmos-sdk/pull/24571) Deprecate the `x/group` module in the Cosmos SDK repository. This module will not be maintained to the extent that our core modules will and will be kept in a [legacy repo](https://github.com/cosmos/cosmos-legacy).
9798
* (types) [#24664](https://github.com/cosmos/cosmos-sdk/pull/24664) Deprecate the `Invariant` type in the Cosmos SDK.
99+
* [#25516](https://github.com/cosmos/cosmos-sdk/pull/25516) Deprecate all existing methods and types in the `telemetry` package, usage of `github.com/hashicorp/go-metrics` and the `telemetry` configuration section. New instrumentation should use the official [OpenTelemetry go API](https://pkg.go.dev/go.opentelemetry.io/otel) and Cosmos SDK appllications can automatically expose OpenTelemetry metrics, traces and logs via [OpenTelemetry declarative configuration](https://pkg.go.dev/go.opentelemetry.io/contrib/otelconf).
98100

99101
## [v0.53.4](https://github.com/cosmos/cosmos-sdk/releases/tag/v0.53.3) - 2025-07-25
100102

UPGRADING.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This document provides a quick reference for the upgrades from `v0.53.x` to `v0.
44

55
Note, always read the **App Wiring Changes** section for more information on application wiring updates.
66

7-
### TLDR
7+
## TLDR
88

99
For a full list of changes, see the [Changelog](https://github.com/cosmos/cosmos-sdk/blob/release/v0.54.x/CHANGELOG.md).
1010

@@ -62,4 +62,16 @@ func (h MyGovHooks) AfterProposalSubmission(ctx context.Context, proposalID uint
6262
func (h MyGovHooks) AfterProposalSubmission(ctx context.Context, proposalID uint64, proposerAddr sdk.AccAddress) error {
6363
// implementation
6464
}
65-
```
65+
```
66+
67+
## Adoption of OpenTelemetry and Deprecation of `github.com/hashicorp/go-metrics`
68+
69+
Existing Cosmos SDK telemetry support is provided by `github.com/hashicorp/go-metrics` which is undermaintained and only supported metrics instrumentation.
70+
OpenTelemetry provides an integrated solution for metrics, traces, and logging which is widely adopted and actively maintained.
71+
The existing wrapper functions in the `telemetry` package required acquiring mutex locks and map lookups for every metric operation which is sub-optimal. OpenTelemetry's API uses atomic concurrency wherever possible and should introduce less performance overhead during metric collection.
72+
73+
The [README.md](telemetry/README.md) in the `telemetry` package provides more details on usage, but below is a quick summary:
74+
1. application developers should follow the official [go OpenTelemetry](https://pkg.go.dev/go.opentelemetry.io/otel) guidelines when instrumenting their applications.
75+
2. node operators who want to configure OpenTelemetry exporters should set the `OTEL_EXPERIMENTAL_CONFIG_FILE` environment variable to the path of a yaml file which follows the OpenTelemetry declarative configuration format specified here: https://pkg.go.dev/go.opentelemetry.io/contrib/otelconf. As long as the `telemetry` package has been imported somewhere (it should already be imported if you are using the SDK), OpenTelemetry will be initialized automatically based on the configuration file.
76+
77+
NOTE: the go implementation of [otelconf](https://pkg.go.dev/go.opentelemetry.io/contrib/otelconf) is still under development and we will update our usage of it as it matures.

baseapp/abci.go

Lines changed: 68 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ import (
1111
abci "github.com/cometbft/cometbft/abci/types"
1212
cmtproto "github.com/cometbft/cometbft/proto/tendermint/types"
1313
"github.com/cosmos/gogoproto/proto"
14+
otelattr "go.opentelemetry.io/otel/attribute"
15+
"go.opentelemetry.io/otel/trace"
1416
"google.golang.org/grpc/codes"
1517
grpcstatus "google.golang.org/grpc/status"
1618

@@ -39,6 +41,9 @@ const (
3941
)
4042

4143
func (app *BaseApp) InitChain(req *abci.RequestInitChain) (*abci.ResponseInitChain, error) {
44+
_, span := tracer.Start(context.Background(), "InitChain")
45+
defer span.End()
46+
4247
if req.ChainId != app.chainID {
4348
return nil, fmt.Errorf("invalid chain-id on InitChain; expected: %s, got: %s", app.chainID, req.ChainId)
4449
}
@@ -152,7 +157,12 @@ func (app *BaseApp) Info(_ *abci.RequestInfo) (*abci.ResponseInfo, error) {
152157

153158
// Query implements the ABCI interface. It delegates to CommitMultiStore if it
154159
// implements Queryable.
155-
func (app *BaseApp) Query(_ context.Context, req *abci.RequestQuery) (resp *abci.ResponseQuery, err error) {
160+
func (app *BaseApp) Query(ctx context.Context, req *abci.RequestQuery) (resp *abci.ResponseQuery, err error) {
161+
ctx, span := tracer.Start(ctx, "Query",
162+
trace.WithAttributes(otelattr.String("path", req.Path)),
163+
)
164+
defer span.End()
165+
156166
// add panic recovery for all queries
157167
//
158168
// Ref: https://github.com/cosmos/cosmos-sdk/pull/8039
@@ -167,8 +177,11 @@ func (app *BaseApp) Query(_ context.Context, req *abci.RequestQuery) (resp *abci
167177
req.Height = app.LastBlockHeight()
168178
}
169179

180+
//nolint:staticcheck // TODO: switch to OpenTelemetry
170181
telemetry.IncrCounter(1, "query", "count")
182+
//nolint:staticcheck // TODO: switch to OpenTelemetry
171183
telemetry.IncrCounter(1, "query", req.Path)
184+
//nolint:staticcheck // TODO: switch to OpenTelemetry
172185
defer telemetry.MeasureSince(telemetry.Now(), req.Path)
173186

174187
if req.Path == QueryPathBroadcastTx {
@@ -178,7 +191,7 @@ func (app *BaseApp) Query(_ context.Context, req *abci.RequestQuery) (resp *abci
178191
// handle gRPC routes first rather than calling splitPath because '/' characters
179192
// are used as part of gRPC paths
180193
if grpcHandler := app.grpcQueryRouter.Route(req.Path); grpcHandler != nil {
181-
return app.handleQueryGRPC(grpcHandler, req), nil
194+
return app.handleQueryGRPC(ctx, grpcHandler, req), nil
182195
}
183196

184197
path := SplitABCIQueryPath(req.Path)
@@ -342,6 +355,9 @@ func (app *BaseApp) ApplySnapshotChunk(req *abci.RequestApplySnapshotChunk) (*ab
342355
// will contain relevant error information. Regardless of tx execution outcome,
343356
// the ResponseCheckTx will contain the relevant gas execution context.
344357
func (app *BaseApp) CheckTx(req *abci.RequestCheckTx) (*abci.ResponseCheckTx, error) {
358+
_, span := tracer.Start(context.Background(), "CheckTx", trace.WithAttributes(otelattr.String("ExecMode", req.Type.String())))
359+
defer span.End()
360+
345361
var mode sdk.ExecMode
346362

347363
switch req.Type {
@@ -454,7 +470,19 @@ func (app *BaseApp) PrepareProposal(req *abci.RequestPrepareProposal) (resp *abc
454470
}
455471
}()
456472

457-
resp, err = app.abciHandlers.PrepareProposalHandler(prepareProposalState.Context(), req)
473+
ctx := prepareProposalState.Context()
474+
ctx, span := ctx.StartSpan(
475+
tracer,
476+
"PrepareProposal",
477+
trace.WithAttributes(
478+
otelattr.Int64("height", req.Height),
479+
otelattr.String("timestamp", req.Time.String()),
480+
otelattr.Int("num_txs", len(req.Txs)),
481+
otelattr.String("proposer", sdk.ValAddress(req.ProposerAddress).String()),
482+
),
483+
)
484+
defer span.End()
485+
resp, err = app.abciHandlers.PrepareProposalHandler(ctx, req)
458486
if err != nil {
459487
app.logger.Error("failed to prepare proposal", "height", req.Height, "time", req.Time, "err", err)
460488
return &abci.ResponsePrepareProposal{Txs: req.Txs}, nil
@@ -513,7 +541,20 @@ func (app *BaseApp) ProcessProposal(req *abci.RequestProcessProposal) (resp *abc
513541
}
514542

515543
processProposalState := app.stateManager.GetState(execModeProcessProposal)
516-
processProposalState.SetContext(app.getContextForProposal(processProposalState.Context(), req.Height).
544+
ctx := processProposalState.Context()
545+
ctx, span := ctx.StartSpan(
546+
tracer,
547+
"ProcessProposal",
548+
trace.WithAttributes(
549+
otelattr.Int64("height", req.Height),
550+
otelattr.String("timestamp", req.Time.String()),
551+
otelattr.String("proposer", sdk.ValAddress(req.ProposerAddress).String()),
552+
otelattr.Int("num_txs", len(req.Txs)),
553+
otelattr.String("hash", fmt.Sprintf("%X", req.Hash)),
554+
),
555+
)
556+
defer span.End()
557+
processProposalState.SetContext(app.getContextForProposal(ctx, req.Height).
517558
WithVoteInfos(req.ProposedLastCommit.Votes). // this is a set of votes that are not finalized yet, wait for commit
518559
WithBlockHeight(req.Height).
519560
WithBlockTime(req.Time).
@@ -595,6 +636,9 @@ func (app *BaseApp) ExtendVote(_ context.Context, req *abci.RequestExtendVote) (
595636
return nil, errors.New("application ExtendVote handler not set")
596637
}
597638

639+
ctx, span := ctx.StartSpan(tracer, "ExtendVote")
640+
defer span.End()
641+
598642
// If vote extensions are not enabled, as a safety precaution, we return an
599643
// error.
600644
cp := app.GetConsensusParams(ctx)
@@ -666,6 +710,9 @@ func (app *BaseApp) VerifyVoteExtension(req *abci.RequestVerifyVoteExtension) (r
666710
ctx = sdk.NewContext(ms, emptyHeader, false, app.logger).WithStreamingManager(app.streamingManager)
667711
}
668712

713+
ctx, span := ctx.StartSpan(tracer, "VerifyVoteExtension")
714+
defer span.End()
715+
669716
// If vote extensions are not enabled, as a safety precaution, we return an
670717
// error.
671718
cp := app.GetConsensusParams(ctx)
@@ -716,7 +763,7 @@ func (app *BaseApp) VerifyVoteExtension(req *abci.RequestVerifyVoteExtension) (r
716763
// Execution flow or by the FinalizeBlock ABCI method. The context received is
717764
// only used to handle early cancellation, for anything related to state app.stateManager.GetState(execModeFinalize).Context()
718765
// must be used.
719-
func (app *BaseApp) internalFinalizeBlock(ctx context.Context, req *abci.RequestFinalizeBlock) (*abci.ResponseFinalizeBlock, error) {
766+
func (app *BaseApp) internalFinalizeBlock(goCtx context.Context, req *abci.RequestFinalizeBlock) (*abci.ResponseFinalizeBlock, error) {
720767
var events []abci.Event
721768

722769
if err := app.checkHalt(req.Height, req.Time); err != nil {
@@ -750,9 +797,12 @@ func (app *BaseApp) internalFinalizeBlock(ctx context.Context, req *abci.Request
750797
app.stateManager.SetState(execModeFinalize, app.cms, header, app.logger, app.streamingManager)
751798
finalizeState = app.stateManager.GetState(execModeFinalize)
752799
}
800+
ctx := finalizeState.Context().WithContext(goCtx)
801+
ctx, span := ctx.StartSpan(tracer, "internalFinalizeBlock")
802+
defer span.End()
753803

754804
// Context is now updated with Header information.
755-
finalizeState.SetContext(finalizeState.Context().
805+
finalizeState.SetContext(ctx.
756806
WithBlockHeader(header).
757807
WithHeaderHash(req.Hash).
758808
WithHeaderInfo(coreheader.Info{
@@ -846,7 +896,7 @@ func (app *BaseApp) internalFinalizeBlock(ctx context.Context, req *abci.Request
846896
WithBlockGasUsed(blockGasUsed).
847897
WithBlockGasWanted(blockGasWanted),
848898
)
849-
endBlock, err := app.endBlock(finalizeState.Context())
899+
endBlock, err := app.endBlock()
850900
if err != nil {
851901
return nil, err
852902
}
@@ -959,7 +1009,11 @@ func (app *BaseApp) checkHalt(height int64, time time.Time) error {
9591009
// height.
9601010
func (app *BaseApp) Commit() (*abci.ResponseCommit, error) {
9611011
finalizeState := app.stateManager.GetState(execModeFinalize)
962-
header := finalizeState.Context().BlockHeader()
1012+
ctx := finalizeState.Context()
1013+
ctx, span := ctx.StartSpan(tracer, "Commit")
1014+
defer span.End()
1015+
1016+
header := ctx.BlockHeader()
9631017
retainHeight := app.GetBlockRetentionHeight(header.Height)
9641018

9651019
if app.abciHandlers.Precommiter != nil {
@@ -1005,6 +1059,8 @@ func (app *BaseApp) Commit() (*abci.ResponseCommit, error) {
10051059
// The SnapshotIfApplicable method will create the snapshot by starting the goroutine
10061060
app.snapshotManager.SnapshotIfApplicable(header.Height)
10071061

1062+
blockCounter.Add(ctx, 1)
1063+
10081064
return resp, nil
10091065
}
10101066

@@ -1174,12 +1230,15 @@ func (app *BaseApp) getContextForProposal(ctx sdk.Context, height int64) sdk.Con
11741230
return ctx
11751231
}
11761232

1177-
func (app *BaseApp) handleQueryGRPC(handler GRPCQueryHandler, req *abci.RequestQuery) *abci.ResponseQuery {
1233+
func (app *BaseApp) handleQueryGRPC(goCtx context.Context, handler GRPCQueryHandler, req *abci.RequestQuery) *abci.ResponseQuery {
11781234
ctx, err := app.CreateQueryContext(req.Height, req.Prove)
11791235
if err != nil {
11801236
return sdkerrors.QueryResult(err, app.trace)
11811237
}
11821238

1239+
// add base context for tracing
1240+
ctx = ctx.WithContext(goCtx)
1241+
11831242
resp, err := handler(ctx, req)
11841243
if err != nil {
11851244
resp = sdkerrors.QueryResult(gRPCErrorToSDKError(err), app.trace)

0 commit comments

Comments
 (0)