Skip to content

transport: Add values to the grpc.disconnect_error label for grpc.subchannel.disconnections metric (A94)#8973

Open
mbissa wants to merge 8 commits intogrpc:masterfrom
mbissa:subchannel-disconnection-unknown-reason
Open

transport: Add values to the grpc.disconnect_error label for grpc.subchannel.disconnections metric (A94)#8973
mbissa wants to merge 8 commits intogrpc:masterfrom
mbissa:subchannel-disconnection-unknown-reason

Conversation

@mbissa
Copy link
Contributor

@mbissa mbissa commented Mar 13, 2026

This PR implements granular grpc.disconnect_error labels for the grpc.subchannel.disconnections metric, as defined in gRFC A94.

RELEASE NOTES:

  • transport: Add disconnection reason to the grpc.disconnect_error label for grpc.subchannel.disconnections metric as defined in gRFC A94.

@mbissa mbissa added Type: Feature New features or improvements in behavior Area: Observability Includes Stats, Tracing, Channelz, Healthz, Binlog, Reflection, Admin, GCP Observability labels Mar 13, 2026
@mbissa mbissa added this to the 1.81 Release milestone Mar 13, 2026
@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 93.75000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.08%. Comparing base (12e91dd) to head (8907a8e).

Files with missing lines Patch % Lines
clientconn.go 92.59% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8973      +/-   ##
==========================================
+ Coverage   83.04%   83.08%   +0.03%     
==========================================
  Files         411      411              
  Lines       32892    32918      +26     
==========================================
+ Hits        27316    27349      +33     
+ Misses       4181     4178       -3     
+ Partials     1395     1391       -4     
Files with missing lines Coverage Δ
internal/transport/http2_client.go 92.66% <100.00%> (-0.38%) ⬇️
clientconn.go 90.49% <92.59%> (-0.55%) ⬇️

... and 25 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mbissa mbissa changed the title transport: Add values for grpc.disconnect_error label for grpc.subchannel.disconnections metric (A94) transport: Add values to the grpc.disconnect_error label for grpc.subchannel.disconnections metric (A94) Mar 13, 2026
@mbissa mbissa force-pushed the subchannel-disconnection-unknown-reason branch from dedb4b2 to 06fb986 Compare March 13, 2026 07:42
@mbissa mbissa force-pushed the subchannel-disconnection-unknown-reason branch from 215dc6c to de97023 Compare March 13, 2026 09:36
@mbissa
Copy link
Contributor Author

mbissa commented Mar 13, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully implements gRFC A94 by adding more granular grpc.disconnect_error labels to the grpc.subchannel.disconnections metric. The changes are well-implemented, introducing a disconnectError field in addrConn and a disconnectErrorString helper to map various error conditions to the new labels. The transport layer modifications to propagate the necessary error details are correct. The addition of comprehensive end-to-end tests is a great way to ensure the new labels are correctly reported in different disconnection scenarios. I have one minor suggestion to clean up a duplicated comment.

@mbissa mbissa requested a review from easwars March 13, 2026 10:41
@easwars easwars self-assigned this Mar 16, 2026
Comment on lines +1422 to +1424
t.goAwayDebugMessage = fmt.Sprintf("code: %s", f.ErrCode)
if len(f.DebugData()) > 0 {
t.goAwayDebugMessage += fmt.Sprintf(", debug data: %q", string(f.DebugData()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed. reverted this part.

clientconn.go Outdated

localityLabel string
backendServiceLabel string
disconnectError string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we add a Label suffix to stay consistent with existing field names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

clientconn.go Outdated
}

func disconnectErrorString(r transport.GoAwayReason, goAwayCode http2.ErrCode, err error) string {
if r != transport.GoAwayInvalid {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a switch statement here. See: https://go.dev/doc/effective_go#switch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@easwars easwars assigned mbissa and unassigned easwars Mar 16, 2026
@mbissa mbissa force-pushed the subchannel-disconnection-unknown-reason branch from 1200625 to 5584fb4 Compare March 17, 2026 06:47
@mbissa mbissa force-pushed the subchannel-disconnection-unknown-reason branch from f219fe9 to 88cd619 Compare March 17, 2026 07:32
@mbissa
Copy link
Contributor Author

mbissa commented Mar 17, 2026

fixed the comments, one test flaked once due to timing of how the connection was closed, so changed the test to be more deterministic. Master branch had new tests which were failing now, so couple of minor changes for that as well.

@mbissa mbissa assigned easwars and unassigned mbissa Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Observability Includes Stats, Tracing, Channelz, Healthz, Binlog, Reflection, Admin, GCP Observability Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants