Skip to content

Conversation

@timvaillancourt
Copy link
Contributor

@timvaillancourt timvaillancourt commented Dec 24, 2025

Description

This PR disables too-many-ping protection in gRPC when running e2e tests with test.go. This is achieved by setting GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA to 0

Context from docs:

GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA
This channel argument controls the maximum number of pings that can be sent when there is no data/header frame to be sent. gRPC Core will not continue sending pings if we run over the limit. Setting it to 0 allows sending pings without such a restriction. (Note that this is an unfortunate setting that does not agree with A8-client-side-keepalive.md. There should ideally be no such restriction on the keepalive ping and we plan to deprecate it in the future.)

and

Why am I receiving a GOAWAY with error code ENHANCE_YOUR_CALM?
A server sends a GOAWAY with ENHANCE_YOUR_CALM if the client sends too many misbehaving pings as described in A8-client-side-keepalive.md. Some scenarios where this can happen are -
if a server has GRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS set to false while the client has set this to true resulting in keepalive pings being sent even when there is no call in flight.
if the client's GRPC_ARG_KEEPALIVE_TIME_MS setting is lower than the server's GRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MS.

Today, many e2e tests connect/reconnect to the gRPC server too quickly and hit the GOAWAY/too_many_pings error. When this occurs, the victim test has to wait the GRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MS default of 300000 milliseconds (5 minutes) doing nothing 😱. This protection makes sense in a normal world, but in an e2e test all running on localhost we don't really care

When tests hit this, we see this error and pause for 5min:

E1218 18:18:06.674435  32993 component.go:44] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".

Docs: https://github.com/grpc/grpc/blob/master/doc/keepalive.md

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

AI Disclosure

@github-actions github-actions bot added this to the v24.0.0 milestone Dec 24, 2025
@timvaillancourt timvaillancourt self-assigned this Dec 24, 2025
@vitess-bot vitess-bot bot added NeedsWebsiteDocsUpdate What it says NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsBackportReason If backport labels have been applied to a PR, a justification is required labels Dec 24, 2025
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Dec 24, 2025

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

Signed-off-by: Tim Vaillancourt <[email protected]>
@timvaillancourt timvaillancourt removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request NeedsBackportReason If backport labels have been applied to a PR, a justification is required labels Dec 24, 2025
@timvaillancourt timvaillancourt changed the title e2e: disable GOAWAY/too_many_pings errors in e2e CI e2e: disable gRPC GOAWAY/too_many_pings keepalive errors in e2e CI Dec 24, 2025
@timvaillancourt timvaillancourt marked this pull request as ready for review December 24, 2025 16:07
@timvaillancourt timvaillancourt enabled auto-merge (squash) December 24, 2025 16:14
timvaillancourt and others added 2 commits December 24, 2025 17:25
Co-authored-by: Mohamed Hamza <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
@codecov
Copy link

codecov bot commented Dec 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.90%. Comparing base (7a3acd5) to head (5a41620).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #19083   +/-   ##
=======================================
  Coverage   69.90%   69.90%           
=======================================
  Files        1612     1612           
  Lines      215817   215789   -28     
=======================================
- Hits       150865   150849   -16     
+ Misses      64952    64940   -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants