Skip to content

Better waiter errors with NIOTS #1775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 25, 2024
Merged

Conversation

glbrntt
Copy link
Collaborator

@glbrntt glbrntt commented Jan 25, 2024

Motivation:

When establishing a connection using NIOTS, Network.framework may enter a 'waiting' state because of a transient error. This is surfaced as a user inbound event.

In many cases the connect call won't resolve until the connect timeout passes (defaults to 20 seconds). Attempts to start a call on this connection during this time will fail with a timeout at the connection pool layer (where the max wait time defaults to 5 seconds) and users will see a 'deadline exceeded' error without any more context. However, we can provide more information by passing up the transient error from Network.framework to the connection pool.

This isn't currently possible as events flow up on state changes, in this case from connecting to transient failure when the connect times out.

Modifications:

  • Catch the WaitingForConnectivity event and bubble up its error though the idle handler to the connection pool

Result:

Better errors on connect timeouts in some situations

Motivation:

When establishing a connection using NIOTS, Network.framework may enter
a 'waiting' state because of a transient error. This is surfaced as a
user inbound event.

In many cases the connect call won't resolve until the connect timeout
passes (defaults to 20 seconds). Attempts to start a call on this
connection during this time will fail with a timeout at the connection
pool layer (where the max wait time defaults to 5 seconds) and users
will see a 'deadline exceeded' error without any more context. However,
we can provide more information by passing up the transient error from
Network.framework to the connection pool.

This isn't currently possible as events flow up on state changes, in
this case from connecting to transient failure when the connect times
out.

Modifications:

- Catch the `WaitingForConnectivity` event and bubble up its error
  though the idle handler to the connection pool

Result:

Better errors on connect timeouts in some situations
@glbrntt glbrntt added the 🔨 semver/patch No public API change. label Jan 25, 2024
@glbrntt glbrntt requested a review from gjcairo January 25, 2024 13:37
Copy link
Collaborator

@gjcairo gjcairo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one nit

@glbrntt glbrntt enabled auto-merge (squash) January 25, 2024 14:55
@glbrntt glbrntt merged commit be8e15d into grpc:main Jan 25, 2024
@glbrntt glbrntt deleted the nw-waits-for-connectivity branch January 26, 2024 10:59
glbrntt added a commit to glbrntt/grpc-swift that referenced this pull request Feb 5, 2024
Motivation:

When establishing a connection using NIOTS, Network.framework may enter
a 'waiting' state because of a transient error. This is surfaced as a
user inbound event.

In many cases the connect call won't resolve until the connect timeout
passes (defaults to 20 seconds). Attempts to start a call on this
connection during this time will fail with a timeout at the connection
pool layer (where the max wait time defaults to 5 seconds) and users
will see a 'deadline exceeded' error without any more context. However,
we can provide more information by passing up the transient error from
Network.framework to the connection pool.

This isn't currently possible as events flow up on state changes, in
this case from connecting to transient failure when the connect times
out.

Modifications:

- Catch the `WaitingForConnectivity` event and bubble up its error
  though the idle handler to the connection pool

Result:

Better errors on connect timeouts in some situations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants