refactor(s2n-quic-transport): update PTO timer once per transmission burst #1884
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolved issues:
resolves #1839
Description of changes:
The packet time out (PTO) timer is restarted each time an ack-eliciting packet is sent, as its main function is to trigger probes when tail packets are not ACKed in time:
Previously we were updating this timer on each packet being sent, even if we send several packets in one burst due to aggregation mechanisms like GSO. This was showing up in Flamegraphs, from .1% to 1% of CPU.
With this change we will wait to update the pto timer until the burst of packets has been transmitted. This doesn't change the actual timing of the PTO timer as the timestamp used for each packet within the burst is identical.
Call-outs:
I cleaned up some code in
on_packet_sentandupdate_pto_timerthat was checkingis_congestion_controlledin addition to ack eliciting. Currently, every frame except forAckandPaddingis congestion controlled, and every frame except forAck,Padding, andConnectionCloseis ack eliciting, so it shouldn't make a difference not checking foris_congestion_controlled. I believe we were checking it previously due to how the code was structured way back when the Recovery Manager was first written.There is a slight change in behavior when we have both an active path and one or more paths that are pending path validation (for connection migration reasons). The non-active paths have their packets transmitted at the end of
on_transmit, and since the PTO timer was updated for each individual packet, it would get updated based on the RTT of the non-active path. Now, I am deliberately only updating the PTO based on the active path's RTT, as this RTT should be more accurate as many more packets have been transmitted over the active path.Testing:
Updated/added unit tests
Is this a refactor change? If so, how have you proved that the intended behavior hasn't changed? -->
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.