Skip to content

Conversation

@WesleyRosenblum
Copy link
Contributor

@WesleyRosenblum WesleyRosenblum commented Jul 19, 2023

Resolved issues:

resolves #1839

Description of changes:

The packet time out (PTO) timer is restarted each time an ack-eliciting packet is sent, as its main function is to trigger probes when tail packets are not ACKed in time:

//= https://www.rfc-editor.org/rfc/rfc9002#section-6.2.1
//# A sender SHOULD restart its PTO timer every time an ack-eliciting
//# packet is sent or acknowledged,

Previously we were updating this timer on each packet being sent, even if we send several packets in one burst due to aggregation mechanisms like GSO. This was showing up in Flamegraphs, from .1% to 1% of CPU.

With this change we will wait to update the pto timer until the burst of packets has been transmitted. This doesn't change the actual timing of the PTO timer as the timestamp used for each packet within the burst is identical.

Call-outs:

I cleaned up some code in on_packet_sent and update_pto_timer that was checking is_congestion_controlled in addition to ack eliciting. Currently, every frame except for Ack and Padding is congestion controlled, and every frame except for Ack, Padding, and ConnectionClose is ack eliciting, so it shouldn't make a difference not checking for is_congestion_controlled. I believe we were checking it previously due to how the code was structured way back when the Recovery Manager was first written.

There is a slight change in behavior when we have both an active path and one or more paths that are pending path validation (for connection migration reasons). The non-active paths have their packets transmitted at the end of on_transmit, and since the PTO timer was updated for each individual packet, it would get updated based on the RTT of the non-active path. Now, I am deliberately only updating the PTO based on the active path's RTT, as this RTT should be more accurate as many more packets have been transmitted over the active path.

Testing:

Updated/added unit tests

Is this a refactor change? If so, how have you proved that the intended behavior hasn't changed? -->

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

camshaft
camshaft previously approved these changes Jul 25, 2023
@WesleyRosenblum WesleyRosenblum merged commit 2ff55c7 into main Jul 26, 2023
@WesleyRosenblum WesleyRosenblum deleted the WesleyRosenblum/ptotimerburst branch July 26, 2023 00:31
kagarmoe pushed a commit that referenced this pull request Jul 27, 2023
…burst (#1884)

* refactor(s2n-quic-transport): update PTO timer once per transmission burst

* remove congestion_controlled check from ack_eliciting_packets_in_flight

* fix formatting

* Add debug assert

* correct comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update the PTO timer once after a transmission round

2 participants