-
Notifications
You must be signed in to change notification settings - Fork 870
Add CborReaderProxy to support process CBOR stream in chunks #3939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CborReaderProxy to support process CBOR stream in chunks #3939
Conversation
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a streaming CBOR reader to reduce memory usage when processing large payloads. Instead of loading entire response streams into memory before unmarshalling, it introduces CborReaderProxy
that processes CBOR data in chunks.
- Adds
CborReaderProxy
to wrapSystem.Formats.Cbor.CborReader
with streaming capabilities - Implements custom container handling for maps/arrays that span buffer boundaries
- Adds configuration for initial buffer size through
AWSConfigs.CborReaderInitialBufferSize
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
sdk/src/Core/Amazon.Util/Internal/RootConfig.cs |
Adds CborReaderInitialBufferSize configuration property |
sdk/src/Core/AWSConfigs.cs |
Implements configuration support and helper methods for CBOR buffer size |
extensions/test/CborProtocol.Tests/CborReaderProxyTests.cs |
Comprehensive test suite for the new streaming CBOR reader |
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/Transform/CborUnmarshallerContext.cs |
Updates context to use CborReaderProxy instead of loading full stream |
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs |
Core implementation of streaming CBOR reader with buffer management |
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborReaderProxy.cs
Outdated
Show resolved
Hide resolved
6a80340
to
2c0ea9a
Compare
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborStreamReader.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborStreamReader.cs
Outdated
Show resolved
Hide resolved
extensions/src/AWSSDK.Extensions.CborProtocol/Internal/CborStreamReader.cs
Outdated
Show resolved
Hide resolved
0208456
to
6da5cab
Compare
private CborReader _internalCborReader; | ||
private int _currentChunkSize; | ||
|
||
public CborStreamReader(Stream stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class is missing documentation
|
||
namespace Amazon.Extensions.CborProtocol.Internal | ||
{ | ||
public class CborStreamReader : IDisposable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class is missing documentation
_stream = stream ?? throw new ArgumentNullException(nameof(stream)); | ||
_buffer = ArrayPool<byte>.Shared.Rent(AWSConfigs.CborReaderInitialBufferSize); | ||
|
||
_currentChunkSize = _stream.Read(_buffer, 0, _buffer.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intention is to fill the whole buffer or fill until the stream has no data left, this isn't guaranteed to happen with one Stream.read
call, since the docs state that it could be less than the bytes you requested if the stream doesn't have that data available yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
catch (CborContentException ex) | ||
{ | ||
if (_currentChunkSize == 0 && _internalCborReader.BytesRemaining == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious how we would be in a CborContentException
is the _currentChunkSize == 0 and the BytesRemaining == 0? Wouldn't this indicate that we successfully read everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and after reading everything the code that uses the CborStreamReader
might try to read more data which means there is an issue with that code, at this case CborReader
will throw this exception and we will throw it back for the other code to handle or fail.
leftoverBytesCount = leftoverBytesCount - bytesToSkip; | ||
|
||
// If the leftover data completely fills the buffer, grow it | ||
if (leftoverBytesCount >= _buffer.Length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering.. is there a reason you used >= here? Isn't it impossible for the leftoverBytesCount to be greather than the buffer's length, since the reader will be instantiated with at most the data in the buffer? or am I missing something? Have you seen this case happen before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't happen while testing, but just wanted to be safe incase we had an off by one error or some, at this case I would prefer expanding the buffer rather than go to the next case and lose data when copying to the start of the buffer.
6da5cab
to
8b50922
Compare
Description
The current code loads the whole response stream in memory before start the unmarshalling process. This works for small payloads but may cause issues for large response payloads.
The
System.Formats.Cbor.CborReader
doesn't accept stream and requires in memory byte array for deserialization.This PR adds
CborReaderProxy
that wrapsSystem.Formats.Cbor.CborReader
to support processing the stream in chunks without ever holding the whole payload in memory, feeding the chunks intoCborReader
and process them one by one.The default
CborReader.ReadEndMap()
/CborReader.ReadEndArray()
methods assume the full map is already present in the input buffer and cannot continue reading if the buffer ends in the middle of a container.This means the reader will fail to advance if the next token is a container break (0xFF) but the start wasn't read into the current buffer (it was read in a previous part of the stream).
To support maps/arrays across multiple refills I did the following:
ReadStartMap
,ReadEndMap
,ReadStartArray
,ReadEndArray
.0xFF
as needed._nestingStack
to track when reading a nested CBOR structure (e.g., an array of maps), and correctly match calls toReadStartMap
withReadEndMap
, andReadStartArray
withReadEndArray
.Motivation and Context
#DOTNET-8252
Testing
CborReaderProxyTests.cs
to cover the new changes.CloudWatch
andSecretsManager
operations with CBOR.DRY_RUN-4ff92fab-9e40-46cc-88d0-746023de37cf
.Screenshots (if appropriate)
Types of changes
Checklist
License