Eliminate HEAD requests during downloads, especially for faster transfers of small files #363

crowecawcaw · 2025-11-21T21:57:54Z

This PR optimizes downloads through the TransferManager by removing the upfront HEAD request. Previously, every download issued a HEAD request to determine object size before starting the GET request. This
change eliminates that extra round-trip by extracting metadata from the first GET response instead.

For small files, the download time is dominated by the request latency, so eliminating one of the two requests results in a ~50% download time reduction. For large files, the effect is less noticeable because the download time is dominated by the the transfer time and because there are multiple chunks to download. In both cases, we save the cost of the HEAD request.

What Changed

• Removed HEAD requests: Downloads now start immediately with a ranged GET request for the first chunk
• Dynamic size detection: Extract object size and ETag from the first GET response headers (ContentRange or ContentLength)
• Dynamic chunk scheduling: After the first chunk completes, schedule additional chunks only if the object is larger than the chunk size
• Simplified code flow: Consolidated download logic into a single path instead of branching on size upfront

Testing

Unit, functional, and integ tests pass. I also added a new script to benchmark downloading many small files. For downloading 1000 1kB files on my laptop, the total duration dropped 41% from 15.0s to 8.9.

Backward Compatibility

External API unchanged. All download methods have the same signatures.

Flow Diagrams

Before (with HEAD request)

flowchart TD
    A[HEAD Request] --> C{Size < 8MB?}
    C -->|Yes| D[GET Request]
    C -->|No| E[Multiple GET Requests]
    D --> F[Complete]
    E --> F

After (no HEAD request)

flowchart TD
    A[GET First Chunk] --> B{Size < 8MB?}
    B -->|Yes| C[Complete]
    B -->|No| D[GET Remaining Chunks]
    D --> C

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

crowecawcaw added 2 commits November 21, 2025 10:31

Avoid HEAD requests for downloads

68b0e69

Add benchmarking script for small files

c138cd0

crowecawcaw mentioned this pull request Dec 10, 2025

perf: avoid S3 HEAD requests when downloading job outputs aws-deadline/deadline-cloud#924

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eliminate HEAD requests during downloads, especially for faster transfers of small files #363

Eliminate HEAD requests during downloads, especially for faster transfers of small files #363

Uh oh!

crowecawcaw commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Eliminate HEAD requests during downloads, especially for faster transfers of small files #363

Are you sure you want to change the base?

Eliminate HEAD requests during downloads, especially for faster transfers of small files #363

Uh oh!

Conversation

crowecawcaw commented Nov 21, 2025

What Changed

Testing

Backward Compatibility

Flow Diagrams

Before (with HEAD request)

After (no HEAD request)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant