Skip to content

Conversation

egibs
Copy link
Member

@egibs egibs commented May 29, 2025

In certain cases, we were encountering partial reads which caused interesting scan results. This PR handles those edge cases and fixes partial reads in our optimized loops.

I ran several local scans with these changes and they address what I was seeing.

@egibs egibs requested a review from eslerm May 29, 2025 23:18
eslerm
eslerm previously approved these changes May 29, 2025
Copy link

@eslerm eslerm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for being thorough and applying to all archive types.

@stevebeattie
Copy link
Member

Confirmed that malcontent with this fix addresses the 'fake sections header' issue and also the failure to extract archives, as seen in wolfi-dev/os#54633 .

@stevebeattie
Copy link
Member

The last commit here causes a failure when a zip archive (and possibly others) contains the same file both compressed and uncompressed. A simple zip file as attached demonstrates:

$ unzip -t /tmp/zipped.and.unzipped.zip 
Archive:  /tmp/zipped.and.unzipped.zip
    testing: zipped.and.unzipped/     OK
    testing: zipped.and.unzipped/test.css.gz   OK
    testing: zipped.and.unzipped/test.css   OK
No errors detected in compressed data of /tmp/zipped.and.unzipped.zip.

$ out/mal --min-risk=critical --min-file-risk=critical --quantity-increases-risk=true --verbose scan /tmp/zipped.and.unzipped.zip
🔎 Scanning "/tmp/zipped.and.unzipped.zip"
time=2025-05-29T17:57:27.142-07:00 level=DEBUG source=$HOME/git/chainguard-dev/malcontent/pkg/archive/archive.go:132 msg="creating temp dir" path=/tmp/zipped.and.unzipped.zip
time=2025-05-29T17:57:27.142-07:00 level=DEBUG source=$HOME/git/chainguard-dev/malcontent/pkg/archive/zip.go:39 msg="extracting zip" dir=$HOME/tmp/zipped.and.unzipped.zip1260100870 file=/tmp/zipped.and.unzipped.zip
time=2025-05-29T17:57:27.143-07:00 level=ERROR source=$HOME/git/chainguard-dev/malcontent/pkg/action/scan.go:506 msg="unable to process /tmp/zipped.and.unzipped.zip: extract to temp: failed to walk directory: failed to create extraction directory: mkdir $HOME/tmp/zipped.and.unzipped.zip1260100870/zipped.and.unzipped/test.css: not a directory"
💣 scan: extract to temp: failed to walk directory: failed to create extraction directory: mkdir $HOME/tmp/zipped.and.unzipped.zip1260100870/zipped.and.unzipped/test.css: not a directory

zipped.and.unzipped.zip

@stevebeattie stevebeattie dismissed eslerm’s stale review May 30, 2025 01:01

Further updates need reviewing.

@egibs
Copy link
Member Author

egibs commented May 30, 2025

The last commit here causes a failure when a zip archive (and possibly others) contains the same file both compressed and uncompressed. A simple zip file as attached demonstrates:

zipped.and.unzipped.zip

Thanks for pulling these out. I added a new test case that checks for this case.

Copy link
Member

@stevebeattie stevebeattie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the colliding file names issue. LGTM.

@stevebeattie stevebeattie merged commit 1971699 into chainguard-dev:main May 30, 2025
12 checks passed
@egibs egibs deleted the fix-partial-reads branch June 25, 2025 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants