-
Notifications
You must be signed in to change notification settings - Fork 52
Prevent path explosion in file traversal with complex symlinks #354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Rez Moss <[email protected]>
Signed-off-by: Rez Moss <[email protected]>
|
Thanks for the contribution @rezmoss! Is there a way to add a test illustrating the problem on a small scale and how this change fixes it? |
|
@kzantow Sure! I'll work on that to reproduce the issue with a new Docker image and share it here soon |
|
@kzantow Building a similar structure to a real docker image was really hard, but I finally built a dockerfile that reproduces almost the same behavior, my real docker image would get stuck for hours, but this one took 26 mins and 35 secs on my machine After the patch, the same process took just 1 min and 32 secs here’s the dockerfile you can build and test |
|
Hey @kzantow just following up on this, have you had a chance to look at the Dockerfile I shared? Let me know if you need any more details |
|
hey @kzantow just checking in again to see if you had a chance to look at the dockerfile i shared, it shows the path explosion issue and how the patch really helps with performance |
|
Sorry this got lost in the noise @rezmoss 🤦 I'll give this a review tomorrow! 👍 |
|
FYI @rezmoss -- I've been testing this manually and can confirm it seems to solve issues with the image you provided as well as the image here: anchore/syft#3928 (comment). I'm actively trying to get this merged, including some sort of meaningful test. It's just a bit of a challenging thing to test in a unit test scenario, as you are probably aware. Thanks very much for your work on this, it's not a trivial change at all! 😁 |
|
@kzantow thanks for testing it out and confirming, let me know if anything else is needed to move it forward ;) |
|
Hey @rezmoss sorry for the delay here. I found a set of cases that this change wasn't accounting for and couldn't account for without restoring some of the prior behavior of walking up the tree, which ended up undoing a lot of the good this did in the first place -- as you said there was a compounding of issues causing this explosion. I spent some time trying to adjust how it works to avoid the performance issues while not missing potential paths and I realized there was a different fundamental change that should happen - instead of returning all the potential paths, we really only need the first matching one from an indexed item -- we're returning "real" files here, as long as they match a requested glob. I ended up implementing an alternate change here: #411. However your work on identifying the problem and providing a way to reproduce it was invaluable to get it fixed, so I wanted to express my deepest gratitude for your help here, despite not accepting this PR directly. Thank you! |
|
thanks @kzantow totally get it and just glad the root issue’s sorted, really appreciate the thoughtful followup and kind words, happy could help in any way ;) |


Problem
When processing Node.js packages with complex symlink structures (particularly .pnpm or turf packages), the
_pathsToNodefunction would enter an exponential path growth pattern:Root Cause
The issue stems from how parent paths are processed. The combination of:
p.ConstituentPaths()creates a compounding effect where each level of symlink traversal multiplies the paths to process
Solution
The fix modifies
_pathsToNodeto:This prevents the path explosion while preserving the expected file traversal behavior