-
Notifications
You must be signed in to change notification settings - Fork 52
fix: complex symlink performance issue #411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Keith Zantow <[email protected]>
Benchmark Test ResultsBenchmark results from the latest changes vs base branch |
Signed-off-by: Keith Zantow <[email protected]>
Signed-off-by: Keith Zantow <[email protected]>
Signed-off-by: Keith Zantow <[email protected]>
Signed-off-by: Keith Zantow <[email protected]>
| } | ||
|
|
||
| refs, err := sc.referencesInTree(fileEntries) | ||
| refs, err := sc.firstMatchingReferences("**/*", fileEntries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem with the alternate change -- it was returning too many results for different layers. I'm happy to figure that out, but this change also solves the performance issue, albeit with some unnecessary glob matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's alright, the string glob match is fairly cheap when compared to other operations happening here
Signed-off-by: Keith Zantow <[email protected]>
This PR adjust how symlinks are matched against glob patterns such that we will now only return the first matching path for an index entry. This often requires no symlink resolution at all, for example when glob patterns match the real path. This solves a problem that is beginning to be apparent with complex webs of symlinks such as large pnpm projects and snaps that would result in an explosion of paths returned to be tested against a glob pattern causing excessive memory usage and processing time, which effectively would cause syft to hang.
A HUGE THANK YOU to @rezmoss for his work to identify this issue and provide a potential fix! You have my deepest gratitude, sir! ❤️