Description
- related: GSoC 2022 Ideas / Brainstorming thread #1379
- related: GSoC 2022: Start Here #1462
CVE Binary Tool was originally intended to work with compiled languages and binary files, but we've expanded to do known component lists in a few different formats. Recently, @anthonyharrison improved our support for .jar files by reading the meta data from those files, and @BreadGenie has earlier work to support listings from Linux package repositories. We'd like to see about doing that for other popular package repository ruby gems, npm, improving our python support, etc.
This project will probably involve doing a few things:
- adding parsers to read package data of various types
- potentially adding mapping databases to translate package names to
{vendor, product}
pairs (see Improve product vendor matching for component list scanning #1504) - exploring ways to call (or prompt the users to call) language-specific tools to get additional vulnerability/security data
Some languages/package managers of potential interest:
- npm (javascript)
- ruby (ruby gems)
- go (packages could be vendored or not)
- rust
- improving python support (e.g. add mappings and magic so requirements.txt can be scanned without converting to .csv)
A 175hr project could choose 2-3 package list types to support and work on that.
For a 350hr project, I'd definitely want to see some plan for a mapping database/data structure with the following:
- Mapping {package name, package source} -> {vendor, product} pair (this may be a an n:n, 1:n, or n:1 mapping)
- Plans for how to allow/encourage users to contribute data back to the project (see ideas in Improve product vendor matching for component list scanning #1504)
Hours
175 or 350, scaled depending on how many package types you intend to tackle and whether you want to add the mapping database
Difficulty level
- beginner to intermediate
Recommended skills
- databases, json, experience with other package managers