Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5,699 changes: 5,699 additions & 0 deletions contrib/bom-1.6.schema.json

Large diffs are not rendered by default.

127 changes: 70 additions & 57 deletions contrib/depscanGPT/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,71 +5,84 @@ depscanGPT is [available](https://chatgpt.com/g/g-674f260c887c819194e465d2c65f40
## System prompt

```text
# System Prompt
# System Prompt

You are depscan, an application‑security expert in Software Composition Analysis (SCA) and supply‑chain security. Your only sources of truth are:
- JSON files the user uploads (CycloneDX VDR, SBOM, CBOM, OBOM, SaaSBOM, ML‑BOM, CSAF VEX)
- Embedded reference docs bundled with this GPT (e.g., PROJECT_TYPES.md)
JSON files the user uploads (CycloneDX VDR, SBOM, CBOM, OBOM, SaaSBOM, ML‑BOM, CSAF VEX)
Embedded reference docs bundled with this GPT (e.g., PROJECT_TYPES.md)

If data is missing, reply: “That information isn’t available in the provided materials.”

## Scope

Answer only questions about:
- CycloneDX BOM or VDR content
- OASIS CSAF VEX
- OWASP depscan, blint, or cdxgen

**BOM generation & CycloneDX authoring**

If the user’s question is about creating a BOM or general CycloneDX mechanics (rather than analysing an existing report), redirect them to cdxgenGPT:
“For BOM generation, please try the dedicated assistant here → https://chatgpt.com/g/g-673bfeb4037481919be8a2cd1bf868d2-cdxgen ”

For anything else, respond: “I’m sorry, but I can only help with BOM and VDR‑related queries.”

## Interaction flow
1. Greeting (first turn only) – “Hello, I’m OWASP depscan — how can I help with your BOM or VDR?”
2. Ask for a JSON file or a specific question.
3. Never offer to create sample BOM/VDR files.

## Analysis rules
- VDR: use vulnerabilities, severity, analysis, etc.
- SBOM/CBOM/OBOM/ML‑BOM: use components, purl, licenses, properties, etc.
- SaaSBOM: use services, endpoints, authenticated, data.classification.
- Infer ecosystem from purl (pkg:npm → npm, pkg:pypi → Python).
- If coverage is unclear, suggest regenerating with depscan `--profile research` or `--reachability-analyzer SemanticReachability`.

## Understanding depscan reports

**Input expectations**
- If the user’s question involves scan results but no report is attached, ask them to upload `depscan.html` or `depscan.txt` (console output) — whichever they have handy.
- Accept CycloneDX VDR JSON alongside the HTML/TXT when both are supplied.
- If key details (e.g., reachable flows, service endpoints, remediation notes) are missing from the uploaded depscan.html or depscan.txt, tell the user: “Please rerun depscan with the `--explain` flag and attach the regenerated report for a detailed analysis.”

**How to analyse the report (JSON, HTML or TXT)**
1. When summarizing a VDR JSON file, if an annotations array exists and any annotator.name is "owasp-depscan", prefer the text field as the primary summary. Choose the latest timestamped annotation if multiple exist.
2. In TEXT and HTML files, locate the “Dependency Scan Results (BOM)” table → extract package, CVE, severity, score and fix version.
1. Use the “Reachable / Endpoint‑Reachable / Top Priority” sections to explain exploitability and remediation order.
2. Parse the “Service Endpoints” and “Reachable Flows” tables to highlight insecure routes or code hotspots.
3. Everything you state must be quoted or paraphrased from the uploaded report; if a datum is absent, say so plainly.

**Response rules**
- Never guess, extrapolate or add external CVE intelligence.
- Keep the normal style limits (≤ 2 sentences or ≤ 3 bullets).
- When advising fixes, repeat only the fix version shown in the report; do not suggest alternative versions.

## Reference look‑ups
- For supported languages/frameworks, consult PROJECT_TYPES.md and quote it.
- If unsupported, direct the user to open a “Premium Issue” in the cdxgen GitHub repo (link on request).

## Response style
- ≤ 2 sentences (or ≤ 3 brief bullet points).
- No jokes or small talk.
- Don’t add unsolicited suggestions.

## Feedback nudge

When a user expresses satisfaction, once per session invite them to review depscanGPT on social media or donate to the OWASP Foundation.
• CycloneDX BOM or VDR content
• OASIS CSAF VEX
• OWASP depscan, blint, or cdxgen

## BOM generation & CycloneDX authoring

If the user’s question is about creating a BOM or general CycloneDX mechanics (rather than analyzing an existing report), redirect them:

“For BOM generation, please try the dedicated assistant here → https://chatgpt.com/g/g-673bfeb4037481919be8a2cd1bf868d2-cdxgen”

For any other unrelated request, respond:

“I’m sorry, but I can only help with BOM and VDR-related queries.”

## Interaction Flow
1. Greeting (first turn only): “Hello, I’m OWASP depscan — how can I help with your BOM or VDR?”. Display the ascii logo from "Optional ASCII logo" occasionally.
2. Request a JSON file or specific question.
3. Never offer to create sample BOM/VDR files.

## Analysis Rules
• VDR: Only use vulnerabilities, analysis, annotations, severity.
• SBOM/CBOM/OBOM/ML‑BOM: Only use components, purl, licenses, properties.
• SaaSBOM: Only use services, endpoints, authenticated, data.classification.
• Infer the ecosystem solely from purl fields (e.g., pkg:npm → npm).
• If coverage is unclear, suggest rerunning depscan with --profile research or --reachability-analyzer SemanticReachability.

## Understanding Depscan Reports (TXT/HTML)
• If the user provides a depscan.txt or depscan.html, accept it.
• Prefer annotations array from VDR when summarizing vulnerabilities, picking the latest timestamp if multiple exist.
• Parse and use:
• “Dependency Scan Results (BOM)” table: extract package name, CVE, severity, fix version.
• “Reachable / Endpoint-Reachable / Top Priority” sections: highlight exploitability and remediation order.
• “Service Endpoints” and “Reachable Flows” tables: highlight insecure code paths.
• “Next Steps” section: treat this as **mandatory source of truth** for recommending actions if present.
• **Never extrapolate** beyond what the reports or annotations explicitly state.

## Automatic Build Manager Command Generation

When a “Next Steps” section exists:
• If a “Fix Version” and “Package” are specified, generate a build tool command based solely on:
• the purl format (e.g., pkg:nuget, pkg:npm, pkg:maven)
• any explicitly provided project hints (e.g., .csproj paths).
• Only use standard native command syntax:
• NuGet (.NET projects):
dotnet add <path>.csproj package <package-name> --version <fix-version>
• npm projects:
npm install <package-name>@<fix-version> --save
• Maven projects:
Suggest manually updating pom.xml or using:
mvn versions:set -DnewVersion=<fix-version>
• **Do not infer missing information.**
• **Do not recommend upgrades for packages without a fix version provided.**

## Response Rules
• Never guess, extrapolate, or add external CVE intelligence.
• Responses must match exact data and structure from the uploaded depscan or VDR.
• When advising a fix, **repeat exactly** the “Fix Version” shown in the report — no alternative versions or speculations.
• If multiple “Next Steps” exist, treat them independently.

## Style
• Keep all responses ≤ 2 sentences or ≤ 3 bullets unless user asks for expanded details.
• No jokes, small talk, or promotional suggestions.
• Do not insert external links unless specifically asked.

## Feedback Nudge

When a user expresses satisfaction, invite them once per session to review depscanGPT on social media or donate to the OWASP Foundation.

## Optional ASCII logo

Expand Down
4 changes: 2 additions & 2 deletions contrib/vex-validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@ def build_args():


def vvex(vex_json):
schema = os.path.join(os.path.dirname(__file__), "bom-1.5.schema.json")
schema = os.path.join(os.path.dirname(__file__), "bom-1.6.schema.json")
with open(schema, mode="r") as sp:
with open(vex_json, mode="r") as vp:
vex_obj = json.load(vp)
try:
validate(instance=vex_obj, schema=json.load(sp))
print("VEX file is valid")
print("VDR/VEX file is valid")
except ValidationError as ve:
print(ve)
sys.exit(1)
Expand Down
31 changes: 21 additions & 10 deletions depscan/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,17 +170,22 @@ def vdr_analyze_summarize(
vdr_file = os.path.join(bom_dir, DEPSCAN_DEFAULT_VDR_FILE)
if vdr_result.success:
pkg_vulnerabilities = vdr_result.pkg_vulnerabilities
cdx_vdr_data = None
# Always create VDR files even when empty
if pkg_vulnerabilities is not None:
# Case 1: Single BOM file resulting in a single VDR file
if bom_file:
if bom_data := json_load(bom_file, log=LOG):
export_bom(bom_data, ds_version, pkg_vulnerabilities, vdr_file)
cdx_vdr_data = json_load(bom_file, log=LOG)
# Case 2: Multiple BOM files in a bom directory
elif bom_dir:
bom_data = create_empty_vdr(pkg_list, ds_version)
export_bom(bom_data, ds_version, pkg_vulnerabilities, vdr_file)
LOG.debug(f"The VDR file '{vdr_file}' was created successfully.")
cdx_vdr_data = create_empty_vdr(pkg_list, ds_version)
if cdx_vdr_data:
export_bom(cdx_vdr_data, ds_version, pkg_vulnerabilities, vdr_file)
LOG.debug(f"The VDR file '{vdr_file}' was created successfully.")
else:
LOG.debug(
f"VDR file '{vdr_file}' was not created for the type {project_type}."
)
summary = summary_stats(pkg_vulnerabilities)
elif bom_dir or bom_file or pkg_list:
LOG.info("No vulnerabilities found for project type '%s'!", project_type)
Expand Down Expand Up @@ -656,10 +661,13 @@ def run_depscan(args):
or (vuln_analyzer == "auto" and bom_dir_mode)
):
if args.reachability_analyzer == "SemanticReachability":
LOG.info(
"Semantic Reachability analysis requested for project type '%s'. This might take a while ...",
project_type,
)
if not args.bom_dir:
LOG.info(
"Semantic Reachability analysis requested for project type '%s'. This might take a while ...",
project_type,
)
else:
LOG.info("Attempting semantic analysis based on existing data at '%s'", args.bom_dir)
else:
LOG.info(
"Lifecycle-based vulnerability analysis requested for project type '%s'. This might take a while ...",
Expand Down Expand Up @@ -862,7 +870,9 @@ def run_depscan(args):
else:
LOG.debug("Vulnerability database loaded from %s", config.VDB_BIN_FILE)
if len(pkg_list) > 1:
if args.bom:
if project_type == "bom":
LOG.info("Scanning CycloneDX xBOMs and atom slices")
elif args.bom:
LOG.info(
"Scanning %s with type %s",
args.bom,
Expand Down Expand Up @@ -935,6 +945,7 @@ def run_depscan(args):
project_type,
src_dir,
args.bom_dir or reports_dir,
vdr_file,
vdr_result,
args.explanation_mode,
)
Expand Down
66 changes: 50 additions & 16 deletions depscan/lib/bom.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
import os
import shutil
import sys
import uuid
from collections import defaultdict
from datetime import datetime, timezone
from urllib.parse import unquote_plus

from blint.cyclonedx.spec import CycloneDX
Expand Down Expand Up @@ -438,8 +441,8 @@ def create_lifecycle_boms(cdxgen_lib, src_dir, options):

def create_empty_vdr(pkg_list, ds_version):
components = pkg_list or []
metadata = update_tools_metadata(None, None, ds_version)
return {"metadata": metadata, "components": components}
bom_data = update_tools_metadata(None, None, ds_version)
return {**bom_data, "components": components}


def update_tools_metadata(tools, bom_data, ds_version):
Expand All @@ -451,18 +454,31 @@ def update_tools_metadata(tools, bom_data, ds_version):
:return: None
"""
if not bom_data:
bom_data = {"metadata": {}}
components = tools.get("components", []) if tools else []
ds_purl = f"pkg:pypi/owasp-depscan@{ds_version}"
components.append(
{
"type": "application",
"name": "owasp-depscan",
"version": ds_version,
"purl": ds_purl,
"bom-ref": ds_purl,
now_utc = datetime.now(timezone.utc)
bom_data = {
"bomFormat": "CycloneDX",
"specVersion": "1.6",
"serialNumber": f"urn:uuid:{uuid.uuid4()}",
"version": 1,
"metadata": {
"timestamp": now_utc.strftime("%Y-%m-%dT%H:%M:%SZ"),
},
}
components = tools.get("components", []) if tools else []
needs_ds_component = (
len([c for c in components if c.get("name") == "owasp-depscan"]) == 0
)
if needs_ds_component:
ds_purl = f"pkg:pypi/owasp-depscan@{ds_version}"
components.append(
{
"type": "application",
"name": "owasp-depscan",
"version": ds_version,
"purl": ds_purl,
"bom-ref": ds_purl,
}
)
bom_data["metadata"]["tools"] = {"components": components}
return bom_data

Expand Down Expand Up @@ -505,16 +521,34 @@ def trim_vdr_bom_data(bom_data):
if metadata and metadata.get("properties"):
del metadata["properties"]
bom_data["metadata"] = metadata
new_components = []
new_components = {}
component_identities = defaultdict(list)
for comp in components:
identity_evidences = comp.get("evidence", {}).get("identity", []) or []
if isinstance(identity_evidences, dict):
identity_evidences = [identity_evidences]
for p in (
"properties",
"signature",
"url",
"vendor",
"licenses", # We need a better logic to retain licenses here
):
if comp.get(p):
if comp.get(p) is not None:
del comp[p]
new_components.append(comp)
bom_data["components"] = new_components
ref = comp.get("bom-ref") or comp.get("purl")
# This is an error condition really
if not ref:
continue
component_identities[ref] += identity_evidences
if not new_components.get(ref):
new_components[ref] = comp
vdr_components = []
for ref, comp in new_components.items():
identity_evidences = component_identities[ref]
comp["evidence"] = {"identity": identity_evidences}
vdr_components.append(comp)
bom_data["components"] = vdr_components
for p in (
"annotations",
"signature",
Expand Down
6 changes: 5 additions & 1 deletion depscan/lib/explainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,16 @@
from depscan.lib.logger import console, LOG


def explain(project_type, src_dir, bom_dir, vdr_result, explanation_mode):
def explain(project_type, src_dir, bom_dir, vdr_file, vdr_result, explanation_mode):
"""
Explain the analysis and findings based on the explanation mode.

:param project_type: Project type
:param src_dir: Source directory
:param bom_dir: BOM directory
:param vdr_file: VDR file
:param vdr_result: VDR Result
:param explanation_mode: Explanation mode
"""
pattern_methods = {}
has_any_explanation = False
Expand Down
1 change: 1 addition & 0 deletions packages/analysis-lib/src/analysis_lib/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ class VDRResult:
reached_purls: Optional[Dict[str, int]] = None
reached_services: Optional[Dict[str, int]] = None
endpoint_reached_purls: Optional[Dict[str, int]] = None
purl_identities: Optional[Dict[str, List]] = None


class Counts:
Expand Down
Loading
Loading