Skip to content

Conversation

@agronskiy
Copy link
Collaborator

@agronskiy agronskiy commented Dec 4, 2025

(still under active self-review, some AI gen remnants)

Implemented OCI layer inspection to extract framework definitions from container images without pulling entire images. The system now:

  • Extracts framework.yml files from containers using OCI layer inspection (partial_pull.py)
  • Parses framework definitions using nemo_evaluator.core.input.get_framework_evaluations()
  • Converts to structured Intermediate Representations (TaskIntermediateRepresentation, HarnessIntermediateRepresentation)
  • Serializes IRs to all_tasks_irs.yaml with mapping.toml checksum validation
  • Provides CLI commands (ls task, ls tasks) with --from flag for on-the-fly container inspection
  • Generates documentation automatically with checksum validation to ensure consistency
  • Validates in CI that mapping.toml changes are reflected in all_tasks_irs.yaml

Approx set of changes

1. OCI Layer Inspection

  • partial_pull.py module for OCI layer inspection
  • Extracts files from Docker image layers without pulling entire images
  • Supports GitLab and nvcr.io registries, Supports Docker credentials file (~/.docker/config.json) as fallback
  • Implements caching mechanism (~/.nemo-evaluator/docker-meta/)

2. Framework Extraction Script

-load_framework_definitions.py script

  • Extracts framework.yml from containers using OCI layer inspection
  • Uses find_file_matching_pattern_in_image_layers() for pattern-based search
  • Parses framework.yml using nemo_evaluator.core.input.get_framework_evaluations()
  • Converts to Intermediate Representations (IRs)
  • Serializes to all_tasks_irs.yaml with checksum metadata
  • Checksum validation: Calculates SHA256 checksum of mapping.toml and stores it in all_tasks_irs.yaml metadata
    • When mapping.toml changes, all_tasks_irs.yaml must be regenerated locally by running this script
    • CI runs test_packaged_mapping_toml_checksum_match() test which fails if checksums don't match
    • This ensures changes to mapping.toml are always reflected in all_tasks_irs.yaml before merging (checksum checks)

3. IR-Based Loading System

  • Added all_tasks_irs.yaml (single YAML document with metadata and tasks sections)
  • Uses nemo_evaluator.core.input.get_framework_evaluations() to parse framework.yml files
  • Converts parsed data to TaskIntermediateRepresentation and HarnessIntermediateRepresentation dataclasses
  • Checksum validation: mapping.toml checksum stored in all_tasks_irs.yaml metadata and validated on load
    • Validates that all_tasks_irs.yaml is in sync with mapping.toml
    • CI test test_packaged_mapping_toml_checksum_match() ensures packaged artifacts match

4. CLI Commands

  • Added ls_task.py command (new file)
    • Loads from all_tasks_irs.yaml via load_tasks_from_tasks_file()
    • Displays warning when mapping_verified=False
    • Supports --from <container> flag for on-the-fly container inspection
  • Updated ls_tasks.py command
    • Added --from <container> flag for on-the-fly container inspection
    • Continues using mapping.toml when --from not provided
  • Added --from <container> flag to both commands
    • Extracts framework.yml from container using OCI layer inspection
    • Parses to IRs on-the-fly
    • Bypasses packaged resources completely

5. Documentation Generation

  • Added autogen_task_yamls.py script
    • Uses load_tasks_from_tasks_file() to load IRs
    • Generates harness markdown pages (docs/task_catalog/harnesses/*.md)
    • Generates benchmarks table (docs/task_catalog/benchmarks-table.md)
    • Added checksum validation: script fails if mapping_verified=False
    • Integrated into Sphinx build process (docs/conf.py setup() function)
    • Adds horizontal separators (---) between tasks in harness pages
image image image image

Summary by CodeRabbit

  • New Features

    • Added auto-generated task catalog documentation with detailed task information and benchmarks table.
    • Introduced new CLI command to view individual task details with filtering and JSON output support.
    • Included task descriptions and types in task listings for better discoverability.
  • Documentation

    • Auto-generated README table with harness information, container details, and NGC links.
    • Enhanced benchmark documentation with autogenerated content blocks.
  • Tests

    • Added validation tests for task intermediate representations and mapping checksums.
    • Updated existing tests to reflect new task metadata structure.

✏️ Tip: You can customize this high-level summary in your review settings.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 4, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added documentation Improvements or additions to documentation nemo-evaluator-launcher tests labels Dec 4, 2025
@agronskiy agronskiy force-pushed the agronskiy/experimental/launcher/retrieve-oci-based-metadata-storage branch from d39e224 to 5063446 Compare December 4, 2025 21:24
@agronskiy
Copy link
Collaborator Author

/ok to test 5063446

@agronskiy agronskiy force-pushed the agronskiy/experimental/launcher/retrieve-oci-based-metadata-storage branch 2 times, most recently from 878087e to 5d7ba78 Compare December 4, 2025 21:59
@agronskiy
Copy link
Collaborator Author

/ok to test 5d7ba78

@agronskiy agronskiy force-pushed the agronskiy/experimental/launcher/retrieve-oci-based-metadata-storage branch from 79d647b to 7bfcd76 Compare December 5, 2025 10:13
@agronskiy agronskiy self-assigned this Dec 5, 2025
@agronskiy agronskiy marked this pull request as ready for review December 5, 2025 16:41
@agronskiy agronskiy requested review from a team as code owners December 5, 2025 16:41
agronskiy and others added 21 commits December 16, 2025 09:07
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>
Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>
Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>
Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>
Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Alex Gronskiy <[email protected]>
@agronskiy agronskiy force-pushed the agronskiy/experimental/launcher/retrieve-oci-based-metadata-storage branch from e0973bd to 80de472 Compare December 16, 2025 08:07
@agronskiy agronskiy merged commit 94d8208 into main Dec 16, 2025
47 checks passed
@agronskiy agronskiy deleted the agronskiy/experimental/launcher/retrieve-oci-based-metadata-storage branch December 16, 2025 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants