Commit 94d8208
feat(sdk): propagate OCI layer-based metadata to the launcher (#523)
(still under active self-review, some AI gen remnants)
Implemented OCI layer inspection to extract framework definitions from
container images without pulling entire images. The system now:
- Extracts `framework.yml` files from containers using OCI layer
inspection (`partial_pull.py`)
- Parses framework definitions using
`nemo_evaluator.core.input.get_framework_evaluations()`
- Converts to structured Intermediate Representations
(`TaskIntermediateRepresentation`, `HarnessIntermediateRepresentation`)
- Serializes IRs to `all_tasks_irs.yaml` with `mapping.toml` checksum
validation
- Provides CLI commands (`ls task`, `ls tasks`) with `--from` flag for
on-the-fly container inspection
- Generates documentation automatically with checksum validation to
ensure consistency
- Validates in CI that `mapping.toml` changes are reflected in
`all_tasks_irs.yaml`
## Approx set of changes
### 1. OCI Layer Inspection
- `partial_pull.py` module for OCI layer inspection
- Extracts files from Docker image layers without pulling entire images
- Supports GitLab and nvcr.io registries, Supports Docker credentials
file (`~/.docker/config.json`) as fallback
- Implements caching mechanism (`~/.nemo-evaluator/docker-meta/`)
### 2. Framework Extraction Script
-`load_framework_definitions.py` script
- Extracts `framework.yml` from containers using OCI layer inspection
- Uses `find_file_matching_pattern_in_image_layers()` for pattern-based
search
- Parses framework.yml using
`nemo_evaluator.core.input.get_framework_evaluations()`
- Converts to Intermediate Representations (IRs)
- Serializes to `all_tasks_irs.yaml` with checksum metadata
- **Checksum validation**: Calculates SHA256 checksum of `mapping.toml`
and stores it in `all_tasks_irs.yaml` metadata
- When `mapping.toml` changes, `all_tasks_irs.yaml` must be regenerated
locally by running this script
- CI runs `test_packaged_mapping_toml_checksum_match()` test which fails
if checksums don't match
- This ensures changes to `mapping.toml` are always reflected in
`all_tasks_irs.yaml` before merging (checksum checks)
### 3. IR-Based Loading System
- **Added** `all_tasks_irs.yaml` (single YAML document with `metadata`
and `tasks` sections)
- **Uses** `nemo_evaluator.core.input.get_framework_evaluations()` to
parse framework.yml files
- **Converts** parsed data to `TaskIntermediateRepresentation` and
`HarnessIntermediateRepresentation` dataclasses
- **Checksum validation**: `mapping.toml` checksum stored in
`all_tasks_irs.yaml` metadata and validated on load
- Validates that `all_tasks_irs.yaml` is in sync with `mapping.toml`
- CI test `test_packaged_mapping_toml_checksum_match()` ensures packaged
artifacts match
### 4. CLI Commands
- **Added** `ls_task.py` command (new file)
- Loads from `all_tasks_irs.yaml` via `load_tasks_from_tasks_file()`
- Displays warning when `mapping_verified=False`
- Supports `--from <container>` flag for on-the-fly container inspection
- **Updated** `ls_tasks.py` command
- Added `--from <container>` flag for on-the-fly container inspection
- Continues using `mapping.toml` when `--from` not provided
- **Added** `--from <container>` flag to both commands
- Extracts `framework.yml` from container using OCI layer inspection
- Parses to IRs on-the-fly
- Bypasses packaged resources completely
### 5. Documentation Generation
- **Added** `autogen_task_yamls.py` script
- Uses `load_tasks_from_tasks_file()` to load IRs
- Generates harness markdown pages (`docs/task_catalog/harnesses/*.md`)
- Generates benchmarks table (`docs/task_catalog/benchmarks-table.md`)
- **Added** checksum validation: script fails if
`mapping_verified=False`
- **Integrated** into Sphinx build process (`docs/conf.py` `setup()`
function)
- Adds horizontal separators (`---`) between tasks in harness pages
<img width="1039" height="935" alt="image"
src="https://github.com/user-attachments/assets/cb254618-729b-423a-be95-dad2eac1d42d"
/>
<img width="1098" height="874" alt="image"
src="https://github.com/user-attachments/assets/3a62fcee-677a-4c33-83ed-2a6efab383e2"
/>
<img width="1142" height="1270" alt="image"
src="https://github.com/user-attachments/assets/40609fec-72d1-4ab1-a64d-209da328d72a"
/>
<img width="1034" height="292" alt="image"
src="https://github.com/user-attachments/assets/a5ba5df4-2af2-485e-9e7e-d2b4e3211912"
/>
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added auto-generated task catalog documentation with detailed task
information and benchmarks table.
* Introduced new CLI command to view individual task details with
filtering and JSON output support.
* Included task descriptions and types in task listings for better
discoverability.
* **Documentation**
* Auto-generated README table with harness information, container
details, and NGC links.
* Enhanced benchmark documentation with autogenerated content blocks.
* **Tests**
* Added validation tests for task intermediate representations and
mapping checksums.
* Updated existing tests to reflect new task metadata structure.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Alex Gronskiy <[email protected]>
Signed-off-by: Marta Stepniewska-Dziubinska <[email protected]>
Co-authored-by: Marta Stepniewska-Dziubinska <[email protected]>1 parent f883a3a commit 94d8208
File tree
46 files changed
+26567
-2233
lines changed- .github
- config
- workflows
- docs
- _extensions
- content_gating
- json_output/content
- _resources
- about
- evaluation
- benchmarks
- catalog
- packages
- nemo-evaluator-launcher
- examples
- scripts
- src/nemo_evaluator_launcher
- api
- cli
- common
- container_metadata
- resources
- tests/unit_tests
- nemo-evaluator
- scripts
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
46 files changed
+26567
-2233
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
13 | 12 | | |
14 | 13 | | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
71 | | - | |
| 71 | + | |
72 | 72 | | |
73 | 73 | | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
74 | 101 | | |
75 | 102 | | |
76 | 103 | | |
| |||
82 | 109 | | |
83 | 110 | | |
84 | 111 | | |
85 | | - | |
| 112 | + | |
86 | 113 | | |
87 | 114 | | |
88 | 115 | | |
| |||
106 | 133 | | |
107 | 134 | | |
108 | 135 | | |
109 | | - | |
| 136 | + | |
110 | 137 | | |
111 | 138 | | |
112 | 139 | | |
| |||
130 | 157 | | |
131 | 158 | | |
132 | 159 | | |
133 | | - | |
| 160 | + | |
| 161 | + | |
134 | 162 | | |
135 | 163 | | |
136 | 164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
183 | 183 | | |
184 | 184 | | |
185 | 185 | | |
186 | | - | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
61 | 87 | | |
62 | 88 | | |
63 | 89 | | |
| |||
0 commit comments