Skip to content

Conversation

patrickfleith
Copy link

Hi, this PR implements dataset card generation #113 and uploads to the Hub.

Example result: https://huggingface.co/datasets/patrickfleith/yourbench_example

Let me know if I should change something 🤗

Notable Changes

  • Template Creation: Added a markdown template file yourbench_card_template.md to standardize dataset cards
  • Configuration Options: Implemented optional upload_card flag in YAML configurations (added to both advanced_example.yaml and simple_example.yaml)
  • Integration: Modified handler.py and dataset_engine.py to automatically generate and upload dataset cards upon pipeline completion

Implementation Details

  1. Metadata Management:

    • Added extract_readme_metadata and extract_dataset_info functions to preserve dataset metadata generated during pipeline steps
    • This ensures all automatically generated dataset information is properly included in the README
  2. Core Functions:

    • _serialize_config_for_card: Serializes pipeline configuration to YAML for inclusion in the dataset card
    • _get_pipeline_subset_info: Maps each active pipeline stage to predefined descriptions
    • _generate_and_upload_dataset_card and upload_dataset_card: Handle card construction and uploading
  3. Badge Design

  • Created a YourBench badge based on the axolotl recommendation from the original issue
  • Provided two versions:
    • SVG format for future modifications
    • PNG format (200x32) for README display
  • Note: The badge references https://raw.githubusercontent.com/huggingface/yourbench/main/docs/assets/yourbench-badge-web.png and will render properly after merging

@patrickfleith
Copy link
Author

Hi @sumukshashidhar @alozowski do you need some help with he PR? I appreciate it looks overly complex with respect to what we are trying to achieve, but it was much more tricky than I initially thought.

@sumukshashidhar
Copy link
Collaborator

Hi @patrickfleith ! Could you please resolve the merge conflict with the dataset engine. If you're unable to, I'd be super happy to do it!

@sumukshashidhar
Copy link
Collaborator

Hi @patrickfleith! I did a preliminary merge! I'll test the functionality later today!

@sumukshashidhar
Copy link
Collaborator

@patrickfleith I don't seem to have access to your repo, so just running make style and make quality should make this mergable.

Thank you so much for your work!

@patrickfleith
Copy link
Author

@patrickfleith I don't seem to have access to your repo, so just running make style and make quality should make this mergable.

Thank you so much for your work!

Hey, thanks for the preliminary merge. I believe you also resolved the conflict during this merge, right? Because I don't see the conflict anymore. I just pushed a new commit with make style and make quality. I tested it with simple_example and it seems to work!

@sumukshashidhar
Copy link
Collaborator

Thank you so much! I'll merge this with the main!

@sumukshashidhar sumukshashidhar self-requested a review June 27, 2025 15:00
@sumukshashidhar sumukshashidhar merged commit 78f7d2a into huggingface:main Jun 27, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants