Skip to content

Save dataset locally if post summary/question generation upload fails (otherwise inference calls wasted) #66

@samefarrar

Description

@samefarrar

I'm not sure what caused this, but after the summary, the dataset failed to upload, and I lost all the summaries inference calls had generated.

If the upload fails, yourbench should probably store the dataset locally to avoid losing the summaries as this wastes many inference calls.

100%|███████████████████████████████████████████████████████████████| 2091/2091 [49:22<00:00,  1.42s/it]
2025-04-15 11:37:13.642 | SUCCESS  | yourbench.utils.inference_engine:_run_inference_async_helper:190 - Completed parallel inference for all models.
2025-04-15 11:37:13.768 | INFO     | yourbench.utils.dataset_engine:custom_save_dataset:97 - Pushing dataset to HuggingFace Hub with repo_id='samefarrar/conversation_bench'
Creating parquet from Arrow format: 100%|█████████████████████████████████| 3/3 [00:00<00:00,  5.75ba/s]
Uploading the dataset shards:   0%|                                               | 0/1 [00:27<?, ?it/s]
2025-04-15 11:37:41.873 | ERROR    | yourbench.pipeline.handler:run_pipeline:126 - Error executing pipeline stage 'summarization': 500 Server Error: Internal Server Error for url: https://huggingface.co/api/datasets/samefarrar/conversation_bench/preupload/main

Internal Error - We're working hard to fix this as soon as possible!
2025-04-15 11:37:41.873 | ERROR    | yourbench.main:run:82 - Pipeline failed: 500 Server Error: Internal Server Error for url: https://huggingface.co/api/datasets/samefarrar/conversation_bench/preupload/main

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions