Write a lightweight script that, given a HuggingFace dataset like https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M or https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K, and a baseline model, and reward function, filters/tags the questions with their reward.