This repository contains the data used in the experiments:
-
The file
holdout_dataset.zipincludes three.jsonlfiles — one for each target type:
binary_type,float_type, andreverse_binary_type. -
The file
train_data.zipcontains six.jsonlfiles: training (train) and validation (val) splits for each of the three target types. -
The
validation_subsetdirectory contains.tomlfiles — one per sample. -
The
slurm_scripts_examplesdirectory contains example.shscripts for:- Full fine-tuning,
- Fine-tuning with LoRA,
- Inference.
-
The
promptsdirectory includes:- A prompt template used to generate test-breaking inputs,
- Descriptions of errors that were synthetically generated using an LLM.