Unable to specify **exact** input token length

Setting `"input_token_distribution": ["uniform", "x", "x+1"]` does not provide input token length of x (we expect x)

<img width="731" alt="Screenshot 2025-01-10 at 1 04 46 PM" src="https://github.com/user-attachments/assets/9ee077d6-4769-4ad1-a1d7-57cd33ce6741" />


The above function returns the correct number, so it is most likely an issue with the tokenizer being unable to produce **exact** lengths

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to specify exact input token length #82

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to specify **exact** input token length #82

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Unable to specify exact input token length #82