Context Rot: How Increasing Input Tokens Impacts LLM Performance

This repository contains the toolkit for replicating results from our technical report.

Motivation

Large Language Models (LLMs) are typically presumed to process context uniformly—that is, the model should handle the 10,000th token just as reliably as the 100th. However, in practice, this assumption does not hold. We observe that model performance varies significantly as input length changes, even on simple tasks.

Latest Models on Repeated Words Task

Experiments

Our experiments are organized under the experiments/ folder:

1. NIAH Extension (`experiments/niah_extension/`)

Extension of Needle in a Haystack to examine the effects of needles with semantic, rather than direct lexical matches, as well as the effects of introducing variations to the haystack content.

2. LongMemEval (`experiments/longmemeval/`)

LongMemEval task.

3. Repeated Words (`experiments/repeated_words/`)

Tests model performance on replicating a sequence of repeated words.

Each experiment contains detailed instructions in their respective README.md files.

Data

Datasets can be downloaded here.

Quick Start

Clone the repository

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies: pip install -r requirements.txt
Set up environment variables:
- OpenAI: OPENAI_API_KEY
- Anthropic: ANTHROPIC_API_KEY
- Google: GOOGLE_APPLICATION_CREDENTIALS and GOOGLE_MODEL_PATH
Navigate to specific experiment folder and follow README instructions

Citation

If you find this work useful, please cite our technical report:

@techreport{hong2025context,
  title = {Context Rot: How Increasing Input Tokens Impacts LLM Performance},
  author = {Hong, Kelly and Troynikov, Anton and Huber, Jeff},
  year = {2025},
  month = {July},
  institution = {Chroma},
  url = {https://research.trychroma.com/context-rot},
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
experiments		experiments
images		images
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Context Rot: How Increasing Input Tokens Impacts LLM Performance

Motivation

Experiments

1. NIAH Extension (`experiments/niah_extension/`)

2. LongMemEval (`experiments/longmemeval/`)

3. Repeated Words (`experiments/repeated_words/`)

Data

Quick Start

Citation

About

Uh oh!

Releases

Packages

Languages

License

chroma-core/context-rot

Folders and files

Latest commit

History

Repository files navigation

Context Rot: How Increasing Input Tokens Impacts LLM Performance

Motivation

Experiments

1. NIAH Extension (experiments/niah_extension/)

2. LongMemEval (experiments/longmemeval/)

3. Repeated Words (experiments/repeated_words/)

Data

Quick Start

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. NIAH Extension (`experiments/niah_extension/`)

2. LongMemEval (`experiments/longmemeval/`)

3. Repeated Words (`experiments/repeated_words/`)

Packages