Skip to content

Improve agents-md prompt to force doc retrieval#88997

Merged
gaojude merged 1 commit intocanaryfrom
gaojude/improve-agents-md-prompt
Jan 25, 2026
Merged

Improve agents-md prompt to force doc retrieval#88997
gaojude merged 1 commit intocanaryfrom
gaojude/improve-agents-md-prompt

Conversation

@gaojude
Copy link
Contributor

@gaojude gaojude commented Jan 25, 2026

Summary

Updates the instruction in agents-md CLAUDE.md output to force LLMs to actually read the docs instead of relying on stale pre-training knowledge.

Before:

IMPORTANT: Prefer retrieval-led reasoning over pre-training-led reasoning for any Next.js tasks.

After:

STOP. What you remember about Next.js is WRONG for this project. Always search docs and read before any task.

Why This Matters

Through systematic testing with the next-evals-oss eval suite, we discovered that the original "prefer retrieval-led reasoning" instruction was ineffective. Agents would ignore it and use outdated pre-training knowledge (e.g., creating middleware.ts instead of the new Next.js 16 proxy.ts convention).

Testing Methodology

We created an "indirect proxy" eval that tests if agents know proxy.ts is needed for request interception in Next.js 16+, using only the prompt "Log every request to the console" (no mention of proxy/middleware).

Instruction Pass Rate
"Prefer retrieval-led reasoning" 0%
"CRITICAL: MUST read docs" ~20-40%
"What you remember is WRONG" 100% (6/6)

Why It Works

The phrase "What you remember is WRONG" creates psychological doubt in the model's confidence, forcing it to actually read the docs rather than trusting its pre-training.

Key elements:

  • "STOP" - Attention grabber, interrupts default behavior
  • "WRONG for this project" - Creates doubt without claiming universal wrongness
  • "Always search docs and read" - Actionable instruction
  • "before any task" - Applies to everything, not just code writing

Full Suite Results

  • The new indirect-proxy eval passed consistently

@nextjs-bot
Copy link
Collaborator

nextjs-bot commented Jan 25, 2026

Stats from current PR

✅ No significant changes detected

📊 All Metrics
📖 Metrics Glossary

Dev Server Metrics:

  • Listen = TCP port starts accepting connections
  • First Request = HTTP server returns successful response
  • Cold = Fresh build (no cache)
  • Warm = With cached build artifacts

Build Metrics:

  • Fresh = Clean build (no .next directory)
  • Cached = With existing .next directory

Change Thresholds:

  • Time: Changes < 50ms AND < 10%, OR < 2% are insignificant
  • Size: Changes < 1KB AND < 1% are insignificant
  • All other changes are flagged to catch regressions

⚡ Dev Server

Metric Canary PR Change Trend
Cold (Listen) 456ms 455ms █▁▁▁▁
Cold (Ready in log) 442ms 442ms █▁▂▁▂
Cold (First Request) 815ms 819ms █▁▃▁▃
Warm (Listen) 456ms 456ms ▆▁▁▁▁
Warm (Ready in log) 440ms 440ms ▇▁▁▁▁
Warm (First Request) 345ms 341ms ▇▁▁▁▁
📦 Dev Server (Webpack) (Legacy)

📦 Dev Server (Webpack)

Metric Canary PR Change Trend
Cold (Listen) 455ms 455ms ▁▁▁▁▁
Cold (Ready in log) 442ms 441ms ▂▇▄▇▂
Cold (First Request) 1.836s 1.832s ▂▆▂▆▁
Warm (Listen) 455ms 457ms ▁▁▁▁▁
Warm (Ready in log) 441ms 441ms ▁▅▁▅▁
Warm (First Request) 1.829s 1.840s ▂▅▁▅▁

⚡ Production Builds

Metric Canary PR Change Trend
Fresh Build 4.394s 4.397s █▁▁▁▁
Cached Build 4.400s 4.423s █▁▁▁▁
📦 Production Builds (Webpack) (Legacy)

📦 Production Builds (Webpack)

Metric Canary PR Change Trend
Fresh Build 14.271s 14.280s ▃▃▁▂▁
Cached Build 14.320s 14.358s ▃▂▁▃▁
node_modules Size 460 MB 460 MB ▁▁▁██
📦 Bundle Sizes

Bundle Sizes

⚡ Turbopack

Client

Main Bundles: **432 kB** → **432 kB** ✅ -50 B

82 files with content-based hashes (individual files not comparable between builds)

Server

Middleware
Canary PR Change
middleware-b..fest.js gzip 764 B 759 B
Total 764 B 759 B ✅ -5 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 450 B 451 B
Total 450 B 451 B ⚠️ +1 B

📦 Webpack

Client

Main Bundles
Canary PR Change
2086.HASH.js gzip 169 B N/A -
2161-HASH.js gzip 5.47 kB N/A -
2747-HASH.js gzip 4.53 kB N/A -
4322-HASH.js gzip 52.7 kB N/A -
ec793fe8-HASH.js gzip 62.3 kB N/A -
framework-HASH.js gzip 59.8 kB 59.8 kB
main-app-HASH.js gzip 251 B 254 B 🔴 +3 B (+1%)
main-HASH.js gzip 38.7 kB 39.1 kB
webpack-HASH.js gzip 1.68 kB 1.68 kB
1596.HASH.js gzip N/A 169 B -
2658-HASH.js gzip N/A 52.4 kB -
6349-HASH.js gzip N/A 4.52 kB -
7019-HASH.js gzip N/A 5.49 kB -
b17a3386-HASH.js gzip N/A 62.3 kB -
Total 226 kB 226 kB ⚠️ +56 B
Polyfills
Canary PR Change
polyfills-HASH.js gzip 39.4 kB 39.4 kB
Total 39.4 kB 39.4 kB
Pages
Canary PR Change
_app-HASH.js gzip 194 B 193 B
_error-HASH.js gzip 182 B 182 B
css-HASH.js gzip 336 B 335 B
dynamic-HASH.js gzip 1.8 kB 1.8 kB
edge-ssr-HASH.js gzip 256 B 256 B
head-HASH.js gzip 352 B 349 B
hooks-HASH.js gzip 385 B 384 B
image-HASH.js gzip 580 B 580 B
index-HASH.js gzip 259 B 258 B
link-HASH.js gzip 2.5 kB 2.51 kB
routerDirect..HASH.js gzip 319 B 317 B
script-HASH.js gzip 385 B 387 B
withRouter-HASH.js gzip 316 B 315 B
1afbb74e6ecf..834.css gzip 106 B 106 B
Total 7.97 kB 7.96 kB ✅ -8 B

Server

Edge SSR
Canary PR Change
edge-ssr.js gzip 126 kB 126 kB
page.js gzip 244 kB 240 kB 🟢 4.71 kB (-2%)
Total 370 kB 366 kB ✅ -4.58 kB
Middleware
Canary PR Change
middleware-b..fest.js gzip 616 B 616 B
middleware-r..fest.js gzip 155 B 156 B
middleware.js gzip 33 kB 33.2 kB
edge-runtime..pack.js gzip 842 B 842 B
Total 34.7 kB 34.9 kB ⚠️ +192 B
Build Details
Build Manifests
Canary PR Change
_buildManifest.js gzip 736 B 738 B
Total 736 B 738 B ⚠️ +2 B
Build Cache
Canary PR Change
0.pack gzip 3.71 MB 3.72 MB 🔴 +8.23 kB (+0%)
index.pack gzip 102 kB 102 kB
index.pack.old gzip 102 kB 101 kB 🟢 1.03 kB (-1%)
Total 3.91 MB 3.92 MB ⚠️ +7.13 kB

🔄 Shared (bundler-independent)

Runtimes
Canary PR Change
app-page-exp...dev.js gzip 306 kB 306 kB
app-page-exp..prod.js gzip 163 kB 163 kB
app-page-tur...dev.js gzip 306 kB 306 kB
app-page-tur..prod.js gzip 163 kB 163 kB
app-page-tur...dev.js gzip 302 kB 302 kB
app-page-tur..prod.js gzip 161 kB 161 kB
app-page.run...dev.js gzip 303 kB 303 kB
app-page.run..prod.js gzip 161 kB 161 kB
app-route-ex...dev.js gzip 69.4 kB 69.4 kB
app-route-ex..prod.js gzip 48.2 kB 48.2 kB
app-route-tu...dev.js gzip 69.4 kB 69.4 kB
app-route-tu..prod.js gzip 48.2 kB 48.2 kB
app-route-tu...dev.js gzip 69 kB 69 kB
app-route-tu..prod.js gzip 47.9 kB 47.9 kB
app-route.ru...dev.js gzip 68.9 kB 68.9 kB
app-route.ru..prod.js gzip 47.9 kB 47.9 kB
dist_client_...dev.js gzip 324 B 324 B
dist_client_...dev.js gzip 326 B 326 B
dist_client_...dev.js gzip 318 B 318 B
dist_client_...dev.js gzip 317 B 317 B
pages-api-tu...dev.js gzip 42.4 kB 42.4 kB
pages-api-tu..prod.js gzip 32.2 kB 32.2 kB
pages-api.ru...dev.js gzip 42.4 kB 42.4 kB
pages-api.ru..prod.js gzip 32.2 kB 32.2 kB
pages-turbo....dev.js gzip 51.7 kB 51.7 kB
pages-turbo...prod.js gzip 38.8 kB 38.8 kB
pages.runtim...dev.js gzip 51.7 kB 51.7 kB
pages.runtim..prod.js gzip 38.8 kB 38.8 kB
server.runti..prod.js gzip 62.4 kB 62.4 kB
Total 2.73 MB 2.73 MB ✅ -3 B

@gaojude gaojude requested a review from timneutkens January 25, 2026 02:27
@gaojude gaojude force-pushed the gaojude/improve-agents-md-prompt branch from 1883267 to 748727c Compare January 25, 2026 04:21
The previous instruction "Prefer retrieval-led reasoning" was too polite
and agents would ignore it, defaulting to stale pre-training knowledge.

The new instruction creates doubt by stating the agent's memory is WRONG,
which psychologically forces the agent to actually read the docs.
@gaojude gaojude force-pushed the gaojude/improve-agents-md-prompt branch from 748727c to aaa4f9e Compare January 25, 2026 04:23
@gaojude gaojude merged commit 39ce012 into canary Jan 25, 2026
286 of 291 checks passed
@gaojude gaojude deleted the gaojude/improve-agents-md-prompt branch January 25, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants