These vectors were captured from the official DeepSeek V4 Flash API using
deepseek-v4-flash, greedy decoding, thinking disabled, and
top_logprobs=20. The hosted API does not expose full logits, so these files
store the best logprob slice the API provides.
Files:
prompts/*.txt: exact user prompts.official/*.official.json: official API continuations and top-logprobs.official.vec: compact C-test fixture generated from the official JSON.
Regenerate official vectors:
DEEPSEEK_API_KEY=... ./tests/test-vectors/fetch_official_vectors.pyRunning the fetcher without --only also regenerates official.vec.
The C runner consumes official.vec directly:
./ds4_test --logprob-vectorsofficial.vec is intentionally trivial to parse from C: each case points to a
prompt file and each expected token is hex-encoded by bytes. The official JSON
files remain in the tree so the compact fixture can be audited against the raw
API response.
To inspect a local top-logprob dump manually:
./ds4 --metal --nothink -sys "" --temp 0 -n 4 --ctx 16384 \
--prompt-file tests/test-vectors/prompts/long_code_audit.txt \
--dump-logprobs /tmp/long_code_audit.ds4.json \
--logprobs-top-k 20