Skip to content

Commit f4cf9fa

Browse files
authored
Update README.md
1 parent 7ac8bde commit f4cf9fa

File tree

1 file changed

+4
-6
lines changed

1 file changed

+4
-6
lines changed

README.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -37,18 +37,16 @@ Text-to-Speech Models (TTS):
3737

3838
## Leaderboard
3939

40-
Snapshot below, click it to jump to the latest spreadsheet.
41-
[![Screenshot 2024-03-05 at 4 08 20 PM](https://github.com/fixie-ai/ai-benchmarks/assets/1821693/97651011-fc8e-4481-bac9-cba0927aa485)](https://docs.google.com/spreadsheets/d/e/2PACX-1vTPttBIJ676Ke5eKXh8EoOe9XrMZ1kgVh-hvuO-LP41GTNIbsHwx1bcb_SsoB3BTDZLNeMspqLQMXSS/pubhtml?gid=0&single=true)
40+
See [https://thefastest.ai](thefastest.ai) for the current leaderboard.
4241

4342
### Test methodology
4443

45-
- Tests are run from a Google Cloud console in us-west1.
46-
- Input requests are short, typically a single message (~20 tokens), and typically ask for a brief output response.
47-
- Max output tokens is set to 100, to avoid distortion of TPS values from long outputs.
44+
- Tests are run from a set of distributed benchmark runners.
45+
- Input requests are relatively brief, typically about 1000 tokens, and ask for a brief output response.
46+
- Max output tokens is set to 20, to avoid distortion of TPS values from long outputs.
4847
- A warmup connection is made to remove any connection setup latency.
4948
- The TTFT clock starts when the HTTP request is made and stops when the first token result is received in the response stream.
5049
- For each provider, three separate inferences are done, and the best result is kept (to remove any outliers due to queuing etc).
51-
- A best result is selected on 3 different days, and the median of these values is displayed.
5250

5351
## Initial setup
5452

0 commit comments

Comments
 (0)