File tree Expand file tree Collapse file tree 1 file changed +4
-6
lines changed Expand file tree Collapse file tree 1 file changed +4
-6
lines changed Original file line number Diff line number Diff line change @@ -37,18 +37,16 @@ Text-to-Speech Models (TTS):
37
37
38
38
## Leaderboard
39
39
40
- Snapshot below, click it to jump to the latest spreadsheet.
41
- [ ![ Screenshot 2024-03-05 at 4 08 20 PM] ( https://github.com/fixie-ai/ai-benchmarks/assets/1821693/97651011-fc8e-4481-bac9-cba0927aa485 )] ( https://docs.google.com/spreadsheets/d/e/2PACX-1vTPttBIJ676Ke5eKXh8EoOe9XrMZ1kgVh-hvuO-LP41GTNIbsHwx1bcb_SsoB3BTDZLNeMspqLQMXSS/pubhtml?gid=0&single=true )
40
+ See [ https://thefastest.ai ] ( thefastest.ai ) for the current leaderboard.
42
41
43
42
### Test methodology
44
43
45
- - Tests are run from a Google Cloud console in us-west1 .
46
- - Input requests are short , typically a single message ( ~ 20 tokens) , and typically ask for a brief output response.
47
- - Max output tokens is set to 100 , to avoid distortion of TPS values from long outputs.
44
+ - Tests are run from a set of distributed benchmark runners .
45
+ - Input requests are relatively brief , typically about 1000 tokens, and ask for a brief output response.
46
+ - Max output tokens is set to 20 , to avoid distortion of TPS values from long outputs.
48
47
- A warmup connection is made to remove any connection setup latency.
49
48
- The TTFT clock starts when the HTTP request is made and stops when the first token result is received in the response stream.
50
49
- For each provider, three separate inferences are done, and the best result is kept (to remove any outliers due to queuing etc).
51
- - A best result is selected on 3 different days, and the median of these values is displayed.
52
50
53
51
## Initial setup
54
52
You can’t perform that action at this time.
0 commit comments