Feature Request: Add --no-warmup to llama-bench

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

**Proposed Enhancement**

Add a new command-line flag **--no-warmup** to disable the internal warm-up in llama-bench.

When used, llama-bench should not run the prompt/gen warm-up phases, and directly execute the timed trials.

This will eliminate redundant operations in automated benchmarking pipelines, and improve efficiency for users running long optimization loops.

### Motivation

Currently, llama-bench automatically performs an internal warm-up before each benchmark test case.

However, in multi-stage tuning workflows like **[llama-optimus](https://pypi.org/project/llama-optimus/)**, this leads to redundant warm-ups: When trying to optimize for the best llama.cpp flags, a tool (such as llama-optimus) can already perform a heavy warm-up phase at the start; 

In optimization loops that call llama-bench multiple times, repeating warm-up for each benchmark trial is wasted time and resources. 

Also, it would help skip warm-up during debug and dev testing. 

### Possible Implementation

llama.cpp already supports a --no-warmup option—but only in llama-cli and llama-server. The flag was introduced in [PR [#8712]](https://github.com/ggml-org/llama.cpp/pull/8712) to bypass the internal llama_decode warm-up call during CLI/server invocation 

The codebase already has parsing and internal logic to disable warm-up in related components.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add --no-warmup to llama-bench #14224

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add --no-warmup to llama-bench #14224

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions