feat(clis,filler): Add trace types to allow trace analysis and gas optimizations #1979

marioevz · 2025-07-31T21:20:54Z

🗒️ Description

Introduces a new flag to the test filler that allows re-running the specified tests with lower gas limits to find the lowest value which still produces the same test output, result, and the exact same traces.

- Trace Types (`efdf63a`)

Introduces trace types to use pydantic to parse the traces from the transition tool.

The types accept the formats used in evmone and execution-specs, and account for their slight differences.

They also implement "equivalence" check methods that compare two different traces:

The trace lines are compared and always expected to be equal.
For each trace line, values including the program counter, the gas cost, the memory size, the stack, depth, refund, opcode name and errors, are compared and expected to be exactly equal.
The remaining gas value is ignored with the reasoning that the remaining gas depends on the gas limit, and therefore changing it in the slightest would result in two traces that are impossible to compare.
A "post-processing" mode can be included where the top stack element is ignored if the previous opcode is the Op.GAS, which allows more flexibility because most tests only use Op.GAS to use it for a Op.CALL*.

Downsides:

If the gas stack value produced by the Op.GAS opcode is not consumed in the very next operation, the value propagates and the post processing is ineffective, but this should not be a big issue since most tests that use the gas opcode will consume it immediately on the next CALL* opcode.

- Gas optimization in state tests (`7b1d0bd`)

Implements a procedure to find the minimum gas-limit value where the test still works and yields the same result and traces.

The algorithm works as follows:

Execute the test with the current gas limit specified in the test.
Reduce the gas limit by one, re-execute the test: if the runs are still equivalent proceed, otherwise abort and deem the test impossible to lower its gas.
Perform a binary search by lowering the gas by half, re-executing the test, if the test fails set the new minimum to the last gas limit value plus 1, if the test passes set the new maximum to the last gas limit value.
Iterate until the difference between the minimum and maximum is zero, or the minimum is higher than 16,777,216

- Fill command's flags to enable and configure optimize-gas feature (`d178bdb`)

Flags --optimize-gas, --optimize-gas-output and --optimize-gas-post-processing are added to the fill command to enable and configure this new feature.

- modify_static_test_gas_limits.py command (`dc4c5fc`)

It's a simple command to take the output from the fill's optimize-gas mode and apply it to static tests.

The command has a dry-run mode and aborts if the static test contains more than one gas value, in order to not override incorrect values.

🔗 Related Issues or PRs

N/A.

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx --with=tox-uv tox -e lint,typecheck,spellcheck,markdownlint
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

src/ethereum_test_specs/state.py

src/cli/modify_static_test_gas_limits.py

src/ethereum_clis/types.py

src/ethereum_test_specs/state.py

felix314159 · 2025-08-06T09:23:52Z

docs/CHANGELOG.md

@@ -78,6 +78,7 @@ Users can select any of the artifacts depending on their testing needs for their
 - 🔀 Disabled writing debugging information to the EVM "dump directory" to improve performance. To obtain debug output, the `--evm-dump-dir` flag must now be explicitly set. As a consequence, the now redundant `--skip-evm-dump` option was removed ([#1874](https://github.com/ethereum/execution-spec-tests/pull/1874)).
 - ✨ Generate unique addresses with Python for compatible static tests, instead of using hard-coded addresses from legacy static test fillers ([#1781](https://github.com/ethereum/execution-spec-tests/pull/1781)).
 - ✨ Added support for the `--benchmark-gas-values` flag in the `fill` command, allowing a single genesis file to be used across different gas limit settings when generating fixtures. ([#1895](https://github.com/ethereum/execution-spec-tests/pull/1895)).
+- ✨ Added `--optimize-gas` flag that allows to binary search the minimum gas limit value for a transaction in a test that still yields the same test result ([#1979](https://github.com/ethereum/execution-spec-tests/pull/1979)).


In the PR you wrote that the flags --optimize-gas, --optimize-gas-output and --optimize-gas-post-processing have been added. So they all should be mentioned in the changelog

felix314159 · 2025-08-06T09:29:44Z

docs/writing_tests/gas_optimization.md

+
+## Post-Processing Mode
+
+Enable post-processing to handle opcodes that put the current gas in the stack (like `GAS` opcode):


"opcodes that put the current gas in the stack (like GAS opcode)"

it would be better if you would put the full list of opcodes that are relevant here. it is also unclear to me what exactly this flag does (it "handles opcodes", but what does that mean). it's also unclear whether this is mandatory or not, because in a later paragraph you refer to it as "optional post-processing". is it even optional when GAS opcode is used? i feel like this flag maybe does not have to exist, can you not scan the code for opcodes and then dynamically toggle this when certain opcodes are found? less flags make this gas optimization feature easier to use

felix314159 · 2025-08-06T10:02:53Z

Can you make it so that it logs into the terminal which optimize-gas-output file has been created? E.g. if i run uv run fill --optimize-gas ./tests/static/state_tests/stRandom/randomStatetest137Filler.json --clean --fill-static-tests --evm-bin=evmone-t8n i get no indication in the terminal that --optimize-gas has done anything. It would be better if above or below the Generated html report line in similar fashion would show sth like Generated optimize-gas-output.json (or whatever name was specified): <path to file>

felix314159 · 2025-08-06T10:09:46Z

docs/writing_tests/gas_optimization.md

+## Limitations
+
+- Only works with state tests (not blockchain tests)
+- Requires trace collection to be enabled


When reading this I thought that the command would fail if I don't provide --trace but that does not seem to be the case, so it seems to add that automatically?

felix314159 · 2025-08-06T10:14:25Z

Maybe I missed sth but from the output of the resulting optimize-gas-output.json file I can't tell whether the test is currently using the optimal gas limit or not. Is this intended? You probably then also have a separate script (not in this PR) which runs over this json file, compares it with the actual gas limit currently used in the test and then updates the test to use the determined minimal value? Maybe this question will be answered when im done with this PR and read through 1980

felix314159 · 2025-08-06T10:23:52Z

When I run uv run fill --optimize-gas ./tests/static/state_tests/stRandom/randomStatetest135Filler.json --clean --fill-static-tests --evm-bin=evmone-t8n I get 2 failed with Exception: Impossible to compare. But from the error message alone it not clear what caused this problem. We should adjust the exception message so that it is clear that the error only occurred because --optimize-gas was toggled

felix314159

Great future with a few rough edges, ty for taking the time to make this and sorry for the email spam :)

marioevz added 6 commits July 31, 2025 19:55

feat(clis): Trace types

efdf63a

feat(specs): Add gas optimization to state tests

7b1d0bd

feat(filler): Add gas optimization flags

d178bdb

feat(command): src/cli/modify_static_test_gas_limits.py

dc4c5fc

docs: Document new feature

9299ce9

docs: Changelog

2f5a8d1

marioevz added type:feat type: Feature scope:fill Scope: fill command labels Jul 31, 2025

marioevz requested a review from spencer-tb July 31, 2025 21:23

marioevz mentioned this pull request Jul 31, 2025

fix(tests/static): Fix all static tests for Osaka fork #1980

Merged

5 tasks

Merge branch 'main' into fill-gas-optimization-feature

d561a39