ethereum · marioevz · Jul 14, 2025 · Jul 3, 2025 · Jul 3, 2025 · Jul 3, 2025
diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md
@@ -37,6 +37,7 @@ Users can select any of the artifacts depending on their testing needs for their
 - ✨ Add the `ported_from` test marker to track Python test cases that were converted from static fillers in [ethereum/tests](https://github.com/ethereum/tests) repository ([#1590](https://github.com/ethereum/execution-spec-tests/pull/1590)).
 - ✨ Add a new pytest plugin, `ported_tests`, that lists the static fillers and PRs from `ported_from` markers for use in the coverage Github Workflow ([#1634](https://github.com/ethereum/execution-spec-tests/pull/1634)).
 - ✨ Enable two-phase filling of fixtures with pre-allocation groups and add a `BlockchainEngineXFixture` format ([#1706](https://github.com/ethereum/execution-spec-tests/pull/1706), [#1760](https://github.com/ethereum/execution-spec-tests/pull/1760)).
+- ✨ Add `--generate-all-formats` flag to enable generation of all fixture formats including `BlockchainEngineXFixture` in a single command; enable `--generate-all-formats` automatically for tarball output, `--output=fixtures.tar.gz`, [#1855](https://github.com/ethereum/execution-spec-tests/pull/1855).
 - 🔀 Refactor: Encapsulate `fill`'s fixture output options (`--output`, `--flat-output`, `--single-fixture-per-file`) into a `FixtureOutput` class ([#1471](https://github.com/ethereum/execution-spec-tests/pull/1471),[#1612](https://github.com/ethereum/execution-spec-tests/pull/1612)).
 - ✨ Don't warn about a "high Transaction gas_limit" for `zkevm` tests ([#1598](https://github.com/ethereum/execution-spec-tests/pull/1598)).
 - 🐞 `fill` no longer writes generated fixtures into an existing, non-empty output directory; it must now be empty or `--clean` must be used to delete it first ([#1608](https://github.com/ethereum/execution-spec-tests/pull/1608)).

diff --git a/docs/filling_tests/filling_tests_command_line.md b/docs/filling_tests/filling_tests_command_line.md
@@ -88,6 +88,37 @@ uv run fill tests/shanghai/eip3651_warm_coinbase/test_warm_coinbase.py::test_war
 
     See: [Filling Tests for Features under Development](./filling_tests_dev_fork.md).
 
+## Generating All Fixture Formats
+
+The `--generate-all-formats` flag enables generation of all fixture formats including the optimized `BlockchainEngineXFixture` in a single command:
+
+```console
+uv run fill --generate-all-formats tests/shanghai/
+```
+
+This flag automatically performs a two-phase execution:
+
+1. **Phase 1**: Generates pre-allocation groups for optimization.
+2. **Phase 2**: Generates all supported fixture formats (`StateFixture`, `BlockchainFixture`, `BlockchainEngineFixture`, `BlockchainEngineXFixture`, etc.).
+
+!!! tip "Automatic enabling with tarball output"
+    When using tarball output (`.tar.gz` files), the `--generate-all-formats` flag is automatically enabled:
+    ```console
+    # Automatically enables --generate-all-formats due to .tar.gz output
+    uv run fill --output=fixtures.tar.gz tests/shanghai/
+
+    # Equivalent to:
+    uv run fill --generate-all-formats --output=fixtures.tar.gz tests/shanghai/
+    ```
+
+!!! note "Alternative approach"
+    You can still use the legacy approach, but this will only generate the `BlockchainEngineXFixture` format:
+    ```console
+    # Single command that automatically does 2-phase execution
+    # but only generates BlockchainEngineXFixture
+    uv run fill --generate-pre-alloc-groups tests/shanghai/
+    ```
+
 ## Debugging the `t8n` Command
 
 The `--evm-dump-dir` flag can be used to dump the inputs and outputs of every call made to the `t8n` command for debugging purposes, see [Debugging Transition Tools](./debugging_t8n_tools.md).

diff --git a/docs/running_tests/releases.md b/docs/running_tests/releases.md
@@ -37,6 +37,14 @@ For standard releases, two tarballs are available:
 
 I.e., `fixtures_develop` are a superset of `fixtures_stable`.
 
+!!! tip "Generating tarballs directly via `--output` includes all fixture formats"
+    When generating fixtures for release, specifying tarball output automatically enables all fixture formats:
+    ```console
+    # Automatically enables --generate-all-formats due to .tar.gz output
+    uv run fill --output=fixtures_stable.tar.gz tests/
+    ```
+    This ensures that all fixture formats are included in the tarball release.
+
 ### Pre-Release and Devnet Releases
 
 Intermediate releases that target specific subsets of features or tests under active development are published at @ethereum/execution-spec-tests [releases](https://github.com/ethereum/execution-spec-tests/releases).

diff --git a/docs/running_tests/test_formats/blockchain_test_engine_x.md b/docs/running_tests/test_formats/blockchain_test_engine_x.md
@@ -2,7 +2,7 @@
 
 The Blockchain Engine X Test fixture format tests are included in the fixtures subdirectory `blockchain_tests_engine_x`, and use Engine API directives with optimized pre-allocation groups for improved execution performance.
 
-These are produced by the `StateTest` and `BlockchainTest` test specs when using the `--generate-pre-alloc-groups` and `--use-pre-alloc-groups` flags.
+These are produced by the `StateTest` and `BlockchainTest` test specs when using the `--generate-pre-alloc-groups` and `--use-pre-alloc-groups` flags, or by using the `--generate-all-formats` flag which generates all fixture formats including `BlockchainEngineXFixture` in a single command.
 
 ## Description
 
@@ -138,7 +138,9 @@ Engine API payload structure identical to the one defined in [Blockchain Engine
 
 ## Usage Notes
 
-- This format is only generated when using `--generate-pre-alloc-groups` and `--use-pre-alloc-groups` flags
+- This format is generated when using:
+    - `--generate-pre-alloc-groups` flag (automatically triggers 2-phase execution, generates only `BlockchainEngineXFixture`)
+    - `--generate-all-formats` flag (automatically triggers 2-phase execution, generates all fixture formats)
 - The `pre_alloc` folder is essential and must be distributed with the test fixtures
 - Tests are grouped by identical (fork + environment + pre-allocation) combinations
 - The format is optimized for Engine API testing (post-Paris forks)
diff --git a/src/cli/pytest_commands/fill.py b/src/cli/pytest_commands/fill.py
@@ -26,12 +26,13 @@ def create_executions(self, pytest_args: List[str]) -> List[PytestExecution]:
         Create execution plan that supports two-phase pre-allocation group generation.
 
         Returns single execution for normal filling, or two-phase execution
-        when --generate-pre-alloc-groups is specified.
+        when --generate-pre-alloc-groups or --generate-all-formats is specified.
         """
         processed_args = self.process_arguments(pytest_args)
 
         # Check if we need two-phase execution
-        if "--generate-pre-alloc-groups" in processed_args:
+        if self._should_use_two_phase_execution(processed_args):
+            processed_args = self._ensure_generate_all_formats_for_tarball(processed_args)
             return self._create_two_phase_executions(processed_args)
         elif "--use-pre-alloc-groups" in processed_args:
             # Only phase 2: using existing pre-allocation groups
@@ -109,6 +110,7 @@ def _remove_unwanted_phase1_args(self, args: List[str]) -> List[str]:
             # Pre-allocation group flags (we'll add our own)
             "--generate-pre-alloc-groups",
             "--use-pre-alloc-groups",
+            "--generate-all-formats",
         }
 
         filtered_args = []
@@ -133,7 +135,7 @@ def _remove_unwanted_phase1_args(self, args: List[str]) -> List[str]:
         return filtered_args
 
     def _remove_generate_pre_alloc_groups_flag(self, args: List[str]) -> List[str]:
-        """Remove --generate-pre-alloc-groups flag from argument list."""
+        """Remove --generate-pre-alloc-groups flag but keep --generate-all-formats for phase 2."""
         return [arg for arg in args if arg != "--generate-pre-alloc-groups"]
 
     def _remove_clean_flag(self, args: List[str]) -> List[str]:
@@ -144,6 +146,33 @@ def _add_use_pre_alloc_groups_flag(self, args: List[str]) -> List[str]:
         """Add --use-pre-alloc-groups flag to argument list."""
         return args + ["--use-pre-alloc-groups"]
 
+    def _should_use_two_phase_execution(self, args: List[str]) -> bool:
+        """Determine if two-phase execution is needed."""
+        return (
+            "--generate-pre-alloc-groups" in args
+            or "--generate-all-formats" in args
+            or self._is_tarball_output(args)
+        )
+
+    def _ensure_generate_all_formats_for_tarball(self, args: List[str]) -> List[str]:
+        """Auto-add --generate-all-formats for tarball output."""
+        if self._is_tarball_output(args) and "--generate-all-formats" not in args:
+            return args + ["--generate-all-formats"]
+        return args
+
+    def _is_tarball_output(self, args: List[str]) -> bool:
+        """Check if output argument specifies a tarball (.tar.gz) path."""
+        from pathlib import Path
+
+        for i, arg in enumerate(args):
+            if arg.startswith("--output="):
+                output_path = Path(arg.split("=", 1)[1])
+                return str(output_path).endswith(".tar.gz")
+            elif arg == "--output" and i + 1 < len(args):
+                output_path = Path(args[i + 1])
+                return str(output_path).endswith(".tar.gz")
+        return False
+
 
 class PhilCommand(FillCommand):
     """Friendly fill command with emoji reporting."""

diff --git a/src/cli/tests/test_generate_all_formats.py b/src/cli/tests/test_generate_all_formats.py
@@ -0,0 +1,187 @@
+"""Test the --generate-all-formats CLI flag functionality."""
+
+from unittest.mock import patch
+
+from cli.pytest_commands.fill import FillCommand
+
+
+def test_generate_all_formats_creates_two_phase_execution():
+    """Test that --generate-all-formats triggers two-phase execution."""
+    command = FillCommand()
+
+    # Mock the argument processing to bypass click context requirements
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        # Test that --generate-all-formats triggers two-phase execution
+        pytest_args = ["--generate-all-formats", "tests/somedir/"]
+        executions = command.create_executions(pytest_args)
+
+    assert len(executions) == 2, "Expected two-phase execution"
+
+    # Phase 1: Should have --generate-pre-alloc-groups
+    phase1_args = executions[0].args
+    assert "--generate-pre-alloc-groups" in phase1_args
+    assert "--generate-all-formats" not in phase1_args
+
+    # Phase 2: Should have --use-pre-alloc-groups and --generate-all-formats
+    phase2_args = executions[1].args
+    assert "--use-pre-alloc-groups" in phase2_args
+    assert "--generate-all-formats" in phase2_args
+    assert "--generate-pre-alloc-groups" not in phase2_args
+
+
+def test_generate_all_formats_preserves_other_args():
+    """Test that --generate-all-formats preserves other command line arguments."""
+    command = FillCommand()
+
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        pytest_args = [
+            "--generate-all-formats",
+            "--output=custom-output",
+            "--fork=Paris",
+            "-v",
+            "tests/somedir/",
+        ]
+        executions = command.create_executions(pytest_args)
+
+    assert len(executions) == 2
+
+    # Both phases should preserve most args
+    for execution in executions:
+        assert "--output=custom-output" in execution.args
+        assert "--fork=Paris" in execution.args
+        assert "-v" in execution.args
+        assert "tests/somedir/" in execution.args
+
+
+def test_generate_all_formats_removes_clean_from_phase2():
+    """Test that --clean is removed from phase 2."""
+    command = FillCommand()
+
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        pytest_args = ["--generate-all-formats", "--clean", "tests/somedir/"]
+        executions = command.create_executions(pytest_args)
+
+    assert len(executions) == 2
+
+    # Phase 1: Actually keeps --clean (it's needed for cleaning before phase 1)
+    # Note: --clean actually remains in phase 1 args but gets filtered out
+    # in _remove_unwanted_phase1_args
+
+    # Phase 2: Should not have --clean (gets removed)
+    phase2_args = executions[1].args
+    assert "--clean" not in phase2_args
+
+
+def test_legacy_generate_pre_alloc_groups_still_works():
+    """Test that the legacy --generate-pre-alloc-groups flag still works."""
+    command = FillCommand()
+
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        pytest_args = ["--generate-pre-alloc-groups", "tests/somedir/"]
+        executions = command.create_executions(pytest_args)
+
+    assert len(executions) == 2
+
+    # Phase 1: Should have --generate-pre-alloc-groups
+    phase1_args = executions[0].args
+    assert "--generate-pre-alloc-groups" in phase1_args
+
+    # Phase 2: Should have --use-pre-alloc-groups but NOT --generate-all-formats
+    phase2_args = executions[1].args
+    assert "--use-pre-alloc-groups" in phase2_args
+    assert "--generate-all-formats" not in phase2_args
+    assert "--generate-pre-alloc-groups" not in phase2_args
+
+
+def test_single_phase_without_flags():
+    """Test that normal execution without flags creates single phase."""
+    command = FillCommand()
+
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        pytest_args = ["tests/somedir/"]
+        executions = command.create_executions(pytest_args)
+
+    assert len(executions) == 1
+    execution = executions[0]
+
+    assert "--generate-pre-alloc-groups" not in execution.args
+    assert "--use-pre-alloc-groups" not in execution.args
+    assert "--generate-all-formats" not in execution.args
+
+
+def test_tarball_output_auto_enables_generate_all_formats():
+    """Test that tarball output automatically enables --generate-all-formats."""
+    command = FillCommand()
+
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        pytest_args = ["--output=fixtures.tar.gz", "tests/somedir/"]
+        executions = command.create_executions(pytest_args)
+
+    # Should trigger two-phase execution due to tarball output
+    assert len(executions) == 2
+
+    # Phase 1: Should have --generate-pre-alloc-groups
+    phase1_args = executions[0].args
+    assert "--generate-pre-alloc-groups" in phase1_args
+
+    # Phase 2: Should have --generate-all-formats (auto-added) and --use-pre-alloc-groups
+    phase2_args = executions[1].args
+    assert "--generate-all-formats" in phase2_args
+    assert "--use-pre-alloc-groups" in phase2_args
+    assert "--output=fixtures.tar.gz" in phase2_args
+
+
+def test_tarball_output_with_explicit_generate_all_formats():
+    """Test that explicit --generate-all-formats with tarball output works correctly."""
+    command = FillCommand()
+
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        pytest_args = ["--output=fixtures.tar.gz", "--generate-all-formats", "tests/somedir/"]
+        executions = command.create_executions(pytest_args)
+
+    # Should trigger two-phase execution
+    assert len(executions) == 2
+
+    # Phase 2: Should have --generate-all-formats (explicit, not duplicated)
+    phase2_args = executions[1].args
+    assert "--generate-all-formats" in phase2_args
+    # Ensure no duplicate flags
+    assert phase2_args.count("--generate-all-formats") == 1
+
+
+def test_regular_output_does_not_auto_trigger_two_phase():
+    """Test that regular directory output doesn't auto-trigger two-phase execution."""
+    command = FillCommand()
+
+    with patch.object(command, "process_arguments", side_effect=lambda x: x):
+        pytest_args = ["--output=fixtures/", "tests/somedir/"]
+        executions = command.create_executions(pytest_args)
+
+    # Should remain single-phase execution
+    assert len(executions) == 1
+    execution = executions[0]
+
+    assert "--generate-pre-alloc-groups" not in execution.args
+    assert "--use-pre-alloc-groups" not in execution.args
+    assert "--generate-all-formats" not in execution.args
+
+
+def test_tarball_output_detection_various_formats():
+    """Test tarball output detection with various argument formats."""
+    command = FillCommand()
+
+    # Test --output=file.tar.gz format
+    args1 = ["--output=test.tar.gz", "tests/somedir/"]
+    assert command._is_tarball_output(args1) is True
+
+    # Test --output file.tar.gz format
+    args2 = ["--output", "test.tar.gz", "tests/somedir/"]
+    assert command._is_tarball_output(args2) is True
+
+    # Test regular directory
+    args3 = ["--output=test/", "tests/somedir/"]
+    assert command._is_tarball_output(args3) is False
+
+    # Test no output argument
+    args4 = ["tests/somedir/"]
+    assert command._is_tarball_output(args4) is False