Skip to content

Forge: resume failed runs from continuation markers#8492

Open
kimeta wants to merge 2 commits into
oracle:masterfrom
kimeta:rhei/issue-8486-forge-resume-failed-runs-from-the-phase-that-failed-run-continuation
Open

Forge: resume failed runs from continuation markers#8492
kimeta wants to merge 2 commits into
oracle:masterfrom
kimeta:rhei/issue-8486-forge-resume-failed-runs-from-the-phase-that-failed-run-continuation

Conversation

@kimeta

@kimeta kimeta commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Closes #8486

Forge failed-run preservation already keeps the work branch available for human intervention, but the next automated run had no durable machine state for deciding which phase could be skipped. This adds a schema-backed .continuation-marker.json and uses it as the resume contract for setup, fix, explore, finalization, and publication, matching the run-continuation phase model in §FS-forge-run-continuation.

What this does

  • Creates and validates a Forge-local continuation marker that records the issue, strategy, coordinates, preserved branch, first non-terminal phase, and phase-local state.
  • Threads --continuation-marker-path through the dispatcher and relevant workflow drivers so resumed runs can skip deterministic setup or agent phases that already completed.
  • Records fix/explore/finalization/publication transitions from the workflow strategies, including dynamic-access exhausted classes so resumed exploration does not retry already exhausted classes.
  • Stores the marker and pending metrics on failed-run preservation branches even though they are ignored during normal Forge runs.
  • Updates publication resume handling to skip an already-pushed PR branch or replay non-preservation commits from a failed preservation branch onto a clean PR branch while preserving the original recorded publication branch namespace and the PR eligibility boundary in §GIT-pr-eligibility.

What passed

The verification artifact is runtime/issues/work/issue-8486-verification.md.

  • git diff --check passed for the full Forge diff before commit.
  • python3 -m compileall passed for the changed Forge Python files.
  • Hermetic fixture E2E passed with python3 forge_metadata.py --fixture-testing --issue-number 9101 --strategy-name dynamic_access_main_sources_pi_gpt-5.5 --reachability-metadata-path .. --keep-tests-without-dynamic-access.
  • The fixture run reached 17/17 dynamic-access coverage, passed native verification, passed post-generation tests for the configured GraalVM matrix, passed checkMetadataFiles, repaired checkstyle through the built-in Pi fix, generated metrics, and completed fixture dry-run publication.
  • A review follow-up check confirmed DynamicAccessIterativeStrategy reads marker-backed EXPLORE exhausted classes, de-duplicates them, and ignores invalid entries.
  • A publication-resume simulation confirmed an unpushed resumed publication reuses the marker-recorded branch, leaves .continuation-marker.json out of the tracked PR branch, and records the branch as pushed after the push step.

Live GitHub Forge E2E was not run because the repository docs require an explicit user request for live mutation. The fixture E2E verified a marker-enabled successful run; preserved-branch resume paths other than the publication branch-reuse simulation were reviewed locally rather than exercised against live GitHub state.

@kimeta kimeta added the rhei label Jun 16, 2026
@kimeta kimeta force-pushed the rhei/issue-8486-forge-resume-failed-runs-from-the-phase-that-failed-run-continuation branch from 67a0d97 to 7fec8bc Compare June 17, 2026 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Forge: resume failed runs from the phase that failed (run continuation)

1 participant