Skip to content

Implement controlling destroy functionality within Terraform Test #37359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

liamcervante
Copy link
Member

This PR merges the changes within f-controlling-destroys with the changes from main, and is then merging those changes into main.

For now, I haven't added any changelog entries but I will do before we merge this. I just want to make sure my merge hasn't undone any changes.

There's still changes to the docs in here from before - these need to be ported over to the new repo before this is merged.

SarahFrench and others added 19 commits March 6, 2025 10:01
…ks, validate configuration (#36541)

* Add ability to parse backend blocks from a run block

* Add validation to avoid multiple backend blocks across run blocks that use the same internal state file. Update tests.

* Add validation to avoid multiple backend blocks within a single run block. Update tests.

* Remove use of quotes in diagnostic messages

* Add validation to avoid backend blocks being used in plan run blocks. Update tests.

* Correct local backend blocks in new test fixtures

* Add test to show that different test files can use same backend block for same state key.

* Add validation to enforce state-storage backend types are used

* Remove TODO comment

We only need to consider one file at a time when checking if a state_key already has a backend associated with it; parallelism in `terraform test` is scoped down to individual files.

* Add validation to assert that the backend block must be in the first apply command for an internal state

* Consolidate backend block validation inside a single if statement

* Add initial version of validation that ensures a backend isn't re-used within a file

* Explicitly set the state_key at the point of parsing the config

TODO: What should be done with method (moduletest.Run).GetStateKey?

* Update test fixture now that reusing backend configs has been made invalid

* Add automated test showing validation of reused configuration blocks

* Skip test due to flakiness, minor change to test config naming

* Update test so it tolerates non-deterministic order run blocks are evaluated in

* Remove unnecessary value assignment to r.StateKey

* Replace use of GetStateKey() with accessing the state key that's now set during test config parsing

* Fix bug so that run blocks using child modules get the correct state key set at parsing time

* Update acceptance test to also cover scenario where root and child module state keys are in use

* Update test name

* Add newline to regex

* Ensure consistent place where repeat backend error is raised from
… the run that defines a backend to write state to the backend (#36646)

* Allow use of backend block to set initial state for a state key

* Note about alternative place to keep 'backend factories'

* Allow the run block defining the backend to write state to it

* Fix rebase

* Change to accessing backend init functions via ContextOpts

* Add tests demonstrating how runs containing backend blocks use and update persisted state

* Fix test fixture

* Address test failure due to trouble opening the state file

This problem doesn't happen on MacOS, so I assume is due to the Linux environment of GitHub runners.

* Fix issue with paths properly

I hope

* Fix defect in test assertion

* Pivot back to approach introduced in 4afc3d7

* Let failing tests write to persistent state, add test case covering that.

I split the acceptance tests into happy/unhappy paths for this, which required some of the helper functions' declarations to be raised up to package-level.

* Change how we update internal state files, so that information about the associated backend is never lost

* Fix UpdateStateFile

* Ensure that the states map set by TestStateTransformer associates a backend with the correct run.

* Misc spelling fixes in comments and a log
* Replace state get/set functions with existing helpers

* Compare to string representation of state

* Compare to string representation of state
…36848)

* Fix nil pointer error, update test to not be table-driven

* Make using a backend block implicitly set skip_cleanup to true

* Stop state artefacts being created when a backend is in use and no cleanup errors have occurred

* Return diagnostics so calling code knows if cleanup experienced issues or not

* Update tests to show that when cleanup fails a state artefact is created

* Add comment about why diag not returned

* Bug fix - actually pull in the state from the state manager!

* Split and simplify (?) tests to show the backend block can create and/or reuse prior state

* Update test to use new fixtures, assert about state artefact. Fix nil pointer

* Update test fixture in use, add guardrail for flakiness of forced error during cleanup

* Refactor so resource ID set in only one place
* Add backend as a documented block in a run block

* Add documentation about backend blocks in run blocks.

* Make the relationship between backends and state keys more clear, other improvements
…om:hashicorp/terraform into liamcervante/f-controlling-destroys
Copy link
Contributor

Changelog Warning

Currently this PR would target a v1.14 release. Please add a changelog entry for in the .changes/v1.14 folder, or discuss which release you'd like to target with your reviewer. If you believe this change does not need a changelog entry, please add the 'no-changelog-needed' label.

@liamcervante liamcervante changed the title Liamcervante/f controlling destroys Implement controlling destroy functionality within Terraform Test Jul 23, 2025
dsa0x
dsa0x previously approved these changes Jul 28, 2025
Copy link
Member

@SarahFrench SarahFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for dusting off this feature branch!

Given that it's been several months since I worked on it, I took a look with mostly fresh eyes and I spotted some potential issues that I've commented on the code.

Edit to add: there are some check failures in internal/moduletest/graph/test_graph_builder.go

Comment on lines 415 to 426
if run.SkipCleanup && run.SkipCleanupSet {
if _, found := skipCleanups[run.StateKey]; found {
diags = append(diags, &hcl.Diagnostic{
Severity: hcl.DiagWarning,
Summary: "Duplicate \"skip_cleanup\" block",
Detail: fmt.Sprintf("The run %q has a skip_cleanup attribute set, but shares state with a later run %q that also has skip_cleanup set. The later run takes precedence, and this attribute is ignored for the earlier run.", run.Name, skipCleanups[run.StateKey]),
Subject: block.DefRange.Ptr(),
})
} else {
skipCleanups[run.StateKey] = run.Name
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realised that this warning diagnostic can be confusing for users if they combine use of a backend block and skip_cleanup, like below:

run "setup" {
  backend "local" {} # This implies skip_cleanup = true
}

run "test" {
  skip_cleanup = true # This triggers a warning to let users know that 
}

Ideally the diagnostic would be able to differentiate whether the prior skip cleanup instruction was supplied via an explicit skip_cleanup attr or through use of a backend block, as the skip_cleanup attr in the later run will impact the state stored in the backend's state file.

I think making this distinction would be possible by updating this code to also checking if there is an entry for tf.BackendConfigs[run.StateKey]; if theres an entry in skipCleanups` and also an entry in BackendConfigs for that state key then the run block currently being parsed contains a duplicate skip_cleanup=true instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #37385 should address this 🤞

diags = append(diags, &hcl.Diagnostic{
Severity: hcl.DiagWarning,
Summary: "Duplicate \"skip_cleanup\" block",
Detail: fmt.Sprintf("The run %q has a skip_cleanup attribute set, but shares state with a later run %q that also has skip_cleanup set. The later run takes precedence, and this attribute is ignored for the earlier run.", run.Name, skipCleanups[run.StateKey]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a problem here, and to illustrate it here's a manual test I did using this branch:

Config:

run "setup" {
  skip_cleanup = true
}

run "test" {
  skip_cleanup = true
}

When I run terraform test I get this error:

│ Warning: Duplicate "skip_cleanup" block
│ 
│   on main.tftest.hcl line 5:
│    5: run "test" {
│ 
│ The run "test" has a skip_cleanup attribute set, but shares state with a later run "setup"
│ that also has skip_cleanup set. The later run takes precedence, and this attribute is
│ ignored for the earlier run.

I think the logic is currently 'the wrong way round'; the diagnostic is saying that "setup" is a later run after "test", which is incorrect, and also it appears that any instance of skip_cleanup=true after the first instance raises this warning and is ignored. I think that the warning should continue to be raised on each instance, but each time the code should also update the entry in skipCleanups for the current state key. This would address this part of the RFC:

restore the state to the last executed run block with the attribute set as true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #37385 should address this 🤞

// Testing whether a state artifact is made for a run block with a backend or not
//
// Artifacts are made when the cleanup operation errors.
func TestTest_UseOfBackends_whenStateArtifactsAreMade(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I observed that during in test case # 1, where no artefacts are made, a non-empty manifest file is created. When I debug the test and check the temp dir's files, the .terraform/test/ directory only contains manifest.json (no state files) and the manifest refers to a non-existent file:

{
  "version": 0,
  "files": {
    "main.tftest.hcl": {
      "states": {
        "": {
          "path": ".terraform/test/Qwlngn0M.tfstate",
          "reason": ""
        }
      }
    }
  }
}

Does this need to be fixed? This could be confusing when debugging issues with the test command, though arguably you could say that this is an implementation detail and users shouldn't be inspecting the file.

* Improve diagnostics around skip_cleanup conflicts

* remove unused dynamic node
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants