Summary
Since v1.21.1 (which bumped the vendored terragrunt library from 0.54.1 to 0.72.5 in #377), generate resolves real dependency outputs while parsing HCL. For each dependency block it fetches the target stack's remote state and recursively re-parses the target's config, including its includes and its dependency blocks. Nothing is cached across recursion levels, so cost grows exponentially with dependency-chain depth.
On a production monorepo, a single generate --filter <one-dir> against a stack whose included config has 4–6 dependency blocks (whose targets have deep chains of their own) ran for 17+ hours at full CPU without completing. Shallow-chain stacks in the same repo complete in under a second. We run this as an Atlantis pre-workflow hook, so the hang also wedges the Atlantis working-dir lock.
This may be the same underlying problem as #419 (v1.21.1, generate never completing on a large monorepo) — different manifestation (we see a CPU-bound spin; #419 reports an errgroup/singleflight deadlock crash), same parse layer. #354 is an older adjacent symptom of parsing dragging in dependency resolution.
Environment
- terragrunt-atlantis-config v1.21.1 (release binary)
- Flags:
--ignore-dependency-blocks=true --cascade-dependencies=false --autoplan=false --parallel=true --num-executors=5 --create-workspace=true --preserve-workflows=true --preserve-projects=true --output atlantis.yaml --filter <single dir>
- v1.20.0 on the identical tree does not hang (it fails differently on these stacks — a single fast
terraform output shell-out that errors — but it terminates in ~3s)
Evidence
SIGQUIT goroutine dump of the hung process shows the spin (runnable goroutine, repeatedly):
config.dependencyBlocksToCtyValue
→ config.(*Dependency).setRenderedOutputs
→ config.getTerragruntOutputIfAppliedElseConfiguredDefault
→ config.getTerragruntOutput → getOutputJSONWithCaching → getTerragruntOutputJSON
→ config.PartialParseConfigFile (of the dependency target)
→ config.handleInclude → partialParseIncludedConfig → ...
→ (recursion into the target's own dependencies, plus ~50KB state JSON decode per fetch)
and, in a second mode, pure parse recursion: parseIncludedConfig → ParseConfigFile (full parse) nested ~9 levels deep across dependency targets, dominated by EvaluateLocalsBlock/FindInParentFolders filesystem walks.
Root cause (three parts)
-
TerragruntOptions.SkipOutput is never set. generate only ever needs dependency paths for when_modified/depends_on; output values never appear in the generated atlantis.yaml. Terragrunt 0.72.5 already has the gate — shouldGetOutputs() checks !ctx.TerragruntOptions.SkipOutput (config/dependency.go:132) and degrades gracefully to mock outputs — but none of the five options.NewTerragruntOptionsWithConfigPath call sites in cmd/generate.go set it.
-
parseLocals passes a context with an empty PartialParseDecodeList, which triggers FULL parses of included configs. Terragrunt's parseIncludedConfig (config/include.go) falls back to ParseConfigFile (full parse — all blocks, dependency outputs included) whenever the included config contains a dependency block and the decode list is empty. Any child config that includes a shared/envcommon parent containing dependency blocks hits this. A partial parse still evaluates the included config's locals, which is all parseLocals needs.
-
--ignore-dependency-blocks filters after the expensive decode. getDependencies always puts config.DependencyBlock in its decode list and only discards the parsed result afterwards (cmd/generate.go:179), so the recursive dependency decode runs even when the user asked for dependency blocks to be ignored.
Minimal repro shape
# child/terragrunt.hcl
include "envcommon" {
path = "${dirname(find_in_parent_folders("root.hcl"))}/_envcommon/foo/terragrunt.hcl"
expose = true
}
# _envcommon/foo/terragrunt.hcl — contains dependency blocks whose targets
# themselves include configs with further dependency blocks (3+ levels)
dependency "a" { config_path = "..." }
dependency "b" { config_path = "..." }
generate --filter child then full-parses the envcommon (rule 2), resolves outputs for a and b (rule 1), which full-parses their configs and includes, and so on. With applied state behind each target, every level adds remote-state fetches; either way the re-parsing alone is exponential.
Suggested fix
All three are small and we have validated them together against the affected monorepo (hung stacks complete in <1s with byte-identical project output incl. extra_atlantis_dependencies; the existing test suite passes unchanged, including the dependency-discovery golden tests):
- Set
opts.SkipOutput = true at every NewTerragruntOptionsWithConfigPath site in cmd/generate.go.
- In
parseLocals, parse with a non-empty decode list that excludes DependencyBlock (e.g. ctx.WithDecodeList(config.DependenciesBlock, config.TerraformBlock)).
- In
getDependencies, only include config.DependencyBlock in the decode list when --ignore-dependency-blocks is not set.
PR to follow.
Summary
Since v1.21.1 (which bumped the vendored terragrunt library from 0.54.1 to 0.72.5 in #377),
generateresolves real dependency outputs while parsing HCL. For eachdependencyblock it fetches the target stack's remote state and recursively re-parses the target's config, including its includes and its dependency blocks. Nothing is cached across recursion levels, so cost grows exponentially with dependency-chain depth.On a production monorepo, a single
generate --filter <one-dir>against a stack whose included config has 4–6 dependency blocks (whose targets have deep chains of their own) ran for 17+ hours at full CPU without completing. Shallow-chain stacks in the same repo complete in under a second. We run this as an Atlantis pre-workflow hook, so the hang also wedges the Atlantis working-dir lock.This may be the same underlying problem as #419 (v1.21.1, generate never completing on a large monorepo) — different manifestation (we see a CPU-bound spin; #419 reports an errgroup/singleflight deadlock crash), same parse layer. #354 is an older adjacent symptom of parsing dragging in dependency resolution.
Environment
--ignore-dependency-blocks=true --cascade-dependencies=false --autoplan=false --parallel=true --num-executors=5 --create-workspace=true --preserve-workflows=true --preserve-projects=true --output atlantis.yaml --filter <single dir>terraform outputshell-out that errors — but it terminates in ~3s)Evidence
SIGQUIT goroutine dump of the hung process shows the spin (runnable goroutine, repeatedly):
and, in a second mode, pure parse recursion:
parseIncludedConfig → ParseConfigFile(full parse) nested ~9 levels deep across dependency targets, dominated byEvaluateLocalsBlock/FindInParentFoldersfilesystem walks.Root cause (three parts)
TerragruntOptions.SkipOutputis never set. generate only ever needs dependency paths forwhen_modified/depends_on; output values never appear in the generated atlantis.yaml. Terragrunt 0.72.5 already has the gate —shouldGetOutputs()checks!ctx.TerragruntOptions.SkipOutput(config/dependency.go:132) and degrades gracefully to mock outputs — but none of the fiveoptions.NewTerragruntOptionsWithConfigPathcall sites incmd/generate.goset it.parseLocalspasses a context with an emptyPartialParseDecodeList, which triggers FULL parses of included configs. Terragrunt'sparseIncludedConfig(config/include.go) falls back toParseConfigFile(full parse — all blocks, dependency outputs included) whenever the included config contains adependencyblock and the decode list is empty. Any child config that includes a shared/envcommon parent containing dependency blocks hits this. A partial parse still evaluates the included config's locals, which is allparseLocalsneeds.--ignore-dependency-blocksfilters after the expensive decode.getDependenciesalways putsconfig.DependencyBlockin its decode list and only discards the parsed result afterwards (cmd/generate.go:179), so the recursive dependency decode runs even when the user asked for dependency blocks to be ignored.Minimal repro shape
generate --filter childthen full-parses the envcommon (rule 2), resolves outputs for a and b (rule 1), which full-parses their configs and includes, and so on. With applied state behind each target, every level adds remote-state fetches; either way the re-parsing alone is exponential.Suggested fix
All three are small and we have validated them together against the affected monorepo (hung stacks complete in <1s with byte-identical project output incl.
extra_atlantis_dependencies; the existing test suite passes unchanged, including the dependency-discovery golden tests):opts.SkipOutput = trueat everyNewTerragruntOptionsWithConfigPathsite incmd/generate.go.parseLocals, parse with a non-empty decode list that excludesDependencyBlock(e.g.ctx.WithDecodeList(config.DependenciesBlock, config.TerraformBlock)).getDependencies, only includeconfig.DependencyBlockin the decode list when--ignore-dependency-blocksis not set.PR to follow.