Skip to content

Conversation

dbnicholson
Copy link
Contributor

This adds an OS error handler that optionally sets the exit code to EXIT_FAILURE. The option can be enabled with the --fail-on-error CLI flag, but it's also set when exporting a project. The error handling is a little funky to make sure that a script calling get_tree().quit() doesn't inadvertently set the exit code back to EXIT_SUCCESS. I'm also not really sure about the public is_error_occurred/set_error_occurred handlers, but they were necessary for testing.

Fixes: #94957

@dbnicholson dbnicholson requested review from a team as code owners November 14, 2024 22:34
@dbnicholson dbnicholson changed the title Add --fail-on-error option Add --fail-on-error option and fail exports on errors Nov 14, 2024
@AThousandShips AThousandShips added this to the 4.4 milestone Nov 15, 2024
core/os/os.cpp Outdated
Comment on lines 218 to 222
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this behavior only apply with --fail-on-error? This may break some user expectations otherwise, as the exit code can be specified as an optional parameter to get_tree().quit().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sorta does. Currently _error_occurred is only set by the error handler when _fail_on_error is also true. So, implicitly _error_occurred also means _fail_on_error. Originally _error_occurred was just a private boolean, but I later added the accessors when writing the test so I could check and restore the value without failing the whole process. So, _error_occurred no longer guarantees _fail_on_error since some other part of Godot could call set_error_occurred(true). So, maybe the conditional should be expanded to check both flags to ensure this only happens with --fail-on-error and that an error was detected.

The whole point of this part is about how users call get_tree().quit(). Since Godot will carry on in the face of script errors, you could have a reasonable quit() or quit(0) in an unrelated location that would be executed and revert the exit code to 0. Here's a contrived example:

extends SceneTree


func _init():
	var a = Node.new()
	a.free()
	var b: Node = a


func _process(_delta):
	quit()

Without the above block, this script would exit 0 even with --fail-on-error. You may be wondering why _error_occurred is needed at all. If this only checked _fail_on_error, then that once quit(1) was called, it couldn't be changed back with quit(0). In other words, _error_occurred is being used as a proxy for "the exit code was changed to a failure because an error was handled". It's trying to maintain flexibility for users calling quit() while preventing them from inadvertently changing it back to 0 since they have no way to know an error occurred.

It would be nice if there was a way to stop the main loop without also setting the exit code. For example, if the quit() default exit code was -1, then users could call a bare quit() and not influence the exit code. Then it would be less likely for this block to be needed since someone would have to explicitly be calling quit(0), and that seems less likely than quit(). Would likely still break someone's usage and it would change the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was a big tangent about quit(). I agree that guarding set_exit_code should only happen when fail on error is enabled and pushed a fixup with that change. I can squash it into the first commit if you agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had one more thought here. Since fail_on_error and error_occurred are independent, then the error handler should always set error_occurred and set_error_occurred should check whether fail_on_error is set before changing the exit code. It's a minor change, but that way if something calls OS::set_error_occurred(true), it doesn't fail the process unless fail_on_error is true.

Similarly, if the error handler always sets error_occurred, then you can reliably query it from other places. I could see in the future where the flags are in the bound interface and it would be really useful to check OS.error_occurred in a script to know if there was an error somewhere.

Does that make sense? I pushed a fixup commit with that change.

Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally on Linux, it works as expected. Code looks good to me.

However, this is always printed when --fail-on-error is used when SceneTree.quit() is called (since its default value is 0):

WARNING: Cannot set exit code to 0 after an error has occurred.
     at: set_exit_code (./core/os/os.cpp:221)

Code used:

func _ready() -> void:
	push_error("Error")
	get_tree().quit()

This does not occur if the engine is exited via other methods, e.g. --quit.

I agree we should readd the OS.exit_code property and add SceneTree.quit(-1) to allow not overriding the exit code, but this should be done in a separate PR since it has a greater impact on the codebase (and potential compatibility quirks).

@dbnicholson
Copy link
Contributor Author

However, this is always printed when --fail-on-error is used when SceneTree.quit() is called (since its default value is 0):

WARNING: Cannot set exit code to 0 after an error has occurred.
     at: set_exit_code (./core/os/os.cpp:221)

Code used:

func _ready() -> void:
	push_error("Error")
	get_tree().quit()

Should I drop the warning? It was my minor attempt at highlighting to users why quit() may not be doing what they expect, but I'm not sure it's adding any value. I could also change it to a WARN_VERBOSE or put it under DEV_ENABLED or something.

This does not occur if the engine is exited via other methods, e.g. --quit.

It still exit's unsuccessfully with --fail-on-error --quit, though, right?

@Calinou
Copy link
Member

Calinou commented Nov 18, 2024

Should I drop the warning?

Yes, I'd move it to WARN_VERBOSE().

It still exit's unsuccessfully with --fail-on-error --quit, though, right?

Yes, exit code is still 1 as evidenced by calling echo $? afterwards.

@dbnicholson
Copy link
Contributor Author

I think all the comments are addressed. If it looks good to you, I'll squash the fixups.

@dbnicholson
Copy link
Contributor Author

@Calinou ping. Anything I can do to help get this merged?

Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works great now 🙂

dbnicholson and others added 3 commits December 21, 2024 16:53
Currently the exit code will always be EXIT_SUCCESS so long as nothing
explicitly sets it otherwise. That's generally good as you want the
engine to report success in the face of non-fatal errors. However, when
you want to fix errors, you want the process to exit with a failure when
there are errors.

Add a boolean `fail_on_error` OS setting that defaults to false to match
the current semantics. When set to true, an error handler will set the
exit code to EXIT_FAILURE. In order to keep an unknowing component from
setting the exit code back to EXIT_SUCCESS, an `error_occurred` flag is
also set from the error handler.
This allows setting the OS `fail_on_error` flag in all builds. The
primary use case is running it with scripts in CI so that error can be
caught without scraping the process output.

Co-authored-by: Hugo Locurcio <[email protected]>
Exporting a project with errors should fail the process to help prevent
users from publishing broken games.
@dbnicholson
Copy link
Contributor Author

Tested locally, it works great now 🙂

Thanks for testing! I squashed all the fixups now, but there's no change.

@akien-mga akien-mga changed the title Add --fail-on-error option and fail exports on errors Add --fail-on-error option and fail exports on errors Jan 6, 2025
@akien-mga akien-mga self-requested a review January 6, 2025 20:58
@akien-mga
Copy link
Member

Status update: I want to find some time to review this myself, as the error propagation in Main is pretty tricky and has been the source of multiple bugs in the past. We're entering the beta stage and feature freeze for 4.4 so this would be merged for 4.5 early on most likely, once I get to approve it.

@akien-mga akien-mga added enhancement and removed bug labels Jan 14, 2025
@akien-mga akien-mga modified the milestones: 4.4, 4.5 Jan 14, 2025
@Repiteo Repiteo modified the milestones: 4.5, 4.6 Aug 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exports from CLI return exit code 0 when failed to build C# project.
5 participants