Skip to content

Parser: improve error message handling #290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chqrlie
Copy link
Contributor

@chqrlie chqrlie commented Jun 11, 2025

  • use single on_error handler with error level and message arguments
  • remove Warning token type, never returned anyway.
  • improve #error and #warning message parsing consistency
  • make num_error messages non fatal
  • fix #warning behavior

@bvdberg
Copy link
Member

bvdberg commented Jun 13, 2025

It seems to make parsing slower... What was it that you were trying to improve?

@chqrlie
Copy link
Contributor Author

chqrlie commented Jun 13, 2025

It seems to make parsing slower...

This is surprising, the changes should not affect the inner loops, only the error cases.

What was it that you were trying to improve?

I was just trying to improve the consistency:

  • both warnings and errors should be produced synchronously,
  • #warning used to output an error with on_warning, itself using diag.error instead of diag.warn
  • num_error are errors, not warnings, but non fatal.
  • #error returned an error token without showing the error message.

char[constants.MaxErrorMsgLen+1] msg;
string.memcpy(msg, start, len);
msg[len] = 0;
msg.size(); // ensure null terminator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More anticipation of future feature :)

String buffers currently null terminate the array for each individual call. I suggest only adding the null terminator when the pointer or the size is actually needed. This would reduce the number of redundant writes and potentially improve the inlining of single character additions.

return true;

// parse pptokens instead of raw text
string_buffer.Buf* msg = string_buffer.create_static(elemsof(t.error_msg), false, t.error_msg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string_buffer.create_static does a malloc (surprised me as well), but old code seemed to be much simpler. If that used the t.error_msg buffer instead of a stack... The string buffer is now barely used (it does a memcopy internally). Also why parse to Eof instead of newline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string_buffer.create_static does a malloc (surprised me as well),

The string_buffer.Buf object would not need to be allocated in this case, as well as many other ones, if we had separate string_buffer.Buf.init(Buf* b, const char *buf, u32 size, bool reallocate) function called by both string_buffer.create and string_buffer.create_static.

but old code seemed to be much simpler.

The #error and #warning features should use preprocessed tokens like #if for multiple reasons:

  • for consistency with the C parser.
  • to simplify language support in editors and code colorizers (simpler if we follow the already supported C semantics)
  • so comments are parsed as such on these lines and thus can be used by the tester, or the programmer.

If that used the t.error_msg buffer instead of a stack... The string buffer is now barely used (it does a memcpy internally).

It does a bunch of strcpy and strcat with truncation. We could indeed write and use strlcpy-like functions instead.

Also why parse toEof instead of newline?

Because t.lex_preproc(result) returns Eof on both the actual end of file and the first newline encountered. lex_preproc is a wrapper for lex_internal that sets the stop_at_eol for this very purpose. I shall add a comment to clarify this side-effect.

@chqrlie chqrlie force-pushed the errors branch 8 times, most recently from c5a8f3e to 0a3ae91 Compare June 22, 2025 17:22
@chqrlie chqrlie force-pushed the errors branch 6 times, most recently from 51f67ec to aae549e Compare June 29, 2025 14:09
@chqrlie chqrlie force-pushed the errors branch 4 times, most recently from af29fa7 to 4a72fd3 Compare July 5, 2025 09:17
* use single `on_error` handler with error level and message arguments
* remove `Warning` token type, never returned anyway.
* improve `#error` and `#warning` message parsing consistency
* make `num_error` messages non fatal
* fix `#warning` behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants