RUST-2028 Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34 #496

arthurprs · 2024-09-03T17:56:03Z

Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34 (Coefficient::MAX_DIGITS).

abr-egn · 2024-09-03T18:51:38Z

src/decimal128.rs

+    if !s.is_char_boundary(precision) {
+        // a non-digit (all single byte utf8) would trigger a ParseIntError::InvalidDigit,
+        // so here we generate a ParseIntError::InvalidDigit kind of error too.
+        return Err(ParseError::InvalidCoefficient(


I think I'd rather have a new enum variant for ParseError (something like ParseError::Unparseable) than synthesize an error this way; that way it could also carry the source string to make debugging easier.

I'm not saying I disagree, but no other error variants carry more information than this one.

InvalidDigit is arguably correct, as that would be the error if the split was character-based (instead of byte-based). The only reason I didn't do that is that it's more and slower code (linear vs. constant time).

If we could directly construct an InvalidDigit I'd have less concern. My worry is that by synthesizing it this way it'll only probably continue to be InvalidDigit; the internals of ParseIntError are private so there are no guarantees of stability. If the error changes in the future to include the offending text, for example, it'll be very confusing for downstream users of this crate to see "❤" that doesn't actually appear in their data.

Adding a new variant to ParseError avoids that issue, and if we're doing that it may as well carry the text.

I'll add a variant. But also, I'd like to point out that InvalidDigit (and any variant of ParseIntError for that matter) do not carry information about the input, such as the offending character.

Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34

3e0dd2f

arthurprs force-pushed the decimal128-parse-panic branch from d50873b to 3e0dd2f Compare September 3, 2024 17:57

abr-egn changed the title ~~Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34~~ RUST-2028 Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34 Sep 3, 2024

abr-egn self-requested a review September 3, 2024 18:19

abr-egn reviewed Sep 3, 2024

View reviewed changes

Add Unparseable error variant

d0c2c71

abr-egn approved these changes Sep 5, 2024

View reviewed changes

abr-egn requested a review from isabelatkinson September 5, 2024 14:24

isabelatkinson approved these changes Sep 9, 2024

View reviewed changes

isabelatkinson merged commit 692cd75 into mongodb:main Sep 9, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RUST-2028 Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34 #496

RUST-2028 Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34 #496

Uh oh!

arthurprs commented Sep 3, 2024

Uh oh!

abr-egn Sep 3, 2024

Uh oh!

arthurprs Sep 3, 2024

Uh oh!

abr-egn Sep 4, 2024

Uh oh!

arthurprs Sep 5, 2024

Uh oh!

Uh oh!

Uh oh!

RUST-2028 Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34 #496

RUST-2028 Fix Decimal128 panic when parsing strings w/o a char boundary at idx 34 #496

Uh oh!

Conversation

arthurprs commented Sep 3, 2024

Uh oh!

abr-egn Sep 3, 2024

Choose a reason for hiding this comment

Uh oh!

arthurprs Sep 3, 2024

Choose a reason for hiding this comment

Uh oh!

abr-egn Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

arthurprs Sep 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!