[Fix] encode brackets in key content to fix round-trip parsing#547
[Fix] encode brackets in key content to fix round-trip parsing#547veeceey wants to merge 4 commits intoljharb:mainfrom
Conversation
lib/stringify.js
Outdated
| // (%5B/%5D) since the encoder will double-encode the % to %25, producing | ||
| // %255B/%255D. If the key will NOT be further encoded (encodeValuesOnly), | ||
| // use double encoding (%255B/%255D) directly. | ||
| if (!(/\[|\]/).test(key)) { |
There was a problem hiding this comment.
all regexes should be module-level vars
There was a problem hiding this comment.
Good call — moved the bracket test regex and all the replacement regexes to module-level vars. Also added a hasBracketOrEncoded regex that includes the %5B/%5D patterns (more on why below). All regexes are now allocated once at module load.
lib/stringify.js
Outdated
| continue; | ||
| } | ||
| // Encode brackets in top-level key content to prevent parser confusion (#513) | ||
| var encodedTopKey = options.encode |
There was a problem hiding this comment.
the nested path uses if (encoder) as the guard, this uses options.encode. they should be the same.
There was a problem hiding this comment.
Yep, you're right — I extracted var encoder = options.encode ? options.encoder : null before the loop, and now both the bracket-encoding guard and the argument passed to the inner stringify call use if (encoder) consistently, same as the nested path does.
lib/parse.js
Outdated
| obj = options.plainObjects ? { __proto__: null } : {}; | ||
| var cleanRoot = root.charAt(0) === '[' && root.charAt(root.length - 1) === ']' ? root.slice(1, -1) : root; | ||
| var decodedRoot = options.decodeDotInKeys ? cleanRoot.replace(/%2E/g, '.') : cleanRoot; | ||
| decodedRoot = decodedRoot.replace(/%5B/gi, '[').replace(/%5D/gi, ']'); |
There was a problem hiding this comment.
this is unconditional, but the stringify side only encodes actual [/] - a key containing the literal text %5B gets mangled on round-trip: { 'a%5Bb': 'c' } → stringify → parse → { 'a[b': 'c' }. this shouldn't happen by default, unless we can work around it
There was a problem hiding this comment.
This was the big one — you're absolutely right that the unconditional decode breaks round-trip for keys with literal %5B text. For example { 'a%5Bb': 'c' } would get mangled to { 'a[b': 'c' } on the way back.
Rather than adding a new option (which felt heavy for this), I fixed it by handling it on both sides:
Stringify side: encBracketsInContent now also pre-encodes any existing %5B/%5D text sequences in the key (before encoding the actual bracket characters). This means literal %5B in a key gets an extra level of encoding, so it doesn't collide with the %5B that represents actual brackets.
Parse side: After the %5B -> [ replacement, there's now a second step that decodes %255B -> %5B (and likewise for %5D). This restores any literal percent-encoded bracket text that was in the original key.
The net effect is that both { a: { 'b[c]': 'd' } } and { 'a%5Bb': 'c' } round-trip correctly, without any new options and without breaking backwards compatibility — the parser still decodes %5B to [ for structural brackets as before.
lib/parse.js
Outdated
| obj = options.plainObjects ? { __proto__: null } : {}; | ||
| var cleanRoot = root.charAt(0) === '[' && root.charAt(root.length - 1) === ']' ? root.slice(1, -1) : root; | ||
| var decodedRoot = options.decodeDotInKeys ? cleanRoot.replace(/%2E/g, '.') : cleanRoot; | ||
| decodedRoot = decodedRoot.replace(/%5B/gi, '[').replace(/%5D/gi, ']'); |
There was a problem hiding this comment.
these regexes should be cached in module-level vars too
There was a problem hiding this comment.
Done — added encodedBracketOpen, encodedBracketClose, escapedEncodedBracketOpen, and escapedEncodedBracketClose as module-level vars at the top of parse.js. The inline regexes on this line are now just references to those cached vars.
| 'does not encode brackets in key content when encode is false' | ||
| ); | ||
|
|
||
| st.end(); |
There was a problem hiding this comment.
needs a test for a key containing the literal text %5B - that case breaks the round-trip with this approach
There was a problem hiding this comment.
Added four tests for this:
- Flat key with literal
%5B:{ 'a%5Bb': 'c' }round-trips correctly - Nested key with literal
%5B:{ a: { 'b%5Bc': 'd' } }round-trips correctly - Nested key with both
%5Band%5D:{ a: { 'b%5Bc%5D': 'd' } }round-trips correctly - Nested key with literal
%5B+encodeValuesOnly: also round-trips correctly
All four pass. The key insight was that the stringify side needs to pre-encode existing %5B/%5D text so it doesn't collide with the structural bracket encoding, and the parse side then undoes that extra level as a second step after the structural bracket decode.
|
Hi @ljharb, I've addressed all the feedback — module-level regex vars, consistent encoder guard, and the %5B round-trip fix with proper double-encoding. Added tests for flat/nested keys with literal %5B/%5D text. CI seems to have a failing check though, will take a look at that. |
When object keys contain literal `[` or `]` characters, stringify now pre-encodes them so the parser does not confuse them with the structural brackets used for nesting (e.g. `a[b]=c`). This fixes the round-trip failure described in ljharb#513, where `qs.parse(qs.stringify(obj))` would not return the original object when keys contained brackets (such as JSON-like strings). Fixes ljharb#513
…stent guard - Move all inline regexes to module-level vars in both stringify.js and parse.js - Use `if (encoder)` guard instead of `options.encode` for consistency with the nested path - Fix critical round-trip bug for keys containing literal %5B/%5D text by pre-encoding those sequences in encBracketsInContent and adding a corresponding unescape step in parseObject - Add tests for keys containing literal %5B text to verify round-trip
1c1d003 to
c0caccc
Compare
|
Hey @ljharb, I've addressed all 5 review points and CI is green now. Could you take another look when you get a chance? Thanks! |
Replace inline /%5B/gi and /%5D/gi in parseValues with the module-level encodedBracketOpen/encodedBracketClose vars that were already defined at the top of the file. Regenerate dist.
0fd0230 to
e3f6b8d
Compare
|
Hey @ljharb, just checking in again -- all 5 review points have been addressed and the tests are passing. Happy to make any further adjustments if needed. Thanks! |
When object keys contain literal
[or]characters (e.g. JSON-like strings or other data embedded in property names),qs.stringifywould produce output thatqs.parsecould not correctly parse back, because the parser confused content brackets with the structural brackets used for nesting (a[b]=c).This patch pre-encodes
[and]inside key content before wrapping it with structural brackets during stringification. On the parse side, the encoded brackets are decoded back after structural splitting is done.Before:
After:
Round-trips work with default encoding,
encodeValuesOnly, andallowDots. Whenencode: false, no encoding is applied (brackets pass through as-is, matching existing behavior).Fixes #513