[pyflakes] Avoid removing the format call when it would change behavior (F523)#25320
Conversation
|
ntBre
left a comment
There was a problem hiding this comment.
I think we should try to implement the ideal fix described in #15557 (comment) rather than just marking the fix unsafe and still breaking the code.
…as no remaining replacement fields
When dropping every positional argument leaves a format string with only `{{` / `}}` brace escapes, the rewrite now removes the `.format(...)` call and reduces the escapes to single braces so the result matches what `.format()` would have produced. When the string still has replacement fields (auto, indexed, or named), the empty `.format()` call is kept so the runtime behaviour, including any `KeyError`, is preserved. Both branches are safe fixes again.
99d4014 to
0c706a0
Compare
|
Implemented the ideal fix in 0c706a0. When dropping the unused arguments leaves a format string with no replacement fields, the rule now removes the .format() call and reduces any {{ and }} escapes to single braces. When the string still has replacement fields, the empty .format() call is kept so the KeyError raised by an unset named placeholder is preserved. Both branches are safe fixes again. Added test cases covering "{{".format("!"), "{x}".format("!"), "{{}}".format("!"), "{{0}}".format("!"), and "{x}".format("!", x=1). |
ntBre
left a comment
There was a problem hiding this comment.
Thanks for looking into this. After seeing the complexity introduced by the "ideal" fix, I think I'd actually prefer just to avoid removing the .format call if the string contains braces.
| if string_expr.value.is_implicit_concatenated() { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
I think this will be bail out unnecessarily in a case like this:
("Hello" "world").format("!")so I think we need to separate the string-processing safety from the number of arguments safety.
There was a problem hiding this comment.
Thanks, that simpler approach is much nicer. I dropped the brace-rewriting helper entirely. Now I only remove the .format(...) call when the receiver string has no braces, where it is a genuine no-op, and whenever the string contains any { or } I keep the now-argumentless .format() call so the runtime behaviour, including any KeyError, is preserved. I also pulled the brace check out of the argument-count handling so they are independent, which means ("Hello" "world").format("!") no longer bails and just becomes "Hello" "world". And I removed the duplicated comments so the explanation only lives in the function docs now.
| // `.format()` would not have changed anything in the resulting | ||
| // string, so reuse the original source as-is. | ||
| return Some(locator.slice(string_expr).to_string()); | ||
| } | ||
|
|
||
| let content_range = literal.content_range(); | ||
| let source_content = locator.slice(content_range); | ||
|
|
||
| // The textual `{{` -> `{` rewrite below is only safe when the source | ||
| // content matches the decoded value: otherwise a Python escape sequence | ||
| // such as `\x7b` could combine with a neighbouring literal brace and | ||
| // change the resulting string. |
There was a problem hiding this comment.
I think these comments are already covered by the function docs. I actually kind of prefer them here I think, but we should delete them either here or in the function docs to avoid duplication.
Instead of rewriting brace escapes when dropping every positional
argument, only drop the .format(...) call when the receiver string has
no braces, where .format() is a genuine no-op. When the string contains
any { or }, keep the now-argumentless .format() call so the runtime
behaviour, including any KeyError, is preserved.
The brace safety check is now separate from the argument-count handling,
so an implicit-concatenated receiver such as ("Hello" "world").format("!")
no longer bails the whole fix.
the additional summary fields are passed via brackets, so the simple `contains` check takes care of this too
ntBre
left a comment
There was a problem hiding this comment.
Thanks. I pushed a few commits converting the tests to use our new mdtest framework, trim down some of the commentary, and simplify out the summary argument again. This should be redundant with the new contains check on the braces since the only way to create those summary fields is by including braces in the format string. I also trimmed down the test suite to just the examples from the issue because the other cases also seemed straightforward now that there's no tricky brace manipulation going on.
Just waiting to see a clean ecosystem check after my changes, and then I'll merge!
Merging this PR will improve performance by 8.92%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Memory | parser[numpy/globals.py] |
18 KB | 16 KB | +12.28% |
| ⚡ | Memory | parser[pydantic/types.py] |
403.2 KB | 368.5 KB | +9.41% |
| ⚡ | Memory | parser[large/dataset.py] |
862.2 KB | 816.8 KB | +5.56% |
| ⚡ | Memory | parser[unicode/pypinyin.py] |
55.5 KB | 52.6 KB | +5.45% |
| ⚡ | Memory | parser[numpy/ctypeslib.py] |
186.7 KB | 166.5 KB | +12.13% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing adityasingh2400:fix-15557 (10746c6) with main (572e4b5)2
Footnotes
-
65 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
-
No successful run was found on
main(1518f1d) during the generation of this report, so 572e4b5 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
pyflakes] Mark F523 fix as unsafe when it removes a meaningful .format(...) callpyflakes] Avoid removing the format call when it would change behavior (F523)
Summary
When every positional argument is unused,
F523removes the entire.format(...)call. That changes behavior whenever the format string relies on the call:"{{".format("!")returns"{", but"{{"is two characters."{x}".format("!")raisesKeyError, but"{x}"quietly succeeds.This detects these cases and avoids removing the entire
formatcall. We also tried the "ideal" fix described on the issue of processing the string to handle escapes ourselves, but just preserving the call seemed like the better tradeoff in terms of complexity.Test plan
New mdtests based on the examples from the issue
Closes #15557