Skip to content

Conversation

soundsonacid
Copy link
Contributor

Fixes #19175

@ntBre ntBre added bug Something isn't working fixes Related to suggested fixes for violations labels Jul 8, 2025
@ntBre
Copy link
Contributor

ntBre commented Jul 8, 2025

Thanks for working on this! Could you add a regression test based on the example in the issue?

I'm slightly concerned that is_whitespace is too permissive, but I'm also not quite sure why it was only limited to spaces and tabs before.

Copy link
Contributor

github-actions bot commented Jul 8, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@soundsonacid
Copy link
Contributor Author

Thanks for working on this! Could you add a regression test based on the example in the issue?

I'm slightly concerned that is_whitespace is too permissive, but I'm also not quite sure why it was only limited to spaces and tabs before.

No problem! I had a similar thought about the is_whitespace check, but I couldn't find (with admittedly not a ton of time searching) a comprehensive list of all valid whitespace Python characters. If you have one handy, I'm happy to make the change to that -- in the meantime I'll keep looking for one myself.

@dscorbett
Copy link

Section 2.1.9 “Whitespace between tokens” says:

Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate tokens. Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (e.g., ab is one token, but a b is two tokens).

Those are the only three characters Python treats as white space within a line.

@soundsonacid soundsonacid changed the title (fix): detect continuations with is_whitespace instead of hardcoded chars (fix): treat unicode formfeed as valid whitespace within match_continuation Jul 8, 2025
Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great!

And thanks @dscorbett for the report and for the whitespace reference.

I just had a couple of very small suggestions, but this is good to go otherwise.

Comment on lines 309 to 312
// space, tab, & formfeed (respectively) are the only three valid whitespace characters
// within a line:
// https://docs.python.org/3/reference/lexical_analysis.html#whitespace-between-tokens
' ' | '\t' | '\u{000C}' => continue,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use

pub const fn is_python_whitespace(c: char) -> bool {
matches!(
c,
// Space, tab, or form-feed
' ' | '\t' | '\x0C'
)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for lmk about this, updated to use this fn

Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ntBre
Copy link
Contributor

ntBre commented Jul 9, 2025

Looks like we need to update the snapshot for the new comment

@ntBre ntBre changed the title (fix): treat unicode formfeed as valid whitespace within match_continuation Treat form feed as valid whitespace before a line continuation Jul 9, 2025
@soundsonacid
Copy link
Contributor Author

Weird, i ran cargo insta review and it said no changes, also ran cargo test on my local and it was fine

Happy to do whatever it takes here just let me know what to do, not super familiar with how the tests work yet

@MichaReiser
Copy link
Member

Can you verify that you have no uncommited changes (git status, git diff)

@ntBre
Copy link
Contributor

ntBre commented Jul 10, 2025

I checked out the branch locally and see the same test failures, so it might be something with git as Micha said.

I can push a commit if you'd like but wanted to give you the solo credit for the PR if possible :)

@soundsonacid
Copy link
Contributor Author

yep, nothing unstaged, no diffs, no snapshots to review

@MichaReiser MichaReiser enabled auto-merge (squash) July 10, 2025 14:07
@MichaReiser MichaReiser merged commit 83b5bbf into astral-sh:main Jul 10, 2025
34 checks passed
UnboundVariable pushed a commit to UnboundVariable/ruff that referenced this pull request Jul 11, 2025
…re_help

* 'main' of https://github.com/astral-sh/ruff:
  Add simple integration tests for all output formats (astral-sh#19265)
  [`flake8-return`] Fix false-positive for variables used inside nested functions in `RET504`  (astral-sh#18433)
  [ty] Add a `--quiet` mode (astral-sh#19233)
  Treat form feed as valid whitespace before a line continuation (astral-sh#19220)
  [ty] Make `check_file` a salsa query (astral-sh#19255)
  [ty] Consolidate submodule resolving code between `types.rs` and `ide_support.rs` (astral-sh#19256)
  [ty] Remove countme from salsa-structs (astral-sh#19257)
  [ty] Improve and document equivalence for module-literal types (astral-sh#19243)
  [ty] Optimize protocol subtyping by removing expensive and unnecessary equivalence check from the top of `Type::has_relation_to()` (astral-sh#19230)
  [ty] Ecosystem analyzer: parallelize, fix race condition (astral-sh#19252)
  [ty] Add completion kind to playground (astral-sh#19251)
  [ty] Deploy ecosystem diff to Cloudflare pages (astral-sh#19234)
  [ty] Add semantic token provider to playground (astral-sh#19232)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixes Related to suggested fixes for violations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

match_continuation breaks on form feed, making TC004 introduce a syntax error
4 participants