Skip to content

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Mar 5, 2025

Summary

This PR implements the first part of #16440. It ensures that Red Knot's module resolver is case sensitive on all systems.

This PR combines a few approaches:

  1. It uses canonicalize on non-case-sensitive systems to get the real casing of a path. This works for as long as no symlinks or mapped network drives (the windows E:\ is mapped to \\server\share thingy). This is the same as what Pyright does
  2. If 1. fails, fall back to recursively list the parent directory and test if the path's file name matches the casing exactly as listed in by list dir. This is the same approach as CPython takes in its module resolver. The main downside is that it requires more syscalls because, unlike CPython, we Red Knot needs to invalidate its caches if a file name gets renamed (CPython assumes that the folders are immutable).

It's worth noting that the file watching test that I added that renames lib.py to Lib.py currently doesn't pass on case-insensitive systems. Making it pass requires some more involved changes to Files. I plan to work on this next. There's the argument that landing this PR on its own isn't worth it without this issue being addressed. I think it's still a good step in the right direction even when some of the details on how and where the path case sensitive comparison is implemented.

Test plan

I added multiple integration tests (including a failing one). I tested that the case-sensitivity detection works as expected on Windows, MacOS and Linux and that the fast-paths are taken accordingly.

@MichaReiser MichaReiser added the ty Multi-file analysis & type inference label Mar 5, 2025
Copy link
Contributor

github-actions bot commented Mar 5, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@MichaReiser MichaReiser force-pushed the micha/mdtest-os-system branch from edcb9e6 to 38e8267 Compare March 6, 2025 09:33
Base automatically changed from micha/mdtest-os-system to main March 6, 2025 09:41
@MichaReiser MichaReiser force-pushed the micha/case-sensitive-module-resolver branch from dbde770 to 35d4ec7 Compare March 6, 2025 09:43
@MichaReiser MichaReiser marked this pull request as ready for review March 7, 2025 14:26
@MichaReiser MichaReiser force-pushed the micha/case-sensitive-module-resolver branch from 499881a to 2ef2590 Compare March 7, 2025 14:29
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a headache! Thank you for taking this on.


db.use_system(OsSystem::new(&root));

db.write_file(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to rely on creating a parent directory, so why isn't it write_file_all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #16518 (comment)

Happy to change the naming if you prefer the consistency of having the _all prefix everywhere.


let upper_case = SystemPathBuf::from(path.as_str().to_uppercase());
if &*upper_case == path {
return CaseSensitivity::Unknown;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that if someone on a case-sensitive file system happens to run red-knot from an all-caps CWD, they may observe a significant slowdown?

That's not likely to ever happen... but it's pretty weird/unpleasant behavior if it does happen.

Would an alternative be to try lower-casing in that scenario? Either all-lower or all-upper should be different from the the original path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a headache!

I hope you referred to file systems and not my PR ;)

Yes. Another example is if you run Red Knot from a directory named /1234 where the upper and lower-case names are identical. The reason why I think this is extremely unlikely is because the code lower-cases the entire path and not just the last component. So /Users/micha/1234 would be converted to /USERS/MICHA/1234 and the test would still succeed.

We could extend the logic to also try current_exe for which the casing should differ unless someone renames the red_knot binary to 1234... 🤷 The only downside this has is that Red Knot might be installed in your home directory (let's say that's case sensitive) and your project runs on a case-insensitive network drive. However, I'm not sure how much we should worry about this case because the best we can do here ultimately is an approximation (or make everything slower than it has to be in most projects).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope you referred to file systems and not my PR ;)

Yes, file systems!

We could extend the logic to also try current_exe for which the casing should differ unless someone renames the red_knot binary to 1234... 🤷 The only downside this has is that Red Knot might be installed in your home directory (let's say that's case sensitive) and your project runs on a case-insensitive network drive.

The same issue could occur if you run knot from a cwd on one filesystem and use --project to check files on another one. In theory you could even have your project on one filesystem and your venv/site-packages on a different one.

Ultimately I am fine with however you want to handle this.

Copy link
Member Author

@MichaReiser MichaReiser Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it as is for now and plan to revisit it as part of the file watching on case insensitive systems work because, depending on which approach we take, falling back to Unknown might no longer be an option (in which case a more precise inference is important).

@MichaReiser MichaReiser force-pushed the micha/case-sensitive-module-resolver branch from 2ef2590 to a708fa1 Compare March 14, 2025 19:13
@MichaReiser MichaReiser enabled auto-merge (squash) March 14, 2025 19:14
@MichaReiser MichaReiser merged commit a467e7c into main Mar 14, 2025
21 checks passed
@MichaReiser MichaReiser deleted the micha/case-sensitive-module-resolver branch March 14, 2025 19:16
Copy link
Contributor

mypy_primer results

No ecosystem changes detected ✅

dcreager added a commit that referenced this pull request Mar 14, 2025
* main:
  [red-knot] Use `try_call_dunder` for augmented assignment (#16717)
  [red-knot] Document current state of attribute assignment diagnostics (#16746)
  [red-knot] Case sensitive module resolver (#16521)
  [red-knot] Very minor simplification of the render tests (#16759)
  [syntax-errors] Unparenthesized assignment expressions in sets and indexes (#16404)
  ruff_db: add a new diagnostic renderer
  ruff_db: add `context` configuration
  red_knot: plumb through `DiagnosticFormat` to the CLI
  ruff_db: add concise diagnostic mode
  [syntax-errors] Star annotations before Python 3.11 (#16545)
  [syntax-errors] Star expression in index before Python 3.11 (#16544)
  Ruff 0.11.0 (#16723)
  [red-knot] Preliminary tests for typing.Final (#15917)
  [red-knot] fix: improve type inference for binary ops on tuples (#16725)
  [red-knot] Assignments to attributes (#16705)
  [`pygrep-hooks`]: Detect file-level suppressions comments without rul… (#16720)
  Fallback to requires-python in certain cases when target-version is not found (#16721)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ty Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants