Skip to content

Conversation

ntBre
Copy link
Contributor

@ntBre ntBre commented May 20, 2025

Summary

This PR adds a macro-generated method to retrieve the Rule associated with a given Violation struct, which makes it substantially cheaper than parsing from the rule name. The rule is then converted to a NoqaCode for storage on the Message (and eventually on the new diagnostic type). The ViolationMetadata::rule_name method was now unused, so the rule method replaces it.

Several types had to be moved from the ruff_diagnostics crate to the ruff_linter crate to make this work, namely the Violation traits and the old Diagnostic type, which had a constructor generic over a Violation.

It's actually a fairly small PR, minus the hundreds of import changes. The main changes are in these files:

Test Plan

Existing tests

Copy link

codspeed-hq bot commented May 20, 2025

CodSpeed Performance Report

Merging #18234 will improve performances by 21.11%

Comparing brent/violation-rules (daa0d26) with main (a3ee6bb)

Summary

⚡ 8 improvements
✅ 26 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
linter/all-rules[numpy/ctypeslib.py] 4.2 ms 4 ms +5.03%
linter/all-rules[numpy/globals.py] 738.4 µs 684.4 µs +7.9%
linter/all-rules[pydantic/types.py] 8.8 ms 8.2 ms +7.04%
linter/all-rules[unicode/pypinyin.py] 2.2 ms 1.8 ms +21.11%
linter/all-with-preview-rules[numpy/ctypeslib.py] 5.1 ms 4.8 ms +6.13%
linter/all-with-preview-rules[numpy/globals.py] 842.5 µs 794 µs +6.11%
linter/all-with-preview-rules[pydantic/types.py] 10.5 ms 9.9 ms +6.49%
linter/all-with-preview-rules[unicode/pypinyin.py] 2.5 ms 2.1 ms +20.21%

Copy link
Contributor

github-actions bot commented May 20, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@ntBre
Copy link
Contributor Author

ntBre commented May 20, 2025

Those benchmarks are pretty promising, maybe this is worth pursuing.

@MichaReiser
Copy link
Member

MichaReiser commented May 21, 2025

The speedups look promising. However, I don't think we want or even can store a Rule on the new ruff_db::Diagnostic. How would this work in that new model?

I think my overall preference would be to reduce the need to access Rule after report_diagnostic has been called (and instead, use NoqaCode directly)

ntBre added 5 commits May 26, 2025 20:07
Summary
--

This is a rough draft/experiment to associate a `Rule` with a `Violation` type
for cheaper conversions between the two than the current approach of matching on
the stringified names. It also moves `Violation` and `Diagnostic` and some
related types out of `ruff_diagnostics` and into the linter to make the macro
work.

This could definitely be a bad idea, but it seemed fun to try. Some of the
module moves may actually be needed for the diagnostic refactor anyway. I'm
mostly hoping to see a significant performance gain on codspeed.

Test Plan
--

Existing tests
@ntBre ntBre force-pushed the brent/violation-rules branch from 31a13d9 to bc8389a Compare May 27, 2025 00:21
Copy link
Contributor

github-actions bot commented May 27, 2025

mypy_primer results

No ecosystem changes detected ✅

@ntBre
Copy link
Contributor Author

ntBre commented May 27, 2025

I think this is ready for review. I tried to regroup all of the imports I added to keep the crate blocks separate from the others.

I also changed Message to store a NoqaCode instead of a Rule directly, fortunately without losing much, if any, of the performance improvement. Some of the smaller benchmarks actually sped up a little, I think.

I also moved the RULE constant to a ViolationMetadata::rule method, replacing ViolationMetadata::rule_name, which was now unused.

@ntBre ntBre added diagnostics Related to reporting of diagnostics. internal An internal refactor or improvement labels May 27, 2025
@ntBre ntBre changed the title [test] Try adding a Violation::RULE constant Add a ViolationMetadata::rule method May 27, 2025
@@ -14,12 +14,17 @@ pub use rule_selector::RuleSelector;
pub use rule_selector::clap_completion::RuleSelectorParser;
pub use rules::pycodestyle::rules::IOError;

pub use diagnostic::Diagnostic;
pub(crate) use ruff_diagnostics::{Applicability, Edit, Fix, IsolationLevel, SourceMap};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not strictly necessary, it just made rewriting some of the imports automatically a little easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all the re-exports. E.g. SourceMap is probably less common

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not, I just got a bit tired/lazy updating imports and wanted to replace all ruff_diagnostics::* imports with crate::* in this crate. It should be easy enough to remove the less common ones like IsolationLevel and SourceMap, though.

@ntBre ntBre marked this pull request as ready for review May 27, 2025 00:55
@ntBre ntBre requested a review from AlexWaygood as a code owner May 27, 2025 00:55
@ntBre ntBre requested review from MichaReiser and removed request for AlexWaygood May 27, 2025 00:55
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

The only thing that's unclear to me and I would like to get a short summary on what the next steps are is how you plan on removing rule (and AsRule) from Diagnostic because we won't be able to add it to ruff_db::Diagnostic. IMO, I think that's the next important refactor we shoul do

}

impl Diagnostic {
#[expect(clippy::needless_pass_by_value)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we allow needless_pass_by_value because passing by reference would require changing all call sites? It might be worth adding a note why we allow this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right. I was a bit confused why it wasn't already needless and thought further refactoring (like combining this with Checker::report_diagnostic in most cases) might make it not needless again. I'll add a note!

/// The message body to display to the user, to explain the diagnostic.
pub body: String,
/// The message to display to the user, to explain the suggested fix.
pub suggestion: Option<String>,
pub range: TextRange,
pub fix: Option<Fix>,
pub parent: Option<TextSize>,

pub(crate) rule: Rule,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our plan is to remove this Diagnostic type with ruff_db::Diagnostic. What's your plan on how to remove rule from here because we can't add it to ruff_db::Diagnostic.

@@ -14,12 +14,17 @@ pub use rule_selector::RuleSelector;
pub use rule_selector::clap_completion::RuleSelectorParser;
pub use rules::pycodestyle::rules::IOError;

pub use diagnostic::Diagnostic;
pub(crate) use ruff_diagnostics::{Applicability, Edit, Fix, IsolationLevel, SourceMap};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all the re-exports. E.g. SourceMap is probably less common

match self { #rule_fixable_match_arms }
}
}

impl AsRule for ruff_diagnostics::Diagnostic {
impl AsRule for crate::Diagnostic {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one of the next important refactors is to remove rule (and AsRule) from Diagnostic because that won't work long-term

@ntBre
Copy link
Contributor Author

ntBre commented May 27, 2025

The only thing that's unclear to me and I would like to get a short summary on what the next steps are is how you plan on removing rule (and AsRule) from Diagnostic because we won't be able to add it to ruff_db::Diagnostic. IMO, I think that's the next important refactor we shoul do

My plan was to get rid of this old Diagnostic kind entirely soon. I was picturing Message as its successor, which is why I only updated it to use the NoqaCode instead of also updating Diagnostic. I was thinking that would take care of most uses, but the rule_enabled checks might be slightly trickier. We could still implement AsRule via FromStr as long as we have the rule name stored, or we can possibly move that enabled check into Checker::report_diagnostic like we discussed a bit on #18232.

In short, I think most of the need for this field goes away when we combine the diagnostic types, and we should be able to work around the rest if needed. Does that sound right to you?

@MichaReiser
Copy link
Member

MichaReiser commented May 27, 2025

I was thinking that would take care of most uses, but the rule_enabled checks might be slightly trickier.

Do you have a link to this check. I think that should be fine as long as you have a ViolationMetadata.

In short, I think most of the need for this field goes away when we combine the diagnostic types, and we should be able to work around the rest if needed. Does that sound right to you?

It does sound right in that this is what I would hope to be possible. I don't have a clear enough picture to say whether it will be possible.

(and one unrelated import order refactor)
@ntBre
Copy link
Contributor Author

ntBre commented May 27, 2025

Do you have a link to this check. I think that should be fine as long as you have a ViolationMetadata.

Oops, I meant Checker::enabled:

#[inline]
pub(crate) const fn enabled(&self, rule: Rule) -> bool {
self.settings.rules.enabled(rule)
}

I think we can work around the uses that don't pass a Rule directly easily enough with a ViolationMetadata, like you said.

It does sound right in that this is what I would hope to be possible. I don't have a clear enough picture to say whether it will be possible.

Sounds good. It's not entirely clear to me either, but it feels like it should work!

@MichaReiser
Copy link
Member

Why would we have to change checker.enabled? Most call sites should have access to the rule or are there call sites where we check if the diagnostic's rule is enabled?

@ntBre
Copy link
Contributor Author

ntBre commented May 27, 2025

Yes, there are call sites where diagnostic.rule() is passed. Some of them are already replaced in #18232, but there are still a few left.

@ntBre
Copy link
Contributor Author

ntBre commented May 27, 2025

Did I really break the mypy primer? I didn't think I touched any ty files, but everything should be up to date and it looks okay on main. I also tried rerunning once. It looks like this is the root cause:

fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/function/execute.rs:209:25 when checking `/tmp/mypy_primer/projects/arviz/arviz/data/inference_data.py`: `infer_definition_types(Id(c2a15)): execute: too many cycle iterations`

@MichaReiser
Copy link
Member

Did I really break the mypy primer? I didn't think I touched any ty files, but everything should be up to date and it looks okay on main. I also tried rerunning once. It looks like this is the root cause:

seems unlikely.

@AlexWaygood
Copy link
Member

I think it's possible that a recent commit on arviz (arviz-devs/arviz@3205b82? it landed a few hours ago) means that ty now crashes on that project when we didn't previously. There's the same crash being reported on #18333

@AlexWaygood
Copy link
Member

if so we should just move the project from good.txt to bad.txt (these files [here
(https://github.com/astral-sh/ruff/tree/main/crates/ty_python_semantic/resources/primer)) but would be good to confirm that that's the cause first

@AlexWaygood
Copy link
Member

Confirmed locally that ty @ Ruff's main branch crashes on arviz @ arviz-devs/arviz@3205b82 but not on the prior commit

@AlexWaygood
Copy link
Member

filed #18336 to fix

@AlexWaygood
Copy link
Member

filed #18336 to fix

Which has now landed -- the primer panic should go away if you rebase!

@ntBre
Copy link
Contributor Author

ntBre commented May 27, 2025

Amazing, thank you @AlexWaygood!

@ntBre
Copy link
Contributor Author

ntBre commented May 27, 2025

Hmm, I got a new cascade of errors on this one.

fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/views.rs:114:9 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/build.py`: `No downcaster registered for type `dyn ty_python_semantic::db::Db` in `Views``
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/__main__.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/parser.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/grammar_parser.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/grammar.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/sccutils.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/first_sets.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/__init__.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/tokenizer.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/utils.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/python_generator.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/parser_generator.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/grammar_visualizer.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/4818b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/web.py`: `index `23` is uninitialized`
fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/[481](https://github.com/astral-sh/ruff/actions/runs/15283313638/job/42987393320?pr=18234#step:6:482)8b15/src/zalsa.rs:210:32 when checking `/tmp/mypy_primer/projects/pegen/src/pegen/validator.py`: `index `23` is uninitialized

I'll try rerunning once to see if that helps.

ntBre added a commit that referenced this pull request May 27, 2025
I avoided this initially, but it will be cheap enough once #18234 lands
@ntBre ntBre merged commit 9925910 into main May 28, 2025
35 checks passed
@ntBre ntBre deleted the brent/violation-rules branch May 28, 2025 13:27
dcreager added a commit that referenced this pull request May 28, 2025
* main:
  [ty] Support ephemeral uv virtual environments (#18335)
  Add a `ViolationMetadata::rule` method (#18234)
  Return `DiagnosticGuard` from `Checker::report_diagnostic` (#18232)
  [flake8_use_pathlib]: Replace os.symlink with Path.symlink_to (PTH211) (#18337)
  [ty] Support cancellation and retry in the server (#18273)
  [ty] Synthetic function-like callables (#18242)
  [ty] Support publishing diagnostics in the server (#18309)
  Add Autofix for ISC003 (#18256)
  [`pyupgrade`]: new rule UP050 (`useless-class-metaclass-type`) (#18334)
  [pycodestyle] Make `E712` suggestion not assume a context (#18328)
carljm added a commit to MatthewMckee4/ruff that referenced this pull request May 28, 2025
* main: (246 commits)
  [ty] Simplify signature types, use them in `CallableType` (astral-sh#18344)
  [ty] Support ephemeral uv virtual environments (astral-sh#18335)
  Add a `ViolationMetadata::rule` method (astral-sh#18234)
  Return `DiagnosticGuard` from `Checker::report_diagnostic` (astral-sh#18232)
  [flake8_use_pathlib]: Replace os.symlink with Path.symlink_to (PTH211) (astral-sh#18337)
  [ty] Support cancellation and retry in the server (astral-sh#18273)
  [ty] Synthetic function-like callables (astral-sh#18242)
  [ty] Support publishing diagnostics in the server (astral-sh#18309)
  Add Autofix for ISC003 (astral-sh#18256)
  [`pyupgrade`]: new rule UP050 (`useless-class-metaclass-type`) (astral-sh#18334)
  [pycodestyle] Make `E712` suggestion not assume a context (astral-sh#18328)
  put similar dunder-call tests next to each other (astral-sh#18343)
  [ty] Derive `PartialOrd, Ord` for `KnownInstanceType` (astral-sh#18340)
  [ty] Simplify `Type::try_bool()` (astral-sh#18342)
  [ty] Simplify `Type::normalized` slightly (astral-sh#18339)
  [ty] Move arviz off the list of selected primer projects (astral-sh#18336)
  [ty] Add --config-file CLI arg (astral-sh#18083)
  [ty] Tell the user why we inferred a certain Python version when reporting version-specific syntax errors (astral-sh#18295)
  [ty] Implement implicit inheritance from `Generic[]` for PEP-695 generic classes (astral-sh#18283)
  [ty] Add hint if async context manager is used in non-async with statement (astral-sh#18299)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
diagnostics Related to reporting of diagnostics. internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants