-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
gh-127443: add tool for linting Doc/data/refcounts.dat
#127476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
picnixz
wants to merge
32
commits into
python:main
Choose a base branch
from
picnixz:tools/refcounts/lint-127443
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 3 commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
183388f
Add tool for linting `Doc/data/refcounts.dat`
picnixz 9c2a109
use single-quotes
picnixz 6c1a19e
add default paths
picnixz 419197b
use NamedTuple whenever possible
picnixz 58ffbeb
address Peter's review
picnixz 9a46f10
fix typos
picnixz 8372de6
address Alex's review
picnixz 4bcf719
format using a mix of black/ruff/picnixz format
picnixz 841a4b1
rename STEALS -> STEAL
picnixz 970cbc7
detect more ref. count manageable objects
picnixz 4558483
add `lineno` to RETURN values and use kw-only
picnixz b8f6090
use helper for C identifier detection
picnixz db9b6e6
`RefType` -> `RefEffect`
picnixz 480f500
improve `ParserReporter`
picnixz 9814dd7
disallow stolen ref in return values
picnixz 82766b3
add doc
picnixz e7a7a10
fix workflow
picnixz f64a23d
update pre-commit hook
picnixz 2eb541f
fix some typos
picnixz cf42e03
restrict the ruff rules
picnixz eb893d0
add ruff docstrings rules
picnixz dbe29a6
address Peter's review
picnixz 658e332
update some variable names
picnixz 5660ffe
add TODO messages
picnixz afceff0
RefEffect -> Effect
picnixz d173d7a
extract checking logic into smaller functions
picnixz 0edd489
add --strict errors mode
picnixz a3becf0
additional stealing effects
picnixz 3ac1ff1
Merge remote-tracking branch 'upstream/main' into tools/refcounts/lin…
picnixz baf2474
address Hugo's review
picnixz 549ba49
remove TODO symbols for now (we don't want yet to change the Sphinx e…
picnixz c708a4c
address Alex's review
picnixz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
target-version = "py312" | ||
line-length = 79 | ||
fix = true | ||
|
||
[format] | ||
quote-style = "single" | ||
|
||
[lint] | ||
select = [ | ||
"ALL", | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"TCH", | ||
] | ||
ignore = [ | ||
# isort | ||
'I001', # unsorted or unformatted import | ||
# mccabbe | ||
"C901", | ||
# pydocstyle | ||
"D", | ||
# flake8-quotes (Q) | ||
'Q000', # double quotes found but single quotes preferred | ||
'Q001', # single quote docstring found but double quotes preferred | ||
# flake8-print (T20) | ||
'T201', # print found | ||
# pylint (PL) | ||
'PLR0912', # too many branches | ||
'PLR2004', # avoid magic values | ||
] | ||
|
||
[lint.isort] | ||
required-imports = ["from __future__ import annotations"] | ||
|
||
[lint.pycodestyle] | ||
max-doc-length = 79 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,301 @@ | ||
"""Lint Doc/data/refcounts.dat.""" | ||
|
||
from __future__ import annotations | ||
|
||
import itertools | ||
import re | ||
import tomllib | ||
from argparse import ArgumentParser, RawDescriptionHelpFormatter | ||
from dataclasses import dataclass, field | ||
from enum import auto as _auto, Enum | ||
from pathlib import Path | ||
from typing import TYPE_CHECKING, LiteralString, NamedTuple | ||
|
||
if TYPE_CHECKING: | ||
from collections.abc import Callable, Iterable, Mapping | ||
|
||
ROOT = Path(__file__).parent.parent.parent.resolve() | ||
DEFAULT_REFCOUNT_DAT_PATH: str = str(ROOT / 'Doc/data/refcounts.dat') | ||
DEFAULT_STABLE_ABI_TOML_PATH: str = str(ROOT / 'Misc/stable_abi.toml') | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
C_ELLIPSIS: LiteralString = '...' | ||
|
||
MATCH_TODO: Callable[[str], re.Match | None] | ||
MATCH_TODO = re.compile(r'^#\s*TODO:\s*(\w+)$').match | ||
|
||
OBJECT_TYPES: frozenset[str] = frozenset() | ||
ZeroIntensity marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
for qualifier, object_type, suffix in itertools.product( | ||
('const ', ''), | ||
( | ||
'PyObject', | ||
'PyLongObject', 'PyTypeObject', | ||
'PyCodeObject', 'PyFrameObject', | ||
'PyModuleObject', 'PyVarObject', | ||
), | ||
('*', '**', '* const *', '* const*'), | ||
): | ||
OBJECT_TYPES |= { | ||
f'{qualifier}{object_type}{suffix}', | ||
f'{qualifier}{object_type} {suffix}', | ||
} | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
del suffix, object_type, qualifier | ||
|
||
IGNORE_LIST: frozenset[str] = frozenset(( | ||
# part of the stable ABI but should not be used at all | ||
'PyUnicode_GetSize', | ||
# part of the stable ABI but completely removed | ||
'_PyState_AddModule', | ||
)) | ||
|
||
def flno_(lineno: int) -> str: | ||
# Format the line so that users can C/C from the terminal | ||
# the line number and jump with their editor using Ctrl+G. | ||
return f'{lineno:>5} ' | ||
|
||
class RefType(Enum): | ||
UNKNOWN = _auto() | ||
UNUSED = _auto() | ||
DECREF = _auto() | ||
BORROW = _auto() | ||
INCREF = _auto() | ||
STEALS = _auto() | ||
NULL = _auto() # for return values only | ||
|
||
class LineInfo(NamedTuple): | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
func: str | ||
ctype: str | None | ||
name: str | None | ||
reftype: RefType | None | ||
comment: str | ||
|
||
raw_func: str | ||
raw_ctype: str | ||
raw_name: str | ||
raw_reftype: str | ||
|
||
strip_func: bool | ||
strip_ctype: bool | ||
strip_name: bool | ||
strip_reftype: bool | ||
|
||
@dataclass(slots=True) | ||
class Return: | ||
ctype: str | None | ||
reftype: RefType | None | ||
comment: str | ||
|
||
@dataclass(slots=True) | ||
class Param: | ||
name: str | ||
lineno: int | ||
|
||
ctype: str | None | ||
reftype: RefType | None | ||
comment: str | ||
|
||
@dataclass(slots=True) | ||
class Signature: | ||
name: str | ||
lineno: int | ||
rparam: Return | ||
params: dict[str, Param] = field(default_factory=dict) | ||
|
||
class FileView(NamedTuple): | ||
signatures: Mapping[str, Signature] | ||
incomplete: frozenset[str] | ||
|
||
def parse_line(line: str) -> LineInfo | None: | ||
parts = line.split(':', maxsplit=4) | ||
if len(parts) != 5: | ||
return None | ||
|
||
raw_func, raw_ctype, raw_name, raw_reftype, comment = parts | ||
|
||
func = raw_func.strip() | ||
strip_func = func != raw_func | ||
if not func: | ||
return None | ||
|
||
clean_ctype = raw_ctype.strip() | ||
ctype = clean_ctype or None | ||
strip_ctype = clean_ctype != raw_ctype | ||
|
||
clean_name = raw_name.strip() | ||
name = clean_name or None | ||
strip_name = clean_name != raw_name | ||
|
||
clean_reftype = raw_reftype.strip() | ||
strip_reftype = clean_reftype != raw_reftype | ||
|
||
if clean_reftype == '-1': | ||
reftype = RefType.DECREF | ||
elif clean_reftype == '0': | ||
reftype = RefType.BORROW | ||
elif clean_reftype == '+1': | ||
reftype = RefType.INCREF | ||
elif clean_reftype == '$': | ||
reftype = RefType.STEALS | ||
elif clean_reftype == 'null': | ||
reftype = RefType.NULL | ||
elif not clean_reftype: | ||
reftype = RefType.UNUSED | ||
else: | ||
reftype = RefType.UNKNOWN | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
comment = comment.strip() | ||
return LineInfo(func, ctype, name, reftype, comment, | ||
raw_func, raw_ctype, raw_name, raw_reftype, | ||
strip_func, strip_ctype, strip_name, strip_reftype) | ||
|
||
class ParserReporter: | ||
def __init__(self) -> None: | ||
self.count = 0 | ||
|
||
def info(self, lineno: int, message: str) -> None: | ||
self.count += 1 | ||
print(f'{flno_(lineno)} {message}') | ||
|
||
warn = error = info | ||
|
||
def parse(lines: Iterable[str]) -> FileView: | ||
signatures: dict[str, Signature] = {} | ||
incomplete: set[str] = set() | ||
|
||
w = ParserReporter() | ||
|
||
for lineno, line in enumerate(map(str.strip, lines), 1): | ||
if not line: | ||
continue | ||
if line.startswith('#'): | ||
if match := MATCH_TODO(line): | ||
incomplete.add(match.group(1)) | ||
continue | ||
|
||
e = parse_line(line) | ||
if e is None: | ||
w.error(lineno, f'cannot parse: {line!r}') | ||
continue | ||
|
||
if e.strip_func: | ||
w.warn(lineno, f'[func] whitespaces around {e.raw_func!r}') | ||
if e.strip_ctype: | ||
w.warn(lineno, f'[type] whitespaces around {e.raw_ctype!r}') | ||
if e.strip_name: | ||
w.warn(lineno, f'[name] whitespaces around {e.raw_name!r}') | ||
if e.strip_reftype: | ||
w.warn(lineno, f'[ref] whitespaces around {e.raw_reftype!r}') | ||
|
||
func, name = e.func, e.name | ||
ctype, reftype = e.ctype, e.reftype | ||
comment = e.comment | ||
|
||
if func not in signatures: | ||
# process return value | ||
if name is not None: | ||
w.warn(lineno, f'named return value in {line!r}') | ||
ret_param = Return(ctype, reftype, comment) | ||
signatures[func] = Signature(func, lineno, ret_param) | ||
else: | ||
# process parameter | ||
if name is None: | ||
w.error(lineno, f'missing parameter name in {line!r}') | ||
continue | ||
sig: Signature = signatures[func] | ||
if name in sig.params: | ||
w.error(lineno, f'duplicated parameter name in {line!r}') | ||
continue | ||
sig.params[name] = Param(name, lineno, ctype, reftype, comment) | ||
|
||
if w.count: | ||
print() | ||
print(f'Found {w.count} issue(s)') | ||
|
||
return FileView(signatures, frozenset(incomplete)) | ||
|
||
class CheckerWarnings: | ||
def __init__(self) -> None: | ||
self.count = 0 | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def block(self, sig: Signature, message: str) -> None: | ||
self.count += 1 | ||
print(f'{flno_(sig.lineno)} {sig.name:50} {message}') | ||
|
||
def param(self, sig: Signature, param: Param, message: str) -> None: | ||
self.count += 1 | ||
fullname = f'{sig.name}[{param.name}]' | ||
print(f'{flno_(param.lineno)} {fullname:50} {message}') | ||
|
||
def check(view: FileView) -> None: | ||
w = CheckerWarnings() | ||
|
||
for sig in view.signatures.values(): # type: Signature | ||
# check the return value | ||
rparam = sig.rparam | ||
if not rparam.ctype: | ||
w.block(sig, 'missing return value type') | ||
if rparam.reftype is RefType.UNKNOWN: | ||
w.block(sig, 'unknown return value type') | ||
# check the parameters | ||
for name, param in sig.params.items(): # type: (str, Param) | ||
ctype, reftype = param.ctype, param.reftype | ||
if ctype in OBJECT_TYPES and reftype is RefType.UNUSED: | ||
w.param(sig, param, 'missing reference count management') | ||
if ctype not in OBJECT_TYPES and reftype is not RefType.UNUSED: | ||
w.param(sig, param, 'unused reference count management') | ||
if name != C_ELLIPSIS and not name.isidentifier(): | ||
# Python accepts the same identifiers as in C | ||
w.param(sig, param, 'invalid parameter name') | ||
|
||
if w.count: | ||
print() | ||
print(f'Found {w.count} issue(s)') | ||
names = view.signatures.keys() | ||
if sorted(names) != list(names): | ||
print('Entries are not sorted') | ||
|
||
def check_structure(view: FileView, stable_abi_file: str) -> None: | ||
print(f"Stable ABI file: {stable_abi_file}") | ||
print() | ||
stable_abi_str = Path(stable_abi_file).read_text() | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
stable_abi = tomllib.loads(stable_abi_str) | ||
expect = stable_abi['function'].keys() | ||
# check if there are missing entries (those marked as "TODO" are ignored) | ||
actual = IGNORE_LIST | view.incomplete | view.signatures.keys() | ||
if missing := (expect - actual): | ||
print(f'Missing {len(missing)} stable ABI entries:') | ||
for name in sorted(missing): | ||
print(name) | ||
|
||
STABLE_ABI_FILE_SENTINEL = object() | ||
|
||
def _create_parser() -> ArgumentParser: | ||
parser = ArgumentParser( | ||
prog='lint.py', | ||
formatter_class=RawDescriptionHelpFormatter, | ||
description='Lint the refcounts.dat file.\n\n' | ||
'Use --abi or --abi=FILE to check against the stable ABI.', | ||
) | ||
parser.add_argument('file', nargs='?', default=DEFAULT_REFCOUNT_DAT_PATH, | ||
help='the refcounts.dat file to check ' | ||
'(default: %(default)s)') | ||
parser.add_argument('--abi', nargs='?', default=STABLE_ABI_FILE_SENTINEL, | ||
help='check against the given stable_abi.toml file ' | ||
'(default: %s)' % DEFAULT_STABLE_ABI_TOML_PATH) | ||
return parser | ||
|
||
def main() -> None: | ||
parser = _create_parser() | ||
args = parser.parse_args() | ||
lines = Path(args.file).read_text().splitlines() | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
print(' PARSING '.center(80, '-')) | ||
view = parse(lines) | ||
print(' CHECKING '.center(80, '-')) | ||
check(view) | ||
if args.abi is not STABLE_ABI_FILE_SENTINEL: | ||
abi = args.abi or DEFAULT_STABLE_ABI_TOML_PATH | ||
print(' CHECKING STABLE ABI '.center(80, '-')) | ||
check_structure(view, abi) | ||
|
||
if __name__ == '__main__': | ||
main() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
[mypy] | ||
files = Tools/refcounts/lint.py | ||
pretty = True | ||
show_traceback = True | ||
python_version = 3.12 | ||
|
||
strict = True | ||
warn_unreachable = True | ||
enable_error_code = all | ||
picnixz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
warn_return_any = False |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.