Filter out Python files with invalid module names during directory scanning #403

Copilot · 2025-10-08T06:22:23Z

Problem

Vulture was processing temporary and backup files created by text editors when scanning directories. For example, Emacs creates temporary files like .#filename.py for unsaved buffers, and these were being analyzed by vulture even though they're not valid Python modules according to PEP 8.

Solution

This PR adds filtering to only process Python files with valid module names when scanning directories. According to PEP 8, module names should be valid Python identifiers.

What's changed:

Added _is_valid_module_name() helper function that checks if a filename is a valid Python identifier (must start with a letter or underscore, followed by letters, digits, or underscores)
Updated get_modules() to filter out files with invalid names when using rglob() on directories
Files explicitly specified on the command line are still processed regardless of name (respecting user intent)

Examples of filtered files:

.#filename.py - Emacs temporary files
.dotfile.py - Hidden files starting with dots
2module.py - Files starting with numbers
my-module.py - Files containing dashes
~backup.py - Backup files

Examples of files still processed:

module.py - Valid lowercase names
_private.py - Names starting with underscore
__init__.py - Double underscore names
module_123.py - Names containing numbers (but not starting with them)
café.py - Unicode identifiers (valid per PEP 3131)

Testing:

Added comprehensive test coverage including:

10 tests for valid/invalid module name detection
3 tests for directory scanning behavior
Verified explicit file specification still works for invalid names

Fixes the issue where users had to manually exclude editor temporary files using patterns like "*/.#*.py" in their configuration.

Original prompt

This section details on the original issue you should resolve

<issue_title>Source file filtering is presently too broad</issue_title>
<issue_description>I use emacs for text editing, and when you have an unsaved buffer, emacs creates a temporary file ".#{original_file_name}". I'm able to exclude these in my configs via "*/.#*.py", however I think it's worth bringing this up, as no valid python source file would have such characters.

I'd suggest that by default, vulture only look for python files with valid names.

Cheers</issue_description>

<agent_instructions>Add test for this at a place that fits well into the test suite.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes #402

💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

- Add `_is_valid_module_name()` helper to check if a filename is a valid Python identifier - Update `get_modules()` to filter out files with invalid names when scanning directories - Files explicitly specified on command line are still processed regardless of name - Add comprehensive tests for valid/invalid module name detection - Fixes issue with emacs temporary files (.#filename.py) being processed Co-authored-by: jendrikseipp <[email protected]>

Initial plan

6d86bb1

Copilot AI assigned Copilot and jendrikseipp Oct 8, 2025

Copilot started work on behalf of jendrikseipp October 8, 2025 06:23 View session

Copilot AI changed the title ~~[WIP] Refactor file filtering to exclude temporary emacs files~~ Filter out Python files with invalid module names during directory scanning Oct 8, 2025

Copilot AI requested a review from jendrikseipp October 8, 2025 06:33

Copilot finished work on behalf of jendrikseipp October 8, 2025 06:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Filter out Python files with invalid module names during directory scanning #403

Filter out Python files with invalid module names during directory scanning #403

Uh oh!

Copilot AI commented Oct 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Filter out Python files with invalid module names during directory scanning #403

Are you sure you want to change the base?

Filter out Python files with invalid module names during directory scanning #403

Uh oh!

Conversation

Copilot AI commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

What's changed:

Examples of filtered files:

Examples of files still processed:

Testing:

Comments on the Issue (you are @copilot in this section)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 8, 2025 •

edited

Loading