Skip to content

Conversation

jamesbraza
Copy link
Collaborator

If a code file is empty (e.g. a py.typed file), the Text-creation loop inside chunk_code_text will not be entered, so we hit an UnboundLocalError:

UnboundLocalError: cannot access local variable 'i' where it is not associated with a value

This PR:

  • Fixes that crash, with test coverage
  • Improves variable names and documents the purpose of conditional logic

@jamesbraza jamesbraza self-assigned this Jun 11, 2025
@Copilot Copilot AI review requested due to automatic review settings June 11, 2025 18:35
@jamesbraza jamesbraza added the bug Something isn't working label Jun 11, 2025
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jun 11, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an UnboundLocalError in chunk_code_text when processing empty files and improves the code’s variable naming.

  • Fixes a bug where an empty file (e.g. py.typed) would trigger an UnboundLocalError.
  • Enhances variable names and updates the logic in the chunking loop to improve clarity.
  • Adds test coverage to verify the handling of empty file content.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
tests/test_paperqa.py Added a test case for a py.typed file to cover empty content scenarios.
paperqa/readers.py Updated variable names in the chunking logic to prevent crashes and improve clarity.
Comments suppressed due to low confidence (2)

paperqa/readers.py:225

  • [nitpick] Consider renaming 'last_line_i' to 'start_line_index' or a similarly clear name to better indicate its purpose in marking the start of a chunk.
line_i = last_line_i = 0

paperqa/readers.py:222

  • [nitpick] Consider updating the function docstring to mention the use of the 'text_buffer' variable and the revised variable names for better clarity.
'''Parse a document into chunks, based on line numbers (for code).'''

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 12, 2025
@jamesbraza jamesbraza merged commit 1993745 into main Jun 12, 2025
17 of 20 checks passed
@jamesbraza jamesbraza deleted the working-with-py-typed branch June 12, 2025 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants