Skip to content

Fix: Add PyCryptodome dependency for encrypted PDF processing#2289

Merged
danielaskdd merged 1 commit intoHKUDS:mainfrom
danielaskdd:fix-pycrptodome-missing
Oct 30, 2025
Merged

Fix: Add PyCryptodome dependency for encrypted PDF processing#2289
danielaskdd merged 1 commit intoHKUDS:mainfrom
danielaskdd:fix-pycrptodome-missing

Conversation

@danielaskdd
Copy link
Collaborator

Fix: Add PyCryptodome dependency for encrypted PDF processing

Problem

When processing encrypted PDF files, the system would fail with the error:

"error_type": "file_extraction_error"
Failed to extract text from PDF: PyCryptodome is required for AES algorithm

This occurred because PyPDF2 requires the pycryptodome package to decrypt and extract text from password-protected or AES-encrypted PDF files, but this dependency was not installed.

Solution

Added pycryptodome>=3.0.0,<4.0.0 as a dependency across all installation methods to ensure encrypted PDFs can be processed successfully.

Changes

1. pyproject.toml

  • Added pycryptodome>=3.0.0,<4.0.0 to the offline-docs optional dependencies
  • Maintains alphabetical ordering within the dependency list
  • Ensures pip install lightrag-hku[offline-docs] includes pycryptodome

2. requirements-offline-docs.txt

  • Added pycryptodome>=3.0.0,<4.0.0 to document processing dependencies
  • Ensures offline document processing installations include the package

3. requirements-offline.txt

  • Added pycryptodome>=3.0.0,<4.0.0 to complete offline dependencies
  • Maintains consistency across all offline installation methods

4. lightrag/api/routers/document_routes.py

  • Added runtime check and auto-installation for pycryptodome (line 1086-1087)
  • Automatically installs pycryptodome if missing during PDF processing
  • Mirrors the existing pattern for pypdf2 installation

5. uv.lock

  • Automatically updated by package manager to include pycryptodome and all its platform-specific wheels

Benefits

  • ✅ Supports both regular and encrypted PDF files
  • ✅ Eliminates "PyCryptodome is required for AES algorithm" error
  • ✅ Automatic installation fallback if dependency is missing
  • ✅ Consistent across all installation methods (pip, offline, requirements files)

Testing

To verify the fix works:

  1. Install the updated dependencies
  2. Upload an encrypted PDF file through the API
  3. Verify successful text extraction without errors

@danielaskdd danielaskdd merged commit 3b48cf1 into HKUDS:main Oct 30, 2025
1 check passed
@danielaskdd danielaskdd deleted the fix-pycrptodome-missing branch October 30, 2025 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]:PyCryptodome is required for AES algorithm

1 participant