Skip to content

feat: add rapidocr service for local PDF reading#2229

Open
Paperlz wants to merge 1 commit into
the-open-agent:masterfrom
Paperlz:feat/support_rapidocr
Open

feat: add rapidocr service for local PDF reading#2229
Paperlz wants to merge 1 commit into
the-open-agent:masterfrom
Paperlz:feat/support_rapidocr

Conversation

@Paperlz
Copy link
Copy Markdown
Contributor

@Paperlz Paperlz commented May 6, 2026

Summary

  • Add a local OCR service for local_file scanned PDF reading.
  • Start a local FastAPI/RapidOCR service from OpenAgent when no OCR endpoint is already running.

Changes

  • Add internal/localocr to manage the local OCR Python service lifecycle.
  • Add deploy/ocr-service FastAPI wrapper using pypdfium2 + RapidOCR.
  • Make local_pdf_ocr_read use the managed local OCR endpoint when Provider URL is empty.
  • Expose Provider URL for local_file as an optional external OCR override.
  • Include OCR service files in GoReleaser archives.
  • Keep Docker OCR deployment available via docker-compose.ocr.yml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant