Image object not recognized #3336

flange-ipb · 2025-06-27T11:02:23Z

flange-ipb
Jun 27, 2025

I'm extracting images from scientific papers. For this PDF I'm having troubles to extract Fig. 3 on page 10 - this image object is not included in PageObject.images.

I have the same issue in PyMuPDF, see PyMuPDF#4577.

Environment

$ python -m platform
Linux-6.12.32-amd64-x86_64-with-glibc2.41

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==5.6.1, crypt_provider=('local_crypt_fallback', '0.0.0'), PIL=11.2.1

Python version: 3.13.3

Code + PDF

Extracts all images from the given document:

from pypdf import PdfReader

with PdfReader("s44372-024-00085-0.pdf") as reader:
    for page_number, page in enumerate(reader.pages, start=1):
        for image in page.images:
            with open(f"pypdf-{page_number}-{image.name}", "wb") as fp:
                fp.write(image.data)

The PDF in question can be found here. I am not the author of this document. It is published under CC-BY 4.0 and the license terms are included in the document. This license is not viral, so I think it's legal to include it into your test dataset.

Traceback

No traceback

Answered by stefan6419846

Jun 27, 2025

This specific page only references one actual image. Figure 3 is included with plain drawing commands. To extract it, you would have to render the page as an image, but this is out of scope for pypdf.

View full answer

stefan6419846 · 2025-06-27T11:20:03Z

stefan6419846
Jun 27, 2025
Maintainer

This specific page only references one actual image. Figure 3 is included with plain drawing commands. To extract it, you would have to render the page as an image, but this is out of scope for pypdf.

1 reply

flange-ipb Jul 3, 2025
Author

Alright, thanks for providing an honest opinion on the limitations of pypdf. 👍

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Image object not recognized #3336

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Image object not recognized #3336

Uh oh!

Uh oh!

flange-ipb Jun 27, 2025

Environment

Code + PDF

Traceback

Replies: 1 comment · 1 reply

Uh oh!

stefan6419846 Jun 27, 2025 Maintainer

Uh oh!

flange-ipb Jul 3, 2025 Author

flange-ipb
Jun 27, 2025

Replies: 1 comment 1 reply

stefan6419846
Jun 27, 2025
Maintainer

flange-ipb Jul 3, 2025
Author