Skip to content

Commit 7129e69

Browse files
committed
ENH: Enhance XMP metadata handling with creation and setter methods
1 parent 2a91bd4 commit 7129e69

File tree

3 files changed

+671
-0
lines changed

3 files changed

+671
-0
lines changed

docs/user/metadata.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,113 @@ if meta:
121121
print(meta.xmp_create_date)
122122
```
123123

124+
## Creating XMP metadata
125+
126+
You can create XMP metadata easily using the `XmpInformation.create()` method:
127+
128+
```python
129+
from pypdf import PdfWriter
130+
from pypdf.xmp import XmpInformation
131+
132+
# Create a new XMP metadata object
133+
xmp = XmpInformation.create()
134+
135+
# Set metadata fields
136+
xmp.set_dc_title({"x-default": "My Document Title"})
137+
xmp.set_dc_creator(["Author One", "Author Two"])
138+
xmp.set_dc_description({"x-default": "Document description"})
139+
xmp.set_dc_subject(["keyword1", "keyword2", "keyword3"])
140+
xmp.set_pdf_producer("pypdf")
141+
142+
# Create a writer and add the metadata
143+
writer = PdfWriter()
144+
writer.add_blank_page(612, 792) # Add a page
145+
writer.xmp_metadata = xmp
146+
writer.write("output.pdf")
147+
```
148+
149+
## Setting XMP metadata fields
150+
151+
The `XmpInformation` class provides setter methods for all supported metadata fields:
152+
153+
### Dublin Core fields
154+
155+
```python
156+
from datetime import datetime
157+
from pypdf.xmp import XmpInformation
158+
159+
xmp = XmpInformation.create()
160+
161+
# Single value fields
162+
xmp.set_dc_coverage("Global coverage")
163+
xmp.set_dc_format("application/pdf")
164+
xmp.set_dc_identifier("unique-id-123")
165+
xmp.set_dc_source("Original Source")
166+
167+
# Array fields (bags - unordered)
168+
xmp.set_dc_contributor(["Contributor One", "Contributor Two"])
169+
xmp.set_dc_language(["en", "fr", "de"])
170+
xmp.set_dc_publisher(["Publisher One"])
171+
xmp.set_dc_relation(["Related Doc 1", "Related Doc 2"])
172+
xmp.set_dc_subject(["keyword1", "keyword2"])
173+
xmp.set_dc_type(["Document", "Text"])
174+
175+
# Sequence fields (ordered arrays)
176+
xmp.set_dc_creator(["Primary Author", "Secondary Author"])
177+
xmp.set_dc_date([datetime.now()])
178+
179+
# Language alternative fields
180+
xmp.set_dc_title({"x-default": "Title", "en": "English Title", "fr": "Titre français"})
181+
xmp.set_dc_description({"x-default": "Description", "en": "English Description"})
182+
xmp.set_dc_rights({"x-default": "All rights reserved"})
183+
```
184+
185+
### XMP fields
186+
187+
```python
188+
from datetime import datetime
189+
190+
# Date fields accept both datetime objects and strings
191+
xmp.set_xmp_create_date(datetime.now())
192+
xmp.set_xmp_modify_date("2023-12-25T10:30:45Z")
193+
xmp.set_xmp_metadata_date(datetime.now())
194+
195+
# Text field
196+
xmp.set_xmp_creator_tool("pypdf")
197+
```
198+
199+
### PDF fields
200+
201+
```python
202+
xmp.set_pdf_keywords("keyword1, keyword2, keyword3")
203+
xmp.set_pdf_pdfversion("1.4")
204+
xmp.set_pdf_producer("pypdf")
205+
```
206+
207+
### XMP Media Management fields
208+
209+
```python
210+
xmp.set_xmpmm_document_id("uuid:12345678-1234-1234-1234-123456789abc")
211+
xmp.set_xmpmm_instance_id("uuid:87654321-4321-4321-4321-cba987654321")
212+
```
213+
214+
### PDF/A fields
215+
216+
```python
217+
xmp.set_pdfaid_part("1")
218+
xmp.set_pdfaid_conformance("B")
219+
```
220+
221+
### Clearing metadata fields
222+
223+
You can clear any field by passing `None`:
224+
225+
```python
226+
xmp.set_dc_title(None)
227+
xmp.set_dc_creator(None)
228+
xmp.set_pdf_producer(None)
229+
```
230+
124231
## Modifying XMP metadata
125232

126233
Modifying XMP metadata is a bit more complicated.

0 commit comments

Comments
 (0)