Skip to content

XSD gYear literals are represented as dates #1379

@wxwilcke

Description

@wxwilcke

When creating or reading literals with XSD's gYear as datatype, these get stored as Python's datatime.date. While I can understand the choice for this approach, it nevertheless changes the semantics of the literals and incorrectly increases their precision by specifying a month and day.

To reproduce:

> import rdflib
RDFLib Version: 5.0.0
> gyear = rdflib.Literal("1970", datatype=rdflib.XSD.gYear)
> gyear.toPython()
datetime.date(1970, 1, 1)
> str(gyear)
'1970-01-01'

The above representation causes several problems. Firstly, when decoupling the values from the Literal class, e.g. using gyear.toPython(), str(gyear), or gyear.value, it is no longer possible to discern that the value is of datatype year, and that the month and day representations are meaningless. Secondly, when serializing the graph, the month and day representation are included in the literal, which therefore violate the definition of the value space:

gYear uses the date/timeSevenPropertyModel, with ·month·, ·day·, ·hour·, ·minute·, and ·second· required to be absent [1].

This serialization problem is an oversight I assume, and can easily be remedied by returning only the year representation when the datatype is gYear. The other issue is a bit more challenging, since I assume you want to keep the convenience of Python's datetime module. Perhaps it would be an idea to only return a datetime.date object if object.toPython is called, and to return just the year when str(object) or object.value are used.

[1] https://www.w3.org/TR/2012/REC-xmlschema11-2-20120405/datatypes.html#gYear

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions