Skip to content

Fix java.util.Date variant conversion losing precision for Timestamp and Time logical types#2

Open
brandonstanleyappfolio wants to merge 1 commit intoseokyun-ha-toss:support-variant-for-sink-connectorfrom
brandonstanleyappfolio:variant-date-fix
Open

Fix java.util.Date variant conversion losing precision for Timestamp and Time logical types#2
brandonstanleyappfolio wants to merge 1 commit intoseokyun-ha-toss:support-variant-for-sink-connectorfrom
brandonstanleyappfolio:variant-date-fix

Conversation

@brandonstanleyappfolio
Copy link
Copy Markdown

@brandonstanleyappfolio brandonstanleyappfolio commented Apr 1, 2026

Fix java.util.Date variant conversion losing precision for Timestamp and Time logical types
Kafka Connect represents three distinct temporal logical types using the same
Java class (java.util.Date):

  • org.apache.kafka.connect.data.Timestamp (milliseconds since epoch)
  • org.apache.kafka.connect.data.Time (milliseconds since midnight)
  • org.apache.kafka.connect.data.Date (days since epoch)

The existing primitiveToVariantValue method only checked instanceof Date
with no schema context, converting all java.util.Date values to Iceberg DATE
(days since epoch). This silently discarded the time component from Timestamp
fields (e.g. 2025-12-09T14:30:45.123Z became 2025-12-09).

Fix: thread the Kafka Connect schema through objectToVariantValue and
primitiveToVariantValue so the Date branch can inspect the schema's logical
type name and convert to the correct Iceberg variant type:

  • Timestamp (logical name: org.apache.kafka.connect.data.Timestamp)
    -> Variants.ofTimestamptz (microseconds since epoch)
  • Time (logical name: org.apache.kafka.connect.data.Time)
    -> Variants.ofTime (microseconds since midnight)
  • Date (logical name: org.apache.kafka.connect.data.Date)
    -> Variants.ofDate (days since epoch)

The schema is propagated from Struct fields (field.schema()), Collection
elements (schema.valueSchema()), and Map values (schema.valueSchema()).
When no schema is available, an IllegalArgumentException is thrown to
prevent silent data loss.

…and Time logical types

Kafka Connect represents three distinct temporal logical types using the same
Java class (java.util.Date):
- org.apache.kafka.connect.data.Timestamp (milliseconds since epoch)
- org.apache.kafka.connect.data.Time (milliseconds since midnight)
- org.apache.kafka.connect.data.Date (days since epoch)

The existing primitiveToVariantValue method only checked `instanceof Date`
with no schema context, converting all java.util.Date values to Iceberg DATE
(days since epoch). This silently discarded the time component from Timestamp
fields (e.g. 2025-12-09T14:30:45.123Z became 2025-12-09).

Fix: thread the Kafka Connect schema through objectToVariantValue and
primitiveToVariantValue so the Date branch can inspect the schema's logical
type name and convert to the correct Iceberg variant type:
- Timestamp (logical name: org.apache.kafka.connect.data.Timestamp)
    -> Variants.ofTimestamptz (microseconds since epoch)
- Time (logical name: org.apache.kafka.connect.data.Time)
    -> Variants.ofTime (microseconds since midnight)
- Date (logical name: org.apache.kafka.connect.data.Date)
    -> Variants.ofDate (days since epoch)

The schema is propagated from Struct fields (field.schema()), Collection
elements (schema.valueSchema()), and Map values (schema.valueSchema()).
When no schema is available, an IllegalArgumentException is thrown to
prevent silent data loss.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant