Skip to content

Bug Report: VReplication loses JSON type information during binary-to-text roundtrip #19880

@arthurschreiber

Description

@arthurschreiber

Overview of the Issue

MySQL's binary JSON format stores richer type information than standard text JSON. Values like CAST('2024-01-15' as DATE), TIME, DATETIME, DECIMAL, BIT, and BLOB all have distinct type tags in the binary representation, but are indistinguishable from plain strings or numbers in text JSON.

During VReplication, the vstreamer decodes binary JSON from the binlog into a json.Value tree (which preserves these types), but then immediately serializes it to text JSON via MarshalTo() before sending it over gRPC (go/mysql/binlog/rbr.go:688-694). This discards the type tags. On the replicator side, marshalJSONForSQL() re-parses that text, but can only recover standard JSON types. The tree-based encoder's code paths for TypeDate, TypeTime, TypeDateTime, TypeBlob, and TypeBit are effectively dead code in the VReplication flow.

As a result, after VReplication copies a table, JSON documents with MySQL-specific typed values will have those values stored as their text equivalents (strings/numbers) on the target. JSON_TYPE() returns different results on source vs. target, and any behavior depending on internal type tags (sorting, comparison) may differ.

Reproduction Steps

  1. On the source, create a table with a JSON column and insert a value with a typed JSON element:
    CREATE TABLE t1 (id INT PRIMARY KEY, doc JSON);
    INSERT INTO t1 VALUES (1, JSON_OBJECT('created', CAST('2024-01-15' as DATE)));
  2. Verify the type on the source:
    SELECT JSON_TYPE(JSON_EXTRACT(doc, '$.created')) FROM t1;
    -- Returns: DATE
  3. VReplicate the table to a target.
  4. Check the type on the target:
    SELECT JSON_TYPE(JSON_EXTRACT(doc, '$.created')) FROM t1;
    -- Returns: STRING

Binary Version

All versions (the code path has been present since at least the introduction of MarshalTo in rbr.go).

Operating System and Environment details

N/A — this is a logic issue in the vstreamer's JSON serialization, not environment-specific.

Log Fragments

N/A — no errors are produced. The type degradation is silent.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions