Skip to content

a data loss problem about batch insert in mysql #2119

@jackjoesh

Description

@jackjoesh

Hi @osheroff :
Excuse me, I find a data loss problem about batch insert in mysql.
For example:
If we insert 5 rows once by this sql pattern : insert into t1 values (data1),(data2),(data3),(data4),(data5);
Then we can find all rows's binlogs have the same position in mysql like the following picture:
image

Since Maxwell's Kafka callback function counting on mysql's raw position, Maxwell has data loss risk such as 2 rows(data1,data2) sending to Kafka fail and 3 rows (data3,data4,data5) sending to Kafka success, 3 rows'(data3,data4,data5) callback setting the same position as success and 2 rows(data1,data2) having no chance to re-send.
image

Then I find this open pr, it seems that you expand a sub offset in the same transaction, can this pr fix this problem? And I am curious that why you don't merge this pr to trunk? Does it need more testing?
#2035

And we issued an another data loss problem about a transaction are flowing to different Kafka partitions.
You fixed that by this pr: c20370c, but you reverted it due to this is a dump implemention.
And does this historical problem can also fixed by you new open pr?

These two problems have made our product environment losing some datas, so we are looking forward to getting help from you. Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions