Skip to content

[MongoDB Replication] Fix resumeTokens going back in time on busy change streams #301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 17, 2025

Conversation

rkistner
Copy link
Contributor

Due to a bug in the NodeJS driver, in some cases, the change stream may go back in time. In most cases, it would only go back a short period, and quickly catch up again, resulting in almost no symptoms. But if the change stream is never idle for more than 10 seconds at a time, this it could go back multiple hours, resulting in significant replication lag while it attempts to catch up.

The symptoms are (1) replication lag increasing significantly despite no recent sync rule deploys, often in a big jump and (2) repeatedly logging Re-applied transaction ... - skipping checkpoint until replication has caught up.

The issue is triggered by a "ResumableChangeStreamError". It could be from a replication fail-over event, a temporary network issue, or some other scenarios, in which case the driver restarts the change stream with the wrong resume token. See upstream driver issue for details: https://jira.mongodb.org/browse/NODE-7042

The workaround here is to detect when the issue happens, by comparing the LSNs. When the issue is detected, we just throw an error, causing replication to be restarted with a new change stream, from starting at the last good resume token.

To reproduce the issue, see the upstream bug report above. To reproduce within PowerSync, run the same script, while running powersync replication from the same source database. With this fix in place, it now results in logs like this:

error: [powersync_72] Replication error [PSYNC_S1101] Change resumeToken 826877BC42000000032B042C0100296E5A10041831DD5EEE2B4D6495A610E5430872B6463C6F7065726174696F6E54797065003C696E736572740046646F63756D656E744B65790046645F696400646877BC424E5904A4B153BD70000004 (2025-07-16T14:50:42.000Z) is less than last checkpoint LSN 6877bc4600000003|3wAAAANyZXN1bWVUb2tlbgDNAAAAAl9kYXRhAL0AAAA4MjY4NzdCQzQ2MDAwMDAwMDMyQjA0MkMwMTAwMjk2RTVBMTAwNDE4MzFERDVFRUUyQjRENjQ5NUE2MTBFNTQzMDg3MkI2NDYzQzZGNzA2NTcyNjE3NDY5NkY2RTU0Nzk3MDY1MDAzQzc1NzA2NDYxNzQ2NTAwNDY2NDZGNjM3NTZENjU2RTc0NEI2NTc5MDA0NjY0NUY2OTY0MDA2NDY4NzdCQzQyNEU1OTA0QTRCMTUzQkQ3MDAwMDAwNAAAAA==. Restarting replication.
info: [powersync_72] Initial replication already done
info: [powersync_72] Resume streaming at new Timestamp({ t: 1752677143, i: 3 }) / 6877bb1700000003|3wAAAANyZXN1bWVUb2tlbgDNAAAAAl9kYXRhAL0AAAA4MjY4NzdCQjE3MDAwMDAwMDMyQjA0MkMwMTAwMjk2RTVBMTAwNDE4MzFERDVFRUUyQjRENjQ5NUE2MTBFNTQzMDg3MkI2NDYzQzZGNzA2NTcyNjE3NDY5NkY2RTU0Nzk3MDY1MDAzQzc1NzA2NDYxNzQ2NTAwNDY2NDZGNjM3NTZENjU2RTc0NEI2NTc5MDA0NjY0NUY2OTY0MDA2NDY4NzdCQjBEMURDNkRGMjk1OEU5QzI4RTAwMDAwNAAAAA==  | Token age: 9s

@rkistner rkistner requested a review from stevensJourney July 16, 2025 15:00
Copy link

changeset-bot bot commented Jul 16, 2025

🦋 Changeset detected

Latest commit: 097bfb6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 12 packages
Name Type
@powersync/service-module-mongodb Patch
@powersync/lib-service-mongodb Patch
@powersync/service-core Patch
@powersync/service-image Patch
@powersync/service-schema Patch
@powersync/service-module-mongodb-storage Patch
@powersync/service-core-tests Patch
@powersync/service-module-core Patch
@powersync/service-module-mysql Patch
@powersync/service-module-postgres-storage Patch
@powersync/service-module-postgres Patch
test-client Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@rkistner rkistner merged commit 60bf5f9 into main Jul 17, 2025
49 of 51 checks passed
@rkistner rkistner deleted the fix-resumetoken-timetravel branch July 17, 2025 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants