Skip to content

Conversation

@git-hulk
Copy link
Member

@git-hulk git-hulk commented Apr 4, 2024

This closes #2214

The namespace mechanism is NOT allowed in cluster mode, so it's
unnecessary to rewrite while the cluster mode is enabled. This
config rewrite behavior will cause the replication issue which was
mentioned in #2214 when starting the cluster node.

The root cause is that the server will try to rewrite the namespace
into the rocksdb if the option repl-namespace-enabled is enabled.
So it will increase the server's rocksdb sequence and replication will
start with the wrong offset. We have checked if the role is a slave
before rewriting, but the cluster replication is NOT set at that time(master-replica is good).

The good news is it only affects the cluster users who enabled
the option repl-namespace-enabled, so I guess almost no user
will do this since the namespace replication is meaningless to the cluster mode.

Please refer to the first commit in this PR for the reproduced code.
The replica will lose one update before applying this patch:

=== RUN   TestClusterReplication/Cluster_replication_should_work_normally_after_restart
    replication_test.go:88: 
        	Error Trace:	/Users/hulk/code/cxx/kvrocks/tests/gocase/integration/replication/replication_test.go:88
        	Error:      	Not equal: 
        	            	expected: "v1"
        	            	actual  : "v0"

And it works well after this patch.

git-hulk added 3 commits April 4, 2024 19:08
The namespaace mechanism is NOT allowed in cluster mode, so it's
unnecessary to rewrite while the cluster mode was enabled. By the way,
Kvrocks depends on the rocksdb's sequence while replicating, any writes
to rocksdb should be forbidden includes the internal behaviors.
Copy link
Contributor

@torwig torwig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sonarqubecloud
Copy link

sonarqubecloud bot commented Apr 4, 2024

@git-hulk git-hulk merged commit 7e1b797 into apache:unstable Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Slave does not sync key updates during downtime after restart

3 participants