-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[CASSANDRA-20804][trunk] Optimize DataPlacement lookup by ReplicationParams #4282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
00bcffd
to
ab95a68
Compare
// as we have as keys in metadata.placements to have a fast map lookup | ||
// ReplicationParams are immutable, so it is a safe optimization | ||
KeyspaceParams keyspaceParams = keyspaceMetadata.params; | ||
ReplicationParams replicationParams = metadata.placements.deduplicateReplicationParams(keyspaceParams.replication); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this logic really necessary? What we started to do is withSwapped
which is creating new object new KeyspaceParams
and new KeyspaceMetadata
etc ...
So one step forward but also some step back as we allocate ...
What if "the deduplication" was done directly upon KeyspaceMetadata.create
? We do attrs.asNewKeyspaceParams
just so we do further transformations / deduplications on that. Why would not we put KeyspaceParams
into KeyspaceMetadata.create
deduplicated already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would not we put KeyspaceParams into KeyspaceMetadata.create deduplicated already?
Yes, agree, it will be more clear and will reduce amount of juggling with the objects (so, the logic will be more readable).
So one step forward but also some step back as we allocate ...
Just to clarify: this method is invoked when we apply schema changes or when we load schema on startup (from TCM log), so this method is not on a hot path to be an optimization target itself and I would not worry a lot about allocation of several extra objects here.
The goal of this deduplication is not to reduce memory usage but to make lookup from metadata.placements (metadata.placements.get(ks.params.replication)
) more efficient within a hot path during a plain write.
ReplicationParams is a key for metadata.placements map, so when we do a lookup from it we have to compare ReplicationParams object provided as a key to the get operation and a ReplicationParams object stored in the map. Deduplication utilises a fast path within this equals via == instead of full comparison of inner structures (which is much more expensive).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing is that this logic is not applied when we deserialize a ClusterMetadata
snapshot, which can happen during replay at startup or when catching up from a peer.
In that case no deduplication is done, so there will still be multiple equivalent ReplicationParams
instances in schema. Similarly, the map keys in DataPlacements
will be distinct from those instances.
I don't believe this affects the intent of this patch, as the deduplication of read/writeReplicaGroups
in the DataPlacement
construction is still done (i.e. if reads.equals(writes)
so the new reference equality check in ReplicaLayout
is still valid.
Of course, the memoized hashcode in ReplicationParams
still works as expected so despite the KeyspaceParams
and DataPlacements
having pointers to different instances, the map lookup is still fine.
So I think the issue is that this deduplication may lead to some confusion if it isn't applied consistently. Deduping the replication params instances during deserialization may be more trouble than it's worth, especially as it's actually the hashcode memoization & this ReplicaGroups
deduplication that actually provide the benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @beobal, thank you for the checking the change. Yes, I see, I have checked that the deduplication is applied during an initial creation or altering keyspace, as well as on startup (but only a recent one, so it was a log replay path, not a snapshot path).
Let me re-measure to see the remaining cost of the equals after other changes applied.
Regarding the snapshot logic, am I right that this path is going through: org.apache.cassandra.tcm.ClusterMetadata.Serializer#deserialize ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @netudima. Yes, that's right about the deserialize code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the deduplication for the snapshot loading case as well.
The cost of the remaining equals logic is about 0.3% of total CPU (the test load is CPU-bound, so it corresponds to the real 0.3% of CPU).
While it may look as a small amount the problem is that the majority of our hot write path is composed of such small fragments, each one take a little, but in sum they consume a lot of CPU, so if we want to improve overall performance we have to deal with such small things.
So, my suggestion is to add the deduplication, the logic is straightforward and I do not see particular risks here to make it less stable.
Avoid double lookup of the same DataPlacement in forNonLocalStrategyTokenRead and forNonLocalStrategyTokenWrite methods Memorize hashCode value Deduplicate ReplicationParams to use the same objects in DataPlacements and KeyspaceMetadata to use the fast == path in the equals Patch by Dmitry Konstantinov; reviewed by TBD for CASSANDRA-20804
… identify pending endpoints) Patch by Dmitry Konstantinov; reviewed by TBD for CASSANDRA-20804
…ping is not needed Patch by Dmitry Konstantinov; reviewed by Štefan Miklošovič for CASSANDRA-20804
Patch by Dmitry Konstantinov; reviewed by Štefan Miklošovič, Sam Tunnicliffe for CASSANDRA-20804
f7a354c
to
30b6d7d
Compare
Avoid double lookup of the same DataPlacement in forNonLocalStrategyTokenRead and forNonLocalStrategyTokenWrite methods Memorize hashCode value
Deduplicate ReplicationParams to use the same objects in DataPlacements and KeyspaceMetadata to use the fast == path in the equals
Patch by Dmitry Konstantinov; reviewed by TBD for CASSANDRA-20804