-
Notifications
You must be signed in to change notification settings - Fork 592
feat(storage): support for sideloading SSTs #2845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@ltagliamonte-dd @PragmaTwice Do you think it's a good idea to provide a command for this purpose? For example: So that it would be easy to ingest SSTs online. |
|
@PragmaTwice @git-hulk I'm not married to the use of a signal. |
Better to be |
I proposed |
|
But the command name will be
|
Makes sense. |
|
@git-hulk @PragmaTwice re-worked as new CMD, thanks for the feedback |
|
I feel that this is not a production-usable feature without considering replication. This scenario is too limiting. |
Thank you for the feedback! I completely understand your concern. Please know that this initial version of SST load doesn't rule out the possibility of adding replication in the future. We're committed to ensuring that nothing we implement is backwards incompatible. |
Yes, we could allow this command only if it's a standalone instance without any replicas. |
addressed via 68ef7f8 |
|
@git-hulk is there a method already there I can use to check if the node
has replicas?
…On Wed, Apr 16, 2025, 4:40 AM hulk ***@***.***> wrote:
***@***.**** approved this pull request.
------------------------------
In src/commands/cmd_server.cc
<#2845 (comment)>:
> + } else {
+ return {Status::RedisParseErr, "movefiles value must be 'yes' or 'no'"};
+ }
+ } else {
+ return {Status::RedisParseErr, "unknown option: " + parser.TakeStr().GetValue()};
+ }
+ }
+ return Commander::Parse(args);
+ }
+
+ Status Execute([[maybe_unused]] engine::Context &ctx, Server *srv, [[maybe_unused]] Connection *conn,
+ std::string *output) override {
+ if (srv->GetConfig()->cluster_enabled) {
+ return {Status::NotOK, "The SST command is not supported in cluster mode."};
+ }
+ if (srv->IsSlave()) {
Need to disallow if it has any replicas.
—
Reply to this email directly, view it on GitHub
<#2845 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMKKICBXAG2HIR4377ZWVIL2ZY6SBAVCNFSM6AAAAABZQ7GGRGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDONZSGE4TIOJRHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
You could add a method in size_t GetReplicaCount() {
slave_threads_mu_.lock();
auto replica_count = slave_threads_.size();
slave_threads_mu_.unlock();
return replica_count;
}
|
|
@ltagliamonte Thanks for your contribution and patience. |
thank you @git-hulk. |
src/storage/storage.cc
Outdated
| } | ||
| for (const auto &cf_name : cf_names) { | ||
| cf_files[cf_name] = std::vector<std::string>(); | ||
| cf_files[cf.Name()] = std::vector<std::string>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @PragmaTwice should be is that we don't need to initialize the empty vector for the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we can just remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@PragmaTwice can you please approve?
|
@caipengbo @PragmaTwice To see you guys have further comments, or we could merge after the CI becomes green. |
caipengbo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|


Dear community,
We're working on leveraging RocksDB sideloading to ship updates to our servers instead of loading data through the Redis interface.
Sideloading in RocksDB is more efficient than direct writes because it bypasses the standard write path, reducing write amplification, CPU overhead, and compaction costs. By ingesting pre-built SST files via IngestExternalFile(), it minimizes WAL writes and MemTable pressure, making bulk inserts faster and more efficient.
Since we don't use replication in our setup, we can ship bulk updates directly to each Kvrocks server and load the data efficiently.
We’d love to contribute our sideloading work back to the community and are excited to share our approach. Looking forward to your thoughts!
Our approach:
The current approach introduce a new CMD:
SST LOAD path, which detects the files to load from a folder, and sideloads the data into RocksDB.