-
Notifications
You must be signed in to change notification settings - Fork 592
feat(tdigest): Add support for QUANTILE command #2849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tdigest): Add support for QUANTILE command #2849
Conversation
|
@LindaSummer I have added the basic structure for the I tried modifying the quantile function as mentioned in the issue and the previous PRs but the issue persists. Any help is appreciated. |
|
Hi @SharonIV0x86 , Thanks very much for effort!😊 I'll try to test it in local in recent days and will sync if there is any new updates. Best Regards, |
Works, till then ill try to fiddle around. Cheers. |
|
Hi @SharonIV0x86 , Sorry for delay of updating. 😊 In recent I have some personal affairs which occupied my schedule so maybe a little delay in response. I'm now working on this ticket and find that there may be some bugs inside it. I will double confirm it and create a PR to fix it if bug exists. Best Regards, |
No issues, meanwhile i tried looking into it, no matter what i did there seems to be a mismatch between the actual information and the tdigest metadata now one thing i can deduce is that is not in the Quantile function but this seems to be the issue with Or this is related to locks as you suggested. There is no hurry from my side in fixing this issue but i am also planning to implement the TDIGEST.CDF and TDIGEST.RANK in upcoming weeks and they most likely will depend in this issue. |
|
Hi @SharonIV0x86 , I have tested the command line successfully with removing the lock and announcing the command "write" for quick validation after fixing #2878 . I will try to find a way to improve the performance with less critical section. Best Regards, |
Okay, thanks for looking into it. So, how should i proceed then? should i wait for #2878 to be merged? |
Hi @SharonIV0x86 , Maybe we could wait for it to be merged before this PR. If you don't mind, I will try to solve the lock issue since I find that it may not be so easy to just add a lock key in current connection lock management. Best Regards, |
Absolutely no issues whatsoever, there is no hurry i am happy to wait 😄 |
|
@LindaSummer Hi, i have tested the quantile command after the merging of the #2878 and it works as expected. However, there is a bit of difference in quantile values when compared to the original implementation of redis I have also added a go test case so if any changes are required pls lmk. 👍🏼 |
LindaSummer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/commands/cmd_tdigest.cc
Outdated
| return {Status::RedisExecErr, s.ToString()}; | ||
| } | ||
| if (values_.empty()) { | ||
| return {Status::RedisExecErr, "invalid quantile or empty tdigest"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SharonIV0x86 ,
Sorry for my unclear description for this behavior. 😄
Currently, redis stack returns nan as response for none centroids.
We'd better follow this behavior.
Adding the logic to tdigest ranther than command may be better to keep the command module clean.
After modifying the tdigest, the cpp unit test should also be updated.
Best Regards,
Edward
| infoAfterEmptyReset := toTdigestInfo(t, rsp.Val()) | ||
| require.EqualValues(t, 100, infoAfterEmptyReset.Compression) | ||
| }) | ||
| t.Run("tdigest.quantile with different arguments", func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SharonIV0x86 ,
We could also add an unordered sequence and an empty tdigest as cases. 😊
Best Regards,
Edward
|
@LindaSummer Would you like to have a look? |
Hi @PragmaTwice and @SharonIV0x86 , Of course!😊 I will go through this PR later today. Best Regards, |
LindaSummer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/commands/cmd_tdigest.cc
Outdated
| quantile_strings.push_back(std::to_string(q)); | ||
| if (!result.has_centroids) { | ||
| for (size_t i = 0; i < values_.size(); ++i) { | ||
| quantile_strings.emplace_back("nan"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SharonIV0x86 ,
Maybe we could make "nan" a constexpr literal string in this file. 😊
It would also be used in other commands.
Best Regards,
Edward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will do this.
src/types/redis_tdigest.cc
Outdated
| if (auto status = dumpCentroids(ctx, ns_key, metadata, ¢roids); !status.ok()) { | ||
| return status; | ||
| } | ||
| if (centroids.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SharonIV0x86 ,
At the first time we load the metadata, we should know this tdigest has no data inside it.
This could be checked before the centroids retrieving.
And it would be better to add cpp unit test for this behavior.
Best Regards,
Edward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just to be clear, after we load the metadata here i add a check to do something like this.
if(metadata.merged_nodes == 0 && metadata.unmerged_nodes == 0){
//tdigest does not contain any data. so set the flag
}Correct me if im wrong.
And it would be better to add cpp unit test for this behavior.
I'll try working on this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just to be clear, after we load the metadata here i add a check to do something like this.
if(metadata.merged_nodes == 0 && metadata.unmerged_nodes == 0){ //tdigest does not contain any data. so set the flag }Correct me if im wrong.
And it would be better to add cpp unit test for this behavior.
I'll try working on this as well.
Hi @SharonIV0x86 ,
We could directly use total_observations to check the tdigest's elements. 😊
Best Regards,
Edward
|
@LindaSummer I have added the cpp test case, i believe it needs some modifications because in the cpp test case its currently not possible to directly compare the values returned by the quantile function with the "nan" string as the The closest i could do was to add a assert for the value of |
Hi @SharonIV0x86 , You could directly test the flag you have added. 😊 And the nan string test could be in integration test. Best Regards, |
LindaSummer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SharonIV0x86 ,
Thanks very much for your contribution and huge effort! 😄
I have made some small changes on current code.
LGTM.
Best Regards,
Edward
|
Thanks for the changes you made, the code is much better now with ranges and views. |



Closes #2794
This PR adds support for
TDIGEST.QUANTILEcommand.