-
Notifications
You must be signed in to change notification settings - Fork 592
feat(tdigest): add TDIGEST.MAX/MIN command #2811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fa1c04c to
d4d2c8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR adds tests for the new TDIGEST.MAX and TDIGEST.MIN commands to validate their behavior, including error handling and correct value computation.
- Introduces the error message constant errValueDoesNotExist for cases where a tdigest exists but contains no data.
- Adds subtests for both TDIGEST.MAX and TDIGEST.MIN covering invalid arguments, empty digests, and digests with one or multiple values.
Reviewed Changes
| File | Description |
|---|---|
| tests/gocase/unit/type/tdigest/tdigest_test.go | Added tests to cover TDIGEST.MAX and TDIGEST.MIN functionality |
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
tests/gocase/unit/type/tdigest/tdigest_test.go:217
- [nitpick] Consider reinitializing the tdigest or using a fresh key when transitioning from testing a single value to multiple values. This ensures that previous state does not affect the aggregation and makes each test case more isolated.
require.NoError(t, rdb.Do(ctx, "TDIGEST.CREATE", key, "compression", "100").Err())
tests/gocase/unit/type/tdigest/tdigest_test.go:212
- [nitpick] Consider expanding the invalid argument tests by including cases with extra arguments for both TDIGEST.MAX and TDIGEST.MIN to further improve test coverage.
require.ErrorContains(t, rdb.Do(ctx, "TDIGEST.MAX").Err(), errMsgWrongNumberArg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR adds tests for the new TDIGEST.MAX and TDIGEST.MIN commands to ensure their correct behavior with various arguments.
- Added a new error constant for missing values.
- Introduced test cases for TDIGEST.MAX and TDIGEST.MIN covering invalid arguments, empty digests, and digests with one or multiple values.
Reviewed Changes
| File | Description |
|---|---|
| tests/gocase/unit/type/tdigest/tdigest_test.go | Added tests for TDIGEST.MAX and TDIGEST.MIN commands and a new error constant |
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (1)
tests/gocase/unit/type/tdigest/tdigest_test.go:39
- [nitpick] Consider renaming 'errValueDoesNotExist' to 'errMsgValueDoesNotExist' for consistency with the other error constants defined in this file.
errValueDoesNotExist = "no value exists for key"
mapleFU
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code general LGTM, but suddenly I realized that TDigest.Add nan is invalid: https://github.com/RedisBloom/RedisBloom/blob/f68a72e463b84c2298990295c6a3db4719dc5c7a/src/rm_tdigest.c#L187
Would you mind also change that? ( Can be in another pr )
I will fix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/types/redis_tdigest.cc
Outdated
|
|
||
| metadata.total_observations += inputs.size(); | ||
| metadata.total_weight += inputs.size(); | ||
| metadata.maximum = std::max(metadata.maximum, *std::max_element(inputs.cbegin(), inputs.cend())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @wyxxxcat ,
The metadata.minimum and metadata.maximum should represent the centroids min and max.
It will be updated in merge action and updated by dumping after merging.
Maybe we could just compute inplace or create a new property in metadata to keep the consistency of metadata and centroids. 😊
If we try to do this in buffer appending, we may need to refactor the merge logic to make sure these max min properties are maintained by just one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this matters since DumpCentroids() is always called after Merge currently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mapleFU ,
Thanks for your update! 😊
Currently the min and max are updated after Merge with DumpCentroids().
But Add will not always trigger the Merge action until the buffer is full.
So, if we try to update it in Add maybe we'd better remove the updating of min&max after DumpCentroids().
Since when we try to merge, the buffer will always be locked, it is safe to update the value either in Add or Merge.
Best Regards,
Edward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To change the word here, what's the main purpose to set min max after merge🤔? Since update min max is free when calling Add if any io really happens. And Quantile will always merge before doing any real things?
Without updating min max in add, the tdigest.min/max will traverse the unmerged data buffer
Both way is ok to me here but I just don't know the point or trade-off here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mapleFU ,
I agree on your view. 😊
Frankly this probably just because I implement the Merge function ahead of Add. 😄
So, I think move it to the Add function is reasonable. But we'd better remove the updating after Merge if we decide to maintain it in Add.
Update it in two places may not be a good practice.
Best Regards,
Edward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm so I'm wondering if we should submit a PR to make these changes before this PR is merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @PragmaTwice ,
Got it!
I'll create a patch today to refactor the min max updating logic to Add.😊
Best Regards,
Edward
|
I would approve and merge after #2811 (comment) is applied |
PragmaTwice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. To see if @LindaSummer has any comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @wyxxxcat ,
Thanks for your effort!😊
Generally LGTM.
Left some minor nitpick.
Current code is also good to me.
If you don't mind, you could update it in next commit.
Best Regards,
Edward
|
Thanks all, merged! By the way, meaning full commit message and not using same commit is appreciated |
|



This closes #2793.