Skip to content

Conversation

@wyxxxcat
Copy link
Contributor

@wyxxxcat wyxxxcat commented Feb 28, 2025

This closes #2793.

@git-hulk git-hulk requested a review from Copilot February 28, 2025 03:03

This comment was marked as duplicate.

@wyxxxcat wyxxxcat force-pushed the min_max branch 2 times, most recently from fa1c04c to d4d2c8f Compare February 28, 2025 03:08
@git-hulk git-hulk requested a review from Copilot February 28, 2025 03:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR adds tests for the new TDIGEST.MAX and TDIGEST.MIN commands to validate their behavior, including error handling and correct value computation.

  • Introduces the error message constant errValueDoesNotExist for cases where a tdigest exists but contains no data.
  • Adds subtests for both TDIGEST.MAX and TDIGEST.MIN covering invalid arguments, empty digests, and digests with one or multiple values.

Reviewed Changes

File Description
tests/gocase/unit/type/tdigest/tdigest_test.go Added tests to cover TDIGEST.MAX and TDIGEST.MIN functionality

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

tests/gocase/unit/type/tdigest/tdigest_test.go:217

  • [nitpick] Consider reinitializing the tdigest or using a fresh key when transitioning from testing a single value to multiple values. This ensures that previous state does not affect the aggregation and makes each test case more isolated.
require.NoError(t, rdb.Do(ctx, "TDIGEST.CREATE", key, "compression", "100").Err())

tests/gocase/unit/type/tdigest/tdigest_test.go:212

  • [nitpick] Consider expanding the invalid argument tests by including cases with extra arguments for both TDIGEST.MAX and TDIGEST.MIN to further improve test coverage.
require.ErrorContains(t, rdb.Do(ctx, "TDIGEST.MAX").Err(), errMsgWrongNumberArg)

@git-hulk git-hulk requested a review from Copilot February 28, 2025 03:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR adds tests for the new TDIGEST.MAX and TDIGEST.MIN commands to ensure their correct behavior with various arguments.

  • Added a new error constant for missing values.
  • Introduced test cases for TDIGEST.MAX and TDIGEST.MIN covering invalid arguments, empty digests, and digests with one or multiple values.

Reviewed Changes

File Description
tests/gocase/unit/type/tdigest/tdigest_test.go Added tests for TDIGEST.MAX and TDIGEST.MIN commands and a new error constant

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

tests/gocase/unit/type/tdigest/tdigest_test.go:39

  • [nitpick] Consider renaming 'errValueDoesNotExist' to 'errMsgValueDoesNotExist' for consistency with the other error constants defined in this file.
errValueDoesNotExist   = "no value exists for key"

Copy link
Member

@mapleFU mapleFU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code general LGTM, but suddenly I realized that TDigest.Add nan is invalid: https://github.com/RedisBloom/RedisBloom/blob/f68a72e463b84c2298990295c6a3db4719dc5c7a/src/rm_tdigest.c#L187

Would you mind also change that? ( Can be in another pr )

@wyxxxcat
Copy link
Contributor Author

Code general LGTM, but suddenly I realized that TDigest.Add nan is invalid: https://github.com/RedisBloom/RedisBloom/blob/f68a72e463b84c2298990295c6a3db4719dc5c7a/src/rm_tdigest.c#L187

Would you mind also change that? ( Can be in another pr )

I will fix

mapleFU
mapleFU previously approved these changes Feb 28, 2025
Copy link
Member

@LindaSummer LindaSummer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wyxxxcat ,

Leave one comment and other parts are generally LGTM. 😊

Best Regards,
Edward


metadata.total_observations += inputs.size();
metadata.total_weight += inputs.size();
metadata.maximum = std::max(metadata.maximum, *std::max_element(inputs.cbegin(), inputs.cend()));
Copy link
Member

@LindaSummer LindaSummer Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wyxxxcat ,

The metadata.minimum and metadata.maximum should represent the centroids min and max.
It will be updated in merge action and updated by dumping after merging.

Maybe we could just compute inplace or create a new property in metadata to keep the consistency of metadata and centroids. 😊

If we try to do this in buffer appending, we may need to refactor the merge logic to make sure these max min properties are maintained by just one place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this matters since DumpCentroids() is always called after Merge currently?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mapleFU ,

Thanks for your update! 😊

Currently the min and max are updated after Merge with DumpCentroids().

But Add will not always trigger the Merge action until the buffer is full.

So, if we try to update it in Add maybe we'd better remove the updating of min&max after DumpCentroids().

Since when we try to merge, the buffer will always be locked, it is safe to update the value either in Add or Merge.

Best Regards,
Edward

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To change the word here, what's the main purpose to set min max after merge🤔? Since update min max is free when calling Add if any io really happens. And Quantile will always merge before doing any real things?

Without updating min max in add, the tdigest.min/max will traverse the unmerged data buffer

Both way is ok to me here but I just don't know the point or trade-off here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mapleFU ,

I agree on your view. 😊
Frankly this probably just because I implement the Merge function ahead of Add. 😄

So, I think move it to the Add function is reasonable. But we'd better remove the updating after Merge if we decide to maintain it in Add.

Update it in two places may not be a good practice.

Best Regards,
Edward

Copy link
Member

@PragmaTwice PragmaTwice Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so I'm wondering if we should submit a PR to make these changes before this PR is merged?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PragmaTwice ,

Got it!
I'll create a patch today to refactor the min max updating logic to Add.😊

Best Regards,
Edward

@mapleFU
Copy link
Member

mapleFU commented Mar 5, 2025

I would approve and merge after #2811 (comment) is applied

Copy link
Member

@PragmaTwice PragmaTwice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. To see if @LindaSummer has any comment.

Copy link
Member

@LindaSummer LindaSummer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wyxxxcat ,

Thanks for your effort!😊
Generally LGTM.

Left some minor nitpick.
Current code is also good to me.

If you don't mind, you could update it in next commit.

Best Regards,
Edward

@mapleFU mapleFU merged commit 231b093 into apache:unstable Mar 7, 2025
34 checks passed
@mapleFU
Copy link
Member

mapleFU commented Mar 7, 2025

Thanks all, merged!

By the way, meaning full commit message and not using same commit is appreciated

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 7, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TDigest: Implement MIN MAX command for TDigest Algorithm

4 participants