Skip to content

Entropy from counts #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Entropy from counts #133

wants to merge 4 commits into from

Conversation

ms609
Copy link
Owner

@ms609 ms609 commented Jul 8, 2025

rather than probabilities.
To optimize calls from TreeSearch.

@ms609 ms609 requested a review from Copilot July 8, 2025 17:46
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for computing entropy directly from counts (integers) via a new entropy_int C++ function, exposes it in R as Ntropy(), and updates documentation and tests to cover this new capability.

  • Introduce a log2_table lookup for fast integer entropy computations and register entropy_int via Rcpp.
  • Add Ntropy() in R (with roxygen and NAMESPACE export) as an alias for entropy_int.
  • Update tests and Rd documentation to verify both Entropy() and Ntropy() outputs.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/testthat/test-information.R Adjust test expectations order and add tests for entropy_int/Ntropy.
src/information.h Define log2_table, compute it at load time, and implement entropy_int.
src/RcppExports.cpp Register the new _TreeDist_entropy_int entry for R interface.
man/Entropy.Rd Document Ntropy() alias and update argument/value descriptions.
R/tree_distance_utilities.R Add Ntropy() wrapper with roxygen comments.
R/RcppExports.R Expose entropy_int() in R for internal use.
NAMESPACE Export Ntropy.
Comments suppressed due to low confidence (2)

tests/testthat/test-information.R:3

  • [nitpick] The test description still refers only to Entropy; consider updating it to mention entropy_int/Ntropy so it's clear what behavior is being validated.
test_that("Entropy is calculated correctly", {

R/RcppExports.R:16

  • [nitpick] Since entropy_int() is internal, you might prefix it with a dot (e.g. .entropy_int) or add a roxygen @export tag if it's meant for end users to avoid confusion.
entropy_int <- function(n) {

@@ -4,11 +4,19 @@
#include <cmath> /* for log2() */
#include <TreeTools/ClusterTable.h> /* for CT_MAX_LEAVES */

#include <Rcpp.h>
Copy link
Preview

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving #include <Rcpp.h> from the header into the implementation file (.cpp) to reduce header dependencies and compile-time coupling.

Copilot uses AI. Check for mistakes.

FACT_MAX = CT_MAX_LEAVES + CT_MAX_LEAVES + 5 + 1
;
constexpr int_fast32_t LOG_MAX = 2048;
double log2_table[LOG_MAX];
Copy link
Preview

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] It may be safer to limit the linkage of log2_table by placing it in an anonymous namespace or marking it static to avoid potential symbol conflicts in large projects.

Suggested change
double log2_table[LOG_MAX];
static double log2_table[LOG_MAX];

Copilot uses AI. Check for mistakes.

Copy link

codecov bot commented Jul 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.09%. Comparing base (bc7bc10) to head (0891e4e).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #133   +/-   ##
=======================================
  Coverage   99.08%   99.09%           
=======================================
  Files          37       37           
  Lines        2625     2638   +13     
=======================================
+ Hits         2601     2614   +13     
  Misses         24       24           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant