Skip to content

Bug: ValueError:all times must be within follow-up time of test data: (re-open of #86) #557

@yhoogstrate

Description

@yhoogstrate

Describe the bug

This is addressing the earlier closed #86:

I think that the behaviour is due to:

https://github.com/sebp/scikit-survival/blob/master/sksurv/metrics.py#L66

if times.max() >= test_time.max() or times.min() < test_time.min():

whereas for using it in the brier_score you would expect it to be:

if times.max() >= test_time.max() or times.min() <= test_time.min():

From my understanding, currently, estimating the brier score will throw this error when the very last step for the StepFunction is included.

If this is a bug rather than a feature, it makes sense to rewrite the error to:

   f"all times must be within follow-up time of test data: [... >= {test_time.min()}; ...< {test_time.max()}]"

code to reproduce:

Steps/Code to Reproduce

y_train_ = np.array([(bool(e), t) for e, t in zip([True, True, True], [15, 100, 200])], dtype=[("event", "bool"), ("time", "float")])
y_test_ = np.array([(bool(e), t) for e, t in zip([True, True, True], [50, 125, 151])], dtype=[("event", "bool"), ("time", "float")])

surv_preds_ = [ [0.9, 0.5, 0.2], [0.9, 0.5, 0.2], [0.9, 0.5, 0.2] ]

times_pass = [ 50, 100.25, 150.5]
times_fail = [ 50, 100.25, 151] # this fails but falls in range of the survival functions
times_should_fail = [ 50, 100.25, 152]

print(y_train_)
print(y_test_)

brier_score(y_train_, y_test_, surv_preds_, times_pass)
brier_score(y_train_, y_test_, surv_preds_, times_fail)
brier_score(y_train_, y_test_, surv_preds_, times_should_fail)

Actual Results

run 1 gives a brier score
run 2: ValueError: all times must be within follow-up time of test data: [50.0; 151.0[
run 3: ValueError: all times must be within follow-up time of test data: [50.0; 151.0[

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[112], line 16
     14 brier_score(y_train_, y_test_, surv_preds_, times_pass)
     15 #brier_score(y_train_, y_test_, surv_preds_, times_fail)
---> 16 brier_score(y_train_, y_test_, surv_preds_, times_should_fail)

File [~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py:646](http://scs-compute-n011:8888/lab/workspaces/auto-L/tree/~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py#line=645), in brier_score(survival_train, survival_test, estimate, times)
    530 r"""The time-dependent Brier score for right-censored data.
    531 
    532 The time-dependent Brier score measures the inaccuracy of
   (...)    643        Statistics in Medicine, vol. 18, no. 17-18, pp. 2529–2545, 1999.
    644 """
    645 test_event, test_time = check_y_survival(survival_test)
--> 646 estimate, times = _check_estimate_2d(estimate, test_time, times, estimator="brier_score")
    647 if estimate.ndim == 1 and times.shape[0] == 1:
    648     estimate = estimate.reshape(-1, 1)

File [~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py:76](http://scs-compute-n011:8888/lab/workspaces/auto-L/tree/~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py#line=75), in _check_estimate_2d(estimate, test_time, time_points, estimator)
     74 def _check_estimate_2d(estimate, test_time, time_points, estimator):
     75     estimate = check_array(estimate, ensure_2d=False, allow_nd=False, input_name="estimate", estimator=estimator)
---> 76     time_points = _check_times(test_time, time_points)
     77     check_consistent_length(test_time, estimate)
     79     if estimate.ndim == 2 and estimate.shape[1] != time_points.shape[0]:

File [~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py:67](http://scs-compute-n011:8888/lab/workspaces/auto-L/tree/~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py#line=66), in _check_times(test_time, times)
     64 times = np.unique(times)
     66 if times.max() >= test_time.max() or times.min() < test_time.min():
---> 67     raise ValueError(
     68         f"all times must be within follow-up time of test data: [{test_time.min()}; {test_time.max()}["
     69     )
     71 return times

ValueError: all times must be within follow-up time of test data: [50.0; 151.0[

Expected Results

run 1 gives a brier score
run 2 gives a brier score
run 3: ValueError: all times must be within follow-up time of test data: [50.0; 151.0[

Installed Versions

SYSTEM
------
Platform          : Linux-6.8.0-79-generic-x86_64-with-glibc2.39
Python version    : CPython 3.12.3
Python interpreter: /home/youri/projects/cognition/.venv/bin/python

DEPENDENCIES
------------
scikit-survival   : 0.25.0
scikit-learn      : 1.7.1
numpy             : 2.3.0
scipy             : 1.15.3
pandas            : 2.3.0
numexpr           : 2.11.0
ecos              : 2.0.14
osqp              : 0.6.7.post3
joblib            : 1.5.1
matplotlib        : 3.10.5
pytest            : 8.4.0
sphinx            : 8.1.3
Cython            : None
pip               : 24.0
setuptools        : 80.9.0

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions