-
Notifications
You must be signed in to change notification settings - Fork 223
Bug: ValueError:all times must be within follow-up time of test data: (re-open of #86) #557
Description
Describe the bug
This is addressing the earlier closed #86:
I think that the behaviour is due to:
https://github.com/sebp/scikit-survival/blob/master/sksurv/metrics.py#L66
if times.max() >= test_time.max() or times.min() < test_time.min():
whereas for using it in the brier_score you would expect it to be:
if times.max() >= test_time.max() or times.min() <= test_time.min():
From my understanding, currently, estimating the brier score will throw this error when the very last step for the StepFunction is included.
If this is a bug rather than a feature, it makes sense to rewrite the error to:
f"all times must be within follow-up time of test data: [... >= {test_time.min()}; ...< {test_time.max()}]"
code to reproduce:
Steps/Code to Reproduce
y_train_ = np.array([(bool(e), t) for e, t in zip([True, True, True], [15, 100, 200])], dtype=[("event", "bool"), ("time", "float")])
y_test_ = np.array([(bool(e), t) for e, t in zip([True, True, True], [50, 125, 151])], dtype=[("event", "bool"), ("time", "float")])
surv_preds_ = [ [0.9, 0.5, 0.2], [0.9, 0.5, 0.2], [0.9, 0.5, 0.2] ]
times_pass = [ 50, 100.25, 150.5]
times_fail = [ 50, 100.25, 151] # this fails but falls in range of the survival functions
times_should_fail = [ 50, 100.25, 152]
print(y_train_)
print(y_test_)
brier_score(y_train_, y_test_, surv_preds_, times_pass)
brier_score(y_train_, y_test_, surv_preds_, times_fail)
brier_score(y_train_, y_test_, surv_preds_, times_should_fail)
Actual Results
run 1 gives a brier score
run 2: ValueError: all times must be within follow-up time of test data: [50.0; 151.0[
run 3: ValueError: all times must be within follow-up time of test data: [50.0; 151.0[
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[112], line 16
14 brier_score(y_train_, y_test_, surv_preds_, times_pass)
15 #brier_score(y_train_, y_test_, surv_preds_, times_fail)
---> 16 brier_score(y_train_, y_test_, surv_preds_, times_should_fail)
File [~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py:646](http://scs-compute-n011:8888/lab/workspaces/auto-L/tree/~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py#line=645), in brier_score(survival_train, survival_test, estimate, times)
530 r"""The time-dependent Brier score for right-censored data.
531
532 The time-dependent Brier score measures the inaccuracy of
(...) 643 Statistics in Medicine, vol. 18, no. 17-18, pp. 2529–2545, 1999.
644 """
645 test_event, test_time = check_y_survival(survival_test)
--> 646 estimate, times = _check_estimate_2d(estimate, test_time, times, estimator="brier_score")
647 if estimate.ndim == 1 and times.shape[0] == 1:
648 estimate = estimate.reshape(-1, 1)
File [~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py:76](http://scs-compute-n011:8888/lab/workspaces/auto-L/tree/~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py#line=75), in _check_estimate_2d(estimate, test_time, time_points, estimator)
74 def _check_estimate_2d(estimate, test_time, time_points, estimator):
75 estimate = check_array(estimate, ensure_2d=False, allow_nd=False, input_name="estimate", estimator=estimator)
---> 76 time_points = _check_times(test_time, time_points)
77 check_consistent_length(test_time, estimate)
79 if estimate.ndim == 2 and estimate.shape[1] != time_points.shape[0]:
File [~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py:67](http://scs-compute-n011:8888/lab/workspaces/auto-L/tree/~/projects/cognition/.venv/lib/python3.12/site-packages/sksurv/metrics.py#line=66), in _check_times(test_time, times)
64 times = np.unique(times)
66 if times.max() >= test_time.max() or times.min() < test_time.min():
---> 67 raise ValueError(
68 f"all times must be within follow-up time of test data: [{test_time.min()}; {test_time.max()}["
69 )
71 return times
ValueError: all times must be within follow-up time of test data: [50.0; 151.0[
Expected Results
run 1 gives a brier score
run 2 gives a brier score
run 3: ValueError: all times must be within follow-up time of test data: [50.0; 151.0[
Installed Versions
SYSTEM
------
Platform : Linux-6.8.0-79-generic-x86_64-with-glibc2.39
Python version : CPython 3.12.3
Python interpreter: /home/youri/projects/cognition/.venv/bin/python
DEPENDENCIES
------------
scikit-survival : 0.25.0
scikit-learn : 1.7.1
numpy : 2.3.0
scipy : 1.15.3
pandas : 2.3.0
numexpr : 2.11.0
ecos : 2.0.14
osqp : 0.6.7.post3
joblib : 1.5.1
matplotlib : 3.10.5
pytest : 8.4.0
sphinx : 8.1.3
Cython : None
pip : 24.0
setuptools : 80.9.0