BUG & FEAT Fix convergence issues with Pinball loss on large datasets (Issue #276) #306

floriankozikowski · 2025-05-06T14:03:52Z

Context of the PR

As reported in Issue #276, the PDCD_WS solver exhibits convergence problems with the Pinball loss on larger datasets, appearing to reach a saddle point state. The issue can be reproduced using:

import numpy as np
from skglm import GeneralizedLinearEstimator
from skglm.experimental.pdcd_ws import PDCD_WS
from skglm.experimental.quantile_regression import Pinball
from skglm.penalties import L1
from sklearn.datasets import make_regression
from sklearn.preprocessing import StandardScaler

# Generate data with 1000 samples (issue happens at n > 1000)
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
X = StandardScaler().fit_transform(X)

# Set up problem with Pinball loss and PDCD_WS solver
datafit = Pinball(0.5)  # Median regression
penalty = L1(alpha=0.1)
solver = PDCD_WS(max_iter=500, max_epochs=500, tol=1e-2, verbose=True)

# This fit will not converge well
estimator = GeneralizedLinearEstimator(datafit=datafit, penalty=penalty, solver=solver)
estimator.fit(X, y)

Contributions of the PR

Created a new QuantileHuber class - a smoothed version of the Pinball loss that replaces the non-differentiable point with a quadratic region, similar to how Huber loss smooths the absolute loss.
Implemented SmoothQuantileRegressor - a progressive smoothing solver that:

Gradually decreases the smoothing parameter (delta)
Solves each smoothed problem with FISTA
Tracks the best solution (lowest quantile error) across all smoothing levels

Users can now solve quantile regression problems on larger datasets with:

from skglm.experimental.smooth_quantile_regressor import SmoothQuantileRegressor
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_regression

# Generate data
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1)
X = StandardScaler().fit_transform(X)

# Create and fit regressor
sqr = SmoothQuantileRegressor(
    quantile=0.5,  # for median regression (0.5) or any other quantile
    alpha=0.1,     # L1 regularization strength
    verbose=False
)
sqr.fit(X, y)

# Make predictions
y_pred = sqr.predict(X)

Limitations and Need for Further Refinement:

Currently, this is an approximation approach and NOT an exact solution.
Speed has not been tested, and the unit test for now only successfully tests against Issue #276.
Initial idea was to use a dual solver approach, where PDCD_WS its the Final Solver. This did not work.

Potential solutions could be path following methods, or review the code again (from earlier commits) and look for any mistakes (e.g. warmstart, intercept mistakes, etc.)

Checks before merging PR

added documentation for any new feature
added unit tests
edited the what's new (if applicable)

…te unnecessary draft files

floriankozikowski · 2025-05-13T15:28:43Z

examples/plot_smooth_quantile.py

@@ -0,0 +1,156 @@
+"""


Some tests so far:

τ ≠ 0.5 (e.g. 0.8): SmoothQuantileRegressor reduces loss by >50% vs QuantileRegressor.

Large n (≥10 000):SmoothQuantileRegressor is 1.3×–2× faster and more accurate.

Median τ=0.5 & n≈1 000: scikit-learn’s QuantileRegressor remains the best choice.

These are anecdotal results—your mileage may vary. Tune sequence and inner‐solver settings accordingly. Still room for improvement

…examples (still not ideal speed, precision and sparsity)

mathurinm · 2025-05-23T13:19:42Z

skglm/experimental/quantile_huber.py

@floriankozikowski to avoid having too many files, this should be merged with smooth_quantile.py since they contain very related code

mathurinm · 2025-05-23T13:20:38Z