survivalpredict.estimators.CoxProportionalHazard¶
- class survivalpredict.estimators.CoxProportionalHazard(*, alpha=0.0, max_iter=100, ties='breslow', tol=1e-09)¶
Cox Proportional Hazards.
The ‘Cox Proportional Hazards’ model is a linear semi-parametric relative risk model. A staple of survival analysis. Cox more or less trains on ranking relative risk to estimate its coefficients. After training, the ‘Breslow estimator’ is run on relative risk and events over time to build the base hazard. A product of the relative risk and base hazard at each point in time is used to build the survival curves.
The Cox is called ‘semi-parametric’ due to the fact that is does not directly estimate the hazard, but only relative hazard. Hence, ‘partial-likelihood’ is what Cox estimates maximizes.
- Parameters:
alpha (float, default=0.0) – Constant that multiplies the penalty terms. Used to penalize coefficients durring training. Used for L2 penalty.
max_iter (Optional[int], default=100) – The maximum number of iterations.
ties ({"breslow", "efron"}, default='breslow') – The method to handle ‘tied’ event times. Cox’s coefficients are intended to represent the relative risk of observations in proportion to each other, independent of time. The presence of ‘tied’ or concurrent failures muddies the interpretability of Cox’s coefficients.‘Breslow ties’ ignore said issue and perform best on predictions. ‘Efron ties’ shaves some of the influence of some tied data on the likelihood, in hopes of solving said problem and at the price of prediction performance. Use Breslow if prediction performance is your primary concern, and use Efron in cases of inference.
tol (float, default=1e-9) – The tolerance for the optimization: if the updates are smaller or equal to
tol, the optimization code checks the dual gap for optimality and continues until it is smaller or equal totol.
- coef_¶
Coefficients of the model.
- Type:
ndarray of ndarray of shape (n_features,)
- n_log_likelihood¶
Negative log likelihood of the model at the point of convergence.
- Type:
float
- _breslow_base_hazard¶
Base hazard generated from training data, used for predicting survival curves.
- Type:
ndarray of ndarray of shape (max_time_seen,) or shape (n_strata,max_time_seen)
- _breslow_base_survival¶
Base survival generated from training data, used for predicting survival curves.
- Type:
ndarray of ndarray of shape (max_time_seen,) or shape (n_strata,max_time_seen)
Methods
fit(X, times, events[, strata, check_input, ...])Fit model.
fit_predict(*args, **kwargs)Fit model and Build survival curves.
predict(X[, strata, max_time])Build survival curves on an array of vectors X.
predict_risk(X)Build relative risk on an array of vectors X.
- __init__(*, alpha=0.0, max_iter=100, ties='breslow', tol=1e-09)¶
- Parameters:
alpha (float)
max_iter (int | None)
ties (Literal['breslow', 'efron'] | None)
tol (float)
Methods
__init__(*[, alpha, max_iter, ties, tol])fit(X, times, events[, strata, check_input, ...])Fit model.
fit_predict(*args, **kwargs)Fit model and Build survival curves.
get_metadata_routing()Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
predict(X[, strata, max_time])Build survival curves on an array of vectors X.
predict_risk(X)Build relative risk on an array of vectors X.
set_fit_request(*[, check_input, events, ...])Configure whether metadata should be requested to be passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_predict_request(*[, max_time, strata])Configure whether metadata should be requested to be passed to the
predictmethod.- fit(X, times, events, strata=None, check_input=True, times_start=None)¶
Fit model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
times (array-like of shape (n_samples), dtype=np.int64) – Point in time last observed.
events (array-like of shape (n_samples), dtype=np.bool_) – Experianed event.
strata (array-like of shape (n_samples,), dtype=np.int64, default=None) – If passed in, associated strata for per observation.
check_input (bool, default=True) – If True, validates and casts inputs.
times_start (array-like of shape (n_samples, dtype=np.int64), default=None) – Starting point for observation. If not passed in, all times_start times are assumed to be 0.
- Returns:
Fitted Estimator.
- Return type:
object
- fit_predict(*args, **kwargs)¶
Fit model and Build survival curves.
- predict(X, strata=None, max_time=None)¶
Build survival curves on an array of vectors X.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Predicting data.
strata (array-like of shape (n_samples,), dtype=np.int64, default=None) – If passed in, associated strata for per observation.
max_time (int, default=None) – Maximum time of built survival curves. If none, maximum time is max time seen on training data.
- Returns:
The estimated survival curves, the left-most column is the probability of survival at time 1, and the right-most column ends at max_time.
- Return type:
ndarray of shape (n_samples, max_time), dtype=np.float64
- predict_risk(X)¶
Build relative risk on an array of vectors X.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Predicting data.
- Returns:
The Relative risk of X, used under the hood for building survival curves. Relative risk is what ‘Concordance Index’ examines.
- Return type:
ndarray of shape (n_samples), dtype=np.float64