survivalpredict.strata_preprocessing.StrataBuilderDiscretizer

class survivalpredict.strata_preprocessing.StrataBuilderDiscretizer(n_bins=5, strategy='quantile', splits=None)

Builds strata keys from numeric data.

If predefined ‘splits’ are given, strata are built via the given bins and ‘n_splits’ and ‘strategy’ are ignored. Otherwise ‘n_splits’ and ‘strategy’ is used to generate bins. Largely inspired by scikitlearn’s KBinsDiscretizer.Adds onto existing strata, if existing strata are passed in.

Parameters:
  • n_bins (int , default=5) –

    The number of bins to produce. Raises ValueError if n_bins < 2.

    ’n_bins’ is ignored if ‘splits’ is not None.

  • strategy ({'uniform','quantile','kmeans'}, default='quantile') –

    Strategy used to define the widths of the bins.

    • ’uniform’: All bins in each feature have identical widths.

    • ’quantile’: All bins in each feature have the same number of points.

    • ’kmeans’: Values in each bin have the same nearest center of a 1D k-means cluster.

    ’strategy’ is ignored if ‘splits’ is not None.

  • splits (numeric array-like, default=None) – Predefined splits to build bins. If ‘splits’ is None, strategy and n_bins is ignored.

_splits

Splits used to generate bins.

Type:

ndarray of ndarray of shape (n_features,)

_uses_strata

True if fitted on preexising strata, False otherwise.

Type:

bool

Methods

fit(X[, times, events, strata, check_input])

Learn the strata.

fit_transform(X[, times, events, strata])

Fit and build strata.

set_output(*[, transform])

Set output container.

set_transform_request(*[, events, strata, times])

Configure whether metadata should be requested to be passed to the transform method.

transform(X[, times, events, strata])

Discretize numerical data to build strata.

__init__(n_bins=5, strategy='quantile', splits=None)
Parameters:
  • n_bins (int | None)

  • strategy (Literal['uniform', 'quantile', 'kmeans'])

  • splits (list[float | int] | list[list[float | int]] | ndarray[tuple[int] | tuple[int, int], dtype[floating | integer]] | None)

Methods

__init__([n_bins, strategy, splits])

fit(X[, times, events, strata, check_input])

Learn the strata.

fit_transform(X[, times, events, strata])

Fit and build strata.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

set_fit_request(*[, check_input, events, ...])

Configure whether metadata should be requested to be passed to the fit method.

set_output(*[, transform])

Set output container.

set_params(**params)

Set the parameters of this estimator.

set_transform_request(*[, events, strata, times])

Configure whether metadata should be requested to be passed to the transform method.

transform(X[, times, events, strata])

Discretize numerical data to build strata.

fit(X, times=None, events=None, strata=None, check_input=True)

Learn the strata.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Data to be discretized.

  • times (array-like of shape n_samples, default=None) – Ignored.

  • events (array-like of shape n_samples, default=None) – Ignored.

  • strata (array-like of shape n_samples, default=None) – Preexsting strata, the strata built will add onto the preexsting strata.

  • check_input (bool, default True) – If True, runs checks and casting on data to ensure data is valid.

Returns:

Returns the instance itself.

Return type:

object

fit_transform(X, times=None, events=None, strata=None)

Fit and build strata.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Data to be discretized.

  • times (array-like of shape n_samples, default=None) – Ignored.

  • events (array-like of shape n_samples, default=None) – Ignored.

  • strata (array-like of shape n_samples, default=None) – Preexsting strata, the strata built will add onto the preexsting strata.

Returns:

Build strata.

Return type:

ndarray of shape (n_samples) , dtype=np.int64

set_output(*, transform=None)

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters:

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • ”polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_transform_request(*, events='$UNCHANGED$', strata='$UNCHANGED$', times='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • events (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for events parameter in transform.

  • strata (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for strata parameter in transform.

  • times (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for times parameter in transform.

  • self (StrataBuilderDiscretizer)

Returns:

self – The updated object.

Return type:

object

transform(X, times=None, events=None, strata=None)

Discretize numerical data to build strata.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Data to be discretized.

  • times (array-like of shape n_samples, default=None) – Ignored.

  • events (array-like of shape n_samples, default=None) – Ignored.

  • strata (array-like of shape n_samples, default=None) – Preexsting strata, the strata built will add onto the preexsting strata.

Returns:

Build strata.

Return type:

array-like of shape (n_samples) , dtype=np.int64