package sklearn

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
type tag = [
  1. | `StackingClassifier
]
type t = [ `BaseEstimator | `ClassifierMixin | `MetaEstimatorMixin | `Object | `StackingClassifier | `TransformerMixin ] Obj.t
val of_pyobject : Py.Object.t -> t
val to_pyobject : [> tag ] Obj.t -> Py.Object.t
val as_classifier : t -> [ `ClassifierMixin ] Obj.t
val as_meta_estimator : t -> [ `MetaEstimatorMixin ] Obj.t
val as_transformer : t -> [ `TransformerMixin ] Obj.t
val as_estimator : t -> [ `BaseEstimator ] Obj.t
val create : ?final_estimator:[> `BaseEstimator ] Np.Obj.t -> ?cv: [ `BaseCrossValidator of [> `BaseCrossValidator ] Np.Obj.t | `Arr of [> `ArrayLike ] Np.Obj.t | `I of int ] -> ?stack_method:[ `Auto | `Predict_proba | `Decision_function | `Predict ] -> ?n_jobs:int -> ?passthrough:bool -> ?verbose:int -> estimators:(string * [> `BaseEstimator ] Np.Obj.t) list -> unit -> t

Stack of estimators with a final classifier.

Stacked generalization consists in stacking the output of individual estimator and use a classifier to compute the final prediction. Stacking allows to use the strength of each individual estimator by using their output as input of a final estimator.

Note that `estimators_` are fitted on the full `X` while `final_estimator_` is trained using cross-validated predictions of the base estimators using `cross_val_predict`.

.. versionadded:: 0.22

Read more in the :ref:`User Guide <stacking>`.

Parameters ---------- estimators : list of (str, estimator) Base estimators which will be stacked together. Each element of the list is defined as a tuple of string (i.e. name) and an estimator instance. An estimator can be set to 'drop' using `set_params`.

final_estimator : estimator, default=None A classifier which will be used to combine the base estimators. The default classifier is a `LogisticRegression`.

cv : int, cross-validation generator or an iterable, default=None Determines the cross-validation splitting strategy used in `cross_val_predict` to train `final_estimator`. Possible inputs for cv are:

* None, to use the default 5-fold cross validation, * integer, to specify the number of folds in a (Stratified) KFold, * An object to be used as a cross-validation generator, * An iterable yielding train, test splits.

For integer/None inputs, if the estimator is a classifier and y is either binary or multiclass, `StratifiedKFold` is used. In all other cases, `KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various cross-validation strategies that can be used here.

.. note:: A larger number of split will provide no benefits if the number of training samples is large enough. Indeed, the training time will increase. ``cv`` is not used for model evaluation but for prediction.

stack_method : 'auto', 'predict_proba', 'decision_function', 'predict', default='auto' Methods called for each base estimator. It can be:

* if 'auto', it will try to invoke, for each estimator, `'predict_proba'`, `'decision_function'` or `'predict'` in that order. * otherwise, one of `'predict_proba'`, `'decision_function'` or `'predict'`. If the method is not implemented by the estimator, it will raise an error.

n_jobs : int, default=None The number of jobs to run in parallel all `estimators` `fit`. `None` means 1 unless in a `joblib.parallel_backend` context. -1 means using all processors. See Glossary for more details.

passthrough : bool, default=False When False, only the predictions of estimators will be used as training data for `final_estimator`. When True, the `final_estimator` is trained on the predictions as well as the original training data.

verbose : int, default=0 Verbosity level.

Attributes ---------- classes_ : ndarray of shape (n_classes,) Class labels.

estimators_ : list of estimators The elements of the estimators parameter, having been fitted on the training data. If an estimator has been set to `'drop'`, it will not appear in `estimators_`.

named_estimators_ : :class:`~sklearn.utils.Bunch` Attribute to access any fitted sub-estimators by name.

final_estimator_ : estimator The classifier which predicts given the output of `estimators_`.

stack_method_ : list of str The method used by each base estimator.

Notes ----- When `predict_proba` is used by each estimator (i.e. most of the time for `stack_method='auto'` or specifically for `stack_method='predict_proba'`), The first column predicted by each estimator will be dropped in the case of a binary classification problem. Indeed, both feature will be perfectly collinear.

References ---------- .. 1 Wolpert, David H. 'Stacked generalization.' Neural networks 5.2 (1992): 241-259.

Examples -------- >>> from sklearn.datasets import load_iris >>> from sklearn.ensemble import RandomForestClassifier >>> from sklearn.svm import LinearSVC >>> from sklearn.linear_model import LogisticRegression >>> from sklearn.preprocessing import StandardScaler >>> from sklearn.pipeline import make_pipeline >>> from sklearn.ensemble import StackingClassifier >>> X, y = load_iris(return_X_y=True) >>> estimators = ... ('rf', RandomForestClassifier(n_estimators=10, random_state=42)), ... ('svr', make_pipeline(StandardScaler(), ... LinearSVC(random_state=42))) ... >>> clf = StackingClassifier( ... estimators=estimators, final_estimator=LogisticRegression() ... ) >>> from sklearn.model_selection import train_test_split >>> X_train, X_test, y_train, y_test = train_test_split( ... X, y, stratify=y, random_state=42 ... ) >>> clf.fit(X_train, y_train).score(X_test, y_test) 0.9...

val decision_function : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Predict decision function for samples in X using `final_estimator_.decision_function`.

Parameters ---------- X : array-like, sparse matrix of shape (n_samples, n_features) Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns ------- decisions : ndarray of shape (n_samples,), (n_samples, n_classes), or (n_samples, n_classes * (n_classes-1) / 2) The decision function computed the final estimator.

val fit : ?sample_weight:[> `ArrayLike ] Np.Obj.t -> x:[> `ArrayLike ] Np.Obj.t -> y:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> t

Fit the estimators.

Parameters ---------- X : array-like, sparse matrix of shape (n_samples, n_features) Training vectors, where `n_samples` is the number of samples and `n_features` is the number of features.

y : array-like of shape (n_samples,) Target values.

sample_weight : array-like of shape (n_samples,), default=None Sample weights. If None, then samples are equally weighted. Note that this is supported only if all underlying estimators support sample weights.

Returns ------- self : object

val fit_transform : ?y:[> `ArrayLike ] Np.Obj.t -> ?fit_params:(string * Py.Object.t) list -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters ---------- X : array-like, sparse matrix, dataframe of shape (n_samples, n_features)

y : ndarray of shape (n_samples,), default=None Target values.

**fit_params : dict Additional fit parameters.

Returns ------- X_new : ndarray array of shape (n_samples, n_features_new) Transformed array.

val get_params : ?deep:bool -> [> tag ] Obj.t -> Py.Object.t

Get the parameters of an estimator from the ensemble.

Parameters ---------- deep : bool, default=True Setting it to True gets the various classifiers and the parameters of the classifiers as well.

val predict : ?predict_params:(string * Py.Object.t) list -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Predict target for X.

Parameters ---------- X : array-like, sparse matrix of shape (n_samples, n_features) Training vectors, where n_samples is the number of samples and n_features is the number of features.

**predict_params : dict of str -> obj Parameters to the `predict` called by the `final_estimator`. Note that this may be used to return uncertainties from some estimators with `return_std` or `return_cov`. Be aware that it will only accounts for uncertainty in the final estimator.

Returns ------- y_pred : ndarray of shape (n_samples,) or (n_samples, n_output) Predicted targets.

val predict_proba : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Predict class probabilities for X using `final_estimator_.predict_proba`.

Parameters ---------- X : array-like, sparse matrix of shape (n_samples, n_features) Training vectors, where n_samples is the number of samples and n_features is the number of features.

Returns ------- probabilities : ndarray of shape (n_samples, n_classes) or list of ndarray of shape (n_output,) The class probabilities of the input samples.

val score : ?sample_weight:[> `ArrayLike ] Np.Obj.t -> x:[> `ArrayLike ] Np.Obj.t -> y:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> float

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters ---------- X : array-like of shape (n_samples, n_features) Test samples.

y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X.

sample_weight : array-like of shape (n_samples,), default=None Sample weights.

Returns ------- score : float Mean accuracy of self.predict(X) wrt. y.

val set_params : ?params:(string * Py.Object.t) list -> [> tag ] Obj.t -> t

Set the parameters of an estimator from the ensemble.

Valid parameter keys can be listed with `get_params()`.

Parameters ---------- **params : keyword arguments Specific parameters using e.g. `set_params(parameter_name=new_value)`. In addition, to setting the parameters of the stacking estimator, the individual estimator of the stacking estimators can also be set, or can be removed by setting them to 'drop'.

val transform : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Return class labels or probabilities for X for each estimator.

Parameters ---------- X : array-like, sparse matrix of shape (n_samples, n_features) Training vectors, where `n_samples` is the number of samples and `n_features` is the number of features.

Returns ------- y_preds : ndarray of shape (n_samples, n_estimators) or (n_samples, n_classes * n_estimators) Prediction outputs for each estimator.

val classes_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute classes_: get value or raise Not_found if None.

val classes_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute classes_: get value as an option.

val estimators_ : t -> [ `BaseEstimator | `Object ] Np.Obj.t list

Attribute estimators_: get value or raise Not_found if None.

val estimators_opt : t -> [ `BaseEstimator | `Object ] Np.Obj.t list option

Attribute estimators_: get value as an option.

val named_estimators_ : t -> Dict.t

Attribute named_estimators_: get value or raise Not_found if None.

val named_estimators_opt : t -> Dict.t option

Attribute named_estimators_: get value as an option.

val final_estimator_ : t -> [ `BaseEstimator | `Object ] Np.Obj.t

Attribute final_estimator_: get value or raise Not_found if None.

val final_estimator_opt : t -> [ `BaseEstimator | `Object ] Np.Obj.t option

Attribute final_estimator_: get value as an option.

val stack_method_ : t -> string list

Attribute stack_method_: get value or raise Not_found if None.

val stack_method_opt : t -> string list option

Attribute stack_method_: get value as an option.

val to_string : t -> string

Print the object to a human-readable representation.

val show : t -> string

Print the object to a human-readable representation.

val pp : Stdlib.Format.formatter -> t -> unit

Pretty-print the object to a formatter.