Probability calibration with isotonic regression or sigmoid.
See glossary entry for :term:`cross-validation estimator`.
With this class, the base_estimator is fit on the train set of the cross-validation generator and the test set is used for calibration. The probabilities for each of the folds are then averaged for prediction. In case that cv="prefit" is passed to __init__, it is assumed that base_estimator has been fitted already and all data is used for calibration. Note that data for fitting the classifier and for calibrating it must be disjoint.
Read more in the :ref:`User Guide <calibration>`.
Parameters ---------- base_estimator : instance BaseEstimator The classifier whose output decision function needs to be calibrated to offer more accurate predict_proba outputs. If cv=prefit, the classifier must have been fit already on data.
method : 'sigmoid' or 'isotonic' The method to use for calibration. Can be 'sigmoid' which corresponds to Platt's method or 'isotonic' which is a non-parametric approach. It is not advised to use isotonic calibration with too few calibration samples ``(<<1000)`` since it tends to overfit. Use sigmoids (Platt's calibration) in this case.
cv : integer, cross-validation generator, iterable or "prefit", optional Determines the cross-validation splitting strategy. Possible inputs for cv are:
- None, to use the default 5-fold cross-validation,
- integer, to specify the number of folds.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.
For integer/None inputs, if ``y`` is binary or multiclass, :class:`sklearn.model_selection.StratifiedKFold` is used. If ``y`` is neither binary nor multiclass, :class:`sklearn.model_selection.KFold` is used.
Refer :ref:`User Guide <cross_validation>` for the various cross-validation strategies that can be used here.
If "prefit" is passed, it is assumed that base_estimator has been fitted already and all data is used for calibration.
.. versionchanged:: 0.22 ``cv`` default value if None changed from 3-fold to 5-fold.
Attributes ---------- classes_ : array, shape (n_classes) The class labels.
calibrated_classifiers_ : list (len() equal to cv or 1 if cv == "prefit") The list of calibrated classifiers, one for each crossvalidation fold, which has been fitted on all but the validation fold and calibrated on the validation fold.
References ---------- .. 1
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers, B. Zadrozny & C. Elkan, ICML 2001
.. 2
Transforming Classifier Scores into Accurate Multiclass Probability Estimates, B. Zadrozny & C. Elkan, (KDD 2002)
.. 3
Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods, J. Platt, (1999)
.. 4
Predicting Good Probabilities with Supervised Learning, A. Niculescu-Mizil & R. Caruana, ICML 2005