package sklearn

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
type tag = [
  1. | `RANSACRegressor
]
type t = [ `BaseEstimator | `MetaEstimatorMixin | `MultiOutputMixin | `Object | `RANSACRegressor | `RegressorMixin ] Obj.t
val of_pyobject : Py.Object.t -> t
val to_pyobject : [> tag ] Obj.t -> Py.Object.t
val as_estimator : t -> [ `BaseEstimator ] Obj.t
val as_meta_estimator : t -> [ `MetaEstimatorMixin ] Obj.t
val as_regressor : t -> [ `RegressorMixin ] Obj.t
val as_multi_output : t -> [ `MultiOutputMixin ] Obj.t
val create : ?base_estimator:[> `BaseEstimator ] Np.Obj.t -> ?min_samples:[ `I of int | `Float_0_1_ of Py.Object.t ] -> ?residual_threshold:float -> ?is_data_valid:Py.Object.t -> ?is_model_valid:Py.Object.t -> ?max_trials:int -> ?max_skips:int -> ?stop_n_inliers:int -> ?stop_score:float -> ?stop_probability:float -> ?loss:[ `S of string | `Callable of Py.Object.t ] -> ?random_state:int -> unit -> t

RANSAC (RANdom SAmple Consensus) algorithm.

RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set.

Read more in the :ref:`User Guide <ransac_regression>`.

Parameters ---------- base_estimator : object, optional Base estimator object which implements the following methods:

* `fit(X, y)`: Fit model to given training data and target values. * `score(X, y)`: Returns the mean accuracy on the given test data, which is used for the stop criterion defined by `stop_score`. Additionally, the score is used to decide which of two equally large consensus sets is chosen as the better one. * `predict(X)`: Returns predicted values using the linear model, which is used to compute residual error using loss function.

If `base_estimator` is None, then ``base_estimator=sklearn.linear_model.LinearRegression()`` is used for target values of dtype float.

Note that the current implementation only supports regression estimators.

min_samples : int (>= 1) or float (0, 1), optional Minimum number of samples chosen randomly from original data. Treated as an absolute number of samples for `min_samples >= 1`, treated as a relative number `ceil(min_samples * X.shape0`) for `min_samples < 1`. This is typically chosen as the minimal number of samples necessary to estimate the given `base_estimator`. By default a ``sklearn.linear_model.LinearRegression()`` estimator is assumed and `min_samples` is chosen as ``X.shape1 + 1``.

residual_threshold : float, optional Maximum residual for a data sample to be classified as an inlier. By default the threshold is chosen as the MAD (median absolute deviation) of the target values `y`.

is_data_valid : callable, optional This function is called with the randomly selected data before the model is fitted to it: `is_data_valid(X, y)`. If its return value is False the current randomly chosen sub-sample is skipped.

is_model_valid : callable, optional This function is called with the estimated model and the randomly selected data: `is_model_valid(model, X, y)`. If its return value is False the current randomly chosen sub-sample is skipped. Rejecting samples with this function is computationally costlier than with `is_data_valid`. `is_model_valid` should therefore only be used if the estimated model is needed for making the rejection decision.

max_trials : int, optional Maximum number of iterations for random sample selection.

max_skips : int, optional Maximum number of iterations that can be skipped due to finding zero inliers or invalid data defined by ``is_data_valid`` or invalid models defined by ``is_model_valid``.

.. versionadded:: 0.19

stop_n_inliers : int, optional Stop iteration if at least this number of inliers are found.

stop_score : float, optional Stop iteration if score is greater equal than this threshold.

stop_probability : float in range 0, 1, optional RANSAC iteration stops if at least one outlier-free set of the training data is sampled in RANSAC. This requires to generate at least N samples (iterations)::

N >= log(1 - probability) / log(1 - e**m)

where the probability (confidence) is typically set to high value such as 0.99 (the default) and e is the current fraction of inliers w.r.t. the total number of samples.

loss : string, callable, optional, default 'absolute_loss' String inputs, 'absolute_loss' and 'squared_loss' are supported which find the absolute loss and squared loss per sample respectively.

If ``loss`` is a callable, then it should be a function that takes two arrays as inputs, the true and predicted value and returns a 1-D array with the i-th value of the array corresponding to the loss on ``Xi``.

If the loss on a sample is greater than the ``residual_threshold``, then this sample is classified as an outlier.

.. versionadded:: 0.18

random_state : int, RandomState instance, default=None The generator used to initialize the centers. Pass an int for reproducible output across multiple function calls. See :term:`Glossary <random_state>`.

Attributes ---------- estimator_ : object Best fitted model (copy of the `base_estimator` object).

n_trials_ : int Number of random selection trials until one of the stop criteria is met. It is always ``<= max_trials``.

inlier_mask_ : bool array of shape n_samples Boolean mask of inliers classified as ``True``.

n_skips_no_inliers_ : int Number of iterations skipped due to finding zero inliers.

.. versionadded:: 0.19

n_skips_invalid_data_ : int Number of iterations skipped due to invalid data defined by ``is_data_valid``.

.. versionadded:: 0.19

n_skips_invalid_model_ : int Number of iterations skipped due to an invalid model defined by ``is_model_valid``.

.. versionadded:: 0.19

Examples -------- >>> from sklearn.linear_model import RANSACRegressor >>> from sklearn.datasets import make_regression >>> X, y = make_regression( ... n_samples=200, n_features=2, noise=4.0, random_state=0) >>> reg = RANSACRegressor(random_state=0).fit(X, y) >>> reg.score(X, y) 0.9885... >>> reg.predict(X:1,) array(-31.9417...)

References ---------- .. 1 https://en.wikipedia.org/wiki/RANSAC .. 2 https://www.sri.com/sites/default/files/publications/ransac-publication.pdf .. 3 http://www.bmva.org/bmvc/2009/Papers/Paper355/Paper355.pdf

val fit : ?sample_weight:[> `ArrayLike ] Np.Obj.t -> x:[> `ArrayLike ] Np.Obj.t -> y:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> t

Fit estimator using RANSAC algorithm.

Parameters ---------- X : array-like or sparse matrix, shape n_samples, n_features Training data.

y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values.

sample_weight : array-like of shape (n_samples,), default=None Individual weights for each sample raises error if sample_weight is passed and base_estimator fit method does not support it.

.. versionadded:: 0.18

Raises ------ ValueError If no valid consensus set could be found. This occurs if `is_data_valid` and `is_model_valid` return False for all `max_trials` randomly chosen sub-samples.

val get_params : ?deep:bool -> [> tag ] Obj.t -> Dict.t

Get parameters for this estimator.

Parameters ---------- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns ------- params : mapping of string to any Parameter names mapped to their values.

val predict : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Predict using the estimated model.

This is a wrapper for `estimator_.predict(X)`.

Parameters ---------- X : numpy array of shape n_samples, n_features

Returns ------- y : array, shape = n_samples or n_samples, n_targets Returns predicted values.

val score : x:[> `ArrayLike ] Np.Obj.t -> y:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> float

Returns the score of the prediction.

This is a wrapper for `estimator_.score(X, y)`.

Parameters ---------- X : numpy array or sparse matrix of shape n_samples, n_features Training data.

y : array, shape = n_samples or n_samples, n_targets Target values.

Returns ------- z : float Score of the prediction.

val set_params : ?params:(string * Py.Object.t) list -> [> tag ] Obj.t -> t

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form ``<component>__<parameter>`` so that it's possible to update each component of a nested object.

Parameters ---------- **params : dict Estimator parameters.

Returns ------- self : object Estimator instance.

val estimator_ : t -> Py.Object.t

Attribute estimator_: get value or raise Not_found if None.

val estimator_opt : t -> Py.Object.t option

Attribute estimator_: get value as an option.

val n_trials_ : t -> int

Attribute n_trials_: get value or raise Not_found if None.

val n_trials_opt : t -> int option

Attribute n_trials_: get value as an option.

val inlier_mask_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute inlier_mask_: get value or raise Not_found if None.

val inlier_mask_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute inlier_mask_: get value as an option.

val n_skips_no_inliers_ : t -> int

Attribute n_skips_no_inliers_: get value or raise Not_found if None.

val n_skips_no_inliers_opt : t -> int option

Attribute n_skips_no_inliers_: get value as an option.

val n_skips_invalid_data_ : t -> int

Attribute n_skips_invalid_data_: get value or raise Not_found if None.

val n_skips_invalid_data_opt : t -> int option

Attribute n_skips_invalid_data_: get value as an option.

val n_skips_invalid_model_ : t -> int

Attribute n_skips_invalid_model_: get value or raise Not_found if None.

val n_skips_invalid_model_opt : t -> int option

Attribute n_skips_invalid_model_: get value as an option.

val to_string : t -> string

Print the object to a human-readable representation.

val show : t -> string

Print the object to a human-readable representation.

val pp : Stdlib.Format.formatter -> t -> unit

Pretty-print the object to a formatter.