package sklearn

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
type t
val of_pyobject : Py.Object.t -> t
val to_pyobject : t -> Py.Object.t
val create : ?missing_values:[ `String of string | `PyObject of Py.Object.t ] -> ?strategy:string -> ?fill_value:[ `String of string | `PyObject of Py.Object.t ] -> ?verbose:int -> ?copy:bool -> ?add_indicator:bool -> unit -> t

Imputation transformer for completing missing values.

Read more in the :ref:`User Guide <impute>`.

Parameters ---------- missing_values : number, string, np.nan (default) or None The placeholder for the missing values. All occurrences of `missing_values` will be imputed.

strategy : string, default='mean' The imputation strategy.

  • If "mean", then replace missing values using the mean along each column. Can only be used with numeric data.
  • If "median", then replace missing values using the median along each column. Can only be used with numeric data.
  • If "most_frequent", then replace missing using the most frequent value along each column. Can be used with strings or numeric data.
  • If "constant", then replace missing values with fill_value. Can be used with strings or numeric data.

.. versionadded:: 0.20 strategy="constant" for fixed value imputation.

fill_value : string or numerical value, default=None When strategy == "constant", fill_value is used to replace all occurrences of missing_values. If left to the default, fill_value will be 0 when imputing numerical data and "missing_value" for strings or object data types.

verbose : integer, default=0 Controls the verbosity of the imputer.

copy : boolean, default=True If True, a copy of X will be created. If False, imputation will be done in-place whenever possible. Note that, in the following cases, a new copy will always be made, even if `copy=False`:

  • If X is not an array of floating values;
  • If X is encoded as a CSR matrix;
  • If add_indicator=True.

add_indicator : boolean, default=False If True, a :class:`MissingIndicator` transform will stack onto output of the imputer's transform. This allows a predictive estimator to account for missingness despite imputation. If a feature has no missing values at fit/train time, the feature won't appear on the missing indicator even if there are missing values at transform/test time.

Attributes ---------- statistics_ : array of shape (n_features,) The imputation fill value for each feature. Computing statistics can result in `np.nan` values. During :meth:`transform`, features corresponding to `np.nan` statistics will be discarded.

indicator_ : :class:`sklearn.impute.MissingIndicator` Indicator used to add binary indicators for missing values. ``None`` if add_indicator is False.

See also -------- IterativeImputer : Multivariate imputation of missing values.

Examples -------- >>> import numpy as np >>> from sklearn.impute import SimpleImputer >>> imp_mean = SimpleImputer(missing_values=np.nan, strategy='mean') >>> imp_mean.fit([7, 2, 3], [4, np.nan, 6], [10, 5, 9]) SimpleImputer() >>> X = [np.nan, 2, 3], [4, np.nan, 6], [10, np.nan, 9] >>> print(imp_mean.transform(X)) [ 7. 2. 3. ] [ 4. 3.5 6. ] [10. 3.5 9. ]

Notes ----- Columns which only contained missing values at :meth:`fit` are discarded upon :meth:`transform` if strategy is not "constant".

val fit : ?y:Py.Object.t -> x:[ `Ndarray of Ndarray.t | `SparseMatrix of Csr_matrix.t ] -> t -> t

Fit the imputer on X.

Parameters ---------- X : array-like, sparse matrix, shape (n_samples, n_features) Input data, where ``n_samples`` is the number of samples and ``n_features`` is the number of features.

Returns ------- self : SimpleImputer

val fit_transform : ?y:Ndarray.t -> ?fit_params:(string * Py.Object.t) list -> x:Ndarray.t -> t -> Ndarray.t

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters ---------- X : numpy array of shape n_samples, n_features Training set.

y : numpy array of shape n_samples Target values.

**fit_params : dict Additional fit parameters.

Returns ------- X_new : numpy array of shape n_samples, n_features_new Transformed array.

val get_params : ?deep:bool -> t -> Py.Object.t

Get parameters for this estimator.

Parameters ---------- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns ------- params : mapping of string to any Parameter names mapped to their values.

val set_params : ?params:(string * Py.Object.t) list -> t -> t

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form ``<component>__<parameter>`` so that it's possible to update each component of a nested object.

Parameters ---------- **params : dict Estimator parameters.

Returns ------- self : object Estimator instance.

val transform : x:[ `Ndarray of Ndarray.t | `SparseMatrix of Csr_matrix.t ] -> t -> Ndarray.t

Impute all missing values in X.

Parameters ---------- X : array-like, sparse matrix, shape (n_samples, n_features) The input data to complete.

val statistics_ : t -> Ndarray.t

Attribute statistics_: see constructor for documentation

val indicator_ : t -> Py.Object.t

Attribute indicator_: see constructor for documentation

val to_string : t -> string

Print the object to a human-readable representation.

val show : t -> string

Print the object to a human-readable representation.

val pp : Format.formatter -> t -> unit

Pretty-print the object to a formatter.

OCaml

Innovation. Community. Security.