package sklearn

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
type tag = [
  1. | `PowerTransformer
]
type t = [ `BaseEstimator | `Object | `PowerTransformer | `TransformerMixin ] Obj.t
val of_pyobject : Py.Object.t -> t
val to_pyobject : [> tag ] Obj.t -> Py.Object.t
val as_transformer : t -> [ `TransformerMixin ] Obj.t
val as_estimator : t -> [ `BaseEstimator ] Obj.t
val create : ?method_:string -> ?standardize:bool -> ?copy:bool -> unit -> t

Apply a power transform featurewise to make data more Gaussian-like.

Power transforms are a family of parametric, monotonic transformations that are applied to make data more Gaussian-like. This is useful for modeling issues related to heteroscedasticity (non-constant variance), or other situations where normality is desired.

Currently, PowerTransformer supports the Box-Cox transform and the Yeo-Johnson transform. The optimal parameter for stabilizing variance and minimizing skewness is estimated through maximum likelihood.

Box-Cox requires input data to be strictly positive, while Yeo-Johnson supports both positive or negative data.

By default, zero-mean, unit-variance normalization is applied to the transformed data.

Read more in the :ref:`User Guide <preprocessing_transformer>`.

.. versionadded:: 0.20

Parameters ---------- method : str, (default='yeo-johnson') The power transform method. Available methods are:

  • 'yeo-johnson' 1_, works with positive and negative values
  • 'box-cox' 2_, only works with strictly positive values

standardize : boolean, default=True Set to True to apply zero-mean, unit-variance normalization to the transformed output.

copy : boolean, optional, default=True Set to False to perform inplace computation during transformation.

Attributes ---------- lambdas_ : array of float, shape (n_features,) The parameters of the power transformation for the selected features.

Examples -------- >>> import numpy as np >>> from sklearn.preprocessing import PowerTransformer >>> pt = PowerTransformer() >>> data = [1, 2], [3, 2], [4, 5] >>> print(pt.fit(data)) PowerTransformer() >>> print(pt.lambdas_) 1.386... -3.100... >>> print(pt.transform(data)) [-1.316... -0.707...] [ 0.209... -0.707...] [ 1.106... 1.414...]

See also -------- power_transform : Equivalent function without the estimator API.

QuantileTransformer : Maps data to a standard normal distribution with the parameter `output_distribution='normal'`.

Notes ----- NaNs are treated as missing values: disregarded in ``fit``, and maintained in ``transform``.

For a comparison of the different scalers, transformers, and normalizers, see :ref:`examples/preprocessing/plot_all_scaling.py <sphx_glr_auto_examples_preprocessing_plot_all_scaling.py>`.

References ----------

.. 1 I.K. Yeo and R.A. Johnson, 'A new family of power transformations to improve normality or symmetry.' Biometrika, 87(4), pp.954-959, (2000).

.. 2 G.E.P. Box and D.R. Cox, 'An Analysis of Transformations', Journal of the Royal Statistical Society B, 26, 211-252 (1964).

val fit : ?y:Py.Object.t -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> t

Estimate the optimal parameter lambda for each feature.

The optimal lambda parameter for minimizing skewness is estimated on each feature independently using maximum likelihood.

Parameters ---------- X : array-like, shape (n_samples, n_features) The data used to estimate the optimal transformation parameters.

y : Ignored

Returns ------- self : object

val fit_transform : ?y:Py.Object.t -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters ---------- X : array-like, sparse matrix, dataframe of shape (n_samples, n_features)

y : ndarray of shape (n_samples,), default=None Target values.

**fit_params : dict Additional fit parameters.

Returns ------- X_new : ndarray array of shape (n_samples, n_features_new) Transformed array.

val get_params : ?deep:bool -> [> tag ] Obj.t -> Dict.t

Get parameters for this estimator.

Parameters ---------- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns ------- params : mapping of string to any Parameter names mapped to their values.

val inverse_transform : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Apply the inverse power transformation using the fitted lambdas.

The inverse of the Box-Cox transformation is given by::

if lambda_ == 0: X = exp(X_trans) else: X = (X_trans * lambda_ + 1) ** (1 / lambda_)

The inverse of the Yeo-Johnson transformation is given by::

if X >= 0 and lambda_ == 0: X = exp(X_trans) - 1 elif X >= 0 and lambda_ != 0: X = (X_trans * lambda_ + 1) ** (1 / lambda_) - 1 elif X < 0 and lambda_ != 2: X = 1 - (-(2 - lambda_) * X_trans + 1) ** (1 / (2 - lambda_)) elif X < 0 and lambda_ == 2: X = 1 - exp(-X_trans)

Parameters ---------- X : array-like, shape (n_samples, n_features) The transformed data.

Returns ------- X : array-like, shape (n_samples, n_features) The original data

val set_params : ?params:(string * Py.Object.t) list -> [> tag ] Obj.t -> t

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form ``<component>__<parameter>`` so that it's possible to update each component of a nested object.

Parameters ---------- **params : dict Estimator parameters.

Returns ------- self : object Estimator instance.

val transform : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Apply the power transform to each feature using the fitted lambdas.

Parameters ---------- X : array-like, shape (n_samples, n_features) The data to be transformed using a power transformation.

Returns ------- X_trans : array-like, shape (n_samples, n_features) The transformed data.

val lambdas_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute lambdas_: get value or raise Not_found if None.

val lambdas_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute lambdas_: get value as an option.

val to_string : t -> string

Print the object to a human-readable representation.

val show : t -> string

Print the object to a human-readable representation.

val pp : Stdlib.Format.formatter -> t -> unit

Pretty-print the object to a formatter.