package sklearn
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha256=48809d88893a3f17d79f8e5acbd28126de919b8ced6d1f6856a61fd6bfae571d
sha512=9e1d01c42aed436163b1ce50bee141f40cb5bc943d5dd16d6eb21f1b53d613933533c70f28675e418a550cf44e0cd66d47496e462132769b05dec64bf3db560c
doc/sklearn/Sklearn/Covariance/index.html
Module Sklearn.Covariance
Source
Get an attribute of this module as a Py.Object.t. This is useful to pass a Python function to another function.
val empirical_covariance :
?assume_centered:bool ->
x:[> `ArrayLike ] Np.Obj.t ->
unit ->
[> `ArrayLike ] Np.Obj.t
Computes the Maximum likelihood covariance estimator
Parameters ---------- X : ndarray of shape (n_samples, n_features) Data from which to compute the covariance estimate
assume_centered : bool, default=False If True, data will not be centered before computation. Useful when working with data whose mean is almost, but not exactly zero. If False, data will be centered before computation.
Returns ------- covariance : ndarray of shape (n_features, n_features) Empirical covariance (Maximum Likelihood Estimator).
Examples -------- >>> from sklearn.covariance import empirical_covariance >>> X = [1,1,1],[1,1,1],[1,1,1], ... [0,0,0],[0,0,0],[0,0,0]
>>> empirical_covariance(X) array([0.25, 0.25, 0.25], [0.25, 0.25, 0.25], [0.25, 0.25, 0.25]
)
val fast_mcd :
?support_fraction:float ->
?cov_computation_method:Py.Object.t ->
?random_state:int ->
x:[> `ArrayLike ] Np.Obj.t ->
unit ->
[> `ArrayLike ] Np.Obj.t * [> `ArrayLike ] Np.Obj.t * Py.Object.t
Estimates the Minimum Covariance Determinant matrix.
Read more in the :ref:`User Guide <robust_covariance>`.
Parameters ---------- X : array-like of shape (n_samples, n_features) The data matrix, with p features and n samples.
support_fraction : float, default=None The proportion of points to be included in the support of the raw MCD estimate. Default is `None`, which implies that the minimum value of `support_fraction` will be used within the algorithm: `(n_sample + n_features + 1) / 2`. This parameter must be in the range (0, 1).
cov_computation_method : callable, default=:func:`sklearn.covariance.empirical_covariance` The function which will be used to compute the covariance. Must return an array of shape (n_features, n_features).
random_state : int or RandomState instance, default=None Determines the pseudo random number generator for shuffling the data. Pass an int for reproducible results across multiple function calls. See :term: `Glossary <random_state>`.
Returns ------- location : ndarray of shape (n_features,) Robust location of the data.
covariance : ndarray of shape (n_features, n_features) Robust covariance of the features.
support : ndarray of shape (n_samples,), dtype=bool A mask of the observations that have been used to compute the robust location and covariance estimates of the data set.
Notes ----- The FastMCD algorithm has been introduced by Rousseuw and Van Driessen in 'A Fast Algorithm for the Minimum Covariance Determinant Estimator, 1999, American Statistical Association and the American Society for Quality, TECHNOMETRICS'. The principle is to compute robust estimates and random subsets before pooling them into a larger subsets, and finally into the full data set. Depending on the size of the initial sample, we have one, two or three such computation levels.
Note that only raw estimates are returned. If one is interested in the correction and reweighting steps described in RouseeuwVan
_, see the MinCovDet object.
References ----------
.. RouseeuwVan
A Fast Algorithm for the Minimum Covariance Determinant Estimator, 1999, American Statistical Association and the American Society for Quality, TECHNOMETRICS
.. Butler1993
R. W. Butler, P. L. Davies and M. Jhun, Asymptotics For The Minimum Covariance Determinant Estimator, The Annals of Statistics, 1993, Vol. 21, No. 3, 1385-1400
val graphical_lasso :
?cov_init:[> `ArrayLike ] Np.Obj.t ->
?mode:[ `Cd | `Lars ] ->
?tol:float ->
?enet_tol:float ->
?max_iter:int ->
?verbose:int ->
?return_costs:bool ->
?eps:float ->
?return_n_iter:bool ->
emp_cov:[> `ArrayLike ] Np.Obj.t ->
alpha:float ->
unit ->
[> `ArrayLike ] Np.Obj.t * [> `ArrayLike ] Np.Obj.t * Py.Object.t * int
l1-penalized covariance estimator
Read more in the :ref:`User Guide <sparse_inverse_covariance>`.
.. versionchanged:: v0.20 graph_lasso has been renamed to graphical_lasso
Parameters ---------- emp_cov : ndarray of shape (n_features, n_features) Empirical covariance from which to compute the covariance estimate.
alpha : float The regularization parameter: the higher alpha, the more regularization, the sparser the inverse covariance. Range is (0, inf].
cov_init : array of shape (n_features, n_features), default=None The initial guess for the covariance.
mode : 'cd', 'lars'
, default='cd' The Lasso solver to use: coordinate descent or LARS. Use LARS for very sparse underlying graphs, where p > n. Elsewhere prefer cd which is more numerically stable.
tol : float, default=1e-4 The tolerance to declare convergence: if the dual gap goes below this value, iterations are stopped. Range is (0, inf].
enet_tol : float, default=1e-4 The tolerance for the elastic net solver used to calculate the descent direction. This parameter controls the accuracy of the search direction for a given column update, not of the overall parameter estimate. Only used for mode='cd'. Range is (0, inf].
max_iter : int, default=100 The maximum number of iterations.
verbose : bool, default=False If verbose is True, the objective function and dual gap are printed at each iteration.
return_costs : bool, default=Flase If return_costs is True, the objective function and dual gap at each iteration are returned.
eps : float, default=eps The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Default is `np.finfo(np.float64).eps`.
return_n_iter : bool, default=False Whether or not to return the number of iterations.
Returns ------- covariance : ndarray of shape (n_features, n_features) The estimated covariance matrix.
precision : ndarray of shape (n_features, n_features) The estimated (sparse) precision matrix.
costs : list of (objective, dual_gap) pairs The list of values of the objective function and the dual gap at each iteration. Returned only if return_costs is True.
n_iter : int Number of iterations. Returned only if `return_n_iter` is set to True.
See Also -------- GraphicalLasso, GraphicalLassoCV
Notes ----- The algorithm employed to solve this problem is the GLasso algorithm, from the Friedman 2008 Biostatistics paper. It is the same algorithm as in the R `glasso` package.
One possible difference with the `glasso` R package is that the diagonal coefficients are not penalized.
val ledoit_wolf :
?assume_centered:bool ->
?block_size:int ->
x:[> `ArrayLike ] Np.Obj.t ->
unit ->
[> `ArrayLike ] Np.Obj.t * float
Estimates the shrunk Ledoit-Wolf covariance matrix.
Read more in the :ref:`User Guide <shrunk_covariance>`.
Parameters ---------- X : array-like of shape (n_samples, n_features) Data from which to compute the covariance estimate
assume_centered : bool, default=False If True, data will not be centered before computation. Useful to work with data whose mean is significantly equal to zero but is not exactly zero. If False, data will be centered before computation.
block_size : int, default=1000 Size of the blocks into which the covariance matrix will be split. This is purely a memory optimization and does not affect results.
Returns ------- shrunk_cov : ndarray of shape (n_features, n_features) Shrunk covariance.
shrinkage : float Coefficient in the convex combination used for the computation of the shrunk estimate.
Notes ----- The regularized (shrunk) covariance is:
(1 - shrinkage) * cov + shrinkage * mu * np.identity(n_features)
where mu = trace(cov) / n_features
val ledoit_wolf_shrinkage :
?assume_centered:bool ->
?block_size:int ->
x:[> `ArrayLike ] Np.Obj.t ->
unit ->
float
Estimates the shrunk Ledoit-Wolf covariance matrix.
Read more in the :ref:`User Guide <shrunk_covariance>`.
Parameters ---------- X : array-like of shape (n_samples, n_features) Data from which to compute the Ledoit-Wolf shrunk covariance shrinkage.
assume_centered : bool, default=False If True, data will not be centered before computation. Useful to work with data whose mean is significantly equal to zero but is not exactly zero. If False, data will be centered before computation.
block_size : int, default=1000 Size of the blocks into which the covariance matrix will be split.
Returns ------- shrinkage : float Coefficient in the convex combination used for the computation of the shrunk estimate.
Notes ----- The regularized (shrunk) covariance is:
(1 - shrinkage) * cov + shrinkage * mu * np.identity(n_features)
where mu = trace(cov) / n_features
val log_likelihood :
emp_cov:[> `ArrayLike ] Np.Obj.t ->
precision:[> `ArrayLike ] Np.Obj.t ->
unit ->
float
Computes the sample mean of the log_likelihood under a covariance model
computes the empirical expected log-likelihood (accounting for the normalization terms and scaling), allowing for universal comparison (beyond this software package)
Parameters ---------- emp_cov : ndarray of shape (n_features, n_features) Maximum Likelihood Estimator of covariance.
precision : ndarray of shape (n_features, n_features) The precision matrix of the covariance model to be tested.
Returns ------- log_likelihood_ : float Sample mean of the log-likelihood.
val oas :
?assume_centered:bool ->
x:[> `ArrayLike ] Np.Obj.t ->
unit ->
[> `ArrayLike ] Np.Obj.t * float
Estimate covariance with the Oracle Approximating Shrinkage algorithm.
Parameters ---------- X : array-like of shape (n_samples, n_features) Data from which to compute the covariance estimate.
assume_centered : bool, default=False If True, data will not be centered before computation. Useful to work with data whose mean is significantly equal to zero but is not exactly zero. If False, data will be centered before computation.
Returns ------- shrunk_cov : array-like of shape (n_features, n_features) Shrunk covariance.
shrinkage : float Coefficient in the convex combination used for the computation of the shrunk estimate.
Notes ----- The regularised (shrunk) covariance is:
(1 - shrinkage) * cov + shrinkage * mu * np.identity(n_features)
where mu = trace(cov) / n_features
The formula we used to implement the OAS is slightly modified compared to the one given in the article. See :class:`OAS` for more details.
val shrunk_covariance :
?shrinkage:float ->
emp_cov:[> `ArrayLike ] Np.Obj.t ->
unit ->
[> `ArrayLike ] Np.Obj.t
Calculates a covariance matrix shrunk on the diagonal
Read more in the :ref:`User Guide <shrunk_covariance>`.
Parameters ---------- emp_cov : array-like of shape (n_features, n_features) Covariance matrix to be shrunk
shrinkage : float, default=0.1 Coefficient in the convex combination used for the computation of the shrunk estimate. Range is 0, 1
.
Returns ------- shrunk_cov : ndarray of shape (n_features, n_features) Shrunk covariance.
Notes ----- The regularized (shrunk) covariance is given by:
(1 - shrinkage) * cov + shrinkage * mu * np.identity(n_features)
where mu = trace(cov) / n_features