PLS regression
PLSRegression implements the PLS 2 blocks regression known as PLS2 or PLS1 in case of one dimensional response. This class inherits from _PLS with mode='A', deflation_mode='regression', norm_y_weights=False and algorithm='nipals'.
Read more in the :ref:`User Guide <cross_decomposition>`.
.. versionadded:: 0.8
Parameters ---------- n_components : int, (default 2) Number of components to keep.
scale : boolean, (default True) whether to scale the data
max_iter : an integer, (default 500) the maximum number of iterations of the NIPALS inner loop (used only if algorithm='nipals')
tol : non-negative real Tolerance used in the iterative algorithm default 1e-06.
copy : boolean, default True Whether the deflation should be done on a copy. Let the default value to True unless you don't care about side effect
Attributes ---------- x_weights_ : array, p, n_components
X block weights vectors.
y_weights_ : array, q, n_components
Y block weights vectors.
x_loadings_ : array, p, n_components
X block loadings vectors.
y_loadings_ : array, q, n_components
Y block loadings vectors.
x_scores_ : array, n_samples, n_components
X scores.
y_scores_ : array, n_samples, n_components
Y scores.
x_rotations_ : array, p, n_components
X block to latents rotations.
y_rotations_ : array, q, n_components
Y block to latents rotations.
coef_ : array, p, q
The coefficients of the linear model: ``Y = X coef_ + Err``
n_iter_ : array-like Number of iterations of the NIPALS inner loop for each component.
Notes ----- Matrices::
T: x_scores_ U: y_scores_ W: x_weights_ C: y_weights_ P: x_loadings_ Q: y_loadings_
Are computed such that::
X = T P.T + Err and Y = U Q.T + Err T:, k
= Xk W:, k
for k in range(n_components) U:, k
= Yk C:, k
for k in range(n_components) x_rotations_ = W (P.T W)^(-1) y_rotations_ = C (Q.T C)^(-1)
where Xk and Yk are residual matrices at iteration k.
`Slides explaining PLS <http://www.eigenvector.com/Docs/Wise_pls_properties.pdf>`_
For each component k, find weights u, v that optimizes: ``max corr(Xk u, Yk v) * std(Xk u) std(Yk u)``, such that ``|u| = 1``
Note that it maximizes both the correlations between the scores and the intra-block variances.
The residual matrix of X (Xk+1) block is obtained by the deflation on the current X score: x_score.
The residual matrix of Y (Yk+1) block is obtained by deflation on the current X score. This performs the PLS regression known as PLS2. This mode is prediction oriented.
This implementation provides the same results that 3 PLS packages provided in the R language (R-project):
- 'mixOmics' with function pls(X, Y, mode = 'regression')
- 'plspm ' with function plsreg2(X, Y)
- 'pls' with function oscorespls.fit(X, Y)
Examples -------- >>> from sklearn.cross_decomposition import PLSRegression >>> X = [0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]
>>> Y = [0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]
>>> pls2 = PLSRegression(n_components=2) >>> pls2.fit(X, Y) PLSRegression() >>> Y_pred = pls2.predict(X)
References ----------
Jacob A. Wegelin. A survey of Partial Least Squares (PLS) methods, with emphasis on the two-block case. Technical Report 371, Department of Statistics, University of Washington, Seattle, 2000.
In french but still a reference: Tenenhaus, M. (1998). La regression PLS: theorie et pratique. Paris: Editions Technic.