Bayesian ARD regression.
Fit the weights of a regression model, using an ARD prior. The weights of the regression model are assumed to be in Gaussian distributions. Also estimate the parameters lambda (precisions of the distributions of the weights) and alpha (precision of the distribution of the noise). The estimation is done by an iterative procedures (Evidence Maximization)
Read more in the :ref:`User Guide <bayesian_regression>`.
Parameters ---------- n_iter : int, default=300 Maximum number of iterations.
tol : float, default=1e-3 Stop the algorithm if w has converged.
alpha_1 : float, default=1e-6 Hyper-parameter : shape parameter for the Gamma distribution prior over the alpha parameter.
alpha_2 : float, default=1e-6 Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the alpha parameter.
lambda_1 : float, default=1e-6 Hyper-parameter : shape parameter for the Gamma distribution prior over the lambda parameter.
lambda_2 : float, default=1e-6 Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the lambda parameter.
compute_score : bool, default=False If True, compute the objective function at each step of the model.
threshold_lambda : float, default=10 000 threshold for removing (pruning) weights with high precision from the computation.
fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).
normalize : bool, default=False This parameter is ignored when ``fit_intercept`` is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:`sklearn.preprocessing.StandardScaler` before calling ``fit`` on an estimator with ``normalize=False``.
copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.
verbose : bool, default=False Verbose mode when fitting the model.
Attributes ---------- coef_ : array-like of shape (n_features,) Coefficients of the regression model (mean of distribution)
alpha_ : float estimated precision of the noise.
lambda_ : array-like of shape (n_features,) estimated precisions of the weights.
sigma_ : array-like of shape (n_features, n_features) estimated variance-covariance matrix of the weights
scores_ : float if computed, value of the objective function (to be maximized)
intercept_ : float Independent term in decision function. Set to 0.0 if ``fit_intercept = False``.
Examples -------- >>> from sklearn import linear_model >>> clf = linear_model.ARDRegression() >>> clf.fit([0,0], [1, 1], [2, 2]
, 0, 1, 2
) ARDRegression() >>> clf.predict([1, 1]
) array(1.
)
Notes ----- For an example, see :ref:`examples/linear_model/plot_ard.py <sphx_glr_auto_examples_linear_model_plot_ard.py>`.
References ---------- D. J. C. MacKay, Bayesian nonlinear modeling for the prediction competition, ASHRAE Transactions, 1994.
R. Salakhutdinov, Lecture notes on Statistical Machine Learning, http://www.utstat.toronto.edu/~rsalakhu/sta4273/notes/Lecture2.pdf#page=15 Their beta is our ``self.alpha_`` Their alpha is our ``self.lambda_`` ARD is a little different than the slide: only dimensions/features for which ``self.lambda_ < self.threshold_lambda`` are kept and the rest are discarded.