DummyClassifier is a classifier that makes predictions using simple rules.
This classifier is useful as a simple baseline to compare with other (real) classifiers. Do not use it for real problems.
Read more in the :ref:`User Guide <dummy_estimators>`.
.. versionadded:: 0.13
Parameters ---------- strategy : str, default="stratified" Strategy to use to generate predictions.
* "stratified": generates predictions by respecting the training set's class distribution. * "most_frequent": always predicts the most frequent label in the training set. * "prior": always predicts the class that maximizes the class prior (like "most_frequent") and ``predict_proba`` returns the class prior. * "uniform": generates predictions uniformly at random. * "constant": always predicts a constant label that is provided by the user. This is useful for metrics that evaluate a non-majority class
.. versionchanged:: 0.22 The default value of `strategy` will change to "prior" in version 0.24. Starting from version 0.22, a warning will be raised if `strategy` is not explicitly set.
.. versionadded:: 0.17 Dummy Classifier now supports prior fitting strategy using parameter *prior*.
random_state : int, RandomState instance or None, optional, default=None If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by `np.random`.
constant : int or str or array-like of shape (n_outputs,) The explicit constant as predicted by the "constant" strategy. This parameter is useful only for the "constant" strategy.
Attributes ---------- classes_ : array or list of array of shape (n_classes,) Class labels for each output.
n_classes_ : array or list of array of shape (n_classes,) Number of label for each output.
class_prior_ : array or list of array of shape (n_classes,) Probability of each class for each output.
n_outputs_ : int, Number of outputs.
sparse_output_ : bool, True if the array returned from predict is to be in sparse CSC format. Is automatically set to True if the input y is passed in sparse format.
Examples -------- >>> import numpy as np >>> from sklearn.dummy import DummyClassifier >>> X = np.array(-1, 1, 1, 1
) >>> y = np.array(0, 1, 1, 1
) >>> dummy_clf = DummyClassifier(strategy="most_frequent") >>> dummy_clf.fit(X, y) DummyClassifier(strategy='most_frequent') >>> dummy_clf.predict(X) array(1, 1, 1, 1
) >>> dummy_clf.score(X, y) 0.75