The Complement Naive Bayes classifier described in Rennie et al. (2003).
The Complement Naive Bayes classifier was designed to correct the 'severe assumptions' made by the standard Multinomial Naive Bayes classifier. It is particularly suited for imbalanced data sets.
Read more in the :ref:`User Guide <complement_naive_bayes>`.
Parameters ---------- alpha : float, optional (default=1.0) Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).
fit_prior : boolean, optional (default=True) Only used in edge case with a single class in the training set.
class_prior : array-like, size (n_classes,), optional (default=None) Prior probabilities of the classes. Not used.
norm : boolean, optional (default=False) Whether or not a second normalization of the weights is performed. The default behavior mirrors the implementations found in Mahout and Weka, which do not follow the full algorithm described in Table 9 of the paper.
Attributes ---------- class_count_ : array, shape (n_classes,) Number of samples encountered for each class during fitting. This value is weighted by the sample weight when provided.
class_log_prior_ : array, shape (n_classes, ) Smoothed empirical log probability for each class. Only used in edge case with a single class in the training set.
classes_ : array, shape (n_classes,) Class labels known to the classifier
feature_all_ : array, shape (n_features,) Number of samples encountered for each feature during fitting. This value is weighted by the sample weight when provided.
feature_count_ : array, shape (n_classes, n_features) Number of samples encountered for each (class, feature) during fitting. This value is weighted by the sample weight when provided.
feature_log_prob_ : array, shape (n_classes, n_features) Empirical weights for class complements.
n_features_ : int Number of features of each sample.
Examples -------- >>> import numpy as np >>> rng = np.random.RandomState(1) >>> X = rng.randint(5, size=(6, 100)) >>> y = np.array(1, 2, 3, 4, 5, 6
) >>> from sklearn.naive_bayes import ComplementNB >>> clf = ComplementNB() >>> clf.fit(X, y) ComplementNB() >>> print(clf.predict(X2:3
)) 3
References ---------- Rennie, J. D., Shih, L., Teevan, J., & Karger, D. R. (2003). Tackling the poor assumptions of naive bayes text classifiers. In ICML (Vol. 3, pp. 616-623). https://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf