package sklearn

You can search for identifiers within the package.

in-package search v0.2.0

sklearn
- Sklearn

Legend:
Library
Module
Module type
Parameter
Class
Class type

type tag = [

| `SpectralCoclustering

]

type t =
  [ `BaseEstimator
  | `BaseSpectral
  | `BiclusterMixin
  | `Object
  | `SpectralCoclustering ]
    Obj.t

val of_pyobject : Py.Object.t -> t

val to_pyobject : [> tag ] Obj.t -> Py.Object.t

val as_estimator : t -> [ `BaseEstimator ] Obj.t

val as_bicluster : t -> [ `BiclusterMixin ] Obj.t

val as_spectral : t -> [ `BaseSpectral ] Obj.t

val create : 
  ?n_clusters:int ->
  ?svd_method:[ `Randomized | `Arpack ] ->
  ?n_svd_vecs:int ->
  ?mini_batch:bool ->
  ?init:
    [ `Random | `Arr of [> `ArrayLike ] Np.Obj.t | `T_k_means_ of Py.Object.t ] ->
  ?n_init:int ->
  ?n_jobs:int ->
  ?random_state:int ->
  unit ->
  t

Spectral Co-Clustering algorithm (Dhillon, 2001).

Clusters rows and columns of an array `X` to solve the relaxed normalized cut of the bipartite graph created from `X` as follows: the edge between row vertex `i` and column vertex `j` has weight `Xi, j`.

The resulting bicluster structure is block-diagonal, since each row and each column belongs to exactly one bicluster.

Supports sparse matrices, as long as they are nonnegative.

Read more in the :ref:`User Guide <spectral_coclustering>`.

Parameters ---------- n_clusters : int, default=3 The number of biclusters to find.

svd_method : 'randomized', 'arpack', default='randomized' Selects the algorithm for finding singular vectors. May be 'randomized' or 'arpack'. If 'randomized', use :func:`sklearn.utils.extmath.randomized_svd`, which may be faster for large matrices. If 'arpack', use :func:`scipy.sparse.linalg.svds`, which is more accurate, but possibly slower in some cases.

n_svd_vecs : int, default=None Number of vectors to use in calculating the SVD. Corresponds to `ncv` when `svd_method=arpack` and `n_oversamples` when `svd_method` is 'randomized`.

mini_batch : bool, default=False Whether to use mini-batch k-means, which is faster but may get different results.

init : {'k-means++', 'random', or ndarray of shape (n_clusters, n_features), default='k-means++' Method for initialization of k-means algorithm; defaults to 'k-means++'.

n_init : int, default=10 Number of random initializations that are tried with the k-means algorithm.

If mini-batch k-means is used, the best initialization is chosen and the algorithm runs once. Otherwise, the algorithm is run for each initialization and the best solution chosen.

n_jobs : int, default=None The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them in parallel.

``None`` means 1 unless in a :obj:`joblib.parallel_backend` context. ``-1`` means using all processors. See :term:`Glossary <n_jobs>` for more details.

.. deprecated:: 0.23 ``n_jobs`` was deprecated in version 0.23 and will be removed in 0.25.

random_state : int, RandomState instance, default=None Used for randomizing the singular value decomposition and the k-means initialization. Use an int to make the randomness deterministic. See :term:`Glossary <random_state>`.

Attributes ---------- rows_ : array-like of shape (n_row_clusters, n_rows) Results of the clustering. `rowsi, r` is True if cluster `i` contains row `r`. Available only after calling ``fit``.

columns_ : array-like of shape (n_column_clusters, n_columns) Results of the clustering, like `rows`.

row_labels_ : array-like of shape (n_rows,) The bicluster label of each row.

column_labels_ : array-like of shape (n_cols,) The bicluster label of each column.

Examples -------- >>> from sklearn.cluster import SpectralCoclustering >>> import numpy as np >>> X = np.array([1, 1], [2, 1], [1, 0], ... [4, 7], [3, 5], [3, 6]) >>> clustering = SpectralCoclustering(n_clusters=2, random_state=0).fit(X) >>> clustering.row_labels_ #doctest: +SKIP array(0, 1, 1, 0, 0, 0, dtype=int32) >>> clustering.column_labels_ #doctest: +SKIP array(0, 0, dtype=int32) >>> clustering SpectralCoclustering(n_clusters=2, random_state=0)

References ----------

* Dhillon, Inderjit S, 2001. `Co-clustering documents and words using bipartite spectral graph partitioning <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.140.3011>`__.

val fit : ?y:Py.Object.t -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> t

Creates a biclustering for X.

Parameters ---------- X : array-like, shape (n_samples, n_features)

y : Ignored

val get_indices : i:int -> [> tag ] Obj.t -> Py.Object.t * Py.Object.t

Row and column indices of the i'th bicluster.

Only works if ``rows_`` and ``columns_`` attributes exist.

Parameters ---------- i : int The index of the cluster.

Returns ------- row_ind : ndarray, dtype=np.intp Indices of rows in the dataset that belong to the bicluster. col_ind : ndarray, dtype=np.intp Indices of columns in the dataset that belong to the bicluster.

val get_params : ?deep:bool -> [> tag ] Obj.t -> Dict.t

Get parameters for this estimator.

Parameters ---------- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns ------- params : mapping of string to any Parameter names mapped to their values.

val get_shape : i:int -> [> tag ] Obj.t -> Py.Object.t

Shape of the i'th bicluster.

Parameters ---------- i : int The index of the cluster.

Returns ------- shape : tuple (int, int) Number of rows and columns (resp.) in the bicluster.

val get_submatrix : 
  i:int ->
  data:[> `ArrayLike ] Np.Obj.t ->
  [> tag ] Obj.t ->
  [> `ArrayLike ] Np.Obj.t

Return the submatrix corresponding to bicluster `i`.

Parameters ---------- i : int The index of the cluster. data : array-like The data.

Returns ------- submatrix : ndarray The submatrix corresponding to bicluster i.

Notes ----- Works with sparse matrices. Only works if ``rows_`` and ``columns_`` attributes exist.

val set_params : ?params:(string * Py.Object.t) list -> [> tag ] Obj.t -> t

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form ``<component>__<parameter>`` so that it's possible to update each component of a nested object.

Parameters ---------- **params : dict Estimator parameters.

Returns ------- self : object Estimator instance.

val rows_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute rows_: get value or raise Not_found if None.

val rows_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute rows_: get value as an option.

val columns_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute columns_: get value or raise Not_found if None.

val columns_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute columns_: get value as an option.

val row_labels_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute row_labels_: get value or raise Not_found if None.

val row_labels_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute row_labels_: get value as an option.

val column_labels_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute column_labels_: get value or raise Not_found if None.

val column_labels_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute column_labels_: get value as an option.

val to_string : t -> string

Print the object to a human-readable representation.

val show : t -> string

Print the object to a human-readable representation.

val pp : Format.formatter -> t -> unit

Pretty-print the object to a formatter.