package sklearn

You can search for identifiers within the package.

in-package search v0.2.0

package sklearn

sklearn

Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module `Cluster.SpectralBiclustering`Source

Sourcetype tag = [

| `SpectralBiclustering

]

Source

type t =
  [ `BaseEstimator
  | `BaseSpectral
  | `BiclusterMixin
  | `Object
  | `SpectralBiclustering ]
    Obj.t

Sourceval of_pyobject : Py.Object.t -> t

Sourceval to_pyobject : [> tag ] Obj.t -> Py.Object.t

Sourceval as_estimator : t -> [ `BaseEstimator ] Obj.t

Sourceval as_spectral : t -> [ `BaseSpectral ] Obj.t

Sourceval as_bicluster : t -> [ `BiclusterMixin ] Obj.t

Source

val create : 
  ?n_clusters:[ `Tuple of Py.Object.t | `I of int ] ->
  ?method_:[ `Bistochastic | `Scale | `Log ] ->
  ?n_components:int ->
  ?n_best:int ->
  ?svd_method:[ `Randomized | `Arpack ] ->
  ?n_svd_vecs:int ->
  ?mini_batch:bool ->
  ?init:[ `Arr of [> `ArrayLike ] Np.Obj.t | `Random | `K_means_ ] ->
  ?n_init:int ->
  ?n_jobs:int ->
  ?random_state:int ->
  unit ->
  t

Spectral biclustering (Kluger, 2003).

Partitions rows and columns under the assumption that the data has an underlying checkerboard structure. For instance, if there are two row partitions and three column partitions, each row will belong to three biclusters, and each column will belong to two biclusters. The outer product of the corresponding row and column label vectors gives this checkerboard structure.

Read more in the :ref:`User Guide <spectral_biclustering>`.

Parameters ---------- n_clusters : int or tuple (n_row_clusters, n_column_clusters), default=3 The number of row and column clusters in the checkerboard structure.

method : 'bistochastic', 'scale', 'log', default='bistochastic' Method of normalizing and converting singular vectors into biclusters. May be one of 'scale', 'bistochastic', or 'log'. The authors recommend using 'log'. If the data is sparse, however, log normalization will not work, which is why the default is 'bistochastic'.

.. warning:: if `method='log'`, the data must be sparse.

n_components : int, default=6 Number of singular vectors to check.

n_best : int, default=3 Number of best singular vectors to which to project the data for clustering.

svd_method : 'randomized', 'arpack', default='randomized' Selects the algorithm for finding singular vectors. May be 'randomized' or 'arpack'. If 'randomized', uses :func:`~sklearn.utils.extmath.randomized_svd`, which may be faster for large matrices. If 'arpack', uses `scipy.sparse.linalg.svds`, which is more accurate, but possibly slower in some cases.

n_svd_vecs : int, default=None Number of vectors to use in calculating the SVD. Corresponds to `ncv` when `svd_method=arpack` and `n_oversamples` when `svd_method` is 'randomized`.

mini_batch : bool, default=False Whether to use mini-batch k-means, which is faster but may get different results.

init : 'k-means++', 'random' or ndarray of (n_clusters, n_features), default='k-means++' Method for initialization of k-means algorithm; defaults to 'k-means++'.

n_init : int, default=10 Number of random initializations that are tried with the k-means algorithm.

If mini-batch k-means is used, the best initialization is chosen and the algorithm runs once. Otherwise, the algorithm is run for each initialization and the best solution chosen.

n_jobs : int, default=None The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them in parallel.

``None`` means 1 unless in a :obj:`joblib.parallel_backend` context. ``-1`` means using all processors. See :term:`Glossary <n_jobs>` for more details.

random_state : int, RandomState instance, default=None Used for randomizing the singular value decomposition and the k-means initialization. Use an int to make the randomness deterministic. See :term:`Glossary <random_state>`.

Attributes ---------- rows_ : array-like of shape (n_row_clusters, n_rows) Results of the clustering. `rowsi, r` is True if cluster `i` contains row `r`. Available only after calling ``fit``.

columns_ : array-like of shape (n_column_clusters, n_columns) Results of the clustering, like `rows`.

row_labels_ : array-like of shape (n_rows,) Row partition labels.

column_labels_ : array-like of shape (n_cols,) Column partition labels.

Examples -------- >>> from sklearn.cluster import SpectralBiclustering >>> import numpy as np >>> X = np.array([1, 1], [2, 1], [1, 0], ... [4, 7], [3, 5], [3, 6]) >>> clustering = SpectralBiclustering(n_clusters=2, random_state=0).fit(X) >>> clustering.row_labels_ array(1, 1, 1, 0, 0, 0, dtype=int32) >>> clustering.column_labels_ array(0, 1, dtype=int32) >>> clustering SpectralBiclustering(n_clusters=2, random_state=0)

References ----------

* Kluger, Yuval, et. al., 2003. `Spectral biclustering of microarray data: coclustering genes and conditions <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.135.1608>`__.

Sourceval fit : ?y:Py.Object.t -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> t

Creates a biclustering for X.

Parameters ---------- X : array-like, shape (n_samples, n_features)

y : Ignored

Sourceval get_indices : i:int -> [> tag ] Obj.t -> Py.Object.t * Py.Object.t

Row and column indices of the i'th bicluster.

Only works if ``rows_`` and ``columns_`` attributes exist.

Parameters ---------- i : int The index of the cluster.

Returns ------- row_ind : np.array, dtype=np.intp Indices of rows in the dataset that belong to the bicluster. col_ind : np.array, dtype=np.intp Indices of columns in the dataset that belong to the bicluster.

Sourceval get_params : ?deep:bool -> [> tag ] Obj.t -> Dict.t

Get parameters for this estimator.

Parameters ---------- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns ------- params : mapping of string to any Parameter names mapped to their values.

Sourceval get_shape : i:int -> [> tag ] Obj.t -> int * int

Shape of the i'th bicluster.

Parameters ---------- i : int The index of the cluster.

Returns ------- shape : (int, int) Number of rows and columns (resp.) in the bicluster.

Source

val get_submatrix : 
  i:int ->
  data:[> `ArrayLike ] Np.Obj.t ->
  [> tag ] Obj.t ->
  [> `ArrayLike ] Np.Obj.t

Return the submatrix corresponding to bicluster `i`.

Parameters ---------- i : int The index of the cluster. data : array The data.

Returns ------- submatrix : array The submatrix corresponding to bicluster i.

Notes ----- Works with sparse matrices. Only works if ``rows_`` and ``columns_`` attributes exist.

Sourceval set_params : ?params:(string * Py.Object.t) list -> [> tag ] Obj.t -> t

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form ``<component>__<parameter>`` so that it's possible to update each component of a nested object.

Parameters ---------- **params : dict Estimator parameters.

Returns ------- self : object Estimator instance.

Sourceval rows_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute rows_: get value or raise Not_found if None.

Sourceval rows_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute rows_: get value as an option.

Sourceval columns_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute columns_: get value or raise Not_found if None.

Sourceval columns_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute columns_: get value as an option.

Sourceval row_labels_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute row_labels_: get value or raise Not_found if None.

Sourceval row_labels_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute row_labels_: get value as an option.

Sourceval column_labels_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute column_labels_: get value or raise Not_found if None.

Sourceval column_labels_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute column_labels_: get value as an option.

Sourceval to_string : t -> string

Print the object to a human-readable representation.

Sourceval show : t -> string

Print the object to a human-readable representation.

Sourceval pp : Format.formatter -> t -> unit

Pretty-print the object to a formatter.

package sklearn

Module Cluster.SpectralBiclusteringSource

Module `Cluster.SpectralBiclustering`Source