package sklearn

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
val get_py : string -> Py.Object.t

Get an attribute of this module as a Py.Object.t. This is useful to pass a Python function to another function.

module PatchExtractor : sig ... end
module Product : sig ... end
val as_strided : ?shape:int list -> ?strides:Py.Object.t -> ?subok:bool -> ?writeable:bool -> x:[> `ArrayLike ] Np.Obj.t -> unit -> [> `ArrayLike ] Np.Obj.t

Create a view into the array with the given shape and strides.

.. warning:: This function has to be used with extreme care, see notes.

Parameters ---------- x : ndarray Array to create a new. shape : sequence of int, optional The shape of the new array. Defaults to ``x.shape``. strides : sequence of int, optional The strides of the new array. Defaults to ``x.strides``. subok : bool, optional .. versionadded:: 1.10

If True, subclasses are preserved. writeable : bool, optional .. versionadded:: 1.12

If set to False, the returned array will always be readonly. Otherwise it will be writable if the original array was. It is advisable to set this to False if possible (see Notes).

Returns ------- view : ndarray

See also -------- broadcast_to: broadcast an array to a given shape. reshape : reshape an array.

Notes ----- ``as_strided`` creates a view into the array given the exact strides and shape. This means it manipulates the internal data structure of ndarray and, if done incorrectly, the array elements can point to invalid memory and can corrupt results or crash your program. It is advisable to always use the original ``x.strides`` when calculating new strides to avoid reliance on a contiguous memory layout.

Furthermore, arrays created with this function often contain self overlapping memory, so that two elements are identical. Vectorized write operations on such arrays will typically be unpredictable. They may even give different results for small, large, or transposed arrays. Since writing to these arrays has to be tested and done with great care, you may want to use ``writeable=False`` to avoid accidental write operations.

For these reasons it is advisable to avoid ``as_strided`` when possible.

val check_array : ?accept_sparse:[ `S of string | `StringList of string list | `Bool of bool ] -> ?accept_large_sparse:bool -> ?dtype: [ `S of string | `Dtype of Np.Dtype.t | `Dtypes of Np.Dtype.t list | `None ] -> ?order:[ `C | `F ] -> ?copy:bool -> ?force_all_finite:[ `Allow_nan | `Bool of bool ] -> ?ensure_2d:bool -> ?allow_nd:bool -> ?ensure_min_samples:int -> ?ensure_min_features:int -> ?estimator:[> `BaseEstimator ] Np.Obj.t -> array:Py.Object.t -> unit -> Py.Object.t

Input validation on an array, list, sparse matrix or similar.

By default, the input is checked to be a non-empty 2D array containing only finite values. If the dtype of the array is object, attempt converting to float, raising on failure.

Parameters ---------- array : object Input object to check / convert.

accept_sparse : string, boolean or list/tuple of strings (default=False) Strings representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.

accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse=False will cause it to be accepted only if its indices are stored with a 32-bit dtype.

.. versionadded:: 0.20

dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.

order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style. When order is None (default), then if copy=False, nothing is ensured about the memory layout of the output array; otherwise (copy=True) the memory layout of the returned array is kept as close as possible to the original array.

copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.

force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in array. The possibilities are:

  • True: Force all values of array to be finite.
  • False: accepts np.inf, np.nan, pd.NA in array.
  • 'allow-nan': accepts only np.nan and pd.NA values in array. Values cannot be infinite.

.. versionadded:: 0.20 ``force_all_finite`` accepts the string ``'allow-nan'``.

.. versionchanged:: 0.23 Accepts `pd.NA` and converts it into `np.nan`

ensure_2d : boolean (default=True) Whether to raise a value error if array is not 2D.

allow_nd : boolean (default=False) Whether to allow array.ndim > 2.

ensure_min_samples : int (default=1) Make sure that the array has a minimum number of samples in its first axis (rows for a 2D array). Setting to 0 disables this check.

ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when the input data has effectively 2 dimensions or is originally 1D and ``ensure_2d`` is True. Setting to 0 disables this check.

estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.

Returns ------- array_converted : object The converted and validated array.

val check_random_state : [ `Optional of [ `I of int | `None ] | `RandomState of Py.Object.t ] -> Py.Object.t

Turn seed into a np.random.RandomState instance

Parameters ---------- seed : None | int | instance of RandomState If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.

val extract_patches : ?patch_shape:[ `Tuple of Py.Object.t | `I of int ] -> ?extraction_step:[ `Tuple of Py.Object.t | `I of int ] -> arr:[> `ArrayLike ] Np.Obj.t -> unit -> Py.Object.t

DEPRECATED: The function feature_extraction.image.extract_patches has been deprecated in 0.22 and will be removed in 0.24.

Extracts patches of any n-dimensional array in place using strides.

Given an n-dimensional array it will return a 2n-dimensional array with the first n dimensions indexing patch position and the last n indexing the patch content. This operation is immediate (O(1)). A reshape performed on the first n dimensions will cause numpy to copy data, leading to a list of extracted patches.

Read more in the :ref:`User Guide <image_feature_extraction>`.

Parameters ---------- arr : ndarray n-dimensional array of which patches are to be extracted

patch_shape : int or tuple of length arr.ndim, default=8 Indicates the shape of the patches to be extracted. If an integer is given, the shape will be a hypercube of sidelength given by its value.

extraction_step : int or tuple of length arr.ndim, default=1 Indicates step size at which extraction shall be performed. If integer is given, then the step is uniform in all dimensions.

Returns ------- patches : strided ndarray 2n-dimensional array indexing patches on first n dimensions and containing patches on the last n dimensions. These dimensions are fake, but this way no data is copied. A simple reshape invokes a copying operation to obtain a list of patches: result.reshape(-1 + list(patch_shape))

val extract_patches_2d : ?max_patches:[ `I of int | `F of float ] -> ?random_state:int -> image:[> `ArrayLike ] Np.Obj.t -> patch_size:Py.Object.t -> unit -> [> `ArrayLike ] Np.Obj.t

Reshape a 2D image into a collection of patches

The resulting patches are allocated in a dedicated array.

Read more in the :ref:`User Guide <image_feature_extraction>`.

Parameters ---------- image : ndarray of shape (image_height, image_width) or (image_height, image_width, n_channels) The original image data. For color images, the last dimension specifies the channel: a RGB image would have `n_channels=3`.

patch_size : tuple of int (patch_height, patch_width) The dimensions of one patch.

max_patches : int or float, default=None The maximum number of patches to extract. If `max_patches` is a float between 0 and 1, it is taken to be a proportion of the total number of patches.

random_state : int, RandomState instance, default=None Determines the random number generator used for random sampling when `max_patches` is not None. Use an int to make the randomness deterministic. See :term:`Glossary <random_state>`.

Returns ------- patches : array of shape (n_patches, patch_height, patch_width) or (n_patches, patch_height, patch_width, n_channels) The collection of patches extracted from the image, where `n_patches` is either `max_patches` or the total number of patches that can be extracted.

Examples -------- >>> from sklearn.datasets import load_sample_image >>> from sklearn.feature_extraction import image >>> # Use the array data from the first image in this dataset: >>> one_image = load_sample_image('china.jpg') >>> print('Image shape: {

}

'.format(one_image.shape)) Image shape: (427, 640, 3) >>> patches = image.extract_patches_2d(one_image, (2, 2)) >>> print('Patches shape: {

}

'.format(patches.shape)) Patches shape: (272214, 2, 2, 3) >>> # Here are just two of these patches: >>> print(patches1) [[174 201 231] [174 201 231]] [[173 200 230] [173 200 230]] >>> print(patches800) [[187 214 243] [188 215 244]] [[187 214 243] [188 215 244]]

val grid_to_graph : ?n_z:int -> ?mask:[ `Dtype_bool of Py.Object.t | `Arr of [> `ArrayLike ] Np.Obj.t ] -> ?return_as:Py.Object.t -> ?dtype:Np.Dtype.t -> n_x:int -> n_y:int -> unit -> Py.Object.t

Graph of the pixel-to-pixel connections

Edges exist if 2 voxels are connected.

Parameters ---------- n_x : int Dimension in x axis n_y : int Dimension in y axis n_z : int, default=1 Dimension in z axis mask : ndarray of shape (n_x, n_y, n_z), dtype=bool, default=None An optional mask of the image, to consider only part of the pixels. return_as : np.ndarray or a sparse matrix class, default=sparse.coo_matrix The class to use to build the returned adjacency matrix. dtype : dtype, default=int The data of the returned sparse matrix. By default it is int

Notes ----- For scikit-learn versions 0.14.1 and prior, return_as=np.ndarray was handled by returning a dense np.matrix instance. Going forward, np.ndarray returns an np.ndarray, as expected.

For compatibility, user code relying on this method should wrap its calls in ``np.asarray`` to avoid type issues.

val img_to_graph : ?mask:[ `Dtype_bool of Py.Object.t | `Arr of [> `ArrayLike ] Np.Obj.t ] -> ?return_as:Py.Object.t -> ?dtype:Np.Dtype.t -> img:[> `ArrayLike ] Np.Obj.t -> unit -> Py.Object.t

Graph of the pixel-to-pixel gradient connections

Edges are weighted with the gradient values.

Read more in the :ref:`User Guide <image_feature_extraction>`.

Parameters ---------- img : ndarray of shape (height, width) or (height, width, channel) 2D or 3D image. mask : ndarray of shape (height, width) or (height, width, channel), dtype=bool, default=None An optional mask of the image, to consider only part of the pixels. return_as : np.ndarray or a sparse matrix class, default=sparse.coo_matrix The class to use to build the returned adjacency matrix. dtype : dtype, default=None The data of the returned sparse matrix. By default it is the dtype of img

Notes ----- For scikit-learn versions 0.14.1 and prior, return_as=np.ndarray was handled by returning a dense np.matrix instance. Going forward, np.ndarray returns an np.ndarray, as expected.

For compatibility, user code relying on this method should wrap its calls in ``np.asarray`` to avoid type issues.

val reconstruct_from_patches_2d : patches:[> `ArrayLike ] Np.Obj.t -> image_size:Py.Object.t -> unit -> Py.Object.t

Reconstruct the image from all of its patches.

Patches are assumed to overlap and the image is constructed by filling in the patches from left to right, top to bottom, averaging the overlapping regions.

Read more in the :ref:`User Guide <image_feature_extraction>`.

Parameters ---------- patches : ndarray of shape (n_patches, patch_height, patch_width) or (n_patches, patch_height, patch_width, n_channels) The complete set of patches. If the patches contain colour information, channels are indexed along the last dimension: RGB patches would have `n_channels=3`.

image_size : tuple of int (image_height, image_width) or (image_height, image_width, n_channels) The size of the image that will be reconstructed.

Returns ------- image : ndarray of shape image_size The reconstructed image.