package sklearn

You can search for identifiers within the package.

in-package search v0.2.0

package sklearn

sklearn

Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module `Model_selection.TimeSeriesSplit`Source

Sourcetype tag = [

| `TimeSeriesSplit

]

Sourcetype t = [ `BaseCrossValidator | `Object | `TimeSeriesSplit ] Obj.t

Sourceval of_pyobject : Py.Object.t -> t

Sourceval to_pyobject : [> tag ] Obj.t -> Py.Object.t

Sourceval as_cross_validator : t -> [ `BaseCrossValidator ] Obj.t

Sourceval create : ?n_splits:int -> ?max_train_size:int -> unit -> t

Time Series cross-validator

Provides train/test indices to split time series data samples that are observed at fixed time intervals, in train/test sets. In each split, test indices must be higher than before, and thus shuffling in cross validator is inappropriate.

This cross-validation object is a variation of :class:`KFold`. In the kth split, it returns first k folds as train set and the (k+1)th fold as test set.

Note that unlike standard cross-validation methods, successive training sets are supersets of those that come before them.

Read more in the :ref:`User Guide <cross_validation>`.

Parameters ---------- n_splits : int, default=5 Number of splits. Must be at least 2.

.. versionchanged:: 0.22 ``n_splits`` default value changed from 3 to 5.

max_train_size : int, optional Maximum size for a single training set.

Examples -------- >>> import numpy as np >>> from sklearn.model_selection import TimeSeriesSplit >>> X = np.array([1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]) >>> y = np.array(1, 2, 3, 4, 5, 6) >>> tscv = TimeSeriesSplit() >>> print(tscv) TimeSeriesSplit(max_train_size=None, n_splits=5) >>> for train_index, test_index in tscv.split(X): ... print('TRAIN:', train_index, 'TEST:', test_index) ... X_train, X_test = Xtrain_index, Xtest_index ... y_train, y_test = ytrain_index, ytest_index TRAIN: 0 TEST: 1 TRAIN: 0 1 TEST: 2 TRAIN: 0 1 2 TEST: 3 TRAIN: 0 1 2 3 TEST: 4 TRAIN: 0 1 2 3 4 TEST: 5

Notes ----- The training set has size ``i * n_samples // (n_splits + 1)

n_samples % (n_splits + 1)`` in the ``i``th split, with a test set of size ``n_samples//(n_splits + 1)``, where ``n_samples`` is the number of samples.

Source

val get_n_splits : 
  ?x:Py.Object.t ->
  ?y:Py.Object.t ->
  ?groups:Py.Object.t ->
  [> tag ] Obj.t ->
  int

Returns the number of splitting iterations in the cross-validator

Parameters ---------- X : object Always ignored, exists for compatibility.

y : object Always ignored, exists for compatibility.

groups : object Always ignored, exists for compatibility.

Returns ------- n_splits : int Returns the number of splitting iterations in the cross-validator.

Source

val split : 
  ?y:[> `ArrayLike ] Np.Obj.t ->
  ?groups:[> `ArrayLike ] Np.Obj.t ->
  x:[> `ArrayLike ] Np.Obj.t ->
  [> tag ] Obj.t ->
  ([> `ArrayLike ] Np.Obj.t * [> `ArrayLike ] Np.Obj.t) Seq.t

Generate indices to split data into training and test set.

Parameters ---------- X : array-like, shape (n_samples, n_features) Training data, where n_samples is the number of samples and n_features is the number of features.

y : array-like, shape (n_samples,) Always ignored, exists for compatibility.

groups : array-like, with shape (n_samples,) Always ignored, exists for compatibility.

Yields ------ train : ndarray The training set indices for that split.

test : ndarray The testing set indices for that split.

Sourceval to_string : t -> string

Print the object to a human-readable representation.

Sourceval show : t -> string

Print the object to a human-readable representation.

Sourceval pp : Format.formatter -> t -> unit

Pretty-print the object to a formatter.

package sklearn

Module Model_selection.TimeSeriesSplitSource

Module `Model_selection.TimeSeriesSplit`Source