package sek

  1. Overview
  2. Docs

The submodule Iter offers an implementation of iterators on ephemeral sequences.

Overview

Iterators offer a way of navigating through the elements of a sequence.

An iterator can be thought of as an integer index, which designates an element of the sequence. For convenience, we think of the sequence as surrounded by two sentinel elements, one at the beginning and one at the end. Thus, if the sequence has n elements, then an iterator can be thought of as an integer index that lies in the closed interval [-1, n]. The special indices -1 and n designate the two sentinel elements, while the ordinary indices in the semi-open interval [0, n) designate the actual elements of the sequence.

There are two canonical ways of writing a loop that iterates over all elements, one by one. (In the following discussion, we consider iteration in the forward direction. Iteration in the backward direction is symmetric.)

The first way is to create an iterator that points to the first element, using create, and to move the iterator after fetching an element, using get_and_move. The loop must continue until the iterator reaches the back sentinel, a condition that is tested using finished. Such a loop looks as follows:

let it = create forward s in
while not (finished it) do
  let x = get_and_move forward it in
  ...
done

This is analogous to the following way of iterating over an array a of length n:

let i = ref 0 in
while !i < n do
  let x = a.(postincrement i) in
  ...
done

The second way relies on the fact that get raises the exception End when the iterator points to a sentinel. Thanks to this feature, one can also express the above loop as follows:

let it = create forward s in
try
  while true do
    let x = get_and_move forward it in
    ...
  done
with End -> ()

An iterator also allows retrieving elements in batches, one segment at a time. The typical iteration pattern in this case is as follows:

let it = create forward s in
while not (finished i) do
  let a, j, k = get_segment_and_jump forward it in
  for j = j to j + k - 1 do
    let x = a.(j) in
    ...
  done
done

The function get_segment_and_jump returns an an array segment, described by an array a, a start index j, and a length k. The user is allowed to read this array segment, which contains the k next elements of the sequence. In the above example, the elements of the array segment a, j, k are processed one at a time using a for loop.

When iterating backward, the segment must be traversed in reverse order. The typical iteration pattern in that case is as follows:

let it = create backward s in
while not (finished it) do
  let a, j, k = get_segment_and_jump backward it in
  for j = j + k - 1 downto j do
    let x = a.(j) in
    ...
  done
done

The auxiliary function Segment.iter can be used to iterate over a segment, in either direction, without risk of error.

After a call to get_segment_and_jump, the iterator points to the first element (in the direction of iteration) that follows the segment that has just been returned. Thus, it is permitted to intermix calls to get_and_move and get_segment_and_jump while obtaining the desired effect: every element of the sequence is examined once and only once.

get_segment_and_jump always returns a nonempty array segment. If the iterator has reached the final sentinel, which implies that there are no more elements ahead of the iterator, then get_segment_and_jump raises the exception End, just like get_and_move. Conversely, if finished it is false, then get_segment_and_jump direction it cannot raise End.

Types

type 'a iter

'a iter is the type of an iterator. If the sequence has n elements, then an iterator can be thought of as an integer index that lies in the closed interval [-1, n]. The special indices -1 and n designate the two sentinel elements, while the ordinary indices in the semi-open interval [0, n) designate the actual elements of the sequence.

Creation Operations

val create : direction -> 'a t -> 'a iter

create forward s creates an iterator that points to index 0. This means that the iterator points to the first element of the sequence, if the sequence is nonempty, and points to the back sentinel if the sequence is empty. Symmetrically, create backward s creates an iterator that points to index n-1, where n is the length of the sequence. This means that the iterator points to the last element of the sequence, if the sequence is nonempty, and points to the front sentinel if the sequence is empty.

Time complexity: O(1).

val reset : direction -> 'a iter -> unit

reset dir it resets the iterator it to the same initial state that is established when a new iterator is created by create dir (sequence it). Thus, reset forward it is equivalent to reach it 0, and reset backward it is equivalent to reach it (length it - 1).

Time complexity: O(1).

val copy : 'a iter -> 'a iter

copy it creates a new iterator on the sequence sequence it, at the same position as the iterator it. Thus, if copy it returns it', then the equality index it = index it' holds.

Time complexity: O(log n).

Accessors

val sequence : 'a iter -> 'a t

sequence it is the sequence that the iterator it navigates. Throughout its lifetime, an iterator remains attached to the same sequence.

Time complexity: O(1).

val length : 'a iter -> int

length it is the length of the sequence that the iterator it navigates. It is a short-hand for length (sequence it).

Time complexity: O(1).

val index : 'a iter -> int

index it is the index that the iterator it currently points to. It lies in the closed interval [-1, length it]. The special indices -1 and n designate the two sentinel elements, while the ordinary indices in the semi-open interval [0, n) designate the actual elements of the sequence.

Time complexity: O(1).

val finished : 'a iter -> bool

finished it tests whether the iterator it currently points to the front or back sentinel. Thus, finished it is equivalent to index it = -1 || index it = length it. Its use is recommended, as it is both more readable and more efficient than testing a condition based on index it. The condition not (finished it) corresponds to it.hasNext() in Java's iterator API.

Time complexity: O(1).

Read Operations

val get : 'a iter -> 'a

If finished it is false, then get it reads and returns the element that the iterator it currently points to, that is, the element at index index it. In that case, get it is equivalent to accessing the sequence via get (sequence it) (index it), yet is much cheaper. If finished it is true, which means that the iterator points to a sentinel, then get it raises the exception End.

Time complexity: O(1).

val get_opt : 'a iter -> 'a option

get_opt it is analogous to get it, but returns an option instead of possibly raising the exception End. It is equivalent to if finished it then None else Some (get it).

Time complexity: O(1).

val get_segment : direction -> 'a iter -> 'a array * int * int

If finished it is false, then get_segment dir it returns a nonempty array segment, which contains a range of sequence elements. An array segment is a triple (a, j, k), where a is an array, j is the start index of the segment in the array a, and k is the length of the segment. k must be positive. The segment covers the array indices in the semi-open interval [j, j + k). You are allowed to read this array segment. You are not allowed to modify the array a in any way.

  • If dir is forward, then the array element a.(j) is the current element, that is, the element that would be returned by get it. It is the first element of the segment. The last element of the segment is a.(j + k - 1). A loop of the form for j = j to j + k - 1 can be used to enumerate these elements in the forward direction.
  • If dir is backward, then the array element a.(j + k - 1) is the current element, that is, the element that would be returned by get it. It is the first element of the segment. The last element of the segment is a.(j). A loop of the form for j = j + k - 1 downto j can be used to enumerate these elements in the backward direction.

If finished it is true, which means that the iterator points to a sentinel, then get_segment dir it raises the exception End.

Time complexity: O(1).

val get_segment_opt : direction -> 'a iter -> ('a array * int * int) option

get_segment_opt dir it is analogous to get_segment dir it, but returns an option instead of possibly raising the exception End. It is equivalent to if finished it then None else Some (get_segment dir it).

Time complexity: O(1).

Move Operations

val move : direction -> 'a iter -> unit

move dir it moves the iterator it by one element in the direction dir. An attempt to move the iterator forward when it already points to the back sentinel, or to move it backward when it already points to the front sentinel, is forbidden: in such a situation, move must not be called. In other words, the new index index it + sign dir must lie in the closed interval [-1, length it].

Time complexity: O(log n) in the worst case. However, the amortized time complexity of move is only O(1), in the following sense: the cost of iterating over a sequence using n successive move operations is O(n).

val jump : direction -> 'a iter -> int -> unit

jump dir it k moves the iterator it by k elements in the direction dir. The value of k may be positive, null, or negative. The new index index it + sign dir * k must lie in the closed interval [-1, length it]. When k is positive, jump dir it k is equivalent to a series of k calls to move dir it. When k is negative, jump dir it k is equivalent to a series of k calls to move (opposite dir) it. The operation jump dir it 0 has no effect.

Time complexity: O(log n). In the particular where abs k = 1, jump is optimized using move.

val reach : 'a iter -> int -> unit

reach it i moves the iterator it so as to point to the element at index i. Thus, after this operation, index it is i. The index i must lie in the closed interval [-1, length it].

Time complexity: O(log n). In the particular case where i = -1 or i = length it, reach is optimized and its complexity is O(1). In the particular case where one moves to the next or previous item, reach is optimized using move.

Read-and-Move Operations

val get_and_move : direction -> 'a iter -> 'a

get_and_move combines get and move. get_and_move dir it is equivalent to let x = get it in move dir it; x. Therefore, it raises the exception End if (before the move) the iterator points to a sentinel. It corresponds to it.next() in Java's iterator API.

Time complexity: same as move.

val get_and_move_opt : direction -> 'a iter -> 'a option

get_and_move_opt dir it is analogous to get_and_move dir it, but returns an option instead of possibly raising the exception End. It is equivalent to if finished it then None else Some (get_and_move it).

Time complexity: same as move.

val get_segment_and_jump : direction -> 'a iter -> 'a array * int * int

get_segment_and_jump combines get_segment and a jump of the length of the segment. get_segment_and_jump dir it is equivalent to let (_, _, k) as seg = get_segment dir it in jump dir it k; seg. Therefore, it raises the exception End if (before the move) the iterator points to a sentinel.

Time complexity: same as jump. Furthermore, the total cost of iterating over a sequence of n elements using get_segment_and_jump is O(n/K).

val get_segment_and_jump_opt : direction -> 'a iter -> ('a array * int * int) option

get_segment_and_jump_opt dir it is analogous to get_segment_and_jump dir it, but returns an option instead of possibly raising the exception End. It is equivalent to if finished it then None else Some (get_segment_and_jump dir it).

Time complexity: same as get_segment_and_jump.

Miscellaneous Operations

val check : 'a iter -> unit

In a release build, check it does nothing. In a development build, it checks that the iterator's internal invariant is satisfied.

val format : Stdlib.Format.formatter -> int iter -> unit

format is a printer for iterators over sequences of integers. It can be installed in the OCaml toplevel loop by #install_printer format. It is intended to be used only while debugging the library.

Iterator Invalidation

An iterator on an ephemeral sequence is invalidated when this sequence is modified (with a few exceptions, described below). Thereafter, this iterator can no longer be used (with a few exceptions, described below).

Operations that Invalidate Iterators

As a general rule, every operation that modifies an ephemeral sequence invalidates all iterators associated with this sequence.

We do not give an exhaustive list of these operations. Among the core operations, the following operations are in this category:

As an exception, E.assign s s has no effect, therefore does not invalidate any iterators.

As a more interesting exception, the operation E.Iter.set it invalidates all iterators on the sequence E.Iter.sequence it, except the iterator it itself, which remains valid.

If finished it is true, then E.Iter.set it raises End and does not invalidate any iterators.

The operations E.Iter.get_writable_segment, E.Iter.get_writable_segment_opt, E.Iter.set_and_move, E.Iter.get_writable_segment_and_jump, E.Iter.get_writable_segment_and_jump_opt, behave in the same way as E.Iter.set.

Operations that Can Be Applied to an Invalid Iterator

As a general rule, no operations can be applied to an invalid iterator.

As an exception to this rule, the following operations can be applied to an invalid iterator:

Runtime Detection of Illegal Operations on Iterators

An attempt to apply an operation other than those listed above to an invalid iterator is illegal.

If the flag check_iterator_validity is enabled (which it is, by default), then such an attempt is detected at runtime and gives rise to a runtime error.

For the same reason, if the flag check_iterator_validity is enabled, then an attempt to modify an ephemeral sequence while a higher-order iteration function (such as E.iter, E.fold_left, etc.) is running is detected at runtime and gives rise to a runtime error.

val is_valid : 'a iter -> bool

When the flag check_iterator_validity is enabled (which it is, by default), it is possible to test, at runtime, whether an iterator is currently valid. When the flag check_iterator_validity is false, it is impossible to determine at runtime whether an iterator is valid; in that case, is_valid always returns true.

Time complexity: O(1).

Write Operations

val set : 'a iter -> 'a -> unit

If finished it is false, then set it x replaces the element currently pointed to by the iterator it with the element x. In this case, set it x is equivalent to accessing the sequence via set (sequence it) (index it) x, yet is much cheaper. If finished it is true, then the exception End is raised.

Time complexity: if the set operation hits a chunk that is marked as potentially shared with other sequences, then its complexity is O(K + log n), and this chunk is replaced in the process with one that is not shared with any other sequence. Otherwise, the complexity is O(1).

val get_writable_segment : direction -> 'a iter -> 'a array * int * int

If finished it is false, then get_writable_segment dir it returns a nonempty array segment (a, j, k), which contains a range of sequence elements. The iterator must not point to a sentinel; that is, finished it must be false. Please see the documentation of get_segment for a detailed description of the array segment. In the case of get_writable_segment, you are allowed to read and write this array segment. You are not allowed to modify the array a outside of the semi-open interval [j, j + k).

If finished it is true, which means that the iterator points to a sentinel, then get_writable_segment dir it raises the exception End.

Time complexity: same as set.

val get_writable_segment_opt : direction -> 'a iter -> ('a array * int * int) option

get_writable_segment_opt dir it is analogous to get_writable_segment dir it, but returns an option instead of possibly raising the exception End. It is equivalent to if finished it then None else Some (get_writable_segment dir it).

Time complexity: same as set.

Write-and-Move Operations

val set_and_move : direction -> 'a iter -> 'a -> unit

set_and_move direction it x is equivalent to set it x; move direction it.

Time complexity: same as set followed with move.

val get_writable_segment_and_jump : direction -> 'a iter -> 'a array * int * int

get_writable_segment_and_jump combines get_writable_segment and a jump of the length of the segment. get_writable_segment_and_jump dir it is equivalent to let (_, _, k) as seg = get_writable_segment dir it in jump dir it k; seg. The iterator must not point to a sentinel; that is, finished it must be false.

Time complexity: same as set followed with jump. The total cost of iterating over a sequence of n elements using get_writable_segment_and_jump is O(n) in the worst case, and O(n/K) if no shared chunk is encountered.

val get_writable_segment_and_jump_opt : direction -> 'a iter -> ('a array * int * int) option

get_writable_segment_and_jump_opt dir it is analogous to get_writable_segment_and_jump dir it, but returns an option instead of possibly raising the exception End. It is equivalent to if finished it then None else Some (get_writable_segment_and_jump dir it).

Time complexity: same as get_writable_segment_and_jump.