The submodule `Iter`

offers an implementation of iterators on persistent sequences.

### Overview

Iterators offer a way of navigating through the elements of a sequence.

An iterator can be thought of as an integer index, which designates an element of the sequence. For convenience, we think of the sequence as surrounded by two *sentinel* elements, one at the beginning and one at the end. Thus, if the sequence has *n* elements, then an iterator can be thought of as an integer index that lies in the closed interval `[-1, n]`

. The special indices `-1`

and `n`

designate the two sentinel elements, while the ordinary indices in the semi-open interval `[0, n)`

designate the actual elements of the sequence.

There are two canonical ways of writing a loop that iterates over all elements, one by one. (In the following discussion, we consider iteration in the forward direction. Iteration in the backward direction is symmetric.)

The first way is to create an iterator that points to the first element, using `create`

, and to move the iterator after fetching an element, using `get_and_move`

. The loop must continue until the iterator reaches the back sentinel, a condition that is tested using `finished`

. Such a loop looks as follows:

```
let it = create forward s in
while not (finished it) do
let x = get_and_move forward it in
...
done
```

This is analogous to the following way of iterating over an array `a`

of length `n`

:

```
let i = ref 0 in
while !i < n do
let x = a.(postincrement i) in
...
done
```

The second way relies on the fact that `get`

raises the exception `End`

when the iterator points to a sentinel. Thanks to this feature, one can also express the above loop as follows:

```
let it = create forward s in
try
while true do
let x = get_and_move forward it in
...
done
with End -> ()
```

An iterator also allows retrieving elements in batches, one segment at a time. The typical iteration pattern in this case is as follows:

```
let it = create forward s in
while not (finished i) do
let a, j, k = get_segment_and_jump forward it in
for j = j to j + k - 1 do
let x = a.(j) in
...
done
done
```

The function `get_segment_and_jump`

returns an an array segment, described by an array `a`

, a start index `j`

, and a length `k`

. The user is allowed to *read* this array segment, which contains the `k`

next elements of the sequence. In the above example, the elements of the array segment `a, j, k`

are processed one at a time using a `for`

loop.

When iterating backward, the segment must be traversed in reverse order. The typical iteration pattern in that case is as follows:

```
let it = create backward s in
while not (finished it) do
let a, j, k = get_segment_and_jump backward it in
for j = j + k - 1 downto j do
let x = a.(j) in
...
done
done
```

The auxiliary function `Segment.iter`

can be used to iterate over a segment, in either direction, without risk of error.

After a call to `get_segment_and_jump`

, the iterator points to the first element (in the direction of iteration) that follows the segment that has just been returned. Thus, it is permitted to intermix calls to `get_and_move`

and `get_segment_and_jump`

while obtaining the desired effect: every element of the sequence is examined once and only once.

`get_segment_and_jump`

always returns a nonempty array segment. If the iterator has reached the final sentinel, which implies that there are no more elements ahead of the iterator, then `get_segment_and_jump`

raises the exception `End`

, just like `get_and_move`

. Conversely, if `finished it`

is `false`

, then `get_segment_and_jump direction it`

cannot raise `End`

.

### Types

`'a iter`

is the type of an iterator. If the sequence has *n* elements, then an iterator can be thought of as an integer index that lies in the closed interval `[-1, n]`

. The special indices `-1`

and `n`

designate the two sentinel elements, while the ordinary indices in the semi-open interval `[0, n)`

designate the actual elements of the sequence.

### Creation Operations

`create forward s`

creates an iterator that points to index `0`

. This means that the iterator points to the first element of the sequence, if the sequence is nonempty, and points to the back sentinel if the sequence is empty. Symmetrically, `create backward s`

creates an iterator that points to index `n-1`

, where `n`

is the length of the sequence. This means that the iterator points to the last element of the sequence, if the sequence is nonempty, and points to the front sentinel if the sequence is empty.

Time complexity: *O(1)*.

`reset dir it`

resets the iterator `it`

to the same initial state that is established when a new iterator is created by ```
create dir (sequence
it)
```

. Thus, `reset forward it`

is equivalent to `reach it 0`

, and ```
reset
backward it
```

is equivalent to `reach it (length it - 1)`

.

Time complexity: *O(1)*.

`copy it`

creates a new iterator on the sequence `sequence it`

, at the same position as the iterator `it`

. Thus, if `copy it`

returns `it'`

, then the equality `index it = index it'`

holds.

Time complexity: *O(log n)*.

### Accessors

`val sequence : 'a iter -> 'a t`

`sequence it`

is the sequence that the iterator `it`

navigates. Throughout its lifetime, an iterator remains attached to the same sequence.

Time complexity: *O(1)*.

`length it`

is the length of the sequence that the iterator `it`

navigates. It is a short-hand for `length (sequence it)`

.

Time complexity: *O(1)*.

`index it`

is the index that the iterator `it`

currently points to. It lies in the closed interval `[-1, length it]`

. The special indices `-1`

and `n`

designate the two sentinel elements, while the ordinary indices in the semi-open interval `[0, n)`

designate the actual elements of the sequence.

Time complexity: *O(1)*.

`val finished : 'a iter -> bool`

`finished it`

tests whether the iterator `it`

currently points to the front or back sentinel. Thus, `finished it`

is equivalent to `index it = -1 || index it = length it`

. Its use is recommended, as it is both more readable and more efficient than testing a condition based on `index it`

. The condition `not (finished it)`

corresponds to `it.hasNext()`

in Java's iterator API.

Time complexity: *O(1)*.

### Read Operations

If `finished it`

is `false`

, then `get it`

reads and returns the element that the iterator `it`

currently points to, that is, the element at index `index it`

. In that case, `get it`

is equivalent to accessing the sequence via `get (sequence it) (index it)`

, yet is much cheaper. If `finished it`

is `true`

, which means that the iterator points to a sentinel, then `get it`

raises the exception `End`

.

Time complexity: *O(1)*.

`val get_opt : 'a iter -> 'a option`

`get_opt it`

is analogous to `get it`

, but returns an option instead of possibly raising the exception `End`

. It is equivalent to ```
if finished
it then None else Some (get it)
```

.

Time complexity: *O(1)*.

If `finished it`

is `false`

, then `get_segment dir it`

returns a nonempty array segment, which contains a range of sequence elements. An array segment is a triple `(a, j, k)`

, where `a`

is an array, `j`

is the start index of the segment in the array `a`

, and `k`

is the length of the segment. `k`

must be positive. The segment covers the array indices in the semi-open interval `[j, j + k)`

. You are allowed to **read** this array segment. **You are not allowed to modify the array **`a`

in any way.

- If
`dir`

is `forward`

, then the array element `a.(j)`

is the current element, that is, the element that would be returned by `get it`

. It is the first element of the segment. The last element of the segment is `a.(j + k - 1)`

. A loop of the form ```
for j = j
to j + k - 1
```

can be used to enumerate these elements in the forward direction.

- If
`dir`

is `backward`

, then the array element `a.(j + k - 1)`

is the current element, that is, the element that would be returned by `get it`

. It is the first element of the segment. The last element of the segment is `a.(j)`

. A loop of the form `for j = j + k - 1 downto j`

can be used to enumerate these elements in the backward direction.

If `finished it`

is `true`

, which means that the iterator points to a sentinel, then `get_segment dir it`

raises the exception `End`

.

Time complexity: *O(1)*.

`get_segment_opt dir it`

is analogous to `get_segment dir it`

, but returns an option instead of possibly raising the exception `End`

. It is equivalent to `if finished it then None else Some (get_segment dir it)`

.

Time complexity: *O(1)*.

### Move Operations

`move dir it`

moves the iterator `it`

by one element in the direction `dir`

. An attempt to move the iterator forward when it already points to the back sentinel, or to move it backward when it already points to the front sentinel, is forbidden: in such a situation, `move`

must not be called. In other words, the new index `index it + sign dir`

must lie in the closed interval `[-1, length it]`

.

Time complexity: *O(log n)* in the worst case. However, the amortized time complexity of `move`

is only *O(1)*, in the following sense: the cost of iterating over a sequence using *n* successive `move`

operations is *O(n)*.

`jump dir it k`

moves the iterator `it`

by `k`

elements in the direction `dir`

. The value of `k`

may be positive, null, or negative. The new index `index it + sign dir * k`

must lie in the closed interval ```
[-1, length
it]
```

. When `k`

is positive, `jump dir it k`

is equivalent to a series of `k`

calls to `move dir it`

. When `k`

is negative, `jump dir it k`

is equivalent to a series of `k`

calls to `move (opposite dir) it`

. The operation `jump dir it 0`

has no effect.

Time complexity: *O(log n)*. In the particular where `abs k = 1`

, `jump`

is optimized using `move`

.

`reach it i`

moves the iterator `it`

so as to point to the element at index `i`

. Thus, after this operation, `index it`

is `i`

. The index `i`

must lie in the closed interval `[-1, length it]`

.

Time complexity: *O(log n)*. In the particular case where `i = -1`

or `i = length it`

, `reach`

is optimized and its complexity is *O(1)*. In the particular case where one moves to the next or previous item, `reach`

is optimized using `move`

.

### Read-and-Move Operations

`get_and_move`

combines `get`

and `move`

. `get_and_move dir it`

is equivalent to `let x = get it in move dir it; x`

. Therefore, it raises the exception `End`

if (before the move) the iterator points to a sentinel. It corresponds to `it.next()`

in Java's iterator API.

Time complexity: same as `move`

.

`get_and_move_opt dir it`

is analogous to `get_and_move dir it`

, but returns an option instead of possibly raising the exception `End`

. It is equivalent to `if finished it then None else Some (get_and_move it)`

.

Time complexity: same as `move`

.

`get_segment_and_jump`

combines `get_segment`

and a `jump`

of the length of the segment. `get_segment_and_jump dir it`

is equivalent to ```
let (_,
_, k) as seg = get_segment dir it in jump dir it k; seg
```

. Therefore, it raises the exception `End`

if (before the move) the iterator points to a sentinel.

Time complexity: same as `jump`

. Furthermore, the total cost of iterating over a sequence of *n* elements using `get_segment_and_jump`

is *O(n/K)*.

`get_segment_and_jump_opt dir it`

is analogous to ```
get_segment_and_jump
dir it
```

, but returns an option instead of possibly raising the exception `End`

. It is equivalent to ```
if finished it then None else Some
(get_segment_and_jump dir it)
```

.

Time complexity: same as `get_segment_and_jump`

.

### Miscellaneous Operations

`val check : 'a iter -> unit`

In a release build, `check it`

does nothing. In a development build, it checks that the iterator's internal invariant is satisfied.

`format`

is a printer for iterators over sequences of integers. It can be installed in the OCaml toplevel loop by `#install_printer format`

. It is intended to be used only while debugging the library.