The submodule Iter
offers an implementation of iterators on persistent sequences.
Overview
Iterators offer a way of navigating through the elements of a sequence.
An iterator can be thought of as an integer index, which designates an element of the sequence. For convenience, we think of the sequence as surrounded by two sentinel elements, one at the beginning and one at the end. Thus, if the sequence has n elements, then an iterator can be thought of as an integer index that lies in the closed interval [-1, n]
. The special indices -1
and n
designate the two sentinel elements, while the ordinary indices in the semi-open interval [0, n)
designate the actual elements of the sequence.
There are two canonical ways of writing a loop that iterates over all elements, one by one. (In the following discussion, we consider iteration in the forward direction. Iteration in the backward direction is symmetric.)
The first way is to create an iterator that points to the first element, using create
, and to move the iterator after fetching an element, using get_and_move
. The loop must continue until the iterator reaches the back sentinel, a condition that is tested using finished
. Such a loop looks as follows:
let it = create forward s in
while not (finished it) do
let x = get_and_move forward it in
...
done
This is analogous to the following way of iterating over an array a
of length n
:
let i = ref 0 in
while !i < n do
let x = a.(postincrement i) in
...
done
The second way relies on the fact that get
raises the exception End
when the iterator points to a sentinel. Thanks to this feature, one can also express the above loop as follows:
let it = create forward s in
try
while true do
let x = get_and_move forward it in
...
done
with End -> ()
An iterator also allows retrieving elements in batches, one segment at a time. The typical iteration pattern in this case is as follows:
let it = create forward s in
while not (finished i) do
let a, j, k = get_segment_and_jump forward it in
for j = j to j + k - 1 do
let x = a.(j) in
...
done
done
The function get_segment_and_jump
returns an an array segment, described by an array a
, a start index j
, and a length k
. The user is allowed to read this array segment, which contains the k
next elements of the sequence. In the above example, the elements of the array segment a, j, k
are processed one at a time using a for
loop.
When iterating backward, the segment must be traversed in reverse order. The typical iteration pattern in that case is as follows:
let it = create backward s in
while not (finished it) do
let a, j, k = get_segment_and_jump backward it in
for j = j + k - 1 downto j do
let x = a.(j) in
...
done
done
The auxiliary function Segment.iter
can be used to iterate over a segment, in either direction, without risk of error.
After a call to get_segment_and_jump
, the iterator points to the first element (in the direction of iteration) that follows the segment that has just been returned. Thus, it is permitted to intermix calls to get_and_move
and get_segment_and_jump
while obtaining the desired effect: every element of the sequence is examined once and only once.
get_segment_and_jump
always returns a nonempty array segment. If the iterator has reached the final sentinel, which implies that there are no more elements ahead of the iterator, then get_segment_and_jump
raises the exception End
, just like get_and_move
. Conversely, if finished it
is false
, then get_segment_and_jump direction it
cannot raise End
.
Types
'a iter
is the type of an iterator. If the sequence has n elements, then an iterator can be thought of as an integer index that lies in the closed interval [-1, n]
. The special indices -1
and n
designate the two sentinel elements, while the ordinary indices in the semi-open interval [0, n)
designate the actual elements of the sequence.
Creation Operations
create forward s
creates an iterator that points to index 0
. This means that the iterator points to the first element of the sequence, if the sequence is nonempty, and points to the back sentinel if the sequence is empty. Symmetrically, create backward s
creates an iterator that points to index n-1
, where n
is the length of the sequence. This means that the iterator points to the last element of the sequence, if the sequence is nonempty, and points to the front sentinel if the sequence is empty.
Time complexity: O(1).
reset dir it
resets the iterator it
to the same initial state that is established when a new iterator is created by create dir (sequence
it)
. Thus, reset forward it
is equivalent to reach it 0
, and reset
backward it
is equivalent to reach it (length it - 1)
.
Time complexity: O(1).
copy it
creates a new iterator on the sequence sequence it
, at the same position as the iterator it
. Thus, if copy it
returns it'
, then the equality index it = index it'
holds.
Time complexity: O(log n).
Accessors
val sequence : 'a iter -> 'a t
sequence it
is the sequence that the iterator it
navigates. Throughout its lifetime, an iterator remains attached to the same sequence.
Time complexity: O(1).
length it
is the length of the sequence that the iterator it
navigates. It is a short-hand for length (sequence it)
.
Time complexity: O(1).
index it
is the index that the iterator it
currently points to. It lies in the closed interval [-1, length it]
. The special indices -1
and n
designate the two sentinel elements, while the ordinary indices in the semi-open interval [0, n)
designate the actual elements of the sequence.
Time complexity: O(1).
val finished : 'a iter -> bool
finished it
tests whether the iterator it
currently points to the front or back sentinel. Thus, finished it
is equivalent to index it = -1 || index it = length it
. Its use is recommended, as it is both more readable and more efficient than testing a condition based on index it
. The condition not (finished it)
corresponds to it.hasNext()
in Java's iterator API.
Time complexity: O(1).
Read Operations
If finished it
is false
, then get it
reads and returns the element that the iterator it
currently points to, that is, the element at index index it
. In that case, get it
is equivalent to accessing the sequence via get (sequence it) (index it)
, yet is much cheaper. If finished it
is true
, which means that the iterator points to a sentinel, then get it
raises the exception End
.
Time complexity: O(1).
val get_opt : 'a iter -> 'a option
get_opt it
is analogous to get it
, but returns an option instead of possibly raising the exception End
. It is equivalent to if finished
it then None else Some (get it)
.
Time complexity: O(1).
If finished it
is false
, then get_segment dir it
returns a nonempty array segment, which contains a range of sequence elements. An array segment is a triple (a, j, k)
, where a
is an array, j
is the start index of the segment in the array a
, and k
is the length of the segment. k
must be positive. The segment covers the array indices in the semi-open interval [j, j + k)
. You are allowed to read this array segment. You are not allowed to modify the array a
in any way.
- If
dir
is forward
, then the array element a.(j)
is the current element, that is, the element that would be returned by get it
. It is the first element of the segment. The last element of the segment is a.(j + k - 1)
. A loop of the form for j = j
to j + k - 1
can be used to enumerate these elements in the forward direction.
- If
dir
is backward
, then the array element a.(j + k - 1)
is the current element, that is, the element that would be returned by get it
. It is the first element of the segment. The last element of the segment is a.(j)
. A loop of the form for j = j + k - 1 downto j
can be used to enumerate these elements in the backward direction.
If finished it
is true
, which means that the iterator points to a sentinel, then get_segment dir it
raises the exception End
.
Time complexity: O(1).
get_segment_opt dir it
is analogous to get_segment dir it
, but returns an option instead of possibly raising the exception End
. It is equivalent to if finished it then None else Some (get_segment dir it)
.
Time complexity: O(1).
Move Operations
move dir it
moves the iterator it
by one element in the direction dir
. An attempt to move the iterator forward when it already points to the back sentinel, or to move it backward when it already points to the front sentinel, is forbidden: in such a situation, move
must not be called. In other words, the new index index it + sign dir
must lie in the closed interval [-1, length it]
.
Time complexity: O(log n) in the worst case. However, the amortized time complexity of move
is only O(1), in the following sense: the cost of iterating over a sequence using n successive move
operations is O(n).
jump dir it k
moves the iterator it
by k
elements in the direction dir
. The value of k
may be positive, null, or negative. The new index index it + sign dir * k
must lie in the closed interval [-1, length
it]
. When k
is positive, jump dir it k
is equivalent to a series of k
calls to move dir it
. When k
is negative, jump dir it k
is equivalent to a series of k
calls to move (opposite dir) it
. The operation jump dir it 0
has no effect.
Time complexity: O(log n). In the particular where abs k = 1
, jump
is optimized using move
.
reach it i
moves the iterator it
so as to point to the element at index i
. Thus, after this operation, index it
is i
. The index i
must lie in the closed interval [-1, length it]
.
Time complexity: O(log n). In the particular case where i = -1
or i = length it
, reach
is optimized and its complexity is O(1). In the particular case where one moves to the next or previous item, reach
is optimized using move
.
Read-and-Move Operations
get_and_move
combines get
and move
. get_and_move dir it
is equivalent to let x = get it in move dir it; x
. Therefore, it raises the exception End
if (before the move) the iterator points to a sentinel. It corresponds to it.next()
in Java's iterator API.
Time complexity: same as move
.
get_and_move_opt dir it
is analogous to get_and_move dir it
, but returns an option instead of possibly raising the exception End
. It is equivalent to if finished it then None else Some (get_and_move it)
.
Time complexity: same as move
.
get_segment_and_jump
combines get_segment
and a jump
of the length of the segment. get_segment_and_jump dir it
is equivalent to let (_,
_, k) as seg = get_segment dir it in jump dir it k; seg
. Therefore, it raises the exception End
if (before the move) the iterator points to a sentinel.
Time complexity: same as jump
. Furthermore, the total cost of iterating over a sequence of n elements using get_segment_and_jump
is O(n/K).
get_segment_and_jump_opt dir it
is analogous to get_segment_and_jump
dir it
, but returns an option instead of possibly raising the exception End
. It is equivalent to if finished it then None else Some
(get_segment_and_jump dir it)
.
Time complexity: same as get_segment_and_jump
.
Miscellaneous Operations
val check : 'a iter -> unit
In a release build, check it
does nothing. In a development build, it checks that the iterator's internal invariant is satisfied.
format
is a printer for iterators over sequences of integers. It can be installed in the OCaml toplevel loop by #install_printer format
. It is intended to be used only while debugging the library.