package regenerate

  1. Overview
  2. Docs

Regenerate.Make(W)(S)(A) is a module that implements sample generation for words implemented by the module W with the alphabet A. S describes the data structure used internally for the enumeration of words.

For casual use of the library, consider using arbitrary instead.

Parameters

module Word : Word.S
module Segment : Segments.S with type elt = Word.t
module Sigma : SIGMA with type t = Segment.t

Signature

type node =
  1. | Nothing
  2. | Everything
  3. | Cons of Segment.t * lang
and lang = unit -> node

A language is a lazy stream of words segmented by growing length.

val pp : Stdlib.Format.formatter -> lang -> unit

pp fmt lang pretty print the language lang.

val gen : Word.char Regex.t -> lang

gen regex returns the language recognized by regex.

Sampling

type res =
  1. | Done
  2. | Finite
  3. | GaveUp
exception ExitSample
val sample : ?st:Stdlib.Random.State.t -> ?n:int -> ?firsts:int -> skip:int -> lang -> (Segment.elt -> unit) -> res

sample ~skip ~n lang returns a sequence of on average n elements. lang is only consumed when needed.

We sample one element every k, where k follows a power law of average skip. Furthermore, if we consume more than sqrt k empty segments, we assume that the rest of the segments will be infinitely empty and stop.

If firsts is provided, we always output the firsts first elements.

Operations on languages

val flatten : lang -> Segment.elt Iter.t

flatten lang returns the sequence of its segments.

Regular operations

val union : lang -> lang -> lang
val inter : lang -> lang -> lang
val difference : lang -> lang -> lang
val compl : lang -> lang
val concatenate : lang -> lang -> lang
val star : (unit -> node) -> unit -> node
val rep : int -> int option -> lang -> lang
val charset : bool -> Word.char list -> unit -> node
OCaml

Innovation. Community. Security.