module Training_algorithm : sig ... end
val train : ?dict_size:int -> ?training_algorithm:Training_algorithm.t -> string array -> 'a Output.t -> 'a Core.Or_error.t
train ?dict_size strings trains a dictionary from an array of samples.
dict_size defaults to 100KB, which is a reasonable dictionary size. In general it's recommended to provide a few thousands samples (though this can vary a lot); and it's recommended that the total size of the samples should be around 100x times the target dictionary size.
If dictionary training fails, you either provided too few samples or a dictionary would not be effective for your data.