package saga
Text processing and NLP extensions for Nx
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha1.tbz
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c
doc/saga.tokenizers/Saga_tokenizers/Trainers/index.html
Module Saga_tokenizers.Trainers
Source
Training module for tokenization models.
Main trainer type
Training Configurations
Source
val bpe :
?vocab_size:int ->
?min_frequency:int ->
?special_tokens:string list ->
?limit_alphabet:int ->
?initial_alphabet:string list ->
?continuing_subword_prefix:string ->
?end_of_word_suffix:string ->
?show_progress:bool ->
?max_token_length:int ->
unit ->
t
Create a BPE trainer.
Source
val wordpiece :
?vocab_size:int ->
?min_frequency:int ->
?special_tokens:string list ->
?limit_alphabet:int ->
?initial_alphabet:string list ->
?continuing_subword_prefix:string ->
?end_of_word_suffix:string ->
?unk_token:string ->
?show_progress:bool ->
unit ->
t
Create a WordPiece trainer.
Source
val word_level :
?vocab_size:int ->
?min_frequency:int ->
?special_tokens:string list ->
?show_progress:bool ->
unit ->
t
Create a WordLevel trainer.
Source
val unigram :
?vocab_size:int ->
?n_sub_iterations:int ->
?shrinking_factor:float ->
?unk_token:string ->
?special_tokens:string list ->
?show_progress:bool ->
?initial_alphabet:string list ->
?max_piece_length:int ->
unit ->
t
Create a Unigram trainer.
Source
val chars :
?min_frequency:int ->
?special_tokens:string list ->
?show_progress:bool ->
unit ->
t
Create a character-level trainer.
Training Operations
Train a model on the given files.
Source
val train_from_iterator :
t ->
iterator:(unit -> string option) ->
?model:Models.t ->
unit ->
training_result
Train a model from an iterator.
Serialization
sectionYPositions = computeSectionYPositions($el), 10)"
x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)"
>
On This Page