package saga
Text processing and NLP extensions for Nx
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha1.tbz
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c
doc/saga.models/Saga_models/Ngram/index.html
Module Saga_models.Ngram
Source
N-gram language models (unigram, bigram, trigram)
N-gram language models for text generation
Types
An n-gram model
Statistics about the trained model
N-gram
Smoothing strategies:
Add_k k
: classic add-k (Laplace) smoothingStupid_backoff alpha
: back off to lower orders scaled byalpha
create ~n ?smoothing ?cache_capacity tokens
builds a model with configurable smoothing and an optional logits cache.
logits model ~context
returns log probabilities given context. Context length should be n-1 for an n-gram model.
perplexity model tokens
computes perplexity on test tokens
log_prob model tokens
returns the sum of log-probabilities of the observed tokens under the model.
Source
val generate :
t ->
?max_tokens:int ->
?temperature:float ->
?seed:int ->
?start:int array ->
unit ->
int array
generate model ?max_tokens ?temperature ?seed ?start ()
generates tokens
stats model
returns statistics about the highest-order n-grams.
save_text model filename
serializes the model to a text file.