package saga
Text processing and NLP extensions for Nx
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha1.tbz
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c
doc/saga.tokenizers/Saga_tokenizers/Encoding/index.html
Module Saga_tokenizers.Encoding
Source
Encoding module - represents the output of a tokenizer
The main encoding type - abstract to users
Source
val create :
ids:int array ->
type_ids:int array ->
tokens:string array ->
words:int option array ->
offsets:(int * int) array ->
special_tokens_mask:int array ->
attention_mask:int array ->
overflowing:t list ->
sequence_ranges:(int, int * int) Hashtbl.t ->
t
Create a new encoding - for internal use
Create encoding from tokens
Accessors
Token/Word/Char mappings
Get the sequence index containing the given token
Get the character offsets of the given token
Get the tokens corresponding to the given word
Get the character offsets of the given word
Get the token containing the given character position
Get the word containing the given character position
Operations
Truncation direction
Truncate the encoding
Padding direction
Source
val pad :
t ->
target_length:int ->
pad_id:int ->
pad_type_id:int ->
pad_token:string ->
direction:padding_direction ->
t
Pad the encoding
sectionYPositions = computeSectionYPositions($el), 10)"
x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)"
>
On This Page