package saga
sectionYPositions = computeSectionYPositions($el), 10)"
x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)"
>
On This Page
Text processing and NLP extensions for Nx
Install
dune-project
Dependency
Authors
Maintainers
Sources
raven-1.0.0.alpha1.tbz
sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867
sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c
doc/saga.tokenizers/Saga_tokenizers/Processors/index.html
Module Saga_tokenizers.ProcessorsSource
Post-processing module for tokenization output.
Post-processors handle special tokens and formatting after tokenization, such as adding CLS and SEP tokens for BERT, or handling sentence pairs.
Source
type encoding = {ids : int array;type_ids : int array;tokens : string array;offsets : (int * int) array;special_tokens_mask : int array;attention_mask : int array;overflowing : encoding list;sequence_ranges : (int * int * int) list;
}Type representing an encoding to be processed
Main post-processor type
Constructors
Source
val roberta :
sep:(string * int) ->
cls:(string * int) ->
?trim_offsets:bool ->
?add_prefix_space:bool ->
unit ->
tCreate a RoBERTa post-processor.
Source
val template :
single:string ->
?pair:string ->
?special_tokens:(string * int) list ->
unit ->
tCreate a template post-processor.
Operations
Process encodings with the post-processor.
Get the number of tokens added by this post-processor.
Serialization
Convert post-processor to JSON representation
Create post-processor from JSON representation
sectionYPositions = computeSectionYPositions($el), 10)"
x-init="setTimeout(() => sectionYPositions = computeSectionYPositions($el), 10)"
>
On This Page