package brot

  1. Overview
  2. No Docs
Tokenization for OCaml

Install

dune-project
 Dependency

Authors

Maintainers

Sources

raven-1.0.0.alpha3.tbz
sha256=96d35ce03dfbebd2313657273e24c2e2d20f9e6c7825b8518b69bd1d6ed5870f
sha512=90c5053731d4108f37c19430e45456063e872b04b8a1bbad064c356e1b18e69222de8bfcf4ec14757e71f18164ec6e4630ba770dbcb1291665de5418827d1465

Description

Fast, HuggingFace-compatible tokenization for language models. BPE, WordPiece, Unigram, word-level, and character-level algorithms with composable pipelines and training from scratch.

Dependencies (7)

  1. uucp
  2. uunf >= "15.1.0"
  3. bytesrw
  4. jsont
  5. re
  6. dune >= "3.21"
  7. ocaml >= "5.2.0"

Dev Dependencies (3)

  1. odoc with-doc
  2. mdx with-test
  3. windtrap with-test

Used by (1)

  1. raven >= "1.0.0~alpha3"

Conflicts

None