package wseg

  1. Overview
  2. Docs
A word identification system

Install

Dune Dependency

Authors

Maintainers

Sources

0.3.0.tar.gz
md5=e56244d34e92bda9c7a9fdea99734748

README.md.html

wseg

A word identification system

Usage

In the test directory, there are two plain text files that serve as dictionary files, one for characters and one for words. And there is a test.ml program that illustrates how to build dictionary and index with wseg and how to apply several rules to identify words from a sentence. Just invoke make runtest to play with the demo.

char.dic contains 12640 Chinese characters and word.dic contains 157202 words. So you can expand the dict or demo for common usage.