package higlo
Page
Library
Module
Module type
Parameter
Class
Class type
Source
Module Higlo.LangSource
Syntax highligthing
Utf8 text and its length or a negative number it the length was not computed.
type token = | Bcomment of token_text(*block comment
*)| Constant of token_text| Directive of token_text| Escape of token_text(*Escape sequence like
*)\123| Id of token_text| Keyword of int * token_text| Lcomment of token_text(*one line comment
*)| Numeric of token_text| String of token_text| Symbol of int * token_text| Text of token_text(*Used for everything else
*)| Title of int * token_text
Tokens read in the given code, with the corresponding text and length of the text (in number of codepoints). These names are inspired from the highlight tool. Keyword and Symbol are parametrized by an integer to be able to distinguish different families of keywords and symbols, as kwa, kwb, ..., in highlight.
type error = | Unknown_lang of string(*when the required language is not found.
*)| Lex_error of Location.t * string
Lexers are based on Sedlex. A lexer returns a list of tokens, in the same order they appear in the read string. Text tokens are merged by the parse function.
get_lexer lang returns the lexer registered for the given language lang or raises Unknown_lang if no such language was registered.
registered_langs returns the list of registered pairs (name, lexer).
If a lexer was registered for the same language, it is not available any more.
parse ?raise_exn ~lang code gets the lexer associated to lang and uses it to build a list of tokens. Consecutive Text tokens are merged. If no lexer is associated to the given language, then the function returns [Text code].