package search

  1. Overview
  2. Docs

A functor for building a Tfidf search index over one type of document.

Parameters

module Uid : Uid
module Doc : sig ... end

Signature

type doc = Doc.t

The type of documents we will search over

type uid = Uid.t

The type of unique identifiers we will use to identify distinct documents

type t

The search index

val index : t -> uid:uid -> token:string -> doc -> unit

index t doc uid indexes a given document doc in t with a unique identifier uid.

val add_document : t -> uid -> doc -> unit

Adds a new document to the indexer

val add_index : t -> (doc -> string) -> unit

Adds a new index and re-indexes everything.

val add_indexes : t -> (doc -> string) list -> unit

Same as add_index but allows you to add multiple indexes at a time before re-indexing occurs.

search t k searches for t using k.

val empty : ?santiser:(string -> string) -> ?strategy:(string -> string list) -> ?tokeniser:(string -> string list) -> unit -> t

Create a new empty search index.

  • parameter sanitiser

    Run on each token to normalise them, by default this is String.lowercase_ascii

  • parameter strategy

    The indexing strategy, by default this is a prefixing strategy such that abc is indexed with a, ab and abc

  • parameter tokeniser

    Turns your documents into tokens.

val pp : Stdlib.Format.formatter -> t -> unit

Dumps the index, mainly for debugging or testing.

OCaml

Innovation. Community. Security.