package bio_io

  1. Overview
  2. Docs

A record type for Btab (Blast-tab) files

Overview

Each record represents a "hit", or one query-target pair as returned by the homology search tool.

These records are output by many homology search tools. It is in the style of BLAST's `outfmt 6`, but with names that match MMseqs2.

Additionally, it handles cases in which query length and target length are included (in the style of mmseqs easy-search --format-mode 2).

API

type t
val sexp_of_t : t -> Sexplib0.Sexp.t

Creating records

val of_string : Base.string -> t

of_string s creates a new t from tab-delimeted string s. Will raise if the format of s is bad.

Valid strings are tab-delimited with either 12 or 14 fields.

The 12 field variants fields should be query, target, pident, alnlen, mismatch, gapopen, qstart, qend, tstart, tend, evalue, and bits, in that order. It is in the style of `outfmt 6` from BLAST and related tools.

The 14 field variant should have the same 12 fields plus qlen and tlen. It is like the output of mmseqs with --format-mode 2.

val to_string : t -> Base.string

to_string t creates a new "ready to print" tab-delimited string representation of t.

Accessing fields

val query : t -> Base.string

query t returns the query sequence of the search record.

val target : t -> Base.string

target t returns the target (a.k.a subject) sequence of the search record.

val pident : t -> Base.float

pident t returns the percent identity of the hit.

Note that no processing is done. So if your software returns a percentage from 0 to 100, this value will be from 0 to 100. If your software returns a fraction from 0 to 1, then this value will range from 0 to 1.

val alnlen : t -> Base.int

alnlen t returns the length of the alignment.

val mismatch : t -> Base.int

mismatch t returns the number of mismatches.

val gapopen : t -> Base.int

gapopen t returns the number of gap openings.

val qstart : t -> Base.int

qstart t returns the start position of the aligment in the query sequence.

Note that no processing is done. If the input file has 1-based coordinates, then this will also have 1-based coordinates. This also holds for qend, tstart, and tend.

val qend : t -> Base.int

qend t returns the end position of the aligment in the query sequence.

val tstart : t -> Base.int

tstart t returns the start position of the aligment in the target sequence.

val tend : t -> Base.int

tend t returns the end position of the aligment in the target sequence.

val evalue : t -> Base.float

evalue t returns the expect value of the hit.

val bits : t -> Base.float

evalue t returns the bit-score of the hit.

val qlen : t -> Base.int Base.option

qlen t returns the length of the query sequence if it was present in the input file.

val tlen : t -> Base.int Base.option

tlen t returns the length of the target sequence if it was present in the input file.

Parsed records

module Parsed : sig ... end

A fully parsed Btab record.

val parse : t -> Parsed.t

parse t parses the Btab.Record.t into Btab.Record.Parsed.t.

OCaml

Innovation. Community. Security.