package bio_io

  1. Overview
  2. Docs

Module Btab.RecordSource

A record type for Btab (Blast-tab) files

Overview

Each record represents a "hit", or one query-target pair as returned by the homology search tool.

These records are output by many homology search tools. It is in the style of BLAST's `outfmt 6`, but with names that match MMseqs2.

Additionally, it handles cases in which query length and target length are included (in the style of mmseqs easy-search --format-mode 2).

API

Sourcetype t
Sourceval sexp_of_t : t -> Sexplib0.Sexp.t

Creating records

Sourceval of_string : Base.string -> t

of_string s creates a new t from tab-delimeted string s. Will raise if the format of s is bad.

Valid strings are tab-delimited with either 12 or 14 fields.

The 12 field variants fields should be query, target, pident, alnlen, mismatch, gapopen, qstart, qend, tstart, tend, evalue, and bits, in that order. It is in the style of `outfmt 6` from BLAST and related tools.

The 14 field variant should have the same 12 fields plus qlen and tlen. It is like the output of mmseqs with --format-mode 2.

Sourceval to_string : t -> Base.string

to_string t creates a new "ready to print" tab-delimited string representation of t.

Accessing fields

Sourceval query : t -> Base.string

query t returns the query sequence of the search record.

Sourceval target : t -> Base.string

target t returns the target (a.k.a subject) sequence of the search record.

Sourceval pident : t -> Base.float

pident t returns the percent identity of the hit.

Note that no processing is done. So if your software returns a percentage from 0 to 100, this value will be from 0 to 100. If your software returns a fraction from 0 to 1, then this value will range from 0 to 1.

Sourceval alnlen : t -> Base.int

alnlen t returns the length of the alignment.

Sourceval mismatch : t -> Base.int

mismatch t returns the number of mismatches.

Sourceval gapopen : t -> Base.int

gapopen t returns the number of gap openings.

Sourceval qstart : t -> Base.int

qstart t returns the start position of the aligment in the query sequence.

Note that no processing is done. If the input file has 1-based coordinates, then this will also have 1-based coordinates. This also holds for qend, tstart, and tend.

Sourceval qend : t -> Base.int

qend t returns the end position of the aligment in the query sequence.

Sourceval tstart : t -> Base.int

tstart t returns the start position of the aligment in the target sequence.

Sourceval tend : t -> Base.int

tend t returns the end position of the aligment in the target sequence.

Sourceval evalue : t -> Base.float

evalue t returns the expect value of the hit.

Sourceval bits : t -> Base.float

evalue t returns the bit-score of the hit.

qlen t returns the length of the query sequence if it was present in the input file.

tlen t returns the length of the target sequence if it was present in the input file.

Parsed records

Sourcemodule Parsed : sig ... end

A fully parsed Btab record.

Sourceval parse : t -> Parsed.t

parse t parses the Btab.Record.t into Btab.Record.Parsed.t.

OCaml

Innovation. Community. Security.