|
FgenesV |
Trained Pattern/Markov chain-based viral gene prediction
FgenesV algorithm is based on pattern recognition of
different types of signals and Markov chain models of coding regions.
Optimal combination of these features is then found by dynamic programming
and a set of gene models is constructed along given sequence.
FgenesV is the fastest ab initio viral gene prediction program
available.
We developed new FgenesV-Annotator script
that finds similar proteins in public databases and annotates predicted
genes. This script can also identify low scoring genes if they have
known homologous protein.
As an example of using FgenesV, the annotation of
SARS coronavirus TOR2 genome is presented:
Annotation of
complete genome of the SARS associated Coronavirus FgenesV-Annotator script.
There are two variants of viral gene prediction program: FgenesV0, which is suited for small (<10 kb) genomes, uses generic parameters of coding regions, while FgenesV learns genome-specific parameters using viral genome sequence as an input.
FgenesV predicts all intronless viral genes. To find small group of genes that contain introns - normally alternative structures of intronless variants - standard eukaryotic gene finding programs, such as Fgenesh , can be used in addition to FgenesV.
As additional parameters, you can choose Linear or Circular form of your virus and select alternative genetic code (Standard code is default): The Bacterial and Plant Plastid Code (transl_table=11) or The Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code (transl_table=4).