TandemRep


Input
  Sequences set   Source file with nucleotide sequences in multiFASTA format Maximum file size is 1 GB
  Base   Select one of the configuration files:
Normal - default configuration
Sensistive - more sensitive configuration resulting in higher masking percent
Rough - more roung configuration resulting in lower masking percent
Output
  Result   Name of the output file
FormatResult presentation mode examples:
  • Output list of tandem repeat regions
    
    >c20
    Masked regions:
    p1:  96       p2: 127      l: 31        chain(+) [Tandem Repeat]
    p1: 240       p2: 262      l: 22        chain(+) [Tandem Repeat]
    p1: 277       p2: 322      l: 45        chain(+) [Tandem Repeat]
    
    p1: - start position of the tandem region
    p2: - end position of the tandem region
    l: - length of the tandem region
    chain(+) - chain direction
  • Output sequence, masked lett. replaced with N
    
    >c20
    CGGTGGCGGCAGCCGGCTCAAGCCCGGGCCGCAGCTGCCTGGCCGCGGGGGCCGCCGAGCAGCGGGAGGGCCTTTGGGGG
    GCGGGGCGGCGGCGCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTGCTGGAAGTGCGCCTGGTCGAGACCCCGGGG
    CGGGAGCTGTGGAGGATGGTCCCGGCGGGACGGGCCGCTCGGGGACAAGCGGAGCGCGCCCAAGGGCCGTCGGGCGAGGG
    NNNNNNNNNNNNNNNNNNNNNNTCCCCGACACCGTCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    NTGAGGAAGGTGAAGAGGAGACGGTGGCGTCGGGGGAGGAGTCGCTGGGCTTTCTGTCCAGGCTGCCCCCTGGCCCGGCC
    
  • Output sequence, masked lett. are in upper case
    
    >c20
    cggtggcggcagccggctcaagcccgggccgcagctgcctggccgcgggggccgccgagcagcgggagggcctttggggg
    gcggggcggcggcgccCGAGGACGACGATGAAGACGACGACGAGGAGctgctggaagtgcgcctggtcgagaccccgggg
    cgggagctgtggaggatggtcccggcgggacgggccgctcggggacaagcggagcgcgcccaagggccgtcgggcgaggg
    GGCGGCCGCCGCCGCCGCCGCCtccccgacaccgtcGGAGGACGAGGAGCCGGAGGAAGAGGAGGAGGAGGCGGCAGCGG
    Ctgaggaaggtgaagaggagacggtggcgtcgggggaggagtcgctgggctttctgtccaggctgccccctggcccggcc
    
  • Output repeats during calculation (regions may overlap)
    
    >seq:1  beg:96  len:31
    CGAGGACGACGATGAAGACGACGACGAGGAG
    >seq:1  beg:240  len:22
    GGCGGCCGCCGCCGCCGCCGCC
    >seq:1  beg:277  len:45
    GGAGGACGAGGAGCCGGAGGAAGAGGAGGAGGAGGCGGCAGCGGC
    
    seq:1 - sequence number in input file
    beg: - start position of the Tandem Repeat
    len: - length of the Tandem Repeat
Options
  Minimal length   Lowest acceptable tandem region length
  Maximum diplet distance   Maximum acceptable difference in diplet composition between two windows in the tested region (from 0 to 200)
  Maximum unit size   Maximum acceptable tandem unit size
  Smith-Waterman identity   Minimum allowed identity in Smith-Waterman algorithm for repeated units
  Strict extending   Extend tandem with more strict conditions for shorter units and low monoplet complexity regions.