|
SeqMatchSW-P |
The program implements Smith-Waterman algorithm for performing local sequence alignment, finding similar regions between two protein sequences. The approach is described in "Identification of Common Molecular Subsequences" , Journal of Molecular Biology, 147:195-197, 1981.The algorithm is a variation of the Needleman-Wunsch dynamic programming algorithm. It is guaranteed to find the optimal local alignment with respect to the scoring system being used (which includes the substitution matrix and the gap-scoring scheme).
Program is provided with viewer.
L:153 Sequence MYOGLOBIN MAP TURTLE vs. 19 Base sequences [C:\Documents and Settings\My Documents\MolQuestWorkSpace\example_data\SeqMatchSW-P\seq1.set.fa]. Total 19 sequences produce 19 significant alignment(s). [DD] 7, S: 28.714, L: 153 MYOGLOBIN CHICKEN [DD] 17, S: 27.56, L: 153 MYOGLOBIN HUMAN [DD] 9, S: 27.482, L: 153 MYOGLOBIN N.AMERICAN OPOSSUM [DD] 5, S: 26.354, L: 153 MYOGLOBIN SADDLEBACK DOLPHIN [DD] 8, S: 12.825, L: 146 HEMOGLOBIN BETA CHICKEN [DD] 13, S: 12.564, L: 141 HEMOGLOBIN ALPHA NILE CROCODILE [DD] 6, S: 12.323, L: 140 HEMOGLOBIN BETA EDIBLE FROG [DD] 10, S: 12.259, L: 146 HEMOGLOBIN BETA N.AMERICAN OPOSSUM [DD] 19, S: 12.226, L: 146 HEMOGLOBIN BETA HUMAN [DD] 11, S: 11.865, L: 141 HEMOGLOBIN ALPHA BULLFROG [DD] 14, S: 11.713, L: 141 HEMOGLOBIN ALPHA OSTRICH [DD] 15, S: 11.353, L: 141 HEMOGLOBIN ALPHA EASTERN GRAY KANGAROO [DD] 18, S: 11.235, L: 141 HEMOGLOBIN ALPHA HUMAN [DD] 16, S: 10.87, L: 142 HEMOGLOBIN ALPHA ABYSSINIAN HYRAX [DD] 12, S: 10.849, L: 146 HEMOGLOBIN BETA NILE CROCODILE [DD] 2, S: 8.2676, L: 161 HEMOGLOBIN I.PARASPONIA ANDERSONII [DD] 1, S: 7.6599, L: 146 HEMOGLOBIN VITREOSCILLA SP. [DD] 3, S: 6.1534, L: 153 LEGHEMOGLOBIN I. YELLOW LUPIN [DD] 4, S: 5.4138, L: 143 LEGHEMOGLOBIN I.BROAD BEAN . **************************************************************************** [DD] Sequence: 7( 1), S: 28.714, L: 153 MYOGLOBIN CHICKEN Summ of block lengths: 153, Alignment bounds: On first sequence: start 1, end 153, length 153 On second sequence: start 1, end 153, length 153 Block of alignment: 1 1 P: 1 1 L: 153, G: 84.27, W: 874000, S:28.7142 1 GLSDDEWHHVLGIWAKVEPDLSAHGQEVIIRLFQVHPETQERFAKFKNLKTIDELRSSEE ||||2||44||0||2|||1|552||4||55|||40||||05||0|||1|||05|662||5 1 GLSDQEWQQVLTIWGKVEADIAGHGHEVLMRLFHDHPETLDRFDKFKGLKTPNEMKGSED 61 VKKHGTTVLTALGRILKLKNNHEPELKPLAESHATKHKIPVKYLEFICEIIVKVIAEKHP 4||||2||||1||6|||0|12||15|||||65|||||||||||||||1|7|7|||||||1 61 LKKHGATVLTQLGKILKQKGQHESDLKPLAQTHATKHKIPVKYLEFISEVIIKVIAEKHA 121 SDFGADSQAAMRKALELFRNDMASKYKEFGFQG 5||||||||||6||||||||||||||||||||| 121 ADFGADSQAAMKKALELFRNDMASKYKEFGFQG [DD] Sequence: 17( 1), S: 27.56, L: 153 MYOGLOBIN HUMAN Summ of block lengths: 153, Alignment bounds: On first sequence: start 1, end 153, length 153 On second sequence: start 1, end 153, length 153 Block of alignment: 1 1 P: 1 1 L: 153, G: 81.13, W: 830000, S:27.5604 1 GLSDDEWHHVLGIWAKVEPDLSAHGQEVIIRLFQVHPETQERFAKFKNLKTIDELRSSEE ||||0||40||17|2|||1|512|||||5||||50||||0|6|0|||4||50||665||5 1 GLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASED 61 VKKHGTTVLTALGRILKLKNNHEPELKPLAESHATKHKIPVKYLEFICEIIVKVIAEKHP 4||||2|||||||0|||0|14||1|5||||6||||||||||||||||1|0|75|512||| 61 LKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKHP 121 SDFGADSQAAMRKALELFRNDMASKYKEFGFQG 2|||||5|2||1|||||||2||||2|||4|||| 121 GDFGADAQGAMNKALELFRKDMASNYKELGFQG ....
[DD] Sequence: 7( 1), S: 28.714, L: 153 MYOGLOBIN CHICKEN
[DD] | No sence, used for output compatibility on nucleotide sequence alignment. |
Sequence: 7( 7) | Order number of sequence from a query set which is submitted to alignment. In brackets is an order number for alignment of this sequence (if it resulted in more than one alignment). Variants: 4( 5) - the fifth alignment of the fourth sequence from a set. |
S | Score of this alignment. |
L | Length of this query sequence |
MYOGLOBIN CHICKEN | Name of this query sequence |
Summ of block lengths: 153, Alignment bounds: On first sequence: start 1, end 153, length 153 On second sequence: start 1, end 153, length 153
length | The length covered by alignment, in target and query sequences appropriately. |
Block of alignment: 1 1 P: 1 1 L: 153, G: 84.27, W: 874000, S:28.7142
Block of alignment: 1 - amount of blocks. Below each line corresponds to one block:
1 P: 1 1 L: 153, G: 84.27, W: 874000, S:28.7142
1 | Block number. |
P: 1 1 | Positions of similarity block' start in target and query sequences appropriately. In this case - from the first position in both sequences. |
L: 153 | Length of this similarity block. |
G: 84.27 | Homology of this similarity block. |
W: 874000 | Weight of this similarity block (the arithmetic sum of symbols' similarity calculated from the given similarity matrix). |
S:28.7142 | Score of this similarity block. |
1 GLSDDEWHHVLGIWAKVEPDLSAHGQEVIIRLFQVHPETQERFAKFKNLKTIDELRSSEE ||||2||44||0||2|||1|552||4||55|||40||||05||0|||1|||05|662||5 1 GLSDQEWQQVLTIWGKVEADIAGHGHEVLMRLFHDHPETLDRFDKFKGLKTPNEMKGSED
1 line - The target sequence itself. Capital letters correspond to blocks of similarity, lower case - not aligned regions.
2 line - Separator line. Separator line symbols: "|" - perfect coincidence between symbols.
Figures means the degree of symbols' similarity. Vary from 0 up to 9. 0 - no similarity, 9 - maximal similarity.
3 line - The query sequence itself. Capital letters correspond to blocks of similarity, lower case - not aligned regions.