|
CysRec |
The program performs prediction of SS-bonding states of cysteines and locating of disulphide bridges in proteins.
Methodology
Procedure: The sequence is processed in steps.
Input Format
Fasta formatted sequence divided by lines ≤ 80 positions in lengths is accepted.
Specially prepared alignment without gaps in the first sequence is accepted too.
Example of alignment:
T0129 5 182 MLISHSDLNQQLKSAGIGFNATELHGFLSGLLCGGLKDQSWLPLLYQFSN ---SYSDFSQQLKTAGIALSAAELHGFLTGLICGGIHDQSWQPLLFQFTN -LPTYPSLALALSQQAVALTPAEMHGLISGMLCGGSKDNGWQTLVHDLTN ----YDEMNRFLNQQGAGLTPAEMHGLISGMICGGNNDSSWQPLLHDLTN ----YNEMNQYLNQQGTGLTPAEMHGLISGMICGGNDDSSWLPLLHDLTN DNHAYPTGLVQPVTELYEQISQTLSDVEGFTFELGLTEDENVFTQADSLS ENHAYPTALLQEVTQIQQHISKKLADIDGFDFELWLPENEDVFTRADALS EGVAFPQALSLPLQQLHEATQEALEN-EGFMFQLLIPEGEDVFDRADALS EGLAFGHELAQALRKMHAATSDALED-DGFLFQLYLPEDVSVFDRADALA EGMAFGHELAQALRKMHSATSDALQD-DGFLFQLYLPDDVSVFDRADALA DWANQFLLGIGLAQPELAKEKGEIGEAVDDLQDICQLGYDEDDNEEELAE EWTNHFLLGLGLAQPKLDKEKGDIGEAIDDLHDICQLGYDESDDKEELSE GWVNHFLLGLGMLQPKLAQVKDEVGEAIDDLRNIAQLGYDEDEDQEELAQ GWVNHFLLGLGVTQPKLDKVTGETGEAIDDLRNIAQLGYDESEDQEELEM GWVNHFLLGLGVTQPKLDKVTGETGEAIDDLRNIAQLGYDEDEDQEELEM ALEEIIEYVRTIAMLFYSHFNEGEIESKPVLH ALEEIIEYVRTLACLLFTHFQPQLPEQKPVLH SLEEVVEYVRVAAILCHIEFTQQKPTAKPTLH SLEEIIEYVRVAALLCHDTFTRQQPTAKPTLH SLEEIIEYVRVAALLCHDTFTHPQPTAKPTLH |
Output Format
Query sequence
Positions of cysteines which are predicted to form disulfide bonds, matrix of pair scores results of SS-bonding states predictions, the most probable pattern of pairs.
Example of output:
CYS_REC Version 2. Recognition of SS-bounded cysteines >1AC5_ length=483 LPSSEEYKVAYELLPGLSEVPDPSNIPQMHAGHIPLRSEDADEQDSSDLEYFFWKFTNNDSNGNVDRPLIIWLNGGPGCSS MDGALVESGPFRVNSDGKLYLNEGSWISKGDLLFIDQPTGTGFSVEQNKDEGKIDKNKFDEDLEDVTKHFMDFLENYFKIF PEDLTRKIILSGESYAGQYIPFFANAILNHNKFSKIDGDTYDLKALLIGNGWIDPNTQSLSYLPFAMEKKLIDESNPNFKH LTNAHENCQNLINSASTDEAAHFSYQECENILNLLLSYTRESSQKGTADCLNMYNFNLKDSYPSCGMNWPKDISFVSKFFS TPGVIDSLHLDSDKIDHWKECTNSVGTKLSNPISKPSIHLLPGLLESGIEIVLFNGDKDLICNNKGVLDTIDNLKWGGIKG FSDDAVSFDWIHKSKSTDDSEEFSGYVKYDRNLTFVSVYNASHMVPFDKSLVSRGIVDIYSNDVMIIDNNGKNVMITT 7 cysteines are found in positions: 79 251 271 293 308 345 386 Matrix of pair scores POS: 79 251 271 293 308 345 79: -999 -21 -4 8 18 143 251: -21 -999 155 7 -3 -12 271: -4 155 -999 13 -20 -15 293: 8 7 13 -999 133 -8 308: 18 -3 -20 133 -999 -7 345: 143 -12 -15 -8 -7 -999 CYS 79 is SS-bounded Score= 56.7 CYS 251 is SS-bounded Score= 53.2 CYS 271 is SS-bounded Score= 47.0 CYS 293 is SS-bounded Score= 68.1 CYS 308 is SS-bounded Score= 63.9 CYS 345 is SS-bounded Score= 60.7 CYS 386 is not SS-bounded Score= -70.7 The most probable pattern of pairs: 79-345, 251-271, 293-308, |
Performance: 3000 positive and 3000 negative examples (i.e ± 10 fragments surrounding bounded and not bounded cysteines) were prepared from PDB sequences that were not participated in the training. An accuracy of SS-bonding states recognition by combined function on this control set was ~90%.