|
SplM |
Prediction of splice sites in Human DNA sequences
The program developed by Salamov A. and Solovyev V. It locates potential splice site positions based on 5 weight matrices for donor sites and a model including dinucleotide composition and weight matrix for acceptor splice site. Program includes prediction of potential GC -donor sites and non-standard s plice sites as AT-AC
Program does not exclude splice sites close to sites predicted with higher scores or sites on different chains. User could make processing based on the reported scores. It designed to be useful to analyze alternative Splice variants and non-canonical splice sites. Program has much higher number of overpredicted sites comparing with Spl program.
For some description of this program see:
Solovyev V.V. (2001) Statistical approaches in Eukaryotic gene prediction. In Handbook of Statistical genetics (eds. Balding D. et al.), John Wiley & Sons, Ltd., p. 83-127.
Example of output:
Splm: Matrix-based prediction of splice sites in Human sequences ------------------------------------------------------------------- Parameters: -d 95 -a 95 -dGC 95 -nc 1 (non-st. consensus AT-AC) Length of sequence 4500 Number of Donor sites: 33 Threshold: 95 Number Position Score Chain Type 1 167 33 - GT 2 184 43 - GC 3 460 25 - GT 4 486 21 - GC 5 710 97 + GT 6 1077 48 + GT 7 1081 18 + GT 8 1181 75 - GT 9 1508 6 - GT 10 1567 7 + GT 11 1920 24 + GT 12 1925 9 + GC 13 1954 6 - GT 14 2179 36 - GC 15 2662 9 - GC 16 2691 45 + GT 17 2745 43 - GC 18 2906 18 + GT 19 2930 7 - GT 20 2937 83 + GT 21 3000 7 - GT 22 3006 14 - GT 23 3023 90 - GT 24 3041 29 - GT 25 3107 11 - GT 26 3174 46 + GT 27 3280 9 + GC 28 3290 12 - GT 29 3636 7 + GC 30 3999 8 - GT 31 4107 11 - GT 32 4156 51 - GT 33 4308 22 + GT Number of Acceptor sites: 80 Threshold: 95 1 14 7 + AG 2 106 6 + AG 3 110 24 - AG 4 194 9 + AG 5 234 7 - AG 6 384 7 + AG 7 395 6 + AG 8 457 10 - AG 9 498 12 + AG 10 676 6 - AG 11 680 15 + AG 12 702 18 - AG 13 733 8 - AG 14 738 19 + AG 15 780 27 - AG 16 861 49 + AG 17 865 10 + AG 18 912 34 - AG 19 928 8 - AG 20 982 6 - AG 21 1033 24 + AG 22 1063 6 - AG 23 1080 7 + AG 24 1104 9 - AG 25 1384 8 - AC 26 1399 16 + AG 27 1432 7 - AG 28 1514 6 - AG 29 1751 9 + AG 30 1780 11 - AG 31 1809 14 - AG 32 2040 6 + AG 33 2072 13 + AG 34 2083 11 - AG 35 2120 29 - AG 36 2212 61 + AG 37 2219 8 + AG 38 2238 24 - AG 39 2258 18 - AG 40 2359 10 - AG 41 2404 7 - AG 42 2430 6 - AG 43 2453 8 - AC 44 2474 12 - AG 45 2508 9 - AC 46 2537 9 - AG 47 2576 94 + AG 48 2674 7 + AG 49 2691 9 - AC 50 2750 8 - AG 51 2755 33 + AG 52 2841 41 - AG 53 2902 6 + AG 54 2990 8 + AG 55 3045 8 + AC 56 3050 8 + AG 57 3085 10 - AG 58 3108 27 - AG 59 3185 14 - AG 60 3241 39 + AG 61 3267 23 - AG 62 3388 9 - AG 63 3451 8 - AG 64 3480 8 + AG 65 3677 6 - AG 66 3776 25 + AG 67 3825 13 - AG 68 3885 8 + AC 69 3996 8 - AG 70 4005 7 + AG 71 4125 9 + AG 72 4200 12 + AG 73 4252 29 + AG 74 4258 6 - AG 75 4280 6 - AG 76 4290 18 - AG 77 4334 9 + AC 78 4388 13 + AG 79 4449 8 + AG 80 4498 10 - AG