SplM

Prediction of splice sites in Human DNA sequences

The program developed by Salamov A. and Solovyev V. It locates potential splice site positions based on 5 weight matrices for donor sites and a model including dinucleotide composition and weight matrix for acceptor splice site. Program includes prediction of potential GC -donor sites and non-standard s plice sites as AT-AC

Program does not exclude splice sites close to sites predicted with higher scores or sites on different chains. User could make processing based on the reported scores. It designed to be useful to analyze alternative Splice variants and non-canonical splice sites. Program has much higher number of overpredicted sites comparing with Spl program.

For some description of this program see:

Solovyev V.V. (2001) Statistical approaches in Eukaryotic gene prediction. In Handbook of Statistical genetics (eds. Balding D. et al.), John Wiley & Sons, Ltd., p. 83-127.

Example of output:


 Splm: Matrix-based prediction of splice sites in Human sequences
-------------------------------------------------------------------
Parameters: -d 95 -a 95 -dGC 95 -nc 1 (non-st. consensus AT-AC)
 Length of sequence   4500
Number of Donor    sites:     33 Threshold:   95
Number Position Score Chain Type
    1     167      33   -   GT
    2     184      43   -   GC
    3     460      25   -   GT
    4     486      21   -   GC
    5     710      97   +   GT
    6    1077      48   +   GT
    7    1081      18   +   GT
    8    1181      75   -   GT
    9    1508       6   -   GT
   10    1567       7   +   GT
   11    1920      24   +   GT
   12    1925       9   +   GC
   13    1954       6   -   GT
   14    2179      36   -   GC
   15    2662       9   -   GC
   16    2691      45   +   GT
   17    2745      43   -   GC
   18    2906      18   +   GT
   19    2930       7   -   GT
   20    2937      83   +   GT
   21    3000       7   -   GT
   22    3006      14   -   GT
   23    3023      90   -   GT
   24    3041      29   -   GT
   25    3107      11   -   GT
   26    3174      46   +   GT
   27    3280       9   +   GC
   28    3290      12   -   GT
   29    3636       7   +   GC
   30    3999       8   -   GT
   31    4107      11   -   GT
   32    4156      51   -   GT
   33    4308      22   +   GT
Number of Acceptor sites:     80 Threshold:   95
    1      14       7   +   AG
    2     106       6   +   AG
    3     110      24   -   AG
    4     194       9   +   AG
    5     234       7   -   AG
    6     384       7   +   AG
    7     395       6   +   AG
    8     457      10   -   AG
    9     498      12   +   AG
   10     676       6   -   AG
   11     680      15   +   AG
   12     702      18   -   AG
   13     733       8   -   AG
   14     738      19   +   AG
   15     780      27   -   AG
   16     861      49   +   AG
   17     865      10   +   AG
   18     912      34   -   AG
   19     928       8   -   AG
   20     982       6   -   AG
   21    1033      24   +   AG
   22    1063       6   -   AG
   23    1080       7   +   AG
   24    1104       9   -   AG
   25    1384       8   -   AC
   26    1399      16   +   AG
   27    1432       7   -   AG
   28    1514       6   -   AG
   29    1751       9   +   AG
   30    1780      11   -   AG
   31    1809      14   -   AG
   32    2040       6   +   AG
   33    2072      13   +   AG
   34    2083      11   -   AG
   35    2120      29   -   AG
   36    2212      61   +   AG
   37    2219       8   +   AG
   38    2238      24   -   AG
   39    2258      18   -   AG
   40    2359      10   -   AG
   41    2404       7   -   AG
   42    2430       6   -   AG
   43    2453       8   -   AC
   44    2474      12   -   AG
   45    2508       9   -   AC
   46    2537       9   -   AG
   47    2576      94   +   AG
   48    2674       7   +   AG
   49    2691       9   -   AC
   50    2750       8   -   AG
   51    2755      33   +   AG
   52    2841      41   -   AG
   53    2902       6   +   AG
   54    2990       8   +   AG
   55    3045       8   +   AC
   56    3050       8   +   AG
   57    3085      10   -   AG
   58    3108      27   -   AG
   59    3185      14   -   AG
   60    3241      39   +   AG
   61    3267      23   -   AG
   62    3388       9   -   AG
   63    3451       8   -   AG
   64    3480       8   +   AG
   65    3677       6   -   AG
   66    3776      25   +   AG
   67    3825      13   -   AG
   68    3885       8   +   AC
   69    3996       8   -   AG
   70    4005       7   +   AG
   71    4125       9   +   AG
   72    4200      12   +   AG
   73    4252      29   +   AG
   74    4258       6   -   AG
   75    4280       6   -   AG
   76    4290      18   -   AG
   77    4334       9   +   AC
   78    4388      13   +   AG
   79    4449       8   +   AG
   80    4498      10   -   AG