|
PromH-AN |
Search for animal promoters using 2 homologous 5'-regions
Method description
To further improve promoter identification accuracy achieved by TSSG program, we developed a
new program, promH(G), by extending the TSSG program feature set. PromH uses linear
discriminant functions that take into account, in addition to features realized in TSSG,
conservation features of major promoter functional components, such as transcription start
points, TATA-boxes and regulatory motifs, in pairs of orthologous genes aligned by
SeqMatch-N program.
PromH(G) output
OUTPUT file begins with description of the program allocation, used abbreviations and
Search Parameters (Lines 1-10). Next two lines include name and length of the first
query sequence and the number of predicted promoter regions. Then, positions of predicted sites,
their "weights" and TATA-box position (for TATA promoters) are given. After that, functional
motifs are given for every predicted region; (+) and (-) reflect direct or complementary
chain; $... means a particular motif identificator from Transcription Factors Database,
TFD (Ghosh, Nucleic Acids Res., 1993 , 21, 3117-3118). Then, the same information is given
for second query sequence.
Program promHG (Softberry Inc.) Search for TATA+/TATA- promoters in 2 aligned DNA sequences NOTE: PHa - Homology Level of Aligned Sequences in LOCAL Search Area (-100,TSS+40) PHs - Homology Level of Aligned Sequences around TSS PHss - Homology Level of Aligned Sequences to Right from TSS PHt - Homology Level of TATA-boxes in Aligned Sequences PHr - Mean Homology Level of Regulatory Elements in LOCAL Search Area Initial / Final Thresholds - 2.00 / 6.00 ====================================================================== >H-NPPA/AL021155/[33199:35843/ Length of sequence- 2645 1 promoter(s) have been predicted Promoter Pos: 2549 (Weight - 16.00) TATA box at: 2517 (Weight - 218.33) PHa - 78% PHs - 100% PHss - 74% PHt - 100% PHr - 80% Transcription factor binding sites: for promoter at position - 2549 2462 (+) S01152 AAGTGA 2378 (+) S00922 AGAGG 2525 (+) S00922 AGAGG 2306 (-) S00922 AGAGG 2499 (-) S00395 CACGCW .............. -------------------------------------------------- >R-NPPA/J03267/[1638:3722]/-2000:+85/CDS: 3723, premRNA: 3638 Length of sequence- 2087 2 promoter(s) have been predicted Promoter Pos: 2000 (Weight - 15.59) TATA box at: 1970 (Weight - 217.73) PHa - 78% PHs - 100% PHss - 77% PHt - 100% PHr - 89% Promoter Pos: 1662 (Weight: 6.37) PHa - 76% PHs - 88% PHss - 72% PHr - 74% Transcription factor binding sites: for promoter at position - 2000 1915 (+) S01152 AAGTGA 1773 (-) S00922 AGAGG 1716 (+) S00392 AGGAAG 1999 (-) S02113 CCAGCTG 1713 (+) S01003 CCCAG ........... for promoter at position - 1662 1504 (+) S01090 AATGA 1610 (+) S01013 ACAGCTG 1484 (+) S00922 AGAGG 1505 (+) S01444 ATGAATCAG ...........