Mdg1

Sequence
Full description of functional site motifs with mapping onto the sequence

Distribution of the revealed motifs of functional sites along the DNA sequence of Drosophila LTR-containing retrotransposon of gypsy-group Mdg1 in the individual samples.
Designations. On an axes abscissas - sequence DNA mobile element in bp; on an ordinate axis - ascribed numbers of the regulatory sites (their titles are reduced in tab.1); 1 group - the sites of replication and transcription initiation and termination; 2 group - enhancers and silencers of chromosomal, viral, etc., genes; 3 group - the sites recognized by cellular protein transcription and translation factors; and 4 group - the sites recognized by protein receptors for inductive signals; the arrows mark the discovered location of regulatory sites on the left-directed (leftward arrows) and right-directed (rightward arrows) DNA strands.
(b) Schematic structure of the DNA sequence of retrotransposon Mdg1.
Designations. LTR, long terminal repeats; sORF1 and sORF2, small open reading frames; ORF1 and ORF2, big open reading frame; P is the motif of amino acid sequence of protease domain; RT, of reverse transcriptase domain; RH, of RNase H domain; and I, of integrase domain.
(c) Consolidated distribution of the revealed motifs of functional sites along the Mdg1 DNA sequence.
Designations. On an axes abscissas - number of segments of genome each size 1/15 length given MGE; on an ordinate axis - total number of nucleotides contained in the motifs of functional sites and falling within the scanning window by a size 75 bp. The upper direct line - 95%th level of the nonrandomness of the condensations of the motifs of functional sites; the lower direct line - average for 50 of casual sequences the same lengths and same nucleotide's of structure as DNA Mdg1 distribution of motives functional saites.
Jamming of the motifs correlates with the potential regulatory regions in the LTR and in the vicinity of the ends (start and beginning) of the small and big ORFs and domains of the ORF2.
Regulatory site motifs (corresponding to numbers on Y axis in fig. a)
1st group
1 tATAaa POLIMERASE II
2 NTCAKTY Inr (ripl.POLIMERASE II)
3 gtataaaatag "TATA box";one of a number of homology blocks at the 5'-ends of histone genes
4 ggaaa found upstream from the promoter of the pheasant conalbumin gene
5 ctggaatnttctag consensus sequence of Drosophila heat shock gene promoters
6 AATAAA POLY A SINGLE BLOCK
7 AATAAA-ygttgttyy* poly-A 2-boxis
8 tatgt One of two related variants of a presumed transcription termination signal in yeast
9 TGGngsntygncyggga PRIMER REVERSE TRANSCRIPTION
10 w ORI HUM
11 taaatttagt AD ORI
12 atatatatat HSV ORI
13 ataatatacc ARS Y XE DM (consensus sequence found in the autonomously replicating elements (ARS) of S.cerevisiae)
14 wtttatrtttw ARS yeast, xenopus
15 wttttatrtttw SAR A BOX OF DROSOFILA
16 aataaayaa SAR T BOX OF DROSOFILA
17 TtwTwTTWtt initiation of translation in genes of animals
18 saannatgg
accatgg
optimal sequence for initiation of translation by eukaryotik ribosomes
2nd group
19 sgcgwaa SECOND MOTIF OF Ad5 E1a
20 AtGCAAAtna DECAMER FROM IMMUNOGLOBULIN GENES IG octa
21 acc-ggt E2-PROTEINE OF PAPILLOMPAVIRUS
22 ccgtc PROTEIN BINDING SITE IN c-FOS ENHANCER
23 taatgarattc HSV-1/Herpesvirus samuiri
24 cAgnTGGc E MOTIF Ig ENHANCER
25 gaacag LVa MOTIF MoMuLV ENHANCER
26 cctgc LVc MOTIF MoMuLV ENHANCER
27 atnnAgtaaaa LYSOZYME SILENCER 4
28 aawanngaaaggr enchanser of betta-interferone genes of human
29 gctgtgkttttgca enchanser region of virus PV
30 tctgtggtaaag enchanser region of virus MSV
31 aagtctgcanagtctgca enchanser region of virus LPV
32 agcagctggc conservative elements of Ig enchanser (E1,E2,E3,E4,OCTA)
33 gccatctggc conservative elements of Ig enchanser (B,E1,E2,E3)
34 cwwwccac potential core sequence of enhancers of various human viruses
35 tGTGgwww enhancer core SV40
3th group
36 tgg-gccaa NF1-FACtOR
37 gggcggr SP1-FACTOR
38 yCAGCtgygG AP4-FACTOR
39 ccaat NFY-FACTOR (consensus sequence of the -80 of eukaryotic promoters)
40 tnnnCCatnnC CRF-FACTOR
4th group
41 ttcnngaa HEAT-SHOSK RESPONSIVE ELEMENT
42 ctcgaatgTTcgcgaaa heat-shock HSP70
43 cnngaanttcnng heat-shock induction site A
44 tgCrcyc HEAVY METALL RESPONSIVE ELEMENT
45 ycgcccgg site of induction of ions of heavy metals
46 aagggaaaag SRE ELEMENT OF HSP70
47 gnnhCh-tGttCt GLUCOCORTICOID RESPONSIVE ELEMENT
48 gttct consensus sequence of the recognition sites for the glucocorticoid receptor recognition sequence for glucocorticoid and progesterone receptors
49 snkrgctggg site of "drug" induction of genes P-450c
50 agaagnmag INTERFERONE GAMMA RESPONSIVE ELEMENT
51 tgacgtca cAMP-regulated element CREB
52 wtStgGgAw ACUTE REACTANT RESPONSIVE ELEMENT
53 taatgarat consensus of putative activator sequence of immediate early genes of Herpes Simplex Virus