|
User guide
PART 1: Structure calculation for short sequences
-
Type or paste your sequence into the input textbox.
It should be less than 250 nt for RNA structure calculation. Use the alphabets ATGC or
AUGC (any other symbols will give an error). Gaps or line folding are ignored.
-
Set the 'length of minimal helix' (continuous double-stranded region).
The higher length value allows faster although less precise calculation.
A recommended value is 2.
-
Set 'Stop criterion' D. This value varies within the range 0 < D < 1. The less is D, the more
quick is calculation and less the prediction accuracy (default value giving the most accurate
prediction is 0.9). The low stop criterion values (about 0.5) may be applied for searching for
alternative structures and/or kinetic intermediates for particular RNA.
-
Set a 'randomization parameter'.
The number defines the algorithm trajectory initial population. Different numbers can
give different suboptimal RNA foldings. However for short sequences and for well-defined
optima one can obtain repetitive results. This parameter may be used, for example, to
determine the robustness of prediction.
- For simulation of antisense oligonucleotide binding define its target coordinates.
You can also use this mode to calculate RNA folding with pre-defined continuous
single-stranded positions. By default, the values 'from'=0, 'to'=0 are recommended.
These values mean that all nucleotides in the sequence analysed may form duplexes
(since numeration in the input sequence begins with 1). For calculations with non-zero
'from' and 'to' values, z-score value is not specified.
- Press "execute".
- The calculation results are:
- one of suboptimal foldings in mfold format.
- the folding energy.
- the folding z-score.
- the visual representation of the secondary structure
Z-score measures the relative deviation of structure stability.
All existing computer methods still have errors in their prediction
of RNA structure. Although you can not be completely sure in concrete
structure, negative Z-score of a sequence under investigation indicates
that it has stable folding (for example, typical z-score of tRNA or 5S
rRNA sequence is about -2)
-
Into the text-box, enter or insert from the clipboard the RNA sequence to be analysed.
The sequence should be within the range of 250 - 100000 nt in length. Use the alphabets
ATGC or AUGC (any other symbols will give an error). Gaps or line folding are ignored.
-
Set the window length for calculation of the E-score value. The window length should correspond to the length of the RNA molecule searched for (by default, the window length equals to 80 nt, which is the length of the tRNA molecule).
-
Click the button 'Execute'.
-
A plot at the top is a 'E-score profile'. It shows the distribution of the
E-score value
along the sequence analysed. The dotted line marks the mean E-score value typical for the
structural RNAs (tRNA, rRNA, etc.). The solid lines to the up and down of the dotted line
mark the limits of E-score variation in the class of structural RNA. The regions of the
resulting line which lie between these limits are marked by red. These are the regions
that potentially encode structural RNA genes .
-
The second plot is a 'E-score matrix'. It looks like dot matrix indicating the regions
that have a potential to form stable secondary structures (shown red).
-
For exact calculation of the secondary structure formed by two regions
of the sequence to be analysed extract the region and treat it as short sequence (see PART 1).
-
If the region is not continuous than insert arbitrary (but not shorter than 3 nt) fragment
into the break and forbid him to participate in complementary interactions (like it is
screened by antisense oligonucleotide).
-
Into the text-box, enter or insert from the clipboard the RNA sequence to be analysed.
The sequence should be within the range of 250 - 100000 nt in length. Use the alphabets
ATGC or AUGC (any other symbols will give an error). Gaps or line folding are ignored.
-
Set the window length for calculation of the profile. The window length should correspond to the length of the RNA molecule searched for (by default, the window length equals to 80 nt, which is the length of the tRNA molecule).
-
Click the button 'Execute'.
-
A plot is a profile of the value F (0 <F < 1) showing potential of the sequence to form RNA with stable seqcondary structure. High values of F mean that the fragment of the sequence has both
nucleotide composition close to typical composition of structural RNAs and Z-score value showing high stability of potential secondary structure.
-
Into the text-box, enter or insert from the clipboard the RNA sequences alignment to be analysed.
The alignmet shold be in ClustalW output format. The example of this format is presented at the 'example' link.
The alignment should be within the range of 5 - 250 nt in length. Use the alphabets
ATGC or AUGC (any other symbols will give an error).
-
Click the button 'Execute'.
-
In the output window, there are plots of secondary structers for all RNA sequences from the alignment.
- Program GArna based on genetic algorithm for structure calculation
(Titov I.I., Ivanisenko V.A., Kolchanov N.A. (2000)
FITness - a WWW-resource for RNA folding simulation based on genetic algorithm
with local minimization. Comput. Techn. V. 5 (2), pp. 48-56).
GArna has been proven to find optimal or suboptimal structures for sequences
up to 250 nt long (Titov I.I., Vorobiev D.G., Kolchanov N.A. (2000)
Mass analysis of RNA secondary structures
using a genetic algorithm.Proc. 2nd Int. Conf. on Bioinformatics of Genome
Regulation and Structure, Novosibirsk, Russia, 2, pp. 138-141.)
- Thermodynamic model for simulation of antisense oligonucleotide binding.
- Thermodynamic parameters: see Jaeger A.H., Turner D.H., Zuker M. (1989) Improved predictions of secondary structures for RNA. Proc. Natl. Acad. Sci. USA, 86, pp. 7706-7710.
- Calculation of relative deviation of structure stability: see Titov I.I., Vorobiev D.G., Kolchanov N.A. (2000) Mass analysis of RNA secondary structures using a genetic algorithm. Proc. 2nd Int. Conf. on Bioinformatics of Genome Regulation and Structure, Novosibirsk, Russia, 2, pp. 138-141.
- In the paper below you can also find the description of a sequence measure which is introduced to locate the regions with possibly stable secondary structure. It is far superior to usual G+C content. Vorobiev D.G., Titov I.I., Kochetov A.V., Kolchanov N.A. (2000) Structural features of mRNA 5'UTRs of eukaryotic genes expressed at high and low levels. Proc. 2nd Int. Conf. on Bioinformatics of Genome Regulation and Structure, Novosibirsk, Russia, 2, pp. 135-137.
|