User guide

PART 1: Structure calculation for short sequences

Type or paste your sequence into the input textbox. It should be less than 250 nt for RNA structure calculation. Use the alphabets ATGC or AUGC (any other symbols will give an error). Gaps or line folding are ignored.
Set the 'length of minimal helix' (continuous double-stranded region). The higher length value allows faster although less precise calculation. A recommended value is 2.
Set 'Stop criterion' D. This value varies within the range 0 < D < 1. The less is D, the more quick is calculation and less the prediction accuracy (default value giving the most accurate prediction is 0.9). The low stop criterion values (about 0.5) may be applied for searching for alternative structures and/or kinetic intermediates for particular RNA.
Set a 'randomization parameter'. The number defines the algorithm trajectory initial population. Different numbers can give different suboptimal RNA foldings. However for short sequences and for well-defined optima one can obtain repetitive results. This parameter may be used, for example, to determine the robustness of prediction.
For simulation of antisense oligonucleotide binding define its target coordinates.

You can also use this mode to calculate RNA folding with pre-defined continuous single-stranded positions. By default, the values 'from'=0, 'to'=0 are recommended. These values mean that all nucleotides in the sequence analysed may form duplexes (since numeration in the input sequence begins with 1). For calculations with non-zero 'from' and 'to' values, z-score value is not specified.

Press "execute".
The calculation results are:
1. one of suboptimal foldings in mfold format.
2. the folding energy.
3. the folding z-score.
4. the visual representation of the secondary structure
Z-score measures the relative deviation of structure stability. All existing computer methods still have errors in their prediction of RNA structure. Although you can not be completely sure in concrete structure, negative Z-score of a sequence under investigation indicates that it has stable folding (for example, typical z-score of tRNA or 5S rRNA sequence is about -2)

PART 2: Search for possible structures in long sequences and their calculation

Into the text-box, enter or insert from the clipboard the RNA sequence to be analysed. The sequence should be within the range of 250 - 100000 nt in length. Use the alphabets ATGC or AUGC (any other symbols will give an error). Gaps or line folding are ignored.
Set the window length for calculation of the E-score value. The window length should correspond to the length of the RNA molecule searched for (by default, the window length equals to 80 nt, which is the length of the tRNA molecule).
Click the button 'Execute'.
A plot at the top is a 'E-score profile'. It shows the distribution of the E-score value along the sequence analysed. The dotted line marks the mean E-score value typical for the structural RNAs (tRNA, rRNA, etc.). The solid lines to the up and down of the dotted line mark the limits of E-score variation in the class of structural RNA. The regions of the resulting line which lie between these limits are marked by red. These are the regions that potentially encode structural RNA genes .
The second plot is a 'E-score matrix'. It looks like dot matrix indicating the regions that have a potential to form stable secondary structures (shown red).
For exact calculation of the secondary structure formed by two regions of the sequence to be analysed extract the region and treat it as short sequence (see PART 1).
If the region is not continuous than insert arbitrary (but not shorter than 3 nt) fragment into the break and forbid him to participate in complementary interactions (like it is screened by antisense oligonucleotide).

PART 3: Search for sequence with a high potential to form structural RNA in long sequences

Into the text-box, enter or insert from the clipboard the RNA sequence to be analysed. The sequence should be within the range of 250 - 100000 nt in length. Use the alphabets ATGC or AUGC (any other symbols will give an error). Gaps or line folding are ignored.
Set the window length for calculation of the profile. The window length should correspond to the length of the RNA molecule searched for (by default, the window length equals to 80 nt, which is the length of the tRNA molecule).
Click the button 'Execute'.
A plot is a profile of the value F (0 <F < 1) showing potential of the sequence to form RNA with stable seqcondary structure. High values of F mean that the fragment of the sequence has both nucleotide composition close to typical composition of structural RNAs and Z-score value showing high stability of potential secondary structure.

PART 4: Search for evolutaionary conservative strcusture in aligned set of sequences

Into the text-box, enter or insert from the clipboard the RNA sequences alignment to be analysed. The alignmet shold be in ClustalW output format. The example of this format is presented at the 'example' link. The alignment should be within the range of 5 - 250 nt in length. Use the alphabets ATGC or AUGC (any other symbols will give an error).
Click the button 'Execute'.
In the output window, there are plots of secondary structers for all RNA sequences from the alignment.

The methods

Program GArna based on genetic algorithm for structure calculation (Titov I.I., Ivanisenko V.A., Kolchanov N.A. (2000) FITness - a WWW-resource for RNA folding simulation based on genetic algorithm with local minimization. Comput. Techn. V. 5 (2), pp. 48-56). GArna has been proven to find optimal or suboptimal structures for sequences up to 250 nt long (Titov I.I., Vorobiev D.G., Kolchanov N.A. (2000) Mass analysis of RNA secondary structures using a genetic algorithm.Proc. 2nd Int. Conf. on Bioinformatics of Genome Regulation and Structure, Novosibirsk, Russia, 2, pp. 138-141.)
Thermodynamic model for simulation of antisense oligonucleotide binding.
Thermodynamic parameters: see Jaeger A.H., Turner D.H., Zuker M. (1989) Improved predictions of secondary structures for RNA. Proc. Natl. Acad. Sci. USA, 86, pp. 7706-7710.
Calculation of relative deviation of structure stability: see Titov I.I., Vorobiev D.G., Kolchanov N.A. (2000) Mass analysis of RNA secondary structures using a genetic algorithm. Proc. 2nd Int. Conf. on Bioinformatics of Genome Regulation and Structure, Novosibirsk, Russia, 2, pp. 138-141.
In the paper below you can also find the description of a sequence measure which is introduced to locate the regions with possibly stable secondary structure. It is far superior to usual G+C content. Vorobiev D.G., Titov I.I., Kochetov A.V., Kolchanov N.A. (2000) Structural features of mRNA 5'UTRs of eukaryotic genes expressed at high and low levels. Proc. 2nd Int. Conf. on Bioinformatics of Genome Regulation and Structure, Novosibirsk, Russia, 2, pp. 135-137.

[an error occurred while processing this directive]