Prediction of RNA secondary structure using genetic algorithm with steepest descent
I.I. Titov, V.A. Ivanisenko, N.A. Kolchanov
Institute of Cytology and Genetics, Siberian Division of the Russian Academy of Sciences.
Genetic algorithm (GA) is one of the powerful "kinetic" (i.e. based on a limited search) optimization methods. GA employs the evolutionary analogies through iteration of mutations and recombinations in the individual genotypes with phenotypic (fitness) selection in each calculation cycle. Every RNA secondary structure is unambiguously defined by its own set of stems; hence, it is natural to choose the latter as genes for GA-based calculations. Although such a coarse-grained representation is computationally convenient, it leads to a formal paradox: according to the FDC criterion (fitness-distance correlation, Jones and Forrest, 1995), the RNA secondary structure prediction is GA-hard, since the number of structures lacking common stems with the minimal-energy structure is exponentially great. On the other hand, the additivity of energetic rules for RNA secondary structure is a good basis for validity (important for GA efficiency) of building blocks hypothesis; in addition, a number of examples demonstrates successful GA application for secondary structure energy minimization.
We modified GA to provide the maximal distinction of its dynamics from the routes in statistical FDC-metry. The major peculiarities of our algorithm are:
The described algorithm possesses a higher efficiency compared to the GA with conventional rules. On IBM-486, it takes our algorithm less than a minute to select the global minimum of RNA free energy in a space of 108 states. Thus, the framing of GA quality criterion in FDC terms in the space of its own operators is essential. However, some complications connected with dynamic character of operators’ action may arise. For example, the absence of any convergence is observed when recombination frequency is considerable ("error catastrophe").