Partial Recognition Score

For an oligonucleotide frequency matrix F_{L-m+1,k}={f_{ij}}
of the length L and an arbitrary DNA sequence S=s_{1}**...**s_{i}**...**s_{L}
of the same length L, the simplest recognition Score is calculated:

Formula (2) calculates the Score_{#} value, the scale range of
which is increasing with the growth of the site length L, and decreasing with the growth
of the size k of the oligonucleotide alphabet {E_{1},
..., E_{j}, ..., E_{k}}. According to Zadeh's fuzzy sets (Zadeh, 1965),
the Score values calculated by the formula (2) were transformed into the normalized
partial recognition scale:

where the partial recognition’s rule is as follows:

__IF__ {Score_{0}(S)>0} __THEN__
{S is __this site__}, __OTHERWISE__ {S is not __this site__}.

Formula (3) gives the normalized Score_{0}(S), the mean value
of which averaged over all the known sequences of the site under study equals to “1”,
whereas the mean value averaged over the random DNA sequences equals to “–1”. This
Score_{0} scale is common for all the oligonucleotide frequency matrices compiled
by the MATRIX database for any functional DNA site expressed within any alphabet.