MATRIX database accumulates oligonucleotide frequency matrices of transcription factor binding sites, the natural sequences of which are contained in the SAMPLES database.
Each MATRIX entry complies the oligonucleotide frequency matrices of the site. These matrices are calculated by using 26 oligonucleotide alphabets on the multiply aligned site sequences (see ALIGNED database) by the standard Gibbs potential method (Lawrence C., 1994, Comput. Chem., 18, 255-258). Only the 50% subset of the analyzed site sequences multiply aligned was used for matrices calculations. Each calculated frequency matrix was transformed into the C program recognizing this site by the Partial Recognition Score, which is stored in the field C-CODE of the entry. The C program of the Mean Frequency Recognition is included into each entry.
The rest 50% subset of the analyzed site sequences multiply aligned was used as CONTROL. The control results of each program testing are stored in the fields Means, Standard Deviation, False Negatives for Control Sequences, Means, Standard Deviation, False Positives for Random Sequences and Graphical Representation of Test Results.
The entry contains the link to the Recognition Tools (see field Web-link to Recognition Tools) implementing each C program documented within this entry to recognize the site within an arbitrary sequence.
How to use Recognition Tools?
Web-link to Recognition Tools
Oligonucleotide Alphabet Length
Means, Standard Deviation, False Negatives for Control Sequences
Means, Standard Deviation, False Positives for Random Sequences
Graphical Representation of Test Results