Help. Variable memory Markov (VMM) model 

for nucleosome formation site prediction

Return to the main menu

 

The program is intended for nucleosome formation site prediction in genomic DNA. 

The program outputs probability estimation for of local DNA region to be in nucleosome structure contacting with histone octamer. 

The input sequences should be in FASTA format. Sequences assumed to be phased (by equal length) to obtain averaged profile. 

Program accepts even relatively short sequences (at least 10 bp). Upper size of sequences is up to 1 Mb.

User can input single sequence (even in plain format). In this case "SINGLE SEQUENCE" button should be selected 

to avoid abundant output statistics (such as averaging by one sequence).

The output is profile in text format (raw of digits).

 

User can input sequence in the window (cut and paste)

Input sequence(s) here (FASTA format or plain text) (cut & paste)

 

Standard file upload instead of copying is also available

or From file

 

User can select sliding window size for probability estimation. Nucleosome size 146 bp is recommended.  

Some part size of site like 50 bp is also acceptable, but at least 10 bp. 

Profile step other than 1 bp also could be defined to avoid lager output for long sequnces.

Profile parameters:

Sliding window size, bp (>10)       Profile step (shift of sliding window)

 

 

 

use pre-defined models (nucleosome):


or input pre-calculated VMM model in text format here (calculated by TreeComplexity program)  

(cut & paste)

or from File:

 

The program allows usage of pre-calculated model for a set of DNA sequences. 

Please construct such model by the related program: Complexity by context tree source

Output of this program (oligonucleotides and frequencies in text file) could be used as a VMM model for prediction (cut and paste in the form). Option "user-defined model" should be selected in the appropriate drop-down menu instead of "Nucleosome formation sites".

Bottom group of parameters define if single sequence or phased sequences should be analysed.

User can select format of the program output.

Output details:   Single sequence

Prediction profile only         Detailed profile (position and logarithm probability)

 

Set of phased sequence (averaged profile) 

Mean profile only  

Detailed report (Mean, standard deviation + profiles for every sew. in the set)  

 

 

User can define supplementary options to mark sequence position in the output

 

Supplementary output options:

Shift position (Profile 0 position)  bp

Centering position (Profile 0 position in the center of the sequence) Yes  No

 

Default value of logarithm probability (CompareLevel in the detailed report output)

 is calculated for sequence with equal nucleotide frequencies.

 

User can also obtain graphical plot of the profile using the corresponding option:

 

  Graphic mode 

 

Pleas check, this option is OFF by default.

Such graphic output is interactive, i.e. user can change borders of the profile, change dot size and color etc.

If you want obtain smoothed line, try to change the profile step to 1 (default is 10 bp).

 

 

Buttons "Execute" starts program run with the parameters selected.

Button "Reset form" changes all the parameters to default values.

                

Return to the main menu

 

    Example        Publications      Results      Related program: RECON       Nucleosome database

 

The Institute of Cytology and Genetics (Russia)

Authors: Yu.L.Orlov, V.G.Levitsky

Contributors: S.V.Lavryushev, D.A.Grigorovich, S.A.Poplavsky

Leader: N.A.Kolchanov

The research was partially supported by the Russian Foundation for Basic Research (RFBR) and Siberian Branch of the Russian Academy of Sciences (Integration project No. 119).