File formats


Several file formats are supported now. Specific file format will be detected automatically, nothing more is need for recognition.

1. NBRF/PIR(SSQ) Format looks like:

>P1;IVHUI6
interferon alpha-I-6 precursor - human
MALPFALLMALVVLSCKSSCSLDCDLPQTH
SLGHRRTMMLLAQMRRISLFSCLKDRHDFR
FPQEEFDGNQFQKAEAISVLHEVIQQTFNL
FSTKDSSVAWDERLLDKLYTELYQQLNDLE
ACVMQEVWVGGTPLMNEDSILAVRKYFQRI
TLYLTEKKYSPCAWEVVRAEIMRSFSSSRN
LQERLRRKE*
N;Alternate names: HuIFN-alpha-I-6; LeIF K; type I interferon
C;Species: Homo sapiens (man)
F;24-122,52-162/Disulfide bonds: #status predicted
>F1;IVHUA7
interferon alpha-7 - human (fragment)
HDFGFPQEEFDGNQFQKAQAISVLHEMIQQ
TFNLFSTKDSSATWDETLLDKFYTELYQQL
NDLEACMMQEVGVEDTPLMNVDSILTVRKY
FQRITLYLTEKKYSPCAWEVVRAEIMRSFS
LSANLQERLRRKE*
C;Species: Homo sapiens (man)
C;Date: 01-Sep-1981 #sequence_revision 01-Sep-1981 #text_change 06-Sep-1996
C;Accession: A01833
R;Goeddel, D.V.; Leung, D.W.; Dull, T.J.; Gross, M.; Lawn, R.M.; McCandliss,
R.; Seeburg, P.H.; Ullrich, A.; Yelverton, E.; Gray, P.W.

i.e.
>P1;Name1
comment seq1-line1..........
seq1-line2..........
.
.
seq1-line_n..........*
comment >P1;Name2
seq2..........
.
.
Seq may have any spaces inside, not more than 80 characters in the line are desirable.


2. PEARSON(SSQ) Format looks like:

> LUTOSIN
    CCDAATCKLR PGAQCADGLC CDQCRFIKKG TVCRV
    ARGDWNDDTC TGQSADCPRNG
> VIRIDIN
CCDAATCKLRPGAQCADGLCCDQCRFIKKGKICRRARGDNPDDRCTGQSADCPRNR
> VENOM METALLOPROTEINASE
CCDAATCKLIPGAQCGEGLCCDQCSFIEEGTVCRIARGDDLDDYCNGRSAGCPRNP
> RHODOSTOMIN
CCDAATCKLRPGAQCGEGLCCEQCKFSRAGKICRIPRGDMPDDRCTGQSADCPRYH

i.e.
>Name1
seq1-line1..........
seq1-line2..........
.
>Name2
seq2..........
.
.
Seq may have any spaces inside, not more than 80 characters in the line are desirable.


3. SWISS-PROT Format looks like:

ID   POLN_HEVBU     STANDARD;      PRT;  1693 AA.
AC   P29324;
DT   01-DEC-1992 (REL. 24, CREATED)
DT   01-DEC-1992 (REL. 24, LAST SEQUENCE UPDATE)
DT   01-JUL-1993 (REL. 26, LAST ANNOTATION UPDATE)	
DE   NON-STRUCTURAL POLYPROTEIN (CONTAINS: RNA-DIRECTED RNA POLYMERASE
DE   (EC 2.7.7.48); HELICASE).
OS   HEPATITIS E VIRUS (STRAIN BURMA) (HEV).
OC   VIRIDAE; SS-RNA NONENVELOPED VIRUSES; CALICIVIRIDAE.
RN   [1]
RP   SEQUENCE FROM N.A.
RM   92024067
RA   TAM A.W., SMITH M.M., GUERRA M.E., HUANG C.-C., BRADLEY D.W.,
RA   FRY K.E., REYES G.R.;
RL   VIROLOGY 185:120-131(1991).
CC   -!- HEPATITIS E VIRUS IS THE MAJOR CAUSATIVE AGENT OF ENTERICALLY
CC       TRANSMITTED NON-A, NON-B HEPATITIS (ET-NANBH).
DR   EMBL; M73218; HPESVP.
DR   PIR; A40778; MNWWHE.
KW   POLYPROTEIN; RNA-DIRECTED RNA POLYMERASE; HELICASE; ATP-BINDING.
FT   NP_BIND     975    982       ATP (POTENTIAL).
SQ   SEQUENCE   1693 AA;  185191 MW;  13792804 CN;
     MEAHQFIKAP GITTAIEQAA LAAANSALAN AVVVRPFLSH QQIEILINLM QPRQLVFRPE
     VFWNHPIQRV IHNELELYCR ARSGRCLEIG AHPRSINDNP NVVHRCFLRP VGRDVQRWYT
//
.
.
.
.