Several file formats are supported now. Specific file format will be detected automatically, nothing more is need for recognition.
1. NBRF/PIR(SSQ) Format looks like:
>P1;IVHUI6 interferon alpha-I-6 precursor - human MALPFALLMALVVLSCKSSCSLDCDLPQTH SLGHRRTMMLLAQMRRISLFSCLKDRHDFR FPQEEFDGNQFQKAEAISVLHEVIQQTFNL FSTKDSSVAWDERLLDKLYTELYQQLNDLE ACVMQEVWVGGTPLMNEDSILAVRKYFQRI TLYLTEKKYSPCAWEVVRAEIMRSFSSSRN LQERLRRKE* N;Alternate names: HuIFN-alpha-I-6; LeIF K; type I interferon C;Species: Homo sapiens (man) F;24-122,52-162/Disulfide bonds: #status predicted >F1;IVHUA7 interferon alpha-7 - human (fragment) HDFGFPQEEFDGNQFQKAQAISVLHEMIQQ TFNLFSTKDSSATWDETLLDKFYTELYQQL NDLEACMMQEVGVEDTPLMNVDSILTVRKY FQRITLYLTEKKYSPCAWEVVRAEIMRSFS LSANLQERLRRKE* C;Species: Homo sapiens (man) C;Date: 01-Sep-1981 #sequence_revision 01-Sep-1981 #text_change 06-Sep-1996 C;Accession: A01833 R;Goeddel, D.V.; Leung, D.W.; Dull, T.J.; Gross, M.; Lawn, R.M.; McCandliss, R.; Seeburg, P.H.; Ullrich, A.; Yelverton, E.; Gray, P.W.
i.e.
>P1;Name1
comment seq1-line1..........
seq1-line2..........
.
.
seq1-line_n..........*
comment >P1;Name2
seq2..........
.
.
Seq may have any spaces inside, not more than 80 characters in the line are desirable.
2. PEARSON(SSQ) Format looks like:
> LUTOSIN CCDAATCKLR PGAQCADGLC CDQCRFIKKG TVCRV ARGDWNDDTC TGQSADCPRNG > VIRIDIN CCDAATCKLRPGAQCADGLCCDQCRFIKKGKICRRARGDNPDDRCTGQSADCPRNR > VENOM METALLOPROTEINASE CCDAATCKLIPGAQCGEGLCCDQCSFIEEGTVCRIARGDDLDDYCNGRSAGCPRNP > RHODOSTOMIN CCDAATCKLRPGAQCGEGLCCEQCKFSRAGKICRIPRGDMPDDRCTGQSADCPRYH
i.e.
>Name1
seq1-line1..........
seq1-line2..........
.
>Name2
seq2..........
.
.
Seq may have any spaces inside, not more than 80 characters in the line are desirable.
3. SWISS-PROT Format looks like:
ID POLN_HEVBU STANDARD; PRT; 1693 AA. AC P29324; DT 01-DEC-1992 (REL. 24, CREATED) DT 01-DEC-1992 (REL. 24, LAST SEQUENCE UPDATE) DT 01-JUL-1993 (REL. 26, LAST ANNOTATION UPDATE) DE NON-STRUCTURAL POLYPROTEIN (CONTAINS: RNA-DIRECTED RNA POLYMERASE DE (EC 2.7.7.48); HELICASE). OS HEPATITIS E VIRUS (STRAIN BURMA) (HEV). OC VIRIDAE; SS-RNA NONENVELOPED VIRUSES; CALICIVIRIDAE. RN [1] RP SEQUENCE FROM N.A. RM 92024067 RA TAM A.W., SMITH M.M., GUERRA M.E., HUANG C.-C., BRADLEY D.W., RA FRY K.E., REYES G.R.; RL VIROLOGY 185:120-131(1991). CC -!- HEPATITIS E VIRUS IS THE MAJOR CAUSATIVE AGENT OF ENTERICALLY CC TRANSMITTED NON-A, NON-B HEPATITIS (ET-NANBH). DR EMBL; M73218; HPESVP. DR PIR; A40778; MNWWHE. KW POLYPROTEIN; RNA-DIRECTED RNA POLYMERASE; HELICASE; ATP-BINDING. FT NP_BIND 975 982 ATP (POTENTIAL). SQ SEQUENCE 1693 AA; 185191 MW; 13792804 CN; MEAHQFIKAP GITTAIEQAA LAAANSALAN AVVVRPFLSH QQIEILINLM QPRQLVFRPE VFWNHPIQRV IHNELELYCR ARSGRCLEIG AHPRSINDNP NVVHRCFLRP VGRDVQRWYT // . . . .