[an error occurred while processing this directive] |
Brief manual on the database ENPDBEnter the database ENPDB here. It works under the SRS system. IntroductionThe experimentally determined information on protein, DNA and RNA 3-dimensional structures is accumulated in the Protein Data Bank. This databank serves a worldwide unique official source of scientific information on 3-dimensional structures of macromolecules. In the PDB is stored information on atomic coordinates, bibliographic citations, primary and secondary structure information, as well as crystallographic structure factors and NMR experimental data. EnPDB database is made by reformatting PDB in a way allowing for extended search possibilities by means of SRS. To support an effective search for 3 dimensional structures extra fields were introduced in EnPDB. First, these are fields comprising information on molecule structural features: the number and length of chains, the number of helices, the number of beta-sheets, presence of DNA/RNA molecules in complex with protein, the number of protein molecules in the complex, the number of heteroatoms. Second, complex PDB fields were split into simple ones. The field SOURCE was separated into Gene, MolSource, Source and Synthesis fields. From the COMPND the field BioUnit was singled out. From the HETNAM and HETSYN emerged the field Heterogen. Coordinates of atoms are not included in the EnPDB. The detailed description of the fields is given below. Complete list of the fieldsID, Header, Date, Title, Compound, Molecule, Synonym, EC, BioUnit, Gene, MolSource, Source, Synthesis, Keyword, Technique, Author, Jrnl, JrnlAuthor, JrnlTitle, JrnlRef, JrnlVolume, JrnlYear, Remark_1, Resolution, ChainAmount, ChainSizes, HelixAmount, SheetAmount, DnaRnaAmount, ProteinAmount, HetAmount, Heterogen, LinkEmbl, LinkPir, LinkSwissProt, LinkTransfac, LinkTrrd4. Fields descriptionID PDB ID code. This identifier is unique within PDB Header PDB classification for the entry Date Deposition date is the date when the coordinates were received by PDB. The date field contains in most cases the date when the entry was created, always stored in the index as an eight-digit number of the format "yyyymmdd" (y = year, m = month, and d = day), e.g., "19940117". It is also possible to type the date to be searched in a different, more intuitive, format: "dd-mmm-yy" or "dd-mmm-yyyy", e.g., "1-jan-97" or "01-jan-1997". Title Contains the title for experiment or analysis described in the entry. The field content corresponds to that in PDB. Compound Describes the macromolecules contained in the entry. Each macromolecule of the entry is defined with a set of token:value pairs, and is referred to as a component of the COMPOUND field. The field content corresponds to that in PDB. For each macromolecular component, the molecule name, synonyms, number assigned by the Enzyme Commission (EC), and other relevant details are specified. Molecule This specialized field is not present in the entry as a separate line. It contains names of macromolecules from the COMPND of PDB and is designed to search for entries by the names of macromolecules. Synonym This specialized field is not present in the entry as a separate line. It contains synonymic names of macromolecules from the COMPND of PDB and is designed to search for entries by the synonymic names of macromolecules. EC Contains the Enzyme Commission number associated with the molecule. If there is more than one EC number, they are presented as a comma-separated list. BioUnit If a MOLECULE functions as a part of a larger biological unit, the entire functional unit may be described. The field content is selected from the COMPND of PDB using the token BIOLOGICAL_UNIT. Gene Identifies the gene through the gene names taken from the SOURCE field of PDB. MolSource Specifies biological and/or chemical sources of all the biological molecules in the entry. There are three values: BIOLOGICAL, SYNTHETIC, and MIXED. SYNTHETIC means that all the molecules with the entry were chemically synthesized; BIOLOGICAL, all the molecules were not synthesized; and MIXED, some molecules were synthesized, whereas the rest are natural. Source This field specifies the biological source of each biological molecule in the entry. Sources are described by both the common and scientific names, e.g., genus and species. Strain and/or cell line for immortalized cells are given when they help to uniquely identify the biological entity studied. Note that the content of this field is not a replica of the SOURCE of PDB. The original PDB filed is divided into two parts: all the information concerning the biological source is retained in this field, while all the data related to synthesis is comprised in the new field SYNTHESIS. We believe that this division allows user to specify the region to be searched for more precisely. Synthesis This field specifies the data on expression systems, e.g. strain, variant, cell line, etc. The content originates from the SOURCE of PDB. See also the description of Field Source. Keyword Contains keywords describing the macromolecule. The content corresponds to that of KEYWDS of PDB. Technique Identifies the experimental technique used. This may refer to the type of radiation and sample, or include the spectroscopic or simulation technique. The content originates from the EXPDTA of PDB. Author Indicates the names of the experts responsible for the contents of the entry and corresponds to the AUTHOR field in PDB. Jrnl Indicates the reference to original publication that describes the experiment and defines the coordinate set. Its content originates from the JRNL field of PDB. JrnlAuthor Contains the list of authors of the paper cited or contribution to a larger work. Its content originates from the JRNL field of PDB. JrnlTitle Specifies the title of the reference and is used for the title of a journal article, chapter, or part of a book. Its content originates from the JRNL field of PDB. JrnlRef Contains name of the publication. Its content originates from the JRNL field of PDB. JrnlVolume Contains the volume of the publication. Its content originates from the JRNL field of PDB. JrnlYear Indicates the year of the publication. Its content originates from the JRNL field of PDB. Remark_1 Lists important publications related to the structure described in the entry. These citations are chosen by the depositor. The content originates from the REMARK 1 of PDB. Resolution Derived from REMARK 2 in the PDB file. No resolution is given for NMR structures and models. The field indicates the highest resolution in Angstroms used in building the model. ChainAmount Indicates the number of chains in the entry, calculated from the SEQRES field of PDB. ChainSizes Specifies the lengths of the chains in the entry, calculated from the data contained in the SEQRES of PDB. HelixAmount Indicates the number of helices in the entry and is derived from the MASTER field of PDB. SheetAmount Indicates the number of beta-sheet structures in the entry and is derived from the MASTER field of PDB. DnaRnaAmount Specifies the number of DNA/RNA strands in the entry, calculated from the SEQRES field of PDB. ProteinAmount Indicates the number of protein chains, calculated from the SEQRES field of PDB. HetAmount Indicates the number of unusual residues, such as prosthetic groups, inhibitors, solvent molecules, and ions, supplemented with their coordinates. The data are calculated from the HET field of PDB. Heterogen Gives the chemical name and the synonyms of unusual residues, such as prosthetic groups, inhibitors, solvent molecules, and ions, supplemented with their coordinates. The data are calculated from the HETNAM and HETSYN fields of PDB. LinkEmbl Links to EMBL Data Bank through SWISS-PROT. For example, we find a SWISS-PROT entry with references to both PDB and EMBL entries. In this case, we consider that the PDB and EMBL entries are linked. LinkPir Links to PIR Data Bank through SWISS-PROT. LinkSwissProt Links to SWISS-PROT Data Bank as its entries contain references to PDB. LinkTransfac Links to Transfac Data Bank through SWISS-PROT. For example, we find a SWISS-PROT entry with references to both PDB and TRANSFAC entries. In this case, we consider that the PDB and TRANSFAC entries are linked. LinkTrrd4 Links to TRRD Data Bank.
|