[an error occurred while processing this directive]

Brief manual on the database ENPDB

Enter the database ENPDB here. It works under the SRS system.

Introduction

The experimentally determined information on protein, DNA and RNA 3-dimensional structures is accumulated in the Protein Data Bank. This databank serves a worldwide unique official source of scientific information on 3-dimensional structures of macromolecules. In the PDB is stored information on atomic coordinates, bibliographic citations, primary and secondary structure information, as well as crystallographic structure factors and NMR experimental data. EnPDB database is made by reformatting PDB in a way allowing for extended search possibilities by means of SRS. To support an effective search for 3 dimensional structures extra fields were introduced in EnPDB. First, these are fields comprising information on molecule structural features: the number and length of chains, the number of helices, the number of beta-sheets, presence of DNA/RNA molecules in complex with protein, the number of protein molecules in the complex, the number of heteroatoms. Second, complex PDB fields were split into simple ones. The field SOURCE was separated into Gene, MolSource, Source and Synthesis fields. From the COMPND the field BioUnit was singled out. From the HETNAM and HETSYN emerged the field Heterogen.

Coordinates of atoms are not included in the EnPDB. The detailed description of the fields is given below.

Complete list of the fields

ID, Header, Date, Title, Compound, Molecule, Synonym, EC, BioUnit, Gene, MolSource, Source, Synthesis, Keyword, Technique, Author, Jrnl, JrnlAuthor, JrnlTitle, JrnlRef, JrnlVolume, JrnlYear, Remark_1, Resolution, ChainAmount, ChainSizes, HelixAmount, SheetAmount, DnaRnaAmount, ProteinAmount, HetAmount, Heterogen, LinkEmbl, LinkPir, LinkSwissProt, LinkTransfac, LinkTrrd4.

Fields description

PDB ID code. This identifier is unique within PDB

Header

PDB classification for the entry

Date

Deposition date is the date when the coordinates were received by PDB.

The date field contains in most cases the date when the entry was created, always stored in the index as an eight-digit number of the format "yyyymmdd" (y = year, m = month, and d = day), e.g., "19940117". It is also possible to type the date to be searched in a different, more intuitive, format: "dd-mmm-yy" or "dd-mmm-yyyy", e.g., "1-jan-97" or "01-jan-1997".

Title

Contains the title for experiment or analysis described in the entry. The field content corresponds to that in PDB.

Compound

Describes the macromolecules contained in the entry. Each macromolecule of the entry is defined with a set of token:value pairs, and is referred to as a component of the COMPOUND field. The field content corresponds to that in PDB. For each macromolecular component, the molecule name, synonyms, number assigned by the Enzyme Commission (EC), and other relevant details are specified.

Molecule

This specialized field is not present in the entry as a separate line. It contains names of macromolecules from the COMPND of PDB and is designed to search for entries by the names of macromolecules.

Synonym

This specialized field is not present in the entry as a separate line. It contains synonymic names of macromolecules from the COMPND of PDB and is designed to search for entries by the synonymic names of macromolecules.

Contains the Enzyme Commission number associated with the molecule. If there is more than one EC number, they are presented as a comma-separated list.

BioUnit

If a MOLECULE functions as a part of a larger biological unit, the entire functional unit may be described. The field content is selected from the COMPND of PDB using the token BIOLOGICAL_UNIT.

Gene

Identifies the gene through the gene names taken from the SOURCE field of PDB.

MolSource

Specifies biological and/or chemical sources of all the biological molecules in the entry.

There are three values: BIOLOGICAL, SYNTHETIC, and MIXED. SYNTHETIC means that all the molecules with the entry were chemically synthesized; BIOLOGICAL, all the molecules were not synthesized; and MIXED, some molecules were synthesized, whereas the rest are natural.

Source

This field specifies the biological source of each biological molecule in the entry. Sources are described by both the common and scientific names, e.g., genus and species. Strain and/or cell line for immortalized cells are given when they help to uniquely identify the biological entity studied.

Note that the content of this field is not a replica of the SOURCE of PDB. The original PDB filed is divided into two parts: all the information concerning the biological source is retained in this field, while all the data related to synthesis is comprised in the new field SYNTHESIS. We believe that this division allows user to specify the region to be searched for more precisely.

Synthesis

This field specifies the data on expression systems, e.g. strain, variant, cell line, etc. The content originates from the SOURCE of PDB. See also the description of Field Source.

Keyword

Contains keywords describing the macromolecule. The content corresponds to that of KEYWDS of PDB.

Technique

Identifies the experimental technique used. This may refer to the type of radiation and sample, or include the spectroscopic or simulation technique. The content originates from the EXPDTA of PDB.

Author

Indicates the names of the experts responsible for the contents of the entry and corresponds to the AUTHOR field in PDB.

Jrnl

Indicates the reference to original publication that describes the experiment and defines the coordinate set. Its content originates from the JRNL field of PDB.

JrnlAuthor

Contains the list of authors of the paper cited or contribution to a larger work. Its content originates from the JRNL field of PDB.

JrnlTitle

Specifies the title of the reference and is used for the title of a journal article, chapter, or part of a book. Its content originates from the JRNL field of PDB.

JrnlRef

Contains name of the publication. Its content originates from the JRNL field of PDB.

JrnlVolume

Contains the volume of the publication. Its content originates from the JRNL field of PDB.

JrnlYear

Indicates the year of the publication. Its content originates from the JRNL field of PDB.

Remark_1

Lists important publications related to the structure described in the entry. These citations are chosen by the depositor. The content originates from the REMARK 1 of PDB.

Resolution

Derived from REMARK 2 in the PDB file. No resolution is given for NMR structures and models. The field indicates the highest resolution in Angstroms used in building the model.

ChainAmount

Indicates the number of chains in the entry, calculated from the SEQRES field of PDB.

ChainSizes

Specifies the lengths of the chains in the entry, calculated from the data contained in the SEQRES of PDB.

HelixAmount

Indicates the number of helices in the entry and is derived from the MASTER field of PDB.

SheetAmount

Indicates the number of beta-sheet structures in the entry and is derived from the MASTER field of PDB.

DnaRnaAmount

Specifies the number of DNA/RNA strands in the entry, calculated from the SEQRES field of PDB.

ProteinAmount

Indicates the number of protein chains, calculated from the SEQRES field of PDB.

HetAmount

Indicates the number of unusual residues, such as prosthetic groups, inhibitors, solvent molecules, and ions, supplemented with their coordinates. The data are calculated from the HET field of PDB.

Heterogen

Gives the chemical name and the synonyms of unusual residues, such as prosthetic groups, inhibitors, solvent molecules, and ions, supplemented with their coordinates. The data are calculated from the HETNAM and HETSYN fields of PDB.

LinkEmbl

Links to EMBL Data Bank through SWISS-PROT.

For example, we find a SWISS-PROT entry with references to both PDB and EMBL entries. In this case, we consider that the PDB and EMBL entries are linked.

LinkPir

Links to PIR Data Bank through SWISS-PROT.

LinkSwissProt

Links to SWISS-PROT Data Bank as its entries contain references to PDB.

LinkTransfac

Links to Transfac Data Bank through SWISS-PROT.

For example, we find a SWISS-PROT entry with references to both PDB and TRANSFAC entries. In this case, we consider that the PDB and TRANSFAC entries are linked.

LinkTrrd4

Links to TRRD Data Bank.

[an error occurred while processing this directive]