3D-MatchDB description

3D-MatchDB

3D-MatchDB performs comparison of one protein 3D structure vs database structures. To solve the task of searching in 3D-Structures database, the high performance algorithms are critically important. In 3DmatchDB the rapid algorithm of structural aligning by secondary structure elements (helix, beta-sheet) is implemented. To speed up the algorithm, 3DmatchDB uses a preprocessed PDB database. Such a preprocessing is based on calculation of secondary structure elements. In the current version of database 12834 protein molecules from PDB, which have a primary structure homology lesser than 98%, are represented. With use of rapid algorithm, 3DmatchDB performs a paired structural alignment of the queried protein with every protein from a preprocessed database and then calculates RMSD, Zscore, Aligned Size and number of Gaps for obtained alignments. For users the result is represented as a sorted list of found proteins, which have a structural homology with queried protein. This list of structural homologs contains only those proteins, which after being aligned with queried protein have a RMSD value lesser than 5 angstrom and Zscore greater than 3.2. To obtain the atomic coordinates of paired structural alignment user should select the protein of interest from the list of found structural homologs. Atomic coordinates of paired structural alignment of queried protein and selected one from the list will be calculated with use of 3Dmatch software.
It is important to note that RMSD, Zscore, Aligned Size and number of Gaps in the list of found structural homologs are being calculated for alignments, which were performed with use of rapid algorithm, and may differ insignificantly from those calculated with use of 3Dmatch software.

Input data.

The input data for the program are a PDB file and identifier of a polypeptide chain for queried protein. If the chain identifier is not provided, alignment is performed using the first polypeptide chain found in the protein.
To sort out the results of structure comparison by Zscore or by RMSD, check the appropriate box "Sort by Zscore" or "Sort by RMSD".

Output data.

The result of structure database searching is used to be represented as the list (table) of found structural homologs that contains identifier for each protein in PDB, identifier of a chain, molecule description taken from the COMPND field of the PDB base, as well as RMSD, Zscore, Aligned Size, and Gaps.
To obtain a protein structure alignment, check the appropriate row in the list (table) of found structural homologs and then check "Get structure alignment as text" or "View structure alignment using 3D-Explorer". Protein structure alignment for pair of queried protein and the one selected in the table will be created with use of 3DMatch software.
Structural alignment is used to be represented in PDB format, in which the queried structures are given different chain IDs. The REMARK field contains the values for RMSD, Zscore and structure-based sequence alignment.

Rapid comparison of 3D structures.

The rapid comparison of 3D structures is based on the slightly modified algorithm for aligning by secondary structure elements (SSE) implemented in 3DMatch. The detailed description of this algorithm is given in 3DMatch description. Modifications are related to the checking of alignment quality at every principal step of algorithm's execution, and transition to the next execution step is allowed only when the quality satisfies to defined requirements.
The first check occurs after building of the alignment core. If alignment has the RMSD value higher or contains the number of SSE smaller than the user defined threshold, then the structure is to be skipped.
The second check occurs after performing the transformation of alignment by SSE into alignment by Ca atoms.

It is important to note that lack of structural homology between the pair of proteins is almost completely characterized by the information on alignment core. I.e. the lack of structural homology means that core of alignment will either have too big RMSD or be too short. Thus the most of structures from PDB that have no structural homology with queried structure will be eliminated from the further calculations just after the building of alignment core. The algorithm for building of alignment core is very rapid, that allows to scan the PDB in real time.


Example of data output.
STRUCTURE DATABASE SEARCHING.

1BAN:A ZScore= 6.6 RMSD=  0.31 Aligned=108 Size=108 Gaps=0 Name=BARNASE 
	(G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT  WITH SER 91 REPLACED BY ALA (S91A)
2RBI:A ZScore= 6.6 RMSD=  0.37 Aligned=108 Size=108 Gaps=0 Name=MOL_ID: 
	1;  MOLECULE: RIBONUCLEASE;  CHAIN: A, B;  SYNONYM: BINASE, EXTRACELLULAR 
	RIBONUCLEASE FROM BACILLUS  INTERMEDIUS;  EC: 3.1.27.-;  ENGINEERED: YES;  MUTATION: H101N
1A2P:A ZScore= 6.6 RMSD=  0.00 Aligned=108 Size=108 Gaps=0 Name=MOL_ID: 
	1;  MOLECULE: BARNASE;  CHAIN: A, B, C;  EC: 3.1.27.-;  ENGINEERED: YES
1BSB:A ZScore= 6.6 RMSD=  0.17 Aligned=108 Size=108 Gaps=0 Name=BARNASE 	
	(G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT  WITH ILE 76 REPLACED BY VAL (I76V)
1BNS:A ZScore= 6.6 RMSD=  0.27 Aligned=108 Size=108 Gaps=0 Name=BARNASE 
	(G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT  WITH THR 26 REPLACED BY ALA (T26A)
1BNG:A ZScore= 6.6 RMSD=  0.22 Aligned=108 Size=108 Gaps=0 Name=BARNASE 
	(E.C.3.1.27.-) DISULFIDE MUTANT WITH SER 85  REPLACED BY CYS AND HIS 102 REPLACED BY CYS (S85C,H102C)
1BAO:A ZScore= 6.6 RMSD=  0.20 Aligned=108 Size=108 Gaps=0 Name=BARNASE 
	(G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT  WITH TYR 78 REPLACED BY PHE (Y78F)
1BRI:A ZScore= 6.6 RMSD=  0.23 Aligned=107 Size=107 Gaps=1 Name=BARNASE 
	(E.C.3.1.27.-) MUTANT WITH ILE 76 REPLACED BY ALA  (I76A)
1BRG:A ZScore= 6.6 RMSD=  0.26 Aligned=108 Size=108 Gaps=0 Name=BARNASE 
	(G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT  WITH PHE 7 REPLACED BY LEU (F7L)
1B20:A ZScore= 6.6 RMSD=  0.30 Aligned=108 Size=109 Gaps=1 Name=MOL_ID: 
	1;  MOLECULE: BARNASE;  CHAIN: A, B, C;  EC: 3.1.27.3;  ENGINEERED: YES;  MUTATION: YES
1BRK:A ZScore= 6.6 RMSD=  0.29 Aligned=108 Size=108 Gaps=0 Name=BARNASE 
	(E.C.3.1.27.-) MUTANT WITH ILE 96 REPLACED BY ALA  (I96A)
1BSC:A ZScore= 6.6 RMSD=  0.18 Aligned=108 Size=108 Gaps=0 Name=BARNASE 
	(G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT  WITH ILE 88 REPLACED BY VAL (I88V)
1BNE:A ZScore= 6.6 RMSD=  0.32 Aligned=107 Size=107 Gaps=1 Name=BARNASE 
	(E.C.3.1.27.-) DISULFIDE MUTANT WITH ALA 43  REPLACED BY CYS AND SER 80 REPLACED BY CYS (A43C,S80C)


PROTEIN STRUCTURE ALIGNMENT.

HEADER    PROTEIN STRUCTURE ALIGNMENT	
COMPND    (A) 1A2P chain A (B) 1BAN chain A
REMARK   1
REMARK   1 RMSD on Ca-atoms :  0.313 angstrom
REMARK   1 Zscore           :  6.580 
REMARK   1 Aligned positions:    108
REMARK   1 Gap positions    :      0
REMARK   1 Sequence identity:   99.1 (%)
REMARK   1      
REMARK   1 Structure based sequence alignment
REMARK   1      
REMARK   1 3     VINTFDGVADYLQTYHKLPDNYITKSEAQALGWVASKGNLADVAPGKSIGGDIFSNREGK
REMARK   1 3     VINTFDGVADYLQTYHKLPDNYITKSEAQALGWVASKGNLADVAPGKSIGGDIFSNREGK
REMARK   1      
REMARK   1 63    LPGKSGRTWREADINYTSGFRNSDRILYSSDWLIYKTTDHYQTFTKIR
REMARK   1 63    LPGKSGRTWREADINYTSGFRNSDRILYASDWLIYKTTDHYQTFTKIR
REMARK   1      
ATOM      1  N   VAL A   3     -12.310  -8.243   5.307  1.00 47.79           N
ATOM      2  CA  VAL A   3     -11.179  -7.573   4.634  1.00 41.49           C
ATOM      3  C   VAL A   3     -11.019  -6.157   5.156  1.00 34.47           C
ATOM      4  O   VAL A   3     -11.979  -5.382   5.128  1.00 34.84           O
ATOM      5  CB  VAL A   3     -11.383  -7.546   3.117  1.00 42.12           C
ATOM      6  CG1 VAL A   3     -10.536  -6.536   2.420  1.00 38.29           C
ATOM      7  CG2 VAL A   3     -11.154  -8.948   2.527  1.00 45.14           C
ATOM      8  N   ILE A   4      -9.810  -5.789   5.545  1.00 27.18           N
ATOM      9  CA  ILE A   4      -9.587  -4.366   5.973  1.00 24.08           C
ATOM     10  C   ILE A   4      -8.788  -3.683   4.864  1.00 21.31           C
ATOM     11  O   ILE A   4      -7.656  -4.064   4.576  1.00 21.63           O
ATOM     12  CB  ILE A   4      -8.731  -4.385   7.264  1.00 24.83           C
ATOM     13  CG1 ILE A   4      -9.399  -5.210   8.386  1.00 27.01           C
ATOM     14  CG2 ILE A   4      -8.372  -2.999   7.701  1.00 24.93           C
ATOM     15  CD1 ILE A   4      -8.582  -5.279   9.651  1.00 33.25           C
ATOM     16  N   ASN A   5      -9.456  -2.797   4.122  1.00 20.12           N
ATOM     17  CA  ASN A   5      -8.814  -2.164   2.982  1.00 19.67           C
ATOM     18  C   ASN A   5      -9.183  -0.706   2.810  1.00 17.24           C
ATOM     19  O   ASN A   5      -8.956  -0.171   1.716  1.00 17.10           O
ATOM     20  CB  ASN A   5      -9.048  -2.927   1.678  1.00 20.04           C
ATOM     21  CG  ASN A   5     -10.495  -2.771   1.189  1.00 20.89           C
ATOM     22  OD1 ASN A   5     -11.360  -2.364   1.950  1.00 21.76           O
ATOM     23  ND2 ASN A   5     -10.710  -3.053  -0.084  1.00 22.93           N
ATOM     24  N   THR A   6      -9.605  -0.043   3.868  1.00 15.82           N
ATOM     25  CA  THR A   6      -9.917   1.401   3.801  1.00 16.81           C
ATOM     26  C   THR A   6      -8.791   2.237   4.362  1.00 14.04           C
ATOM     27  O   THR A   6      -7.944   1.762   5.098  1.00 14.38           O
ATOM     28  CB  THR A   6     -11.207   1.679   4.628  1.00 17.16           C
ATOM     29  OG1 THR A   6     -11.008   1.226   5.948  1.00 23.19           O
ATOM     30  CG2 THR A   6     -12.404   0.966   4.043  1.00 22.55           C
ATOM     31  N   PHE A   7      -8.801   3.561   4.057  1.00 14.44           N
ATOM     32  CA  PHE A   7      -7.792   4.422   4.634  1.00 14.94           C