|
3D-MatchDB |
3D-MatchDB performs comparison of one protein 3D structure vs database structures. To solve the task of searching in 3D-Structures database, the high performance algorithms are critically important. In 3DmatchDB the rapid algorithm of structural aligning by secondary structure elements (helix, beta-sheet) is implemented. To speed up the algorithm, 3DmatchDB uses a preprocessed PDB database. Such a preprocessing is based on calculation of secondary structure elements. In the current version of database 12834 protein molecules from PDB, which have a primary structure homology lesser than 98%, are represented. With use of rapid algorithm, 3DmatchDB performs a paired structural alignment of the queried protein with every protein from a preprocessed database and then calculates RMSD, Zscore, Aligned Size and number of Gaps for obtained alignments. For users the result is represented as a sorted list of found proteins, which have a structural homology with queried protein. This list of structural homologs contains only those proteins, which after being aligned with queried protein have a RMSD value lesser than 5 angstrom and Zscore greater than 3.2. To obtain the atomic coordinates of paired structural alignment user should select the protein of interest from the list of found structural homologs. Atomic coordinates of paired structural alignment of queried protein and selected one from the list will be calculated with use of 3Dmatch software.
It is important to note that RMSD, Zscore, Aligned Size and number of Gaps in the list of found structural homologs are being calculated for alignments, which were performed with use of rapid algorithm, and may differ insignificantly from those calculated with use of 3Dmatch software.
The input data for the program are a PDB file and identifier of a polypeptide chain for queried protein. If the chain identifier is not provided, alignment is performed using the first polypeptide chain found in the protein.
To sort out the results of structure comparison by Zscore or by RMSD, check the appropriate box "Sort by Zscore" or "Sort by RMSD".
The result of structure database searching is used to be represented as the list (table) of found structural homologs that contains identifier for each protein in PDB, identifier of a chain, molecule description taken from the COMPND field of the PDB base, as well as RMSD, Zscore, Aligned Size, and Gaps.
To obtain a protein structure alignment, check the appropriate row in the list (table) of found structural homologs and then check "Get structure alignment as text" or "View structure alignment using 3D-Explorer". Protein structure alignment for pair of queried protein and the one selected in the table will be created with use of 3DMatch software.
Structural alignment is used to be represented in PDB format, in which the queried structures are given different chain IDs. The REMARK field contains the values for RMSD, Zscore and structure-based sequence alignment.
The rapid comparison of 3D structures is based on the slightly modified algorithm for aligning by secondary structure elements (SSE) implemented in 3DMatch. The detailed description of this algorithm is given in 3DMatch description. Modifications are related to the checking of alignment quality at every principal step of algorithm's execution, and transition to the next execution step is allowed only when the quality satisfies to defined requirements.
The first check occurs after building of the alignment core. If alignment has the RMSD value higher or contains the number of SSE smaller than the user defined threshold, then the structure is to be skipped.
The second check occurs after performing the transformation of alignment by SSE into alignment by Ca atoms.
It is important to note that lack of structural homology between the pair of proteins is almost completely characterized by the information on alignment core. I.e. the lack of structural homology means that core of alignment will either have too big RMSD or be too short. Thus the most of structures from PDB that have no structural homology with queried structure will be eliminated from the further calculations just after the building of alignment core. The algorithm for building of alignment core is very rapid, that allows to scan the PDB in real time.
Example of data output. STRUCTURE DATABASE SEARCHING. 1BAN:A ZScore= 6.6 RMSD= 0.31 Aligned=108 Size=108 Gaps=0 Name=BARNASE (G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT WITH SER 91 REPLACED BY ALA (S91A) 2RBI:A ZScore= 6.6 RMSD= 0.37 Aligned=108 Size=108 Gaps=0 Name=MOL_ID: 1; MOLECULE: RIBONUCLEASE; CHAIN: A, B; SYNONYM: BINASE, EXTRACELLULAR RIBONUCLEASE FROM BACILLUS INTERMEDIUS; EC: 3.1.27.-; ENGINEERED: YES; MUTATION: H101N 1A2P:A ZScore= 6.6 RMSD= 0.00 Aligned=108 Size=108 Gaps=0 Name=MOL_ID: 1; MOLECULE: BARNASE; CHAIN: A, B, C; EC: 3.1.27.-; ENGINEERED: YES 1BSB:A ZScore= 6.6 RMSD= 0.17 Aligned=108 Size=108 Gaps=0 Name=BARNASE (G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT WITH ILE 76 REPLACED BY VAL (I76V) 1BNS:A ZScore= 6.6 RMSD= 0.27 Aligned=108 Size=108 Gaps=0 Name=BARNASE (G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT WITH THR 26 REPLACED BY ALA (T26A) 1BNG:A ZScore= 6.6 RMSD= 0.22 Aligned=108 Size=108 Gaps=0 Name=BARNASE (E.C.3.1.27.-) DISULFIDE MUTANT WITH SER 85 REPLACED BY CYS AND HIS 102 REPLACED BY CYS (S85C,H102C) 1BAO:A ZScore= 6.6 RMSD= 0.20 Aligned=108 Size=108 Gaps=0 Name=BARNASE (G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT WITH TYR 78 REPLACED BY PHE (Y78F) 1BRI:A ZScore= 6.6 RMSD= 0.23 Aligned=107 Size=107 Gaps=1 Name=BARNASE (E.C.3.1.27.-) MUTANT WITH ILE 76 REPLACED BY ALA (I76A) 1BRG:A ZScore= 6.6 RMSD= 0.26 Aligned=108 Size=108 Gaps=0 Name=BARNASE (G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT WITH PHE 7 REPLACED BY LEU (F7L) 1B20:A ZScore= 6.6 RMSD= 0.30 Aligned=108 Size=109 Gaps=1 Name=MOL_ID: 1; MOLECULE: BARNASE; CHAIN: A, B, C; EC: 3.1.27.3; ENGINEERED: YES; MUTATION: YES 1BRK:A ZScore= 6.6 RMSD= 0.29 Aligned=108 Size=108 Gaps=0 Name=BARNASE (E.C.3.1.27.-) MUTANT WITH ILE 96 REPLACED BY ALA (I96A) 1BSC:A ZScore= 6.6 RMSD= 0.18 Aligned=108 Size=108 Gaps=0 Name=BARNASE (G SPECIFIC ENDONUCLEASE) (E.C.3.1.27.-) MUTANT WITH ILE 88 REPLACED BY VAL (I88V) 1BNE:A ZScore= 6.6 RMSD= 0.32 Aligned=107 Size=107 Gaps=1 Name=BARNASE (E.C.3.1.27.-) DISULFIDE MUTANT WITH ALA 43 REPLACED BY CYS AND SER 80 REPLACED BY CYS (A43C,S80C) PROTEIN STRUCTURE ALIGNMENT. HEADER PROTEIN STRUCTURE ALIGNMENT COMPND (A) 1A2P chain A (B) 1BAN chain A REMARK 1 REMARK 1 RMSD on Ca-atoms : 0.313 angstrom REMARK 1 Zscore : 6.580 REMARK 1 Aligned positions: 108 REMARK 1 Gap positions : 0 REMARK 1 Sequence identity: 99.1 (%) REMARK 1 REMARK 1 Structure based sequence alignment REMARK 1 REMARK 1 3 VINTFDGVADYLQTYHKLPDNYITKSEAQALGWVASKGNLADVAPGKSIGGDIFSNREGK REMARK 1 3 VINTFDGVADYLQTYHKLPDNYITKSEAQALGWVASKGNLADVAPGKSIGGDIFSNREGK REMARK 1 REMARK 1 63 LPGKSGRTWREADINYTSGFRNSDRILYSSDWLIYKTTDHYQTFTKIR REMARK 1 63 LPGKSGRTWREADINYTSGFRNSDRILYASDWLIYKTTDHYQTFTKIR REMARK 1 ATOM 1 N VAL A 3 -12.310 -8.243 5.307 1.00 47.79 N ATOM 2 CA VAL A 3 -11.179 -7.573 4.634 1.00 41.49 C ATOM 3 C VAL A 3 -11.019 -6.157 5.156 1.00 34.47 C ATOM 4 O VAL A 3 -11.979 -5.382 5.128 1.00 34.84 O ATOM 5 CB VAL A 3 -11.383 -7.546 3.117 1.00 42.12 C ATOM 6 CG1 VAL A 3 -10.536 -6.536 2.420 1.00 38.29 C ATOM 7 CG2 VAL A 3 -11.154 -8.948 2.527 1.00 45.14 C ATOM 8 N ILE A 4 -9.810 -5.789 5.545 1.00 27.18 N ATOM 9 CA ILE A 4 -9.587 -4.366 5.973 1.00 24.08 C ATOM 10 C ILE A 4 -8.788 -3.683 4.864 1.00 21.31 C ATOM 11 O ILE A 4 -7.656 -4.064 4.576 1.00 21.63 O ATOM 12 CB ILE A 4 -8.731 -4.385 7.264 1.00 24.83 C ATOM 13 CG1 ILE A 4 -9.399 -5.210 8.386 1.00 27.01 C ATOM 14 CG2 ILE A 4 -8.372 -2.999 7.701 1.00 24.93 C ATOM 15 CD1 ILE A 4 -8.582 -5.279 9.651 1.00 33.25 C ATOM 16 N ASN A 5 -9.456 -2.797 4.122 1.00 20.12 N ATOM 17 CA ASN A 5 -8.814 -2.164 2.982 1.00 19.67 C ATOM 18 C ASN A 5 -9.183 -0.706 2.810 1.00 17.24 C ATOM 19 O ASN A 5 -8.956 -0.171 1.716 1.00 17.10 O ATOM 20 CB ASN A 5 -9.048 -2.927 1.678 1.00 20.04 C ATOM 21 CG ASN A 5 -10.495 -2.771 1.189 1.00 20.89 C ATOM 22 OD1 ASN A 5 -11.360 -2.364 1.950 1.00 21.76 O ATOM 23 ND2 ASN A 5 -10.710 -3.053 -0.084 1.00 22.93 N ATOM 24 N THR A 6 -9.605 -0.043 3.868 1.00 15.82 N ATOM 25 CA THR A 6 -9.917 1.401 3.801 1.00 16.81 C ATOM 26 C THR A 6 -8.791 2.237 4.362 1.00 14.04 C ATOM 27 O THR A 6 -7.944 1.762 5.098 1.00 14.38 O ATOM 28 CB THR A 6 -11.207 1.679 4.628 1.00 17.16 C ATOM 29 OG1 THR A 6 -11.008 1.226 5.948 1.00 23.19 O ATOM 30 CG2 THR A 6 -12.404 0.966 4.043 1.00 22.55 C ATOM 31 N PHE A 7 -8.801 3.561 4.057 1.00 14.44 N ATOM 32 CA PHE A 7 -7.792 4.422 4.634 1.00 14.94 C