STING_DB Quality Assessment
STING_DB is composed of structural, sequence, function and stability parameters/descriptor for protein analysis. This database operates with a collection of both publicly available data (PDB, HSSP, Prosite) and its own data (contacts, interface contacts, surface accessibility). STING_DB is one of the best known databases of structural parameters reported in per-residue fashion with over 300 of them compiled at a single site.
Considering its relevance for researchers interested in protein analysis, the module Sting_DB Quality Assessment (QA) was designed to measure the quality of the data deposited in the Sting_DB. On a weekly basis, a checklist procedure is performed to identify the parameters/files that are both missing and/or empty for the new PDB files added to the database. The main goal of such a procedure is to guarantee that the quality of the data will not be degraded as the updates take place. When the checklist procedure identifies a group of parameters that are missing and/or empty, a report is automatically sent to the Sting_DB administrator who will run a set of scripts to update the parameters concerning the new PDB files, and subsequently, perform the checklist procedure to evaluate the quality of the updated data.
Table 1 shows the parameters analyzed by the module Sting_DB QA and their corresponding eligible PDB files.
Parameter Name | Elegible PDB Files |
Accessibility_and_Interface_Residue | All PDB files with at least one proteic chain |
Cavity_Complex | All PDB files |
Cavity_Isolation | All proteic chains in PDB files |
Contact_Energy_Density_Intrachain | All proteins in PDB files |
Contact_Energy_Density_Interface | All PDB files with at least 2 proteic chains |
Contacts | All proteins in PDB files |
Cross_Link | All proteins in PDB files |
Cross_Presence | All proteins in PDB files |
Curvature_Complex | All PDB files |
Curvature_Isolation | All PDB files |
Density_Sponge | All proteins in PDB files |
Distances | All proteins in PDB files |
Eletrostatic_Potential | All PDB files |
Entropy_Density_Interface | All proteins in PDB files w/ at least 2 chains one being protein w/ HSSP |
Entropy_Density_Intrachain | All proteins in PDB files w/ HSSP |
Evolutionary_Pressure | All proteins in PDB files w/ HSSP-MSA |
HSSP | All proteins in PDB files with HSSP |
HSSP_MSA | All proteins in PDB files with HSSP |
HSSP_MSA_Full | All proteins in PDB files with HSSP |
Hydro_Patches | All proteins in PDB files |
Ligand_Pocket_Residue | All proteins in PDB files with ligands |
My_Evolutionary_Pressure | All proteins in PDB files |
My_HSSP | All proteins in PDB files |
My_HSSP_100 | All proteins in PDB files |
My_Phylogenetic_Tree | All proteins in PDB files |
Phylogenetic_Tree | All proteins in PDB files |
Protein_Ligand_Contacts | All proteins in PDB files with ligands |
Ramachandran | All proteins in PDB files |
Rotamer | All proteins in PDB files |
Space_Clash | All proteins in PDB files |
Stride | All proteins in PDB files |
Unused_Contacts | All proteins in PDB files |
Water_Contact_Residue | All proteins in PDB files with HOH |
Table 1. Parameters analyzed by Sting_DB QA and their corresponding eligible PDB files.
When a Sting user selects a PDB file for analysis, if one or more parameters of that PDB are not available at the STING_DB, the user can search for such a PDB name in the Sting_DB QA to verify the existence of those parameters. For each parameter, there is a list of missing and empties PDB containing the PDB’s parameters. However, this situation is unusual since we are keeping the percentage of missing and empty PDB’s parameters below 3% in almost all PDB files. In addition, we are working diligently to reduce that percentage to 1% or less.
This effort makes the Sting_DB unique in terms of quality assessment when compared with other counterparts in the Bioinformatics domain. To the best of our knowledge, Sting is the only software for protein analysis capable of providing its users with a data quality indicator.
Parameters not included in Sting_DB QA: Prosite and Protherm
We did not include the parameters Prosite and Protherm in Sting_DB QA due to
difficulties to establish correlation of the number of PDB files that should
have the Prosite Motif, from one side, and the actual number of PDB files we
identify having Prosite Motif.