3dMind Home
Man Tree-Top

Menu:

Man Tree-Top 3dMind Home

Pdb Ligands Summary

The purpose of the "PDB ligands" link is to give, at a glance, a quick way of determining if compounds in the selected SOM cluster have any "known" protein targets and if any homologous human proteins exists. The PDB ligands link from each cluster page provides connections to structural data as deposited in the Protein Data Bank (PBD). Using a simple similarity measure the link provides a set of PDB ligand molecules that have some structural similarity to the NSC compounds of that particular cluster. For each, if any, NSC compound that has a structural analogs among the PDB ligands a link is provided that will display the underlying NSC compounds and graphical view of the PDB ligands. This page has further links to the PDBSUM web page that summarizes the information about the protein to which the ligand is bound as well as to homologous protein sequences, if any, that are found in Homo Sapiens. If the "PDB ligands" link is grayed out there are no analogous PDB ligands in that cluster.

Details

The compounds tested in the NCI screen are characterized by their cytotoxicity, as measured by their GI50 values, in a number of cancer cell lines comprising panels of leukemia, lung, colon, central nervous system, melanoma, ovarian, renal, prostate and breast cancer. Thus clustered responses in our anti-cancer map suggest similarities in their tumor cell response. This response similarity does not necessarily translate into a chemical similarity, although this is often the case; rather it is the total molecular pharmacophore that dictates the behavior of the molecule in the cell. The observation that similar cellular response patterns often translate into chemically and structurally similar compounds supports the similarity principle, i. e. similar molecules elicit similar cellular responses. A more accurate statement would be that the same three-dimensional pharmacophore elicits the same cellular response.

However, for the bulk of compounds tested, only putative assignments of targets are possible. In an effort to increase the confidence of target assignment we have linked a structural component to our SOM-based analysis of cellular responses for screened compounds. We use the set of structurally-characterized small molecular ligands that have been co-deposited as complexes in the Protein Data Bank to provide a direct structural link between ligand and target. PDB ligands which are similar to screened compounds are collected. These compounds can then be assigned a structural target based on their similarity to the original ligand/enzyme complex in the PDB.

As a first step in our analysis the PDB was scanned for HET atom records and any fully or partially present ligands were collated as PDB ligands. This search includes ligands that are associated with only nucleic acid targets as well as modified protein residues that are covalently attached. The purpose of this collation is to generate the set of structurally available ligand/protein and ligand/DNA interactions. Small ions and unsuitable metal ligands were de-selected from this list, leaving a total of 1919 small molecule ligands for analysis. The ligand coordinates were then extracted and converted to other file formats using BABEL for further processing.

To describe the chemical similarity of each PDB ligand to compounds in the NCI database we use a bit-vector method. This is an electronically convenient way of describing a compound according to its structural features that also provides a means for comparisons to other compounds. In this bit-vector description the compound is dissected for properties that are coded in an on/off bit, e. g. presence or absence of aromatic fragments, carboxyl groups, etc. We have used the properties defined by the regular E-screen bit-vectors, which encodes 431 bits. The extended version of 631 bits was also tested but not found to yield more information in the applications performed here. Detailed analysis of Voigt et. al. finds that the E-screen bit vectors rank among the better class of descriptors for relating chemical structures.

Based on their bit-vector assignments, structurally similar compounds can be grouped according to different measures. The Tanimoto coefficient identifies compounds containing similar chemical elements or fragments using any class of bit-vector assignments, based on the number of bits in common divided by the total number of assigned bits. The Tanimoto coefficient measures the number of common substructures shared by two molecules as described by their bit-vector mask. As noted above, the E-screen bit-vector mask was generated and used as a similarity measure between NSC compounds and PDB ligands. Although a high similarity does not guarantee that two compounds will behave the same in a biological screen, high structural similarity can be used to identify structural binding motifs of similar compounds bound to a protein target.

Using the 0.85 cutoff it is possible to assign relationships between 1161 NSC compounds and their corresponding analogs in the PDB, with an average of 2.5 structures per NSC compound. While the number of unique associations is small compared to the total number of tested NSCs, each association can potentially lead to a wealth of structural and genomic information for the analysis of alternative classes of compounds and their putative targets.

Protein sequence alignments were performed using FASTA, version 3.3, with standard gap-parameters and the BLOSUM50 similarity matrix. We aligned the extracted protein sequence for the PDB with the assembled Homo Sapiens sequences, cataloged based on their Unigene annotation. This procedure provides a quick search of each target protein for sequence homologs and provides a facile means to connect target proteins to human genes. This connection to the human genome further facilitates access to discoveries of possible human targets for candidate ligands and may provide clues about the function of these molecular targets.

More Info: