|
|
|
Pdb Ligands SummaryThe purpose of the "PDB ligands" link is to give, at a glance, a quick way of determining if compounds in the selected SOM cluster have any "known" protein targets and if any homologous human proteins exists. The PDB ligands link from each cluster page provides connections to structural data as deposited in the Protein Data Bank (PBD). Using a simple similarity measure the link provides a set of PDB ligand molecules that have some structural similarity to the NSC compounds of that particular cluster. For each, if any, NSC compound that has a structural analogs among the PDB ligands a link is provided that will display the underlying NSC compounds and graphical view of the PDB ligands. This page has further links to the PDBSUM web page that summarizes the information about the protein to which the ligand is bound as well as to homologous protein sequences, if any, that are found in Homo Sapiens. If the "PDB ligands" link is grayed out there are no analogous PDB ligands in that cluster.DetailsThe compounds tested in the NCI screen are characterized by their cytotoxicity, as measured by their GI50 values, in a number of cancer cell lines comprising panels of leukemia, lung, colon, central nervous system, melanoma, ovarian, renal, prostate and breast cancer. Thus clustered responses in our anti-cancer map suggest similarities in their tumor cell response. This response similarity does not necessarily translate into a chemical similarity, although this is often the case; rather it is the total molecular pharmacophore that dictates the behavior of the molecule in the cell. The observation that similar cellular response patterns often translate into chemically and structurally similar compounds supports the similarity principle, i. e. similar molecules elicit similar cellular responses. A more accurate statement would be that the same three-dimensional pharmacophore elicits the same cellular response.
However, for the bulk of compounds tested, only putative assignments
of targets are possible. In an effort to increase the confidence of
target assignment we have linked a structural component to our
SOM-based analysis of cellular responses for screened compounds. We
use the set of structurally-characterized small molecular ligands that
have been co-deposited as complexes in the Protein Data
Bank to provide a direct structural link between
ligand and target. PDB ligands which are similar to screened compounds
are collected. These compounds can then be assigned a structural
target based on their similarity to the original ligand/enzyme complex
in the PDB.
As a first step in our analysis the PDB was scanned for HET atom
records and any fully or partially present ligands were collated as
PDB ligands. This search includes ligands that are associated with
only nucleic acid targets as well as modified protein residues that
are covalently attached. The purpose of this collation is to generate
the set of structurally available ligand/protein and ligand/DNA
interactions. Small ions and unsuitable metal ligands were de-selected
from this list, leaving a total of 1919 small molecule ligands for analysis. The
ligand coordinates were then extracted and converted to other file
formats using BABEL for further processing.
To describe the chemical similarity of each PDB ligand to compounds in
the NCI database we use a bit-vector method. This is an electronically
convenient way of describing a compound according to its structural
features that also provides a means for comparisons to
other compounds. In this bit-vector description the compound is
dissected for properties that are coded in an on/off bit,
e. g. presence or absence of aromatic fragments, carboxyl groups,
etc. We have used the properties defined by the regular E-screen
bit-vectors, which encodes 431 bits. The extended
version of 631 bits was also tested but not found to yield more
information in the applications performed here. Detailed analysis of
Voigt et. al.
finds that the E-screen bit vectors rank
among the better class of descriptors for relating chemical
structures.
Based on their bit-vector assignments, structurally similar compounds
can be grouped according to different measures. The
Tanimoto coefficient identifies compounds containing similar chemical
elements or fragments using any class of bit-vector assignments, based
on the number of bits in common divided by the total number of
assigned bits.
The Tanimoto coefficient measures the number of common substructures
shared by two molecules as described by their bit-vector mask. As
noted above, the E-screen bit-vector mask was generated and used as a
similarity measure between NSC compounds and PDB ligands. Although a
high similarity does not guarantee that two compounds will behave the
same in a biological screen, high structural similarity can be used to
identify structural binding motifs of similar compounds bound to a protein
target.
Using the 0.85 cutoff
it is possible to assign relationships between 1161 NSC compounds and
their corresponding analogs in the PDB, with an average of 2.5
structures per NSC compound. While the number of unique associations
is small compared to the total number of tested NSCs, each association
can potentially lead to a wealth of structural and genomic information
for the analysis of alternative classes of compounds and their putative
targets.
Protein sequence alignments were performed using FASTA, version 3.3,
with standard gap-parameters and the BLOSUM50 similarity
matrix. We aligned the extracted protein sequence for
the PDB with the assembled Homo Sapiens sequences, cataloged based on
their Unigene annotation. This procedure provides a
quick search of each target protein for sequence homologs and provides
a facile means to connect target proteins to human genes. This
connection to the human genome further facilitates access to
discoveries of possible human targets for candidate ligands and may
provide clues about the function of these molecular targets.
| |||
|
|||