The CC-chemokine receptor 5 (CCR5), a membrane protein belonging to the G-protein coupled receptor super-family, has been identified as an essential co-receptor for HIV entry into the cells, and small molecules that inhibit HIV entry by targeting CCR5 have been in fast development as antiviral agents. This review focuses on computational studies of predicting the CCR5 structure and its interactions with known small molecule inhibitors and discusses how the recently solved GPCR structures would provide new insights into the modeling of CCR5-inhibitior binding. In addition, this review pays a particular attention to the design of the inhibitors that specifically interrupt the viral entry co-receptor activity of CCR5 while preserving its normal chemokine receptor function to minimize side effects and toxicity.
Link:
http://www.ncbi.nlm.nih.gov/pubmed/19519482
Computational molecular docking has obtained increasingly success in the past two decades. However molecular conformational flexibility remains a major challenge. This review will outline the recent progress in flexible docking and focus on the molecular dynamics simulation techniques.
Link:
http://d.wanfangdata.com.cn/Periodical_jsjyyyhx200701018.aspx
The three-dimensional structures of proteins are being solved apace, yet this information is often underused in quantitative structure-activity relationship (QSAR) studies. Here, we describe and compare methods for exploiting protein structures to derive 3D-QSARs. These methods can facilitate molecular design and lead optimization and should increasingly become a standard component of the drug designer's repertoire.
Three-dimentional (3D) searching in a database of molecules based on their 3D characteristics has become a fast and effective approach to discover leading compounds. This review describes its principle, development and application in drug design.
Antibody single-chain variable fragments (scFvs) offer particular advantages over the full size antibodies, including easy expression, efficient local concentration and fast body clearance. However, scFvs typically show low thermal stability that limits their biomedical and biotechnological applications. In this study, we examined the thermal stability of the human and murine vascular endothelial growth factor (VEGF) antibody scFv fragment by molecular dynamics simulations. A consistent observation was the dissociation of the light chain (VL) and heavy chain (VH) domains and loss of the native structures of both domains in the simulations at the elevated temperatures. The stability-limiting structural elements in the protein were revealed from the detailed analyses on the native contacts. We found that dissociation of the VL-VH domains was the first event leading to the unfolding of the native structure of the protein and the disruption of the VL-VH interface was largely due to the break of the interfacial hydrophobic and aromatic interactions while the hydrogen bonding interaction between Gln38 in VL and Gln39 in VH remained. Within the beta-barrel structure of the VL and VH domains, beta-strands b6, b2 and b11 appeared to be the least stable. In addition, we found that the VH domain was more thermally resistant than the VL domain. Based on these findings, we discussed potential strategies to improve the stability of this therapeutically important scFv fragment.
The crystal structures of opsin in the ligand-free and the G-protein-interacting states showed two inter-helical openings between transmembrane (TM) helices TM1 and TM7 and between TM5 and TM6 near the extracellular side that were thought to serve as the retinal uptake and release gates. However, it is unclear which opening is for 11-cis-retinal uptake or all-trans-retinal release although speculations have been proposed based on the structural features of opsin and retinal. In this work, we simulated the exit process of all-trans-retinal from the ligand-free opsin structure by the classical molecular dynamics (MD) and random acceleration molecular dynamics (RAMD). In the 64 ns classical MD simulation, retinal remained in the receptor but moved significantly toward the TM5-TM6 opening and almost inserted into the opening after 50 ns. Complete exit was observed in 114 out of 160 RAMD trajectories with the TM5-TM6 opening being the predominant egress gate while egress from the TM1-TM7 opening was observed in only a few trajectories when relatively large acceleration was applied and large structural alteration of the protein resulted. These results suggest that photolyzed all-trans-retinal is likely released through the TM5-TM6 opening. Based on the unidirectional mechanism of retinal exchange suggested by experiment, we speculate that the TM1-TM7 opening serves as the 11-cis-retinal uptake gate. The spatial occupancy maps of retinal computed from the 160 RAMD trajectories further indicated that retinal experienced significant interactions with the receptor during the exit process. The implications of these findings for disease mechanisms of rhodopsin mutants are discussed.
A major challenge in drug design is to obtain compounds that bind selectively to their target receptors and do not cause side-effects by binding to other similar receptors. Here, we investigate strategies for applying COMBINE (COMparative BINding Energy) analysis, in conjunction with PIPSA (Protein Interaction Property Similarity Analysis) and ligand docking methods, to address this problem. We evaluate these approaches by application to diverse sets of inhibitors of three structurally related serine proteases of medical relevance: thrombin, trypsin and urokinase-type plasminogen activator (uPA). We generated target-specific scoring functions (COMBINE models) for the three targets using training sets of ligands with known inhibition constants and structures of their receptor-ligand complexes. These COMBINE models were compared with the PIPSA results and experimental data on receptor selectivity. These scoring functions highlight the ligand-receptor interactions that are particularly important for binding specificity for the different targets. To predict target selectivity in virtual screening, compounds were docked into the three protein binding sites using the program GOLD and the docking solutions were re-ranked with the target-specific scoring functions and computed electrostatic binding free energies. Limits in the accuracy of some of the docking solutions and difficulties in scoring them adversely affected the predictive ability of the target specific scoring functions. Nevertheless, the target-specific scoring functions enabled the selectivity of ligands to thrombin versus trypsin and uPA to be predicted.
The recently determined crystal structure of the human b2-adrenergic (b2AR) G-protein coupled receptor provides an excellent structural basis for exploring b2AR -ligand binding and dissociation process. Based on this crystal structure, we simulated ligand exit from the b2AR receptor by applying the random acceleration molecular dynamics (RAMD) simulation method. The simulation results showed that the extracellular opening on the receptor surface was the most frequently observed egress point (referred to as pathway A) and a few other pathways through inter-helical clefts were also observed with significantly lower frequencies. In the egress trajectories along pathway A, the D192-K305 salt bridge between the extracellular loop 2 (ECL2) and the apex of the transmembrane helix 7 (TM7) was exclusively broken. The spatial occupancy maps of the ligand computed from the 100 RAMD simulation trajectories indicated that the receptor-ligand interactions that restrained the ligand in the binding pocket were the major resistance encountered by the ligand during exit and no second barrier was notable. We next performed RAMD simulations by using a putative ligand-free conformation of the receptor as input structure. This conformation was obtained in a standard MD simulation in the absence of the ligand and it differed from the ligand-bound conformation in a hydrophobic patch bridging ECL2 and TM7 due to the rotation of F193 of ECL2. Results from the RAMD simulations with this putative ligand-free conformation suggest that the cleft formed by the hydrophobic bridge, TM2, TM3 and TM7 on the extracellular surface likely serves as a more specific ligand-entry site and the ECL2-TM7 hydrophobic junction can be partially interrupted upon the entry of ligand that pushes F193 to rotate, resulting in a conformation as observed in the ligand-bound crystal structure. These results may help design b2AR-targeting drugs with improved efficacy as well as understand the receptor subtype-selectivity of ligand binding in the b family of the adrenergic receptors that share almost identical ligand-binding pockets but show notable amino acid sequence divergence in the putative ligand-entry site, including ECL2 and the extracellular end of TM7.
Link:
http://dx.doi.org/10.1016/j.jmb.2009.07.093(NIHMS138089)
G-protein coupled receptors (GPCR) represent a class of important therapeutic targets. Seeking novel ligands as potential drugs targeting GPCRs and identifying natural ligands for orphan GPCRs have been long-standing efforts of academic and pharmaceutical industrial research. To accelerate this effort, there is a critical need for methods capable of predicting GPCR-ligand interactions on a large scale. Such methods also may help to reveal cross-pharmacology of different GPCRs in order to alleviate side effects and toxicity of potential drugs. Here we report a support vector machine (SVM)-based method for predicting GPCR-ligand interactions on a chemo-genomic scale. In this method, GPCRs were characterized by the sequence information of the transmembrane segments and ligands were represented by their chemical structural information. The application of the method to a set of known GPCR-ligand interacting pairs that included GPCRs from 28 subfamilies of the A family led to a model of GPCR-ligand interaction network. The model was able to distinguish interacting pairs from non-interacting pairs with an average 86.9% true-positive rate and 99.97% true-negative rate. Moreover, the model correctly predicted the interactions of a number of new ligands and orphan GPCRs that were chemically and phylogenetically novel to the training data set. This method is expected to be applicable to in silico high-throughput GPCR-targeting drug discovery and ligand identification at the GPCRs with unknown functions.
Link: and pdf
Since the CC-chemokine receptor 5 (CCR5) was identified as a major co-receptor for human immunodeficiency virus type 1 (HIV-1) entry into a host cell, CCR5-targetting HIVentry inhibitors have been developed and some of them are currently in clinical trials. Most of these inhibitors also inhibit the physiological chemokine reaction function of CCR5, which is so far considered to be safe to patients based on the observation that individuals that naturally lack CCR5 do not show apparent health problems. Nevertheless, to minimize the toxicity and side effects, it would be ideal to preserve the chemokine receptor activity. In this work, we simulated the flexible docking of two small molecule inhibitors to CCR5 in a solvated phospholipid bilayer environment. One of the inhibitors, aplaviroc has a unique feature of preserving two of the natural chemokine ligands binding to CCR5 and subsequent activation whereas the other one, SCH-C fully blocks chemokine-CCR5 interactions. Our results revealed significantly different binding modes of these two inhibitors although both established extensive interaction networks with CCR5. Comparison of the different binding modes suggests that avoiding the deep insertion of inhibitors into the transmembrane helix bundle may be able to preserve chemokine-CCR5 interactions. These results could help design HIV co-receptor activity-specific inhibitors.
Link:
http://dx.doi.org/10.1016/j.jmgm.2007.12.003(NIHMS51075)
We present a computational procedure for modeling protein-protein association and predicting the structures of protein-protein complexes. The initial sampling stage is based on an efficient Brownian dynamics algorithm that mimics the physical process of diffusional association. Relevant biochemical data can be directly incorporated as distance constraints at this stage. The docked configurations are then grouped with a hierarchical clustering algorithm into ensembles that represent potential protein-protein encounter complexes. Flexible refinement of selected representative structures is done by molecular dynamics simulation. The protein-protein docking procedure was thoroughly tested on 10 structurally and functionally diverse protein-protein complexes. Starting from Xray crystal structures of the unbound proteins, in 9 out of 10 cases it yields structures of protein-protein complexes close to those determined experimentally with the percentage of correct contacts >30% and interface backbone RMSD <4 A. Detailed examination of all the docking cases gives insights into important determinants of the performance of the computational approach in modeling protein-protein association and predicting of protein-protein complex structures.
Rhodopsin serves as the prototype for studies of the G-protein coupled receptor (GPCR) proteins as it is the only GPCR protein with known crystal structures, and its structure has been used as the template to model a large number of GPCR proteins, including many used as drug targets. Understanding ligand entrance routes is important for designing drugs with improved efficacy. Here we simulated the egress of the retinal chromophore from the protein by applying the random acceleration molecular dynamics (RAMD) method. The interhelical clefts near the extracellular side were identified to be the predominant egress, while the movement of retinal deep into the cytoplasmic side was also observed. These results suggest possible routes for ligands to enter into the binding pockets of GPCR proteins. In addition, the RAMD simulation results revealed the high stability of the interactions between helix 3 and other helices.
In protein unfolding simulations, elevated temperature, significantly exceeding the melting temperature Tm, provides an important means to accelerate unfolding to a computationally accessible time range. This procedure is based on the assumption that protein thermal unfolding has Arrhenius behavior and therefore that increasing temperature does not alter the protein unfolding pathways. However, in nature, proteins can show non-Arrhenius behavior and, in practice, overly fast unfolding in high-temperature simulations can result in difficulties in identifying unfolding intermediates and distinguishing their relative stabilities. In this paper, we describe simulations of two WW domains, small protein domains that have a three-stranded b-sheet structure. Simulations were carried out at several temperatures ranging from 300 K to 500 K, starting from folded structures. The results demonstrate the temperature dependence of the unfolding pathways, showing that to obtain unfolding pathways corresponding to those observed in experiments, the elevation of the simulation temperature has to be controlled. Based on trajectory analysis, we proposed a qualitative criterion for judging when an elevated temperature is acceptable or not, namely, that the temperature must be such that the native folded state is sampled substantially before protein unfolding begins. While depending on force field parameters and protein fold complexity, this criterion can be quantified to obtain the upper bound of an "acceptable elevated temperature", which was observed to be dependent on the thermostabilities of the two WW domain proteins.
The small guanosine triphosphate (GTP)-binding proteins of the Ras family are involved in many cellular pathways leading to cell growth, differentiation, and apoptosis. Understanding the interaction of Raswith other proteins is of importance not only for studying signalling mechanisms but also, because of their medical relevance as targets, for anticancer therapy. To study their selectivity and specificity, which are essential to their signal transfer function, we performed COMparative BINding Energy (COMBINE) analysis for 122 differentwild-type andmutant complexes between the Ras proteins, Ras and Rap, and their effectors, Raf and RalGDS. The COMBINE models highlighted the amino acid residues responsible for subtle differences in binding of the same effector to the two different Ras proteins, aswell asmore significant differences in the binding of the two different effectors (RalGDS and Raf) to Ras. The study revealed that E37, D38, and D57 in Ras are nonspecific hot spots at its effector interface, important for stabilization of both the RalGDS-Ras and Raf-Ras complexes. The electrostatic interaction between a GTP analogue and the effector, either Raf or RalGDS, also stabilizes these complexes. The Raf-Ras complexes are specifically stabilized by S39, Y40, and D54, and RalGDS-Ras complexes by E31 and D33. Binding of a small molecule in the vicinity of one of these groups of amino acid residues could increase discrimination between the Raf-Ras and RalGDS-Ras complexes. Despite the different size of the RalGDS-Ras and Raf-Ras complexes, we succeeded in building COMBINE models for one type of complex that were also predictive for the other type of protein complex. Further, using system-specific models trained with only five complexes selected according to the results of principal component analysis, we were able to predict binding affinities for the other mutants of the particular Ras-effector complex. As the COMBINE analysis method is able to explicitly reveal the amino acid residues that have most influence on binding affinity, it is a valuable aid for protein design.
The secondary structure propensities observed in protein simulations depend heavily on the force field parameters used. The existing empirical force fields often have difficulty in balancing the relative stabilities of helical and extended conformations. The resultant secondary structure bias may not be apparent in short simulations at room temperature starting from the native folded states. However, it can manifest itself dramatically at high temperatures and lead to large deviations from experimentally observed secondary structure propensities. Motivated by thermal unfolding simulations of several WW domains, which have a three-stranded b-sheet structure, we chose the FBP28WWdomain as a well-characterized system to investigate several AMBER force fields as well as parametrization of the NPSA (Neutralized, Polarized ionizable side chains with a solvent-accessible Surface Area-dependent term) implicit solvent model. The ff94 force field and two variants with altered parameters for the backbone torsion term were found to convert the native b-sheet structure directly to a single helix at high temperatures, whereas the ff96 force field produced significant non-native b-sheet content at high temperatures. The ff03 force field was able to reproduce the b-sheet-coil transition and experimentally observed unfolding pathways with both an explicit water solvent and the NPSA implicit solvent model at relatively low temperatures. However, the protein domain became predominantly helical after unfolding. Modification of the solvation parameter in the NPSA implicit solvent model was not sufficient to remedy this problem. The results imply that the intrinsic secondary structure bias in a force field cannot easily be solved by modifying a single parameter such as backbone torsion potential or a solvation parameter of a solvent model. Nevertheless, the results show that the AMBER ff03 force field together with an explicit solvent model or the NPSA implicit solvent model is a useful tool for studying the unfolding of both a- and b-sheet structure protein domains, and an integrative consideration of all force field parameters is likely to be necessary for a complete solution.
The extracellular ribonuclease barnase and its intracellular inhibitor barstar bind fast and with high afinity. Although extensive experimental and theoretical studies have been carried out on this system, it is unclear what the relative importance of different contributions to the high af.nity is and whether binding can be improved through point mutations. In this work, we .rst applied Poisson-Boltzmann electrostatic calculations to 65 barnase-barstar complexes with mutations in both barnase and barstar. The continuum electrostatic calculations with a van der Waals surface dielectric boundary de.nition result in the electrostatic interaction free energy providing the dominant contribution favoring barnase-barstar binding. The results show that the computed electrostatic binding free energy can be improved through mutations at W44/barstar and E73/barnase. Furthermore, the determinants of binding af.nity were quanti.ed by applying COMparative BINding Energy (COMBINE) analysis to derive quantitative structure-activity relationships (QSARs) for the 65 complexes. The COMBINE QSAR model highlights ;20 interfacial residue pairs as responsible for most of the differences in binding af.nity between the mutant complexes, mainly due to electrostatic interactions. Based on the COMBINE model, together with Brownian dynamics simulations to compute diffusional association rate constants, several mutants were designed to have higher binding af.nities than the wild-type proteins.
WW domains are small globular protein interaction modules found in a wide spectrum of proteins. They recognize their target proteins by binding specifically to short linear peptide motifs that are often proline-rich. To infer the determinants of the ligand binding propensities of WW domains, we analyzed 42 WW domains. We built models of the 3D structures of the WW domains and their peptide complexes by comparative modeling supplemented with experimental data from peptide library screens. The models provide new insights into the orientation and position of the peptide in structures of WW domain-peptide complexes that have not yet been determined experimentally. From a protein interaction property similarity analysis (PIPSA) of the WW domain structures, we show that electrostatic potential is a distinguishing feature of WW domains and we propose a structure-based classification of WW domains that expands the existent ligand-based classification scheme. Application of the comparative molecular field analysis (CoMFA), GRID/GOLPE and comparative binding energy (COMBINE) analysis methods permitted the derivation of quantitative structure-activity relationships (QSARs) that aid in identifying the specificity-determining residues within WW domains and their ligandrecognition motifs. Using these QSARs, a new group-specific sequence feature of WW domains that target arginine-containing peptides was identified. Finally, the QSAR models were applied to the design of a peptide to bind with greater affinity than the known binding peptide sequences of the yRSP5-1 WW domain. The prediction was verified experimentally, providing validation of the QSAR models and demonstrating the possibility of rationally improving peptide affinity for WW domains. The QSAR models may also be applied to the prediction of the specificity of WW domains with uncharacterized ligand-binding properties.
The suitability of three implicit solvent models for flexible protein¨Cprotein docking by procedures using molecular dynamics simulation is investigated. The three models are (i) the generalized Born (GB) model implemented in the program AMBER6.0; (ii) a distance-dependent dielectric (DDD) model; and (iii) a surface area-dependent model that we have parameterized and call the NPSA model. This is a distance-dependent dielectric model modified by neutralizing the ionizable sidechains and adding a surface area-dependent solvation term. These solvent models were first tested in molecular dynamics simulations at 300 K of the native structures of barnase, barstar, segment B1 of protein G, and three WW domains. These protein structures display a range of secondary structure contents and stabilities. Then, to investigate the performance of the implicit solvent models in protein docking, molecular dynamics simulations of barnase/barstar complexation, as well as PIN1 WW domain/peptide complexation, were conducted, starting from separated unbound structures. The simulations show that the NPSA model has signifi- cant advantages over the DDD and GB models in maintaining the native structures of the proteins and providing more accurate docked complexes.
We describe a Database of Simulated Molecular Motions (DSMM). This database is designed to serve as a single searchable site for locating movies and animations from simulations of biomolecules. DSMM is accessible via a webserver at: http://projects.villabosch.de/mcm/database/dsmm.
Link:
http://nar.oxfordjournals.org/cgi/content/full/31/1/456
The periplasmic oligopeptide binding component (OppA) of the oligopeptide permease found in Gram-negative bacteria acts as a receptor for peptide transport across the cell membrane and is a potential target for antibacterial drug design. OppA exhibits broad specificity, binding to diverse peptides of 2-5 amino acid residues length. Crystallographic and calorimetric measurements have been carried out by Tame et al. of the binding of 28 peptides of sequence K-X-K to OppA, where X is a natural or nonnatural amino acid. Despite this extensive experimental characterization, a clear relationship between structural and thermodynamic parameters could not be readily identified, with a complicating factor being the observation of varying numbers of water molecules at the binding interface in the different complexes. Consequently, we have applied COMparative BINding Energy (COMBINE) analysis to derive quantitative structure-activity relationships (QSARs) for these 28 OppA-tripeptide complexes. This is the first application of COMBINE analysis to predict binding enthalpies and entropies, and predictive QSAR models were obtained for these quantities as well as for binding free energies. These QSAR models highlight several protein residues and bound water molecules in the binding site, as well as the electrostatic desolvation energies of the protein and the peptides, as responsible for most of the differences in binding thermodynamics between the peptides studied. The QSAR models aid rationalization of the determinants of binding affinity of the OppA:peptide complexes and provide guides for further ligand design. This study also points to the general applicability of COMBINE analysis to estimating thermodynamic parameters for protein-peptide complexes.
Neuraminidase is a surface glycoprotein of influenza viruses that cleaves terminal sialic acids from carbohydrates. It is critical for viral release from infected cells and facilitates viral spread in the respiratory tract. The catalytic active site of neuraminidase is highly conserved in all type A and B influenza viruses, making it an excellent target for antiinfluenza drug design. Indeed, neuraminidase inhibitors have recently become available in the clinic for the treatment of influenza. Here, we describe the use of 3D structures of neuraminidase-inhibitor complexes to derive quantitative structure-activity relationships (QSARs) to aid understanding of the mechanism of inhibition and the discovery of new inhibitors. Crystal structures of neuraminidase-inhibitor complexes were used alongside modeled complexes to derive QSAR models by COMparative BINding Energy (COMBINE) analysis (Ortiz, A. R.; Pisabarro, M. T.; Gago, F.; Wade, R. C. J. Med. Chem. 1995, 38, 2681-2691). The neuraminidase proteins studied include type A subtypes N2 and N9 (which have ca. 50% sequence identity) and an active site mutant of the N9 subtype. The inhibitors include sialic acid and benzoic acid analogues with diverse frameworks and substitution groups. By considering the contributions of the protein residues and a key water molecule to the electrostatic and van der Waals intermolecular interaction energies, a predictive and robust QSAR model for binding to type A neuraminidase was obtained. In this QSAR model, 12 protein residues and 1 bound water molecule are highlighted as particularly important for inhibitory activity. This QSAR model provides guidelines for structural modification of current inhibitors and the design of novel inhibitors in order to optimize inhibitory activity.
3DFS is a 3D flexible searching system for lead discovery. Version 1.0 of 3DFS was published recently (Wang, T.; Zhou, J. J. Chem. Inf. Comput. Sci., 1998, 38, 71-77). Here version 1.2 represents a substantial improvement over version 1.0. There are six major changes in version 1.2
compared to version 1.0.
1. A new rule of aromatic ring recognition.
2. The inclusion of multiple-type atoms and chains in queries.
3. The inclusion of more spatial constraints, especially the directions of lone pairs.
4. The improvement of the query file format.
5. The addition of genetic search for flexible search.
6. An output option for generating MOLfiles of hits.
Besides the above, this paper supplies:
1. More query examples.
2. A comparison between genetic search and Powell optimization.
3. More detailed comparison between 3DFS and Chem-X.
4. A preliminary application of 3DFS to K+ channel opener studies.
This paper describes a new 3D flexible searching system 3DFS which supports two types of query definitions: simple atom-based definition and generalized function-based definition. The simple and practical definitions of hydrogen bond acceptors/donors, charge centers, aromatic ring centers, and a rapid hydrophobe recognition algorithm are described in detail. 3DFS adopts a four-step searching strategy: a rapid ID screening, an exact 2D substructure searching using the GMA algorithm, a rigid 3D searching, and a conformationally flexible 3D searching using POWELL method. The utility of 3DFS is illustrated by several typical searching examples.
The efficiency and the effectiveness of a 3D searching system can be dramatically influenced by the conformational searching algorithm used in the system. Genetic algorithm and POWELL method are compared about their conformational search utility in a 3D searching system-3DFS developed in our laboratory. The search results of five typical queries suggest that the algorithm choice depends on requirement, because these two algorithms illustrate advantages in the term of either speed or hit number.
This paper describes a new method-EMCSS for Maximal Common Substructure (MCS) search, which uses a substructure searching algorithm: Xu's GMA algorithm converts the MCS search space into a much smaller space, the connection table space of query graph (QG), and adopts a evolutionary strategy to search the optimum solution. The principle of the EMCSS method and its implementation are described in detail. Some highly complex examples, even a hyperstructure pair, are tested. The investigation demonstrates that the EMCSS method is robust and efficient.
This paper describes an evolutionary algorithm for variable selection in Quantitative Structure-Activitity Relationship (QSAR) studies. The application to two data sets: larvicides and sulfonylurea herbicides, demonstrates that the evolutionary algorithm is a effective tool for variable selection and building multiple QSAR models. An appropriate fitness function used for evaluating the models is a key to obtain high quality models.