Use of neural networks to predict hydration sites in proteins

Rebecca C. Wade, Henrik Bohr and Peter G. Wolynes

European Molecular Biology Laboratory, Meyerhofstr. 1, 6900 Heidelberg, Germany
Noyes Laboratory, School of Chemical Sciences, University of Illinois, 505 St. Mathews Avenue, Urbana, IL 618OI, U.S.A.


Fundamental to the design of drugs to bind to particular receptors is the ability to predict ligand binding sites on biological macromolecules. Water has an important influence on ligand-protein interactions and therefore, we focus here on the prediction of hydration sites in proteins. These sites were predicted on the basis of amino acid sequence and secondary structure, by using multilayered feed-forward neural networks trained on crystallographic data.


An input window of 17 residues was used. Each residue was represented by 12 input neurons specifying its physical properties and secondary structure class [1]. Protein structure analysis [2,31 indicates that hydration sites in proteins are dependent on secondary structure, e.g., waters are often located at specific positions along a-helices. Neural networks may be used to predict secondary structure although, at present, their accuracy is about 65%, see e.g. [4,5]. Thus, in principle, it is not necessary to know a test protein's tertiary structure in order to input secondary structure. Two independent networks were run to predict (1) whether each residue in a test protein has a water ligand (1 binary output neuron/residue) and (2) whether each atom (except C and H) in the protein make a close (3.5 A) contact with a water molecule (binary output neurons/atom).

In ``Trends in QSAR and Molecular Modelling '92''
Ed. Wermuth,C.G.
ESCOM, Leiden, (1993) pp396-397.