

In order to solve the paradox of increasing requirement for new drug and low efficiency of drug development, many researches are focused on developing computational methods to aid the drug discovery ( Heifetz et al., 2018).

Currently, the drug development is a long-term and costly process, spending about billions of US dollars and taking several years to develop a single on-market drug ( Politis et al., 2017). About 70% approved drugs in the DrugBank database belong to the small molecule category ( Wishart et al., 2008). In the post genomics era, although some novel therapeutic methods, such as immunotherapy, have tremendously progressed, small molecule drug design is still a dominant way to combat diseases ( Anusuya et al., 2018). As the mechanism and targets of these complex diseases gradually being explored, developing effective drugs to block the disease related pathway by protein–ligand interaction becomes possible ( Copeland, Pompliano & Meek, 2006). Many complex diseases still prevailed due to lack of effective therapeutic drugs for instance, many type of cancers, dengue viral disease, Human Immunodeficiency Virus, hypertension, diabetes, and Alzheimer’s disease ( Iyengar, 2013 Zahreddine & Borden, 2013). We also compare the performance of DeepBindRG with a 4D based deep learning method “pafnucy”, the advantage and limitation of both methods have provided clues for improving the deep learning based protein–ligand prediction model in the future. The better performance of DeepBindRG than autodock vina in predicting protein–ligand binding affinity indicates that deep learning approach can greatly help with the drug discovery process. Furthermore, DeepBindRG performed better for four challenging datasets from DUD.E database with no experimental protein–ligand complexes. We also explored the detailed reasons for the performance of DeepBindRG, especially for several failed cases by vina. While validating our model on three independent datasets, DeepBindRG achieves root mean squared error (RMSE) value of pKa (−logK d or −logK i) about 1.6–1.8 and R value around 0.5–0.6, which is better than the autodock vina whose RMSE value is about 2.2–2.4 and R value is 0.42–0.57. During the initial data processing step, the critical interface information was preserved to make sure the input is suitable for the proposed deep learning model. In this paper, we proposed a new deep neural network-based model named DeepBindRG to predict the binding affinity of protein–ligand complex, which learns all the effects, binding mode, and specificity implicitly by learning protein–ligand interface contact information from a large protein–ligand dataset.

While the accuracy of current scoring function is limited by several aspects, the solvent effect, entropy effect, and multibody effect are largely ignored in traditional machine learning methods. Many scoring functions were developed to estimate the binding strength and predict the effective protein–ligand binding. Currently, there are several docking programs to estimate the binding position and the binding orientation of protein–ligand complex. Many acute diseases were cured by small molecule binding in the active site of protein either by inhibition or activation. Proteins interact with small molecules to modulate several important cellular functions. DeepBindRG: a deep learning based method for estimating effective protein–ligand affinity. Cite this article Zhang H, Liao L, Saravanan KM, Yin P, Wei Y.

For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. Licence This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China DOI 10.7717/peerj.7362 Published Accepted Received Academic Editor Ben Corry Subject Areas Bioinformatics, Computational Biology, Molecular Biology, Data Mining and Machine Learning Keywords Protein–ligand binding affinity, ResNet, Deep neural network, Native-like protein–ligand complex, Drug design Copyright © 2019 Zhang et al.
