Biomolecular Entropy Estimation

The concepts of entropy and mutual information (MI) are fundamental aspects of statistical mechanics. However, whilst entropy is a key driver in chemical and biological processes, it remains difficult to model and is consequently ill understood. Various entropy estimation techniques have been developed, with the most direct being based on information theoretic approaches. With a classical treatment, the total entropy of a system (S) with n degrees of freedom has separate contributions from the particles’ momenta and configurations. The configurational contribution can be expressed in terms of the first order entropies and the sum of all the MI terms (I) up to order n. These terms can be expressed in terms of multi-particle correlation functions of increasing order.

In the language of information theory, this generates a mutual information expansion (MIE) in terms of increasing order. The MI terms are increasingly difficult to calculate from molecular simulations due to a combinatorial explosion. However, the MIE can be truncated at any level desired in order to reach a computationally tractable expression. For example, inhomogeneous fluid solvation theory applies the two-particle entropy approximation, considering only pairwise correlations. In the context of molecular simulation, entropy and MI can be estimated from studying the available degrees of freedom in the system and how these are frustrated by introduction of a potential energy function. We have previously employed the k-nearest neighbours (KNN) algorithm to estimate entropy and MI from molecular simulation, demonstrating its accuracy in the context of randomly generated data with known entropy, hydration entropy of simple solutes, and water molecules in protein cavities.

We have developed a new statistical mechanical framework for calculating free energies, by truncating the MIE at a tractable level and using the KNN algorithm to make quantitative predictions. Entropy estimates are then combined with long-timescale energy averages to yield absolute and relative free energy changes for molecular systems and processes.



A Large Scale Study of Hydration Environments Through Hydration Sites

Benedict W. J. Irwin, Sinisa Vukovic, Michael C. Payne, and David John Huggins

The Journal of Physical Chemistry B

Biomolecular simulations: From dynamics and mechanisms to computational assays of biological activity

David J. Huggins, Philip C. Biggin, Marc A. Dämgen, Jonathan W. Essex, Sarah A. Harris, Richard H. Henchman, Syma Khalid, Antonija Kuzmanic, Charles A. Laughton, Julien Michel, Adrian J. Mulholland, Edina Rosta, Mark S. P. Sansom, Marc W. van der Kamp

Wiley Interdisciplinary Reviews: Computational Molecular Science

© 2020 Theory of Condensed Matter Group, Cavendish Laboratory, JJ Thomson Avenue, Cambridge, CB3 0HE, United Kingdom