Free energy simulations: theory and applications

Free energy simulations: theory and applications O. Michielin(1,2,4) (1) Ludwig Institute for Cancer Research Epalinges, Switzerland (2) Swiss Insti...
Author: Paul Hutchinson
3 downloads 0 Views 7MB Size
Free energy simulations: theory and applications

O. Michielin(1,2,4)

(1) Ludwig Institute for Cancer Research Epalinges, Switzerland (2) Swiss Institute for Bioinformatics Epalinges, Switzerland (4) Centre Pluridiscipinaire d'oncologie CHUV, Lausanne, Switzerland

Dynamical aspects of molecular recognition

Free energy: classical definition

+ The free energy is the energy left for once you paid the tax to entropy:

!G = !H " T!S Enthalpic ! ! ! !

Hydrogen bonds Polar interactions Van der Waals interactions ...

Entropic ! ! ! !

Loss of degrees of freedom Gain of vibrational modes Loss of solvent/protein structure ...

Theoretical Predictions: ! Approximate: empirical formula for all contributions ! Exact: using statistical physics definition of G G = -KBT ln(Z)

Examples of factors determining the binding free energy Electrostatic interactions - Strength depends on microscopic environment (!) - Case of hydrogen bonds Neutral :

-1.2 ± 0.6 kcal/mol

Charge assisted :

-2.4 to -4.8 kcal/mol

EH-bond (solv.) - EH-bond (comp.) determines if H-bonds contributes to affinity or not

H H

H

O

O H O

H H O

H O

H O

H H

H O H

O

S

H

solvent

complex

Unpaired polar groups upon binding are detrimental Strong directional nature Specificity of molecular recognition

Free energy: statistical mechanics definition

G = !k BT ln(Z )

where

Z = # i e! " Ei

is the partition function Free energy differences between 2 states (bound/unbound, …) are, therefore, ratios of partition functions

# ZA & !G = GA " GB = "kBT ln % ( $Z ' B

Free energy simulations aim at computing this ratio using various techniques.

Relation with chemical equilibrium A’B’

A'B'] [ KA = Kb = [ A ] [ B]

A’B’ # " A+B

A ] [ B] [ KD = Ki = [ A'B']

A+B # "

Kb : binding constant, Kd : dissociation constant, Ki : inhibition constant

!G binding = "RTln K A = RTln K D = !H " T!S "Gbinding (kcal/mol) Weak asso.

KD (mol/l)

-2

-4

-6 10 -3

-8

-10 10 -6

-12

-14

10 -9

-16

Strong asso.

10 -12

Connection micro/macroscopic: thermodynamics and kinetics

e - RT!G Free Energy

Microscopic Structure

=

Absolute binding free energies: !G " KA

KA Association Constant

Biological function

Relative binding free energies: !!G " KA’ / KA Binding free energy profiles: !G(#) " KA, Kon, Koff

The free energy is the main function behind all process KA =

A) Chemical equilibrium

!G binding = RTln K A

+ A

B) Conformational changes

!G conf = kB Tln

PConf 1 PConf 2

C) Ligand binding

!G binding = kB Tln D) …

PUnbound PBound

R = kB N A

[ AB] [ A ] [ B]

B KD = 1 / KA

AB

Free energy: computational approaches # ZA & !G = GA " GB = "kBT ln % ( $Z ' B

Free energy simulations techniques aim at computing ratios of partition functions using various techniques.

Z = # i e! " Ei Sampling of important microstates of the system (MD, MC, GA, …)

Computation of energy of each microstate (force fields, QM, CP)

Connection micro/macroscopic: intuitive view E1, P1 ~ e-$E1

Expectation value

O =

1 Oie! "Ei # Z i

#

E2, P2 ~ e-$E2

E3, P3 ~ e-$E3

! "Ei

Where Z = e i is the partition function

E4, P4 ~ e-$E4

E5, P5 ~ e-$E5

Central Role of the Partition Function

Z = # i e! " Ei

O =

1 ! " Ei O e # i Z i

...

Expectation Value

E =

! ln(Z) = U !"

Internal Energy

" ! ln(Z ) % p = k BT $ # !V '& N ,T Pressure

G = -kBT ln(Z) Gibbs free energy

Molecular Modeling Principles 1) Modeling of molecular interactions Electrostatics

Van der Waals

Covalent bonds

Solvent

Free energy landscape

2) Simulation of time evolution (Newton) 3) Computation of average values O = < O >Ensemble = < O >Temps (Ergodicity) Macroscopic value

Average simulation value

Connection microscopic/ macroscopic

The CHARMM Force Field V=

!K

2 (b " b ) + b 0

Bonds

"K

+

!K

2 ( # " # ) # 0

Angles

2 ( ! # ! ) ! 0

Impropers

+

" K [1# cos(n ! # $ )] !

!

!

Dihedrals

qiq j 1 i> j 4 !" ri, j

+#

+$ 4!ij [(" ij /rij )12 # (" ij /rij ) 6 ] i> j

Ergodic Hypothesis MD Trajectory E

$ Dialanine

O

Ensemble

#

NVT simulation

NVE simulation 3N Spatial coordinates

Protein

? 1& 1 = % O(! ," )e# $ E (! ," )d! d" = % O(t)dt = O Z & 0

Time

Approx.

Free Energy Perturbation (FEP)

Thermodynamical Integration (TI)

Non Equilibrium Statistical Mechanics (Jarzynski)

Linear Interaction Energy (LIE)

Molecular Mechanics/PoissonBoltzmann/Surface area (MM-PBSA)

Quantitative Structure Activity Relationship (QSAR)

G k0

kiXi

!G = F(X)

(X is a descriptor)

CPU Time

Sampling, Approx.

Sampling, Exact

Free energy calculation: Main approaches

Approx.

Free Energy Perturbation (FEP)

Thermodynamical Integration (TI)

Non Equilibrium Statistical Mechanics (Jarzynski)

Linear Interaction Energy (LIE)

Molecular Mechanics/PoissonBoltzmann/Surface area (MM-PBSA)

Quantitative Structure Activity Relationship (QSAR)

G k0

kiXi

!G = F(X)

(X is a descriptor)

CPU Time

Sampling, Approx.

Sampling, Exact

Free energy calculation: Main approaches

Theoretical approaches for the estimation of binding affinities Without 3D structure of the complex - 2D - QSAR - 3D - QSAR

With 3D structure of the complex - Knowledge based approaches • regression based methods • potential of mean force - Free energy simulation - Linear interaction energy - Binding free energy decomposition (MM-PBSA, MM-GBSA)

Quantitative Structure Activity Relationships (QSAR) (Drug design) Assumption Chemical similarity of ligands

Similarity of biological response

Affinity is a function of the ligand physico-chemical properties

Advantage No need for structural information about the target

Requirements (drawbacks) Known affinities for a series of ligands Structurally related ligands or similar binding modes

2D - QSAR n structurally related molecules O R4

R1

N

O

N

R2

R3

Quantitative description

Measured activities

Descriptors (Xi): - Substituents: surface, volume, electrostatics (%), hydrophobicity (&), partial charges, etc... - Molecule : volume, MR, HOMO, dipole moment, etc...

X 1

"G bind = k 0 + ! k i X i !G bind = neural network (X i )

C. Hansch and T. Fujita, JACS, 1964, 86, 1616 S.S. So and M. Karplus, J. Med. Chem., 1996, 39, 1521 S.S. So and M. Karplus, J. Med. Chem., 1996, 39, 5246

2D - QSAR Limitations Limited to structurally related molecules Needs the experimental activity of a series ligands Not for ab initio studies Overfitting - Method for selecting the descriptor (genetic algorithm) - Estimation of the predictive ability (test set, randomization test, Leave-One-Out method, ...) Use limited to the descriptor’s range in the training set : If only hydrophobic groups at R1 in the training set Influence of a hydrophilic group at R1 ? If only methyl, ethyl, propyl, butyl at R1 in the training set Contribution of pentyl, hexyl, etc... ?

O R4 O

R1

N N R3

R2

3D - QSAR Example : Comparative molecular field analysis (CoMFA) R.D. Cramer et al., JACS, 1988, 110, 5959

(x1,y1,z1)

(x2,y2,z2)

S

S

1

2

1

(x1,y1,z1)

. . .

Ligands mutually aligned Common 3D lattice

(x2,y2,z2)

E

E

1

2

. . .

2

" G b i n d

3 . . .n

Steric field

PLS

Electrostatic field

"G bind = k 0 + ! $ iSi + ! # i E i

(xi,yi,zi) (xj,yj,zj)

3D - QSAR Limitations Needs the experimental activity of a series of ligands Not limited to structurally related molecules Alignment of the molecules in their bioactive conformation. Use of: - known structure of a complex (QSAR?) - conformationally rigid example in the dataset - functional groups in agreement with a pharmacophore hypothesis Others : CoMSIA, HASL, Compass, APEX-3D, YAK, ...

Knowledge based approaches. Regression based methods Example : LUDI score H.J. Bohm, J. Comput.-Aided Mol. Des., 1994, 8, 623 H.J. Bohm, J. Comput.-Aided Mol. Des., 1998, 12, 309

!G bind = !G 0 + !G polar + !G apolar + !G solv + !G flexi Trained using a 82 protein-ligand complexes dataset with known experimental "Gbind

Polar interactions

Desolvation effect

(

)

!G polar = !G hb $ f ( !R, !" ) # f N neighb # fpcs hb

(

)

+!G ion $ f ( !R, !" ) # f N neighb # fpcs ion

+!G esrep N esrep

Active site filled with water molecules

!G solv = !G lipo wat

MD

Unbound water molecules

" unbound water

"Glipo water = -0.33 kcal/mol

"Ghb = -0.81, "Gion = -1.41 and "Gesrep = +0.10 kcal/mol

Ligand flexibility

Apolar interactions !G apolar = !G lipo A lipo + !G aro " f (R) aro

"Glipo = -0.81 and "Garo = -0.62 kcal/mol

!G flex = !G rot N rot "Grot = +0.26 kcal/mol Nrot : number of rotable bonds (acyclic sp3-sp3, sp3-sp2)

Knowledge based approaches. Regression based methods Example : LUDI score 82 complexes of the training set SD ~2 kcal/mol

Advantages :

Drawbacks :

- Allows identification of high affinity ligands

- Somewhat large errors

- Rapid estimation of the affinities

- Method biased: • certain type of proteins • only good complementarity protein/ligand - Some interactions ignored (cation – &)

- Structurally different ligands - “Universal” (different proteins)

Others : ChemScore, VALIDATE

Knowledge based approaches. Potential of mean force Example : PMF score I. Muegge et al., J. Med. Chem., 1999, 42, 791 I. Muegge et al., Persp. In Drug Disc. And Des., 2000, 20, 99

Trained using 697 complexes from the PDB. No need for experimental "Gbind atom type i for protein and j for ligand

# j " ij (r) & A ij ( r ) = !k b Tln % fvol_corr (r) ij ( "bulk ' $

%ij(r) : number density of atom pairs of type ij at distance r %ijbulk(r) : number density of atom pairs of type ij in a sphere with radius R f jvolv_corr : ligand volume correction

16 protein atom types, 34 ligand atom types PMF (kcal/mol)

NC positively charged nitrogen OC negatively charged oxygen ND nitrogen as hydrogen bond donor OA oxygen as hydrogen bond acceptor OD oxygen as hydrogen bond donnor Atom pair distance (Å)

Atom pair distance (Å)

PMF score = !

Atom pair distance (Å)

!

ij kl of type ij r