Per-residue energy decompositions

From Rizzo_Lab
Jump to: navigation, search

The Molecular Mechanics energy function is a pair-wise additive function which can be broken into per-residue components. This is useful for measuring the contribution of each protein residue towards ligand binding. An analogous experimental technique would be alanine scanning. We call these plots molecular interaction footprints.

This tutorial is for analyzing molecular dynamics (MD) simulations of a receptor-ligand complex.

In the example, we show how to analyze a trajectory generated with AMBER. However, you can also use NAMD to generate your trajectories.

I would suggest that you perform your own simulation and then use this tutorial as a guide to performing your analysis.

Here are some tutorials on the Rizzo lab wiki for running MD:

Here is also some information on NAMD from the rizzo lab wiki:

Calculate Decompositions Using Namd

The following script uses vmd to create pdb files. Each atom is labeled with a 0 (excluded), 1(species #1) or 2(species #2) in beta column. namd will compute the interaction energy between species #1 and #2. A separate pdb file is created for each residue (species #2). The ligand is always species #1.

We are using amber inputs in this example. The namd input file loops over each pdb file to compute the interaction energy for every timestep.

run.005.vmd.namd.energy.decomp_sw.csh:

#! /bin/csh -fe

set mountdir        = "/nfs/user03/chembio/AMBER_Project"

set start           = 1
set stop            = 35
set offset          = 1

set workdir         = "${mountdir}/006.FP.VMD.NAMD"

cd        ${workdir}

###########################################

set res_start = ${start}
set res_stop  = ${stop}
@ loop_stop = ${res_stop} + 1

cat << EOF > ptraj.in
trajin ${mountdir}/004.ptraj/1qp6.rec.trj.stripfit
trajout 1qp6.rec.stripfit.dcd charmm
EOF

ptraj ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.prm7 ptraj.in > ptraj.out


###########################################
cat <<EOFK >pair.maker.tcl
#opens the file that has the atoms that are to be set-up for pair-interaction calculations
mol load parm7 ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.prm7 rst7 ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.rst7

for {set resid_num $res_start} {\$resid_num < $loop_stop} {incr resid_num} {
set all [atomselect top all]
\$all set beta 0
set case1 [atomselect top "protein and resid 36 to 70"]
\$case1 set beta 1
set case2 [atomselect top "protein and resid \$resid_num"]
\$case2 set beta 2
\$all writepdb 1qp6.rec.gas.pair.\$resid_num.pdb
}
quit
EOFK
###########################################
# copy DCD files required in the calculations
# to the current directory

# make a special PDB files defining what the two species will be for pair interaciton calcs
  /nfs/user05/lingling/zzz.programs/vmd-1.9.1/bin/vmd -dispdev text -e pair.maker.tcl         >> zzz.pair.maker.out

echo "MAKE INPUT FILES"
date

set res_start = ${start}
set res_stop  = ${stop}
@ loop_stop = ${res_stop} + 1

set start_residue = $res_start
set loop_stop  = $loop_stop
set count01 = $start_residue

###########################################
###########################################
while ( $count01 < $loop_stop )
###########################################
cat <<EOFL >07_to_08pr.$count01.in
# initial config
temperature     0

# output params
outputname      1qp6.rec.gas.$count01
binaryoutput    no

# integrator params
timestep        1.0

# force field params
amber           on
parmfile        ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.prm7
ambercoor       ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.rst7
readexclusions  yes  # exclusions are taken from parmfile
exclude         scaled1-4
1-4scaling      0.833333  # =1/1.2, default is 1.0
scnb            2.0  # Default
switching       off
cutoff          999.0

# Atoms in group 1 have a 1 in the B column; group 2 has a 2
pairInteraction         on
pairInteractionFile     1qp6.rec.gas.pair.$count01.pdb
pairInteractionCol      B
pairInteractionGroup1   1
pairInteractionGroup2   2

# First frame saved was frame 1000.
set ts 1000

coorfile open dcd 1qp6.rec.stripfit.dcd

# Read all frames until nonzero is returned.
while { ![coorfile read] } {
  # Set firstTimestep so our energy output has the correct TS.
  firstTimestep \$ts
  # Compute energies and forces, but don't try to move the atoms.
  run 0
  incr ts 1000
}
coorfile close
EOFL
###########################################


cat << EOF > fp.namd.${count01}.csh
#! /bin/csh
#PBS -l walltime=24:00:00
#PBS -V

cd ${workdir}

hostname
date
echo "START CALCS"

/nfs/user03/fochtman/zzz.programs/NAMD_2.9_Source/Linux-x86_64-g++/namd2 07_to_08pr.${count01}.in > 07_to_08pr.${count01}.out

EOF

qsub fp.namd.${count01}.csh


@ count01 = $count01 + 1
###########################################
###########################################
end

exit

Energy Decomposition Analysis

This script using awk to calculate the mean and standard deviation.

run.006.vmd.namd.energy.decomp_sw_analysis.csh:

#! /bin/csh -fe

set mountdir        = "/nfs/user03/chembio/AMBER_Project"

set start           = 1
set stop            = 35
set offset          = 1

set workdir         = "${mountdir}/006.FP.VMD.NAMD"

cd        ${workdir}

set res_start = ${start}
set res_stop  = ${stop}
@ loop_stop = ${res_stop} + 1

set start_residue = $res_start
set loop_stop  = $loop_stop
set count01 = $start_residue

echo "resid,mean,std" > vdw.fp.output.csv
echo "resid,mean,std" > es.fp.output.csv

###########################################
while ( $count01 < $loop_stop )

## Gets VDW
awk ' BEGIN{count=0;sum=0;sum2=0;} /^ENERGY:/{count=count+1;sum=sum+$8;sum2 = sum2 + $8^2} END{mean=sum/count; std= ((sum2/count)- (mean^2))^(1/2); print "'${count01}'," mean "," std } ' 07_to_08pr.${count01}.out >> vdw.fp.output.csv

##Get ES
awk ' BEGIN{count=0;sum=0;sum2=0;} /^ENERGY:/{count=count+1;sum=sum+$7;sum2 = sum2 + $7^2} END{mean=sum/count; std= ((sum2/count)- (mean^2))^(1/2); print "'${count01}'," mean "," std } ' 07_to_08pr.${count01}.out >> es.fp.output.csv

@ count01 = $count01 + 1

end # while

exit

This above script produces 2 (vdw and coul are in different files) csv files that may be plotted in you favorite graphing software (matlab, R, xmgrace, origin, excel, etc). The first column contains the residue numbers, the second column the mean values averaged over all snapshots of the per-residue energies, and the third column contains the standard deviation.