Per-residue energy decompositions
The Molecular Mechanics energy function is a pair-wise additive function which can be broken into per-residue components. This is useful for measuring the contribution of each protein residue towards ligand binding. An analogous experimental technique would be alanine scanning. We call these plots molecular interaction footprints.
This tutorial is for analyzing molecular dynamics (MD) simulations of a receptor-ligand complex.
In the example, we show how to analyze a trajectory generated with AMBER. However, you can also use NAMD to generate your trajectories.
I would suggest that you perform your own simulation and then use this tutorial as a guide to performing your analysis.
Here are some tutorials on the Rizzo lab wiki for running MD:
- http://ringo.ams.sunysb.edu/index.php/2012_AMBER_Tutorial_with_Biotin_and_Streptavidin
- http://ringo.ams.sunysb.edu/index.php/2013_AMBER_Tutorial_with_UMP_and_OMP
Here is also some information on NAMD from the rizzo lab wiki:
- http://ringo.ams.sunysb.edu/index.php/NAMD_tutorial
- http://ringo.ams.sunysb.edu/index.php/NAMD_on_Seawulf
Calculate Decompositions Using Namd
The following script uses vmd to create pdb files. Each atom is labeled with a 0 (excluded), 1(species #1) or 2(species #2) in beta column. namd will compute the interaction energy between species #1 and #2. A separate pdb file is created for each residue (species #2). The ligand is always species #1.
We are using amber inputs in this example. The namd input file loops over each pdb file to compute the interaction energy for every timestep.
run.005.vmd.namd.energy.decomp_sw.csh:
#! /bin/csh -fe set mountdir = "/nfs/user03/chembio/AMBER_Project" set start = 1 set stop = 35 set offset = 1 set workdir = "${mountdir}/006.FP.VMD.NAMD" cd ${workdir} ########################################### set res_start = ${start} set res_stop = ${stop} @ loop_stop = ${res_stop} + 1 cat << EOF > ptraj.in trajin ${mountdir}/004.ptraj/1qp6.rec.trj.stripfit trajout 1qp6.rec.stripfit.dcd charmm EOF ptraj ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.prm7 ptraj.in > ptraj.out ########################################### cat <<EOFK >pair.maker.tcl #opens the file that has the atoms that are to be set-up for pair-interaction calculations mol load parm7 ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.prm7 rst7 ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.rst7 for {set resid_num $res_start} {\$resid_num < $loop_stop} {incr resid_num} { set all [atomselect top all] \$all set beta 0 set case1 [atomselect top "protein and resid 36 to 70"] \$case1 set beta 1 set case2 [atomselect top "protein and resid \$resid_num"] \$case2 set beta 2 \$all writepdb 1qp6.rec.gas.pair.\$resid_num.pdb } quit EOFK ########################################### # copy DCD files required in the calculations # to the current directory # make a special PDB files defining what the two species will be for pair interaciton calcs /nfs/user05/lingling/zzz.programs/vmd-1.9.1/bin/vmd -dispdev text -e pair.maker.tcl >> zzz.pair.maker.out echo "MAKE INPUT FILES" date set res_start = ${start} set res_stop = ${stop} @ loop_stop = ${res_stop} + 1 set start_residue = $res_start set loop_stop = $loop_stop set count01 = $start_residue ########################################### ########################################### while ( $count01 < $loop_stop ) ########################################### cat <<EOFL >07_to_08pr.$count01.in # initial config temperature 0 # output params outputname 1qp6.rec.gas.$count01 binaryoutput no # integrator params timestep 1.0 # force field params amber on parmfile ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.prm7 ambercoor ${mountdir}/002.TLEAP/1qp6.rec.gas.leap.rst7 readexclusions yes # exclusions are taken from parmfile exclude scaled1-4 1-4scaling 0.833333 # =1/1.2, default is 1.0 scnb 2.0 # Default switching off cutoff 999.0 # Atoms in group 1 have a 1 in the B column; group 2 has a 2 pairInteraction on pairInteractionFile 1qp6.rec.gas.pair.$count01.pdb pairInteractionCol B pairInteractionGroup1 1 pairInteractionGroup2 2 # First frame saved was frame 1000. set ts 1000 coorfile open dcd 1qp6.rec.stripfit.dcd # Read all frames until nonzero is returned. while { ![coorfile read] } { # Set firstTimestep so our energy output has the correct TS. firstTimestep \$ts # Compute energies and forces, but don't try to move the atoms. run 0 incr ts 1000 } coorfile close EOFL ########################################### cat << EOF > fp.namd.${count01}.csh #! /bin/csh #PBS -l walltime=24:00:00 #PBS -V cd ${workdir} hostname date echo "START CALCS" /nfs/user03/fochtman/zzz.programs/NAMD_2.9_Source/Linux-x86_64-g++/namd2 07_to_08pr.${count01}.in > 07_to_08pr.${count01}.out EOF qsub fp.namd.${count01}.csh @ count01 = $count01 + 1 ########################################### ########################################### end exit
Energy Decomposition Analysis
This script using awk to calculate the mean and standard deviation.
run.006.vmd.namd.energy.decomp_sw_analysis.csh:
#! /bin/csh -fe set mountdir = "/nfs/user03/chembio/AMBER_Project" set start = 1 set stop = 35 set offset = 1 set workdir = "${mountdir}/006.FP.VMD.NAMD" cd ${workdir} set res_start = ${start} set res_stop = ${stop} @ loop_stop = ${res_stop} + 1 set start_residue = $res_start set loop_stop = $loop_stop set count01 = $start_residue echo "resid,mean,std" > vdw.fp.output.csv echo "resid,mean,std" > es.fp.output.csv ########################################### while ( $count01 < $loop_stop ) ## Gets VDW awk ' BEGIN{count=0;sum=0;sum2=0;} /^ENERGY:/{count=count+1;sum=sum+$8;sum2 = sum2 + $8^2} END{mean=sum/count; std= ((sum2/count)- (mean^2))^(1/2); print "'${count01}'," mean "," std } ' 07_to_08pr.${count01}.out >> vdw.fp.output.csv ##Get ES awk ' BEGIN{count=0;sum=0;sum2=0;} /^ENERGY:/{count=count+1;sum=sum+$7;sum2 = sum2 + $7^2} END{mean=sum/count; std= ((sum2/count)- (mean^2))^(1/2); print "'${count01}'," mean "," std } ' 07_to_08pr.${count01}.out >> es.fp.output.csv @ count01 = $count01 + 1 end # while exit
This above script produces 2 (vdw and coul are in different files) csv files that may be plotted in you favorite graphing software (matlab, R, xmgrace, origin, excel, etc). The first column contains the residue numbers, the second column the mean values averaged over all snapshots of the per-residue energies, and the third column contains the standard deviation.