2014 AMBER tutorial with HIV Protease
For additional Rizzo Lab tutorials see AMBER Tutorials.
In this tutorial, we will learn how to run a molecular dynamics simulation of a protein-ligand complex. We will then post-process that simulation by calculating structural fluctuations (with RMSD) and free energies of binding (MM-GBSA).
Contents
I. Introduction
AMBER
HIV Protease
II. Structural Preparation
Lu, Yan and Yao
Preparation in Chimera
Downloading the PDB file
Go to PDB homepage (http://www.rcsb.org/pdb/home/home.do ) enter the protein ID (1HVR) in the search bar, click Download Files in the top-right of the webpage, then select PDB File (text). In the new window, save the file.
Preparing the ligand and receptor in Chimera
Generating the final files
antechamber
Ton begin with build a 002.ANTE.TLEAP directory under Amber_tutorial directory. To make sure these three programs are available, (antechamber, parmchk and tleap) and we are using the correct version of amber, we can use the which command, type command:
which antechamber which parmchk which tleap
The result should look like this:
/nfs/user03/wjallen/local/amber12/bin/antechamber /nfs/user03/wjallen/local/amber12/bin/parmchk /nfs/user03/wjallen/local/amber12/bin/tleap
parmchk
tleaP
Visualization in VMD
III. Simulation using sander
Junjie, Tianao and Kai
Minimization
Before running the equilibration and production, we should first run the energy minimization. The purpose of this step is to remove structural artifacts resulting from the model-building process. Because model building often creates unwanted structural artifacts that must be removed before a molecular dynamics simulation is performed.
An example of a typical input file “01mi.in” for minimization is listed blow:
01mi.in: minimization &cntrl imin = 1, maxcyc = 1000, ntmin = 2, ntx = 1, ntc = 1, ntf = 1, ntb = 1, ntp = 0, ntwx = 1000, ntwe = 0, ntpr = 1000, cut = 8.0, ntr = 1, restraintmask = ':1-199 & !@H=', restraint_wt = 5.0, /
The following parameters are defined as follows: imin:specifies minimization not molecular dynamics maxcyc:total number of minimization cycles to be performed ntmin:how many cycles will use a deepest decent method, the remaining cycles use an approximation of this called the conjugate gradient method. ntx:only coordinates and not velocities are to be read from previous step ntc:indicates level of constraint on bonds. if =1, SHAKE algorithm is off so no bonds are constrained. If =2, constrains any bonds with H atoms. If =3, constrains all bonds. ntf:=1, all parts of the potential must be evaluated ntb:periodic boundary to keep system at constant volume ntp:=0, NO constant pressure applied The frequencies at which the program records data are in controlled by the paramenters ntwx, ntwe, and ntpr. ntwx:=1000, atom coordinates saved into .trj file every 1000 cycles ntwe:=0, no .en file is generated ntpr:=1000, energy readins are written as .out and .info files every 1000 steps ntr:=1, positional restraint method applied restraintmask= ':1-119 & !@H : position of atom within residues 1-119 that is not a H atom is being restrained restraint_wt: restraint weight indicating how strong the restraint on the atoms is
Equilibration
Production
Running jobs on the queue
IV. Simulation Analysis
Arkin, Jess and Mosavverul
Ptraj
RMSD Plots
Measuring h-bond distances
MM-GBSA Energy Calculation
MM/GBSA is the acronym for Molecular Mechanics/Generalized Born Surface Area. This part of AMBER combines molecular mechanics (MM) with both the electrostatic (PB) and nonpolar (SA) contribution to solvation . Topology files are needed for the receptor, ligand, and receptor-ligand complex. The trajectory files generate the coordinates. Therefore, molecular dynamics is used to generate a set of snapshots taken at fixed intervals from the trajectories. These snapshots are processed to remove solvent and generate the free energy of binding.
In the AMBER_Tutorial directory, create a new directory:
mkdir 005.MMGBSA
In this new directory, create the file gb.rescore.in containing:
Single point GB energy calc &cntrl ntf = 1, ntb = 0, ntc = 2, idecomp= 0, igb = 5, saltcon= 0.00, gbsa = 2, surften= 1.0, offset = 0.09, extdiel= 78.5, cut = 99999.0, nsnb = 99999, imin = 5, maxcyc = 1, ncyc = 0, /
Then create a csh script, run.sander.rescore.csh, that contains the following lines of command:
#! /bin/tcsh #PBS -l nodes=1:ppn=2 #PBS -l walltime=24:00:00 #PBS -o zzz.qsub.out #PBS -e zzz.qsub.err #PBS -V set workdir = nfs/user03/username/AMS536/AMBER_Tutorial/005.MMGBSA cd /nfs/user03/username/AMS536/AMBER_Tutorial/005.MMGBSA mpirun -n 2 sander.MPI -O -i gb.rescore.in \ -o gb.rescore.out.com \ -p ../002.TLEAP/1HVR.com.gas.leap.prm7 \ -c ../002.TLEAP/1HVR.com.gas.leap.rst7 \ -y ../004.PTRAJ/1HVR.com.trj.stripfit \ -r restrt.com \ -ref ../002.TLEAP/1HVR.com.gas.leap.rst7 \ -x mdcrd.com \ -inf mdinfo.com mpirun -n 2 sander.MPI -O -i gb.rescore.in \ -o gb.rescore.out.lig \ -p ../002.TLEAP/1HVR.lig.gas.leap.prm7 \ -c ../002.TLEAP/1HVR.lig.gas.leap.rst7 \ -y ../004.PTRAJ/1HVR.lig.trj.stripfit \ -r restrt.lig \ -ref ../002.TLEAP/1HVR.lig.gas.leap.rst7 \ -x mdcrd.lig \ -inf mdinfo.lig mpirun -n 2 sander.MPI -O -i gb.rescore.in \ -o gb.rescore.out.test.rec \ -p ../002.TLEAP/1HVR.rec.gas.leap.prm7 \ -c ../002.TLEAP/1HVR.rec.gas.leap.rst7 \ -y ../004.PTRAJ/1HVR.rec.trj.stripfit \ -r restrt.rec \ -ref ../002.TLEAP/1HVR.rec.gas.leap.rst7 \ -x mdcrd.rec \ -inf mdinfo.rec exit
~ ~ ~ ~
Then this script should be sent to the queue, i.e., qsub the script using the commands:
qsub run.sander.rescore.csh
You can monitor your progress by typing
qstat -u username
When the job is complete, you will obtain the following output files: gb.rescore.out.com, gb.rescore.out.lig, and gb.rescore.out.rec In these files, the single point energy calculation results will be written for each individual frame. It will be found in the results section and the output file will have an infrastrucutre that is similar to the following:
FINAL RESULTS NSTEP ENERGY RMS GMAX NAME NUMBER 1 6.0197E+03 1.9231E+01 1.0478E+02 CA 950 BOND = 613.6427 ANGLE = 1575.2750 DIHED = 2052.4780 VDWAALS = -1544.4789 EEL = -13857.3791 EGB = -2026.3402 1-4 VDW = 697.7785 1-4 EEL = 8382.6220 RESTRAINT = 0.0000 ESURF = 10126.1041 minimization completed, ENE= 0.60197022E+04 RMS= 0.192312E+02 TRAJENE: Trajectory file ended TRAJENE: Trajene complete.
In the command line, type:
grep VDWAALS gb.rescore.out.com > vdw.com.txt. grep ESURF gb.rescore.out.com > surf.com.txt.
You can take these text files, import them into Excel, and do the rest of your modifications there.
Equations for analysis
Remember that to obtain the Gvdw term, you need to take the SASA (which is ESURF) and input into equation 1:
ΔGnonpolar = SASA*0.00542 + 0.92
Also, the mmgbsa of a given system can be determined by equation 2:
ΔGmmgbsa = ΔGvdw + ΔGcoul + ΔGpolar + ΔGnonpolar
From the output file:
VDWAALS = ΔGvdw
EELS = ΔGcoul
EGB = ΔGpolar
You can then easily calculate the ΔΔGbind by using equation 3:
ΔΔGbind = ΔGmmgbsa,complex – (ΔGmmgbsa,lig + ΔGmmgbsa,rec) You will want to careful when doing your analysis that the results from frame 1 for the receptor and ligand are subtracted from the results from frame 1 for your complex. By doing this in excel, you should have 2000 frames for each, and the values should cleanly line up. Finally, you will want to plot your ΔΔGbind and examine if you see corresponding changes in the ligand position and the ΔΔGbind. Also, you should determine the mean and standard deviation for your ΔΔGbind.
Plotting Energy
When your rescoring calculation finishes, you have the following three output files: "gb.rescore.out.com", "gb.rescore.out.lig", and "gb.rescore.out.rec".
Use the following script, entitled get.mmgbsa.bash, to extract data and calculate MMGBSA energy for each snap shot.
#! /bin/bash # by Haoquan echo com lig rec > namelist LIST=`cat namelist` for i in $LIST ; do grep VDWAALS gb.rescore.out.$i | awk '{print $3}' > $i.vdw grep EGB gb.rescore.out.$i | awk '{print $9}' > $i.polar grep EELS gb.rescore.out.$i | awk '{print $6}' > $i.coul grep ESURF gb.rescore.out.$i | awk '{print $3 * 0.00542 + 0.92}' > $i.surf paste -d " " $i.vdw $i.polar $i.surf $i.coul | awk '{print $1 + $2 + $3 + $4}' > data.$i rm $i.* done paste -d " " data.com data.lig data.rec | awk '{print $1 - $2 - $3}' > data.all for ((j=1; j<=`wc -l data.all | awk '{print $1}'`; j+=1)) do echo $j , >> time done paste -d " " time data.all > MMGBSA_vs_time.dat rm namelist time data.*
To run this script do:
bash get.mmgbsa.sh
This will create a text file called MMGBSA_vs_time.dat with x and y values separated by a space and comma. These values can be imported to Excel or Origin or to XMGRACE if you are using Linux:
xmgrace MMGBSA_vs_time.dat