2014 AMBER tutorial with HIV Protease
For additional Rizzo Lab tutorials see AMBER Tutorials.
In this tutorial, we will learn how to run a molecular dynamics simulation of a protein-ligand complex. We will then post-process that simulation by calculating structural fluctuations (with RMSD) and free energies of binding (MM-GBSA).
Contents
I. Introduction
AMBER
Amber - Assisted Model Building with Energy Refinement - is a multi-program suite for macromolecular simulations. Amber12 is the most recent version of the software and it includes new force fields such as ff12SB and Lipid11, expanded options for Poisson-Boltzmann solvation calculations, accelerated molecular dynamics, additional features in sander pmemd code, and expanded methods for free energy calculations. Our lab is set up with Amber12 and the latest update of AmberTools13 which contains the programs such as antechamber and tleap to set up your simulation.
The Amber 12 Manual is available to get started with using Amber12 as well as a troubleshooting reference. (Tip: You can search the document for keywords if you use Adobe Acrobat to view the file which can save time.) Additionally, Amber Tools User's Manual is another reference for the probrgrams available under Amber tools.
Here some of programs availabe in Amber and AmberTools
- LEaP: is a preparatory program for constructing new or modified systems in Amber. It contains the functions: prep, link, edit, and parm from earlier version of Amber.
- ANTECHAMBER: is the primary program used to prepare input files excluding the original nucleic acid and protein information taken from the PDB.
- SANDER: is 'a basic energy minimizer and molecular dynamics program' that is used to minimize, equilibrate and sample molecular conformations. This is the program used to generate trajectory files of the system.
- PMEMD: is an improved version of SANDER optimized for periodic, PME simulations, and for GB simulations. It is faster and scales better on parallel machines than SANDER.
- PTRAJ: is an analysis program for processing trajectory files. ptraj can be used to rotate and translate the structures, evaluate geometrical features, calculate RMSDs among other analyses.
There is a mailing list available as an additional resource.
HIV Protease
Human Immunodeficiency Virus (HIV) is a retrovirus which causes Acquired Immune Deficiency Syndrome (AIDS). HIV protease is a aspartyl protease which is a dimer, each monomer consisting of 99 amino acids. This protein is essential for the lifecycle of the virus as it cleaves immature polyproteins into mature proteins which can then be used for proper virion assembly. HIV protease has been validated as a drug target for the treatment for HIV as inhibition of this protein can slow HIV progression. However, retroviruses have high mutation rates and drug-resistance is a constant problem which needs to be addressed. There are a number of structure of HIV protease in the protein databank, we will use 1HVR for this tutorial.
Organizing Directories
The following directory structure and naming scheme is a convienient way to organize your files for this tutorial. Create these directories first before doing anything else.
~username/AMS536/AMBER-Tutorial/001.MOL.PREP/ 002.ANTE.TLEAP/ 003.PMEMD/ 004.PTRAJ/ 005.MMGBSA/
II. Structural Preparation
Downloading the PDB file
Go to PDB homepage (http://www.rcsb.org/pdb/home/home.do ) enter the protein ID (1HVR) in the search bar, click Download Files in the top-right of the webpage, then select PDB File (text). In the new window, save the file.
Preparing the ligand and receptor in Chimera
In this section, three new files would be created in the 001.CHIMERA.MOL.PREP/ folder:
1HVR.dockprep.mol2 (complete system with hydrogens and charges) 1HVR.receptor.noH.pdb (the receptor alone, without hydrogens) 1HVR.ligand.mol2 (the ligand alone)
To prepare these files, first copy the original PDB file into 001.MOL.PREP folder and open it with VIM ($ vim 1HVR.pdb). Because the residue name of the ligand (U) will give us some problems when assigning charges, change the residue name "U" to "LIG". Here is an example command that will change all instances of " U" into "LIG", while preserving the correct spacing (Note: there are two spaces before U):
:%s/ U/LIG/gc
For this command, g is short for global and c is short for check with the user before making the change.
Next, open up the PDB file (1HVR.pdb) in Chimera. To generate 1HVR.dockprep.mol2 files, delete the water molecules; delete the original hydrogen atoms; add the charge and add the hydrogen atoms.
To delete water molecules and other ligands, click Select -> Residue -> HOH, then go to Actions -> Atoms/Bonds -> Delete. Hydrogen atoms can be added manually by choosing Tools -> Structure Editing -> Add H. To add charges to the ligand and receptor, go to Select -> Residue -> LIG, then go to Tools -> Structure Editing -> Add Charge. Choose AMBER ff99SB as the charge model, click Okay, and when prompted chose AM1-BCC charges for the ligand, and make sure the Net Charge is set to 0. (The chemistry of the ligand should be considered when assigning a charge state).
Alternatively, you can do all of the above by clicking Tools -> Structure Editing -> Dock Prep. Note when adding the charge to the ligand, you can choose AMBER ff99SB as the charge model and chose gasteiger as the charge method. In this 1HVR case, we set Net Charge to 0.
Finally, save this file as 1HVR.dockprep.mol2.
Generating the final files
To create the receptor file with no hydrogen atoms: Open 1HVR.dockprep.mol2, click Select -> Chemistry -> Element -> H, then chose Actions -> Atoms/Bonds -> Delete'. Save the file as 1HVR.receptor.noH.pdb.
To create the ligand file: Open 1HVR.dockprep.mol2, click Select -> Residue -> LIG, then click Select -> Invert, then chose Actions -> Atoms/Bonds -> Delete. Save the file as 1HVR.ligand.mol2.
antechamber
An antichamber input file need all the atom names to be unique and only the first three characters will be used. For example, H301 and H302 can not be distinguished with each other. So first we need to manually rename the atoms which have the same first three characters.
The file after renaming:
To run antichamber, we need make sure these three programs are available, (antechamber, parmchk and tleap) and we are using the correct version of amber, we can use the which command, type command:
which antechamber which parmchk which tleap
The result should look like this:
/nfs/user03/wjallen/local/amber12/bin/antechamber /nfs/user03/wjallen/local/amber12/bin/parmchk /nfs/user03/wjallen/local/amber12/bin/tleap
Copy parameters of ions to your working directory from the following resource:
scp -r ~yuchzhou/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/rizzo_amber7.ionparms ./
Then we use antechamber to convert our input mol2 file into files ready for LEaP.Type command:
antechamber -i ../001.MOL.PREP/1HVR.lig.mol2 -fi mol2 -o 1HVR.lig.ante.pdb -fo pdb
Here, -i means input file name; -fi means input file format; -o means output file name; -fo output file format. You will have an output file:1HVR.lig.ante.pdb Similarly, we can use antechamber to change the format of 1HVR.lig.mol2 file to prep file:
antechamber -i ../001.MOL.PREP/1HVR.lig.mol2 -fi mol2 -o 1HVR.lig.ante.prep -fo prepi
You will get a set of output files:
ANTECHAMBER_AC.AC ANTECHAMBER_AC.AC0 ANTECHAMBER_BOND_TYPE.AC ANTECHAMBER_BOND_TYPE.AC0 ANTECHAMBER_PREP.AC ANTECHAMBER_PREP.AC0 ATOMTYPE.INF NEWPDB.PDB PREP.INF 1HVR.lig.ante.prep
parmchk
After running antechamber, we run parmchk to check the parameters. Parmchk is another program in Antechamber. If there are missing parameters or mistake after antechamber is finished, the frcmod template generated by parmchk will help us in generating the needed parameters:
parmchk -i 1HVR.lig.ante.prep -f prepi -o 1HVR.lig.ante.frcmod
In the command, -i input file name,-f input file format (prepi, prepc, ac, mol2),-o frcmod file name. When the programming is done, you should have a new file in the directory.
1HVR.lig.ante.frcmod
tleap
Then, three input files are needed to run TLEAP, they are “tleap.lig.in”, “tleap.rec.in” and “tleap.com.in”, for the ligand, receptor, and complex respectively. These input files will be used by tleaP to create parameter/topology files and initial coordinate files. Copy parameters to your working directory from the following resource:
scp ~yuchzhou/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/tleap.lig.in scp ~yuchzhou/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/tleap.rec.in scp ~yuchzhou/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/tleap.com.in
here are what the files should look like.
- tleap.lig.in
set default PBradii mbondi2 #set default PBradii source leaprc.ff99SB #load a force field loadoff rizzo_amber7.ionparms/ions.lib #Loading Rob Rizzo's ion parameters loadamberparams rizzo_amber7.ionparms/ions.frcmod #load an AMBER format parameter file source leaprc.gaff loadamberparams rizzo_amber7.ionparms/gaff.dat.rizzo loadamberparams 1HVR.lig.ante.frcmod loadamberprep 1HVR.lig.ante.prep #load an AMBER PREP file lig = loadpdb 1HVR.lig.ante.pdb #load the ligand pdb file saveamberparm lig 1HVR.lig.gas.leap.prm7 1HVR.lig.gas.leap.rst7 #save the ligand gas phase AMBER topology and coordinate files solvateBox lig TIP3PBOX 10.0 #solvate the receptor using TIP3P, solvent box radii 10 angstroms saveamberparm lig 1HVR.lig.wat.leap.prm7 1HVR.lig.wat.leap.rst7 #save the ligand water phase AMBER topology and coordinate files charge lig #this writes out the charge of the ligand quit
Coments (text after the '#' symbol) are for edification and should not apper in the file.
- tleap.rec.in
set default PBradii mbondi2 source leaprc.ff99SB loadoff rizzo_amber7.ionparms/ions.lib loadamberparams rizzo_amber7.ionparms/ions.frcmod REC = loadpdb ../001.MOL.PREP/1HVR.rec.noh.pdb saveamberparm REC 1HVR.rec.gas.leap.prm7 1HVR.rec.gas.leap.rst7 solvateBox REC TIP3PBOX 10.0 saveamberparm REC 1HVR.rec.wat.leap.prm7 1HVR.rec.wat.leap.rst7 charge REC quit
- tleap.com.in
set default PBradii mbondi2 source leaprc.ff99SB loadoff rizzo_amber7.ionparms/ions.lib loadamberparams rizzo_amber7.ionparms/ions.frcmod source leaprc.gaff loadamberparams rizzo_amber7.ionparms/gaff.dat.rizzo REC = loadpdb ../001.MOL.PREP/1HVR.rec.noh.pdb loadamberparams 1HVR.lig.ante.frcmod loadamberprep 1HVR.lig.ante.prep LIG = loadpdb 1HVR.lig.ante.pdb COM = combine {REC LIG} saveamberparm COM 1HVR.com.gas.leap.prm7 1HVR.com.gas.leap.rst7 solvateBox COM TIP3PBOX 10.0 saveamberparm COM 1HVR.com.wat.leap.prm7 1HVR.com.wat.leap.rst7 charge COM quit
Use leap command:
tleap -s -f tleap.lig.in > tleap.lig.out tleap -s -f tleap.rec.in > tleap.rec.out tleap -s -f tleap.com.in > tleap.com.out
The following output files will be generated:
1HVR.com.gas.leap.prm7 1HVR.com.wat.leap.prm7 1HVR.lig.ante.frcmod 1HVR.lig.ante.prep 1HVR.lig.gas.leap.rst7 1HVR.lig.wat.leap.rst7 1HVR.rec.gas.leap.rst7 1HVR.rec.wat.leap.rst7 1HVR.com.gas.leap.rst7 1HVR.com.wat.leap.rst7 1HVR.lig.ante.pdb 1HVR.lig.gas.leap.prm7 1HVR.lig.wat.leap.prm7 1HVR.rec.gas.leap.prm7 1HVR.rec.wat.leap.prm7 tleap.lig.out tleap.rec.out tleap.com.out leap.log
Visualization in VMD
Visualization is an important step in AMBER molecular dynamics simulation as it allows for the viewing of molecules and molecule movements within a specified field of view. Several files prepared in the previous two steps will be required for the visualization of the ligand and its movement in the protein binding pocket. The first step that must be completed is the copying of all necessary information from Seawulf to Herbie.
scp -r ~/AMS536/AMBER_Tutorial/002.ANTE.TLEAP username@herbie.mathlab.sunysb.edu:~/AMS536/AMBER_Tutorial/
We can now visualize several files in herbie by VMD: the protein-ligand complex in the gas or water phase, and the ligand in the gas or water phase.In the "VMD Main" window, click File > New Molecule. Now in the new window titled "Molecule File Browser" a few files must be selected and loaded.Viewing the protein-ligand complex in the gas phase we select 1LOQ.com.gas.leap.prm7 and the file type AMBER7 Parm. This would allow you to load the parameter information of the milecule, but nothing will shoe up in the VMD visualization window because coordinates has not been loaded yet.Next you need to select and load the file 1LOQ.com.gas.leap.rst7 and the file type AMBER7 Restart. This can be done for the complex or ligand for either the gas or the water phase by selecting and loading the corresponding .parm7 files and .rst7 files.
Next, you can choose to edit the structure shown for better visualization. By clicking Graphics > Representations... a new window "Graphical Representations" will open, in which you can create new representations of the protein-ligand complex. It is possible to choose the entire complex or simply the protein or ligand, to color the structure in various ways, and to choose how it is best represented (e.g. lines, thicker bonds, full surface).
III. Simulation using sander
Junjie, Tianao and Kai
Minimization
Before running any equilibration and production, we have to minimize the energy of system. This process can remove any unfavorable interaction between atoms due to the low resolution of biomolecule. If an unfavorable interaction is present when doing equilibration and production, atoms may "fly apart" and "crash" into other part of the biomolecule.
An example of a typical input file “01mi.in” for minimization is listed blow:
01mi.in: minimization &cntrl imin = 1, maxcyc = 1000, ntmin = 2, ntx = 1, ntc = 1, ntf = 1, ntb = 1, ntp = 0, ntwx = 1000, ntwe = 0, ntpr = 1000, cut = 8.0, ntr = 1, restraintmask = ':1-199 & !@H=', restraint_wt = 5.0, /
You may have to minimize the system several times. Each time you release the constraint on heavy atoms slowly.
The meaning of each flag is:
imin = 1: do minimization only.
maxcyc: cycles of minimization will be performed.
ntmin: specify the number of cycles that use a deepest decent method, after that minimization will use the conjugate gradient method.
ntx = 1: read coordinates only from rst file. There is no need to read velocity before minimization.
ntc = 1: Minimization is performed on every bond.
ntf = 1: Typically, ntf = ntc.
ntb: periodic boundary is applied.
ntp = 0: NO constant pressure is applied.
ntwx = 1000: saving snapshot to a trajectory file every ntwx steps.
ntwe = 0: human readable energy file is appended every ntwe steps.
ntpr = 1000: newest calculated energy is stored in .info file every ntpr steps.
cut: cutoff used in energy evaluation.
ntr = 1: positional restraint method is applied.
restraintmask = ':1-199 & !@H : atoms of residue 1-199 except H is restrained.
restraint_wt: restraint strength in unit of kcal/(mol*Angstrom^2).
Equilibration
System need to be heat up in a every careful way. An abrupt heating up may cause unwanted perturbation.
An example of the content of "02md.in" is listed below:
02md.in: equilibration &cntrl imin = 0, ntx = 1, irest = 0, nstlim = 50000, temp0 = 298.15, tempi = 298.15, ig = 71287, ntc = 2, ntf = 1, ntt = 1, dt = 0.001, ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0, ntwx = 1000, ntwe = 0, ntwr = 1000, ntpr = 1000, cut = 8.0, iwrap = 1, ntr = 1, nscm = 100, restraintmask = ':1-199 & !@H=', restraint_wt = 5.0, /
The parameters are described as below:
imin = 0: now md simulation is performed instead of minimization.
ntc = 2: SHAKE is applied to constraint bonds involve H.
ntb = 2 and ntp = 1: constant pressure is applied.
irest = 0: we are not reading in any velocities so there is no need to use restart.
ntt = 1: weak-coupling is applied to keep the temperature constant during equilibration.
tautp = 1.0 and taup = 1.0: time constant in unit of ps and the pressure relaxation time in unit of ps.
iwrap = 1: all molecules are in the "primary box" when rst and traj files are written. Molecule may move to a neighboring box during simulation especially those molecules are close to the boundary, which is very annoying when we visualize the system.
nscm = 100: net translational and rotational velocity is cancelled every 100 steps. During a long simulation, the whole system may start moving and rotating which give an unwanted kinetic energy.
Production
Before the production runs, first we need to generate the input files 10md.in (as shown below).
10md.in: production &cntrl imin = 0, ntx = 5, irest = 1, nstlim = 500000, temp0 = 298.15, tempi = 298.15, ig = 71287, ntc = 2, ntf = 1, ntt = 1, dt = 0.002, ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0, ntwx = 500, ntwe = 0, ntwr = 500, ntpr = 500, cut = 8.0, iwrap = 1, ntr = 1, nscm = 100, restraintmask = ':1-198@CA,C,N', restraint_wt = 0.1,
imin = 0
ntx = 5: read coordinates and velocities
irest = 1: now we are restarting. restarting requires reading in velocities.
nstlim = 500000: 500000 steps.
ig = 71287: random number seeds.
dt = 0.002: stepsize = 2fs.
restraintmask = ':1-198@CA,C,N': ligand (resid 199) is free to move around.
Running jobs on the queue
To perform minimization, equilibration and production, create the following script and submit through qsub:
#!/bin/tcsh #PBS -l nodes=2:ppn=2 #PBS -l walltime=10:00:00 #PBS -N 1HVR.vs #PBS -o 1HVR.output #PBS -j oe #PBS -V cd /nfs/user03/<<your username>>/CHE536/AMBER-Tutorial/003.PMEMD mpirun -n 4 pmemd.MPI -O -i 01mi.in -o 01mi.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c ../002.ANTE.TLEAP/1HVR.com.wat.leap.rst7 -ref ../002.ANTE.TLEAP/1HVR.com.wat.leap.rst7 \ -x 01mi.trj -inf 01mi.info -r 01mi.rst7 mpirun -n 4 pmemd.MPI -O -i 02md.in -o 02md.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 01mi.rst7 -ref 01mi.rst7 -x 02md.trj -inf 02md.info -r 02md.rst7 mpirun -n 4 pmemd.MPI -O -i 03mi.in -o 03mi.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 02md.rst7 -ref 02md.rst7 -x 03mi.trj -inf 03mi.info -r 03mi.rst7 mpirun -n 4 pmemd.MPI -O -i 04mi.in -o 04mi.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 03mi.rst7 -ref 03mi.rst7 -x 04mi.trj -inf 04mi.info -r 04mi.rst7 mpirun -n 4 pmemd.MPI -O -i 05mi.in -o 05mi.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 04mi.rst7 -ref 04mi.rst7 -x 05mi.trj -inf 05mi.info -r 05mi.rst7 mpirun -n 4 pmemd.MPI -O -i 06md.in -o 06md.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 05mi.rst7 -ref 05mi.rst7 -x 06md.trj -inf 06md.info -r 06md.rst7 mpirun -n 4 pmemd.MPI -O -i 07md.in -o 07md.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 06md.rst7 -ref 05mi.rst7 -x 07md.trj -inf 07md.info -r 07md.rst7 mpirun -n 4 pmemd.MPI -O -i 08md.in -o 08md.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 07md.rst7 -ref 05mi.rst7 -x 08md.trj -inf 08md.info -r 08md.rst7 mpirun -n 4 pmemd.MPI -O -i 09md.in -o 09md.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 08md.rst7 -ref 05mi.rst7 -x 09md.trj -inf 09md.info -r 09md.rst7 mpirun -n 4 pmemd.MPI -O -i 10md.in -o 10md.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 09md.rst7 -ref 05mi.rst7 -x 10md.trj -inf 10md.info -r 10md.rst7 mpirun -n 4 pmemd.MPI -O -i 11md.in -o 11md.out -p ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 \ -c 10md.rst7 -ref 05mi.rst7 -x 11md.trj -inf 11md.info -r 11md.rst7
You can also separate each run into several scripts.
mpirun -n 4: run in parallel with 4 processors.
pmemd.MPI: parallel version of pmemd.
-i: input file.
-o: output file.
-p: topology and parameter of the system.
-c: read coordinates and velocities from the restart file generated by last run.
-x: trajectories.
-inf: info file (with most recent energy).
-r: restart file.
IV. Simulation Analysis
Arkin, Jess and Mosavverul
Checking Output Files
Ptraj
"Ptraj" stands for Process Trajectory. The frames in a trajectory can be thought of as snapshots in the MD simulation process. Each snapshot is taken at a frequency designated in the input.
For this step we created another work directory (004.PTRAJ, for example).
Create the file ptraj.strip.wat.in containing the following input:
trajin /nfs/user03/yuchzhou/AMS536/zzz.test/003.PMEMD/10md.trj 1 1000 1 trajin /nfs/user03/yuchzhou/AMS536/zzz.test/003.PMEMD/11md.trj 1 1000 1 trajout 1HVR.trj.strip nobox strip :WAT
The numbers 1 1000 1 give the input information about which frames are being used for Ptraj. The first two numbers 1 and 1000 specify the starting and ending snapshots from your trajectory file. The last number 1 specifies the frequency of the snapshot saved, in this case, every frame of the trajectory file.
The "strip :WAT" will strip off the waters, and the output will be a new file called 1HVR.trj.strip. And the last line of the input file will remove the water molecules.
To execute the first Ptraj analysis, run the following command specifying the input file ptraj.strip.wat.in just created and the designated output file ptraj.strip.wat.log:
ptraj ../002.ANTE.TLEAP/1HVR.com.wat.leap.prm7 ptraj.strip.wat.in > ptraj.strip.wat.log
The output file 1HVR.trj.strip will contain the 2000 snapshots of the trajectory.
The second input file for ptraj analysis was called ptraj.rec.rmsd.in and contained the following input:
trajin 1HVR.trj.strip 1 2000 1 trajout 1HVR.com.trj.stripfit reference ../002.ANTE.TLEAP/1HVR.com.gas.leap.rst7 rms reference out 1HVR.rmsd.CA.dat :1-198@CA
After the first step the 1HVR.com.trj.strip file is created which contains 2000 snapshots of the trajectory. This file will be compared to the reference file 1HVR.com.gas.leap.rst7 specified in the third line. All the water molecules have been removed in the first step, and it is ready for comparison to the gas phase complex trajectory. The last line says calculates the rmsd for alpha carbon number 1 to 198.
To execute the the analysis, run the following line.
ptraj ../002.ANTE.TLEAP/1HVR.com.gas.leap.prm7 ptraj.rec.rmsd.in > ptraj.rec.rmsd.log
This will give a txt file with two columns in it. The first column contains the number of frames being compared and the second column is the rmsd.
Then a similar rmsd file for the ligand is generated using the following input file ptraj.lig.rmsd.in:
trajin 1HVR.com.trj.stripfit 1 2000 1 reference ../002.ANTE.TLEAP/1HVR.com.gas.leap.rst7 rms reference out 1HVR.lig.rmsd.dat :199@C*,N*,O*,S* nofit
Similarly, this time the trajectory file will be compared to the reference file 1HVR.com.gas.leap.rst7, only this time the ligand is being compared instead of the receptor. The last line states that the rmsd values for carbon, nitrogen, oxygen and sulfur in residue 199 are being calculated. The <nofit> in the last line specifies that the rmsd of the ligand to the crystal structure is being calculated without any fitting.
To execute the the analysis, run the following line.
ptraj ../002.ANTE.TLEAP/1HVR.com.gas.leap.prm7 ptraj.lig.rmsd.in > ptraj.lig.rmsd.log
RMSD Plots
Measuring h-bond distances
To measure H-bond distances, first create a script to identify the bonds of interest:
trajin 1HVR.com.trj.stripfit #The name of the complex file distance 124OD1_199O4 :124@OD1 :199@O4 out 124OD1_199O4.dat #Bonds of interest (Multiple bonds can be listed here) -Residue@atom number : Residue@atom number
It is important to know which atoms to put in the script.
Next, enter the following command in your current directory to get an output(.dat) file.
ptraj ../Parameter File Input File > Name.log
The .dat file you end up with can be opened using the program "xmgrace".
To do this, enter the command:
xmgrace Name.dat
Xmgrace allows you to open the .dat file and customize labels and axes.
MM-GBSA Energy Calculation
MM/GBSA is the acronym for Molecular Mechanics/Generalized Born Surface Area. This part of AMBER combines molecular mechanics (MM) with both the electrostatic (PB) and nonpolar (SA) contribution to solvation . Topology files are needed for the receptor, ligand, and receptor-ligand complex. The trajectory files generate the coordinates. Therefore, molecular dynamics is used to generate a set of snapshots taken at fixed intervals from the trajectories. These snapshots are processed to remove solvent and generate the free energy of binding.
In the AMBER_Tutorial directory, create a new directory:
mkdir 005.MMGBSA
In this new directory, create the file gb.rescore.in containing:
Single point GB energy calc &cntrl ntf = 1, ntb = 0, ntc = 2, idecomp= 0, igb = 5, saltcon= 0.00, gbsa = 2, surften= 1.0, offset = 0.09, extdiel= 78.5, cut = 99999.0, nsnb = 99999, imin = 5, maxcyc = 1, ncyc = 0, /
Then create a csh script, run.sander.rescore.csh, that contains the following lines of command:
#! /bin/tcsh #PBS -l nodes=1:ppn=2 #PBS -l walltime=24:00:00 #PBS -o zzz.qsub.out #PBS -e zzz.qsub.err #PBS -V set workdir = nfs/user03/username/AMS536/AMBER_Tutorial/005.MMGBSA cd /nfs/user03/username/AMS536/AMBER_Tutorial/005.MMGBSA mpirun -n 2 sander.MPI -O -i gb.rescore.in \ -o gb.rescore.out.com \ -p ../002.TLEAP/1HVR.com.gas.leap.prm7 \ -c ../002.TLEAP/1HVR.com.gas.leap.rst7 \ -y ../004.PTRAJ/1HVR.com.trj.stripfit \ -r restrt.com \ -ref ../002.TLEAP/1HVR.com.gas.leap.rst7 \ -x mdcrd.com \ -inf mdinfo.com mpirun -n 2 sander.MPI -O -i gb.rescore.in \ -o gb.rescore.out.lig \ -p ../002.TLEAP/1HVR.lig.gas.leap.prm7 \ -c ../002.TLEAP/1HVR.lig.gas.leap.rst7 \ -y ../004.PTRAJ/1HVR.lig.trj.stripfit \ -r restrt.lig \ -ref ../002.TLEAP/1HVR.lig.gas.leap.rst7 \ -x mdcrd.lig \ -inf mdinfo.lig mpirun -n 2 sander.MPI -O -i gb.rescore.in \ -o gb.rescore.out.test.rec \ -p ../002.TLEAP/1HVR.rec.gas.leap.prm7 \ -c ../002.TLEAP/1HVR.rec.gas.leap.rst7 \ -y ../004.PTRAJ/1HVR.rec.trj.stripfit \ -r restrt.rec \ -ref ../002.TLEAP/1HVR.rec.gas.leap.rst7 \ -x mdcrd.rec \ -inf mdinfo.rec exit
~ ~ ~ ~
Then this script should be sent to the queue, i.e., qsub the script using the commands:
qsub run.sander.rescore.csh
You can monitor your progress by typing
qstat -u username
When the job is complete, you will obtain the following output files: gb.rescore.out.com, gb.rescore.out.lig, and gb.rescore.out.rec In these files, the single point energy calculation results will be written for each individual frame. It will be found in the results section and the output file will have an infrastrucutre that is similar to the following:
FINAL RESULTS NSTEP ENERGY RMS GMAX NAME NUMBER 1 6.0197E+03 1.9231E+01 1.0478E+02 CA 950 BOND = 613.6427 ANGLE = 1575.2750 DIHED = 2052.4780 VDWAALS = -1544.4789 EEL = -13857.3791 EGB = -2026.3402 1-4 VDW = 697.7785 1-4 EEL = 8382.6220 RESTRAINT = 0.0000 ESURF = 10126.1041 minimization completed, ENE= 0.60197022E+04 RMS= 0.192312E+02 TRAJENE: Trajectory file ended TRAJENE: Trajene complete.
In the command line, type:
grep VDWAALS gb.rescore.out.com > vdw.com.txt. grep ESURF gb.rescore.out.com > surf.com.txt.
You can take these text files, import them into Excel, and do the rest of your modifications there.
Equations for analysis
Remember that to obtain the Gvdw term, you need to take the SASA (which is ESURF) and input into equation 1:
ΔGnonpolar = SASA*0.00542 + 0.92
Also, the mmgbsa of a given system can be determined by equation 2:
ΔGmmgbsa = ΔGvdw + ΔGcoul + ΔGpolar + ΔGnonpolar
From the output file:
VDWAALS = ΔGvdw
EELS = ΔGcoul
EGB = ΔGpolar
You can then easily calculate the ΔΔGbind by using equation 3:
ΔΔGbind = ΔGmmgbsa,complex – (ΔGmmgbsa,lig + ΔGmmgbsa,rec) You will want to careful when doing your analysis that the results from frame 1 for the receptor and ligand are subtracted from the results from frame 1 for your complex. By doing this in excel, you should have 2000 frames for each, and the values should cleanly line up. Finally, you will want to plot your ΔΔGbind and examine if you see corresponding changes in the ligand position and the ΔΔGbind. Also, you should determine the mean and standard deviation for your ΔΔGbind.
Plotting Energy
When your rescoring calculation finishes, you have the following three output files: "gb.rescore.out.com", "gb.rescore.out.lig", and "gb.rescore.out.rec".
Use the following script, entitled get.mmgbsa.bash, to extract data and calculate MMGBSA energy for each snap shot.
#! /bin/bash # by Haoquan echo com lig rec > namelist LIST=`cat namelist` for i in $LIST ; do grep VDWAALS gb.rescore.out.$i | awk '{print $3}' > $i.vdw grep EGB gb.rescore.out.$i | awk '{print $9}' > $i.polar grep EELS gb.rescore.out.$i | awk '{print $6}' > $i.coul grep ESURF gb.rescore.out.$i | awk '{print $3 * 0.00542 + 0.92}' > $i.surf paste -d " " $i.vdw $i.polar $i.surf $i.coul | awk '{print $1 + $2 + $3 + $4}' > data.$i rm $i.* done paste -d " " data.com data.lig data.rec | awk '{print $1 - $2 - $3}' > data.all for ((j=1; j<=`wc -l data.all | awk '{print $1}'`; j+=1)) do echo $j , >> time done paste -d " " time data.all > MMGBSA_vs_time.dat rm namelist time data.*
To run this script do:
bash get.mmgbsa.sh
This will create a text file called MMGBSA_vs_time.dat with x and y values separated by a space and comma. These values can be imported to Excel or Origin or to XMGRACE if you are using Linux:
xmgrace MMGBSA_vs_time.dat
V. Frequently Encountered Problems
Mike
It is very important that you know which filetype you need. For example, for minimizations be sure that you have the correct input .mol2 file. The wrond mol2 file will result in file output with no information.
Ivan
Double and triple check your syntax before you submit a job to seawulf! It is incredibly disappointing to come back the next morning and realize your job failed simply because of a misplaced period or dash.
Lu
Yan
Pay attention to the names of your directories and files. Mostly, the reason why your commands fail to run is just a minor mirror in typing.
Yao
Junjie
Before you start minimization, equilibration and production, if the queue is not busy, you can type "qsub -I" in seawulf and "cd" to where you want to run sander. Then copy your sander command in your .csh file that you are going to submit and run it by simply paste to the command line and hit "return". By doing this, you are running your job interactively. If you can tell the job is running, your script is likely to be correct. It is easy to debug your .csh script in this way, since error is printed out right after you run commands instead of stored in some .er or .out files that have no context.