Difference between revisions of "2013 AMBER Tutorial with UMP and OMP"

From Rizzo_Lab
Jump to: navigation, search
(Running jobs on the queue)
(Generating the final files)
 
(67 intermediate revisions by 3 users not shown)
Line 39: Line 39:
 
In this AMBER tutorial, we will use the same system with previous DOCK part.  To begin with, we need three files under directory 001.CHIMERA.MOL.PREP.  
 
In this AMBER tutorial, we will use the same system with previous DOCK part.  To begin with, we need three files under directory 001.CHIMERA.MOL.PREP.  
  
===Downloading the PDB file (1LOQ)===
+
====Downloading the PDB file (1LOQ)====
  
 
Since we need to edit the PDB before we use it in Chimera we should do this manually. Go to PDB homepage (http://www.rcsb.org/pdb/home/home.do ) enter the protein ID (1LOQ) in the search bar, click Download Files in the top-right of the webpage, then select PDB File (text). In the new window, save the file in Downloads.  
 
Since we need to edit the PDB before we use it in Chimera we should do this manually. Go to PDB homepage (http://www.rcsb.org/pdb/home/home.do ) enter the protein ID (1LOQ) in the search bar, click Download Files in the top-right of the webpage, then select PDB File (text). In the new window, save the file in Downloads.  
  
===Preparing the ligand and receptor in Chimera===
+
====Preparing the ligand and receptor in Chimera====
  
 
In this section, we will create three new files and save them in the 001.CHIMERA.MOL.PREP/ folder:
 
In this section, we will create three new files and save them in the 001.CHIMERA.MOL.PREP/ folder:
  
 
   1LOQ.dockprep.mol2      (complete system with hydrogens and charges)
 
   1LOQ.dockprep.mol2      (complete system with hydrogens and charges)
   1LOQ.receptor.noH.mol2 (the receptor alone, without hydrogens)
+
   1LOQ.receptor.noH.pdb (the receptor alone, without hydrogens)
 
   1LOQ.ligand.mol2        (the ligand alone)
 
   1LOQ.ligand.mol2        (the ligand alone)
  
 
To prepare these files, first copy the original PDB file into the 001.CHIMERA.MOL.PREP/ folder and open it with VIM ($ vim 1LOQ.pdb). Because the residue name of the ligand (U) will give us some problems when assigning charges, change the residue name "U" to "LIG" starting at line 2082. Here is an example command that will change all instances of "  U" into "LIG", while preserving the correct spacing:
 
To prepare these files, first copy the original PDB file into the 001.CHIMERA.MOL.PREP/ folder and open it with VIM ($ vim 1LOQ.pdb). Because the residue name of the ligand (U) will give us some problems when assigning charges, change the residue name "U" to "LIG" starting at line 2082. Here is an example command that will change all instances of "  U" into "LIG", while preserving the correct spacing:
  
   %s/ U/LIG/gc
+
   :%s/U/LIG/gc
  
 
For this command, '''g'''  is short for global and '''c''' is short for check with the user before making the change.
 
For this command, '''g'''  is short for global and '''c''' is short for check with the user before making the change.
Line 63: Line 63:
 
Finally, save this file as '''1LOQ.dockprep.mol2'''.
 
Finally, save this file as '''1LOQ.dockprep.mol2'''.
  
===Generating the final files===
+
====Generating the final files====
  
To create the receptor file with no hydrogen atoms: Open 1LOQ.dockprep.mol2, click ''Select'' -> ''Chemistry'' -> ''Element'' -> ''H'', then chose ''Actions'' -> ''Atoms/Bonds'' -> ''Delete''. Save the file as '''1LOQ.receptor.noH.mol2'''.
+
To create the receptor file with no hydrogen atoms: Open 1LOQ.dockprep.mol2, click ''Select'' -> ''Chemistry'' -> ''Element'' -> ''H'', then chose ''Actions'' -> ''Atoms/Bonds'' -> ''Delete''. Save the file as '''1LOQ.receptor.noH.pdb'''.
  
 
To create the ligand file: Open 1LOQ.dockprep.mol2, click ''Select'' -> ''Residue'' -> ''LIG'', then click ''Select'' -> ''Invert'', then chose ''Actions'' -> ''Atoms/Bonds'' -> ''Delete''. Save the file as '''1LOQ.ligand.mol2'''.
 
To create the ligand file: Open 1LOQ.dockprep.mol2, click ''Select'' -> ''Residue'' -> ''LIG'', then click ''Select'' -> ''Invert'', then chose ''Actions'' -> ''Atoms/Bonds'' -> ''Delete''. Save the file as '''1LOQ.ligand.mol2'''.
Line 71: Line 71:
 
===antechamber===
 
===antechamber===
 
   
 
   
The antechamber program itself is the main program of Antechamber package. In most of cases, one should use this program instead of a series of separated programs to do molecular format convertion, atom type assignment and charge generation etc. An antechamber input file requires all the atom names to be unique. So if we use 1LOQ.ligand.mol2 as the input file, it will cause errors. The program can only recognize atom names of 3 characters ( In this case, H5' and H5'' cannot be distinguished from each other. )
+
The antechamber program itself is the main program of Antechamber package. In most of cases, one should use this program instead of a series of separated programs to do molecular format convertion, atom type assignment and charge generation etc. An antechamber input file requires all the atom names to be unique. So if we use 1LOQ.ligand.mol2 as the input file, it will cause errors. The program can only recognize atom names of 3 characters ( In this case, H5' and <nowiki>H5'' </nowiki> cannot be distinguished from each other. )
  
 
     22 H5'        40.0697  36.0506  37.6716 H        1 LIG210      0.0761
 
     22 H5'        40.0697  36.0506  37.6716 H        1 LIG210      0.0761
Line 143: Line 143:
 
===parmchk===
 
===parmchk===
  
Parmchk is another program included in the Antechamber package. After running antechamber, we run parmchk to check the parameters. If there are missing parameters after antechamber is finished, the frcmod template generated by parmchk will help us in generating the needed parameters:
+
Parmchk is another program in Antechamber. After running antechamber, we run parmchk to check the parameters. If there are missing parameters after antechamber is finished, the frcmod template generated by parmchk will help us in generating the needed parameters:
  
 
  parmchk -i 1LOQ.lig.ante.prep -f prepi -o 1LOQ.lig.ante.frcmod
 
  parmchk -i 1LOQ.lig.ante.prep -f prepi -o 1LOQ.lig.ante.frcmod
Line 184: Line 184:
  
 
===Visualization in VMD===
 
===Visualization in VMD===
 +
 +
Visualization is an important step in AMBER molecular dynamics simulation as it allows for the viewing of molecules and molecule movements within a specified field of view. Several files prepared in the previous two steps will be required for the visualization of the ligand and its movement in the protein binding pocket.  The first step that must be completed is the copying of all necessary information from Seawulf to Herbie.
 +
 +
Before exiting Seawulf, you can type:
 +
scp -r ~/AMS536/AMBER_Tutorial/002.ANTE.TLEAP username@herbie.mathlab.sunysb.edu:~/AMS536/AMBER_Tutorial/
 +
 +
Then, when in Herbie, type the command '''vmd''' in any directory and the program VMD will open on the computer.  In the "VMD Main" window, click '''File > New Molecule'''.  Now in the new window titled "Molecule File Browser" a few files must be selected and loaded. To select and load a file follow these steps:
 +
1. Click '''Browse...''' and select the file that you desire.
 +
2. Click the down button in the "Determine file type:" field and select the proper file type.
 +
3. Click '''Load''' and view the molecule/system in the original "VMD 1.8.5 OpenGL Display" window.
 +
 +
We can now visualize several files: the protein-ligand complex in the gas or water phase, and the ligand in the gas or water phase. Viewing the protein-ligand complex in the gas phase we select '''1LOQ.com.gas.leap.prm7''' and the file type '''AMBER7 Parm'''. Next you need to select and load the file '''1LOQ.com.gas.leap.rst7''' and the file type '''AMBER7 Restart'''.  This can be done for the complex or ligand for either the gas or the water phase by selecting and loading the corresponding '''.parm7''' files and '''.rst7''' files.
 +
 +
Next, you can choose to edit the structure shown for better visualization.  By clicking '''Graphics > Representations...''' a new window "Graphical Representations" will open, in which you can create new representations of the protein-ligand complex. It is possible to choose the entire complex or simply the protein or ligand, to color the structure in various ways, and to choose how it is best represented (e.g. lines, thicker bonds, full surface).
  
 
==III. Simulation using sander==
 
==III. Simulation using sander==
  
 
=== Minimization ===
 
=== Minimization ===
 +
Energy minimization is first performed on the stucture before the equilibration and production runs may be performed. Model building often creates unwanted structural artifacts that must be removed before a molecular dynamics simulation is performed.
 +
 +
To begin, create four files for minimization steps 1,3,4 and 5.
 +
 +
vim 01mi.in
 +
 +
and follow the naming according to minimization run number, i.e. 03mi.in
 +
 +
All "#mi.in" file content will be identical except the last parameter, the restraint weight (restraint_wt). This restraint will decrease with increasing minimization runs. Run number 1,3,4,and 5 has restraint weight 5,2,0.1,and 0.05 respectively.
 +
 +
 +
An example of minimization code for 01mi.in:
 +
 +
01mi.in: equilibration
 +
&cntrl
 +
  imin = 1, maxcyc = 1000, ntmin = 2,
 +
  ntx = 1, ntc = 1, ntf = 1,
 +
  ntb = 1, ntp = 0,
 +
  ntwx = 1000, ntwe = 0, ntpr = 1000,
 +
  ntr = 1,
 +
  restraintmask = ':1-210 & !@H=',
 +
  restraint_wt=5.0,
 +
 +
 +
The following parameters are defined as follows:
 +
 +
'''imin''':specifies minimization not molecular dynamics
 +
 +
'''maxcyc''':total number of minimization cycles to be performed
 +
 +
'''ntmin''':how many cycles will use a deepest decent method, the remaining cycles use an approximation of this called the conjugate gradient method.
 +
 +
'''ntx''':only coordinates and not velocities are to be read from previous step
 +
 +
'''ntc''':indicates level of constraint on bonds. if =1, SHAKE algorithm is off so no bonds are constrained. If =2, constrains any bonds with H atoms. If =3, constrains all bonds.
 +
 +
'''ntf''':=1, all parts of the potential must be evaluated
 +
 +
'''ntb''':periodic boundary to keep system at constant volume
 +
 +
'''ntp''':=0, NO constant pressure applied
 +
 +
The frequencies at which the program records data are in controlled by the paramenters ntwx, ntwe, and ntpr.
 +
 +
'''ntwx''':=1000, atom coordinates saved into .trj file every 1000 cycles
 +
 +
'''ntwe''':=0, no .en file is generated
 +
 +
'''ntpr''':=1000, energy readins are written as .out and .info files every 1000 steps
 +
 +
'''ntr''':=1, positional restraint method applied
 +
 +
'''restraintmask'''= ':1-119 & !@H : position of atom within residues 1-119 that is not a H atom is being restrained
 +
 +
'''restraint_wt''': restraint weight indicating how strong the restraint on the atoms is
  
 
=== Equilibration ===
 
=== Equilibration ===
 +
To further prepare our complex for the molecular dynamic simulations, the subsequent step of energy minimization is equilibrate the system at some certain temperatures. We repeat the process of minimization and equilibration for twice in our case, of course with varied parameters and restraints put on our system.
 +
 +
Right after the first 1000 steps of minimization, we perform a 50000 steps('''nstlim = 50000''') X the step length of 1 fs('''dt = 0.001''') (that is 50 ps in total) equilibration at the temperature 298.15K('''temp0 = 298.15, tempi = 298.15'''), contining putting on the weight (in kcal/mol) of 5.0 for the positional restraints on all non hydrogen atoms('''restraintmask = ':1-210 & !@H=', restraint_wt = 5.0''').
 +
 +
02md.in: equilibration (50ps)
 +
&cntrl
 +
  imin = 0, ntx = 1, irest = 0, nstlim = 50000,
 +
  temp0 = 298.15, tempi = 298.15,
 +
  ntc = 2, ntf = 1, ntt = 1, dt = 0.001,
 +
  ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
 +
  ntwx = 1000, ntwe = 0, ntwr = 1000, ntpr = 1000,
 +
  ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
 +
  restraintmask = ':1-210 & !@H=', restraint_wt = 5.0,
 +
/
 +
 +
*Notice that some flags used in minimization have been changed:
 +
 +
'''imin = 0''' indicates that we are doing an MD simulation instead of doing energy minimization.
 +
 +
'''ntc = 2 ''' indicates that all bonds involving H bonds are constrained by the SHAKE algorithm to eliminate high frequency oscillations in the system.
 +
 +
'''ntb = 2 '''and '''ntp = 1 '''indicate that constant pressure instead of constant volume is applied.
 +
 +
*Some flags (including those explained above) are newly added;
 +
 +
'''ntx = 1''' means coordinates, but no velocities, will be read.
 +
 +
'''irest = 0''' irest is a flag to restart a simulation, we always start from running a new simulation until the production part, so it is 0 from 02md.in, 06md.in to 09md.in.
 +
 +
'''ntt = 1''' switch the temperature scaling, 1 means constant temperature, using the weak-coupling algorithm.
 +
 +
'''tautp = 1.0''' and '''taup = 1.0''' defines the time constant (in ps) and the pressure relaxation time (in ps), respectively.
 +
 +
'''iwrap = 1''' indicate that the coordinates written to the .rst7 and .trj files will be wrapped into a primary box. This reduces the amount of coordinate output, thus preventing overflowing and making visualization more convenient.
 +
 +
'''cut = 8.0''' set the the non-bonded cutoff distance at 8.0 Angstroms.
 +
 +
Continuing 4 simulations are carried out after the fourth minimization, there are only little differents in these input files:
 +
 +
06md.in: equilibration (50ps)
 +
  &cntrl
 +
    imin = 0, ntx = 1, irest = 0, nstlim = 50000,
 +
    temp0 = 298.15, tempi = 298.15, '''ig = 71287''',
 +
    ntc = 2, ntf = 1, ntt = 1, dt = 0.001,
 +
    ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
 +
    ntwx = 1000, ntwe = 0, ntwr = 1000, ntpr = 1000,
 +
    ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
 +
    restraintmask = ':1-210 & !@H=', '''restraint_wt = 1.0''',
 +
/
 +
 +
*As we can see in 06md.in file, the only changes are setting '''ig''' as 71287 and reducing '''restraint_wt''' value from 5.0 to 1.0.
 +
 +
'''ig''' is the seed for the pseudo-random number generator which the MD starting velocity depend on.
 +
 +
To further reduce the restraint put on the whole system, we vary '''restraint_wt''' value to be '''0.05, 0.1, 0.1''', respectively in 07md, 08md and 09md input files.
 +
 +
*Note that starting from 08md to 09md simulations, the '''restraintmask = ':1-210 & !@H='''' has been changed to '''restraintmask = ':1-209 & !@H='''' which means not at all restrains are put on the ligand atoms.
  
 
=== Production ===
 
=== Production ===
 +
Before the production runs (10 and 11), first we need to generate the input files 10md.in and 11md.in (as shown below).
 +
 +
10md.in: production (500000 = 1ns)
 +
&cntrl
 +
  imin = 0, ntx = 5, irest = 1, nstlim = 500000,
 +
  temp0 = 298.15, tempi = 298.15, ig = 71287,
 +
  ntc = 2, ntf = 1, ntt = 1, dt = 0.002,
 +
  ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
 +
  ntwx = 500, ntwe = 0, ntwr = 500, ntpr = 500,
 +
  ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
 +
  restraintmask = ':1-209@CA,C,N', restraint_wt = 0.1,
 +
/
 +
 +
11md.in: production (500000 = 1ns)
 +
&cntrl
 +
  imin = 0, ntx = 5, irest = 1, nstlim = 500000,
 +
  temp0 = 298.15, tempi = 298.15, ig = 71287,
 +
  ntc = 2, ntf = 1, ntt = 1, dt = 0.002,
 +
  ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
 +
  ntwx = 500, ntwe = 0, ntwr = 500, ntpr = 500,
 +
  ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
 +
  restraintmask = ':1-209@CA,C,N', restraint_wt = 0.1,
 +
/
 +
 +
Explanation for some of the parameter values:
 +
ntx = 5, irest = 1 #using velocity information from the previous restart file#
 +
nstlim = 500000    #running for 500,000 steps#
 +
temp0 = 298.15    #the target temperature is 298.15 K (the default is 300 K)#
 +
tempi = 298.15    #the initial temperature is 298.15 K#
 +
dt = 0.002        #the time step dt is 0.002 ps (2 fs)#
 +
ntwr = 500        #to print energy output every 500 steps# 
 +
ntb = 2            #the constant pressure is switched on#
 +
 +
To run the production, we need to create a c-shell file named "run.production.csh" of which the contents are shown below:
 +
#! /bin/tcsh
 +
#-l nodes=4:ppn=2
 +
#PBS -l walltime=720:00:00
 +
#PBS -o zzz.qsub.prod.out
 +
#PBS -e zzz.qsub.prod.err
 +
#PBS -V
 +
 +
set workdir = "~<username>/AMS536/amber-tutorial/03.sander"
 +
cd ${workdir}
 +
cat $PBS_NODEFILE | sort | uniq
 +
 +
mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 10md.in -o 10md.out -p ../02.ante.tleap/1LOQ.com.wat.leap.prm7 \
 +
-c 09md.rst7 -ref 09md.rst7 -x 10md.trj -inf 10md.info -r 10md.rst7
 +
mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 11md.in -o 11md.out -p ../02.ante.tleap/1LOQ.com.wat.leap.prm7 \
 +
-c 10md.rst7 -ref 10md.rst7 -x 11md.trj -inf 11md.info -r 11md.rst7
 +
exit
 +
 +
The last step of a production run is to submit the c-shell file to the queue:
 +
qsub zzz.production.csh
 +
It will take almost 24 hour to finish the production run. To check the status of your job, type:
 +
qstat -u <username>
 +
After the production run, 10 new files should be created:
 +
10md.info
 +
10md.out
 +
10md.rst7
 +
10md.trj
 +
11md.info
 +
11md.out
 +
11md.rst7
 +
11md.trj
 +
zzz.qsub.prod.err (should be empty if nothing is wrong)
 +
zzz.qsub.prod.out
  
 
=== Running jobs on the queue ===
 
=== Running jobs on the queue ===
Line 235: Line 427:
 
Note in the above script that for each run, the .rst7, .trj and .info files generated from the previous run provides the initial state to start from.  The .prm7 file generated for the hydrated complex by TLEAP provides the force parameters.
 
Note in the above script that for each run, the .rst7, .trj and .info files generated from the previous run provides the initial state to start from.  The .prm7 file generated for the hydrated complex by TLEAP provides the force parameters.
  
Submit the file to the que and monitor progress.
+
Submit the file to the queue and monitor progress.
  
 
==IV. Simulation Analysis==
 
==IV. Simulation Analysis==
  
 
===Ptraj===
 
===Ptraj===
 +
 +
You should create another work directory for this step (004.PTRAJ, for example), if you don't have one.
 +
There will be five runs of analysis, each of which will require different input files.
 +
 +
'''1.''' At first we want to concatenate the two 1ns trajectories together, stripping off the waters, and creating a .strip-file as an output. Below is the input file which will allow us to do so.
 +
 +
'''ptraj.1.in'''
 +
trajin ../003.SANDER/10md.trj 1 1000 1
 +
trajin ../003.SANDER/11md.trj 1 1000 1
 +
trajout 1LOQ.trj.strip nobox
 +
strip :WAT
 +
 +
The two sets of numbers ''1 1000 1'' give the input information about which frames are used for the Ptraj. The first two numbers ''1'' and ''1000'' specify the starting and ending snapshots from the trajectory file. The ending number of the snapshot doesn't need to be accurate because if you actually don't have enough snapshots in your trajectory file, Ptraj will read up to the last one you have. The last number ''1'' specifies the frequency of the snapshot saved, in this case, we are saving every frame of the trajectory file. And the last line of the input file will take away all the water molecules.
 +
 +
As the input file is prepared we can launch the first analysis as follows:
 +
ptraj  ../002.ANTE.TLEAP/1LOQ.com.wat.leap.parm  ptraj.1.in  > ptraj.1.log
 +
 +
As the output you will obtain '''1LOQ.trj.strip''' file which contains ''2000'' snapshots of the trajectory.
 +
 +
'''2.''' Later on we want to compare the output file just obtained to our reference file '''1LOQ.com.gas.leap.rst7''', using the following input file.
 +
 +
'''ptraj.2.in'''
 +
trajin 1LOQ.trj.strip  1 2000 1
 +
trajout 1LOQ.com.trj.stripfit
 +
reference ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7
 +
rms reference out 1LOQ.rmsd.CA.txt :1-209@CA
 +
 +
Since we have just concatenated the two trajectories, we will have ''2000'' snapshots in '''1LOQ.trj.strip'''. The third line in the input specifies the reference file, we have taken away all the water molecules during the first step, hence we are working here with the gas phase complex. The last line says we are calculating the rmsd for alpha carbon number ''1'' to ''209''.
 +
 +
And as we have this file filled out, we can run this step:
 +
ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.2.in  > ptraj.2.log
 +
 +
The output file '''1LOQ.rmsd.CA.txt''' wil contain two columns, the first one is the number of frame and the second one stands for rmsd value.
 +
 +
'''3.''' Afterwards we will generate a similar file for ligand, using the following input file. 
 +
 +
'''ptraj.3.in'''
 +
trajin 1LOQ.com.trj.stripfit 1 2000 1
 +
reference ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7
 +
rms reference out 1LOQ.lig.rmsd.txt :210@C*,N*,O*,S* nofit
 +
 +
And then run this step:
 +
ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.3.in  > ptraj.3.log
 +
 +
By doing this we will compare the trajectory file to '''1LOQ.com.gas.leap.rst7''' as well, but working with the ligand instead of the receptor (we specified that by pointing that we want to calculate the rmsd for carbon, nitrogen, oxygen and sulfur in residue ''210''.
 +
 +
The last two steps are to obtain energetic information about the system. To do this we take a trajectory file of the gas phase complex '''1LOQ.com.trj.stripfit''', and want to create two more trajectory files containing the information on only receptor and only ligand correspondingly.
 +
 +
'''4.''' At this step we consider receptor only. The input file is provided below:
 +
 +
'''ptraj.4.in'''
 +
trajin 1LOQ.com.trj.stripfit 1 2000 1
 +
trajout 1LOQ.rec.trj.stripfit
 +
strip :210
 +
 +
And then run this step:
 +
ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.4.in  > ptraj.4.log
 +
 +
'''5.''' And the same procedure for the ligand, with the following input file:
 +
 +
'''ptraj.5.in'''
 +
trajin 1LOQ.com.trj.stripfit 1 2000 1
 +
trajout 1LOQ.lig.trj.stripfit
 +
strip :1-209
 +
 +
And then run this step:
 +
ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.5.in  > ptraj.5.log
 +
 +
'''6.''' ''(optional)'' Visualization
 +
 +
As we've gone through all these steps, the analysis is done. If you want to visualize the trajectories, you first need to copy the trajectory files to Herbie like this, for example (being a level above '''004.PTRAJ''' directory):
 +
scp 004.PTRAJ/ your_username@herbie.mathlab.sunysb.edu:~AMS536/AMBER_tutorial/004.PTRAJ
 +
 +
Now, launch VMD, then open one of the .prm7 files in 002.ANTE.TLEAP. If you want to visualize the whole complex in gas state, you can open 1LOQ.com.gas.leap.prm7 with AMBER7 Parm from 002.ANTE.TLEAP and then 1LOQ.com.trj.stripfit with AMBER coordinates from 004.PTRAJ. With these files, you can look at the real-time movement of the complex in the gas state. You can repeat this procedure to observe the real-time movement of the complex in the water state. Just open 1LOQ.com.wat.leap.prm7 instead of 1LOQ.com.gas.leap.prm7.
  
 
====RMSD Plots====
 
====RMSD Plots====
 +
 +
Once having finished running Ptraj, you should find two RMSD files in your directory: '''1LOQ.lig.rmsd.txt''' and '''1LOQ.rmsd.CA.txt'''
 +
The data in these files can be plotted to visualize the structural drift (ie change in RMSD) over the course of your simulation. Large fluctuations in RMSD may be an indication that the simulation was unstable.
 +
 +
Here is a plot for ligand RMSD:
 +
 +
[[Image:AMBER_RMSD_plot.jpg|500px|center]]
  
 
====Measuring h-bond distances====
 
====Measuring h-bond distances====
 
  
 
===MM-GBSA Energy Calculation===
 
===MM-GBSA Energy Calculation===
Line 261: Line 533:
 
   cut    = 99999.0,  nsnb  = 99999,
 
   cut    = 99999.0,  nsnb  = 99999,
 
   imin  = 5,        maxcyc = 1,        ncyc  = 0,
 
   imin  = 5,        maxcyc = 1,        ncyc  = 0,
/
+
  /
  
 
Then create a csh script, run.sander.rescore.csh, that contains the following lines of command:  
 
Then create a csh script, run.sander.rescore.csh, that contains the following lines of command:  
Line 270: Line 542:
 
   #PBS -e zzz.qsub.err
 
   #PBS -e zzz.qsub.err
 
   #PBS -V
 
   #PBS -V
 
 
   set workdir = "/nfs/user03/yechen1/AMBER-Tutorial/005.MMGBSA"
 
   set workdir = "/nfs/user03/yechen1/AMBER-Tutorial/005.MMGBSA"
 
   set sander  = "/nfs/user03/wjallen/local/amber12/bin/sander"
 
   set sander  = "/nfs/user03/wjallen/local/amber12/bin/sander"
 
 
   cd ${workdir}
 
   cd ${workdir}
 
 
   $sander -O -i gb.rescore.in -o gb.rescore.out.com -p ../002.ANTE.TLEAP  /1LOQ.com.gas.leap.prm7 \
 
   $sander -O -i gb.rescore.in -o gb.rescore.out.com -p ../002.ANTE.TLEAP  /1LOQ.com.gas.leap.prm7 \
 
   -c ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.com.trj.stripfit \
 
   -c ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.com.trj.stripfit \
 
   -r restrt.com -ref ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7 -x mdcrd.com -inf mdinfo.com
 
   -r restrt.com -ref ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7 -x mdcrd.com -inf mdinfo.com
 
 
   $sander -O -i gb.rescore.in -o gb.rescore.out.lig -p ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.prm7 \
 
   $sander -O -i gb.rescore.in -o gb.rescore.out.lig -p ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.prm7 \
 
   -c ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.lig.trj.stripfit \
 
   -c ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.lig.trj.stripfit \
 
   -r restrt.lig -ref ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.rst7 -x mdcrd.lig -inf mdinfo.lig
 
   -r restrt.lig -ref ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.rst7 -x mdcrd.lig -inf mdinfo.lig
 
 
   $sander -O -i gb.rescore.in -o gb.rescore.out.rec -p ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.prm7 \
 
   $sander -O -i gb.rescore.in -o gb.rescore.out.rec -p ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.prm7 \
 
   -c ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.rec.trj.stripfit \
 
   -c ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.rec.trj.stripfit \
 
   -r restrt.rec -ref ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.rst7 -x mdcrd.rec -inf mdinfo.rec
 
   -r restrt.rec -ref ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.rst7 -x mdcrd.rec -inf mdinfo.rec
 
 
   exit
 
   exit
 
~                                                                                                                                                   
 
~                                                                                                                                                   
Line 293: Line 559:
 
~                                                                                                                                                   
 
~                                                                                                                                                   
 
~                                                                                                                                                   
 
~                                                                                                                                                   
~                                                                                                                                                 
+
                                                                                                                                   
~                                                                                                                                                 
 
~                                                                                                                                                 
 
 
 
Then this script should be sent to the queue, i.e., qsub the script using the commands:  
 
Then this script should be sent to the queue, i.e., qsub the script using the commands:  
 
   qsub run.sander.rescore.csh
 
   qsub run.sander.rescore.csh
 
You can monitor your progress by typing  
 
You can monitor your progress by typing  
 
   qstat -u username
 
   qstat -u username
When the job is complete, you will obtain the following output files: ""gb.rescore.out.com, gb.rescore.out.lig, and gb.rescore.out.rec""               
+
When the job is complete, you will obtain the following output files: <font color="#666666">'''gb.rescore.out.com''', '''gb.rescore.out.lig''', and '''gb.rescore.out.rec'''</font> 
 +
In these files, the single point energy calculation results will be written for each individual frame. It will be found in the results section and the output file will have an infrastrucutre that is similar to the following:
 +
      FINAL RESULTS
 +
    NSTEP      ENERGY          RMS            GMAX        NAME    NUMBER
 +
      1      3.6269E+03    1.8737E+01    1.0472E+02    CB        585
 +
  BOND    =      580.2786  ANGLE  =    1563.7704  DIHED      =    2161.5659
 +
  VDWAALS =    -1684.2762  EEL    =  -13809.8494  EGB        =    -2953.6681
 +
  1-4 VDW =      756.7767  1-4 EEL =    7260.2823  RESTRAINT  =        0.0000
 +
  ESURF  =    9752.0291
 +
  minimization completed, ENE= 0.36269092E+04 RMS= 0.187371E+02
 +
  TRAJENE: Trajectory file ended
 +
  TRAJENE: Trajene complete.
 +
 
 +
In the command line, type:
 +
 
 +
  grep VDWAALS gb.rescore.out.com > vdw.com.txt.
 +
  grep ESURF gb.rescore.out.com > surf.com.txt. 
 +
 
 +
You can take these text files, import them into Excel, and do the rest of your modifications there.           
 +
 
 
====Equations for analysis====
 
====Equations for analysis====
 
+
Remember that to obtain the Gvdw term, you need to take the SASA (which is ESURF) and input into equation 1:
 +
 
 +
ΔGnonpolar = SASA*0.00542 + 0.92
 +
 
 +
Also, the mmgbsa of a given system can be determined by equation 2:
 +
 
 +
ΔGmmgbsa = ΔGvdw + ΔGcoul + ΔGpolar + ΔGnonpolar
 +
 
 +
From the output file:
 +
 
 +
VDWAALS = ΔGvdw
 +
 
 +
EELS = ΔGcoul
 +
 
 +
EGB = ΔGpolar
 +
 
 +
You can then easily calculate the ΔΔGbind by using equation 3:
 +
 
 +
ΔΔGbind = ΔGmmgbsa,complex – (ΔGmmgbsa,lig + ΔGmmgbsa,rec)
 +
You will want to careful when doing your analysis that the results from frame 1 for the receptor and ligand are subtracted from the results from frame 1 for your complex. By doing this in excel, you should have 2000 frames for each, and the values should cleanly line up. Finally, you will want to plot your ΔΔGbind and examine if you see corresponding changes in the ligand position and the ΔΔGbind. Also, you should determine the mean and standard deviation for your ΔΔGbind.
 
====Plotting Energy====
 
====Plotting Energy====
 +
When your rescoring calculation finishes, you have the following three output files: "gb.rescore.out.com", "gb.rescore.out.lig", and "gb.rescore.out.rec".
 +
 +
Use the following script, entitled get.mmgbsa.bash, to extract data and calculate MMGBSA energy for each snap shot.
 +
  #! /bin/bash
 +
  # by Haoquan
 +
  echo com lig rec > namelist
 +
  LIST=`cat namelist`
 +
  for i in $LIST ; do
 +
  grep VDWAALS gb.rescore.out.$i | awk '{print $3}' > $i.vdw
 +
  grep EGB    gb.rescore.out.$i | awk '{print $9}' > $i.polar
 +
  grep EELS    gb.rescore.out.$i | awk '{print $6}' > $i.coul
 +
  grep ESURF  gb.rescore.out.$i | awk '{print $3 * 0.00542 + 0.92}' > $i.surf
 +
  paste -d " " $i.vdw $i.polar $i.surf $i.coul | awk '{print $1 + $2 + $3 + $4}' > data.$i
 +
  rm $i.*
 +
  done
 +
  paste -d " " data.com data.lig data.rec | awk '{print $1 - $2 - $3}' > data.all
 +
  for ((j=1; j<=`wc -l data.all | awk '{print $1}'`; j+=1)) do
 +
  echo $j , >> time
 +
  done
 +
  paste -d " " time data.all > MMGBSA_vs_time.dat 
 +
  rm namelist time data.*
 +
 +
To run this script do:
 +
 +
  bash get.mmgbsa.sh
 +
 +
This will create a text file called MMGBSA_vs_time.dat with x and y values separated by a space and comma. These values can be imported to Excel or Origin or to XMGRACE if you are using Linux:
 +
 +
  xmgrace MMGBSA_vs_time.dat
 +
 +
2013 AMBER MMGBSA plot.jpg
 +
[[File:Example.png]]
  
 
==V. Frequently Encountered Problems==
 
==V. Frequently Encountered Problems==

Latest revision as of 15:54, 6 January 2015

For additional Rizzo Lab tutorials see AMBER Tutorials.

In this tutorial, we will learn how to run a molecular dynamics simulation of a protein-ligand complex. We will then post-process that simulation by calculating structural fluctuations (with RMSD) and free energies of binding (MM-GBSA).

I. Introduction

AMBER

Amber - Assisted Model Building with Energy Refinement - is a suite of about multiple programs for perform macromolecular simulations. Amber11, the current version of Amber, includes newly released functionality such as PMEMD, particle mesh Ewald MD and soft-core Thermodynamics Integration MD. For the tutorial, we are using the newest version AMBER12.

The Amber 12 Manual is the primary resource to get started with Amber12. (Tip: Using Adobe Acrobat to view the file, you can simply search the document for keywords such as the name of a simulation parameter, which saves much time.) In addition, Amber Tools User's Manual serves as another reference while using Amber tools.

Here are some programs in Amber

  1. LEaP: an preparing program for constructing new or modified systems in Amber. It consists of the functions of prep, link, edit, and parm for earlier version of Amber.
  2. ANTECHAMBER: in additional to LEap, this main Antechamber suite program is for preparing input files other than standard nucleic acids and proteins.
  3. SANDER: according to the Amber 12 manual, it is 'a basic energy minimizer and molecular dynamics program' that can be used to minimize, equilibrate and sample molecular conformations. And this is the program we mainly use in this tutorial to generate trajectory files of the molecular system.
  4. PMEMD: version of SANDER that has improved parallel scaling property and optimized speed.
  5. PTRAJ: an analysis program for processing trajectory files. One can use ptraj to rotate, translate the structures, evaluate geometrical features and so on.

There is a mailing list you could sign-up for, as an additional resource.

UMP and OMP

For information of the UMP-OMP system, see 2013 DOCK tutorial with Orotodine Monophosphate Decarboxylase.

Organizing Directories

While performing MD simulations, it is convenient to adopt a standard directory structure / naming scheme, so that files are easy to find / identify. For this tutorial, we will use something similar to the following:

~username/AMS536/AMBER-Tutorial/001.CHIMERA.MOL.PREP/  
                                002.ANTE.TLEAP/ 
                                003.SANDER/       
                                004.PTRAJ/
                                005.MMGBSA/

II. Structural Preparation

Preparation in Chimera

In this AMBER tutorial, we will use the same system with previous DOCK part. To begin with, we need three files under directory 001.CHIMERA.MOL.PREP.

Downloading the PDB file (1LOQ)

Since we need to edit the PDB before we use it in Chimera we should do this manually. Go to PDB homepage (http://www.rcsb.org/pdb/home/home.do ) enter the protein ID (1LOQ) in the search bar, click Download Files in the top-right of the webpage, then select PDB File (text). In the new window, save the file in Downloads.

Preparing the ligand and receptor in Chimera

In this section, we will create three new files and save them in the 001.CHIMERA.MOL.PREP/ folder:

 1LOQ.dockprep.mol2      (complete system with hydrogens and charges)
 1LOQ.receptor.noH.pdb  (the receptor alone, without hydrogens)
 1LOQ.ligand.mol2        (the ligand alone)

To prepare these files, first copy the original PDB file into the 001.CHIMERA.MOL.PREP/ folder and open it with VIM ($ vim 1LOQ.pdb). Because the residue name of the ligand (U) will give us some problems when assigning charges, change the residue name "U" to "LIG" starting at line 2082. Here is an example command that will change all instances of " U" into "LIG", while preserving the correct spacing:

 :%s/U/LIG/gc

For this command, g is short for global and c is short for check with the user before making the change.

Next, open up the PDB file (1LOQ.pdb) in Chimera. To delete water molecules and other ligands, click Tools -> Structure Editing -> Dock Prep. Check all boxes and click Okay to the end. Alternatively, the waters can be deleted manually by choosing Select -> Residue -> HOH, then go to Actions -> Atoms/Bonds -> Delete. Hydrogen atoms can be added manually by choosing Tools -> Structure Editing -> Add H.

Next, to add charges to the ligand and receptor, go to Select -> Residue -> LIG, then go to Tools -> Structure Editing -> Add Charge. Choose AMBER ff99SB as the charge model, click Okay, and when prompted chose AM1-BCC charges for the ligand, and make sure the Net Charge is set to -1. (You must consider the chemistry of the ligand when assigning a charge state).

Finally, save this file as 1LOQ.dockprep.mol2.

Generating the final files

To create the receptor file with no hydrogen atoms: Open 1LOQ.dockprep.mol2, click Select -> Chemistry -> Element -> H, then chose Actions -> Atoms/Bonds -> Delete. Save the file as 1LOQ.receptor.noH.pdb.

To create the ligand file: Open 1LOQ.dockprep.mol2, click Select -> Residue -> LIG, then click Select -> Invert, then chose Actions -> Atoms/Bonds -> Delete. Save the file as 1LOQ.ligand.mol2.

antechamber

The antechamber program itself is the main program of Antechamber package. In most of cases, one should use this program instead of a series of separated programs to do molecular format convertion, atom type assignment and charge generation etc. An antechamber input file requires all the atom names to be unique. So if we use 1LOQ.ligand.mol2 as the input file, it will cause errors. The program can only recognize atom names of 3 characters ( In this case, H5' and H5'' cannot be distinguished from each other. )

    22 H5'        40.0697   36.0506   37.6716 H         1 LIG210      0.0761
    23 H5         40.6060   37.0416   36.2883 H         1 LIG210      0.0349
    24 H4'        38.2510   37.5673   38.1082 H         1 LIG210      0.0967
    25 H3'        38.2723   38.5564   35.5613 H         1 LIG210      0.1052
    26 H2'        40.3056   39.7322   35.6920 H         1 LIG210      0.1188
    27 H1'        39.7587   40.5927   38.4795 H         1 LIG210      0.0753
    28 H6         41.3847   41.7815   39.3543 H         1 LIG210      0.0322
    29 H5         43.6843   42.3632   39.6188 H         1 LIG210      0.1693
    30 H3         44.6515   39.3570   36.8437 H         1 LIG210      0.3481
    31 HO3'       36.5607   38.9743   36.9843 H         1 LIG210      0.4373
    32 HP3        39.3989   32.7036   35.9199 H         1 LIG210      0.4245
    33 HO2'       39.5501   41.7087   35.4449 H         1 LIG210      0.4228

We need to manually rename the atoms.

    22 H1         40.0697   36.0506   37.6716 H         1 LIG210      0.0761
    23 H2         40.6060   37.0416   36.2883 H         1 LIG210      0.0349
    24 H3         38.2510   37.5673   38.1082 H         1 LIG210      0.0967
    25 H4         38.2723   38.5564   35.5613 H         1 LIG210      0.1052
    26 H5         40.3056   39.7322   35.6920 H         1 LIG210      0.1188
    27 H6         39.7587   40.5927   38.4795 H         1 LIG210      0.0753
    28 H7         41.3847   41.7815   39.3543 H         1 LIG210      0.0322
    29 H8         43.6843   42.3632   39.6188 H         1 LIG210      0.1693
    30 H9         44.6515   39.3570   36.8437 H         1 LIG210      0.3481
    31 HO1        36.5607   38.9743   36.9843 H         1 LIG210      0.4373
    32 HP3        39.3989   32.7036   35.9199 H         1 LIG210      0.4245
    33 HO2        39.5501   41.7087   35.4449 H         1 LIG210      0.4228

To begin with, go to 002.ANTE.TLEAP directory. To make sure we have access to the three programs that we want to run (antechamber, parmchk and tleap) and we are using the correct version of amber, we can use the which command, type:

which antechamber
which parmchk
which tleap

Your results should be similar to this:

/home/wjallen/AMS536/local/amber12/bin/antechamber
/home/wjallen/AMS536/local/amber12/bin/parmchk
/home/wjallen/AMS536/local/amber12/bin/tleap

Copy parameters of ions to your working directory from the following resource:

cp -r ~lingling/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/rizzo_amber7.ionparms

Then we use antechamber to convert our input mol2 file into files ready for LEaP.Type command:

antechamber -i ../001.CHIMERA.MOL.PREP/1LOQ.lig.mol2 -fi mol2 -o 1LOQ.lig.ante.pdb -fo pdb

Here, -i input file name; -fi input file format; -o output file name; -fo output file format. You will have an output file:1LOQ.lig.ante.pdb

Similarly, we can use antechamber to change the fomat of 1LOQ.lig.mol2 file to prep file:

antechamber -i ../001.CHIMERA.MOL.PREP/1LOQ.lig.mol2 -fi mol2 -o 1LOQ.lig.ante.prep -fo prepi

You will get a set of output files:

ANTECHAMBER_AC.AC  
ANTECHAMBER_AC.AC0  
ANTECHAMBER_BOND_TYPE.AC  
ANTECHAMBER_BOND_TYPE.AC0  
ANTECHAMBER_PREP.AC  
ANTECHAMBER_PREP.AC0
ATOMTYPE.INF 
NEWPDB.PDB 
PREP.INF 
1LOQ.lig.ante.prep

parmchk

Parmchk is another program in Antechamber. After running antechamber, we run parmchk to check the parameters. If there are missing parameters after antechamber is finished, the frcmod template generated by parmchk will help us in generating the needed parameters:

parmchk -i 1LOQ.lig.ante.prep -f prepi -o 1LOQ.lig.ante.frcmod

tleaP

Next, we need 3 new input files – “tleap.lig.in”, “tleap.rec.in” and “tleap.com.in”, for the ligand, the receptor, and the complex, respectively. These input files will be used by LEaP to create parameter/topology files and initial coordinate files.

Copy parameters of ions to your working directory from the following resource:

cp  ~lingling/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/tleap.lig.in
cp  ~lingling/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/tleap.rec.in
cp  ~lingling/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/tleap.com.in

Use leap command:

tleap -s -f tleap.lig.in > tleap.lig.out
tleap -s -f tleap.rec.in > tleap.rec.out
tleap -s -f tleap.com.in > tleap.com.out

You will see the following output files: 1LOQ.com.gas.leap.prm7 1LOQ.com.wat.leap.prm7 1LOQ.lig.ante.frcmod 1LOQ.lig.ante.prep 1LOQ.lig.gas.leap.rst7 1LOQ.lig.wat.leap.rst7 1LOQ.rec.gas.leap.rst7 1LOQ.rec.wat.leap.rst7 1LOQ.com.gas.leap.rst7 1LOQ.com.wat.leap.rst7 1LOQ.lig.ante.pdb 1LOQ.lig.gas.leap.prm7 1LOQ.lig.wat.leap.prm7 1LOQ.rec.gas.leap.prm7 1LOQ.rec.wat.leap.prm7 tleap.lig.out tleap.rec.out tleap.com.out leap.log

Visualization in VMD

Visualization is an important step in AMBER molecular dynamics simulation as it allows for the viewing of molecules and molecule movements within a specified field of view. Several files prepared in the previous two steps will be required for the visualization of the ligand and its movement in the protein binding pocket. The first step that must be completed is the copying of all necessary information from Seawulf to Herbie.

Before exiting Seawulf, you can type:

scp -r ~/AMS536/AMBER_Tutorial/002.ANTE.TLEAP username@herbie.mathlab.sunysb.edu:~/AMS536/AMBER_Tutorial/

Then, when in Herbie, type the command vmd in any directory and the program VMD will open on the computer. In the "VMD Main" window, click File > New Molecule. Now in the new window titled "Molecule File Browser" a few files must be selected and loaded. To select and load a file follow these steps:

1. Click Browse... and select the file that you desire.
2. Click the down button in the "Determine file type:" field and select the proper file type.
3. Click Load and view the molecule/system in the original "VMD 1.8.5 OpenGL Display" window.

We can now visualize several files: the protein-ligand complex in the gas or water phase, and the ligand in the gas or water phase. Viewing the protein-ligand complex in the gas phase we select 1LOQ.com.gas.leap.prm7 and the file type AMBER7 Parm. Next you need to select and load the file 1LOQ.com.gas.leap.rst7 and the file type AMBER7 Restart. This can be done for the complex or ligand for either the gas or the water phase by selecting and loading the corresponding .parm7 files and .rst7 files.

Next, you can choose to edit the structure shown for better visualization. By clicking Graphics > Representations... a new window "Graphical Representations" will open, in which you can create new representations of the protein-ligand complex. It is possible to choose the entire complex or simply the protein or ligand, to color the structure in various ways, and to choose how it is best represented (e.g. lines, thicker bonds, full surface).

III. Simulation using sander

Minimization

Energy minimization is first performed on the stucture before the equilibration and production runs may be performed. Model building often creates unwanted structural artifacts that must be removed before a molecular dynamics simulation is performed.

To begin, create four files for minimization steps 1,3,4 and 5.

vim 01mi.in

and follow the naming according to minimization run number, i.e. 03mi.in

All "#mi.in" file content will be identical except the last parameter, the restraint weight (restraint_wt). This restraint will decrease with increasing minimization runs. Run number 1,3,4,and 5 has restraint weight 5,2,0.1,and 0.05 respectively.


An example of minimization code for 01mi.in:

01mi.in: equilibration
&cntrl
 imin = 1, maxcyc = 1000, ntmin = 2,
 ntx = 1, ntc = 1, ntf = 1,
 ntb = 1, ntp = 0,
 ntwx = 1000, ntwe = 0, ntpr = 1000,
 ntr = 1,
 restraintmask = ':1-210 & !@H=',
 restraint_wt=5.0,


The following parameters are defined as follows:

imin:specifies minimization not molecular dynamics

maxcyc:total number of minimization cycles to be performed

ntmin:how many cycles will use a deepest decent method, the remaining cycles use an approximation of this called the conjugate gradient method.

ntx:only coordinates and not velocities are to be read from previous step

ntc:indicates level of constraint on bonds. if =1, SHAKE algorithm is off so no bonds are constrained. If =2, constrains any bonds with H atoms. If =3, constrains all bonds.

ntf:=1, all parts of the potential must be evaluated

ntb:periodic boundary to keep system at constant volume

ntp:=0, NO constant pressure applied

The frequencies at which the program records data are in controlled by the paramenters ntwx, ntwe, and ntpr.

ntwx:=1000, atom coordinates saved into .trj file every 1000 cycles

ntwe:=0, no .en file is generated

ntpr:=1000, energy readins are written as .out and .info files every 1000 steps

ntr:=1, positional restraint method applied

restraintmask= ':1-119 & !@H : position of atom within residues 1-119 that is not a H atom is being restrained

restraint_wt: restraint weight indicating how strong the restraint on the atoms is

Equilibration

To further prepare our complex for the molecular dynamic simulations, the subsequent step of energy minimization is equilibrate the system at some certain temperatures. We repeat the process of minimization and equilibration for twice in our case, of course with varied parameters and restraints put on our system.

Right after the first 1000 steps of minimization, we perform a 50000 steps(nstlim = 50000) X the step length of 1 fs(dt = 0.001) (that is 50 ps in total) equilibration at the temperature 298.15K(temp0 = 298.15, tempi = 298.15), contining putting on the weight (in kcal/mol) of 5.0 for the positional restraints on all non hydrogen atoms(restraintmask = ':1-210 & !@H=', restraint_wt = 5.0).

02md.in: equilibration (50ps)
&cntrl
  imin = 0, ntx = 1, irest = 0, nstlim = 50000,
  temp0 = 298.15, tempi = 298.15,
  ntc = 2, ntf = 1, ntt = 1, dt = 0.001,
  ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
  ntwx = 1000, ntwe = 0, ntwr = 1000, ntpr = 1000,
  ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
  restraintmask = ':1-210 & !@H=', restraint_wt = 5.0,
/
  • Notice that some flags used in minimization have been changed:

imin = 0 indicates that we are doing an MD simulation instead of doing energy minimization.

ntc = 2 indicates that all bonds involving H bonds are constrained by the SHAKE algorithm to eliminate high frequency oscillations in the system.

ntb = 2 and ntp = 1 indicate that constant pressure instead of constant volume is applied.

  • Some flags (including those explained above) are newly added;

ntx = 1 means coordinates, but no velocities, will be read.

irest = 0 irest is a flag to restart a simulation, we always start from running a new simulation until the production part, so it is 0 from 02md.in, 06md.in to 09md.in.

ntt = 1 switch the temperature scaling, 1 means constant temperature, using the weak-coupling algorithm.

tautp = 1.0 and taup = 1.0 defines the time constant (in ps) and the pressure relaxation time (in ps), respectively.

iwrap = 1 indicate that the coordinates written to the .rst7 and .trj files will be wrapped into a primary box. This reduces the amount of coordinate output, thus preventing overflowing and making visualization more convenient.

cut = 8.0 set the the non-bonded cutoff distance at 8.0 Angstroms.

Continuing 4 simulations are carried out after the fourth minimization, there are only little differents in these input files:

06md.in: equilibration (50ps)
 &cntrl
   imin = 0, ntx = 1, irest = 0, nstlim = 50000,
   temp0 = 298.15, tempi = 298.15, ig = 71287,
   ntc = 2, ntf = 1, ntt = 1, dt = 0.001,
   ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
   ntwx = 1000, ntwe = 0, ntwr = 1000, ntpr = 1000,
   ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
   restraintmask = ':1-210 & !@H=', restraint_wt = 1.0,
/
  • As we can see in 06md.in file, the only changes are setting ig as 71287 and reducing restraint_wt value from 5.0 to 1.0.

ig is the seed for the pseudo-random number generator which the MD starting velocity depend on.

To further reduce the restraint put on the whole system, we vary restraint_wt value to be 0.05, 0.1, 0.1, respectively in 07md, 08md and 09md input files.

  • Note that starting from 08md to 09md simulations, the restraintmask = ':1-210 & !@H=' has been changed to restraintmask = ':1-209 & !@H=' which means not at all restrains are put on the ligand atoms.

Production

Before the production runs (10 and 11), first we need to generate the input files 10md.in and 11md.in (as shown below).

10md.in: production (500000 = 1ns)
&cntrl
  imin = 0, ntx = 5, irest = 1, nstlim = 500000,
  temp0 = 298.15, tempi = 298.15, ig = 71287,
  ntc = 2, ntf = 1, ntt = 1, dt = 0.002,
  ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
  ntwx = 500, ntwe = 0, ntwr = 500, ntpr = 500,
  ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
  restraintmask = ':1-209@CA,C,N', restraint_wt = 0.1,
/
11md.in: production (500000 = 1ns)
&cntrl
  imin = 0, ntx = 5, irest = 1, nstlim = 500000,
  temp0 = 298.15, tempi = 298.15, ig = 71287,
  ntc = 2, ntf = 1, ntt = 1, dt = 0.002,
  ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0,
  ntwx = 500, ntwe = 0, ntwr = 500, ntpr = 500,
  ntr = 1, nscm = 100, iwrap = 1, cut = 8.0,
  restraintmask = ':1-209@CA,C,N', restraint_wt = 0.1,
/

Explanation for some of the parameter values:

ntx = 5, irest = 1 #using velocity information from the previous restart file#
nstlim = 500000    #running for 500,000 steps#
temp0 = 298.15     #the target temperature is 298.15 K (the default is 300 K)#
tempi = 298.15     #the initial temperature is 298.15 K#
dt = 0.002         #the time step dt is 0.002 ps (2 fs)#
ntwr = 500         #to print energy output every 500 steps#  
ntb = 2            #the constant pressure is switched on#

To run the production, we need to create a c-shell file named "run.production.csh" of which the contents are shown below:

#! /bin/tcsh
#-l nodes=4:ppn=2
#PBS -l walltime=720:00:00
#PBS -o zzz.qsub.prod.out
#PBS -e zzz.qsub.prod.err
#PBS -V

set workdir = "~<username>/AMS536/amber-tutorial/03.sander"
cd ${workdir}
cat $PBS_NODEFILE | sort | uniq

mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 10md.in -o 10md.out -p ../02.ante.tleap/1LOQ.com.wat.leap.prm7 \
-c 09md.rst7 -ref 09md.rst7 -x 10md.trj -inf 10md.info -r 10md.rst7
mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 11md.in -o 11md.out -p ../02.ante.tleap/1LOQ.com.wat.leap.prm7 \
-c 10md.rst7 -ref 10md.rst7 -x 11md.trj -inf 11md.info -r 11md.rst7
exit

The last step of a production run is to submit the c-shell file to the queue:

qsub zzz.production.csh

It will take almost 24 hour to finish the production run. To check the status of your job, type:

qstat -u <username>

After the production run, 10 new files should be created:

10md.info
10md.out
10md.rst7
10md.trj
11md.info
11md.out
11md.rst7
11md.trj
zzz.qsub.prod.err (should be empty if nothing is wrong)
zzz.qsub.prod.out

Running jobs on the queue

To run equilibration and production, Write the tcsh file equil.produc.qsub.csh:

 #! /bin/tcsh
 #PBS -l nodes=4:ppn=2
 #PBS -l walltime=720:00:00
 #PBS -o zzz.qsub.out
 #PBS -e zzz.qsub.err
 #PBS -V
 
 set workdir = "~lingling/AMS536/AMBER_Tutorial/003.SANDER"
 cd ${workdir}
 cat $PBS_NODEFILE | sort | uniq
 
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 01mi.in -o 01mi.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c ../002.ANTE.TLEAP/1LOQ.com.wat.leap.rst7 -ref ../002.ANTE.TLEAP/1LOQ.com.wat.leap.rst7 \
  -x 01mi.trj -inf 01mi.info -r 01mi.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 02md.in -o 02md.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 01mi.rst7 -ref 01mi.rst7 -x 02md.trj -inf 02md.info -r 02md.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 03mi.in -o 03mi.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 02md.rst7 -ref 02md.rst7 -x 03mi.trj -inf 03mi.info -r 03mi.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 04mi.in -o 04mi.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 03mi.rst7 -ref 03mi.rst7 -x 04mi.trj -inf 04mi.info -r 04mi.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 05mi.in -o 05mi.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 04mi.rst7 -ref 04mi.rst7 -x 05mi.trj -inf 05mi.info -r 05mi.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 06md.in -o 06md.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 05mi.rst7 -ref 05mi.rst7 -x 06md.trj -inf 06md.info -r 06md.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 07md.in -o 07md.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 06md.rst7 -ref 06md.rst7 -x 07md.trj -inf 07md.info -r 07md.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 08md.in -o 08md.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 07md.rst7 -ref 07md.rst7 -x 08md.trj -inf 08md.info -r 08md.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 09md.in -o 09md.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 08md.rst7 -ref 08md.rst7 -x 09md.trj -inf 09md.info -r 09md.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 10md.in -o 10md.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 09md.rst7 -ref 09md.rst7 -x 10md.trj -inf 10md.info -r 10md.rst7
 mpirun -n 8 /nfs/user03/wjallen/local/amber12/bin/sander.MPI -O -i 11md.in -o 11md.out -p ../002.ANTE.TLEAP/1LOQ.com.wat.leap.prm7 \
  -c 10md.rst7 -ref 09md.rst7 -x 11md.trj -inf 11md.info -r 11md.rst7
 exit
  

Note in the above script that for each run, the .rst7, .trj and .info files generated from the previous run provides the initial state to start from. The .prm7 file generated for the hydrated complex by TLEAP provides the force parameters.

Submit the file to the queue and monitor progress.

IV. Simulation Analysis

Ptraj

You should create another work directory for this step (004.PTRAJ, for example), if you don't have one. There will be five runs of analysis, each of which will require different input files.

1. At first we want to concatenate the two 1ns trajectories together, stripping off the waters, and creating a .strip-file as an output. Below is the input file which will allow us to do so.

ptraj.1.in

trajin ../003.SANDER/10md.trj 1 1000 1
trajin ../003.SANDER/11md.trj 1 1000 1
trajout 1LOQ.trj.strip nobox
strip :WAT

The two sets of numbers 1 1000 1 give the input information about which frames are used for the Ptraj. The first two numbers 1 and 1000 specify the starting and ending snapshots from the trajectory file. The ending number of the snapshot doesn't need to be accurate because if you actually don't have enough snapshots in your trajectory file, Ptraj will read up to the last one you have. The last number 1 specifies the frequency of the snapshot saved, in this case, we are saving every frame of the trajectory file. And the last line of the input file will take away all the water molecules.

As the input file is prepared we can launch the first analysis as follows:

ptraj  ../002.ANTE.TLEAP/1LOQ.com.wat.leap.parm  ptraj.1.in  > ptraj.1.log

As the output you will obtain 1LOQ.trj.strip file which contains 2000 snapshots of the trajectory.

2. Later on we want to compare the output file just obtained to our reference file 1LOQ.com.gas.leap.rst7, using the following input file.

ptraj.2.in

trajin 1LOQ.trj.strip  1 2000 1
trajout 1LOQ.com.trj.stripfit
reference ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7
rms reference out 1LOQ.rmsd.CA.txt :1-209@CA

Since we have just concatenated the two trajectories, we will have 2000 snapshots in 1LOQ.trj.strip. The third line in the input specifies the reference file, we have taken away all the water molecules during the first step, hence we are working here with the gas phase complex. The last line says we are calculating the rmsd for alpha carbon number 1 to 209.

And as we have this file filled out, we can run this step:

ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.2.in  > ptraj.2.log

The output file 1LOQ.rmsd.CA.txt wil contain two columns, the first one is the number of frame and the second one stands for rmsd value.

3. Afterwards we will generate a similar file for ligand, using the following input file.

ptraj.3.in

trajin 1LOQ.com.trj.stripfit 1 2000 1
reference ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7
rms reference out 1LOQ.lig.rmsd.txt :210@C*,N*,O*,S* nofit

And then run this step:

ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.3.in  > ptraj.3.log

By doing this we will compare the trajectory file to 1LOQ.com.gas.leap.rst7 as well, but working with the ligand instead of the receptor (we specified that by pointing that we want to calculate the rmsd for carbon, nitrogen, oxygen and sulfur in residue 210.

The last two steps are to obtain energetic information about the system. To do this we take a trajectory file of the gas phase complex 1LOQ.com.trj.stripfit, and want to create two more trajectory files containing the information on only receptor and only ligand correspondingly.

4. At this step we consider receptor only. The input file is provided below:

ptraj.4.in

trajin 1LOQ.com.trj.stripfit 1 2000 1
trajout 1LOQ.rec.trj.stripfit
strip :210

And then run this step:

ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.4.in  > ptraj.4.log

5. And the same procedure for the ligand, with the following input file:

ptraj.5.in

trajin 1LOQ.com.trj.stripfit 1 2000 1
trajout 1LOQ.lig.trj.stripfit
strip :1-209

And then run this step:

ptraj  ../002.ANTE.TLEAP/1LOQ.com.gas.leap.parm  ptraj.5.in  > ptraj.5.log

6. (optional) Visualization

As we've gone through all these steps, the analysis is done. If you want to visualize the trajectories, you first need to copy the trajectory files to Herbie like this, for example (being a level above 004.PTRAJ directory):

scp 004.PTRAJ/ your_username@herbie.mathlab.sunysb.edu:~AMS536/AMBER_tutorial/004.PTRAJ 

Now, launch VMD, then open one of the .prm7 files in 002.ANTE.TLEAP. If you want to visualize the whole complex in gas state, you can open 1LOQ.com.gas.leap.prm7 with AMBER7 Parm from 002.ANTE.TLEAP and then 1LOQ.com.trj.stripfit with AMBER coordinates from 004.PTRAJ. With these files, you can look at the real-time movement of the complex in the gas state. You can repeat this procedure to observe the real-time movement of the complex in the water state. Just open 1LOQ.com.wat.leap.prm7 instead of 1LOQ.com.gas.leap.prm7.

RMSD Plots

Once having finished running Ptraj, you should find two RMSD files in your directory: 1LOQ.lig.rmsd.txt and 1LOQ.rmsd.CA.txt The data in these files can be plotted to visualize the structural drift (ie change in RMSD) over the course of your simulation. Large fluctuations in RMSD may be an indication that the simulation was unstable.

Here is a plot for ligand RMSD:

AMBER RMSD plot.jpg

Measuring h-bond distances

MM-GBSA Energy Calculation

MM/GBSA is the acronym for Molecular Mechanics/Generalized Born Surface Area. This part of AMBER combines molecular mechanics (MM) with both the electrostatic (PB) and nonpolar (SA) contribution to solvation . Topology files are needed for the receptor, ligand, and receptor-ligand complex. The trajectory files generate the coordinates. Therefore, molecular dynamics is used to generate a set of snapshots taken at fixed intervals from the trajectories. These snapshots are processed to remove solvent and generate the free energy of binding.

In the AMBER_Tutorial directory, create a new directory:

 mkdir 005.MMGBSA

In this new directory, create the file gb.rescore.in containing:

 Single point GB energy calc
&cntrl
 ntf    = 1,        ntb    = 0,        ntc    = 2,
 idecomp= 0,
 igb    = 5,        saltcon= 0.00,
 gbsa   = 2,        surften= 1.0,
 offset = 0.09,     extdiel= 78.5,
 cut    = 99999.0,  nsnb   = 99999,
 imin   = 5,        maxcyc = 1,        ncyc   = 0,
 /

Then create a csh script, run.sander.rescore.csh, that contains the following lines of command:

 #! /bin/csh
 #PBS -l nodes=1:ppn=2
 #PBS -l walltime=24:00:00
 #PBS -o zzz.qsub.out
 #PBS -e zzz.qsub.err
 #PBS -V
 set workdir = "/nfs/user03/yechen1/AMBER-Tutorial/005.MMGBSA"
 set sander  = "/nfs/user03/wjallen/local/amber12/bin/sander"
 cd ${workdir}
 $sander -O -i gb.rescore.in -o gb.rescore.out.com -p ../002.ANTE.TLEAP  /1LOQ.com.gas.leap.prm7 \
 -c ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.com.trj.stripfit \
 -r restrt.com -ref ../002.ANTE.TLEAP/1LOQ.com.gas.leap.rst7 -x mdcrd.com -inf mdinfo.com
 $sander -O -i gb.rescore.in -o gb.rescore.out.lig -p ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.prm7 \
 -c ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.lig.trj.stripfit \
 -r restrt.lig -ref ../002.ANTE.TLEAP/1LOQ.lig.gas.leap.rst7 -x mdcrd.lig -inf mdinfo.lig
 $sander -O -i gb.rescore.in -o gb.rescore.out.rec -p ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.prm7 \
 -c ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.rst7 -y ../004.PTRAJ/1LOQ.rec.trj.stripfit \
 -r restrt.rec -ref ../002.ANTE.TLEAP/1LOQ.rec.gas.leap.rst7 -x mdcrd.rec -inf mdinfo.rec
 exit

~ ~ ~ ~

Then this script should be sent to the queue, i.e., qsub the script using the commands:

 qsub run.sander.rescore.csh

You can monitor your progress by typing

 qstat -u username

When the job is complete, you will obtain the following output files: gb.rescore.out.com, gb.rescore.out.lig, and gb.rescore.out.rec In these files, the single point energy calculation results will be written for each individual frame. It will be found in the results section and the output file will have an infrastrucutre that is similar to the following:

     FINAL RESULTS
   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
     1       3.6269E+03     1.8737E+01     1.0472E+02     CB        585
  BOND    =      580.2786  ANGLE   =     1563.7704  DIHED      =     2161.5659
  VDWAALS =    -1684.2762  EEL     =   -13809.8494  EGB        =    -2953.6681
  1-4 VDW =      756.7767  1-4 EEL =     7260.2823  RESTRAINT  =        0.0000
  ESURF   =     9752.0291
 minimization completed, ENE= 0.36269092E+04 RMS= 0.187371E+02
 TRAJENE: Trajectory file ended
 TRAJENE: Trajene complete.

In the command line, type:

 grep VDWAALS gb.rescore.out.com > vdw.com.txt.
 grep ESURF gb.rescore.out.com > surf.com.txt.   

You can take these text files, import them into Excel, and do the rest of your modifications there.

Equations for analysis

Remember that to obtain the Gvdw term, you need to take the SASA (which is ESURF) and input into equation 1:

ΔGnonpolar = SASA*0.00542 + 0.92

Also, the mmgbsa of a given system can be determined by equation 2:

ΔGmmgbsa = ΔGvdw + ΔGcoul + ΔGpolar + ΔGnonpolar

From the output file:

VDWAALS = ΔGvdw

EELS = ΔGcoul

EGB = ΔGpolar

You can then easily calculate the ΔΔGbind by using equation 3:

ΔΔGbind = ΔGmmgbsa,complex – (ΔGmmgbsa,lig + ΔGmmgbsa,rec) You will want to careful when doing your analysis that the results from frame 1 for the receptor and ligand are subtracted from the results from frame 1 for your complex. By doing this in excel, you should have 2000 frames for each, and the values should cleanly line up. Finally, you will want to plot your ΔΔGbind and examine if you see corresponding changes in the ligand position and the ΔΔGbind. Also, you should determine the mean and standard deviation for your ΔΔGbind.

Plotting Energy

When your rescoring calculation finishes, you have the following three output files: "gb.rescore.out.com", "gb.rescore.out.lig", and "gb.rescore.out.rec".

Use the following script, entitled get.mmgbsa.bash, to extract data and calculate MMGBSA energy for each snap shot.

 #! /bin/bash
 # by Haoquan
 echo com lig rec > namelist
 LIST=`cat namelist`
 for i in $LIST ; do
 grep VDWAALS gb.rescore.out.$i | awk '{print $3}' > $i.vdw
 grep EGB     gb.rescore.out.$i | awk '{print $9}' > $i.polar
 grep EELS    gb.rescore.out.$i | awk '{print $6}' > $i.coul
 grep ESURF   gb.rescore.out.$i | awk '{print $3 * 0.00542 + 0.92}' > $i.surf
 paste -d " " $i.vdw $i.polar $i.surf $i.coul | awk '{print $1 + $2 + $3 + $4}' > data.$i
 rm $i.*
 done
 paste -d " " data.com data.lig data.rec | awk '{print $1 - $2 - $3}' > data.all 
 for ((j=1; j<=`wc -l data.all | awk '{print $1}'`; j+=1)) do
 echo $j , >> time
 done
 paste -d " " time data.all > MMGBSA_vs_time.dat  
 rm namelist time data.*

To run this script do:

 bash get.mmgbsa.sh

This will create a text file called MMGBSA_vs_time.dat with x and y values separated by a space and comma. These values can be imported to Excel or Origin or to XMGRACE if you are using Linux:

 xmgrace MMGBSA_vs_time.dat

2013 AMBER MMGBSA plot.jpg Example.png

V. Frequently Encountered Problems

Artem

Brian

He

Jiahui

Jiaye

Koushik

Natalie

Nikolay

Weiliang

Ye

Yuan

Zach