2014 AMBER tutorial with HIV Protease

From Rizzo_Lab
Revision as of 14:13, 26 March 2014 by Stonybrook (talk | contribs) (I. Introduction)
Jump to: navigation, search

For additional Rizzo Lab tutorials see AMBER Tutorials.

In this tutorial, we will learn how to run a molecular dynamics simulation of a protein-ligand complex. We will then post-process that simulation by calculating structural fluctuations (with RMSD) and free energies of binding (MM-GBSA).

For additional Rizzo Lab tutorials see AMBER Tutorials.

In this tutorial, we will learn how to run a molecular dynamics simulation of a protein-ligand complex. We will then post-process that simulation by calculating structural fluctuations (with RMSD) and free energies of binding (MM-GBSA).

I. Introduction

AMBER

Amber - Assisted Model Building with Energy Refinement - is a multi-program suite for macromolecular simulations. Amber12 is the most recent version of the software and it includes new force fields such as ff12SB and Lipid11, expanded options for Poisson-Boltzmann solvation calculations, accelerated molecular dynamics, additional features in sander pmemd code, and expanded methods for free energy calculations. Our lab is set up with Amber12 and the latest update of AmberTools13 which contains the programs such as antechamber and tleap to set up your simulation.

The Amber 12 Manual is available to get started with using Amber12 as well as a troubleshooting reference. (Tip: You can search the document for keywords if you use Adobe Acrobat to view the file which can save time.) Additionally, Amber Tools User's Manual is another reference for the probrgrams available under Amber tools.

Here some of programs availabe in Amber and AmberTools

  1. LEaP: is a preparatory program for constructing new or modified systems in Amber. It contains the functions: prep, link, edit, and parm from earlier version of Amber.
  2. ANTECHAMBER: is the primary program used to prepare input files excluding the original nucleic acid and protein information taken from the PDB.
  3. SANDER: is 'a basic energy minimizer and molecular dynamics program' that is used to minimize, equilibrate and sample molecular conformations. This is the program used to generate trajectory files of the system.
  4. PMEMD: is an improved version of SANDER optimized for periodic, PME simulations, and for GB simulations. It is faster and scales better on parallel machines than SANDER.
  5. PTRAJ: is an analysis program for processing trajectory files. ptraj can be used to rotate and translate the structures, evaluate geometrical features, calculate RMSDs among other analyses.

There is a mailing list available as an additional resource.

II. Structural Preparation

Lu, Yan and Yao

Preparation in Chimera

Downloading the PDB file

Go to PDB homepage (http://www.rcsb.org/pdb/home/home.do ) enter the protein ID (1HVR) in the search bar, click Download Files in the top-right of the webpage, then select PDB File (text). In the new window, save the file.

Preparing the ligand and receptor in Chimera

Generating the final files

antechamber

Ton begin with, build a 002.ANTE.TLEAP directory under Amber_tutorial directory. To make sure these three programs are available, (antechamber, parmchk and tleap) and we are using the correct version of amber, we can use the which command, type command:

  which antechamber
  which parmchk
  which tleap

The result should look like this:

  /nfs/user03/wjallen/local/amber12/bin/antechamber
  /nfs/user03/wjallen/local/amber12/bin/parmchk
  /nfs/user03/wjallen/local/amber12/bin/tleap

Copy parameters of ions to your working directory from the following resource:

  cp -r ~yuchzhou/AMS536/AMBER_Tutorial/002.ANTE.TLEAP/rizzo_amber7.ionparms

Then we use antechamber to convert our input mol2 file into files ready for LEaP.Type command:

  antechamber -i ../001.MOL.PREP/1HVR.lig.mol2 -fi mol2 -o 1HVR.lig.ante.pdb -fo pdb

Here, -i means input file name; -fi means input file format; -o means output file name; -fo output file format. You will have an output file:1HVR.lig.ante.pdb Similarly, we can use antechamber to change the format of 1LOQ.lig.mol2 file to prep file:

  antechamber -i ../001.MOL.PREP/1HVR.lig.mol2 -fi mol2 -o 1HVR.lig.ante.prep -fo prepi

You will get a set of output files:

  ANTECHAMBER_AC.AC  
  ANTECHAMBER_AC.AC0  
  ANTECHAMBER_BOND_TYPE.AC  
  ANTECHAMBER_BOND_TYPE.AC0  
  ANTECHAMBER_PREP.AC  
  ANTECHAMBER_PREP.AC0
  ATOMTYPE.INF 
  NEWPDB.PDB 
  PREP.INF 
  1HVR.lig.ante.prep

parmchk

After running antechamber, we run parmchk to check the parameters. Parmchk is another program in Antechamber. If there are missing parameters or mistake after antechamber is finished, the frcmod template generated by parmchk will help us in generating the needed parameters:

  parmchk -i 1LOQ.lig.ante.prep -f prepi -o 1LOQ.lig.ante.frcmod

In the command, -i input file name,-f input file format (prepi, prepc, ac, mol2),-o frcmod file name. When the programming is done, you should have a new file in the directory.

  1HVR.lig.ante.frcmod

tleaP

Next, three input files are needed to run TLEAP

Visualization in VMD

III. Simulation using sander

Junjie, Tianao and Kai

Minimization

Before running the equilibration and production, we should first run the energy minimization. The purpose of this step is to remove structural artifacts resulting from the model-building process. Because model building often creates unwanted structural artifacts that must be removed before a molecular dynamics simulation is performed.

An example of a typical input file “01mi.in” for minimization is listed blow:

  01mi.in: minimization
  &cntrl
  imin = 1, maxcyc = 1000, ntmin = 2,
  ntx = 1, ntc = 1, ntf = 1,
  ntb = 1, ntp = 0,
  ntwx = 1000, ntwe = 0, ntpr = 1000,
  cut = 8.0,
  ntr = 1,
  restraintmask = ':1-199 & !@H=',
  restraint_wt = 5.0,
  /

You also have to create four files for minimization which is "03mi.in", "04mi.in" and "05mi.in". The contents of these files are similar to each other.

The parameters are defined as follows:

imin: specifies minimization not molecular dynamics.

maxcyc: total number of minimization cycles to be performed.

ntmin: how many cycles will use a deepest decent method, the remaining cycles use an approximation of this called the conjugate gradient method.

ntx: only coordinates and not velocities are to be read from previous step.

ntc: indicates level of constraint on bonds. if =1, SHAKE algorithm is off so no bonds are constrained. If =2, constrains any bonds with H atoms. If =3, constrains all bonds.

ntf=1: all parts of the potential must be evaluated.

ntb: periodic boundary to keep system at constant volume.

ntp=0: NO constant pressure applied.

The frequencies at which the program records data are in controlled by the paramenters ntwx, ntwe, and ntpr.

ntwx=1000: atom coordinates saved into .trj file every 1000 cycles.

ntwe=0: no .en file is generated.

ntpr=1000: energy readins are written as .out and .info files every 1000 steps.

ntr=1: positional restraint method applied.

restraintmask= ':1-119 & !@H : position of atom within residues 1-119 that is not a H atom is being restrained.

restraint_wt: restraint weight indicating how strong the restraint on the atoms is.

Equilibration

Production

Running jobs on the queue

IV. Simulation Analysis

Arkin, Jess and Mosavverul

Checking Output Files

Ptraj

For this step we created another work directory (004.PTRAJ, for example).

Create the file ptraj.strip.wat.in containing the following input:

    trajin /nfs/user03/yuchzhou/AMS536/zzz.test/003.PMEMD/10md.trj 1 1000 1
    trajin /nfs/user03/yuchzhou/AMS536/zzz.test/003.PMEMD/11md.trj 1 1000 1
    trajout 1HVR.trj.strip nobox
    strip :WAT

This will concatenate your two 1ns trajectories together, strip off the waters, and output it as a new file called 1DF8.trj.strip. The two sets of numbers 1 1000 1 give you the input information about which frames you are using for the Ptraj. The first two numbers 1 and 1000 specify the starting and ending snapshots from your trajectory file. The ending number of the snapshot doesn't need to be accurate because if you actually don't have enough snapshots in your trajectory file, Ptraj will read up to the last one you have. The last number 1 specifies the frequency of the snapshot saved, in this case, we are saving every frame of the trajectory file. And the last line of the input file will take away all the water molecules.


RMSD Plots

Measuring h-bond distances

MM-GBSA Energy Calculation

MM/GBSA is the acronym for Molecular Mechanics/Generalized Born Surface Area. This part of AMBER combines molecular mechanics (MM) with both the electrostatic (PB) and nonpolar (SA) contribution to solvation . Topology files are needed for the receptor, ligand, and receptor-ligand complex. The trajectory files generate the coordinates. Therefore, molecular dynamics is used to generate a set of snapshots taken at fixed intervals from the trajectories. These snapshots are processed to remove solvent and generate the free energy of binding.

In the AMBER_Tutorial directory, create a new directory:

 mkdir 005.MMGBSA

In this new directory, create the file gb.rescore.in containing:

 Single point GB energy calc
&cntrl
 ntf    = 1,        ntb    = 0,        ntc    = 2,
 idecomp= 0,
 igb    = 5,        saltcon= 0.00,
 gbsa   = 2,        surften= 1.0,
 offset = 0.09,     extdiel= 78.5,
 cut    = 99999.0,  nsnb   = 99999,
 imin   = 5,        maxcyc = 1,        ncyc   = 0,
 /

Then create a csh script, run.sander.rescore.csh, that contains the following lines of command:

 #! /bin/tcsh
 #PBS -l nodes=1:ppn=2
 #PBS -l walltime=24:00:00
 #PBS -o zzz.qsub.out
 #PBS -e zzz.qsub.err
 #PBS -V
 
 set workdir = nfs/user03/username/AMS536/AMBER_Tutorial/005.MMGBSA
 cd /nfs/user03/username/AMS536/AMBER_Tutorial/005.MMGBSA
 
 mpirun -n 2 sander.MPI -O -i gb.rescore.in \
 -o gb.rescore.out.com \
 -p ../002.TLEAP/1HVR.com.gas.leap.prm7 \
 -c ../002.TLEAP/1HVR.com.gas.leap.rst7 \
 -y ../004.PTRAJ/1HVR.com.trj.stripfit \
 -r restrt.com \
 -ref ../002.TLEAP/1HVR.com.gas.leap.rst7 \
 -x mdcrd.com \
 -inf mdinfo.com
 
 mpirun -n 2 sander.MPI -O -i gb.rescore.in \
 -o gb.rescore.out.lig \
 -p ../002.TLEAP/1HVR.lig.gas.leap.prm7 \
 -c ../002.TLEAP/1HVR.lig.gas.leap.rst7 \
 -y ../004.PTRAJ/1HVR.lig.trj.stripfit \
 -r restrt.lig \
 -ref ../002.TLEAP/1HVR.lig.gas.leap.rst7 \
 -x mdcrd.lig \
 -inf mdinfo.lig
 
 mpirun -n 2 sander.MPI -O -i gb.rescore.in \
 -o gb.rescore.out.test.rec \
 -p ../002.TLEAP/1HVR.rec.gas.leap.prm7 \
 -c ../002.TLEAP/1HVR.rec.gas.leap.rst7 \
 -y ../004.PTRAJ/1HVR.rec.trj.stripfit \
 -r restrt.rec \
 -ref ../002.TLEAP/1HVR.rec.gas.leap.rst7 \
 -x mdcrd.rec \
 -inf mdinfo.rec
 
 exit

~ ~ ~ ~

Then this script should be sent to the queue, i.e., qsub the script using the commands:

 qsub run.sander.rescore.csh

You can monitor your progress by typing

 qstat -u username

When the job is complete, you will obtain the following output files: gb.rescore.out.com, gb.rescore.out.lig, and gb.rescore.out.rec In these files, the single point energy calculation results will be written for each individual frame. It will be found in the results section and the output file will have an infrastrucutre that is similar to the following:

                     FINAL RESULTS
 
 
 
    NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER  
       1       6.0197E+03     1.9231E+01     1.0478E+02     CA        950
 
  BOND    =      613.6427  ANGLE   =     1575.2750  DIHED      =     2052.4780
  VDWAALS =    -1544.4789  EEL     =   -13857.3791  EGB        =    -2026.3402
  1-4 VDW =      697.7785  1-4 EEL =     8382.6220  RESTRAINT  =        0.0000
  ESURF   =    10126.1041
 minimization completed, ENE= 0.60197022E+04 RMS= 0.192312E+02
 TRAJENE: Trajectory file ended
 TRAJENE: Trajene complete.

In the command line, type:

 grep VDWAALS gb.rescore.out.com > vdw.com.txt.
 grep ESURF gb.rescore.out.com > surf.com.txt.   

You can take these text files, import them into Excel, and do the rest of your modifications there.

Equations for analysis

Remember that to obtain the Gvdw term, you need to take the SASA (which is ESURF) and input into equation 1:

ΔGnonpolar = SASA*0.00542 + 0.92

Also, the mmgbsa of a given system can be determined by equation 2:

ΔGmmgbsa = ΔGvdw + ΔGcoul + ΔGpolar + ΔGnonpolar

From the output file:

VDWAALS = ΔGvdw

EELS = ΔGcoul

EGB = ΔGpolar

You can then easily calculate the ΔΔGbind by using equation 3:

ΔΔGbind = ΔGmmgbsa,complex – (ΔGmmgbsa,lig + ΔGmmgbsa,rec) You will want to careful when doing your analysis that the results from frame 1 for the receptor and ligand are subtracted from the results from frame 1 for your complex. By doing this in excel, you should have 2000 frames for each, and the values should cleanly line up. Finally, you will want to plot your ΔΔGbind and examine if you see corresponding changes in the ligand position and the ΔΔGbind. Also, you should determine the mean and standard deviation for your ΔΔGbind.

Plotting Energy

When your rescoring calculation finishes, you have the following three output files: "gb.rescore.out.com", "gb.rescore.out.lig", and "gb.rescore.out.rec".

Use the following script, entitled get.mmgbsa.bash, to extract data and calculate MMGBSA energy for each snap shot.

 #! /bin/bash
 # by Haoquan
 echo com lig rec > namelist
 LIST=`cat namelist`
 for i in $LIST ; do
 grep VDWAALS gb.rescore.out.$i | awk '{print $3}' > $i.vdw
 grep EGB     gb.rescore.out.$i | awk '{print $9}' > $i.polar
 grep EELS    gb.rescore.out.$i | awk '{print $6}' > $i.coul
 grep ESURF   gb.rescore.out.$i | awk '{print $3 * 0.00542 + 0.92}' > $i.surf
 paste -d " " $i.vdw $i.polar $i.surf $i.coul | awk '{print $1 + $2 + $3 + $4}' > data.$i
 rm $i.*
 done
 paste -d " " data.com data.lig data.rec | awk '{print $1 - $2 - $3}' > data.all 
 for ((j=1; j<=`wc -l data.all | awk '{print $1}'`; j+=1)) do
 echo $j , >> time
 done
 paste -d " " time data.all > MMGBSA_vs_time.dat  
 rm namelist time data.*

To run this script do:

 bash get.mmgbsa.sh

This will create a text file called MMGBSA_vs_time.dat with x and y values separated by a space and comma. These values can be imported to Excel or Origin or to XMGRACE if you are using Linux:

 xmgrace MMGBSA_vs_time.dat

2013 AMBER MMGBSA plot.jpg MMGBSA.png

V. Frequently Encountered Problems

Mike

Ivan

Lu

Yan

Yao

Junjie

Tianao

Kai

Arkin

Jess

Mosavverul