Difference between revisions of "2015 AMBER tutorial with PARP"
Stonybrook (talk | contribs) (→AMBER) |
(→Antechamber, Parmchk, tLeap) |
||
(42 intermediate revisions by one other user not shown) | |||
Line 24: | Line 24: | ||
===PARP=== | ===PARP=== | ||
− | Poly (ADP-ribose) polymerase (PARP) is a family of proteins involved in a number of cellular processes involving mainly DNA repair and programmed cell death. PARP is composed of four domains of interest: a DNA-binding domain, a caspase-cleaved domain (see below), an auto-modification domain, and a catalytic domain. The DNA-binding domain is composed of two zinc finger motifs. The PARP family comprises 17 members (10 putative). They have all very different structures and functions in the cell.We're going to use PARP5b in this tutorial. | + | Poly (ADP-ribose) polymerase (PARP) is a family of proteins involved in a number of cellular processes involving mainly DNA repair and programmed cell death. PARP is composed of four domains of interest: a DNA-binding domain, a caspase-cleaved domain (see below), an auto-modification domain, and a catalytic domain. The DNA-binding domain is composed of two zinc finger motifs. The PARP family comprises 17 members (10 putative). They have all very different structures and functions in the cell. We're going to use PARP5b in this tutorial. |
===Organizing Directories=== | ===Organizing Directories=== | ||
Line 30: | Line 30: | ||
It makes things easier to organize your files in a clean and logical way. The following directory structure and naming scheme is a convenient way to organize your files. We could make these directories first before doing anything further. | It makes things easier to organize your files in a clean and logical way. The following directory structure and naming scheme is a convenient way to organize your files. We could make these directories first before doing anything further. | ||
− | ~username/AMS536/ | + | ~username/AMS536/Amber-Tutorial/001.system.prep/ |
002.tleap/ | 002.tleap/ | ||
003.pmemd/ | 003.pmemd/ | ||
Line 40: | Line 40: | ||
====Antechamber, Parmchk, tLeap==== | ====Antechamber, Parmchk, tLeap==== | ||
+ | |||
+ | Before beginning the Molecular Dynamics protocol using AMBER, you must first set up your files. | ||
+ | In your 001.chimera folder, you will add the following files: | ||
+ | |||
+ | 4TKG.lig.mol2 | ||
+ | 4TKG.rec.noH.pdb | ||
+ | 4TKG.rec.noH.ZIN.pdb | ||
+ | |||
+ | To prepare the first two files, please see the 2025 DOCK tutorial at the following link: [http://ringo.ams.sunysb.edu/index.php/2015_DOCK_tutorial_with_Poly(ADP-ribose)_polymerase_(PARP)] | ||
+ | |||
+ | Note: delete any headers before the atoms/helix information. | ||
+ | |||
+ | In order to prepare 4TKG.rec.noH.ZIN.pdb, first open 4TKG.rec.noH.pdb and use the command "SHIFT + g" to reach the bottom of the pdb. The last lines of your files should look like this: | ||
+ | |||
+ | ATOM 1648 OE1 GLU B 205 23.839 -23.190 57.747 1.00 0.00 O | ||
+ | TER 1649 GLU B 205 | ||
+ | HETATM 1650 ZN ZN B 206 28.130 -3.467 55.482 1.00 0.00 Zn | ||
+ | END | ||
+ | |||
+ | To make 4TKG.rec.noH.ZIN.pdb, you will need to change the "ZN" atom ID to "ZIN" so that AMBER can read the atom type. | ||
+ | |||
+ | ATOM 1648 OE1 GLU B 205 23.839 -23.190 57.747 1.00 0.00 O | ||
+ | TER 1649 GLU B 205 | ||
+ | ATOM 1650 ZIN ZIN B 206 28.130 -3.467 55.482 1.00 0.00 Zn | ||
+ | END | ||
+ | |||
+ | |||
+ | To run a tleap, first generate a 002.tleap file, with the command: | ||
+ | mkdir 002.tleap | ||
+ | |||
+ | cp ~/../../tleap.lig.in | ||
+ | |||
+ | *tleap.lig.in | ||
+ | source leaprc.ff14SB #Load a force field | ||
+ | source leaprc.gaff | ||
+ | LIG = loadmol2 4TKG.lig.mol2 #Conformation file | ||
+ | saveamberparm LIG 4TKG.lig.gas.leap.prm7 4TKG.lig.gas.leap.rst7 #Save the ligand gas phase AMBER topology and coordinate file | ||
+ | solvatebox LIG TIP3PBOX 10.0 #Solvate the lig using TIP3P, solvent box radii 10 angstroms | ||
+ | saveamberparm LIG 4TKG.lig.wat.leap.prm7 4TKG.lig.wat.leap.rst7 #Save the ligand water phase AMBER topology and coordinate files | ||
+ | quit | ||
+ | |||
+ | tleap –s –f tleap.lig.in > tleap.lig.out | ||
+ | The atom types which used in '4TKG.lig.mol2' file are not recognized. | ||
+ | [[Image:Screen Shot 2015-05-21 at 13.54.08.png|thumb|center|600px]] | ||
+ | So the name of the atoms has to be redefined based on the gaff force field, '''antechamber''' can be used to solve the error. | ||
+ | antechamber –i 4TKG.lig.mol2 –fi mol2 –o 4TKG.lig.gaff.mol2 –fo mol2 | ||
+ | Change the content of tleap.lig.in to | ||
+ | (gaff file) | ||
+ | |||
+ | Re-run tleap by: | ||
+ | tleap –s –f tleap.lig.in > tleap.lig.out | ||
+ | [[Image:error2.png|thumb|center|600px]] | ||
+ | |||
+ | So now run parmchk to fix the missing parameters | ||
+ | parmchk –i 4TKG.lig.gaff.mol2 –f mol2 –o 4TKG.lig.ante.frcmod | ||
+ | Change the content of tleap.lig.in to | ||
+ | (load amber params) | ||
+ | run tleap.lig.in, and all the errors are fixed. | ||
+ | [[Image:error1&error2 fixed.png|thumb|center|600px]] | ||
+ | cp ~/../../tleap.rec.in | ||
+ | *tleap.rec.in | ||
+ | source leaprc.ff14SB #Load a force field | ||
+ | REC = loadpdb ../001.chimera/4TKG.rec.noH.ZIN.pdb #Conformation file for receptor | ||
+ | saveamberparm REC 4TKG.rec.gas.leap.prm7 4TKG.rec.gas.leap.rst7 #Save the receptor gas phase AMBER topology and coordinate file | ||
+ | solvateBox REC TIP3PBOX 10.0 #Solvate the lig using TIP3P, solvent box radii 10 angstroms | ||
+ | saveamberparm REC 4TKG.rec.wat.leap.prm7 4TKG.rec.wat.leap.rst7 #Save the receptor water phase AMBER topology and coordinate file | ||
+ | quit | ||
+ | |||
+ | tleap –s –f tleap.rec.in > tleap.rec.out | ||
+ | [[Image:rec.png|thumb|center|600px]] | ||
+ | |||
+ | For the missing parameters, cp the two ionparam from ../000.amberfiles/ | ||
+ | cp [https://ringo.ams.stonybrook.edu/export/home/lprentis/ions.lib ions.lib] | ||
+ | cp ../000.amberfiles/ions.frcmod | ||
+ | Change the content of the tleap.rec.in to | ||
+ | (load param) | ||
+ | [[Image:rec_ion.png|thumb|center|600px]] | ||
+ | run tleap.rec.in, and all the errors are fixed. | ||
+ | (explain) | ||
+ | |||
+ | cp ~/../../tleap.com.in | ||
+ | tleap –s –f tleap.com.in > tleap.com.out | ||
+ | |||
+ | |||
+ | This whole process can be run on a cluster by using the following script: | ||
+ | [[Image:script.png|thumb|center|600px]] | ||
==III. Simulation using pmemd== | ==III. Simulation using pmemd== | ||
+ | |||
+ | '''PMEMD''' | ||
+ | |||
+ | In this part of the MD simulation, you begin with the starting coordinates for the system (typically the protein and the small molecule), and you perform a series of minimizations and short MD simulation runs to equilibrate the system. This is to remove any steric clashes and to allow the system to settle down into a natural state. Otherwise, any large steric clashes from the starting coordinates (which could be from the protein data bank or from a dock calculation) can cause the system to behave unnaturally or even come apart. Once the system is properly equilibrated, then you can begin an MD production run in which you record the trajectory. | ||
+ | |||
+ | There is no specific way in which equilibrations should be performed, but this is an example from the class: | ||
+ | |||
+ | 1) The first input file is called 01mi.in and contains the following lines of code. | ||
+ | |||
+ | 01mi.in: equilibration | ||
+ | &cntrl | ||
+ | imin = 1, maxcyc = 1000, ntmin = 2, | ||
+ | ntx = 1, ntc = 1, ntf = 1, | ||
+ | ntb = 1, ntp = 0, | ||
+ | ntwx = 1000, ntwe = 0, ntpr = 1000, | ||
+ | cut = 8.0, | ||
+ | ntr = 1, | ||
+ | restraintmask = ':1-207 & !@H=', | ||
+ | restraint_wt=5.0, | ||
+ | / | ||
+ | |||
+ | This describes a minimization in which we put a large 5.0 kcal/molA^2 on all the heavy atoms except the hydrogens. Thus, the hydrogens are allowed to relax and we remove any steric clashes involving hydrogens. | ||
+ | |||
+ | 2) The second input file is called 02md.in and contains the following lines of code. | ||
+ | |||
+ | 02md.in: equilibration (1,000,000 = 1ns) | ||
+ | &cntrl | ||
+ | imin = 0, ntx = 1, irest = 0, nstlim = 50000, | ||
+ | temp0 = 298.15, tempi = 298.15, ig = 71287, | ||
+ | ntc = 2, ntf = 1, ntt = 1, dt = 0.001, | ||
+ | ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0, | ||
+ | ntwx = 1000, ntwe = 0, ntwr = 1000, ntpr = 1000, | ||
+ | cut = 8.0, iwrap = 1, | ||
+ | ntr = 1, nscm = 100, | ||
+ | restraintmask = ':1-207 & !@H=', restraint_wt = 5.0, | ||
+ | / | ||
+ | |||
+ | This describes an 'equilibration' or short MD simulation that is intended to bring the system into a more natural (low energy) state. Here, the same restraint is applied on all atoms except the hydrogen atoms, but this time we do an MD simulation, not a minimization. | ||
+ | |||
+ | For the next few steps, we perform 3 minimizations using exactly the same code as in 01mi.in, but we slowly decrease the restraint_wt from 5.0 to 2.0, to 1.0, to 0.05 kcal/molA^2. | ||
+ | |||
+ | After those 3 minimizations, we do 2 short MD simulations. At a high-level, they are as follows: | ||
+ | |||
+ | 1) Perform a short MD simulation (almost the same code as in 02md.in) with the restraint_wt = 1.0 | ||
+ | |||
+ | 2) Perform another short MD simulation with same code as in 02md.in, but with restraintmask = ':1-206@ZIN, CA, C, N', restraint_wt = 0.1. This removes the restraint from the ligand and allows the ligand to equilibrate. Note: the protein residues are numbered 1-205, the zinc ion is residue 206, and the ligand is residue 207. | ||
+ | |||
+ | The length of these simulations can be variable, and in the class step 1) was set for 50 picoseconds and step 2) was set for 100 picoseconds. You can change the length of these simulations as necessary by editing the nstlim variable in the code. | ||
+ | |||
+ | Finally, to do a production run, you would use a mimic of 02md.in (perhaps modified if necessary) with nstlim set to the right number for the right length of simulation. dt is the time step, so the product of nstlim times dt gives the length of the simulation in picoseconds. In 02md.in, a time step of 1 femtosecond was used, which corresponds to 0.001 picoseconds. | ||
+ | |||
+ | To submit this job on seawulf, you can use the following script: | ||
+ | |||
+ | Note: in this submission script, there are calls to mi.in and md.in files which were not explicity mentioned above, but only described at a high level. But these input files, and the submission script, are easily modifiable. You may want to use a different number of minimization/equilibration steps, in which case modifications to these files must be done. To do the modific<nowiki>Insert non-formatted text here</nowiki>ations, you would essentially concentrate on deleting/adding lines of code focusing on the -i XXmi.in part of the lines of code. | ||
+ | |||
+ | replace user_name with your own username. | ||
+ | |||
+ | |||
+ | #! /bin/tcsh | ||
+ | #PBS -l nodes=4:ppn=2 | ||
+ | #PBS -l walltime=720:00:00 | ||
+ | #PBS -o zzz.qsub.out | ||
+ | #PBS -e zzz.qsub.err | ||
+ | #PBS -V | ||
+ | #PBS -N pmemd | ||
+ | set workdir = "/nfs/user03/fochtman/amber_tutorial/003.pmemd" | ||
+ | cd ${workdir} | ||
+ | cat \$PBS_NODEFILE | sort | uniq | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 01mi.in -o 01mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c ../002.tleap/4TKG.com.wat.leap.rst7 -ref ../002.tleap/4TKG.com.wat.leap.rst7 \ | ||
+ | -x 01mi.trj -inf 01mi.info -r 01mi.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 02md.in -o 02md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 01mi.rst7 -ref 01mi.rst7 -x 02md.trj -inf 02md.info -r 02md.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 03mi.in -o 03mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 02md.rst7 -ref 02md.rst7 -x 03mi.trj -inf 03mi.info -r 03mi.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 04mi.in -o 04mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 03mi.rst7 -ref 03mi.rst7 -x 04mi.trj -inf 04mi.info -r 04mi.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 05mi.in -o 05mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 04mi.rst7 -ref 04mi.rst7 -x 05mi.trj -inf 05mi.info -r 05mi.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 06md.in -o 06md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 05mi.rst7 -ref 05mi.rst7 -x 06md.trj -inf 06md.info -r 06md.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 07md.in -o 07md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 06md.rst7 -ref 05mi.rst7 -x 07md.trj -inf 07md.info -r 07md.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 08md.in -o 08md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 07md.rst7 -ref 05mi.rst7 -x 08md.trj -inf 08md.info -r 08md.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 09md.in -o 09md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 08md.rst7 -ref 05mi.rst7 -x 09md.trj -inf 09md.info -r 09md.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 10md.in -o 10md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 09md.rst7 -ref 05mi.rst7 -x 10md.trj -inf 10md.info -r 10md.rst7 | ||
+ | mpirun -n 8 pmemd.MPI -O -O -i 11md.in -o 11md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ | ||
+ | -c 10md.rst7 -ref 05mi.rst7 -x 11md.trj -inf 11md.info -r 11md.rst7 | ||
+ | exit | ||
==IV. Simulation Analysis== | ==IV. Simulation Analysis== | ||
Line 47: | Line 225: | ||
===Ptraj=== | ===Ptraj=== | ||
+ | You should create another work directory for this step (004.ptraj, for example). | ||
+ | '''1.''' At first we want to concatenate the two 1ns trajectories together, stripping off the waters, and creating a .strip-file as an output. Below is the input file which will allow us to do so. | ||
+ | |||
+ | '''ptraj.1.in''' | ||
+ | trajin ../003.pmemd/10md.trj 1 1000 1 | ||
+ | trajin ../003.pmemd/11md.trj 1 1000 1 | ||
+ | trajout 4TKG.trj.strip nobox | ||
+ | strip :WAT | ||
+ | |||
+ | The two sets of numbers ''1 1000 1'' give the input information about which frames are used for the Ptraj. The first two numbers ''1'' and ''1000'' specify the starting and ending snapshots from the trajectory file. The ending number of the snapshot doesn't need to be accurate because if you actually don't have enough snapshots in your trajectory file, Ptraj will read up to the last one you have. The last number ''1'' specifies the frequency of the snapshot saved, in this case, we are saving every frame of the trajectory file. And the last line of the input file will take away all the water molecules. | ||
+ | |||
+ | As the input file is prepared we can launch the first analysis as follows: | ||
+ | ptraj ../002.tleap/4TKG.com.wat.leap.parm ptraj.1.in > ptraj.1.log | ||
+ | |||
+ | As the output you will obtain '''4TKG.trj.strip''' file which contains ''2000'' snapshots of the trajectory. | ||
+ | |||
+ | '''2.''' Later on we want to compare the output file just obtained to our reference file '''4TKG.com.gas.leap.rst7''', using the following input file. | ||
+ | |||
+ | '''ptraj.2.in''' | ||
+ | trajin 4TKG.trj.strip 1 2000 1 | ||
+ | trajout 4TKG.com.trj.stripfit | ||
+ | reference ../002.tleap/4TKG.com.gas.leap.rst7 | ||
+ | rms reference out 4TKG.rmsd.CA.txt :1-206@CA | ||
+ | |||
+ | Since we have just concatenated the two trajectories, we will have ''2000'' snapshots in '''4TKG.trj.strip'''. The third line in the input specifies the reference file, we have taken away all the water molecules during the first step, hence we are working here with the gas phase complex. The last line says we are calculating the rmsd for alpha carbon number ''1'' to ''206''. | ||
+ | |||
+ | And as we have this file filled out, we can run this step: | ||
+ | ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.2.in > ptraj.2.log | ||
+ | |||
+ | The output file '''4TKG.rmsd.CA.txt''' wil contain two columns, the first one is the number of frame and the second one stands for rmsd value. | ||
+ | |||
+ | '''3.''' Afterwards we will generate a similar file for ligand, using the following input file. | ||
+ | |||
+ | '''ptraj.3.in''' | ||
+ | trajin 4TKG.com.trj.stripfit 1 2000 1 | ||
+ | reference ../002.tleap/4TKG.com.gas.leap.rst7 | ||
+ | rms reference out 4TKG.lig.rmsd.txt :207@C*,N*,O*,S* nofit | ||
+ | |||
+ | And then run this step: | ||
+ | ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.3.in > ptraj.3.log | ||
+ | |||
+ | By doing this we will compare the trajectory file to '''4TKG.com.gas.leap.rst7''' as well, but working with the ligand instead of the receptor (we specified that by pointing that we want to calculate the rmsd for carbon, nitrogen, oxygen and sulfur in residue ''207''. | ||
+ | |||
+ | The last two steps are to obtain energetic information about the system. To do this we take a trajectory file of the gas phase complex '''4TKG.com.trj.stripfit''', and want to create two more trajectory files containing the information on only receptor and only ligand correspondingly. | ||
+ | |||
+ | '''4.''' At this step we consider receptor only. The input file is provided below: | ||
+ | |||
+ | '''ptraj.4.in''' | ||
+ | trajin 4TKG.com.trj.stripfit 1 2000 1 | ||
+ | trajout 4TKG.rec.trj.stripfit | ||
+ | strip :207 | ||
+ | |||
+ | And then run this step: | ||
+ | ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.4.in > ptraj.4.log | ||
+ | |||
+ | '''5.''' And the same procedure for the ligand, with the following input file: | ||
+ | |||
+ | '''ptraj.5.in''' | ||
+ | trajin 4TKG.com.trj.stripfit 1 2000 1 | ||
+ | trajout 4TKG.lig.trj.stripfit | ||
+ | strip :1-206 | ||
+ | |||
+ | And then run this step: | ||
+ | ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.5.in > ptraj.5.log | ||
+ | |||
+ | '''6.''' ''(optional)'' Visualization | ||
+ | |||
+ | As we've gone through all these steps, the analysis is done. If you want to visualize the trajectories, you first need to copy the trajectory files to Herbie like this, for example (being a level above '''004.PTRAJ''' directory): | ||
+ | scp 004.ptraj/ your_username@herbie.mathlab.sunysb.edu:~AMS536/amber_tutorial/004.ptraj | ||
+ | |||
+ | Now, launch VMD, then open one of the .prm7 files in 002.tleap. If you want to visualize the whole complex in gas state, you can open 4TKG.com.gas.leap.prm7 with AMBER7 Parm from 002.tleap and then 4TKG.com.trj.stripfit with AMBER coordinates from 004.ptraj. With these files, you can look at the real-time movement of the complex in the gas state. You can repeat this procedure to observe the real-time movement of the complex in the water state. Just open 4TKG.com.wat.leap.prm7 instead of 4TKG.com.gas.leap.prm7. | ||
===MM-GBSA Energy Calculation=== | ===MM-GBSA Energy Calculation=== | ||
+ | Molecular Mechanics-Generalized Born Surface Area (MM-GBSA) is a great method to calculate or estimate relative binding affinity of a ligand(s) to a receptor. The binding energy calculated from this method are also known as free energies of binding, where the more negative values indicate stronger binding. For this section, the topology files for the ligand, receptor and complex are needed. | ||
+ | |||
+ | Create a new directory: | ||
+ | mkdir 005.MMGBSA | ||
+ | Create an input file name | ||
+ | vim gb.rescore.in | ||
+ | Enter the following into the input file: | ||
+ | Single point GB energy calc | ||
+ | &cntrl | ||
+ | ntf = 1, ntb = 0, ntc = 2, | ||
+ | idecomp= 0, | ||
+ | igb = 5, saltcon= 0.00, | ||
+ | gbsa = 2, surften= 1.0, | ||
+ | offset = 0.09, extdiel= 78.5, | ||
+ | cut = 99999.0, nsnb = 99999, | ||
+ | imin = 5, maxcyc = 1, ncyc = 0, | ||
+ | / | ||
+ | Create a tcsh/bash/csh script (run.sander.rescore.csh) with the following information: | ||
+ | #! /bin/tcsh | ||
+ | #PBS -l nodes=1:ppn=1 | ||
+ | #PBS -l walltime=48:00:00 | ||
+ | #PBS -o zzz.qsub.out | ||
+ | #PBS -e zzz.qsub.err | ||
+ | #PBS -V | ||
+ | #PBS -N mmgbsa | ||
+ | |||
+ | set workdir = /nfs/user03/kbelfon/amber_tutorial/005.mmgbsa | ||
+ | |||
+ | cd $workdir | ||
+ | |||
+ | sander -O -i gb.rescore.in \ | ||
+ | -o gb.rescore.out.com \ | ||
+ | -p ../002.tleap/4TKG.com.gas.leap.prm7 \ | ||
+ | -c ../002.tleap/4TKG.com.gas.leap.rst7 \ | ||
+ | -y ../004.ptraj/4TKG.com.trj.stripfit \ | ||
+ | -r restrt.com \ | ||
+ | -ref ../002.tleap/4TKG.com.gas.leap.rst7 \ | ||
+ | -x mdcrd.com \ | ||
+ | -inf mdinfo.com | ||
+ | |||
+ | sander -O -i gb.rescore.in \ | ||
+ | -o gb.rescore.out.lig \ | ||
+ | -p ../002.tleap/4TKG.lig.gas.leap.prm7 \ | ||
+ | -c ../002.tleap/4TKG.lig.gas.leap.rst7 \ | ||
+ | -y ../004.ptraj/4TKG.lig.trj.stripfit \ | ||
+ | -r restrt.lig \ | ||
+ | -ref ../002.tleap/4TKG.lig.gas.leap.rst7 \ | ||
+ | -x mdcrd.lig \ | ||
+ | -inf mdinfo.lig | ||
+ | |||
+ | sander -O -i gb.rescore.in \ | ||
+ | -o gb.rescore.out.test.rec \ | ||
+ | -p ../002.tleap/4TKG.rec.gas.leap.prm7 \ | ||
+ | -c ../002.tleap/4TKG.rec.gas.leap.rst7 \ | ||
+ | -y ../004.ptraj/4TKG.rec.trj.stripfit \ | ||
+ | -r restrt.rec \ | ||
+ | -ref ../002.tleap/4TKG.rec.gas.leap.rst7 \ | ||
+ | -x mdcrd.rec \ | ||
+ | -inf mdinfo.rec | ||
+ | |||
+ | exit | ||
+ | |||
+ | Execute this script on the seawulf cluster or machine(s) of your preference | ||
+ | qsub run.sander.rescore.csh | ||
+ | Three output files will be generated once the job is completed: | ||
+ | '''gb.rescore.out.com''', '''gb.rescore.out.lig''', and '''gb.rescore.out.rec''' | ||
+ | These files represent the single point energy calculation results for the complex (.com), the ligand (.lig) and the receptor (.rec). The energy will be output by the program Sander for each frame specified in the input file. The final results for one frame in one of the three files should look as the following: | ||
+ | |||
+ | FINAL RESULTS | ||
+ | |||
+ | |||
+ | |||
+ | NSTEP ENERGY RMS GMAX NAME NUMBER | ||
+ | 1 5.9132E+03 2.0005E+01 1.2640E+02 C 159 | ||
+ | |||
+ | BOND = 661.8980 ANGLE = 1751.7992 DIHED = 2581.7692 | ||
+ | VDWAALS = -1696.6585 EEL = -13958.9335 EGB = -3125.9524 | ||
+ | 1-4 VDW = 747.0185 1-4 EEL = 7750.8118 RESTRAINT = 0.0000 | ||
+ | ESURF = 11201.4791 | ||
+ | minimization completed, ENE= 0.59132314E+04 RMS= 0.200047E+02 | ||
+ | '''Extracting Data from MM-GBSA calculation and calculating Free energy of Binding''' | ||
+ | From the output files above: | ||
+ | |||
+ | VDWAALS = ΔGvdw | ||
+ | |||
+ | EELS = ΔGcoul | ||
+ | |||
+ | EGB = ΔGpolar | ||
+ | |||
+ | SASA = ESURF | ||
+ | |||
+ | With this information ΔGnonpolar can be solved using equation(1): | ||
+ | |||
+ | ΔGnonpolar = SASA*0.00542 + 0.92 (1) | ||
+ | |||
+ | Once ΔGnonpolar is solved then ΔGmmgbsa can be determined by equation(2): | ||
+ | |||
+ | ΔGmmgbsa = ΔGvdw + ΔGcoul + ΔGpolar + ΔGnonpolar (2) | ||
+ | |||
+ | Solve equation 2 and 3 using the extracted information from all three output files. So therefore you should have ΔGmmgbsa for the complex, ligand and receptor | ||
+ | |||
+ | Finally ΔΔGbind can be calculated using equation (3): | ||
+ | |||
+ | ΔΔGbind = ΔGmmgbsa,complex – (ΔGmmgbsa,lig + ΔGmmgbsa,rec) (3) | ||
+ | |||
+ | Plot your ΔΔGbind and examine the plot for changes in the ligand position and the ΔΔGbind. Also, you should calculate the mean and standard deviation for your ΔΔGbind. | ||
+ | |||
+ | The following script (get.mmgbsa.sh) can be used to extract the energy from the three output files obtained above and calculate ΔΔGbind: | ||
+ | |||
+ | #! /bin/bash | ||
+ | # by Haoquan | ||
+ | echo com lig rec > namelist | ||
+ | LIST=`cat namelist` | ||
+ | for i in $LIST ; do | ||
+ | grep VDWAALS gb.rescore.out.$i | awk '{print $3}' > $i.vdw | ||
+ | grep EGB gb.rescore.out.$i | awk '{print $9}' > $i.polar | ||
+ | grep EELS gb.rescore.out.$i | awk '{print $6}' > $i.coul | ||
+ | grep ESURF gb.rescore.out.$i | awk '{print $3 * 0.00542 + 0.92}' > $i.surf | ||
+ | paste -d " " $i.vdw $i.polar $i.surf $i.coul | awk '{print $1 + $2 + $3 + $4}' > data.$i | ||
+ | rm $i.* | ||
+ | done | ||
+ | paste -d " " data.com data.lig data.rec | awk '{print $1 - $2 - $3}' > data.all | ||
+ | for ((j=1; j<=`wc -l data.all | awk '{print $1}'`; j+=1)) do | ||
+ | echo $j , >> time | ||
+ | done | ||
+ | paste -d " " time data.all > MMGBSA_vs_time.dat | ||
+ | rm namelist time data.* | ||
+ | |||
+ | Execute this script: | ||
+ | bash get.mmgbsa.sh | ||
+ | |||
+ | A text file called MMGBSA_vs_time.dat with x and y values separated by a space and comma should be created. Use XMGRACE to plot this dat file using the following command in Linux: | ||
+ | |||
+ | xmgrace MMGBSA_vs_time.dat | ||
==V. Frequently Encountered Problems== | ==V. Frequently Encountered Problems== |
Latest revision as of 11:53, 26 June 2019
For additional Rizzo Lab tutorials see AMBER Tutorials.
In this tutorial, we will learn how to run a molecular dynamics simulation of a protein-ligand complex. We will then post-process that simulation by calculating structural fluctuations (with RMSD) and free energies of binding (MM-GBSA).
Contents
I. Introduction
AMBER
Amber - Assisted Model Building with Energy Refinement - is a multi-program suite for macromolecular simulations. Amber14 is the most recent version of the software and it includes new force fields such as ff14SB. In addition, in this release, more features from sander have been added to pmemd for both CPU and GPU platforms, including performance improvements, and support for extra points, multi-dimension replica exchange, a Monte Carlo barostat, ScaledMD, Jarzynski sampling, explicit solvent constant pH, GBSA, and hydrogen mass repartitioning. Support is also included for the latest Kepler, Titan and GTX7xx GPUs expanded options for Poisson-Boltzmann solvation calculations, accelerated molecular dynamics, additional features in sander pmemd code, and expanded methods for free energy calculations. Our lab is set up with Amber14 and the latest update of AmberTools14 which contains the programs such as antechamber and tleap to set up your simulation.
The Amber 14 Manual is available to get started with using Amber14. You can search the document for keywords such as "tleap" if you use Adobe Acrobat to view the file. Additionally, AmberTools Reference Manual is another reference for the programs available under Amber tools.
Here below are some of the programs available in both Amber and AmberTools:
- LEaP: LEaP is an X-windows-based program that provides for basic model building and Amber coordinate and parameter/topology input file creation. It includes a molecular editor which allows for building residues and manipulating molecules.
- ANTECHAMBER: This program suite automates the process of developing force field descriptors for most organic molecules. It starts with structures (usually in PDB format), and generates files that can be read into LEaP for use in molecular modeling. The force field description that is generated is designed to be compatible with the usual Amber force fields for proteins and nucleic acids.
- SANDER: Sander is short for Simulated annealing with NMR-derived energy restraints. This allows for NMR refinement based on NOE-derived distance restraints, torsion angle restraints, and penalty functions based on chemical shifts and NOESY volumes. Sander is also the "main" program used for molecular dynamics simulations, and is also used for replica-exchange, thermodynamic integration, and potential of mean force (PMF) calculations. Sander also includes QM/MM capability.
- PMEMD: This is an extensively-modified version (originally by Bob Duke) of the sander program, optimized for periodic, PME simulations, and for GB simulations. It is faster than sander and scales better on parallel machines.
- PTRAJ and CPPTRAJ: These are used to analyze MD trajectories, computing a variety of things, like RMS deviation from a reference structure, hydrogen bonding analysis, time-correlation functions, diffusional behavior, and so on.
- MM_PBSA and MM_PBSA.py: These are scripts that automate post-processing of MD trajectories, to analyze energetics using continuum solvent ideas. It can be used to break energies energies into "pieces" arising from different residues, and to estimate free energy differences between conformational basins.
There is also a mailing list available as an additional resource. What you can do with it is: you document your questions and sent to this mail address, some specialists of Amber will be assigned to reply your email and help you.
PARP
Poly (ADP-ribose) polymerase (PARP) is a family of proteins involved in a number of cellular processes involving mainly DNA repair and programmed cell death. PARP is composed of four domains of interest: a DNA-binding domain, a caspase-cleaved domain (see below), an auto-modification domain, and a catalytic domain. The DNA-binding domain is composed of two zinc finger motifs. The PARP family comprises 17 members (10 putative). They have all very different structures and functions in the cell. We're going to use PARP5b in this tutorial.
Organizing Directories
It makes things easier to organize your files in a clean and logical way. The following directory structure and naming scheme is a convenient way to organize your files. We could make these directories first before doing anything further.
~username/AMS536/Amber-Tutorial/001.system.prep/ 002.tleap/ 003.pmemd/ 004.ptraj/ 005.mmgbsa/
II. Structural Preparation
Antechamber, Parmchk, tLeap
Before beginning the Molecular Dynamics protocol using AMBER, you must first set up your files. In your 001.chimera folder, you will add the following files:
4TKG.lig.mol2 4TKG.rec.noH.pdb 4TKG.rec.noH.ZIN.pdb
To prepare the first two files, please see the 2025 DOCK tutorial at the following link: [1]
Note: delete any headers before the atoms/helix information.
In order to prepare 4TKG.rec.noH.ZIN.pdb, first open 4TKG.rec.noH.pdb and use the command "SHIFT + g" to reach the bottom of the pdb. The last lines of your files should look like this:
ATOM 1648 OE1 GLU B 205 23.839 -23.190 57.747 1.00 0.00 O TER 1649 GLU B 205 HETATM 1650 ZN ZN B 206 28.130 -3.467 55.482 1.00 0.00 Zn END
To make 4TKG.rec.noH.ZIN.pdb, you will need to change the "ZN" atom ID to "ZIN" so that AMBER can read the atom type.
ATOM 1648 OE1 GLU B 205 23.839 -23.190 57.747 1.00 0.00 O TER 1649 GLU B 205 ATOM 1650 ZIN ZIN B 206 28.130 -3.467 55.482 1.00 0.00 Zn END
To run a tleap, first generate a 002.tleap file, with the command:
mkdir 002.tleap
cp ~/../../tleap.lig.in
- tleap.lig.in
source leaprc.ff14SB #Load a force field source leaprc.gaff LIG = loadmol2 4TKG.lig.mol2 #Conformation file saveamberparm LIG 4TKG.lig.gas.leap.prm7 4TKG.lig.gas.leap.rst7 #Save the ligand gas phase AMBER topology and coordinate file solvatebox LIG TIP3PBOX 10.0 #Solvate the lig using TIP3P, solvent box radii 10 angstroms saveamberparm LIG 4TKG.lig.wat.leap.prm7 4TKG.lig.wat.leap.rst7 #Save the ligand water phase AMBER topology and coordinate files quit
tleap –s –f tleap.lig.in > tleap.lig.out
The atom types which used in '4TKG.lig.mol2' file are not recognized.
So the name of the atoms has to be redefined based on the gaff force field, antechamber can be used to solve the error.
antechamber –i 4TKG.lig.mol2 –fi mol2 –o 4TKG.lig.gaff.mol2 –fo mol2
Change the content of tleap.lig.in to (gaff file)
Re-run tleap by:
tleap –s –f tleap.lig.in > tleap.lig.out
So now run parmchk to fix the missing parameters
parmchk –i 4TKG.lig.gaff.mol2 –f mol2 –o 4TKG.lig.ante.frcmod
Change the content of tleap.lig.in to (load amber params) run tleap.lig.in, and all the errors are fixed.
cp ~/../../tleap.rec.in
- tleap.rec.in
source leaprc.ff14SB #Load a force field REC = loadpdb ../001.chimera/4TKG.rec.noH.ZIN.pdb #Conformation file for receptor saveamberparm REC 4TKG.rec.gas.leap.prm7 4TKG.rec.gas.leap.rst7 #Save the receptor gas phase AMBER topology and coordinate file solvateBox REC TIP3PBOX 10.0 #Solvate the lig using TIP3P, solvent box radii 10 angstroms saveamberparm REC 4TKG.rec.wat.leap.prm7 4TKG.rec.wat.leap.rst7 #Save the receptor water phase AMBER topology and coordinate file quit
tleap –s –f tleap.rec.in > tleap.rec.out
For the missing parameters, cp the two ionparam from ../000.amberfiles/
cp ions.lib cp ../000.amberfiles/ions.frcmod
Change the content of the tleap.rec.in to (load param)
run tleap.rec.in, and all the errors are fixed. (explain)
cp ~/../../tleap.com.in tleap –s –f tleap.com.in > tleap.com.out
This whole process can be run on a cluster by using the following script:
III. Simulation using pmemd
PMEMD
In this part of the MD simulation, you begin with the starting coordinates for the system (typically the protein and the small molecule), and you perform a series of minimizations and short MD simulation runs to equilibrate the system. This is to remove any steric clashes and to allow the system to settle down into a natural state. Otherwise, any large steric clashes from the starting coordinates (which could be from the protein data bank or from a dock calculation) can cause the system to behave unnaturally or even come apart. Once the system is properly equilibrated, then you can begin an MD production run in which you record the trajectory.
There is no specific way in which equilibrations should be performed, but this is an example from the class:
1) The first input file is called 01mi.in and contains the following lines of code.
01mi.in: equilibration &cntrl
imin = 1, maxcyc = 1000, ntmin = 2, ntx = 1, ntc = 1, ntf = 1, ntb = 1, ntp = 0, ntwx = 1000, ntwe = 0, ntpr = 1000, cut = 8.0, ntr = 1, restraintmask = ':1-207 & !@H=', restraint_wt=5.0,
/
This describes a minimization in which we put a large 5.0 kcal/molA^2 on all the heavy atoms except the hydrogens. Thus, the hydrogens are allowed to relax and we remove any steric clashes involving hydrogens.
2) The second input file is called 02md.in and contains the following lines of code.
02md.in: equilibration (1,000,000 = 1ns)
&cntrl imin = 0, ntx = 1, irest = 0, nstlim = 50000, temp0 = 298.15, tempi = 298.15, ig = 71287, ntc = 2, ntf = 1, ntt = 1, dt = 0.001, ntb = 2, ntp = 1, tautp = 1.0, taup = 1.0, ntwx = 1000, ntwe = 0, ntwr = 1000, ntpr = 1000, cut = 8.0, iwrap = 1, ntr = 1, nscm = 100, restraintmask = ':1-207 & !@H=', restraint_wt = 5.0, /
This describes an 'equilibration' or short MD simulation that is intended to bring the system into a more natural (low energy) state. Here, the same restraint is applied on all atoms except the hydrogen atoms, but this time we do an MD simulation, not a minimization.
For the next few steps, we perform 3 minimizations using exactly the same code as in 01mi.in, but we slowly decrease the restraint_wt from 5.0 to 2.0, to 1.0, to 0.05 kcal/molA^2.
After those 3 minimizations, we do 2 short MD simulations. At a high-level, they are as follows:
1) Perform a short MD simulation (almost the same code as in 02md.in) with the restraint_wt = 1.0
2) Perform another short MD simulation with same code as in 02md.in, but with restraintmask = ':1-206@ZIN, CA, C, N', restraint_wt = 0.1. This removes the restraint from the ligand and allows the ligand to equilibrate. Note: the protein residues are numbered 1-205, the zinc ion is residue 206, and the ligand is residue 207.
The length of these simulations can be variable, and in the class step 1) was set for 50 picoseconds and step 2) was set for 100 picoseconds. You can change the length of these simulations as necessary by editing the nstlim variable in the code.
Finally, to do a production run, you would use a mimic of 02md.in (perhaps modified if necessary) with nstlim set to the right number for the right length of simulation. dt is the time step, so the product of nstlim times dt gives the length of the simulation in picoseconds. In 02md.in, a time step of 1 femtosecond was used, which corresponds to 0.001 picoseconds.
To submit this job on seawulf, you can use the following script:
Note: in this submission script, there are calls to mi.in and md.in files which were not explicity mentioned above, but only described at a high level. But these input files, and the submission script, are easily modifiable. You may want to use a different number of minimization/equilibration steps, in which case modifications to these files must be done. To do the modificInsert non-formatted text hereations, you would essentially concentrate on deleting/adding lines of code focusing on the -i XXmi.in part of the lines of code.
replace user_name with your own username.
#! /bin/tcsh #PBS -l nodes=4:ppn=2 #PBS -l walltime=720:00:00 #PBS -o zzz.qsub.out #PBS -e zzz.qsub.err #PBS -V #PBS -N pmemd set workdir = "/nfs/user03/fochtman/amber_tutorial/003.pmemd" cd ${workdir} cat \$PBS_NODEFILE | sort | uniq mpirun -n 8 pmemd.MPI -O -O -i 01mi.in -o 01mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c ../002.tleap/4TKG.com.wat.leap.rst7 -ref ../002.tleap/4TKG.com.wat.leap.rst7 \ -x 01mi.trj -inf 01mi.info -r 01mi.rst7 mpirun -n 8 pmemd.MPI -O -O -i 02md.in -o 02md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 01mi.rst7 -ref 01mi.rst7 -x 02md.trj -inf 02md.info -r 02md.rst7 mpirun -n 8 pmemd.MPI -O -O -i 03mi.in -o 03mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 02md.rst7 -ref 02md.rst7 -x 03mi.trj -inf 03mi.info -r 03mi.rst7 mpirun -n 8 pmemd.MPI -O -O -i 04mi.in -o 04mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 03mi.rst7 -ref 03mi.rst7 -x 04mi.trj -inf 04mi.info -r 04mi.rst7 mpirun -n 8 pmemd.MPI -O -O -i 05mi.in -o 05mi.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 04mi.rst7 -ref 04mi.rst7 -x 05mi.trj -inf 05mi.info -r 05mi.rst7 mpirun -n 8 pmemd.MPI -O -O -i 06md.in -o 06md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 05mi.rst7 -ref 05mi.rst7 -x 06md.trj -inf 06md.info -r 06md.rst7 mpirun -n 8 pmemd.MPI -O -O -i 07md.in -o 07md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 06md.rst7 -ref 05mi.rst7 -x 07md.trj -inf 07md.info -r 07md.rst7 mpirun -n 8 pmemd.MPI -O -O -i 08md.in -o 08md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 07md.rst7 -ref 05mi.rst7 -x 08md.trj -inf 08md.info -r 08md.rst7 mpirun -n 8 pmemd.MPI -O -O -i 09md.in -o 09md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 08md.rst7 -ref 05mi.rst7 -x 09md.trj -inf 09md.info -r 09md.rst7 mpirun -n 8 pmemd.MPI -O -O -i 10md.in -o 10md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 09md.rst7 -ref 05mi.rst7 -x 10md.trj -inf 10md.info -r 10md.rst7 mpirun -n 8 pmemd.MPI -O -O -i 11md.in -o 11md.out -p ../002.tleap/4TKG.com.wat.leap.prm7 \ -c 10md.rst7 -ref 05mi.rst7 -x 11md.trj -inf 11md.info -r 11md.rst7 exit
IV. Simulation Analysis
Ptraj
You should create another work directory for this step (004.ptraj, for example). 1. At first we want to concatenate the two 1ns trajectories together, stripping off the waters, and creating a .strip-file as an output. Below is the input file which will allow us to do so.
ptraj.1.in
trajin ../003.pmemd/10md.trj 1 1000 1 trajin ../003.pmemd/11md.trj 1 1000 1 trajout 4TKG.trj.strip nobox strip :WAT
The two sets of numbers 1 1000 1 give the input information about which frames are used for the Ptraj. The first two numbers 1 and 1000 specify the starting and ending snapshots from the trajectory file. The ending number of the snapshot doesn't need to be accurate because if you actually don't have enough snapshots in your trajectory file, Ptraj will read up to the last one you have. The last number 1 specifies the frequency of the snapshot saved, in this case, we are saving every frame of the trajectory file. And the last line of the input file will take away all the water molecules.
As the input file is prepared we can launch the first analysis as follows:
ptraj ../002.tleap/4TKG.com.wat.leap.parm ptraj.1.in > ptraj.1.log
As the output you will obtain 4TKG.trj.strip file which contains 2000 snapshots of the trajectory.
2. Later on we want to compare the output file just obtained to our reference file 4TKG.com.gas.leap.rst7, using the following input file.
ptraj.2.in
trajin 4TKG.trj.strip 1 2000 1 trajout 4TKG.com.trj.stripfit reference ../002.tleap/4TKG.com.gas.leap.rst7 rms reference out 4TKG.rmsd.CA.txt :1-206@CA
Since we have just concatenated the two trajectories, we will have 2000 snapshots in 4TKG.trj.strip. The third line in the input specifies the reference file, we have taken away all the water molecules during the first step, hence we are working here with the gas phase complex. The last line says we are calculating the rmsd for alpha carbon number 1 to 206.
And as we have this file filled out, we can run this step:
ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.2.in > ptraj.2.log
The output file 4TKG.rmsd.CA.txt wil contain two columns, the first one is the number of frame and the second one stands for rmsd value.
3. Afterwards we will generate a similar file for ligand, using the following input file.
ptraj.3.in
trajin 4TKG.com.trj.stripfit 1 2000 1 reference ../002.tleap/4TKG.com.gas.leap.rst7 rms reference out 4TKG.lig.rmsd.txt :207@C*,N*,O*,S* nofit
And then run this step:
ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.3.in > ptraj.3.log
By doing this we will compare the trajectory file to 4TKG.com.gas.leap.rst7 as well, but working with the ligand instead of the receptor (we specified that by pointing that we want to calculate the rmsd for carbon, nitrogen, oxygen and sulfur in residue 207.
The last two steps are to obtain energetic information about the system. To do this we take a trajectory file of the gas phase complex 4TKG.com.trj.stripfit, and want to create two more trajectory files containing the information on only receptor and only ligand correspondingly.
4. At this step we consider receptor only. The input file is provided below:
ptraj.4.in
trajin 4TKG.com.trj.stripfit 1 2000 1 trajout 4TKG.rec.trj.stripfit strip :207
And then run this step:
ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.4.in > ptraj.4.log
5. And the same procedure for the ligand, with the following input file:
ptraj.5.in
trajin 4TKG.com.trj.stripfit 1 2000 1 trajout 4TKG.lig.trj.stripfit strip :1-206
And then run this step:
ptraj ../002.tleap/4TKG.com.gas.leap.parm ptraj.5.in > ptraj.5.log
6. (optional) Visualization
As we've gone through all these steps, the analysis is done. If you want to visualize the trajectories, you first need to copy the trajectory files to Herbie like this, for example (being a level above 004.PTRAJ directory):
scp 004.ptraj/ your_username@herbie.mathlab.sunysb.edu:~AMS536/amber_tutorial/004.ptraj
Now, launch VMD, then open one of the .prm7 files in 002.tleap. If you want to visualize the whole complex in gas state, you can open 4TKG.com.gas.leap.prm7 with AMBER7 Parm from 002.tleap and then 4TKG.com.trj.stripfit with AMBER coordinates from 004.ptraj. With these files, you can look at the real-time movement of the complex in the gas state. You can repeat this procedure to observe the real-time movement of the complex in the water state. Just open 4TKG.com.wat.leap.prm7 instead of 4TKG.com.gas.leap.prm7.
MM-GBSA Energy Calculation
Molecular Mechanics-Generalized Born Surface Area (MM-GBSA) is a great method to calculate or estimate relative binding affinity of a ligand(s) to a receptor. The binding energy calculated from this method are also known as free energies of binding, where the more negative values indicate stronger binding. For this section, the topology files for the ligand, receptor and complex are needed.
Create a new directory:
mkdir 005.MMGBSA
Create an input file name
vim gb.rescore.in
Enter the following into the input file:
Single point GB energy calc &cntrl ntf = 1, ntb = 0, ntc = 2, idecomp= 0, igb = 5, saltcon= 0.00, gbsa = 2, surften= 1.0, offset = 0.09, extdiel= 78.5, cut = 99999.0, nsnb = 99999, imin = 5, maxcyc = 1, ncyc = 0, /
Create a tcsh/bash/csh script (run.sander.rescore.csh) with the following information:
#! /bin/tcsh #PBS -l nodes=1:ppn=1 #PBS -l walltime=48:00:00 #PBS -o zzz.qsub.out #PBS -e zzz.qsub.err #PBS -V #PBS -N mmgbsa set workdir = /nfs/user03/kbelfon/amber_tutorial/005.mmgbsa cd $workdir sander -O -i gb.rescore.in \ -o gb.rescore.out.com \ -p ../002.tleap/4TKG.com.gas.leap.prm7 \ -c ../002.tleap/4TKG.com.gas.leap.rst7 \ -y ../004.ptraj/4TKG.com.trj.stripfit \ -r restrt.com \ -ref ../002.tleap/4TKG.com.gas.leap.rst7 \ -x mdcrd.com \ -inf mdinfo.com sander -O -i gb.rescore.in \ -o gb.rescore.out.lig \ -p ../002.tleap/4TKG.lig.gas.leap.prm7 \ -c ../002.tleap/4TKG.lig.gas.leap.rst7 \ -y ../004.ptraj/4TKG.lig.trj.stripfit \ -r restrt.lig \ -ref ../002.tleap/4TKG.lig.gas.leap.rst7 \ -x mdcrd.lig \ -inf mdinfo.lig sander -O -i gb.rescore.in \ -o gb.rescore.out.test.rec \ -p ../002.tleap/4TKG.rec.gas.leap.prm7 \ -c ../002.tleap/4TKG.rec.gas.leap.rst7 \ -y ../004.ptraj/4TKG.rec.trj.stripfit \ -r restrt.rec \ -ref ../002.tleap/4TKG.rec.gas.leap.rst7 \ -x mdcrd.rec \ -inf mdinfo.rec exit
Execute this script on the seawulf cluster or machine(s) of your preference
qsub run.sander.rescore.csh
Three output files will be generated once the job is completed: gb.rescore.out.com, gb.rescore.out.lig, and gb.rescore.out.rec These files represent the single point energy calculation results for the complex (.com), the ligand (.lig) and the receptor (.rec). The energy will be output by the program Sander for each frame specified in the input file. The final results for one frame in one of the three files should look as the following:
FINAL RESULTS NSTEP ENERGY RMS GMAX NAME NUMBER 1 5.9132E+03 2.0005E+01 1.2640E+02 C 159 BOND = 661.8980 ANGLE = 1751.7992 DIHED = 2581.7692 VDWAALS = -1696.6585 EEL = -13958.9335 EGB = -3125.9524 1-4 VDW = 747.0185 1-4 EEL = 7750.8118 RESTRAINT = 0.0000 ESURF = 11201.4791 minimization completed, ENE= 0.59132314E+04 RMS= 0.200047E+02
Extracting Data from MM-GBSA calculation and calculating Free energy of Binding From the output files above:
VDWAALS = ΔGvdw
EELS = ΔGcoul
EGB = ΔGpolar
SASA = ESURF
With this information ΔGnonpolar can be solved using equation(1):
ΔGnonpolar = SASA*0.00542 + 0.92 (1)
Once ΔGnonpolar is solved then ΔGmmgbsa can be determined by equation(2):
ΔGmmgbsa = ΔGvdw + ΔGcoul + ΔGpolar + ΔGnonpolar (2)
Solve equation 2 and 3 using the extracted information from all three output files. So therefore you should have ΔGmmgbsa for the complex, ligand and receptor
Finally ΔΔGbind can be calculated using equation (3):
ΔΔGbind = ΔGmmgbsa,complex – (ΔGmmgbsa,lig + ΔGmmgbsa,rec) (3)
Plot your ΔΔGbind and examine the plot for changes in the ligand position and the ΔΔGbind. Also, you should calculate the mean and standard deviation for your ΔΔGbind.
The following script (get.mmgbsa.sh) can be used to extract the energy from the three output files obtained above and calculate ΔΔGbind:
#! /bin/bash # by Haoquan echo com lig rec > namelist LIST=`cat namelist` for i in $LIST ; do grep VDWAALS gb.rescore.out.$i | awk '{print $3}' > $i.vdw grep EGB gb.rescore.out.$i | awk '{print $9}' > $i.polar grep EELS gb.rescore.out.$i | awk '{print $6}' > $i.coul grep ESURF gb.rescore.out.$i | awk '{print $3 * 0.00542 + 0.92}' > $i.surf paste -d " " $i.vdw $i.polar $i.surf $i.coul | awk '{print $1 + $2 + $3 + $4}' > data.$i rm $i.* done paste -d " " data.com data.lig data.rec | awk '{print $1 - $2 - $3}' > data.all for ((j=1; j<=`wc -l data.all | awk '{print $1}'`; j+=1)) do echo $j , >> time done paste -d " " time data.all > MMGBSA_vs_time.dat rm namelist time data.*
Execute this script:
bash get.mmgbsa.sh
A text file called MMGBSA_vs_time.dat with x and y values separated by a space and comma should be created. Use XMGRACE to plot this dat file using the following command in Linux:
xmgrace MMGBSA_vs_time.dat