Difference between revisions of "2023 AMBER tutorial 1 with PDBID 4S0V"
Stonybrook (talk | contribs) (→Complex Equilibration) |
Stonybrook (talk | contribs) (→MM-GBSA Analysis) |
||
(64 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
==Setting Up Your Environment== | ==Setting Up Your Environment== | ||
Just as with DOCK you should set up for directory structure at this point to keep everything organized and easy to find. We will be creating a new structure which looks like: | Just as with DOCK you should set up for directory structure at this point to keep everything organized and easy to find. We will be creating a new structure which looks like: | ||
+ | [[File: directoryStructureAMBER2.png|center]] | ||
=Structure= | =Structure= | ||
− | Before starting the analysis it's best to download a new protein/ligand complex from the PDB and isolate both the protein and ligand structures. | + | Before starting the analysis it's best to download a new protein/ligand complex from the PDB and isolate both the protein and ligand structures. This time we will be using the command line function in Chimera. To open the command line go to: |
Tools -> General Controls -> Command Line | Tools -> General Controls -> Command Line | ||
Line 12: | Line 13: | ||
on the command line, type: | on the command line, type: | ||
− | open | + | open 4s0v |
− | + | Once the file is opened, we will move back to using the drop down menus. Choose: | |
Select -> Structure -> protein | Select -> Structure -> protein | ||
− | which will select the | + | which will select the protein. Choose: |
Select -> Invert (all models) | Select -> Invert (all models) | ||
− | to | + | This will change your selection to everything other than the protein. To delete the current selection, choose: |
Actions -> Atoms/Bonds -> delete | Actions -> Atoms/Bonds -> delete | ||
− | + | This will delete everything from the session except the protein structure. Save this file under a new name, '''4s0v_protein_only_for_AMBER.pdb'''. | |
For the protein file, it’s important to model any missing loops before running AMBER on the complex. You might not have done this for the docking tutorial because the loops were far enough from the binding site to not matter but for this step it needs to be done. After adding in any missing loops with | For the protein file, it’s important to model any missing loops before running AMBER on the complex. You might not have done this for the docking tutorial because the loops were far enough from the binding site to not matter but for this step it needs to be done. After adding in any missing loops with | ||
Line 36: | Line 37: | ||
File -> Save Mol2... | File -> Save Mol2... | ||
− | and save as ''' | + | and save the file under a new name, such as '''4s0v_protein_only_no_charges.mol2'''. |
− | + | You then need to follow the same steps above but this time isolating the ligand and not the protein. Once you have the ligand in its own file, '''4s0v_ligand_only_for_AMBER.pdb''', you need to make whatever protonation changes you made when preparing the ligand for docking. See *[http://ringo.ams.stonybrook.edu/index.php/2023_DOCK_tutorial_1_with_PDBID_4S0V] for these steps. | |
− | + | For our purposes in this tutorial, we will leave the ligand as a .pdb for parametrization, since defining the connectivity of the oxygens where we removed hydrogens can be tricky. However, if you are following this for your own ligands, you can explicitly define the charges if you are confident. | |
− | + | This file can now be saved as '''4s0v_ligand_only_charges_for_AMBER.pdb''' | |
− | |||
− | |||
Once these two files have been generated, scp them over to the 001.structure directory on Seawulf. | Once these two files have been generated, scp them over to the 001.structure directory on Seawulf. | ||
=Force Field Parameters= | =Force Field Parameters= | ||
− | AMBER needs force field parameters to run correctly. This only needs to be done for the ligand. | + | AMBER needs force field parameters to run correctly. This only needs to be done for the ligand. These commands will be done on the command line again, please cd into your 002.forceFieldParameters directory. To generate ligand parameters execute the following slurm script: |
− | |||
− | |||
− | |||
− | To generate ligand parameters execute the following slurm script: | ||
#!/bin/bash | #!/bin/bash | ||
Line 97: | Line 92: | ||
=TLeap Implemenation= | =TLeap Implemenation= | ||
− | The goal of molecular dynamics simulations is to model the behavior of molecular systems and observe the behavior and interactions between constituents in those systems. Here, we investigate the interactions between our ligand and our receptor. Our MD analysis using AMBER will allow is to experimentally model this interactions over a given timeframe, and from these simulations we can estimate the affinity of the receptor for this ligand. | + | The goal of molecular dynamics simulations is to model the behavior of molecular systems and observe the behavior and interactions between constituents in those systems. Here, we investigate the interactions between our ligand and our receptor. Our MD analysis using AMBER will allow is to experimentally model this interactions over a given timeframe, and from these simulations we can estimate the affinity of the receptor for this ligand. We will be working on the command line so please cd into your 003.tleap directory. |
− | |||
− | |||
We will first generate the gas-phase and solvated systems, and neutralize both systems. | We will first generate the gas-phase and solvated systems, and neutralize both systems. | ||
To create the input script (which should be run on the cluster, not on a login node, see below): | To create the input script (which should be run on the cluster, not on a login node, see below): | ||
− | + | vi 4s0v_tleap.in | |
And then add: | And then add: | ||
Line 126: | Line 119: | ||
rec=loadpdb ../001.structure/4s0v_built.pdb | rec=loadpdb ../001.structure/4s0v_built.pdb | ||
+ | #THIS IS WHERE YOU WOULD DEFINE DISULFIDE BONDS | ||
+ | #NUMBERING SHOULD MATCH INPUT PDB FILE | ||
#bond rec.649.SG rec.762.SG | #bond rec.649.SG rec.762.SG | ||
#bond rec.454.SG rec.472.SG | #bond rec.454.SG rec.472.SG | ||
Line 175: | Line 170: | ||
#SBATCH --nodes=6 | #SBATCH --nodes=6 | ||
#SBATCH --time=48:00:00 | #SBATCH --time=48:00:00 | ||
− | #SBATCH -p long- | + | #SBATCH -p long-24core |
module load amber/16 | module load amber/16 | ||
tleap -f 4s0v_tleap.in | tleap -f 4s0v_tleap.in | ||
Line 216: | Line 211: | ||
=Complex Equilibration= | =Complex Equilibration= | ||
− | Before we can run our structure through the AMBER program we need to minimize it and make sure it's at equilibrium. | + | Before we can run our structure through the AMBER program we need to minimize it and make sure it's at equilibrium. These commands will again be done on the command line so please cd into your 004.equil directory. |
+ | |||
+ | Nine input files are required for system equilibration and AMBER will process them in order. You can imagine these steps as 'heat-treating' your complex - you apply heat to the system and then let it relax - and repeat until the system is at rest. Each of these nine files is one step in this process. For each step there are two files: | ||
+ | |||
+ | min.mdin - will minimize only hydrogen atoms | ||
+ | equil.mdin - will minimize system including non-hydrogen atoms, depending upon the restraint mask for atoms involved | ||
+ | |||
+ | These input files will also direct AMBER on which atoms to restrain during equilibration. There a specific line in each file: | ||
+ | restraintmask | ||
+ | |||
+ | after that command, if you see a "!", that is telling AMBER to exclude any atoms that follow; if you see "@H=", that is telling AMBER to restrain all the hydrogen atoms. In the first 7 input files we want to restrain all residues in our complex, including the ligand. For the last two steps we will only restrain the protein and not the ligand. | ||
+ | |||
+ | There is also a line: | ||
+ | restraint_wt | ||
+ | |||
+ | What are restraint weights? Check the AMBER handbook for a better description, but in brief, this is a constant similar to a spring constant used in Hooke's law. For the first files we use a larger restraint_wt and a lower constant for the final files. Before we can create any input files we need to know how many residues are in our complex. Do this by looking at at the 4s0v.gas.complex.pdb file created in the previous step and looking at the last line: | ||
− | + | [[File: complexpdb.png|center]] | |
− | + | We see that there are 490 residues in our complex. 489 in the protein and one for the ligand. | |
Create 01.min.mdin | Create 01.min.mdin | ||
− | + | vi 01.min.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 238: | Line 248: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask="!@H=", ! atoms to be restrained | + | restraintmask=":1-490 & !@H=", ! atoms to be restrained |
restraint_wt=5.0, ! force constant for restraint | restraint_wt=5.0, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
− | + | / | |
+ | |||
+ | In the above input file you'll see a line: | ||
+ | restraintmask=":1-490@CA,C,N", ! atoms to be restrained. | ||
+ | |||
+ | This is where the number of residues for your system needs to be updated. This change needs to be done for the following input files as well. | ||
Create 02.equil.mdin | Create 02.equil.mdin | ||
− | + | vi 02.equil.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 268: | Line 283: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask="!@H= | + | restraintmask=":1-490 & !@H=", ! atoms to be restrained |
restraint_wt=5.0, ! force constant for restraint | restraint_wt=5.0, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
iwrap=1, ! iwrap is turned on | iwrap=1, ! iwrap is turned on | ||
− | + | / | |
Create 03.min.mdin | Create 03.min.mdin | ||
− | + | vi 03.min.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 291: | Line 306: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask="!@H=", ! atoms to be restrained | + | restraintmask=":1-490 & !@H=", ! atoms to be restrained |
restraint_wt=2.0, ! force constant for restraint | restraint_wt=2.0, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
− | + | / | |
Create 04.min.mdin | Create 04.min.mdin | ||
− | + | vi 04.min.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 313: | Line 328: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask="!@H=", ! atoms to be restrained | + | restraintmask=":1-490 & !@H=", ! atoms to be restrained |
restraint_wt=0.1, ! force constant for restraint | restraint_wt=0.1, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
− | + | / | |
Create 05.min.mdin | Create 05.min.mdin | ||
− | + | vi 05.min.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 335: | Line 350: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask="!@H=", ! atoms to be restrained | + | restraintmask=":1-490 & !@H=", ! atoms to be restrained |
restraint_wt=0.05, ! force constant for restraint | restraint_wt=0.05, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
− | + | / | |
Create 06.equil.mdin | Create 06.equil.mdin | ||
− | + | vi 06.equil.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 365: | Line 380: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask="!@H=", ! atoms to be restrained | + | restraintmask=":1-490 & !@H=", ! atoms to be restrained |
restraint_wt=1.0, ! force constant for restraint | restraint_wt=1.0, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
iwrap=1, ! iwrap is turned on | iwrap=1, ! iwrap is turned on | ||
− | + | / | |
Create 07.equil.mdin | Create 07.equil.mdin | ||
− | + | vi 07.equil.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 398: | Line 413: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask="!@H=", ! atoms to be restrained | + | restraintmask=":1-490 & !@H=", ! atoms to be restrained |
restraint_wt=0.5, ! force constant for restraint | restraint_wt=0.5, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
iwrap=1, ! iwrap is turned on | iwrap=1, ! iwrap is turned on | ||
− | + | / | |
Create 08.equil.mdin | Create 08.equil.mdin | ||
− | + | vi 08.equil.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 431: | Line 446: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask=":1- | + | restraintmask=":1-489@CA.C.N", ! atoms to be restrained, only the backbone |
restraint_wt=0.1, ! force constant for restraint | restraint_wt=0.1, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
iwrap=1, ! iwrap is turned on | iwrap=1, ! iwrap is turned on | ||
− | + | / | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | NOTE: for the above input file we only want to restrain the protein and not the ligand so the '490' needs to be change to '489'. | ||
Create 09.equil.mdin. | Create 09.equil.mdin. | ||
− | + | vi 09.equil.mdin | |
Copy the following lines: | Copy the following lines: | ||
Line 474: | Line 481: | ||
cut= 8.0, ! Nonbonded cutoff in Angstroms | cut= 8.0, ! Nonbonded cutoff in Angstroms | ||
ntr=1, ! Turn on restraints | ntr=1, ! Turn on restraints | ||
− | restraintmask=":1-489@CA | + | restraintmask=":1-489@CA.C.N", ! atoms to be restrained |
restraint_wt=0.1, ! force constant for restraint | restraint_wt=0.1, ! force constant for restraint | ||
ntxo=1, ! Write coordinate file in ASCII format | ntxo=1, ! Write coordinate file in ASCII format | ||
ioutfm=0, ! Write trajectory file in ASCII format | ioutfm=0, ! Write trajectory file in ASCII format | ||
iwrap=1, ! iwrap is turned on | iwrap=1, ! iwrap is turned on | ||
− | + | / | |
− | + | NOTE: for the above input file we only want to restrain the protein and not the ligand so the '490' needs to be change to '489'. | |
− | |||
− | |||
− | |||
Once you have all nine files in your directory we need to create a slurm script, 4s0v.equil.slurm, to run them: | Once you have all nine files in your directory we need to create a slurm script, 4s0v.equil.slurm, to run them: | ||
Line 498: | Line 502: | ||
module load amber/16 | module load amber/16 | ||
− | do_parallel="mpirun pmemd.MPI" | + | do_parallel="mpirun -np 80 pmemd.MPI" |
prmtop="../003.tleap/4s0v.wet.complex.prmtop" | prmtop="../003.tleap/4s0v.wet.complex.prmtop" | ||
Line 511: | Line 515: | ||
done | done | ||
− | Submit the above file to | + | Submit the above file to Seawulf by typing: |
sbatch 4s0v.equil.slurm | sbatch 4s0v.equil.slurm | ||
− | Once the program finishes you will see | + | Once the program finishes you will see multiple new files including a file named logfile. Check to file for errors. If there are none, you can move onto the next step. You should also check the .trj files from each step to make sure that the system appears rational. Below is the equillibrated system following 09.equil.mdin |
+ | |||
+ | [[File: 4S0V_MIN.png|center||500px]] | ||
=Production Run= | =Production Run= | ||
− | Now that our system is set up properly we can move onto an MD production run of the structure. This will again be completed on the command line, please cd into your 005. | + | Now that our system is set up properly we can move onto an MD production run of the structure. This will again be completed on the command line, please cd into 005.production |
+ | |||
+ | Create the input file, 10.prod.min | ||
+ | nano 10.prod.mdin | ||
+ | |||
+ | And copy the following lines: | ||
+ | MD simulations | ||
+ | &cntrl | ||
+ | imin=0, ! Perform MD | ||
+ | nstlim=5000000, ! Number of MD steps | ||
+ | ntx=5, ! Positions and velocities read formatted | ||
+ | irest=1, ! Restart calculation | ||
+ | ntc=2, ! SHAKE on for bonds with hydrogen | ||
+ | dt=0.002, ! Timestep (ps) | ||
+ | ntb=2, ! Constant Pressure | ||
+ | ntp=1, ! Isotropic pressure scaling | ||
+ | barostat=1 ! Berendsen | ||
+ | taup=0.5 ! Pressure relaxtion time (ps) | ||
+ | ntf=2, ! No force evaluation for bonds with hydrogen | ||
+ | ntt=3, ! Langevin thermostat | ||
+ | gamma_ln=2.0 ! Collision Frequency for thermostat | ||
+ | ig=-1, ! Random seed for thermostat | ||
+ | temp0=298.15 ! Simulation temperature (K) | ||
+ | ntwx= 2500, ! Write to trajectory file every ntwx steps | ||
+ | ntpr= 2500, ! Print to mdout every ntpr steps | ||
+ | ntwr= 5000000, ! Write a restart file every ntwr steps | ||
+ | cut=8.0, ! Nonbonded cutoff in Angstroms | ||
+ | ntr=1, ! Turn on restraints | ||
+ | restraintmask=":1-489@CA,C,N", ! atoms to be restrained | ||
+ | restraint_wt=0.1, ! force constant for restraint | ||
+ | ntxo=1, ! Write coordinate file in ASCII format | ||
+ | ioutfm=0, ! Write trajectory file in ASCII format | ||
+ | iwrap=1, ! iwrap is turned on | ||
+ | $end | ||
+ | |||
+ | Note for the restraintmask line, use the same number as for the 09.equil.mdin file in the previous step. In our case, this number is 489. | ||
+ | |||
+ | nano production.sh | ||
+ | |||
+ | This input file can now be run on Seawulf using the following slurm script: | ||
+ | #!/bin/bash | ||
+ | # | ||
+ | #SBATCH --job-name=4s0v_production | ||
+ | #SBATCH --output=4s0v_production.txt | ||
+ | #SBATCH --ntasks-per-node=28 | ||
+ | #SBATCH --nodes=4 | ||
+ | #SBATCH --time=48:00:00 | ||
+ | #SBATCH -p long-28core | ||
+ | |||
+ | module load amber/16 | ||
+ | |||
+ | do_parallel="mpirun pmemd.MPI" | ||
+ | |||
+ | prmtop="../003.tleap/4s0v.wet.complex.prmtop" | ||
+ | coords="../004.equil/09.equil" | ||
+ | |||
+ | MDINPUTS=(10.prod) | ||
+ | |||
+ | for input in ${MDINPUTS[@]}; do | ||
+ | $do_parallel -O -i ${input}.mdin -o ${input}.mdout -p $prmtop -c ${coords}.rst7 -ref ${coords}.rst7 -x ${input}.trj -inf ${input}.info -r ${input}.rst7 | ||
+ | coords=$input | ||
+ | done | ||
+ | |||
+ | =Analysis= | ||
+ | The output of the production run performed in the previous step is to generate and MD trajectory of the ligand/protein complex. Once this is done we need to analyze the results. We will analysis three different aspects of the complex: | ||
+ | *RMDS (the distance the ligand and protein have moved from their initial positions) | ||
+ | *H-bonds (what hydrogren bonds have been formed between the ligand and protein) | ||
+ | *MM-GBSA (estimation of the free binding energy between the ligand and protein) | ||
+ | |||
+ | We will again be working on the command line, please cd into your 006a.RMSD directory | ||
+ | |||
+ | ==RMSD== | ||
+ | |||
+ | To perform the RMSD calculations, first create an input file which will delete the water and charges from the production trajectory. | ||
+ | |||
+ | nano cpp_strip_water.in | ||
+ | |||
+ | with the following inputs. Note that this should cover all CA atoms (including the ligand) in your complex | ||
+ | |||
+ | #!/usr/bin/sh | ||
+ | parm ../003.tleap/4s0v.wet.complex.prmtop | ||
+ | #read in trajectory | ||
+ | trajin ../005.production/10.prod.trj | ||
+ | #read in reference | ||
+ | reference ../003.tleap/4s0v.wet.complex.rst7 | ||
+ | #compute RMSD + align CA to crystal structure | ||
+ | rmsd rms1 reference :1-490@CA | ||
+ | #strip solvent | ||
+ | strip :WAT:Na+:Cl- | ||
+ | #create gas-phase trajectory | ||
+ | trajout 4s0v_stripfit_restrained_gas.trj nobox | ||
+ | cpp_strip_water.in (END) | ||
+ | |||
+ | The reference mask should contain BOTH the ligand and receptor in its value. | ||
+ | |||
+ | Now run this file using the following command: | ||
+ | |||
+ | cpptraj -i cpp_strip_water.in | ||
+ | |||
+ | Next, create another input file which will be used to calculate the RMSD of the ligand by typing | ||
+ | |||
+ | nano cpp_rmsd_lig.in | ||
+ | |||
+ | with inputs | ||
+ | |||
+ | #!/usr/bin/sh | ||
+ | #trajin the trajectory | ||
+ | trajin 4s0v_stripfit_restrained_gas.trj | ||
+ | #read in reference | ||
+ | reference ../003.tleap/4s0v.gas.complex.rst7 | ||
+ | #compute RMSD (do not fit internal geometris first, included rigid body motions, convert frames to ns (framenum*.005) | ||
+ | rmsd rms1 ":490&!(@H=)" nofit mass out 4s0v_lig_restrained_rmsd_nofit.dat time .005 | ||
+ | #histogram the nofit rmsd | ||
+ | histogram rms1,*,*,.1,* norm out 4s0v_lig_restrained_rmsd_nofit_histogram.dat | ||
+ | |||
+ | And check that the reference mask only factors in the ligand (490) in this file. Run this input file using the following command: | ||
+ | |||
+ | cpptraj -p ../003.tleap/4s0v.complex.parm7 -i cpp_rmsd_lig.in | ||
+ | |||
+ | Next, the receptor RMSD can be calculated by creating the following input file: | ||
+ | |||
+ | nano cpp_rmsd_receptor.in | ||
+ | |||
+ | with inputs | ||
+ | |||
+ | #!/usr/bin/sh | ||
+ | #trajin the trajectory | ||
+ | trajin 4s0v_stripfit_restrained_gas.trj | ||
+ | #read in reference | ||
+ | reference ../003.tleap/4s0v.gas.complex.rst7 | ||
+ | #compute RMSD (do not fit internal geometries first, included rigid body motions, convert frames to ns (framenum*.005) | ||
+ | rmsd rms1 ":1-489&!(@H=)" nofit mass out 4s0v_receptor_restrained_rmsd_nofit.dat time .005 | ||
+ | #histogram the nofit rmsd | ||
+ | histogram rms1,*,*,.1,* norm out 4s0v_receptor_restrained_rmsd_nofit_histogram.dat | ||
+ | |||
+ | And check that the reference mask only takes into account the receptor (1-489). Now run this file using the following command: | ||
+ | |||
+ | cpptraj -p ../003.tleap/4s0v.complex.parm7 -i cpp_rmsd_receptor.in | ||
+ | |||
+ | The output files from these analyses will be: | ||
+ | |||
+ | 4s0v_lig_restrained_rmsd_nofit.dat | ||
+ | 4s0v_receptor_restrained_rmsd_nofit_histogram.dat | ||
+ | 4s0v_lig_restrained_rmsd_nofit_histogram.dat | ||
+ | 4s0v_stripfit_restrained_gas.trj | ||
+ | 4s0v_receptor_restrained_rmsd_nofit.dat | ||
+ | |||
+ | Examine the histogram.dat files, or plot if desired, to note the change in the RMSD of the ligand (or the receptor) over the course of the simulation. Since this was docked previously, we would hope that it would stay under 2 angstroms over the course of the simulation. There may be other simulations were movement is acceptable (or even desired) but this is not that case. | ||
+ | |||
+ | [[FILE:Screenshot 2023-03-29 at 3.25.46 PM.png]] | ||
+ | |||
+ | ==H-bonds== | ||
+ | |||
+ | Next, the number of hydrogen bonds will be analyzed, including what these bonds are. Create the following input file: | ||
+ | |||
+ | nano cpp_hbonds.in | ||
+ | |||
+ | with the following: | ||
+ | |||
+ | !/usr/bin/sh | ||
+ | #read in trajectory | ||
+ | trajin ../004.production/10_prod.trj | ||
+ | #wrap everything in one periodic cell - could cause problems, may comment out #autoimage if problems later | ||
+ | autoimage | ||
+ | #compute intra + water-mediated H-bonds | ||
+ | hbond hb1 :1-490 out 4s0v_hbond.out avgout 4s0v_hbond_avg.dat solventdonor :WAT solventacceptor :WAT@-O | ||
+ | nointramol brid\ | ||
+ | geout 4s0v_bridge-water.dat dist 3.0 angle 140 | ||
+ | |||
+ | Be sure that the hbobnd hb1 parameter corresponds to the residue numbers of the relevant ligand AND receptor (490 in our case). Run with: | ||
+ | |||
+ | cpptraj -p ../003.tleap/4s0v.wet.complex.prmtop -i cpp_hbonds.in | ||
+ | |||
+ | ==MM-GBSA Analysis== | ||
+ | |||
+ | Finally, we will calculate free energies of binding in this system with MM-GBSA. | ||
+ | Create the following input file: | ||
+ | |||
+ | nano mmgbsa_4s0v.in | ||
+ | |||
+ | This file should contain the following: | ||
+ | |||
+ | mmgbsa 4s0v analysis | ||
+ | &general | ||
+ | interval=1, netcdf=1, | ||
+ | keep_files=0, | ||
+ | / | ||
+ | &gb | ||
+ | igb=8 | ||
+ | saltcon=0.0, surften=0.0072, | ||
+ | surfoff=0.0, molsurf=0, | ||
+ | / | ||
+ | &nmode | ||
+ | drms=0.001, maxcyc=10000, | ||
+ | nminterval=250, nmendframe=2000, | ||
+ | nmode_igb=1, | ||
+ | / | ||
+ | |||
+ | Since this may take a while to run we should execute the script using Seawulf. Generate the following script: | ||
+ | |||
+ | 4s0v_mmgbsa.sh | ||
+ | |||
+ | With the following as contents: | ||
+ | |||
+ | #bin/bash | ||
+ | #SBATCH --time=2-00:00:00 | ||
+ | #SBATCH --nodes=2 | ||
+ | #SBATCH --ntasks=28 | ||
+ | #SBATCH --job-name=4s0v_MMGBSA | ||
+ | #SBATCH --output=4s0v_MMGBSA.log | ||
+ | #SBATCH -p extended-28core | ||
+ | #Define topology files | ||
+ | solv_parm="../003.tleap/4s0v.wet.complex.prmtop" | ||
+ | complex_parm="../003.tleap/4s0v.complex.parm7" | ||
+ | receptor_parm="../003.tleap/4s0v.gas.receptor.parm7" | ||
+ | lig_parm="../003.tleap/4s0v.gas.ligand.parm7" | ||
+ | trajectory="../005.production/10_prod.trj" | ||
+ | MMPBSA.py -O -i mmgbsa.in \ | ||
+ | -o 4s0v_mmgbsa_results.dat \ | ||
+ | -eo 4s0v_mmgbsa_perframe.dat \ | ||
+ | -sp ${solv_parm} \ | ||
+ | -cp ${complex_parm} \ | ||
+ | -rp ${receptor_parm} \ | ||
+ | -lp ${lig_parm} \ | ||
+ | -y ${trajectory} | ||
+ | |||
+ | Execute this with: | ||
+ | |||
+ | sbatch 4s0v_mmgbsa.sh |
Latest revision as of 14:57, 29 March 2023
Contents
Introduction
AMBER is a molecular dynamics program that can be run on your protein/ligand complex to ensure that the interactions between the two structures are stable. DOCK shows us how the two interact with each other at one point in time. AMBER looks at those interactions over time to ensure that forces will not occur which will push the ligand out of the binding site as the complex naturally moves. This tutorial will again be working with PDB #4s0v
Setting Up Your Environment
Just as with DOCK you should set up for directory structure at this point to keep everything organized and easy to find. We will be creating a new structure which looks like:
Structure
Before starting the analysis it's best to download a new protein/ligand complex from the PDB and isolate both the protein and ligand structures. This time we will be using the command line function in Chimera. To open the command line go to:
Tools -> General Controls -> Command Line
on the command line, type:
open 4s0v
Once the file is opened, we will move back to using the drop down menus. Choose:
Select -> Structure -> protein
which will select the protein. Choose:
Select -> Invert (all models)
This will change your selection to everything other than the protein. To delete the current selection, choose:
Actions -> Atoms/Bonds -> delete
This will delete everything from the session except the protein structure. Save this file under a new name, 4s0v_protein_only_for_AMBER.pdb.
For the protein file, it’s important to model any missing loops before running AMBER on the complex. You might not have done this for the docking tutorial because the loops were far enough from the binding site to not matter but for this step it needs to be done. After adding in any missing loops with
Tools -> Structure Editing -> Model/Refine Loops
Go to
File -> Save Mol2...
and save the file under a new name, such as 4s0v_protein_only_no_charges.mol2.
You then need to follow the same steps above but this time isolating the ligand and not the protein. Once you have the ligand in its own file, 4s0v_ligand_only_for_AMBER.pdb, you need to make whatever protonation changes you made when preparing the ligand for docking. See *[1] for these steps.
For our purposes in this tutorial, we will leave the ligand as a .pdb for parametrization, since defining the connectivity of the oxygens where we removed hydrogens can be tricky. However, if you are following this for your own ligands, you can explicitly define the charges if you are confident.
This file can now be saved as 4s0v_ligand_only_charges_for_AMBER.pdb
Once these two files have been generated, scp them over to the 001.structure directory on Seawulf.
Force Field Parameters
AMBER needs force field parameters to run correctly. This only needs to be done for the ligand. These commands will be done on the command line again, please cd into your 002.forceFieldParameters directory. To generate ligand parameters execute the following slurm script:
#!/bin/bash # #SBATCH --job-name=4s0v_AMBER_parameters #SBATCH --output=parameters_output.txt #SBATCH --ntasks-per-node=24 #SBATCH --nodes=6 #SBATCH --time=48:00:00 #SBATCH -p long-24core module load amber/16 antechamber -i ../001.structure/4s0v_ligand_only.pdb -fi pdb -o 4s0v_ligand_antechamber.mol2 -fo mol2 -at gaff2 -c bcc -rn LIG -nc 0
The important parameter in the above command is the -nc option at the end. This is telling antechamber what the total charge is on your ligand. This number needs to be the same as the number used in the previous step when you added charges to the structure in Chimera. In our case, the total charge on the ligand should be 0.
Once this has completed running you will see multiple new files in your directory including, specifically:
4s0v_ligand_antechamber.mol2 ANTECHAMBER_AC.AC0 ANTECHAMBER_AM1BCC_PRE.AC ANTECHAMBER_BOND_TYPE.AC0 sqm.in sqm.pdb ANTECHAMBER_AC.AC ANTECHAMBER_AM1BCC.AC ANTECHAMBER_BOND_TYPE.AC ATOMTYPE.INF sqm.out
4s0v_ligand_antechamber.mol2 is the file with the parameters we just generated.
Now that we have defined parameters for the ligand, we need to modify our force field parameters slightly. Run the command:
parmchk2 -i 4s0v_ligand_antechamber.mol2 -f mol2 -o 4s0v_ligand.am1bcc.frcmod
If you get an error when running this line, try typing:
module load amber/16
and then running the command again. Once it's done running you will see the 4s0v_ligand.am1bcc.frcmod file in your directory.
TLeap Implemenation
The goal of molecular dynamics simulations is to model the behavior of molecular systems and observe the behavior and interactions between constituents in those systems. Here, we investigate the interactions between our ligand and our receptor. Our MD analysis using AMBER will allow is to experimentally model this interactions over a given timeframe, and from these simulations we can estimate the affinity of the receptor for this ligand. We will be working on the command line so please cd into your 003.tleap directory.
We will first generate the gas-phase and solvated systems, and neutralize both systems.
To create the input script (which should be run on the cluster, not on a login node, see below):
vi 4s0v_tleap.in
And then add:
#!/usr/bin/sh ###load protein force field source leaprc.protein.ff14SB ###load GAFF force field (for our ligand) source leaprc.gaff ###load TIP3P (water) force field source leaprc.water.tip3p ###load ions frcmod for the tip3p model loadamberparams frcmod.ionsjc_tip3p ###needed so we can use igb=8 model set default PBradii mbondi3 ###load protein pdb file rec=loadpdb ../001.structure/4s0v_built.pdb #THIS IS WHERE YOU WOULD DEFINE DISULFIDE BONDS #NUMBERING SHOULD MATCH INPUT PDB FILE #bond rec.649.SG rec.762.SG #bond rec.454.SG rec.472.SG #bond rec.444.SG rec.447.SG #bond rec.328.SG rec.339.SG #bond rec.385.SG rec.394.SG ###load ligand frcmod/mol2 loadamberparams ../002.parameters/4s0v_ligand.am1bcc.frcmod lig=loadmol2 ../002.parameters/4s0v_ligand_antechamber.mol2 ###create gase-phase complex gascomplex= combine {rec lig} ###write gas-phase pdb savepdb gascomplex 4s0v.gas.complex.pdb ###write gase-phase toplogy and coord files for MMGBSA calc saveamberparm gascomplex 4s0v.complex.parm7 4s0v.gas.complex.rst7 saveamberparm rec 4s0v.gas.receptor.parm7 4s0v.gas.receptor.rst7 saveamberparm lig 4s0v.gas.ligand.parm7 4s0v.gas.ligand.rst7 ###create solvated complex (albeit redundant) solvcomplex= combine {rec lig} ###solvate the system solvateoct solvcomplex TIP3PBOX 12.0 ###Neutralize system addions solvcomplex Cl- 0 addions solvcomplex Na+ 0 #write solvated pdb file savepdb solvcomplex 4s0v.wet.complex.pdb ###check the system charge solvcomplex check solvcomplex ###write solvated toplogy and coordinate file saveamberparm solvcomplex 4s0v.wet.complex.prmtop 4s0v.wet.complex.rst7 quit
This input file should be executed on the cluster, not in a login node. To do so, create and run the following script: 4s0v_tleap.slurm.
#!/bin/bash # #SBATCH --job-name=4s0v_tleap #SBATCH --output=tleap_output.txt #SBATCH --ntasks-per-node=24 #SBATCH --nodes=6 #SBATCH --time=48:00:00 #SBATCH -p long-24core module load amber/16 tleap -f 4s0v_tleap.in
Execute this script with:
sbatch 4s0v_tleap.slurm
When this is done running you will see multiple files in your directory. Specifically:
4s0v.complex.parm7 4s0v.gas.complex.rst7 4s0v.gas.ligand.rst7 4s0v.gas.receptor.rst7 4s0v_tleap.slurm 4s0v.wet.complex.prmtop leap.log 4s0v.gas.complex.pdb 4s0v.gas.ligand.parm7 4s0v.gas.receptor.parm7 4s0v_tleap.in 4s0v.wet.complex.pdb 4s0v.wet.complex.prmtop 4s0v.wet.complex.rst7 tleap_output.txt
Scp 4s0v.wet.complex.rst7 and 4s0v.wet.complex.prmtop to your local computer.
In Chimera, open using:
Tools → MD/Ensemble Analysis → MD Movie
In the prmtop section, select 4s0v.wet.complex.prmtop, and then hit "Add" and select the 4s0v.wet.complex.rst7 file. Your receptor should now open in Chimera.
Everything looks good so we can move onto the next step.
Complex Equilibration
Before we can run our structure through the AMBER program we need to minimize it and make sure it's at equilibrium. These commands will again be done on the command line so please cd into your 004.equil directory.
Nine input files are required for system equilibration and AMBER will process them in order. You can imagine these steps as 'heat-treating' your complex - you apply heat to the system and then let it relax - and repeat until the system is at rest. Each of these nine files is one step in this process. For each step there are two files:
min.mdin - will minimize only hydrogen atoms equil.mdin - will minimize system including non-hydrogen atoms, depending upon the restraint mask for atoms involved
These input files will also direct AMBER on which atoms to restrain during equilibration. There a specific line in each file:
restraintmask
after that command, if you see a "!", that is telling AMBER to exclude any atoms that follow; if you see "@H=", that is telling AMBER to restrain all the hydrogen atoms. In the first 7 input files we want to restrain all residues in our complex, including the ligand. For the last two steps we will only restrain the protein and not the ligand.
There is also a line:
restraint_wt
What are restraint weights? Check the AMBER handbook for a better description, but in brief, this is a constant similar to a spring constant used in Hooke's law. For the first files we use a larger restraint_wt and a lower constant for the final files. Before we can create any input files we need to know how many residues are in our complex. Do this by looking at at the 4s0v.gas.complex.pdb file created in the previous step and looking at the last line:
We see that there are 490 residues in our complex. 489 in the protein and one for the ligand.
Create 01.min.mdin
vi 01.min.mdin
Copy the following lines:
Minmize all the hydrogens &cntrl imin=1, ! Minimize the initial structure maxcyc=5000, ! Maximum number of cycles for minimization ntb=1, ! Constant volume ntp=0, ! No pressure scaling ntf=1, ! Complete force evaluation ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-490 & !@H=", ! atoms to be restrained restraint_wt=5.0, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format /
In the above input file you'll see a line:
restraintmask=":1-490@CA,C,N", ! atoms to be restrained.
This is where the number of residues for your system needs to be updated. This change needs to be done for the following input files as well.
Create 02.equil.mdin
vi 02.equil.mdin
Copy the following lines:
MD simualation &cntrl imin=0, ! Perform MD nstlim=50000 ! Number of MD steps ntb=2, ! Constant Pressure ntc=1, ! No SHAKE on bonds between hydrogens dt=0.001, ! Timestep (ps) ntp=1, ! Isotropic pressure scaling barostat=1 ! Berendsen taup=0.5 ! Pressure relaxtion time (ps) ntf=1, ! Complete force evaluation ntt=3, ! Langevin thermostat gamma_ln=2.0 ! Collision Frequency for thermostat ig=-1, ! Random seed for thermostat temp0=298.15 ! Simulation temperature (K) ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-490 & !@H=", ! atoms to be restrained restraint_wt=5.0, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format iwrap=1, ! iwrap is turned on /
Create 03.min.mdin
vi 03.min.mdin
Copy the following lines:
Minmize all the hydrogens &cntrl imin=1, ! Minimize the initial structure maxcyc=1000, ! Maximum number of cycles for minimization ntb=1, ! Constant volume ntp=0, ! No pressure scaling ntf=1, ! Complete force evaluation ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-490 & !@H=", ! atoms to be restrained restraint_wt=2.0, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format /
Create 04.min.mdin
vi 04.min.mdin
Copy the following lines:
Minmize all the hydrogens &cntrl imin=1, ! Minimize the initial structure maxcyc=1000, ! Maximum number of cycles for minimization ntb=1, ! Constant volume ntp=0, ! No pressure scaling ntf=1, ! Complete force evaluation ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-490 & !@H=", ! atoms to be restrained restraint_wt=0.1, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format /
Create 05.min.mdin
vi 05.min.mdin
Copy the following lines:
Minmize all the hydrogens &cntrl imin=1, ! Minimize the initial structure maxcyc=1000, ! Maximum number of cycles for minimization ntb=1, ! Constant volume ntp=0, ! No pressure scaling ntf=1, ! Complete force evaluation ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-490 & !@H=", ! atoms to be restrained restraint_wt=0.05, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format /
Create 06.equil.mdin
vi 06.equil.mdin
Copy the following lines:
MD simualation &cntrl imin=0, ! Perform MD nstlim=50000 ! Number of MD steps ntb=2, ! Constant Pressure ntc=1, ! No SHAKE on bonds between hydrogens dt=0.001, ! Timestep (ps) ntp=1, ! Isotropic pressure scaling barostat=1 ! Berendsen taup=0.5 ! Pressure relaxtion time (ps) ntf=1, ! Complete force evaluation ntt=3, ! Langevin thermostat gamma_ln=2.0 ! Collision Frequency for thermostat ig=-1, ! Random seed for thermostat temp0=298.15 ! Simulation temperature (K) ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-490 & !@H=", ! atoms to be restrained restraint_wt=1.0, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format iwrap=1, ! iwrap is turned on /
Create 07.equil.mdin
vi 07.equil.mdin
Copy the following lines:
MD simulation &cntrl imin=0, ! Perform MD nstlim=50000 ! Number of MD steps ntx=5, ! Positions and velocities read formatted irest=1, ! Restart calculation ntc=1, ! No SHAKE on for bonds with hydrogen dt=0.001, ! Timestep (ps) ntb=2, ! Constant Pressure ntp=1, ! Isotropic pressure scaling barostat=1 ! Berendsen taup=0.5 ! Pressure relaxtion time (ps) ntf=1, ! Complete force evaluation ntt=3, ! Langevin thermostat gamma_ln=2.0 ! Collision Frequency for thermostat ig=-1, ! Random seed for thermostat temp0=298.15 ! Simulation temperature (K) ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-490 & !@H=", ! atoms to be restrained restraint_wt=0.5, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format iwrap=1, ! iwrap is turned on /
Create 08.equil.mdin
vi 08.equil.mdin
Copy the following lines:
MD simulations &cntrl imin=0, ! Perform MD nstlim=50000 ! Number of MD steps ntx=5, ! Positions and velocities read formatted irest=1, ! Restart calculation ntc=1, ! No SHAKE on for bonds with hydrogen dt=0.001, ! Timestep (ps) ntb=2, ! Constant Pressure ntp=1, ! Isotropic pressure scaling barostat=1 ! Berendsen taup=0.5 ! Pressure relaxtion time (ps) ntf=1, ! Complete force evaluation ntt=3, ! Langevin thermostat gamma_ln=2.0 ! Collision Frequency for thermostat ig=-1, ! Random seed for thermostat temp0=298.15 ! Simulation temperature (K) ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-489@CA.C.N", ! atoms to be restrained, only the backbone restraint_wt=0.1, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format iwrap=1, ! iwrap is turned on /
NOTE: for the above input file we only want to restrain the protein and not the ligand so the '490' needs to be change to '489'.
Create 09.equil.mdin.
vi 09.equil.mdin
Copy the following lines:
MD simulations &cntrl imin=0, ! Perform MD nstlim=50000 ! Number of MD steps ntx=5, ! Positions and velocities read formatted irest=1, ! Restart calculation ntc=1, ! No SHAKE on for bonds with hydrogen dt=0.001, ! Timestep (ps) ntb=2, ! Constant Pressure ntp=1, ! Isotropic pressure scaling barostat=1 ! Berendsen taup=0.5 ! Pressure relaxtion time (ps) ntf=1, ! Complete force evaluation ntt=3, ! Langevin thermostat gamma_ln=2.0 ! Collision Frequency for thermostat ig=-1, ! Random seed for thermostat temp0=298.15 ! Simulation temperature (K) ntwx= 1000, ! Write to trajectory file every ntwx steps ntpr= 1000, ! Print to mdout every ntpr steps ntwr= 1000, ! Write a restart file every ntwr steps cut= 8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-489@CA.C.N", ! atoms to be restrained restraint_wt=0.1, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format iwrap=1, ! iwrap is turned on /
NOTE: for the above input file we only want to restrain the protein and not the ligand so the '490' needs to be change to '489'.
Once you have all nine files in your directory we need to create a slurm script, 4s0v.equil.slurm, to run them:
#!/bin/bash # #SBATCH --job-name=4s0v_equilibration #SBATCH --output=equilibration_output.txt #SBATCH --ntasks-per-node=24 #SBATCH --nodes=1 #SBATCH --time=48:00:00 #SBATCH -p long-24core module load amber/16 do_parallel="mpirun -np 80 pmemd.MPI" prmtop="../003.tleap/4s0v.wet.complex.prmtop" coords="../003.tleap/4s0v.wet.complex" MDINPUTS=(01.min 02.equil 03.min 04.min 05.min 06.equil 07.equil 08.equil 09.equil) for input in ${MDINPUTS[@]}; do $do_parallel -O -i ${input}.mdin -o ${input}.mdout -p $prmtop -c ${coords}.rst7 -ref ${coords}.rst7 -x ${input}.trj -inf ${input}.info -r ${input}.rst7 coords=$input done
Submit the above file to Seawulf by typing:
sbatch 4s0v.equil.slurm
Once the program finishes you will see multiple new files including a file named logfile. Check to file for errors. If there are none, you can move onto the next step. You should also check the .trj files from each step to make sure that the system appears rational. Below is the equillibrated system following 09.equil.mdin
Production Run
Now that our system is set up properly we can move onto an MD production run of the structure. This will again be completed on the command line, please cd into 005.production
Create the input file, 10.prod.min
nano 10.prod.mdin
And copy the following lines:
MD simulations &cntrl imin=0, ! Perform MD nstlim=5000000, ! Number of MD steps ntx=5, ! Positions and velocities read formatted irest=1, ! Restart calculation ntc=2, ! SHAKE on for bonds with hydrogen dt=0.002, ! Timestep (ps) ntb=2, ! Constant Pressure ntp=1, ! Isotropic pressure scaling barostat=1 ! Berendsen taup=0.5 ! Pressure relaxtion time (ps) ntf=2, ! No force evaluation for bonds with hydrogen ntt=3, ! Langevin thermostat gamma_ln=2.0 ! Collision Frequency for thermostat ig=-1, ! Random seed for thermostat temp0=298.15 ! Simulation temperature (K) ntwx= 2500, ! Write to trajectory file every ntwx steps ntpr= 2500, ! Print to mdout every ntpr steps ntwr= 5000000, ! Write a restart file every ntwr steps cut=8.0, ! Nonbonded cutoff in Angstroms ntr=1, ! Turn on restraints restraintmask=":1-489@CA,C,N", ! atoms to be restrained restraint_wt=0.1, ! force constant for restraint ntxo=1, ! Write coordinate file in ASCII format ioutfm=0, ! Write trajectory file in ASCII format iwrap=1, ! iwrap is turned on $end
Note for the restraintmask line, use the same number as for the 09.equil.mdin file in the previous step. In our case, this number is 489.
nano production.sh
This input file can now be run on Seawulf using the following slurm script:
#!/bin/bash # #SBATCH --job-name=4s0v_production #SBATCH --output=4s0v_production.txt #SBATCH --ntasks-per-node=28 #SBATCH --nodes=4 #SBATCH --time=48:00:00 #SBATCH -p long-28core module load amber/16 do_parallel="mpirun pmemd.MPI" prmtop="../003.tleap/4s0v.wet.complex.prmtop" coords="../004.equil/09.equil" MDINPUTS=(10.prod) for input in ${MDINPUTS[@]}; do $do_parallel -O -i ${input}.mdin -o ${input}.mdout -p $prmtop -c ${coords}.rst7 -ref ${coords}.rst7 -x ${input}.trj -inf ${input}.info -r ${input}.rst7 coords=$input done
Analysis
The output of the production run performed in the previous step is to generate and MD trajectory of the ligand/protein complex. Once this is done we need to analyze the results. We will analysis three different aspects of the complex:
- RMDS (the distance the ligand and protein have moved from their initial positions)
- H-bonds (what hydrogren bonds have been formed between the ligand and protein)
- MM-GBSA (estimation of the free binding energy between the ligand and protein)
We will again be working on the command line, please cd into your 006a.RMSD directory
RMSD
To perform the RMSD calculations, first create an input file which will delete the water and charges from the production trajectory.
nano cpp_strip_water.in
with the following inputs. Note that this should cover all CA atoms (including the ligand) in your complex
#!/usr/bin/sh parm ../003.tleap/4s0v.wet.complex.prmtop #read in trajectory trajin ../005.production/10.prod.trj #read in reference reference ../003.tleap/4s0v.wet.complex.rst7 #compute RMSD + align CA to crystal structure rmsd rms1 reference :1-490@CA #strip solvent strip :WAT:Na+:Cl- #create gas-phase trajectory trajout 4s0v_stripfit_restrained_gas.trj nobox cpp_strip_water.in (END)
The reference mask should contain BOTH the ligand and receptor in its value.
Now run this file using the following command:
cpptraj -i cpp_strip_water.in
Next, create another input file which will be used to calculate the RMSD of the ligand by typing
nano cpp_rmsd_lig.in
with inputs
#!/usr/bin/sh #trajin the trajectory trajin 4s0v_stripfit_restrained_gas.trj #read in reference reference ../003.tleap/4s0v.gas.complex.rst7 #compute RMSD (do not fit internal geometris first, included rigid body motions, convert frames to ns (framenum*.005) rmsd rms1 ":490&!(@H=)" nofit mass out 4s0v_lig_restrained_rmsd_nofit.dat time .005 #histogram the nofit rmsd histogram rms1,*,*,.1,* norm out 4s0v_lig_restrained_rmsd_nofit_histogram.dat
And check that the reference mask only factors in the ligand (490) in this file. Run this input file using the following command:
cpptraj -p ../003.tleap/4s0v.complex.parm7 -i cpp_rmsd_lig.in
Next, the receptor RMSD can be calculated by creating the following input file:
nano cpp_rmsd_receptor.in
with inputs
#!/usr/bin/sh #trajin the trajectory trajin 4s0v_stripfit_restrained_gas.trj #read in reference reference ../003.tleap/4s0v.gas.complex.rst7 #compute RMSD (do not fit internal geometries first, included rigid body motions, convert frames to ns (framenum*.005) rmsd rms1 ":1-489&!(@H=)" nofit mass out 4s0v_receptor_restrained_rmsd_nofit.dat time .005 #histogram the nofit rmsd histogram rms1,*,*,.1,* norm out 4s0v_receptor_restrained_rmsd_nofit_histogram.dat
And check that the reference mask only takes into account the receptor (1-489). Now run this file using the following command:
cpptraj -p ../003.tleap/4s0v.complex.parm7 -i cpp_rmsd_receptor.in
The output files from these analyses will be:
4s0v_lig_restrained_rmsd_nofit.dat 4s0v_receptor_restrained_rmsd_nofit_histogram.dat 4s0v_lig_restrained_rmsd_nofit_histogram.dat 4s0v_stripfit_restrained_gas.trj 4s0v_receptor_restrained_rmsd_nofit.dat
Examine the histogram.dat files, or plot if desired, to note the change in the RMSD of the ligand (or the receptor) over the course of the simulation. Since this was docked previously, we would hope that it would stay under 2 angstroms over the course of the simulation. There may be other simulations were movement is acceptable (or even desired) but this is not that case.
H-bonds
Next, the number of hydrogen bonds will be analyzed, including what these bonds are. Create the following input file:
nano cpp_hbonds.in
with the following:
!/usr/bin/sh #read in trajectory trajin ../004.production/10_prod.trj #wrap everything in one periodic cell - could cause problems, may comment out #autoimage if problems later autoimage #compute intra + water-mediated H-bonds hbond hb1 :1-490 out 4s0v_hbond.out avgout 4s0v_hbond_avg.dat solventdonor :WAT solventacceptor :WAT@-O nointramol brid\ geout 4s0v_bridge-water.dat dist 3.0 angle 140
Be sure that the hbobnd hb1 parameter corresponds to the residue numbers of the relevant ligand AND receptor (490 in our case). Run with:
cpptraj -p ../003.tleap/4s0v.wet.complex.prmtop -i cpp_hbonds.in
MM-GBSA Analysis
Finally, we will calculate free energies of binding in this system with MM-GBSA. Create the following input file:
nano mmgbsa_4s0v.in
This file should contain the following:
mmgbsa 4s0v analysis &general interval=1, netcdf=1, keep_files=0, / &gb igb=8 saltcon=0.0, surften=0.0072, surfoff=0.0, molsurf=0, / &nmode drms=0.001, maxcyc=10000, nminterval=250, nmendframe=2000, nmode_igb=1, /
Since this may take a while to run we should execute the script using Seawulf. Generate the following script:
4s0v_mmgbsa.sh
With the following as contents:
#bin/bash #SBATCH --time=2-00:00:00 #SBATCH --nodes=2 #SBATCH --ntasks=28 #SBATCH --job-name=4s0v_MMGBSA #SBATCH --output=4s0v_MMGBSA.log #SBATCH -p extended-28core #Define topology files solv_parm="../003.tleap/4s0v.wet.complex.prmtop" complex_parm="../003.tleap/4s0v.complex.parm7" receptor_parm="../003.tleap/4s0v.gas.receptor.parm7" lig_parm="../003.tleap/4s0v.gas.ligand.parm7" trajectory="../005.production/10_prod.trj" MMPBSA.py -O -i mmgbsa.in \ -o 4s0v_mmgbsa_results.dat \ -eo 4s0v_mmgbsa_perframe.dat \ -sp ${solv_parm} \ -cp ${complex_parm} \ -rp ${receptor_parm} \ -lp ${lig_parm} \ -y ${trajectory}
Execute this with:
sbatch 4s0v_mmgbsa.sh