Nonstandard residues prep
This tutorial is created by and for AMS536 class at Stony Brook University and made available for the use of the whole modeling community.
The following are examples of a work flow of how to assign parameters for non-standard residues compatible with the amber program. This procedure will give you reasonable parameters; however, the charge calculation should use higher levels of theory and the other parameters should be optimized Further perhaps fit to QM or optimizing to reproduce experimental observables. Parameters assigned from gaff.
Fluorophenylalanine
To prepare the mutant structure for Molecular Dynamics, a copy of the phenylalanine residue from the raw pdb file for 1QP6 was made and subsequently modified using a text editor to include the name FPH, rather than PHE. Furthermore, the five hydrogen atoms on the phenylalanine residue named ‘H’ for hydrogen, were replaced with five F’s, for fluorine. Using the program, Chimera, hydrogens were added to this modified residue and the C-F bond length was defined as 1.35 Angstroms. Also using Chimera, which has a built-in antechamber module, charges were assigned using the charge model, AM1-BCC. The .mol2 file generated was used as an input to create a new pdb file and a prep file. Using Parmchk, the prep file, which establishes connectivity between atoms, was used as an input to create the frcmod file, which stores additional parameters. The prep file was manually modified to contain AMBER parameters from the PHE.prep file in the AMBER library for the backbone atoms of the mutant residue. The resulting structure is a hybrid of the two force fields(FF99SB for the backbone, whose parameters were taken from a PHE backbone, and GAFF for the side-chain), and for this reason, additional parameters, including the bond lengths, angles, and dihedral angles had to be specified and stored in the frcmod file. A Table of Additional Parameters (bond angles)stored in the frcmod file is shown below.
Angle | Kθ | θ | Source |
c3-CT-C | 63.8 | 110.53 | GAFF: c-c3-c3 |
H1-CT-c3 | 46.4 | 110.05 | GAFF: c3-c3-ha |
CT-c3-hc | 46.4 | 110.05 | GAFF: c3-c3-ha |
CT-c3-ca | 62.5 | 114.61 | GAFF: c3-c3-ca |
N-CT-c3 | 80.0 | 109.70 | PARM99: CT-CT-N |
Below is the prep file created:
FLUOROPHENYLALANINE molecule.res FPH INT 0 CORR OMIT DU BEG 0.00000 1 DUMM DU M 0 -1 -2 0.000 0.000 0.000 0.00000 2 DUMM DU M 1 0 -1 1.449 0.000 0.000 0.00000 3 DUMM DU M 2 1 0 1.522 111.100 0.000 0.00000 4 N N M 3 2 1 1.335 116.600 180.000 -0.41570 5 H H E 4 3 2 1.010 -10.200 0.000 0.27190 6 CA CT M 4 3 2 1.449 121.900 180.000 -0.00240 7 HA H1 E 6 4 3 1.090 -109.50 300.000 0.09780 8 CB c3 3 6 4 3 1.533 110.759 -133.834 -0.01010 9 HB2 hc E 8 6 4 1.090 112.115 -149.730 0.08300 10 HB3 hc E 8 6 4 1.090 111.978 -26.211 0.08300 11 CG ca S 8 6 4 1.505 109.715 91.984 -0.10820 12 CD1 ca B 11 8 6 1.410 119.856 69.106 0.12160 13 FD1 f E 12 11 8 1.350 119.921 -0.168 -0.10880 14 CE1 ca B 12 11 8 1.410 120.090 179.880 0.07260 15 FE1 f E 14 12 11 1.350 120.094 -179.943 -0.10190 16 CZ ca B 14 12 11 1.410 119.913 0.059 0.09970 17 FZ f E 16 14 12 1.350 119.912 179.922 -0.09840 18 CE2 ca B 16 14 12 1.414 120.110 -0.071 0.07280 19 FE2 f E 18 16 14 1.350 120.153 -179.940 -0.10140 20 CD2 ca S 18 16 14 1.412 119.865 0.062 0.12490 21 FD2 f E 20 18 16 1.350 119.987 179.997 -0.10980 22 C C M 6 4 3 1.522 111.100 180.000 0.59730 23 O O E 22 6 4 1.229 120.500 0.000 -0.56790 LOOP CG CD2 IMPROPER -M CA N H CA +M C O CG CE2 CD2 FD2 CD2 CZ CE2 FE2 CE1 CE2 CZ FZ CD1 CZ CE1 FE1 CG CE1 CD1 FD1 CD1 CD2 CG CB DONE STOP
non-starandard Nucleic acids with modified backbone
The following procedure is a method of preparation for running MD simulations on a non-standard nucleic acid in AMBER. Although this was initially meant for application on a nucleic acid analog, in principle it can work with all types of molecules which are too large for the usual Antechamber parameterization.
1.With a PDB structure of the desired molecule at hand, you will have to load it into Chimera, and assign it gasteiger charges. This charge model is used because for a system larger than 60 heavy atoms, an AM1-BCC charge model would not be compatible.
2.With newest version of Chimera, save the file as a mol2 molecule, with AMBER/Gaff atom types. Having this option on is very important, because by default Chimera will give a mol2 file sybyl atom types, and TLEAP which will be used later will not be able to recognize it. Older versions of Chimera will not have this option of AMBER/Gaff atom types.
3.Now that we have all the connection/topology and charge information contained in the mol2 file, we only need one more element before we load the molecule into TLEAP, that is a frcmod file. For unknown dihedral angles types, we need to assign it force constants, and with the command
parmchk -i <nucleicacid>.mol2 -f mol2 -o CFNA.frcmod
a frcmod file with all the force constants for the unknown dihedral angle types will be generated. Note that a message may show up saying maximum number of atoms reached, however this is ok, parmchk will relocate memory, and will be successful in generating the complete frcmod file.
4.With the mol2 file and frcmod file at hand, using the key command loadmol2, a set of amber parameters will be saved. A specific example of how to write the commands as a script is as follows
set default PBradii mbondi2 source leaprc.ff99SB source leaprc.gaff loadamberparams gaff.dat.rizzo loadamberparams <nucleicacid>.frcmod file = loadmol2 <nucleicacid>.mol2 saveamberparm file <nucleicacid>.gas.prm7 <nucleicacid>.gas.rst7 charge file quit
5.So the generated prm7 would be the amber parameter file, which one can use for the MD simulations.
An alternative method would be a residue-wise parameterization approach, given that a large molecule consists of repeating units, such as a DNA analog in our case. However this is tricky, as since this method employs prep files, which uses a z-matrix for defining atom connections. A general description is as follows.
1.Build each standalone residue structure, with hydrogens capped at the connection points.
2.Now assign each residue with AM1-BCC charges in Chimera, save it as a mol2 file.
3.Use Antechamber to generate prep files for every residue. The actual command will be
antechamber -i <residue>.mol2 -fi mol2 -o <residue>.prep -fo prepi
4.Now we have at hand the prep files of both hydrogen capped and heavy atom capped residues, open up the hydrogen capped preps. Delete the whole line of the capping hydrogens. Try to redistribute the resulting deleted charges over to other atoms, so that we have a neutral molecule(or whatever integral charge it had initially).
5.Now here's the tricky part. Notice the beginning of the prep file, there are some dummy atoms, and then comes your first actual atom. In an example of a prep file for standard DNA, the phosphorous wants to connect to atom number 3, with bond length 1.60 angstroms, bond angle of 119.04 with 4-3-2, dihedral angle of 200.00 with 4-3-2-1. See table.
D-ADENOSINE - with 5' - phosphate group and 3' - O(minus) group CORRECT OMIT DU BEG 0.0 1 DUMM DU M 0 -1 -2 0.00 0.00 0.00 0.0000 2 DUMM DU M 1 0 -1 1.00 0.00 0.00 0.0000 3 DUMM DU M 2 1 0 1.00 90.00 0.00 0.0000 4 P P M 3 2 1 1.60 119.04 200.00 1.1662 5 O1P O2 E 4 3 2 1.48 109.61 150.00 -0.7760 6 O2P O2 E 4 3 2 1.48 109.58 20.00 -0.7760 7 O5' OS M 4 3 2 1.60 101.43 -98.89 -0.4989 8 C5' CT M 7 4 3 1.44 119.00 -39.22 0.0558 9 H5'1 H1 E 8 7 4 1.09 109.50 60.00 0.0679 10 H5'2 H1 E 8 7 4 1.09 109.50 -60.00 0.0679 11 C4' CT M 8 7 4 1.52 110.00 180.00 0.1065 12 H4' H1 E 11 8 7 1.09 109.50 -200.00 0.1174 13 O4' OS S 11 8 7 1.46 108.86 -86.31 -0.3548 14 C1' CT B 13 11 8 1.42 110.04 105.60 0.0394 15 H1' H2 E 14 13 11 1.09 109.50 -240.00 0.2007 16 N9 N* S 14 13 11 1.52 109.59 -127.70 -0.0251 17 C8 CK B 16 14 13 1.37 131.20 81.59 0.2006 18 H8 H5 E 17 16 14 1.08 120.00 0.00 0.1553 19 N7 NB S 17 16 14 1.30 113.93 177.00 -0.6073 20 C5 CB S 19 17 16 1.39 104.00 0.00 0.0515 21 C6 CA B 20 19 17 1.40 132.42 180.00 0.7009 22 N6 N2 B 21 20 19 1.34 123.50 0.00 -0.9019 23 H61 H E 22 21 20 1.01 120.00 180.00 0.4115 24 H62 H E 22 21 20 1.01 120.00 0.00 0.4115 25 N1 NC S 21 20 19 1.34 117.43 180.00 -0.7615 26 C2 CQ B 25 21 20 1.33 118.80 0.00 0.5875 27 H2 H5 E 26 25 21 1.08 120.00 180.00 0.0473 28 N3 NC S 26 25 21 1.32 129.17 0.00 -0.6997 29 C4 CB E 28 26 25 1.35 110.80 0.00 0.3053 30 C3' CT M 11 8 7 1.53 115.78 -329.11 0.2022 31 H3' H1 E 30 11 8 1.09 109.50 30.00 0.0615 32 C2' CT B 30 11 8 1.53 102.80 -86.30 0.0670 33 H2'1 H1 E 32 30 11 1.09 109.50 120.00 0.0972 34 O2' OH S 32 30 11 1.43 109.50 240.00 -0.6139 35 HO'2 HO E 34 32 30 0.96 107.00 180.00 0.4186 36 O3' OS M 30 11 8 1.42 116.52 -203.47 -0.5246
So what this means is that, when these prep files get loaded into TLEAP, TLEAP will according to your given residue sequence(of the pdb file of your complete molecule you have loaded in also, say ATCGGTCA) try to build a structure of the same sequence, according to the preps it has, and verify the built structure with your pdb structure. And if it matches, we are happy, and the MD is ready to run. It is suppose to match, so the connection points are critical. Notice the dummy atoms have an "M" mark after "DU", therefore dummy atom 3, 2, 1, actually corresponds to the first, second and third atom also with the "M" mark, in the neighboring residue(to make things simple let's think of it as also being the same adenosine), counting from bottom up. So with an adenosine-adenosine connection, see how O3', C3' and C4' actually fits the criteria of both being the first 3 atoms bearing "M" in the bottom up count order, and being the actual connection atoms, which define the bond length, bond angle and dihedral angle in DNA. Compare table and figure and you will see.
In our prep file for our own residues, we will need to get the connection information correct, just like DNA in the AMBER library. Antechamber generates a prep file with atoms in a arbitrary order, we will have to tweak the four connection atoms, to make atom number 4 our terminal atom(just like phosphorous in DNA), the other 3 connection defining atoms in the correct relative order(just like how O3', C3' and C4' is relatively sequenced and marked with "M"), and all marking them with "M". Change any interfering "M"s to E. Bond length, angle and dihedral angle numbers will have to be obtained manually from visualization software like MOE, and input into atom row 4. Fiddling with the z-matrix is not easy, and one is very likely to make errors, so a facile approach would be the first method mentioned. However this second method may be slightly more accurate in terms of charge model, and once we have parameterized all the residues for a new species of nucleic acid analog or some residue-based structure, we can build a molecule of its species in any arbitrary residue sequence.
6.Finally, run command
parmchk -i <residue>.prep -f prepi -o <residue>.frcmod
and we have both the prep and frcmod file at hand.
7.An example of a script of commands to be used in TLEAP to generate the AMBER parameters as follows:
set default PBradii mbondi2 source leaprc.ff99SB source leaprc.gaff loadamberparams <residue>.frcmod (...repeat for every residue) loadamberprep <residue>.prep (...repeat for every residue) file = loadpdb <nucleicacid>.pdb saveamberparm file <nucleicacid>.gas.prm7 <nucleicacid>.gas.rst7 charge lig quit