Before running this tutorial, you should be familiar with running constant pH simulations in implicit solvent (see http://ambermd.org/tutorials/advanced/tutorial18/ for an overview). Running constant pH molecular dynamics (CpHMD) in explicit solvent is very similar to running CpHMD in implicit solvent with a few small differences.
Likewise, if you wish to run pH-REMD in explicit solvent, see the pH-REMD tutorial (click here) for information about pH-REMD.
You must have at least Amber and AmberTools version 14 to run CpHMD in explicit solvent. You can use either sander or pmemd for the constant pH calculations. Please note that, while CpHMD simulations in explicit solvent will work in Amber 14's version of pmemd.cuda, you need at least Amber 16 to achieve acceptable performance on the GPU.
Creating the Input Files
Generating the structure and topology
For the purposes of this tutorial, you will need a PDB file with the starting structure of your protein. For the purposes of this demonstration, I will be using the 3LZT crystal structure of the hen egg-white lysozyme (HEWL) (click here). I modify the PDB file by eliminating everything except the 129 protein residues. Then, you need to modify some of the residue names for those residues that you want to titrate. Specifically, you need to make sure the following residues have the correct names for their "titratable" residues:
- ASP -> AS4 (Aspartate)
- GLU -> GL4 (Glutamate)
- HIS -> HIP (Histidine)
- LYS -> LYS (Lysine)
- TYR -> TYR (Tyrosine)
- CYS -> CYS (Cysteine)
This step is handled by tleap. The leaprc.constph file contains the necessary residue libraries and force field definitions to set a system up for CpHMD simulations. The LEaP script below will create 3LZT.solv10.parm7 and 3LZT.solv10.rst7 — the topology file and coordinate file, respectively — complete with the appropriate disulfide bonds and titratable residue names.
tleap.in:
source leaprc.constph
# Load the PDBs
3lzt = loadPDB 3LZT_Fixed.pdb
# Load ion force field params
loadAmberParams frcmod.ionsjc_tip3p
# Make the disulfide bonds
bond 3lzt.6.SG 3lzt.127.SG
bond 3lzt.30.SG 3lzt.115.SG
bond 3lzt.64.SG 3lzt.80.SG
bond 3lzt.76.SG 3lzt.94.SG
# Solvate
solvateOct 3lzt TIP3PBOX 10.0
# Add ions
addIonsRand 3lzt Cl- 23 Na+ 14
# Save topology files
saveAmberParm 3lzt 3LZT.solv10.parm7 3LZT.solv10.rst7
# Quit
quit
You can create the topology and coordinate files using the command
tleap -f tleap.in
Creating the cpin file
Creating this file is slightly more involved than creating the cpin file for an implicit solvent CpHMD simulation owing to the need to fix the implicit solvent radii for the carboxylate oxygens of any AS4 or GL4 residues you wish to titrate. Because carboxylate residues are defined with 4 hydrogens (one on the syn- and anti- positions of both carboxylate oxygens), the effective Born radius is artificially increased as a result of the extra hydrogen atoms. To counter this effect, cpinutil.py was modified with an option to create a new topology file with the 'correct' radii.
If you plan to titrate any aspartate or glutamate residues, you must create a new prmtop file with cpinutil.py and use the new intrinsic radii, as the reference energies have been defined for that set of intrinsic radii. Also, only the igb=2 GB model has been parametrized for explicit solvent simulations, so we will use that one. The following command will produce the 3LZT.solv10.cpin and 3LZT.solv10.modO.parm7 (we will just titrate the acid-range residues AS4, GL4, and HIP):
bash $ cpinutil.py -p 3LZT.solv10.parm7 -igb 2 -o 3LZT.solv10.cpin -op 3LZT.solv10.modO.parm7 -resnames AS4 GL4 HIP
CPIN generation complete!
For explicit solvent simulations, the cpin file will have a couple extra variables than are typically present in implicit solvent CpHMD simulations (see below):
&CNSTPH
CHRGDAT=-0.4157,0.2719,0.0145,0.0779,-0.0398,-0.0173,-0.0173,0.0136,-0.0425,
-0.0425,0.8054,-0.8188,-0.8188,0.0,0.5973,-0.5679,0.0,0.0,0.0,-0.4157,0.2719,
0.0145,0.0779,-0.0071,0.0256,0.0256,-0.0174,0.043,0.043,0.6801,-0.5838,-0.6511,
0.4641,0.5973,-0.5679,0.0,0.0,0.0,-0.4157,0.2719,0.0145,0.0779,-0.0071,0.0256,
0.0256,-0.0174,0.043,0.043,0.6801,-0.5838,-0.6511,0.0,0.5973,-0.5679,0.4641,
0.0,0.0,-0.4157,0.2719,0.0145,0.0779,-0.0071,0.0256,0.0256,-0.0174,0.043,0.043,
0.6801,-0.6511,-0.5838,0.0,0.5973,-0.5679,0.0,0.4641,0.0,-0.4157,0.2719,0.0145,
0.0779,-0.0071,0.0256,0.0256,-0.0174,0.043,0.043,0.6801,-0.6511,-0.5838,0.0,
0.5973,-0.5679,0.0,0.0,0.4641,-0.3479,0.2747,-0.1354,0.1212,-0.0414,0.081,
0.081,-0.0012,-0.1513,0.3866,-0.017,0.2681,-0.1718,0.3911,-0.1141,0.2317,
0.7341,-0.5894,-0.3479,0.2747,-0.1354,0.1212,-0.111,0.0402,0.0402,-0.0266,
-0.3811,0.3649,0.2057,0.1392,-0.5727,0.0,0.1292,0.1147,0.7341,-0.5894,-0.3479,
0.2747,-0.1354,0.1212,-0.1012,0.0367,0.0367,0.1868,-0.5432,0.0,0.1635,0.1435,
-0.2795,0.3339,-0.2207,0.1862,0.7341,-0.5894,-0.4157,0.2719,0.0341,0.0864,
-0.1783,-0.0122,-0.0122,0.7994,-0.8014,-0.8014,0.0,0.5973,-0.5679,0.0,0.0,0.0,
-0.4157,0.2719,0.0341,0.0864,-0.0316,0.0488,0.0488,0.6462,-0.5554,-0.6376,
0.4747,0.5973,-0.5679,0.0,0.0,0.0,-0.4157,0.2719,0.0341,0.0864,-0.0316,0.0488,
0.0488,0.6462,-0.5554,-0.6376,0.0,0.5973,-0.5679,0.4747,0.0,0.0,-0.4157,0.2719,
0.0341,0.0864,-0.0316,0.0488,0.0488,0.6462,-0.6376,-0.5554,0.0,0.5973,-0.5679,
0.0,0.4747,0.0,-0.4157,0.2719,0.0341,0.0864,-0.0316,0.0488,0.0488,0.6462,
-0.6376,-0.5554,0.0,0.5973,-0.5679,0.0,0.0,0.4747,
PROTCNT=0,1,1,1,1,2,1,1,0,1,1,1,1,
RESNAME='System: Unknown','Residue: GL4 7','Residue: HIP 15','Residue: AS4 18',
'Residue: GL4 35','Residue: AS4 48','Residue: AS4 52','Residue: AS4 66',
'Residue: AS4 87','Residue: AS4 101','Residue: AS4 119',
RESSTATE=0,0,0,0,0,0,0,0,0,0,
STATEINF(0)%FIRST_ATOM=102, STATEINF(0)%FIRST_CHARGE=0,
STATEINF(0)%FIRST_STATE=0, STATEINF(0)%NUM_ATOMS=19, STATEINF(0)%NUM_STATES=5,
STATEINF(1)%FIRST_ATOM=233, STATEINF(1)%FIRST_CHARGE=95,
STATEINF(1)%FIRST_STATE=5, STATEINF(1)%NUM_ATOMS=18, STATEINF(1)%NUM_STATES=3,
STATEINF(2)%FIRST_ATOM=277, STATEINF(2)%FIRST_CHARGE=149,
STATEINF(2)%FIRST_STATE=8, STATEINF(2)%NUM_ATOMS=16, STATEINF(2)%NUM_STATES=5,
STATEINF(3)%FIRST_ATOM=543, STATEINF(3)%FIRST_CHARGE=0,
STATEINF(3)%FIRST_STATE=0, STATEINF(3)%NUM_ATOMS=19, STATEINF(3)%NUM_STATES=5,
STATEINF(4)%FIRST_ATOM=742, STATEINF(4)%FIRST_CHARGE=149,
STATEINF(4)%FIRST_STATE=8, STATEINF(4)%NUM_ATOMS=16, STATEINF(4)%NUM_STATES=5,
STATEINF(5)%FIRST_ATOM=790, STATEINF(5)%FIRST_CHARGE=149,
STATEINF(5)%FIRST_STATE=8, STATEINF(5)%NUM_ATOMS=16, STATEINF(5)%NUM_STATES=5,
STATEINF(6)%FIRST_ATOM=1029, STATEINF(6)%FIRST_CHARGE=149,
STATEINF(6)%FIRST_STATE=8, STATEINF(6)%NUM_ATOMS=16, STATEINF(6)%NUM_STATES=5,
STATEINF(7)%FIRST_ATOM=1327, STATEINF(7)%FIRST_CHARGE=149,
STATEINF(7)%FIRST_STATE=8, STATEINF(7)%NUM_ATOMS=16, STATEINF(7)%NUM_STATES=5,
STATEINF(8)%FIRST_ATOM=1537, STATEINF(8)%FIRST_CHARGE=149,
STATEINF(8)%FIRST_STATE=8, STATEINF(8)%NUM_ATOMS=16, STATEINF(8)%NUM_STATES=5,
STATEINF(9)%FIRST_ATOM=1811, STATEINF(9)%FIRST_CHARGE=149,
STATEINF(9)%FIRST_STATE=8, STATEINF(9)%NUM_ATOMS=16, STATEINF(9)%NUM_STATES=5,
STATENE=0,21.2486237123,21.2486237123,21.2486237123,21.2486237123,0,
-11.7115914534,-16.2435974337,0,38.7598760021,38.7598760021,38.7598760021,
38.7598760021,
TRESCNT=10,CPHFIRST_SOL=1998, CPH_IGB=2, CPH_INTDIEL=1.0,
/
The CPHFIRST_SOL is the first atom that will be treated as solvent, and all atoms from CPHFIRST_SOL to the last will be ignored when evaluating protonation state changes in GB. If you need to include specific structural water molecules or ions, make sure you appropriately adjust CPHFIRST_SOL to reflect the atoms you wish to include. Note that structural waters and/or ions should NOT exchange with other solvent or ion residues, otherwise the wrong ones will be included in the protonation state change attempts.
The CPH_IGB is the GB model you wish to use, and the CPH_INTDIEL is the dielectric constant that will be used. You should not modify either of these variables.
Setting up the simulation
Minimization
In this step, we will minimize the structure, putting restraints on the backbone with the following input file:
Minimization to relax initial bad contacts, explicit solvent
&cntrl
imin=1,
ncyc=1000,
maxcyc=5000,
ntpr=50,
cut=8,
restraintmask='!:WAT&@CA,C,O,N'
restraint_wt=10.0,
/
This can be run with pmemd, sander, or pmemd.cuda.
Heating
The next step is to heat the structure. The following input file will heat slowly over 400 ps at constant volume from 0 to 300 K.
Explicit solvent initial heating mdin
&cntrl
imin=0, irest=0, ntx=1,
ntpr=1000, ntwx=1000, nstlim=200000,
dt=0.002, ntt=3, gamma_ln=5.0, ig=-1,
ntc=2, ntf=2, cut=8, ntb=1,
iwrap=1, ioutfm=1, nmropt=1,
/
&wt
TYPE='TEMP0', ISTEP1=0, ISTEP2=150000,
VALUE1=10.0, VALUE2=300.0,
/
&wt TYPE='END' /
This step can be run with either sander, pmemd, or pmemd.cuda.
Equilibration (Relaxation)
Our next step is to equilibrate the structure and stabilize the density. In this step, we will run in the NPT ensemble for 4 ns.
Explicit solvent molecular dynamics constant pressure MD
&cntrl
imin=0, irest=1, ntx=5,
ntpr=1000, ntwx=1000, nstlim=2000000,
dt=0.002, ntt=3, tempi=300,
temp0=300, gamma_ln=1.0, ig=-1,
ntp=1, ntc=2, ntf=2, cut=8,
ntb=2, iwrap=1, ioutfm=1,
/
This step can be run using sander, pmemd, or pmemd.cuda. You can also apply backbone restraints on this step if you want to, and extend the equilibration with NVT dynamics. This step can be run with sander, pmemd, or pmemd.cuda.
Production
To actually run the simulation, you need to add the following variables to your standard, explicit solvent input files:
- icnstph: This should be set to 2 to toggle explicit solvent CpHMD
- solvph: This is the solution pH
- ntcnstph: This is the number of steps between attempted protonation state changes.
- ntrelax: This is the number of steps of relaxation dynamics you wish to run. (This is unique to running in explicit solvent)
- saltcon: This is the GB salt concentration (and only used in the protonation state change attempts). Since the reference energies were defined using a salt concentration of 0.1 M, this should be set to 0.1.
If you are familiar with CpHMD in implicit solvent, you know that one protonation state change is attempted every ntcnstph steps, so values of 5-10 are typically used in systems that have ca. 10 titratable residues. In explicit solvent, however, we attempt to change the protonation state of every titratable residue (in random order), since each successful exchange is accompanied by a (expensive) solvent relaxation step. As a result, you should set ntcnstph to a much larger value. For example, I have seen good results with ntcnstph set to 100 to 500.
The relaxation dynamics, too, should be carefully chosen. In my studies, I use 200 fs (100 steps using a 2 fs time step). The input file at pH 4.5 would then look like this:
Explicit solvent constant pH MD
&cntrl
imin=0, irest=1, ntx=5, ntxo=2,
ntpr=1000, ntwx=1000, nstlim=1000000,
dt=0.002, ntt=3, tempi=300,
temp0=300, gamma_ln=5.0, ig=-1,
ntc=2, ntf=2, cut=8, iwrap=1,
ioutfm=1, icnstph=2, ntcnstph=100,
solvph=-3.0, ntrelax=100, saltcon=0.1,
/
Analyzing the results
The constant pH output files printed here are identical to those printed for implicit solvent (for both pH-REMD and standard CpHMD). Please see the respective tutorials (CpHMD or pH REMD) for instructions on analyzing the results.