Python utility that converts AMBER .prmtop/PDB files to LAMMPS data/parameter formats through CLI or Python API, featuring validation-first error handling. Can handle multiple AMBER topology files for mixed molecular systems (e.g., drug + solvent mixtures).
- 🔧 Multi-Topology Conversion: Convert mixed systems using multiple AMBER
.prmtopfiles in a single run - 🧪 Mixed & Replicated Systems: Handle mixtures (e.g., drug + solvent) and multiple copies of one molecule via
-c - ✅ Validation-First: Clear errors for missing files, mismatched list lengths, and PDB atom-count/order issues
- 🔄 CLI + Python API: Same conversion engine from the command line or
amber2lammps(...)in Python - 📦 Debug Artifacts (
--keep-temp): Optionally writepairs.txt,bonds.txt,angles.txt,dihedrals.txtgrouped by topology
Run molecular dynamics simulations in LAMMPS using systems prepared in AMBER format. Supports both single-molecule and mixed molecular systems with multiple topology files.
Typical workflow:
- Start with a molecular structure (PDB file or SMILES string).
- Generate AMBER topology files (
.prmtop) with AmberTools (and optionally.crdfor other tools/validation). - Convert (
.prmtop+ PDB coordinates) to LAMMPS format with this tool. - Run your simulation in LAMMPS.
Input → Output Mapping
.prmtop: Molecular topology (bonds, angles, atom types, force-field parameters) → LAMMPS data and parameter files. Supports multiple topologies for mixed systems.- PDB: Atomic coordinates (single molecule or combined from PackMol) → LAMMPS coordinates with proper atom indexing.
- Outputs:
data.lammps(coordinates, box, topology) andparm.lammps(bonded and nonbonded parameters).
- Open Source
- Citation
- Platform Compatibility
- What You Need
- Installation
- SMILES to PDB Workflow
- Command Reference
- How It Works
- Tutorial
- Validation with InterMol
- Contributing
- Acknowledgements
This is an open-source project. The source code is freely available for use, modification, and distribution under the MIT License.
If you use this software in your research, please cite it as:
Tested and validated on:
- Linux (Ubuntu, CentOS, Red Hat, Debian)
- macOS (Intel and Apple Silicon)
- Windows 10/11 (native Python or WSL2; WSL2 recommended)
- Structure input: A PDB file of the molecular structure (or a SMILES string that you convert to PDB; see workflow below). For mixed systems, use a combined PDB file from PACKMOL containing all molecules.
- AMBER prep: AmberTools (
antechamber,tleap) to generate.prmtop. For mixed systems, generate separate.prmtopfiles for each molecule type. - Python: Python 3.8+ with
parmedandnumpy. - LAMMPS: Build with
MOLECULE,KSPACE, andEXTRA-MOLECULEpackages on yourPATH(lmp -hto confirm).
Note: If you are using a PDB file generated from SMILES or another source, it is recommended that you pass in the PDB file with the antechamber -dr yes option. PDB format note: The converter uses fixed-column parsing of ATOM/HETATM records. PackMol-style PDBs work well; non-standard PDBs (e.g., unusual formatting, multi-model files) may fail or require cleanup.
Required for running LAMMPS simulation multiple copies of the molecule or mixed molecular systems. Install from: https://m3g.github.io/packmol/
PACKMOL creates combined PDB files with multiple molecules positioned in a simulation box.
Verify installation by running:
which packmolInstall AmberTools (e.g., AmberTools23) from https://ambermd.org/GetAmber.php#ambertools and activate the environment:
conda activate AmberTools23 # or your AmberTools version# Recommended
conda install -c conda-forge parmed numpy
# Alternative
pip install parmed numpyDownload LAMMPS (https://lammps.org/) and build a recent stable version with packages MOLECULE,
KSPACE, and EXTRA-MOLECULE enabled. Verify your installation with lmp or which lmp.
Use Open Babel (obabel) to generate a 3D PDB from SMILES.
# macOS
brew install open-babel
# conda
conda install -c openbabel openbabel
# Ubuntu/Debian
sudo apt-get install openbabel# Basic conversion
obabel -:CCO -opdb -O ethanol.pdb --gen3d
# With explicit hydrogens (recommended)
obabel -:CCO -h -opdb -O ethanol.pdb --gen3d
# Other examples
obabel -:c1ccccc1 -opdb -O benzene.pdb --gen3d # Benzene
obabel -:"CC(=O)OC1=CC=CC=C1C(=O)O" -opdb -O aspirin.pdb --gen3d # AspirinOutputs: a LAMMPS data file (<data_file>, e.g., data.lammps) and a separate parameter file
(<param_file>, e.g., parm.lammps).
| Argument | Required | Description |
|---|---|---|
data_file |
yes | Output LAMMPS data file name |
param_file |
yes | Output LAMMPS parameter file name |
pdb_file |
yes | Combined PDB containing all molecules (e.g., from PackMol) |
-t, --topologies |
yes | One or more AMBER topology files (.prmtop) |
-c, --counts |
yes | Molecule counts for each topology (same order/length as --topologies) |
--charges |
yes | Target net charge per topology (list matching --topologies; use 0 0 ... for neutrals) |
-b, --buffer |
optional | Vacuum padding (Å) for the simulation box. Default: 3.8. |
--verbose |
optional | Print step-by-step progress, counts, and box size. Default: False. |
--keep-temp |
optional | Keep temporary files (pairs.txt, bonds.txt, angles.txt, dihedrals.txt) grouped by topology. Default: False. |
-h, --help |
optional | Show help message. |
- Input validation:
validate_fileschecks thattopologies,counts, andpdb_fileexist and lengths match. CLI does this automatically; Python API requires calling it yourself. - Load topology and parameters: ParmEd reads atoms, bonds, angles, dihedrals, masses, and LJ parameters directly from each
.prmtop(supports multiple topologies). - Atom types and masses: Atom types are namespaced on conflict across topologies; masses and LJ params are captured per canonical type.
- Coordinates and box: Coordinates come from the combined PDB; the box is min/max of those coordinates expanded by
buffer. - Charge normalization: For each topology, charges are uniformly shifted to hit the user-provided
--chargestarget (tolerance 1e-6). Totals are checked across all molecules. - Nonbonded coefficients: Like–like
pair_coefflines are emitted per atom type; cross terms are left to LAMMPS mixing rules. - Bonded coefficients (two-pass, deduplicated): A first pass iterates each topology once to build global type registries for bonds, angles, and dihedrals — keyed by rounded parameter values so that identical force constants across different topologies or molecule copies share one type ID. Dihedrals are grouped by atom-index tuple first to correctly merge multi-term Fourier companions into a single
dihedral_coeffline, then deduplicated by parameter signature across symmetry-equivalent instances. A second pass writes the connectivity rows (Bonds/Angles/Dihedralssections) for all replicas, referencing the pre-computed type IDs. This means bond/angle/dihedral type counts stay constant regardless of how many copies (-c N) are used. - Export and cleanup: Data/parameter files are written; debug files (
pairs.txt,bonds.txt,angles.txt,dihedrals.txt) are kept if--keep-tempis set.
AMBER Charge Schemes: AMBER uses various charge methods including:
- RESP (Restrained Electrostatic Potential): Derived from quantum mechanical electrostatic potential
- AM1-BCC: Semi-empirical charges with bond charge corrections
- CM5 or CM1A: Charge models based on atomic charges
Charge Normalization in AMBER2LAMMPS:
- Most systems should typically be neutral for PME convergence; AM1-BCC charge calculations often leave small residuals (±0.003).
- For neutral molecules, run with
--charges 0so AMBER2LAMMPS applies a uniform offset that removes the residual charge; this prevents the error from scaling up when the system is replicated in LAMMPS or when you have multiple copies of the same molecule. - For intentionally charged species (e.g., protonated or deprotonated), add counterions in tleap/packmol before conversion. AMBER2LAMMPS never adds ions; it only shifts existing charges to your requested total.
- How the flag works:
--charges 0: adds a uniform offset so total charge is 0 within 1e-6.--charges +1(or any integer): adds a uniform offset so the total matches that integer within 1e-6.- The same constant is added to every atom, so relative charge differences are preserved.
Example: If your system has net charge +0.003 and you specify --charges 0, each atom's charge will be reduced by (0.003 ÷ number_of_atoms) to reach neutrality.
The tutorial assumes you are running from an AMBER2LAMMPS checkout or otherwise have access to
amber_to_lammps.py.
Activate the AmberTools environment before running:
import subprocess
pdb_file = "epon.pdb" # Replace with your PDB filename
base_name = pdb_file.replace(".pdb", "")
# Generate MOL2 file with charges from PDB file
cmd1 = f"antechamber -j 4 -at gaff2 -dr yes -fi pdb -fo mol2 -i {pdb_file} -o {base_name}.mol2 -c bcc"
subprocess.run(cmd1, shell=True)
# -c flag specifies the charge method. bcc is used for AM1-BCC charges
# -nc flag specifies net charge (default: 0.0). Use for charged molecules (e.g., -nc 1 for +1 charge)
# Create tleap input file
with open("tleap.in", "w") as f:
f.write("source leaprc.gaff2\n") # Load the Gaff2 force field
f.write(f"MOLECULE = loadmol2 {base_name}.mol2\n") # Load the molecule
f.write("check MOLECULE\n") # Check the molecule
f.write(f"saveamberparm MOLECULE {base_name}.prmtop {base_name}.crd\n") # Save the AMBER files
f.write("quit")
# Run tleap to generate AMBER files
cmd3 = "tleap -f tleap.in"
subprocess.run(cmd3, shell=True)
# Output files generated: epon.prmtop, epon.crd, epon.mol2
# Note: amber_to_lammps.py uses epon.prmtop + epon.pdb; the .crd is optional (kept for validation/other tools).python3 amber_to_lammps.py data.lammps parm.lammps epon.pdb \
-t epon.prmtop -c 1 --charges 0 --verbose -b 4.5Use the provided example_lammps_input.lmp:
lmp < example_lammps_input.lmpNote: The LAMMPS input file example_lammps_input.lmp includes parm.lammps (force-field
parameters) and uses read_data to load data.lammps (coordinates). Keeping parameters separate
makes it easy to swap parameter sets without regenerating coordinates.
# Custom buffer and verbose logging (adds 5 Å padding)
python3 amber_to_lammps.py my_data.lammps my_params.lammps ethanol.pdb \
-t ethanol.prmtop -c 1 --charges 0 --verbose -b 5.0
# Custom output names
python3 amber_to_lammps.py system.data system.parm system.pdb \
-t system.prmtop -c 1 --charges 0
# Minimal output without verbose logging
python3 amber_to_lammps.py data.lammps parm.lammps ethanol.pdb \
-t ethanol.prmtop -c 1 --charges 0 -b 3.0
# Using absolute paths with custom buffer
python3 amber_to_lammps.py /home/user/lammps/output/data.lammps /home/user/lammps/output/param.lammps /home/user/amber/packmol/combined.pdb \
-t /home/user/amber/topology.prmtop -c 1 --charges 0 -b 4.5
# Single topology with multiple copies (e.g., 10 ethanol molecules)
python3 amber_to_lammps.py multi_ethanol_data.lammps multi_ethanol_parm.lammps multi_ethanol.pdb \
-t ethanol.prmtop -c 10 --charges 0 --verboseIf you include --keep-temp, the converter will also write pairs.txt, bonds.txt, angles.txt, and dihedrals.txt, each grouped by topology (handy for debugging mixed systems).
example_lammps_input.lmp includes parm.lammps and reads data.lammps; adjust those paths and the file names in example_lammps_input.lmp if you rename outputs.
from amber_to_lammps import amber2lammps, validate_files
import subprocess
data_file = 'data.lammps'
param_file = 'parm.lammps'
pdb_file = 'epon.pdb'
topologies = ['epon.prmtop']
counts = [1]
charges = [0]
# Optional validation
validate_files(topologies, counts, pdb_file)
# Convert
amber2lammps(
data_file=data_file,
param_file=param_file,
topologies=topologies,
molecule_counts=counts,
pdb_file=pdb_file,
charges_target=charges,
buffer=3.8,
verbose=True,
keep_temp=False
)
# Run LAMMPS
subprocess.run("lmp < example_lammps_input.lmp", shell=True)
print("Completed conversion")# Example 1: Custom buffer and keep temporary files
from amber_to_lammps import amber2lammps
amber2lammps(
data_file='epon.data',
param_file='epon.parm',
topologies=['epon.prmtop'],
molecule_counts=[1],
pdb_file='epon.pdb',
charges_target=[0], # Target neutral charge
buffer=5.0,
verbose=True,
keep_temp=True # Keep temporary files for inspection
)
# Example 2: Batch processing multiple molecules
from amber_to_lammps import amber2lammps, validate_files
molecules = ['ethanol', 'benzene', 'aspirin']
for mol in molecules:
validate_files([f'{mol}.prmtop'], [1], f'{mol}.pdb')
amber2lammps(
data_file=f'{mol}.data',
param_file=f'{mol}.parm',
topologies=[f'{mol}.prmtop'],
molecule_counts=[1],
pdb_file=f'{mol}.pdb',
charges_target=[0],
verbose=True,
keep_temp=False
)For systems containing multiple molecule types (e.g., drug + solvent mixtures), use the following workflow.
# Generate MOL2 files with proper bond orders (Open Babel + Antechamber)
obabel aspirin.pdb -opdb -O aspirin_obabel.pdb -h --gen3d
obabel aspirin_obabel.pdb -omol2 -O aspirin_obabel.mol2 -h
antechamber -fi mol2 -fo mol2 -i aspirin_obabel.mol2 -o aspirin_final.mol2 -c bcc
# Repeat for benzene and ethanol
obabel benzene.pdb -opdb -O benzene_obabel.pdb -h --gen3d
obabel benzene_obabel.pdb -omol2 -O benzene_obabel.mol2 -h
antechamber -fi mol2 -fo mol2 -i benzene_obabel.mol2 -o benzene_final.mol2 -c bcc
obabel ethanol.pdb -opdb -O ethanol_obabel.pdb -h --gen3d
obabel ethanol_obabel.pdb -omol2 -O ethanol_obabel.mol2 -h
antechamber -fi mol2 -fo mol2 -i ethanol_obabel.mol2 -o ethanol_final.mol2 -c bcc# Create tleap input files and run for each molecule
tleap -f tleap_aspirin.in
tleap -f tleap_benzene.in
tleap -f tleap_ethanol.in# Create PackMol input file
cat > packmol_mix.in << 'EOF'
tolerance 2.0
filetype pdb
output mixed_system.pdb
# Aspirin (1 molecule)
structure aspirin.pdb
number 1
inside box 0.0 0.0 0.0 10.0 10.0 10.0
end structure
# Benzene (2 molecules)
structure benzene.pdb
number 2
inside box 0.0 0.0 0.0 10.0 10.0 10.0
end structure
# Ethanol (5 molecules)
structure ethanol.pdb
number 5
inside box 0.0 0.0 0.0 10.0 10.0 10.0
end structure
EOF
# Run PackMol
packmol < packmol_mix.in
# The resulting mixed_system.pdb is a single combined PDB containing all molecules
# in the same order and counts you will pass to -t/-c. Keep residue/atom ordering intact.# Convert the mixed molecular system to LAMMPS format
python amber_to_lammps.py mixed_data.lammps mixed_parm.lammps mixed_system.pdb \
-t aspirin.prmtop benzene.prmtop ethanol.prmtop -c 1 2 5 --charges 0 0 0 --verbosefrom amber_to_lammps import amber2lammps, validate_files
import subprocess
# Validate all topology files
validate_files(['aspirin.prmtop', 'benzene.prmtop', 'ethanol.prmtop'], [1, 2, 5], 'mixed_system.pdb')
# Convert mixed system with multiple molecule types
amber2lammps(
data_file='mixed_data.lammps',
param_file='mixed_parm.lammps',
topologies=['aspirin.prmtop', 'benzene.prmtop', 'ethanol.prmtop'],
molecule_counts=[1, 2, 5], # 1 aspirin, 2 benzene, 5 ethanol
pdb_file='mixed_system.pdb',
charges_target=[0, 0, 0], # All neutral molecules
verbose=True,
keep_temp=False
)
# Run LAMMPS with mixed system
subprocess.run("lmp < test_mixed_system.in", shell=True)# Generate MOL2 files with proper bond orders (Open Babel + Antechamber)
obabel ethanol.pdb -opdb -O ethanol_obabel.pdb -h --gen3d
obabel ethanol_obabel.pdb -omol2 -O ethanol_obabel.mol2 -h
antechamber -fi mol2 -fo mol2 -i ethanol_obabel.mol2 -o ethanol_final.mol2 -c bcc# Create tleap input files and run for the molecule
tleap -f tleap_ethanol.in# Create PackMol input file for multiple copies (packmol_multi_ethanol.in)
cat > packmol_multi_ethanol.in << 'EOF'
tolerance 2.0
filetype pdb
output multi_ethanol.pdb
# Ethanol (10 molecules)
structure ethanol.pdb
number 10
inside box 0.0 0.0 0.0 15.0 15.0 15.0
end structure
EOF
# Run PackMol
packmol < packmol_multi_ethanol.in
# The resulting multi_ethanol.pdb is a single combined PDB containing all molecules
# in the same order and counts you will pass to -t/-c. Keep residue/atom ordering intact.# Convert the multi-copy molecular system to LAMMPS format
python amber_to_lammps.py multi_ethanol_data.lammps multi_ethanol_parm.lammps multi_ethanol.pdb \
-t ethanol.prmtop -c 10 --charges 0 --verbosefrom amber_to_lammps import amber2lammps, validate_files
# Validate single topology with multiple copies
validate_files(['ethanol.prmtop'], [10], 'multi_ethanol.pdb')
# Convert multiple copies of the same molecule
amber2lammps(
data_file='multi_ethanol_data.lammps',
param_file='multi_ethanol_parm.lammps',
topologies=['ethanol.prmtop'],
molecule_counts=[10], # 10 ethanol molecules
pdb_file='multi_ethanol.pdb',
charges_target=[0],
verbose=True,
keep_temp=False
)# Use appropriate LAMMPS input for mixed systems
subprocess.run("lmp < test_mixed_system.in", shell=True)Conversion results have been cross-checked with InterMol.
Install InterMol: https://github.com/shirtsgroup/InterMol
Usage (InterMol):
python convert.py --amb_in epon.prmtop epon.crd --lammps --odir . --oname epon_convertedInterMol expects an AMBER coordinate file (.crd); amber_to_lammps.py uses the PDB for coordinates.
Committed validation assets:
epon_converted.input/epon_converted.lmp(InterMol reference)epon_validation.input/epon_validation.data/epon_validation.parm(AMBER2LAMMPS validation snapshot)
epon_validation.data is an AMBER2LAMMPS-generated snapshot with box bounds aligned to the
InterMol cell so the comparison isolates force-field conversion rather than box construction.
lmp < epon_converted.input
lmp < epon_validation.inputEnergy Comparison Results
Output from the committed InterMol and AMBER2LAMMPS validation assets above
| Energy Component | InterMol | AMBER2LAMMPS | Difference |
|---|---|---|---|
| E_bond | 2.3161274 | 2.3161274 | 0.0000 |
| E_angle | 6.0940384 | 6.0940126 | 0.0000258 |
| E_dihed | 12.475809 | 12.475827 | -0.0000180 |
| E_impro | 0.0000 | 0.0000 | 0.0000 |
| E_pair | -8.8739005 | -8.8427535 | -0.0311470 |
| E_vdwl | 10.824738 | 10.824738 | 0.0000 |
| E_coul | 97.869973 | 97.927222 | -0.0572490 |
| E_long | -117.56861 | -117.59471 | 0.02610 |
| E_tail | -0.0044166818 | -0.0044166819 | 0.0000000001 |
| PotEng | 12.012074 | 12.043214 | -0.03114 |
When making changes:
- Create a branch:
git checkout -b feature-name - Implement the change
- Test with varied inputs
- Commit and push
Thanks to Dr. Axel Kohlmeyer and Dr. Germain Clavier for tutorial feedback, and to Dr. Andrew Jewitt (author of moltemplate) for discussions via the mailing list and email.