Commit 5f1dec8
feat: add support for multiple LAMMPS atom styles with automatic detection (#867)
This PR adds comprehensive support for different LAMMPS atom styles
beyond the previously supported "atomic" style. The implementation now
supports 8 common LAMMPS atom styles with **automatic detection** and
charge extraction while maintaining full backward compatibility.
## Supported Atom Styles
- **atomic**: atom-ID atom-type x y z (default fallback)
- **full**: atom-ID molecule-ID atom-type q x y z (includes charges and
molecule IDs)
- **charge**: atom-ID atom-type q x y z (includes charges)
- **bond**: atom-ID molecule-ID atom-type x y z (includes molecule IDs)
- **angle**: atom-ID molecule-ID atom-type x y z
- **molecular**: atom-ID molecule-ID atom-type x y z
- **dipole**: atom-ID atom-type q x y z mux muy muz (includes charges)
- **sphere**: atom-ID atom-type diameter density x y z
## Key Features
- **Automatic atom style detection**: Parses LAMMPS data file headers
and comments (e.g., `Atoms # full`) with intelligent fallback based on
column analysis
- **Automatic charge extraction and registration**: For atom styles that
include charges (full, charge, dipole), charges are automatically
extracted, stored, and properly registered as a DataType
- **Smart defaults**: `atom_style="auto"` is now the default,
eliminating the need for manual specification in most cases
- **Backward compatibility**: Existing code continues to work without
any changes
- **Robust error handling**: Clear error messages for unsupported atom
styles with graceful fallbacks
## Usage
```python
# Automatic detection (new default behavior)
system = dpdata.System("data.lmp", type_map=["O", "H"]) # Detects style automatically
# Full style with charges and molecule IDs
system = dpdata.System("data.lmp", type_map=["O", "H"]) # Auto-detects "full" style
print(system["charges"]) # Access extracted charges
# Explicit styles still supported for edge cases
system = dpdata.System("data.lmp", type_map=["O", "H"], atom_style="charge")
```
## Implementation Details
The solution adds intelligent atom style detection that:
1. Parses header comments after "Atoms" sections for explicit style
declarations
2. Uses heuristic analysis of column count and content patterns as
fallback
3. Maintains the existing configurable atom style parameter for explicit
control
4. Automatically registers charge DataType when charge data is present
All parsing functions (`get_atype`, `get_posi`, `get_charges`) were
updated to handle different column arrangements with full type hints.
Comprehensive tests cover both comment-based and heuristic detection
scenarios.
Fixes #853.
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>1 parent f727ada commit 5f1dec8
File tree
3 files changed
+624
-18
lines changed- dpdata
- lammps
- plugins
- tests
3 files changed
+624
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
10 | 86 | | |
11 | 87 | | |
12 | 88 | | |
| |||
95 | 171 | | |
96 | 172 | | |
97 | 173 | | |
98 | | - | |
99 | | - | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
100 | 235 | | |
101 | 236 | | |
102 | 237 | | |
| |||
105 | 240 | | |
106 | 241 | | |
107 | 242 | | |
108 | | - | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
109 | 262 | | |
110 | 263 | | |
111 | 264 | | |
112 | | - | |
113 | | - | |
| 265 | + | |
| 266 | + | |
114 | 267 | | |
115 | 268 | | |
116 | 269 | | |
117 | 270 | | |
118 | 271 | | |
119 | 272 | | |
120 | 273 | | |
121 | | - | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
122 | 289 | | |
123 | 290 | | |
124 | 291 | | |
125 | | - | |
126 | | - | |
| 292 | + | |
| 293 | + | |
127 | 294 | | |
128 | 295 | | |
129 | 296 | | |
130 | | - | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
131 | 328 | | |
132 | 329 | | |
133 | 330 | | |
| |||
161 | 358 | | |
162 | 359 | | |
163 | 360 | | |
164 | | - | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
165 | 385 | | |
166 | | - | |
| 386 | + | |
167 | 387 | | |
168 | 388 | | |
169 | 389 | | |
| |||
177 | 397 | | |
178 | 398 | | |
179 | 399 | | |
180 | | - | |
181 | | - | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
182 | 404 | | |
183 | 405 | | |
184 | 406 | | |
185 | | - | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
186 | 413 | | |
187 | 414 | | |
188 | 415 | | |
189 | 416 | | |
190 | 417 | | |
191 | 418 | | |
192 | | - | |
193 | | - | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
194 | 455 | | |
195 | 456 | | |
196 | 457 | | |
| |||
0 commit comments