High-performance 3D molecular conformer generation and quantum-chemistry-inspired property computation, written in pure Rust.
Generates chemically valid 3D coordinates from SMILES strings with RDKit-quality accuracy (ETKDGv2), and provides a full suite of computational chemistry tools: Extended Hückel Theory, electrostatic potentials, density of states, population analysis, molecular alignment, force field energy evaluation (UFF + MMFF94), partial charges, SASA, and materials framework assembly.
Native bindings for Rust, Python, TypeScript/JavaScript (WASM), and a cross-platform CLI.
- ETKDGv2 Distance Geometry — CSD torsion preferences (846 SMARTS patterns), matches RDKit accuracy
- High Accuracy — 0.00% heavy-atom RMSD > 0.5 Å vs RDKit on GDB-20 (2000 molecules, ensemble comparison)
- Fast — 60+ molecules/second in Rust, parallel batch processing via rayon
- Full chemical coverage — organics, stereocenters, macrocycles, fused rings, metals, halogens (He→Bi)
- Extended Hückel Theory — Wolfsberg-Helmholtz Hamiltonian, Löwdin orthogonalization, HOMO/LUMO gaps
- Mulliken & Löwdin population analysis — atomic orbital contributions per atom
- Molecular dipole moment — bond dipoles + lone-pair contributions in Debye
- Volumetric orbital grids — STO-3G basis, chunked evaluation for large molecules, marching cubes isosurfaces
- Density of States — total DOS + per-atom PDOS with Gaussian smearing, JSON export, MSE metric
- Electrostatic Potential (ESP) — Coulomb grid from Mulliken charges, color mapping (red/white/blue),
.cubeexport - Parallel ESP — rayon-accelerated grid computation (
parallelfeature) - Solvent-Accessible Surface Area (SASA) — Shrake-Rupley algorithm, Fibonacci sphere, Bondi radii, parallelized per-atom evaluation
- Gasteiger-Marsili partial charges — 6-iteration electronegativity equalization
- Automatic rayon thread pool — all compute functions (DOS, ESP SASA, population, dipole, etc.) parallelized with work-stealing queue
- Zero-configuration — no API changes needed; parallelization enabled by default via
parallelfeature - Intra-library parallelism — grid point loops, per-atom evaluation, force field term accumulation all use
par_iter() - CPU-aware workload balancing — handles small molecules (sequential) and large molecules (parallel) automatically
- UFF — Universal Force Field for 50+ element types (including transition metals Ti–Zn, Pd, Pt, Au)
- MMFF94 — Merck force field (stretch, bend, torsion, van der Waals, electrostatics via 14-7 potential)
- BFGS / L-BFGS minimizers — dense BFGS for small molecules, L-BFGS for large systems
- Kabsch alignment — SVD-based optimal rotation, minimizes RMSD
- Quaternion alignment — Coutsias 2004 4×4 eigenproblem (faster for large molecules)
- RMSD computation — after optimal superposition
- Periodic unit cells — lattice parameters (a, b, c, α, β, γ) to Cartesian tensor
- Secondary Building Units (SBUs) — node/linker topology with coordination sites
- Framework assembly — MOF-type crystal structure generation from SBUs + topology
- Multi-platform — Rust lib, Python (PyO3), TypeScript/JS (WASM), CLI (Linux/macOS/Windows)
- Zero runtime dependencies — pure Rust, no C++ toolchain needed
- SMILES + SMARTS — full parser and substructure matcher; 60+ bracket elements
[dependencies]
sci-form = "0.2"
# For parallel batch + ESP:
sci-form = { version = "0.2", features = ["parallel"] }use sci_form::{embed, compute_charges, compute_esp, compute_dos, compute_population};
// 3D conformer
let result = sci_form::embed("CCO", 42);
println!("Atoms: {}, Time: {:.1}ms", result.num_atoms, result.time_ms);
// Gasteiger–Marsili charges
let charges = sci_form::compute_charges("CCO").unwrap();
println!("Charges: {:?}", charges.charges);
// EHT → population analysis
let mol = sci_form::parse("CCO").unwrap();
let elements: Vec<u8> = /* ... */ vec![8, 6, 6, 1, 1, 1, 1, 1, 1];
let positions: Vec<[f64; 3]> = /* ... from result.coords */ vec![];
let pop = sci_form::compute_population(&elements, &positions).unwrap();
println!("HOMO: {:.3} eV", pop.homo_energy);
// ESP grid
let esp = sci_form::compute_esp(&elements, &positions, 0.5, 3.0).unwrap();
// DOS / PDOS
let dos = sci_form::compute_dos(&elements, &positions, 0.2, -20.0, 5.0, 200).unwrap();
println!("HOMO–LUMO gap: {:.3} eV", dos.homo_lumo_gap.unwrap_or(0.0));→ Full Rust API reference · Guide
pip install sciformaimport sci_form
# 3D conformer
result = sci_form.embed("CCO")
print(f"Atoms: {result.num_atoms}, Time: {result.time_ms:.1f}ms")
# Batch
results = sci_form.embed_batch(["CCO", "c1ccccc1", "CC(=O)O"])
# Gasteiger charges
charges = sci_form.compute_charges("CCO")
print(charges["charges"]) # per-atom charges
# EHT + population analysis
elements = [8, 6, 6, 1, 1, 1, 1, 1, 1]
positions = [[0.0, 0.0, 0.0], ...] # from result.coords
pop = sci_form.compute_population(elements, positions)
print(f"HOMO: {pop['homo_energy']:.3f} eV")
# DOS / PDOS
dos = sci_form.compute_dos(elements, positions, sigma=0.2, e_min=-20.0, e_max=5.0, n_points=200)
print(f"Gap: {dos['homo_lumo_gap']:.3f} eV")→ Full Python API reference · Guide
npm install sci-form-wasmimport init, {
embed, embed_coords_typed,
compute_esp_grid_typed, compute_esp_grid_info,
eht_calculate, eht_orbital_mesh, eht_orbital_grid_typed,
compute_charges, compute_dos
} from 'sci-form-wasm';
await init();
// Conformer as JSON
const result = JSON.parse(embed('CCO', 42));
console.log(result.num_atoms);
// Typed-array conformer (faster, no JSON overhead)
const coords: Float64Array = embed_coords_typed('CCO', 42);
// EHT calculation
const eht = JSON.parse(eht_calculate('[6,8,6,1,1,1,1,1,1]', coords.toString(), 1.75));
console.log(`HOMO: ${eht.homo_energy} eV, LUMO: ${eht.lumo_energy} eV`);
// Orbital isosurface mesh
const mesh = JSON.parse(eht_orbital_mesh('[6,8,6,1,1,1,1,1,1]', coords.toString(), 1.75, 0, 0.02));
// mesh.vertices, mesh.normals, mesh.indices
// ESP grid (typed array)
const espData: Float64Array = compute_esp_grid_typed('CCO', 42, 0.5, 3.0);
const espInfo = JSON.parse(compute_esp_grid_info('CCO', 42, 0.5, 3.0));
console.log(`Grid: ${espInfo.nx}×${espInfo.ny}×${espInfo.nz}`);
// DOS
const dos = JSON.parse(compute_dos('[6,8,6,1,1,1,1,1,1]', coords.toString(), 0.2, -20.0, 5.0, 200));→ Full TypeScript API reference · Guide
cargo install sci-form-cli# Single molecule
sci-form embed "CCO" --format xyz
# Batch processing
sci-form batch -i molecules.smi -o output.sdf --format sdf --threads 8
# Parse only (no 3D)
sci-form parse "c1ccccc1"
# Gasteiger charges
sci-form charges "CCO"
# UFF energy
sci-form energy "CCO" --coords coords.json
# Version / features
sci-form infoPrebuilt binaries available at GitHub Releases:
| Platform | File |
|---|---|
| Linux x86_64 | sci-form-linux-x86_64 |
| Linux aarch64 | sci-form-linux-aarch64 |
| macOS x86_64 | sci-form-macos-x86_64 |
| macOS Apple Silicon | sci-form-macos-aarch64 |
| Windows x86_64 | sci-form-windows-x86_64.exe |
| Metric | Value |
|---|---|
| Parse success | 100% |
| Embed success | 97.7% |
| Geometry quality | 97.7% |
| Throughput | 60 mol/s |
| Metric | Value |
|---|---|
| Average RMSD | 0.064 Å |
| Median RMSD | 0.011 Å |
| < 0.5 Å | 98.4% |
| < 0.3 Å | 94.4% |
| Metric | All-atom | Heavy-atom |
|---|---|---|
| Avg RMSD | 0.035 Å | 0.018 Å |
| > 0.5 Å | 0.95% | 0.00% |
| Module | Description |
|---|---|
sci_form::embed |
ETKDGv2 3D conformer generation from SMILES |
sci_form::embed_batch |
Parallel batch conformer generation (rayon) |
sci_form::parse |
SMILES → molecular graph |
sci_form::compute_charges |
Gasteiger-Marsili partial charges |
sci_form::compute_sasa |
Solvent-accessible surface area (SASA) |
sci_form::compute_population |
Mulliken & Löwdin population analysis (EHT) |
sci_form::compute_dipole |
Molecular dipole moment in Debye (EHT) |
sci_form::compute_esp |
Electrostatic potential grid (Mulliken charges) |
sci_form::compute_dos |
Density of states + PDOS (EHT orbital energies) |
sci_form::compute_rmsd |
RMSD after Kabsch alignment |
sci_form::compute_uff_energy |
UFF force field energy evaluation |
sci_form::create_unit_cell |
Periodic unit cell from lattice parameters |
sci_form::assemble_framework |
MOF-type framework assembly from SBUs |
| Module Path | Key API |
|---|---|
sci_form::eht |
solve_eht(), EhtResult, evaluate_orbital_on_grid_chunked(), marching_cubes() |
sci_form::esp |
compute_esp_grid_parallel(), esp_color_map(), esp_grid_to_colors(), export_cube(), read_cube() |
sci_form::dos |
compute_dos(), compute_pdos(), dos_mse(), export_dos_json() |
sci_form::alignment |
align_coordinates(), align_quaternion(), compute_rmsd() |
sci_form::forcefield |
build_uff_force_field(), Mmff94Builder::build() |
sci_form::charges::gasteiger |
gasteiger_marsili_charges() |
sci_form::surface::sasa |
compute_sasa() |
sci_form::materials |
UnitCell, Sbu, Topology, assemble_framework() |
sci_form::transport |
pack_batch_arrow(), ChunkedIterator, WorkerTask |
sci-form implements ETKDGv2 (Experimental Torsion Knowledge Distance Geometry):
- SMILES Parsing → Molecular graph with atoms, bonds, hybridization
- Bounds Matrix → 1-2, 1-3, 1-4, and VdW distance bounds from topology
- Triangle Smoothing → Floyd-Warshall triangle inequality enforcement
- Distance Picking → Random distances from smoothed bounds (MinstdRand)
- Metric Matrix Embedding → Eigendecomposition → 4D coordinates
- Bounds Force Field → BFGS minimization in 4D to satisfy distance constraints
- Projection to 3D → Drop lowest-variance dimension
- ETKDG 3D Refinement → Force field with CSD torsion preferences (846 patterns)
- Validation → Tetrahedral centers, planarity, double-bond geometry
See the algorithm documentation for mathematical derivations and diagrams.
# Library + CLI
cargo build --release
# Python bindings
cd crates/python && pip install maturin && maturin develop --release
# WASM bindings
cd crates/wasm && wasm-pack build --target web --release
# With parallel feature
cargo build --release --features parallel# All unit tests
cargo test --lib
# Integration — diverse molecules (97.7% pass rate)
cargo test --release --test test_diverse_molecules -- --nocapture
# Integration — gradient correctness
cargo test --release --test test_gradient_check -- --nocapture
# Lint & format
cargo fmt --all -- --check
cargo clippy --all-targets -- -D warningsUse the provided bump script. It updates all version strings, commits, tags, and pushes:
# Auto-increment patch (0.1.7 → 0.1.8)
./scripts/bump_version.sh
# Set a specific version
./scripts/bump_version.sh 0.2.0This updates versions in:
Cargo.toml(root lib)crates/cli/Cargo.tomlcrates/python/Cargo.tomlcrates/wasm/Cargo.tomlcrates/python/pyproject.tomlcrates/wasm/pkg/package.jsonpkg/package.json&pkg-node/package.json
Then creates a vX.Y.Z git tag, which triggers the release workflow to publish to crates.io, PyPI, and npm automatically.
Required repository secrets:
| Secret | Used for |
|---|---|
CARGO_REGISTRY_TOKEN |
Publishing to crates.io |
PYPI_API_TOKEN |
Publishing to PyPI (sciforma) |
NPM_TOKEN |
Publishing to npm (sci-form-wasm) — must be a Granular Automation token |
MIT