A comprehensive Python library for Near-Infrared Spectroscopy data analysis
Documentation • Installation • Quick Start • Examples • Contributing
NIRS4ALL bridges the gap between spectroscopic data and machine learning by providing a unified framework for data loading, preprocessing, model training, and evaluation. Built for researchers and practitioners working with Near-Infrared Spectroscopy data.
- NIRS-Specific Preprocessing — SNV, MSC, Savitzky-Golay, Norris-Williams, wavelet denoise, OSC/EPO, and 30+ spectral transforms
- Advanced PLS Models — AOM-PLS, POP-PLS, OPLS, DiPLS, MBPLS, and 15+ PLS variants with automatic operator selection
- Multi-Backend ML — Seamless integration with scikit-learn, TensorFlow, PyTorch, and JAX
- Declarative Pipelines — Define complex workflows with simple, readable syntax
- Parallel Execution — Multi-core pipeline variant execution via joblib
- Hyperparameter Tuning — Built-in Optuna integration for automated optimization
- Rich Visualizations — Performance heatmaps, candlestick plots, SHAP explanations
- Model Deployment — Export trained pipelines as portable
.n4abundles - sklearn Compatible —
NIRSPipelinewrapper for SHAP, cross-validation, and more
pip install nirs4allThis installs the core library with scikit-learn support. Deep learning frameworks are optional.
# TensorFlow
pip install nirs4all[tensorflow]
# PyTorch
pip install nirs4all[torch]
# JAX
pip install nirs4all[jax]
# All frameworks
pip install nirs4all[all]
# All frameworks with GPU support
pip install nirs4all[all-gpu]Coming soon! We're working with conda-forge to make NIRS4ALL available through conda. In the meantime, use pip install nirs4all or docker.
# Available soon:
# conda install -c conda-forge nirs4alldocker pull ghcr.io/gbeurier/nirs4all:latest
docker run -v $(pwd):/workspace ghcr.io/gbeurier/nirs4all python my_script.pygit clone https://github.com/GBeurier/nirs4all.git
cd nirs4all
pip install -e ".[dev]"nirs4all --test-install # Check dependencies
nirs4all --test-integration # Run integration tests
nirs4all --version # Check versionimport nirs4all
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import ShuffleSplit
from sklearn.cross_decomposition import PLSRegression
# Define your pipeline
pipeline = [
MinMaxScaler(),
{"y_processing": MinMaxScaler()},
ShuffleSplit(n_splits=3, test_size=0.25),
{"model": PLSRegression(n_components=10)}
]
# Train and evaluate
result = nirs4all.run(
pipeline=pipeline,
dataset="path/to/your/data",
name="MyPipeline",
verbose=1
)
# Access results
print(f"Best RMSE: {result.best_rmse:.4f}")
print(f"Best R²: {result.best_r2:.4f}")
# Export for deployment
result.export("exports/best_model.n4a")import nirs4all
from sklearn.preprocessing import MinMaxScaler
from sklearn.cross_decomposition import PLSRegression
from sklearn.ensemble import RandomForestRegressor
with nirs4all.session(verbose=1, save_artifacts=True) as s:
# Compare models with shared configuration
pls_result = nirs4all.run(
pipeline=[MinMaxScaler(), PLSRegression(n_components=10)],
dataset="data/wheat.csv",
name="PLS",
session=s
)
rf_result = nirs4all.run(
pipeline=[MinMaxScaler(), RandomForestRegressor(n_estimators=100)],
dataset="data/wheat.csv",
name="RandomForest",
session=s
)
print(f"PLS: {pls_result.best_rmse:.4f} | RF: {rf_result.best_rmse:.4f}")import nirs4all
from nirs4all.sklearn import NIRSPipeline
import shap
# Train with nirs4all
result = nirs4all.run(pipeline, dataset)
# Wrap for sklearn compatibility
pipe = NIRSPipeline.from_result(result)
# Use with SHAP
explainer = shap.Explainer(pipe.predict, X_background)
shap_values = explainer(X_test)
shap.summary_plot(shap_values)NIRS4ALL uses a declarative syntax for defining pipelines:
from nirs4all.operators.transforms import SNV, SavitzkyGolay, FirstDerivative
pipeline = [
# Preprocessing
MinMaxScaler(),
SNV(),
SavitzkyGolay(window_length=11, polyorder=2),
# Target scaling
{"y_processing": MinMaxScaler()},
# Cross-validation
ShuffleSplit(n_splits=5, test_size=0.2),
# Models to compare
{"model": PLSRegression(n_components=10)},
{"model": RandomForestRegressor(n_estimators=100)},
# Neural network with training parameters
{
"model": nicon,
"name": "NICON-CNN",
"train_params": {"epochs": 100, "patience": 20}
}
]# Feature augmentation - generate preprocessing combinations
{
"feature_augmentation": {
"_or_": [SNV, FirstDerivative, SavitzkyGolay],
"size": [1, (1, 2)],
"count": 5
}
}
# Hyperparameter optimization
{
"model": PLSRegression(),
"finetune_params": {
"n_trials": 50,
"model_params": {"n_components": ("int", 1, 30)}
}
}
# Branching for parallel preprocessing paths
{
"branch": [
[SNV(), PLSRegression(n_components=10)],
[MSC(), RandomForestRegressor()]
]
}
# Merge branch outputs (stacking)
{"merge": "predictions"}| Transform | Description |
|---|---|
SNV / StandardNormalVariate |
Standard Normal Variate normalization |
RNV / RobustStandardNormalVariate |
Robust Normal Variate (outlier-resistant) |
MSC / MultiplicativeScatterCorrection |
Multiplicative Scatter Correction |
SavitzkyGolay |
Smoothing and derivative computation |
FirstDerivative / SecondDerivative |
Spectral derivatives |
NorrisWilliams |
Gap derivative with segment smoothing |
WaveletDenoise |
Multi-level wavelet denoising with thresholding |
OSC |
Orthogonal Signal Correction (DOSC) |
EPO |
External Parameter Orthogonalization |
Detrend |
Remove linear/polynomial trends |
Gaussian |
Gaussian smoothing |
Haar |
Haar wavelet decomposition |
| Transform | Description |
|---|---|
Baseline |
Baseline correction (ALS, AirPLS, ArPLS, IModPoly, SNIP, etc.) |
ReflectanceToAbsorbance |
Convert R to A using Beer-Lambert |
ToAbsorbance / FromAbsorbance |
Signal type conversion |
KubelkaMunk |
Kubelka-Munk transform |
Resampler |
Wavelength interpolation |
CARS / MCUVE |
Feature selection methods |
| Model | Description |
|---|---|
AOMPLSRegressor / AOMPLSClassifier |
Adaptive Operator-Mixture PLS — auto-selects best preprocessing |
POPPLSRegressor / POPPLSClassifier |
Per-Operator-Per-component PLS via PRESS |
PLSDA |
PLS Discriminant Analysis |
OPLS / OPLSDA |
Orthogonal PLS |
MBPLS |
Multi-Block PLS |
DiPLS |
Domain-Invariant PLS |
IKPLS |
Improved Kernel PLS |
FCKPLS |
Fractional Convolution Kernel PLS |
| Splitter | Description |
|---|---|
KennardStoneSplitter |
Kennard-Stone algorithm |
SPXYSplitter |
Sample set Partitioning based on X and Y |
SPXYFold / SPXYGFold |
SPXY-based K-Fold cross-validation (with group support) |
KMeansSplitter |
K-means clustering based split |
KBinsStratifiedSplitter |
Binned stratification for continuous targets |
See Preprocessing Guide for complete reference.
The examples/ directory is organized by topic:
| Category | Examples |
|---|---|
| Getting Started | Hello world, basic regression, classification, visualization |
| Data Handling | Multi-source, data loading, metadata |
| Preprocessing | SNV, MSC, derivatives, custom transforms |
| Models | Multi-model, hyperparameter tuning, stacking, PLS variants |
| Cross-Validation | KFold, group splits, nested CV |
| Deployment | Export, prediction, workspace management |
| Explainability | SHAP basics, sklearn integration, feature selection |
Complete syntax reference and advanced pipeline patterns.
Run examples:
cd examples
./run.sh # Run all
./run.sh -i 1 # Run by index
./run.sh -n "U01*" # Run by pattern| Section | Description |
|---|---|
| User Guide | Preprocessing, API migration, augmentation |
| API Reference | Module-level API, sklearn integration, data handling |
| Specifications | Pipeline syntax, config format, metrics |
| Explanations | SHAP, resampling, SNV theory |
Full documentation: nirs4all.readthedocs.io
NIRS4ALL has been used in published research:
Houngbo, M. E., et al. (2024). Convolutional neural network allows amylose content prediction in yam (Dioscorea alata L.) flour using near infrared spectroscopy. Journal of the Science of Food and Agriculture, 104(8), 4915-4921. John Wiley & Sons, Ltd.
If you use NIRS4ALL in your research, please cite:
@software{beurier2025nirs4all,
author = {Gregory Beurier and Denis Cornet and Lauriane Rouan},
title = {NIRS4ALL: Open spectroscopy for everyone},
url = {https://github.com/GBeurier/nirs4all},
version = {0.7.1},
year = {2026},
}We welcome contributions! Please see CONTRIBUTING.md for guidelines.
This project is licensed under the CeCILL-2.1 License — a French free software license compatible with GPL.
- CIRAD for supporting this research
- The open-source scientific Python community
Made for the spectroscopy community



