linfa-preprocessing
provides machine learning specific data preprocessing algorithms in the vein of Python's sklearn.preprocessing
.
Once this project matures, it will become part of the linfa
Rust machine learning toolkit.
linfa-preprocessing
consists of a single trait: Preprocess<A>
. All of the preprocessing functions we provide are defined in Preprocessing<A>
. It is generic over a single parameter A
, representing the element type of the data being operated on.
There is one implementation of this trait that is compatible with 2-dimensional, float-based, owned-memory ndarray
arrays. In this implementation, the preprocessing functions consume the array and produce a new owned array. This allows for ergonomic chaining of preprocessing methods.
- StandardScale
- MinMaxScale
- CustomScale
- Binarize
- RobustScale
- Normalize
- PowerTransform
- QuantileTransform
- FunctionTransform
- LabelBinarize
- MultiLabelBinarize
- PolynomialFeatures
- LabelEncode
use linfa_preprocessing::Preprocess;
use ndarray::array;
let data = array![[-1., 2.],
[-0.5, 6.],
[0., 10.],
[1., 18.]];
let processed = data.min_max_scale()
.standard_scale()
.binarize(0.);
let expected = array![[0., 0.],
[0., 0.],
[1., 1.],
[1., 1.]];
assert_eq!(processed, expected)