Skip to content

Commit

Permalink
Bump version
Browse files Browse the repository at this point in the history
  • Loading branch information
bytesnake committed Mar 11, 2021
1 parent f0eca13 commit 0822fa4
Show file tree
Hide file tree
Showing 18 changed files with 136 additions and 41 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,4 @@ poetry.lock

# Generated artifacts of website (with Zola)
docs/website/public/*
docs/website/static/rustdocs/
43 changes: 43 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,46 @@
Version 0.3.1 - 2021-03-11
========================

In this release of Linfa the documentation is extended, new examples are added and the functionality of datasets improved. No new algorithms were added.

The meta-issue [#82](https://github.com/rust-ml/linfa/issues/82) gives a good overview of the necessary documentation improvements and testing/documentation/examples were considerably extended in this release.

Further new functionality was added to datasets and multi-target datasets are introduced. Bootstrapping is now possible for features and samples and you can cross-validate your model with k-folding. We polished various bits in the kernel machines and simplified the interface there.

The trait structure of regression metrics are simplified and the silhouette score introduced for easier testing of K-Means and other algorithms.

Changes
-----------
* improve documentation in all algorithms, various commits
* add a website to the infrastructure (c8acc785b)
* add k-folding with and without copying (b0af80546f8)
* add feature naming and pearson's cross correlation (71989627f)
* improve ergonomics when handling kernels (1a7982b973)
* improve TikZ generator in `linfa-trees` (9d71f603bbe)
* introduce multi-target datasets (b231118629)
* simplify regression metrics and add cluster metrics (d0363a1fa8ef)

Version 0.3.0 - 2021-01-21
=========================

New Algorithms
-----------

* Approximated DBSCAN has been added to `linfa-clustering` by [@Sauro98]
* Gaussian Naive Bayes has been added to `linfa-bayes` by [@VasanthakumarV]
* Elastic Net linear regression has been added to `linfa-elasticnet` by [@paulkoerbitz] and [@bytesnake]

Changes
----------

* Added benchmark to gaussian mixture models (a3eede55)
* Fixed bugs in linear decision trees, added generator for TiKZ trees (bfa5aebe7)
* Implemented serde for all crates behind feature flag (4f0b63bb)
* Implemented new backend features (7296c9ec4)
* Introduced `linfa-datasets` for easier testing (3cec12b4f)
* Rename `Dataset` to `DatasetBase` and introduce `Dataset` and `DatasetView` (21dd579cf)
* Improve kernel tests and documentation (8e81a6d)

Version 0.2.0 - 2020-11-26
==========================

Expand Down
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa"
version = "0.3.0"
version = "0.3.1"
authors = [
"Luca Palmieri <rust@lpalmieri.com>",
"Lorenz Schmidt <bytesnake@mailbox.org>",
Expand Down Expand Up @@ -60,7 +60,7 @@ features = ["cblas"]
ndarray-rand = "0.11"
approx = { version = "0.3", default-features = false, features = ["std"] }

linfa-datasets = { version = "0.3.0", path = "datasets", features = ["winequality", "iris", "diabetes", "linnerud"] }
linfa-datasets = { path = "datasets", features = ["winequality", "iris", "diabetes"] }

[workspace]
members = [
Expand Down
6 changes: 3 additions & 3 deletions algorithms/linfa-bayes/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-bayes"
version = "0.3.0"
version = "0.3.1"
authors = ["VasanthakumarV <vasanth260m12@gmail.com>"]
description = "Collection of Naive Bayes Algorithms"
edition = "2018"
Expand All @@ -15,8 +15,8 @@ ndarray = { version = "0.13" , features = ["blas", "approx"]}
ndarray-stats = "0.3"
thiserror = "1"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }

[dev-dependencies]
approx = "0.3"
linfa-datasets = { version = "0.3.0", path = "../../datasets", features = ["winequality"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets", features = ["winequality"] }
4 changes: 2 additions & 2 deletions algorithms/linfa-clustering/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-clustering"
version = "0.3.0"
version = "0.3.1"
edition = "2018"
authors = [
"Luca Palmieri <rust@lpalmieri.com>",
Expand Down Expand Up @@ -36,7 +36,7 @@ sprs = "0.7"
num-traits = "0.1.32"
rand_isaac = "0.2.0"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }
partitions = "0.2.4"

[dev-dependencies]
Expand Down
6 changes: 3 additions & 3 deletions algorithms/linfa-elasticnet/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-elasticnet"
version = "0.3.0"
version = "0.3.1"
authors = [
"Paul Körbitz / Google <koerbitz@google.com>",
"Lorenz Schmidt <bytesnake@mailbox.org>"
Expand Down Expand Up @@ -35,9 +35,9 @@ num-traits = "0.2"
approx = "0.3.2"
thiserror = "1"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }

[dev-dependencies]
linfa-datasets = { version = "0.3.0", path = "../../datasets", features = ["diabetes"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets", features = ["diabetes"] }
ndarray-rand = "0.11"
rand_isaac = "0.2"
8 changes: 4 additions & 4 deletions algorithms/linfa-hierarchical/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-hierarchical"
version = "0.3.0"
version = "0.3.1"
authors = ["Lorenz Schmidt <lorenz.schmidt@mailbox.org>"]
edition = "2018"

Expand All @@ -17,10 +17,10 @@ categories = ["algorithms", "mathematics", "science"]
ndarray = { version = "0.13", default-features = false }
kodama = "0.2"

linfa = { version = "0.3.0", path = "../.." }
linfa-kernel = { version = "0.3.0", path = "../linfa-kernel" }
linfa = { version = "0.3.1", path = "../.." }
linfa-kernel = { version = "0.3.1", path = "../linfa-kernel" }

[dev-dependencies]
rand = "0.7"
ndarray-rand = "0.11"
linfa-datasets = { version = "0.3.0", path = "../../datasets", features = ["iris"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets", features = ["iris"] }
4 changes: 2 additions & 2 deletions algorithms/linfa-ica/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-ica"
version = "0.3.0"
version = "0.3.1"
authors = ["VasanthakumarV <vasanth260m12@gmail.com>"]
description = "A collection of Independent Component Analysis (ICA) algorithms"
edition = "2018"
Expand Down Expand Up @@ -31,7 +31,7 @@ ndarray-stats = "0.3"
num-traits = "0.2"
rand_isaac = "0.2.0"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }

[dev-dependencies]
ndarray-npy = { version = "0.5", default-features = false }
Expand Down
4 changes: 2 additions & 2 deletions algorithms/linfa-kernel/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-kernel"
version = "0.3.0"
version = "0.3.1"
authors = ["Lorenz Schmidt <bytesnake@mailbox.org>"]
description = "Kernel methods for non-linear algorithms"
edition = "2018"
Expand Down Expand Up @@ -29,4 +29,4 @@ sprs = { version = "0.9.3", default-features = false }
hnsw = "0.6"
space = "0.10"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }
6 changes: 3 additions & 3 deletions algorithms/linfa-linear/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-linear"
version = "0.3.0"
version = "0.3.1"
authors = [
"Paul Körbitz / Google <koerbitz@google.com>",
"VasanthakumarV <vasanth260m12@gmail.com>"
Expand All @@ -25,8 +25,8 @@ argmin = {version="0.3.1", features=["ndarrayl"]}
serde = { version = "1.0", default-features = false, features = ["derive"] }
thiserror = "1"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }

[dev-dependencies]
linfa-datasets = { version = "0.3.0", path = "../../datasets", features = ["diabetes"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets", features = ["diabetes"] }
approx = "0.3.2"
6 changes: 3 additions & 3 deletions algorithms/linfa-logistic/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-logistic"
version = "0.3.0"
version = "0.3.1"
authors = ["Paul Körbitz / Google <koerbitz@google.com>"]

description = "A Machine Learning framework for Rust"
Expand All @@ -20,8 +20,8 @@ num-traits = "0.2"
argmin = {version="0.3.1", features=["ndarrayl"]}
serde = "1.0"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }

[dev-dependencies]
approx = "0.3.2"
linfa-datasets = { version = "0.3.0", path = "../../datasets", features = ["winequality"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets", features = ["winequality"] }
8 changes: 4 additions & 4 deletions algorithms/linfa-reduction/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-reduction"
version = "0.3.0"
version = "0.3.1"
authors = ["Lorenz Schmidt <bytesnake@mailbox.org>"]
description = "A collection of dimensionality reduction techniques"
edition = "2018"
Expand Down Expand Up @@ -30,11 +30,11 @@ ndarray-linalg = "0.12"
ndarray-rand = "0.11"
num-traits = "0.2"

linfa = { version = "0.3.0", path = "../.." }
linfa-kernel = { version = "0.3.0", path = "../linfa-kernel" }
linfa = { version = "0.3.1", path = "../.." }
linfa-kernel = { version = "0.3.1", path = "../linfa-kernel" }

[dev-dependencies]
rand = { version = "0.7", features = ["small_rng"] }
ndarray-npy = { version = "0.5", default-features = false }
linfa-datasets = { version = "0.3.0", path = "../../datasets", features = ["iris"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets", features = ["iris"] }
approx = { version = "0.3", default-features = false, features = ["std"] }
8 changes: 4 additions & 4 deletions algorithms/linfa-svm/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-svm"
version = "0.3.0"
version = "0.3.1"
edition = "2018"
authors = ["Lorenz Schmidt <lorenz.schmidt@mailbox.org>"]
description = "Support Vector Machines"
Expand Down Expand Up @@ -29,9 +29,9 @@ ndarray-rand = "0.11"
num-traits = "0.1.32"
thiserror = "1"

linfa = { version = "0.3.0", path = "../.." }
linfa-kernel = { version = "0.3.0", path = "../linfa-kernel" }
linfa = { version = "0.3.1", path = "../.." }
linfa-kernel = { version = "0.3.1", path = "../linfa-kernel" }

[dev-dependencies]
linfa-datasets = { version = "0.3.0", path = "../../datasets", features = ["winequality", "diabetes"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets", features = ["winequality", "diabetes"] }
rand_isaac = "0.2"
6 changes: 3 additions & 3 deletions algorithms/linfa-trees/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "linfa-trees"
version = "0.3.0"
version = "0.3.1"
edition = "2018"
authors = ["Moss Ebeling <moss@banay.me>"]
description = "A collection of tree-based algorithms"
Expand All @@ -27,14 +27,14 @@ features = ["std", "derive"]
ndarray = { version = "0.13" , features = ["rayon", "approx"]}
ndarray-rand = "0.11"

linfa = { version = "0.3.0", path = "../.." }
linfa = { version = "0.3.1", path = "../.." }

[dev-dependencies]
rand = { version = "0.7", features = ["small_rng"] }
criterion = "0.3"
approx = "0.3"

linfa-datasets = { version = "0.3.0", path = "../../datasets/", features = ["iris"] }
linfa-datasets = { version = "0.3.1", path = "../../datasets/", features = ["iris"] }

[[bench]]
name = "decision_tree"
Expand Down
4 changes: 2 additions & 2 deletions datasets/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
[package]
name = "linfa-datasets"
version = "0.3.0"
version = "0.3.1"
authors = ["Lorenz Schmidt <bytesnake@mailbox.org>"]
description = "Collection of small datasets for Linfa"
edition = "2018"
license = "MIT/Apache-2.0"
repository = "https://github.com/rust-ml/linfa"

[dependencies]
linfa = { version = "0.3.0", path = ".." }
linfa = { version = "0.3.1", path = ".." }
ndarray = { version = "0.13", default-features = false }
ndarray-csv = "0.4"
csv = "1.1"
Expand Down
4 changes: 2 additions & 2 deletions datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@ Currently the following datasets are provided:
| linnerud | The linnerud dataset contains samples from 20 middle-aged men in a fitness club. Their physical capability, as well as biological measures are related. | 20, 3, 3 | Regression | [here](https://core.ac.uk/download/pdf/20641325.pdf) |

The purpose of this crate is to faciliate dataset loading and make it as simple as possible. Loaded datasets are returned as a
[`linfa::Dataset`](https://docs.rs/linfa/0.3.0/linfa/dataset/type.Dataset.html) structure with named features.
[`linfa::Dataset`](https://docs.rs/linfa/latest/linfa/dataset/type.Dataset.html) structure with named features.

## Using a dataset

To use one of the provided datasets in your project add the `linfa-datasets` crate to your `Cargo.toml` and enable the corresponding feature:
```
linfa-datasets = { version = "0.3.0", features = ["winequality"] }
linfa-datasets = { version = "0.3.1", features = ["winequality"] }
```
You can then use the dataset in your working code:
```rust
Expand Down
4 changes: 2 additions & 2 deletions datasets/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@
//! | linnerud | The linnerud dataset contains samples from 20 middle-aged men in a fitness club. Their physical capability, as well as biological measures are related. | 20, 3, 3 | Regression | [here](https://core.ac.uk/download/pdf/20641325.pdf) |
//!
//! The purpose of this crate is to faciliate dataset loading and make it as simple as possible. Loaded datasets are returned as a
//! [`linfa::Dataset`](https://docs.rs/linfa/0.3.0/linfa/dataset/type.Dataset.html) structure with named features.
//! [linfa::Dataset] structure with named features.
//!
//! ## Using a dataset
//!
//! To use one of the provided datasets in your project add the `linfa-datasets` crate to your `Cargo.toml` and enable the corresponding feature:
//! ```ignore
//! linfa-datasets = { version = "0.3.0", features = ["winequality"] }
//! linfa-datasets = { version = "0.3.1", features = ["winequality"] }
//! ```
//!
//! You can then use the dataset in your working code:
Expand Down
51 changes: 51 additions & 0 deletions docs/website/content/news/release_031.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
+++
title = "Release 0.3.1"
date = "2021-03-11"
+++

In this release of Linfa the documentation is extended, new examples are added and the functionality of datasets improved. No new algorithms were added.

<!-- more -->

The meta-issue [#82](https://github.com/rust-ml/linfa/issues/82) gives a good overview of the necessary documentation improvements and testing/documentation/examples were considerably extended in this release.

Further new functionality was added to datasets and multi-target datasets are introduced. Bootstrapping is now possible for features and samples and you can cross-validate your model with k-folding. We polished various bits in the kernel machines and simplified the interface there.

The trait structure of regression metrics are simplified and the silhouette score introduced for easier testing of K-Means and other algorithms.


# Changes

* improve documentation in all algorithms, various commits
* add a website to the infrastructure (c8acc785b)
* add k-folding with and without copying (b0af80546f8)
* add feature naming and pearson's cross correlation (71989627f)
* improve ergonomics when handling kernels (1a7982b973)
* improve TikZ generator in `linfa-trees` (9d71f603bbe)
* introduce multi-target datasets (b231118629)
* simplify regression metrics and add cluster metrics (d0363a1fa8ef)

# Example

You can now perform cross-validation with k-folding. @Sauro98 actually implemented two versions, one which copies the dataset into k folds and one which avoid excessive memory operations by copying only the validation dataset around. For example to test a model with 8-folding:

```rust
// perform cross-validation with the F1 score
let f1_runs = dataset
.iter_fold(8, |v| params.fit(&v).unwrap())
.map(|(model, valid)| {
let cm = model
.predict(&valid)
.mapv(|x| x > Pr::even())
.confusion_matrix(&valid).unwrap();

cm.f1_score()
})
.collect::<Array1<_>>();

// calculate mean and standard deviation
println!("F1 score: {}±{}",
f1_runs.mean().unwrap(),
f1_runs.std_axis(Axis(0), 0.0),
);
```

0 comments on commit 0822fa4

Please sign in to comment.