Skip to content

Commit

Permalink
Merge pull request #18 from orxfun/major-refactoring-v2.0
Browse files Browse the repository at this point in the history
major refactoring
  • Loading branch information
orxfun authored Feb 27, 2024
2 parents f56b055 + a26dce8 commit 3792902
Show file tree
Hide file tree
Showing 35 changed files with 3,232 additions and 3,849 deletions.
12 changes: 6 additions & 6 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
[package]
name = "orx-linked-list"
version = "1.0.0"
version = "2.0.0"
edition = "2021"
authors = ["orxfun <orx.ugur.arikan@gmail.com>"]
description = "An efficient doubly linked list using regular & references with a focus on better cache locality avoiding heap allocations by smart pointers."
description = "An efficient and recursive singly and doubly linked list implementation."
license = "MIT"
repository = "https://github.com/orxfun/orx-linked-list/"
keywords = ["linked", "list", "vec", "array", "pinned"]
categories = ["data-structures", "rust-patterns"]

[dependencies]
orx-imp-vec = "1.0"
orx-split-vec = "1.2"

orx-selfref-col = "1.0"
orx-split-vec = "2.0"

[dev-dependencies]
rand = "0.8"
rand_chacha = "0.3"
criterion = { version = "0.5", features = ["html_reports"] }
test-case = "3.3"

[[bench]]
name = "mutation_ends"
name = "append"
harness = false
148 changes: 64 additions & 84 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,117 +1,97 @@
# orx-linked-list

An efficient doubly linked list using regular `&` references with a focus to avoid smart pointers and improve cache locality.
An efficient and recursive singly and doubly linked list implementation.

## A. Motivation
## Variants and Time Complexity of Methods

Self referential, often recursive, collections contain an important set of useful data structures including linked lists. However, building these structures with references `&` is not possible in safe rust.
* `type SinglyLinkedList<'a, T> = List<'a, Singly, T>;`
* `type DoublyLinkedList<'a, T> = List<'a, Doubly, T>;`

Alternatively, these collections can be built using reference counted smart pointers such as `std::rc::Rc` and independent heap allocations. However, independent heap allocations is a drawback as the elements do not live close to each other leading to poor cache locality. Further, reference counted pointers have a runtime overhead. This crate makes use of [`orx_imp_vec::ImpVec`](https://crates.io/crates/orx-imp-vec) as the underlying storage. ImpVec specializes in enabling self referential collections by relying on the pinned memory location guarantee provided by the [`orx_pinned_vec::PinnedVec`](https://crates.io/crates/orx-pinned-vec).
## Time Complexity of Methods

### Standard LinkedList
| Method | Time Complexity |
| -------- | ------- |
| access to front and back of the list | **O(1)** |
| push to front and back (`Doubly` only) of the list | **O(1)** |
| pop from front and back (`Doubly` only) of the list | **O(1)** |
| insert at an arbitrary position | O(n) |
| remove from an arbitrary position | O(n) |
| append another list to the front or back of the list | **O(1)** |
| retain elements by a predicate | O(n) |
| retain and collect remove elements | O(n) |
| iteration forwards or backwards (only `Doubly`) | O(n) |

Standard `std::collections::LinkedList` implementation avoids reference counted pointers and uses `NonNull` instead, most likely to avoid this overhead. However, this leads to a risky and difficult implementation that feels more low level than it should. You may see the implementation [here](https://doc.rust-lang.org/src/alloc/collections/linked_list.rs.html). The `unsafe` keyword is used more than 60 times in this file. These are usually related to reading from and writing to memory through raw pointers.

***Motivation:*** We do not need to count references provided that all elements and inter-element references belong to the same owner or container. This is because all elements will be dropped at the same time together with their inter-element references when the container `ImpVec` is dropped.
## Examples

***Motivation:*** We should be able to define these structures without directly accessing memory through raw pointers. This is unnecessarily powerful and risky. Instead, unsafe code must be limited to methods which are specialized for and only allow defining required connections of self referential collections.

### orx_linked_list::LinkedList

Linked list implementation in this crate uses an `ImpVec` as the underlying storage and makes use of its specialized methods. This brings the following advantages:

* Allows for a higher level implementation without any use of raw pointers.
* Avoids smart pointers.
* Avoids almost completely accessing through integer indices.
* All nodes belong to the same `ImpVec` living close to each other. This allows for better cache locality.
* Full fetched doubly-linked-list implementation uses the `unsafe` keyword seven times, which are repeated uses of three methods:
* `ImpVec::push_get_ref`
* `ImpVec::move_get_ref`
* `ImpVec::unsafe_truncate` (*a deref method from [`PinnedVec`](https://crates.io/crates/orx-pinned-vec)*)

Furthermore, this implementation is more performant than the standard library implementation, a likely indicator of better cache locality. You may below the benchmark results for a series of random push/pop mutations after pushing "number of elements" elements to the list.

<img src="https://raw.githubusercontent.com/orxfun/orx-linked-list/main/docs/img/bench_mutation_ends.PNG" alt="https://raw.githubusercontent.com/orxfun/orx-linked-list/main/docs/img/bench_mutation_ends.PNG" />
```rust
use orx_linked_list::*;

However, note that the benchmark compares only the linked list implementations. `std::collections::VecDeque` is significantly more efficient than both linked lists for most operations. Therefore, it is preferrable unless the flexibility of linked list's recursive nature is not required (see `split` methods in the next section).
fn eq<'a, I: Iterator<Item = &'a u32> + Clone>(iter: I, slice: &[u32]) -> bool {
iter.clone().count() == slice.len() && iter.zip(slice.iter()).all(|(a, b)| a == b)
}

## B. Features
let _list: List<Singly, u32> = List::new();
let _list = SinglyLinkedList::<u32>::new();
let _list: List<Doubly, u32> = List::new();
let _list = DoublyLinkedList::<u32>::new();

`orx_linked_list::LinkedList` implementation provides standard linked list opeartions such as constant time insertions and removals. Further, it reflects the **recursive** nature of the data structure through so called `LinkedListSlice`s. The caller can move to the desired element of the linked list and get the rest of the list as a linked list slice; which is nothing but an immutable linked list. Furthermore, slices can simply be `collect`ed as an owned linked list.
let mut list = DoublyLinkedList::from_iter([3, 4, 5]);
assert_eq!(list.front(), Some(&3));
assert_eq!(list.back(), Some(&5));
assert!(eq(list.iter(), &[3, 4, 5]));
assert!(eq(list.iter_from_back(), &[5, 4, 3]));

```rust
use orx_linked_list::*;
assert_eq!(list.pop_front(), Some(3));
assert_eq!(list.pop_back(), Some(5));

// BASIC
let mut list = LinkedList::new();
list.extend(vec!['a', 'b', 'c']);
list.push_back(5);
list.push_front(3);
assert!(eq(list.iter(), &[3, 4, 5]));

assert_eq!(list.len(), 3);
assert!(list.contains(&'b'));
assert_eq!(list.index_of(&'c'), Some(2));
assert_eq!(list.from_back_index_of(&'c'), Some(0));
let other = DoublyLinkedList::from_iter([6, 7, 8, 9]);
list.append_back(other);
assert!(eq(list.iter(), &[3, 4, 5, 6, 7, 8, 9]));

list.push_back('d');
assert_eq!(Some(&'d'), list.back());
let other = DoublyLinkedList::from_iter([0, 1, 2]);
list.append_front(other);
assert!(eq(list.iter(), &[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]));

*list.get_at_mut(0).unwrap() = 'x';
list.retain(&|x| x < &5);
assert!(eq(list.iter(), &[0, 1, 2, 3, 4]));

list.push_front('e');
*list.front_mut().unwrap() = 'f';
let mut odds = vec![];
let mut collect_odds = |x| odds.push(x);
list.retain_collect(&|x| x % 2 == 0, &mut collect_odds);

_ = list.remove_at(1);
_ = list.pop_back();
list.insert_at(0, 'x');
list.clear();
list.push_front('y');
list.pop_front();
assert!(eq(list.iter(), &[0, 2, 4]));
assert!(eq(odds.iter(), &[1, 3]));
```

// ITER
let list: LinkedList<_> = ['a', 'b', 'c', 'd', 'e'].into_iter().collect();
## Internal Features

let forward: Vec<_> = list.iter().copied().collect();
assert_eq!(forward, &['a', 'b', 'c', 'd', 'e']);
`orx_linked_list::List` makes use of the safety guarantees and efficiency features of [orx-selfref-col::SelfRefCol](https://crates.io/crates/orx-selfref-col).
* `SelfRefCol` constructs its safety guarantees around the fact that all references will be among elements of the same collection. By preventing bringing in external references or leaking out references, it is safe to build the self referential collection with **regular `&` references**.
* With careful encapsulation, `SelfRefCol` prevents passing in external references to the list and leaking within list node references to outside. Once this is established, it provides methods to easily mutate inter list node references. These features allowed a very convenient implementation of the linked list in this crate with almost no use of the `unsafe` keyword, no read or writes through pointers and no access by indices. Compared to the `std::collections::LinkedList` implementation, it can be observed that `orx_linked_list::List` is a much **higher level implementation**.
* Furthermore, `orx_linked_list::List` is **significantly faster** than the standard linked list. One of the main reasons for this is the feature of `SelfRefCol` keeping all close to each other rather than at arbitrary locations in memory which leads to a better cache locality.

let backward: Vec<_> = list.iter_from_back().copied().collect();
assert_eq!(backward, &['e', 'd', 'c', 'b', 'a']);
## Benchmarks

// SPLITS
let (left, right) = list.split(2).unwrap();
assert_eq!(left, &['a', 'b']);
assert_eq!(right, &['c', 'd', 'e']);
// left & right are also nothing but immutable linked lists
assert_eq!(right.front(), Some(&'c'));
assert_eq!(left.back(), Some(&'b'));
### Mutation Ends

let (front, after) = list.split_front().unwrap();
assert_eq!(front, &'a');
assert_eq!(after, &['b', 'c', 'd', 'e']);
*You may see the benchmark at [benches/mutation_ends.rs](https://github.com/orxfun/orx-linked-list/blob/main/benches/mutation_ends.rs).*

let (before, back) = list.split_back().unwrap();
assert_eq!(before, &['a', 'b', 'c', 'd']);
assert_eq!(back, &'e');
This benchmark compares time performance of calls to `push_front`, `push_back`, `pop_front` and `pop_back` methods.

let (left, right) = list.split_before(&'d').unwrap();
assert_eq!(left, &['a', 'b', 'c']);
assert_eq!(right, &['d', 'e']);
<img src="https://raw.githubusercontent.com/orxfun/orx-linked-list/main/docs/img/bench_mutation_ends.PNG" alt="https://raw.githubusercontent.com/orxfun/orx-linked-list/main/docs/img/bench_mutation_ends.PNG" />

let (left, right) = list.split_after(&'d').unwrap();
assert_eq!(left, &['a', 'b', 'c', 'd']);
assert_eq!(right, &['e']);
### Iteration

// RECURSIVE SPLITS
let (left1, left2) = left.split(1).unwrap();
assert_eq!(left1, &['a']);
assert_eq!(left2, &['b', 'c', 'd']);
*You may see the benchmark at [benches/iter.rs](https://github.com/orxfun/orx-linked-list/blob/main/benches/iter.rs).*

// SPLIT TO OWNED
let mut left_list = left.collect();
This benchmark compares time performance of iteration through the `iter` method.

assert_eq!(left_list, &['a', 'b', 'c', 'd']);
_ = left_list.pop_front();
_ = left_list.pop_back();
assert_eq!(left_list, &['b', 'c']);
```
<img src="https://raw.githubusercontent.com/orxfun/orx-linked-list/main/docs/img/iter.PNG" alt="https://raw.githubusercontent.com/orxfun/orx-linked-list/main/docs/img/iter.PNG" />

## License

Expand Down
104 changes: 104 additions & 0 deletions benches/append.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
use criterion::{
criterion_group, criterion_main, measurement::WallTime, BenchmarkGroup, BenchmarkId, Criterion,
};
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;

#[derive(Clone, Copy)]
enum Action {
PushBack(u32),
PushFront(u32),
}

fn get_test_data(n: usize) -> Vec<Action> {
let mut rng = ChaCha8Rng::seed_from_u64(6523);
let vec: Vec<_> = (0..n)
.map(|_| match rng.gen::<f32>() {
x if x < 0.5 => Action::PushBack(rng.gen()),
_ => Action::PushFront(rng.gen()),
})
.collect();
vec
}
fn get_orx_linked_list(actions: &[Action]) -> orx_linked_list::DoublyLinkedList<u32> {
let mut list = orx_linked_list::DoublyLinkedList::new();
for action in actions {
match action {
Action::PushBack(x) => list.push_back(*x),
Action::PushFront(x) => list.push_front(*x),
};
}
list
}
fn get_std_linked_list(actions: &[Action]) -> std::collections::LinkedList<u32> {
let mut list = std::collections::LinkedList::new();
for action in actions {
match action {
Action::PushBack(x) => list.push_back(*x),
Action::PushFront(x) => list.push_front(*x),
};
}
list
}
fn get_std_vecdeque(actions: &[Action]) -> std::collections::VecDeque<u32> {
let mut list = std::collections::VecDeque::new();
for action in actions {
match action {
Action::PushBack(x) => list.push_back(*x),
Action::PushFront(x) => list.push_front(*x),
};
}
list
}

// variants
fn bench_orx_linked_list(group: &mut BenchmarkGroup<'_, WallTime>, data: &[Action], n: &usize) {
group.bench_with_input(
BenchmarkId::new("orx_linked_list::DoublyLinkedList", n),
n,
|b, _| {
let mut list = get_orx_linked_list(data);
b.iter(|| list.append_back(get_orx_linked_list(data)))
},
);
}

fn bench(c: &mut Criterion) {
let treatments = vec![
1_024,
1_024 * 4,
1_024 * 16,
1_024 * 16 * 4,
1_024 * 16 * 4 * 4,
];

let mut group = c.benchmark_group("append");

for n in &treatments {
let data = get_test_data(*n);

bench_orx_linked_list(&mut group, &data, n);

group.bench_with_input(
BenchmarkId::new("std::collections::LinkedList", n),
n,
|b, _| {
let mut list = get_std_linked_list(&data);
b.iter(|| list.append(&mut get_std_linked_list(&data)))
},
);
group.bench_with_input(
BenchmarkId::new("std::collections::VecDeque", n),
n,
|b, _| {
let mut list = get_std_vecdeque(&data);
b.iter(|| list.append(&mut get_std_vecdeque(&data)))
},
);
}

group.finish();
}

criterion_group!(benches, bench);
criterion_main!(benches);
Loading

0 comments on commit 3792902

Please sign in to comment.