Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README #6

Merged
merged 1 commit into from
Feb 3, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 16 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,31 +4,39 @@
# PhastFT

PhastFT is a high-performance, "quantum-inspired" Fast Fourier
Transform (FFT) library written in pure and safe Rust. It is the fastest
pure-Rust FFT library according to our benchmarks.
Transform (FFT) library written in pure Rust.
Despite its simplicity, it is competitive with and often outperforms
the fastest Rust FFT libraries, including [RustFFT](https://crates.io/crates/rustfft/).

## Features

- Takes advantage of latest CPU features up to and including AVX-512, but performs well even without them.
- Simple implementation using a single, general-purpose FFT algorithm.
- Zero `unsafe` code
- Python bindings (via [PyO3](https://github.com/PyO3/pyo3)).
- Simple implementation using a single, general-purpose FFT algorithm and no costly "planning" step
- Optional parallelization of some steps to 2 threads (with even more planned).
- Takes advantage of latest CPU features up to and including AVX-512, but performs well even without them.
- Optional parallelization of some steps to 2 threads (with even more planned)
- 2x lower memory usage than [RustFFT](https://crates.io/crates/rustfft/)
- Python bindings (via [PyO3](https://github.com/PyO3/pyo3))

## Limitations

- No runtime CPU feature detection (yet). Right now achieving the highest performance requires compiling
with `-C target-cpu=native` or [`cargo multivers`](https://github.com/ronnychevalier/cargo-multivers).
- Requires nightly Rust compiler due to use of portable SIMD

## Planned features

- Runtime CPU feature detection
- More multi-threading
- More work on cache-optimal FFT

## How is it so fast?

PhastFT is designed around the capabilities and limitations of modern hardware (that is, anything made in the last 10
years or so).

The two major bottlenecks in FFT are the **CPU cycles** and **memory accesses.**

We picked an FFT algorithm that maps well to modern CPUs. The implementation can make use of latest CPU features such as
We picked an efficient, general-purpose FFT algorithm. Our implementation can make use of latest CPU features such as
AVX-512, but performs well even without them.

Our key insight for speeding up memory accesses is that FFT is equivalent to applying gates to all qubits in `[0, n)`.
Expand All @@ -41,7 +49,7 @@ on large datasets and optionally run it on 2 parallel threads, accelerating it e

All of this combined results in a fast and efficient FFT implementation that surpasses the performance of existing Rust
FFT crates,
including [RustFFT](https://crates.io/crates/rustfft/), on both large and small inputs and while using significantly
including [RustFFT](https://crates.io/crates/rustfft/) on large inputs and while using significantly
less memory.

## Quickstart
Expand Down
Loading