Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite README #3

Merged
merged 1 commit into from
Jan 30, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 33 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,37 @@
# PHFT

**PH**ast**FT** (PHFT) is a high-performance, "quantum-inspired" Fast Fourier Transform (FFT) library written in pure
and
safe Rust.
and safe Rust. It is the fastest pure-Rust FFT library according to our benchmarks.

What's with the name? Great question!
## Features

The name, **PHFT**, is derived from the implementation of the
[Quantum Fourier Transform](https://en.wikipedia.org/wiki/Quantum_Fourier_transform) (QFT). Namely, the
[quantum circuit implementation of QFT](https://en.wikipedia.org/wiki/Quantum_Fourier_transform#Circuit_implementation)
consists of the **P**hase gates and **H**adamard gates. Hence, **PH**ast**FT**.
- Takes advantage of latest CPU features up to and including AVX-512, but performs well even without them.
- Zero `unsafe` code
- Python bindings (via [PyO3](https://github.com/PyO3/pyo3)).
- Optional parallelization of some steps to 2 threads (with even more parallelization planned).
- Did we mention it is really fast?!

In general, the FFT is equivalent to applying gates to all qubits in `[0, n)`. This approach creates to oppurtunity to
leverage the same memory access patterns as high-performance quantum state simulator. This results in a fast and
efficient FFT implementation that surpasses the performance of existing Rust FFT crates, including RustFFT.
## Limitations

## Features
- No runtime CPU feature detection (yet). Right now achieving the highest performance requires compiling with `-C target-cpu=native` or [`cargo multivers`](https://github.com/ronnychevalier/cargo-multivers).
- Requires nightly Rust compiler due to use of portable SIMD

## How is it so fast?

PHFT is designed around the capabilities and limitations of modern hardware (that is, anything made in the last 10 years or so).

The two major bottlenecks in FFT are the **CPU cycles** and **memory accesses.**

- Performance ...
- Python bindings (via PyO3) ...
- Safety ...
We picked an FFT algorithm that maps well to modern CPUs. The implementation can make use of latest CPU features such as AVX-512, but performs well even without them.

Our key insight for speeding up memory accesses is that FFT is equivalent to applying gates to all qubits in `[0, n)`.
This creates to oppurtunity to leverage the same memory access patterns as a [high-performance quantum state simulator](https://github.com/QuState/spinoza).

We also use the Cache-Optimal Bit Reveral Algorithm ([COBRA](https://csaws.cs.technion.ac.il/~itai/Courses/Cache/bit.pdf))
on large datasets and optionally run it on 2 parallel threads, accelerating it even further.

All of this combined results in a fast and efficient FFT implementation that surpasses the performance of existing Rust FFT crates,
including [RustFFT](https://crates.io/crates/rustfft/), on both large and small inputs and while using significantly less memory.

## Getting Started

Expand Down Expand Up @@ -88,3 +100,10 @@ Finally, run:
```bash
./profile.sh
```

## What's with the name?

The name, **PHFT**, is derived from the implementation of the
[Quantum Fourier Transform](https://en.wikipedia.org/wiki/Quantum_Fourier_transform) (QFT). Namely, the
[quantum circuit implementation of QFT](https://en.wikipedia.org/wiki/Quantum_Fourier_transform#Circuit_implementation)
consists of the **P**hase gates and **H**adamard gates. Hence, **PH**ast**FT**.
Loading