Skip to content

Commit

Permalink
Merge pull request #3 from QuState/readme
Browse files Browse the repository at this point in the history
Rewrite README
  • Loading branch information
smu160 authored Jan 30, 2024
2 parents 666b7d6 + bdc185f commit f07c3b7
Showing 1 changed file with 33 additions and 14 deletions.
47 changes: 33 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,37 @@
# PHFT

**PH**ast**FT** (PHFT) is a high-performance, "quantum-inspired" Fast Fourier Transform (FFT) library written in pure
and
safe Rust.
and safe Rust. It is the fastest pure-Rust FFT library according to our benchmarks.

What's with the name? Great question!
## Features

The name, **PHFT**, is derived from the implementation of the
[Quantum Fourier Transform](https://en.wikipedia.org/wiki/Quantum_Fourier_transform) (QFT). Namely, the
[quantum circuit implementation of QFT](https://en.wikipedia.org/wiki/Quantum_Fourier_transform#Circuit_implementation)
consists of the **P**hase gates and **H**adamard gates. Hence, **PH**ast**FT**.
- Takes advantage of latest CPU features up to and including AVX-512, but performs well even without them.
- Zero `unsafe` code
- Python bindings (via [PyO3](https://github.com/PyO3/pyo3)).
- Optional parallelization of some steps to 2 threads (with even more parallelization planned).
- Did we mention it is really fast?!

In general, the FFT is equivalent to applying gates to all qubits in `[0, n)`. This approach creates to oppurtunity to
leverage the same memory access patterns as high-performance quantum state simulator. This results in a fast and
efficient FFT implementation that surpasses the performance of existing Rust FFT crates, including RustFFT.
## Limitations

## Features
- No runtime CPU feature detection (yet). Right now achieving the highest performance requires compiling with `-C target-cpu=native` or [`cargo multivers`](https://github.com/ronnychevalier/cargo-multivers).
- Requires nightly Rust compiler due to use of portable SIMD

## How is it so fast?

PHFT is designed around the capabilities and limitations of modern hardware (that is, anything made in the last 10 years or so).

The two major bottlenecks in FFT are the **CPU cycles** and **memory accesses.**

- Performance ...
- Python bindings (via PyO3) ...
- Safety ...
We picked an FFT algorithm that maps well to modern CPUs. The implementation can make use of latest CPU features such as AVX-512, but performs well even without them.

Our key insight for speeding up memory accesses is that FFT is equivalent to applying gates to all qubits in `[0, n)`.
This creates to oppurtunity to leverage the same memory access patterns as a [high-performance quantum state simulator](https://github.com/QuState/spinoza).

We also use the Cache-Optimal Bit Reveral Algorithm ([COBRA](https://csaws.cs.technion.ac.il/~itai/Courses/Cache/bit.pdf))
on large datasets and optionally run it on 2 parallel threads, accelerating it even further.

All of this combined results in a fast and efficient FFT implementation that surpasses the performance of existing Rust FFT crates,
including [RustFFT](https://crates.io/crates/rustfft/), on both large and small inputs and while using significantly less memory.

## Getting Started

Expand Down Expand Up @@ -88,3 +100,10 @@ Finally, run:
```bash
./profile.sh
```

## What's with the name?

The name, **PHFT**, is derived from the implementation of the
[Quantum Fourier Transform](https://en.wikipedia.org/wiki/Quantum_Fourier_transform) (QFT). Namely, the
[quantum circuit implementation of QFT](https://en.wikipedia.org/wiki/Quantum_Fourier_transform#Circuit_implementation)
consists of the **P**hase gates and **H**adamard gates. Hence, **PH**ast**FT**.

0 comments on commit f07c3b7

Please sign in to comment.