Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: enable reusing fft plans #271

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 3 additions & 15 deletions .github/workflows/UnitTest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,36 +7,24 @@ on:
branches:
- master
pull_request:
schedule:
- cron: '20 00 1 * *'

jobs:
test:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
julia-version: ['1.6', '1', 'nightly']
julia-version: ['1.10', '1']
os: [ubuntu-latest, windows-latest, macOS-latest]

steps:
- uses: actions/checkout@v1.0.0
- uses: actions/checkout@v4
- name: "Set up Julia"
uses: julia-actions/setup-julia@v1
with:
version: ${{ matrix.julia-version }}

- name: Cache artifacts
uses: actions/cache@v1
env:
cache-name: cache-artifacts
with:
path: ~/.julia/artifacts
key: ${{ runner.os }}-test-${{ env.cache-name }}-${{ hashFiles('**/Project.toml') }}
restore-keys: |
${{ runner.os }}-test-${{ env.cache-name }}-
${{ runner.os }}-test-
${{ runner.os }}-
- uses: julia-actions/cache@v1

- name: "Unit Test"
uses: julia-actions/julia-runtest@v1
Expand Down
4 changes: 3 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ ImageCore = "a09fc81d-aa75-5fe9-8630-4744c3626534"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
OffsetArrays = "6fe1bfb0-de20-5000-8ca7-80f57d26f881"
PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
RealFFTs = "3645faec-0534-4e91-afef-021db0981eec"
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
Expand All @@ -31,10 +32,11 @@ ImageCore = "0.10"
OffsetArrays = "1.9"
PrecompileTools = "1"
Reexport = "1.1"
RealFFTs = "1"
StaticArrays = "0.10, 0.11, 0.12, 1.0"
Statistics = "1"
TiledIteration = "0.2, 0.3, 0.4, 0.5"
julia = "1.6"
julia = "1.10"

[extras]
AxisArrays = "39de3d68-74b9-583c-8d2d-e117c070f3a9"
Expand Down
22 changes: 20 additions & 2 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ processing.

The main functions provided by this package are:

| Function | Action |
| Function | Action |
|:-------------------------|:---------------|
|[`imfilter`](@ref) | Filter a one, two or multidimensional array img with a kernel by computing their correlation |
|[`imfilter!`](@ref) | Filter an array img with kernel kernel by computing their correlation, storing the result in imgfilt |
Expand All @@ -25,7 +25,7 @@ The main functions provided by this package are:
|[`findlocalminima`](@ref) | Returns the coordinates of elements whose value is smaller than all of their immediate neighbors |
|[`findlocalmaxima`](@ref) | Returns the coordinates of elements whose value is larger than all of their immediate neighbors |

Common kernels (filters) are organized in the `Kernel` and `KernelFactors` modules.
Common kernels (filters) are organized in the `Kernel` and `KernelFactors` modules.

A common task in image processing and computer vision is computing
image *gradients* (derivatives), for which there is the dedicated
Expand Down Expand Up @@ -80,6 +80,24 @@ transform (FFT). By default, this choice is made based on kernel
size. You can manually specify the algorithm using [`Algorithm.FFT()`](@ref)
or [`Algorithm.FIR()`](@ref).

#### Reusing FFT plans

It is possible to reuse FFT plans if the operation is going to be done on the
same array type and dimensions i.e. on each image of an image stack

```julia
using ImageFiltering, ComputationalResources
imgstack = rand(Float64, 200, 100, 10)
imgstack_filtered = similar(imgstack)

kernel = ImageFiltering.factorkernel(Kernel.LoG(1))
fft_planned = CPU1(ImageFiltering.planned_fft(imgstack_filtered[:,:,1], kernel))

for i in axes(imgstack, 3)
imfilter!(fft_planned, imgstack_filtered[:,:,i], imgstack[:,:,i], kernel)
end
```

### Feature: Multithreading

If you launch Julia with `JULIA_NUM_THREADS=n` (where `n > 1`), then
Expand Down
13 changes: 11 additions & 2 deletions src/ImageFiltering.jl
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
module ImageFiltering

using FFTW
using RealFFTs
using ImageCore, FFTViews, OffsetArrays, StaticArrays, ComputationalResources, TiledIteration
# Where possible we avoid a direct dependency to reduce the number of [compat] bounds
# using FixedPointNumbers: Normed, N0f8 # reexported by ImageCore
using ImageCore.MappedArrays
using Statistics, LinearAlgebra
using Base: Indices, tail, fill_to_length, @pure, depwarn, @propagate_inbounds
import Base: copy!
using OffsetArrays: IdentityUnitRange # using the one in OffsetArrays makes this work with multiple Julia versions
using SparseArrays # only needed to fix an ambiguity in borderarray
using Reexport
Expand All @@ -30,7 +32,8 @@ export Kernel, KernelFactors,
imgradients, padarray, centered, kernelfactors, reflect,
freqkernel, spacekernel,
findlocalminima, findlocalmaxima,
blob_LoG, BlobLoG
blob_LoG, BlobLoG,
planned_fft

FixedColorant{T<:Normed} = Colorant{T}
StaticOffsetArray{T,N,A<:StaticArray} = OffsetArray{T,N,A}
Expand All @@ -50,10 +53,16 @@ function Base.transpose(A::StaticOffsetArray{T,2}) where T
end

module Algorithm
import FFTW
# deliberately don't export these, but it's expected that they
# will be used as Algorithm.FFT(), etc.
abstract type Alg end
"Filter using the Fast Fourier Transform" struct FFT <: Alg end
"Filter using the Fast Fourier Transform" struct FFT <: Alg
plan1::Union{Function,Nothing}
plan2::Union{Function,Nothing}
plan3::Union{Function,Nothing}
end
FFT() = FFT(nothing, nothing, nothing)
"Filter using a direct algorithm" struct FIR <: Alg end
"Cache-efficient filtering using tiles" struct FIRTiled{N} <: Alg
tilesize::Dims{N}
Expand Down
60 changes: 57 additions & 3 deletions src/imfilter.jl
Original file line number Diff line number Diff line change
Expand Up @@ -826,7 +826,7 @@ function _imfilter_fft!(r::AbstractCPU{FFT},
for I in CartesianIndices(axes(kern))
krn[I] = kern[I]
end
Af = filtfft(A, krn)
Af = filtfft(A, krn, r.settings.plan1, r.settings.plan2, r.settings.plan3)
if map(first, axes(out)) == map(first, axes(Af))
R = CartesianIndices(axes(out))
copyto!(out, R, Af, R)
Expand All @@ -837,13 +837,67 @@ function _imfilter_fft!(r::AbstractCPU{FFT},
src = view(FFTView(Af), axes(dest)...)
copyto!(dest, src)
end
out
return out
end

function buffered_planned_rfft(a::AbstractArray{T}) where {T}
buf = RealFFTs.RCpair{T}(undef, size(a))
plan = RealFFTs.plan_rfft!(buf; flags=FFTW.MEASURE)
return function (arr::AbstractArray{T}) where {T}
copy!(buf, OffsetArrays.no_offset_view(arr))
return plan(buf)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test coverage here and below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I hadn't noticed this. Sadly it turns out the new functionality isn't passing yet..

end
function buffered_planned_irfft(a::AbstractArray{T}) where {T}
buf = RealFFTs.RCpair{T}(undef, size(a))
plan = RealFFTs.plan_irfft!(buf; flags=FFTW.MEASURE)
return function (arr::AbstractArray{T}) where {T}
copy!(buf, OffsetArrays.no_offset_view(arr))
return plan(buf)
end
end

function planned_fft(A::AbstractArray{T,N},
kernel::ProcessedKernel,
border::BorderSpecAny=Pad(:replicate)
) where {T<:AbstractFloat,N}
bord = border(kernel, A, Algorithm.FFT())
_A = padarray(T, A, bord)
bfp1 = buffered_planned_rfft(_A)
kern = samedims(_A, kernelconv(kernel...))
krn = FFTView(zeros(eltype(kern), map(length, axes(_A))))
bfp2 = buffered_planned_rfft(krn)
bfp3 = buffered_planned_irfft(_A)
return Algorithm.FFT(bfp1, bfp2, bfp3)
end
planned_fft(A::AbstractArray, kernel, border::AbstractString) = planned_fft(A, kernel, borderinstance(border))
planned_fft(A::AbstractArray, kernel::Union{ArrayLike,Laplacian}, border::BorderSpecAny) = planned_fft(A, factorkernel(kernel), border)

function filtfft(A, krn, planned_rfft1::Function, planned_rfft2::Function, planned_irfft::Function)
B = complex(planned_rfft1(A))
B .*= conj!(complex(planned_rfft2(krn)))
return real(planned_irfft(complex(B)))
end
# TODO: this Colorant method does not work. See TODO below
function filtfft(A::AbstractArray{C}, krn, planned_rfft1::Function, planned_rfft2::Function, planned_irfft::Function) where {C<:Colorant}
Av, dims = channelview_dims(A)
kernrs = kreshape(C, krn)
B = complex(planned_rfft1(Av, dims)) # TODO: dims is not supported by planned_rfft1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think dims can be a point of flexibility: these plans are specific to the memory layout of the array. (The planning explores various implementations and picks the fastest discovered; performance is strongly dependent on memory layout, so the choice for one layout may not be the same as another.) You'd have to create a plan specifically to the colorant array-type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this comment to the TODO. I think I'd prefer to fix this in a follow on PR, if that sounds ok

# Quoting Tim Holy in https://github.com/JuliaImages/ImageFiltering.jl/pull/271/files#r1559210348
# I don't think dims can be a point of flexibility: these plans are specific to the
# memory layout of the array. (The planning explores various implementations and picks
# the fastest discovered; performance is strongly dependent on memory layout, so the
# choice for one layout may not be the same as another.) You'd have to create a plan
# specifically to the colorant array-type.
B .*= conj!(complex(planned_rfft2(kernrs)))
Avf = real(planned_irfft(complex(B)))
return colorview(base_colorant_type(C){eltype(Avf)}, Avf)
end
filtfft(A, krn, ::Nothing, ::Nothing, ::Nothing) = filtfft(A, krn)
function filtfft(A, krn)
B = rfft(A)
B .*= conj!(rfft(krn))
irfft(B, length(axes(A, 1)))
return irfft(B, length(axes(A, 1)))
end
function filtfft(A::AbstractArray{C}, krn) where {C<:Colorant}
Av, dims = channelview_dims(A)
Expand Down
39 changes: 24 additions & 15 deletions test/2d.jl
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,14 @@ using ImageFiltering: borderinstance
end
end

# TODO: extend planned_fft to support more types than just floats
function supported_algs(img::AbstractArray{T}, kernel, border) where {T<:AbstractFloat}
return (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT(), planned_fft(img, kernel, border))
end
function supported_algs(img::AbstractArray, kernel, border)
return (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
end

@testset "FIR/FFT" begin
f32type(img) = f32type(eltype(img))
f32type(::Type{C}) where {C<:Colorant} = base_colorant_type(C){Float32}
Expand All @@ -50,6 +58,7 @@ end
# Dense inseparable kernel
kern = [0.1 0.2; 0.4 0.5]
kernel = OffsetArray(kern, -1:0, 1:2)
border = Inner()
for img in (imgf, imgi, imgg, imgc)
targetimg = zeros(typeof(img[1]*kern[1]), size(img))
targetimg[3:4,2:3] = rot180(kern) .* img[3,4]
Expand All @@ -66,7 +75,7 @@ end
@test @inferred(imfilter(f32type(img), img, kernel, border)) ≈ float32.(targetimg)
fill!(ret, zero(eltype(ret)))
@test @inferred(imfilter!(ret, img, kernel, border)) ≈ targetimg
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
@test @inferred(imfilter(img, kernel, border, alg)) ≈ targetimg
@test @inferred(imfilter(img, (kernel,), border, alg)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border, alg)) ≈ float32.(targetimg)
Expand All @@ -76,12 +85,12 @@ end
@test_throws MethodError imfilter!(CPU1(Algorithm.FIR()), ret, img, kernel, border, Algorithm.FFT())
end
targetimg_inner = OffsetArray(targetimg[2:end, 1:end-2], 2:5, 1:5)
@test @inferred(imfilter(img, kernel, Inner())) ≈ targetimg_inner
@test @inferred(imfilter(f32type(img), img, kernel, Inner())) ≈ float32.(targetimg_inner)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
@test @inferred(imfilter(img, kernel, Inner(), alg)) ≈ targetimg_inner
@test @inferred(imfilter(f32type(img), img, kernel, Inner(), alg)) ≈ float32.(targetimg_inner)
@test @inferred(imfilter(CPU1(alg), img, kernel, Inner())) ≈ targetimg_inner
@test @inferred(imfilter(img, kernel, border)) ≈ targetimg_inner
@test @inferred(imfilter(f32type(img), img, kernel, border)) ≈ float32.(targetimg_inner)
for alg in supported_algs(img, kernel, border)
@test @inferred(imfilter(img, kernel, border, alg)) ≈ targetimg_inner
@test @inferred(imfilter(f32type(img), img, kernel, border, alg)) ≈ float32.(targetimg_inner)
@test @inferred(imfilter(CPU1(alg), img, kernel, border)) ≈ targetimg_inner
end
end
# Factored kernel
Expand All @@ -96,7 +105,7 @@ end
for border in ("replicate", "circular", "symmetric", "reflect", Fill(zero(eltype(img))))
@test @inferred(imfilter(img, kernel, border)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border)) ≈ float32.(targetimg)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
@test @inferred(imfilter(img, kernel, border, alg)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border, alg)) ≈ float32.(targetimg)
end
Expand All @@ -106,7 +115,7 @@ end
targetimg_inner = OffsetArray(targetimg[2:end, 1:end-2], 2:5, 1:5)
@test @inferred(imfilter(img, kernel, Inner())) ≈ targetimg_inner
@test @inferred(imfilter(f32type(img), img, kernel, Inner())) ≈ float32.(targetimg_inner)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
@test @inferred(imfilter(img, kernel, Inner(), alg)) ≈ targetimg_inner
@test @inferred(imfilter(f32type(img), img, kernel, Inner(), alg)) ≈ float32.(targetimg_inner)
end
Expand All @@ -122,7 +131,7 @@ end
for border in ("replicate", "circular", "symmetric", "reflect", Fill(zero(eltype(img))))
@test @inferred(imfilter(img, kernel, border)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border)) ≈ float32.(targetimg)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
if alg == Algorithm.FFT() && eltype(img) == Int
@test @inferred(imfilter(Float64, img, kernel, border, alg)) ≈ targetimg
else
Expand All @@ -134,7 +143,7 @@ end
targetimg_inner = OffsetArray(targetimg[2:end-1, 2:end-1], 2:4, 2:6)
@test @inferred(imfilter(img, kernel, Inner())) ≈ targetimg_inner
@test @inferred(imfilter(f32type(img), img, kernel, Inner())) ≈ float32.(targetimg_inner)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
if alg == Algorithm.FFT() && eltype(img) == Int
@test @inferred(imfilter(Float64, img, kernel, Inner(), alg)) ≈ targetimg_inner
else
Expand Down Expand Up @@ -184,7 +193,7 @@ end
targetimg = target1(img, kern, border)
@test @inferred(imfilter(img, kernel, border)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border)) ≈ float32.(targetimg)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
@test @inferred(imfilter(img, kernel, border, alg)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border, alg)) ≈ float32.(targetimg)
end
Expand All @@ -195,7 +204,7 @@ end
targetimg = zerona!(copy(targetimg0))
@test @inferred(zerona!(imfilter(img, kernel, border))) ≈ targetimg
@test @inferred(zerona!(imfilter(f32type(img), img, kernel, border))) ≈ float32.(targetimg)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
@test @inferred(zerona!(imfilter(img, kernel, border, alg), nanflag)) ≈ targetimg
@test @inferred(zerona!(imfilter(f32type(img), img, kernel, border, alg), nanflag)) ≈ float32.(targetimg)
end
Expand All @@ -208,7 +217,7 @@ end
targetimg = target1(img, kern, border)
@test @inferred(imfilter(img, kernel, border)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border)) ≈ float32.(targetimg)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
@test @inferred(imfilter(img, kernel, border, alg)) ≈ targetimg
@test @inferred(imfilter(f32type(img), img, kernel, border, alg)) ≈ float32.(targetimg)
end
Expand All @@ -219,7 +228,7 @@ end
targetimg = zerona!(copy(targetimg0))
@test @inferred(zerona!(imfilter(img, kernel, border))) ≈ targetimg
@test @inferred(zerona!(imfilter(f32type(img), img, kernel, border))) ≈ float32.(targetimg)
for alg in (Algorithm.FIR(), Algorithm.FIRTiled(), Algorithm.FFT())
for alg in supported_algs(img, kernel, border)
@test @inferred(zerona!(imfilter(img, kernel, border, alg), nanflag)) ≈ targetimg
@test @inferred(zerona!(imfilter(f32type(img), img, kernel, border, alg), nanflag)) ≈ float32.(targetimg)
end
Expand Down
14 changes: 9 additions & 5 deletions test/gabor.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,15 @@ using ImageFiltering, Test, Statistics
size_y = 6*σy+1
γ = σx/σy
# zero size forces default kernel width, with warnings
@info "Four warnings are expected"
kernel = Kernel.gabor(0,0,σx,0,5,γ,0)
@test isequal(size(kernel[1]),(size_x,size_y))
kernel = Kernel.gabor(0,0,σx,π,5,γ,0)
@test isequal(size(kernel[1]),(size_x,size_y))

@test_logs (:warn, r"The input parameter size_") match_mode=:any begin
kernel = Kernel.gabor(0,0,σx,0,5,γ,0)
@test isequal(size(kernel[1]),(size_x,size_y))
end
@test_logs (:warn, r"The input parameter size_") match_mode=:any begin
kernel = Kernel.gabor(0,0,σx,π,5,γ,0)
@test isequal(size(kernel[1]),(size_x,size_y))
end
IanButterworth marked this conversation as resolved.
Show resolved Hide resolved

for x in 0:4, y in 0:4, z in 0:4, t in 0:4
σx1 = 2*x+1
Expand Down
Loading
Loading