Improve robustness #12

timholy · 2023-11-19T18:20:24Z

This eliminates the common failures observed in
#11 (comment)

The strategy is to stop updating a component when non-finite values are encountered. Obviously, this only helps if robust=true, but I'm a little unsure of when one wouldn't want that.

This eliminates the common failures observed in dmetivie#11 (comment)

dmetivie · 2023-11-20T13:59:14Z

I am curious, don't you think all these checks for each n will decrease dramatically the performance?

timholy · 2023-11-20T18:20:31Z

I haven't benchmarked it, have a test problem you want me to try it on?

But won't your replace call do the same thing? Except by making two traversals rather than one through the whole array?

timholy · 2023-11-21T11:15:45Z

julia> N, K = 1000, 3;

julia> LL = Matrix{Float64}(undef, N, K);

julia> c = Vector{Float64}(undef, N);

julia> γ = similar(LL);

julia> using Distributions

julia> dists = [Normal(randn(), 1) for _ = 1:K];

julia> α = rand(K); α ./= sum(α);

julia> y = randn(N);

# master branch
julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y);
  68.838 μs (1 allocation: 15.75 KiB)

julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y; robust=true);
  78.249 μs (1 allocation: 15.75 KiB)

# I'm using Revise
shell> git checkout teh/robust
Switched to branch 'teh/robust'
Your branch is up to date with 'myfork/teh/robust'.

julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y);
  69.083 μs (1 allocation: 15.75 KiB)

julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y; robust=true);
  80.663 μs (1 allocation: 15.75 KiB)

My original implementation did have a small performance hit, but this one doesn't really.

src/fit_em.jl

dmetivie · 2023-11-29T12:19:53Z

Ideally, we could add tests? Not sure exactly what though.

timholy · 2023-11-29T13:39:11Z

How about just running it a lot of times like in #11 (comment)? It doesn't take much to trigger this error.

dmetivie · 2023-12-28T21:35:38Z

I was trying to recreate a case where dropout appear to write a test that robust = true was useful. Inspired by your case I did

@testset "Test robustness against dropout issue" begin
    # See https://github.com/dmetivie/ExpectationMaximization.jl/issues/11 
    # In this example, one of the mixture weight goes to zero outputing at iteration 3 an
    # ERROR: PosDefException: matrix is not Hermitian; Cholesky factorization failed.
    Random.seed!(1234)

    N = 600
    
    ctrue = [[-0.3, 1],
            [-0.4, 0.7],
            [0.4, -0.6]]
    X = reduce(hcat, [randn(length(c), N÷3) .+ c for c in ctrue])
    mix_bad_guess = MixtureModel([MvNormal([1.6, -2.4], [100 0.0; 0.0 1]), MvNormal([-1.1, -0.6], 0.01), MvNormal([0.4, 2.4], 1)])
    
    fit_mle(mix_bad_guess, X, maxiter = 1)
    
    try # make sure our test case is problematic after two iterations without robust option
        fit_mle(mix_bad_guess, X, maxiter = 20) #triggers error
        @test false
    catch e
        @test true
    end
    begin 
        #! no error thrown, however the EM converges to some bad local maxima!
        mix_mle_bad = fit_mle(mix_bad_guess, X, maxiter = 2000, robust = true)
        @test true
    end
    begin 
        #! no error thrown, however the SEM has one mixture component with zero proba (remaining the same at every iteration)
        mix_mle_S = fit_mle(mix_bad_guess, X, method = StochasticEM(), maxiter = 2000)
        @test true
    end
end

Not sure if it is the best test syntax, I just wanted to check these run despite being

Now we basically allow convergence to "weird" local minimal. I am not sure if we should throw an warning or not.
I also robustified against 0 proba in the Stocastic EM

Improve robustness

137f03a

This eliminates the common failures observed in dmetivie#11 (comment)

timholy mentioned this pull request Nov 19, 2023

Handling dropouts #11

Open

hoist the robust check

e6988eb

dmetivie mentioned this pull request Nov 28, 2023

update doc/readme with common pitfalls & errors #13

Open

Remove replace

55eca9e

dmetivie reviewed Nov 29, 2023

View reviewed changes

src/fit_em.jl Outdated Show resolved Hide resolved

add dropout test

fec4d6b

robustify against dropout SEM + test

7098aa4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve robustness #12

Improve robustness #12

timholy commented Nov 19, 2023 •

edited

Loading

dmetivie commented Nov 20, 2023

timholy commented Nov 20, 2023

timholy commented Nov 21, 2023 •

edited

Loading

dmetivie commented Nov 29, 2023

timholy commented Nov 29, 2023

dmetivie commented Dec 28, 2023

Improve robustness #12

Are you sure you want to change the base?

Improve robustness #12

Conversation

timholy commented Nov 19, 2023 • edited Loading

dmetivie commented Nov 20, 2023

timholy commented Nov 20, 2023

timholy commented Nov 21, 2023 • edited Loading

dmetivie commented Nov 29, 2023

timholy commented Nov 29, 2023

dmetivie commented Dec 28, 2023

timholy commented Nov 19, 2023 •

edited

Loading

timholy commented Nov 21, 2023 •

edited

Loading