Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve robustness #12

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Improve robustness #12

wants to merge 5 commits into from

Conversation

timholy
Copy link
Contributor

@timholy timholy commented Nov 19, 2023

This eliminates the common failures observed in
#11 (comment)

The strategy is to stop updating a component when non-finite values are encountered. Obviously, this only helps if robust=true, but I'm a little unsure of when one wouldn't want that.

This eliminates the common failures observed in
dmetivie#11 (comment)
@timholy timholy mentioned this pull request Nov 19, 2023
@dmetivie
Copy link
Owner

I am curious, don't you think all these checks for each n will decrease dramatically the performance?

@timholy
Copy link
Contributor Author

timholy commented Nov 20, 2023

I haven't benchmarked it, have a test problem you want me to try it on?

But won't your replace call do the same thing? Except by making two traversals rather than one through the whole array?

@timholy
Copy link
Contributor Author

timholy commented Nov 21, 2023

julia> N, K = 1000, 3;

julia> LL = Matrix{Float64}(undef, N, K);

julia> c = Vector{Float64}(undef, N);

julia> γ = similar(LL);

julia> using Distributions

julia> dists = [Normal(randn(), 1) for _ = 1:K];

julia> α = rand(K); α ./= sum(α);

julia> y = randn(N);

# master branch
julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y);
  68.838 μs (1 allocation: 15.75 KiB)

julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y; robust=true);
  78.249 μs (1 allocation: 15.75 KiB)

# I'm using Revise
shell> git checkout teh/robust
Switched to branch 'teh/robust'
Your branch is up to date with 'myfork/teh/robust'.

julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y);
  69.083 μs (1 allocation: 15.75 KiB)

julia> @btime ExpectationMaximization.E_step!($LL, $c, $γ, $dists, $α, $y; robust=true);
  80.663 μs (1 allocation: 15.75 KiB)

My original implementation did have a small performance hit, but this one doesn't really.

src/fit_em.jl Outdated Show resolved Hide resolved
@dmetivie
Copy link
Owner

Ideally, we could add tests? Not sure exactly what though.

@timholy
Copy link
Contributor Author

timholy commented Nov 29, 2023

How about just running it a lot of times like in #11 (comment)? It doesn't take much to trigger this error.

@dmetivie
Copy link
Owner

I was trying to recreate a case where dropout appear to write a test that robust = true was useful. Inspired by your case I did

@testset "Test robustness against dropout issue" begin
    # See https://github.com/dmetivie/ExpectationMaximization.jl/issues/11 
    # In this example, one of the mixture weight goes to zero outputing at iteration 3 an
    # ERROR: PosDefException: matrix is not Hermitian; Cholesky factorization failed.
    Random.seed!(1234)

    N = 600
    
    ctrue = [[-0.3, 1],
            [-0.4, 0.7],
            [0.4, -0.6]]
    X = reduce(hcat, [randn(length(c), N÷3) .+ c for c in ctrue])
    mix_bad_guess = MixtureModel([MvNormal([1.6, -2.4], [100 0.0; 0.0 1]), MvNormal([-1.1, -0.6], 0.01), MvNormal([0.4, 2.4], 1)])
    
    fit_mle(mix_bad_guess, X, maxiter = 1)
    
    try # make sure our test case is problematic after two iterations without robust option
        fit_mle(mix_bad_guess, X, maxiter = 20) #triggers error
        @test false
    catch e
        @test true
    end
    begin 
        #! no error thrown, however the EM converges to some bad local maxima!
        mix_mle_bad = fit_mle(mix_bad_guess, X, maxiter = 2000, robust = true)
        @test true
    end
    begin 
        #! no error thrown, however the SEM has one mixture component with zero proba (remaining the same at every iteration)
        mix_mle_S = fit_mle(mix_bad_guess, X, method = StochasticEM(), maxiter = 2000)
        @test true
    end
end

Not sure if it is the best test syntax, I just wanted to check these run despite being

Now we basically allow convergence to "weird" local minimal. I am not sure if we should throw an warning or not.
I also robustified against 0 proba in the Stocastic EM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants