Slice Sampling on Turing model returns constant LP values in Chain #15

dlakelan · 2024-11-12T17:29:37Z

Rather than returning actual LP values in the return chain, the sampler returns what appears to be the initial values for every sample. Here is a MWE:

using Turing, SliceSampling, StatsPlots

@model function foo()
    a ~ MvNormal(fill(0,3),1.0)
end

sam = sample(foo(),externalsampler(SliceSampling.HitAndRun(SliceSteppingOut(2.))),10,initial_params=fill(10.0,3))

plot(sam["a[1]"])
plot(sam[:lp])

This lp plot is a horizontal line, whereas the samples of a[1] clearly move around and should have different LP values

This is with SliceSampling 0.6.1 and Turing 0.34.1

The text was updated successfully, but these errors were encountered:

Red-Portal · 2024-11-12T18:46:47Z

Hi @torfjelde @mhauru , is there an additional interface I have to implement on my side to fix this, or is it a bug on the Turing side?

torfjelde · 2024-11-12T19:24:00Z

This honestly looks like a bug on Turing.jl's end; I don't think lp should even be in the chain here 🤷 At least with the current impl, that's not the intention for externalsampler.

dlakelan · 2024-11-12T19:51:21Z

having the lp values seems really important, one of the easiest ways to diagnose convergence is to see that the lp has converged to stationary and the same region for all chains.

Red-Portal · 2024-11-12T20:09:54Z

@torfjelde I was guessing that this function will be invoked by Turing to recompute lp whenever necessary, but I guess not?

@dlakelan lp can certainly be used for that purpose, but it doesn't necessarily need to be, and it isn't obviously the best quantity for doing so, no? Is looking at the $\widehat{R}$ of the parameters insufficient?

dlakelan · 2024-11-12T20:16:09Z

@Red-Portal the lp trace gives you a traceplot, whereas R hat gives you a summary statistic of the entire chain. If the Rhat is not 1 then it doesn't really give you much information about what went wrong. For example, maybe out of 6 chains 5 of them converged LP to one region whereas the 6th got stuck... or maybe if you take all samples after the 100th sample they converged to the same region and the Rhat of that subset is 1 etc.

I find the traceplot of lp much more informative than summary stats.

Red-Portal · 2024-11-12T21:26:18Z

I find the traceplot of lp much more informative than summary stats.

But wouldn't any trace, like one of the parameters, do the same trick?

dlakelan · 2024-11-12T21:29:31Z

No, it's entirely possible for some parameters to be fully converged and others to be stuck in a local optimum or wandering around lost. LP depends on ALL the parameters.

Red-Portal · 2024-11-12T21:42:44Z

Ah you're talking in terms of comparing across chains. Okay yes that makes sense.

torfjelde · 2024-11-13T09:02:10Z

having the lp values seems really important

Sure! But these are quantities that can easily be computed after the fact too:)

julia> @model function foo()
           a ~ MvNormal(fill(0,3),1.0)
       end
foo (generic function with 2 methods)

julia> sam = sample(foo(),externalsampler(SliceSampling.HitAndRun(SliceSteppingOut(2.))),10,initial_params=fill(10.0,3))
Sampling 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:01
Chains MCMC chain (10×4×1 Array{Float64, 3}):

Iterations        = 1:1:10
Number of chains  = 1
Samples per chain = 10
Wall duration     = 1.93 seconds
Compute duration  = 1.93 seconds
parameters        = a[1], a[2], a[3]
internals         = lp

Summary Statistics
  parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   ess_per_sec 
      Symbol   Float64   Float64   Float64    Float64    Float64   Float64       Float64 

        a[1]    4.2874    2.7543    0.8710     8.2309    10.0000    1.3090        4.2735
        a[2]    4.8392    5.4982    2.7238     4.2663    10.0000    1.5820        2.2151
        a[3]    5.8212    2.4718    1.0158     6.3101    10.0000    1.3013        3.2763

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

        a[1]   -0.0302    3.2358    4.1998    5.2023    9.1064
        a[2]   -3.5720    0.1340    6.1599    8.5229   11.5717
        a[3]    2.9935    3.7541    5.4587    7.8328    9.6795


julia> logjoint(foo(), sam)
10×1 Matrix{Float64}:
 -152.75681559961401
 -124.57841649146756
  -79.9584047071337
  -63.01607272527707
  -37.45955307019793
  -37.33701541789175
  -33.948795464804974
  -31.05996528484012
  -21.954809961421393
  -21.59660608994128

Long-term we definitively want to introduce some convenient way to allow external samplers to save more information in the chains, but right now I think the best way is to just compute these things after the fact.

dlakelan · 2024-11-13T11:46:30Z

That's a super useful function to know about thanks for that pointer.

My only concern about after the fact is when the model is quite costly. For example if you have to solve a PDE for a minute to get the LP. I think there's a tendency to have in our mind something like a linear regression and not something like a computational fluid dynamics problem or pharmacokinetics. But there are good reasons to avoid recalculation of LP for some types of models

torfjelde · 2024-11-13T12:13:26Z

Definitively:) As I said, we do want to make it possible to have a sampler wrapped in externalsampler provide some information about what information you want to keep around. However, at the moment we're mainly seeing usage of models where an additional evaluation per sample isn't really a big issue, so it's probably not something that will be addressed very rapidly (though it is on our TODO 👍)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slice Sampling on Turing model returns constant LP values in Chain #15

Slice Sampling on Turing model returns constant LP values in Chain #15

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024 •

edited

Loading

torfjelde commented Nov 12, 2024

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024 •

edited

Loading

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024

torfjelde commented Nov 13, 2024

dlakelan commented Nov 13, 2024

torfjelde commented Nov 13, 2024

Slice Sampling on Turing model returns constant LP values in Chain #15

Slice Sampling on Turing model returns constant LP values in Chain #15

Comments

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024 • edited Loading

torfjelde commented Nov 12, 2024

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024 • edited Loading

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024

dlakelan commented Nov 12, 2024

Red-Portal commented Nov 12, 2024

torfjelde commented Nov 13, 2024

dlakelan commented Nov 13, 2024

torfjelde commented Nov 13, 2024

Red-Portal commented Nov 12, 2024 •

edited

Loading

Red-Portal commented Nov 12, 2024 •

edited

Loading