Add input format checks for HMM models: Initialization, Transition and select Emission #434

ay340 · 2025-09-15T21:55:34Z

Background
This PR aims to improve the usability of the package and solve 2 issues.

As a new user myself I experimented with the package and found it tough to guess the right array dimensions arrangement for vectors and matrices in the initialization phase of HMMs. Hence, input checks at init time were implemented with message what the exact dimensions should be.
Furthermore, I observed that certain inputs are silently projected to fit the dimensionality of the performed actions. This is problematic as the parameters are used in an "implicit" form for calculations but shown as defined by the user in the params variable after initialization.

Reproducible sample of issue

def test_faulty_ARHMM_config():
  
  # faulty initial probs: probabilities don't sum to 1
  initial_probs = jnp.array([0., 3.])
  # faulty transition matrix: below interpreted as a 2x1 matrix padded with zeros to the right as needed for calculations
  transition_matrix = jnp.array([[0.75], [0.80]])
  # possibly unintended emission biases projection of 1st row element to 2x2 matrix 
  emission_biases= jnp.array([[100],
                              [-44.]])
  
  num_states = 2
  emission_dim = 2

  # Construct the linear ARHMM
  hmm = models.LinearAutoregressiveHMM(num_states=num_states, emission_dim=emission_dim, num_lags=1)

  # Initialize the parameters struct with known values
  params, _ = hmm.initialize(key=jr.PRNGKey(42),
                          transition_matrix=transition_matrix,
                          initial_probs=initial_probs,
                          emission_biases=emission_biases,
  )
  regimes, _ = hmm.sample(params,num_timesteps=5, key=jr.PRNGKey(2342))
  print(regimes)

Result is a hidden state vector of [1,0,0,0,0] corresponding to an immediate transition from state 1 to state 0 due to the implicit "padding" of the transition matrix with 0's. The biases vector is also used in a row-wise expanded form.

Solution

The idea is to add shape/sanity checks as early as possible within model definition which is usually when model.initialize(**kwargs) is called. Checks integrated as follows:

Initial state initialization within StandardHMMInitialState: uniform across HMM models; assertion error with exact shape requirement output if different shape given, also stochasticity of the vector (non-negative entries summing to 1);
Transition initializations within StandardHMMTransitions: same as above; each row of the matrix checked for stochasticity;
Emission initialization: specific to the models. Here I implemented checks only for CategoricalHMM and LinearAutoregressiveHMM because I use them currently. Given positive feedback it is straightforward to extend to remaining models.

Let me know if there are any questions or you need further clarification!

…atHMM and ARHMM emissions checks

Add general initial state and transition input format checks; added C…

4b383d3

…atHMM and ARHMM emissions checks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add input format checks for HMM models: Initialization, Transition and select Emission #434

Add input format checks for HMM models: Initialization, Transition and select Emission #434

Uh oh!

ay340 commented Sep 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add input format checks for HMM models: Initialization, Transition and select Emission #434

Are you sure you want to change the base?

Add input format checks for HMM models: Initialization, Transition and select Emission #434

Uh oh!

Conversation

ay340 commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ay340 commented Sep 15, 2025 •

edited

Loading