Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ICARRV and ICAR #6831

Merged
merged 12 commits into from
Aug 18, 2023
Merged

add ICARRV and ICAR #6831

merged 12 commits into from
Aug 18, 2023

Conversation

daniel-saunders-phil
Copy link
Contributor

@daniel-saunders-phil daniel-saunders-phil commented Jul 17, 2023

This PR adds support for intrinsic conditional autoregressive (ICAR) models to PyMC. The ICAR is a special case of the CAR distribution that arises when the $\alpha$ parameter equals $1$ and we add a zero-sum constraint. There are two big advantages of ICAR over CAR.

First, computing the logp likelihood is much faster, especially on large datasets. The CAR distribution involves either inverting a large precision matrix or a variety of eigenvalue strategies. As the dataset gets larger, those operations become quite expensive, even with the sparsity tricks PyMC currently employs. In some benchmarking I was doing I found it takes nearly 60 seconds to find the logp of a dataset with 6000 observations. ICAR exploits a pairwise difference calculation which grows basically linearly with the size of the dataset.

image

Second, the zero-sum constraint can also help improve the identifiability of autoregressive models. Suppose you're working on a spatial statistics problem with the 50 US states. You might want a global intercept in your model plus 50 spatial random effects. With CAR, this leaves you overparameterized. If the sampler increases the mean value of the 50 spatial effects, then it can decrease the value of the global intercept to get the same overall intercept. The vector of random effects in an ICAR distribution is forced to sum to zero, which allows the sampler to independently adjust the global intercept.

Finally, ICAR often serves as a sub-component of larger models, like the BYM model. Providing a convenient API can make easy to build up to the more complex models.

More information on the design of this ICAR distribution can be found in a stan case study

@codecov
Copy link

codecov bot commented Jul 17, 2023

Codecov Report

Merging #6831 (2f8072c) into main (d59a960) will decrease coverage by 1.40%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6831      +/-   ##
==========================================
- Coverage   92.05%   90.65%   -1.40%     
==========================================
  Files          96       96              
  Lines       16369    16410      +41     
==========================================
- Hits        15068    14877     -191     
- Misses       1301     1533     +232     
Files Changed Coverage Δ
pymc/distributions/__init__.py 100.00% <ø> (ø)
pymc/distributions/multivariate.py 92.55% <100.00%> (+0.33%) ⬆️

... and 13 files with indirect coverage changes

@bwengals
Copy link
Contributor

RE checking if W contains ints. If W doesn't contain ints and the code runs but gives wrong results it'd def be good to check. If the code errors and says something informative then there's no need to check I think

_print_name = ("ICAR", "\\operatorname{ICAR}")

def __call__(self, W, node1, node2, N, sigma, centering_strength, size=None, **kwargs):
return super().__call__(W, node1, node2, N, sigma, centering_strength, size=size, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better to use a different word than "centering", since that's usually more about centered vs non-centered parameterizations. Maybe "mean_zero_strength" or something similar instead?

@daniel-saunders-phil
Copy link
Contributor Author

RE checking if W contains ints. If W doesn't contain ints and the code runs but gives wrong results it'd def be good to check. If the code errors and says something informative then there's no need to check I think

Okay that sounds good. Looks like the code fails silently when they pass an adjacency matrix with a non-zero, non-one in it. People might be inclined to do this, some literature in the CAR area discusses using edge weights for more flexibility. But the pairwise difference formula cannot handle edge weights. I'll add a check that requires every entry to be 0 or 1.

@daniel-saunders-phil daniel-saunders-phil marked this pull request as ready for review July 25, 2023 23:38
Comment on lines 2330 to 2333
# check that adjacency matrix is two dimensional,
# square,
# symmetrical
# and composed of 1s or 0s.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block of comments doesn't add too much to the code doing the checks below, dont think its necessary

N = pt.shape(W)[0]
N = pt.as_tensor_variable(N)

# check on sigma
Copy link
Contributor

@bwengals bwengals Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment and the comment check on centering strength are out of date now

pairwise_difference = (-1 / (2 * sigma**2)) * pt.sum(
pt.square(value[node1] - value[node2])
)
soft_center = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before RE wording, "center" vs zero sum. I think the latter is better and is more consistent with zero_sum_strength

Comment on lines 2249 to 2252
f(\\phi| W,\\sigma) =
-\frac{1}{2\\sigma^{2}} \\sum_{i\\sim j} (\\phi_{i} - \\phi_{j})^2 -
\frac{1}{2}*\frac{\\sum_{i}{\\phi_{i}}}{0.001N}^{2} - \\ln{\\sqrt{2\\pi}} -
\\ln{0.001N}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are some backslashes twice and others once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I wrote them with single slashes. Maybe one of the pre-commit checks added them by mistake?

class ICAR(Continuous):
r"""
The intrinsic conditional autoregressive prior. It is primarily used to model
covariance between neighboring areas on large datasets. It is a special case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on large datasets

Or small datasets!

based on the square distance from each of its neighbors. The notation $i\\sim j$
indicates a sum over all the neighbors of $\\phi_{i}$. The last three terms are the
Normal log density function where the mean is zero and the standard deviation is
$N * 0.001$ (where N is the length of the vector $\\phi$). This component imposed the zero-sum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$N * 0.001$ (where N is the length of the vector $\\phi$). This component imposed the zero-sum
$N * 0.001$ (where N is the length of the vector $\\phi$). This component imposes a zero-sum

----------
W : ndarray of int
Symmetric adjacency matrix of 1s and 0s indicating adjacency between elements.
Must pass either W or both node1 and node2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Must pass either W or both node1 and node2.

Comment on lines 2292 to 2309
W = np.array([[0,1,0,1],
[1,0,1,0],
[0,1,0,1],
[1,0,1,0]])

# centered parameterization

with pm.Model():
sigma = pm.Exponential('sigma',1)
phi = pm.ICAR('phi',W=W,sigma=sigma)

mu = phi

# non-centered parameterization

with pm.Model():
sigma = pm.Exponential('sigma',1)
phi = pm.ICAR('phi',W=W)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
W = np.array([[0,1,0,1],
[1,0,1,0],
[0,1,0,1],
[1,0,1,0]])
# centered parameterization
with pm.Model():
sigma = pm.Exponential('sigma',1)
phi = pm.ICAR('phi',W=W,sigma=sigma)
mu = phi
# non-centered parameterization
with pm.Model():
sigma = pm.Exponential('sigma',1)
phi = pm.ICAR('phi',W=W)
W = np.array([
[0, 1, 0, 1],
[1, 0, 1, 0],
[0, 1, 0, 1],
[1, 0, 1, 0]
])
# centered parameterization
with pm.Model():
sigma = pm.Exponential('sigma', 1)
phi = pm.ICAR('phi', W=W, sigma=sigma)
mu = phi
# non-centered parameterization
with pm.Model():
sigma = pm.Exponential('sigma', 1)
phi = pm.ICAR('phi', W=W)
mu = sigma * phi

adjusted some formatting here, mostly putting a space after comma

Comment on lines 2275 to 2276
Controls how strongly to enforce the zero-sum constraint. It sets the
standard deviation of a normal density function with mean zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add a bit of detail here, like, "It puts an additional normal prior on the sum of the phi, such that the sum is normally distributed with mean zero and a small standard deviation, whose value is zero_sum_strength. Maybe zero_sum_stdev is a clearer name then?

Copy link
Contributor

@bwengals bwengals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! mostly cosmetic stuff

@daniel-saunders-phil
Copy link
Contributor Author

daniel-saunders-phil commented Jul 31, 2023

Hi @ricardoV94 , I've been working on adding a shape inference function since the change in #388 . I thought supp_shape_from_ref_param_shape would be my best bet. In the ICAR, the output shape is always either the length or width of W, the adjacency matrix. I thought it was similar to the multivariate normal case where the length or width of the covariance matrix gives the output shape.

Continuous integration failed though because it couldn't import supp_shape_from_ref_param_shape. I also couldn't get the test to run locally for the same reason.

Is this related to releases? I assumed the continuous integration would run the latest pytensor but now I'm not so sure. Otherwise, do you have suggestions for how I should handle shape inference?

@bwengals
Copy link
Contributor

Looks great!

@fonnesbeck fonnesbeck merged commit 9ee5d84 into pymc-devs:main Aug 18, 2023
20 of 21 checks passed
@fonnesbeck
Copy link
Member

Thanks, @daniel-saunders-phil !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants