Possibility of training a noiseless DSPP / DGP model? #1695
Replies: 3 comments 3 replies
-
In general, likelihood.initialize(noise=1e-6)
likelihood.raw_noise.requires_grad_(False) |
Beta Was this translation helpful? Give feedback.
-
Hi all -- I have still been unable to achieve interpolation with the DSPP model. Please let me know if you have any guidance! Thanks again, |
Beta Was this translation helpful? Give feedback.
-
Hi @gpleiss and GPyTorch team, I have still have not achieved this behavior using DGP / DSPP models. Today I took a look at fitting an exact GP with no additional noise term compared to fitting a SVGP with no additional noise term, and these are the fits obtained: This approximate GP fit is using the code from the tutorial except uses the entire (small) training set as inducing points. I am wondering if this suggests that there is something inherent to the approximation that causes the smoothing effect and disallows fitting a function without noise. Thanks again for any input! |
Beta Was this translation helpful? Give feedback.
-
Hello,
I have been curious about fitting GPs and Deep GPs / Deep Sigma Point Processes without an additional noise term in the likelihood. I am looking for the behavior shown in Figures 4 and 7 in this paper: https://arxiv.org/pdf/1905.03350.pdf which I will also paste below:
The exact GP and the Deep GP in these figures both interpolate through all the training points exactly, given the observations are noiseless. I want to replicate this behavior in GPyTorch.
Following the Simple GP Regression tutorial, I am able to achieve this behavior by replacing the GaussianLikelihood with a FixedNoiseGaussianLikelihood and specifying the observation noise to be 0. (I am also curious if there is a preferred way to accomplish this!)
After this, I followed the DSPP tutorial, and after changing the likelihood to a FixedNoiseGaussianLikelihood with zero noise as before, I am still unable to achieve the "interpolation" behavior. The fit still appears to consider the observations to be noisy. Here is a link to a plot demonstrating the DSPP model fit on some sampled data from the same function as in the linked paper:
The model was specified just as in the tutorial, except with the following specifications:
batch_size = 10 # Size of minibatch milestones = [20, 150, 300] # Epochs at which we will lower the learning rate by a factor of 0.1 num_inducing_pts = 20 # Number of inducing points in each hidden layer num_epochs = 200 # Number of epochs to train for initial_lr = 0.01 # Initial learning rate hidden_dim = 10 # Number of GPs (i.e., the width) in the hidden layer. num_quadrature_sites = 8 # Number of quadrature sites (see paper for a description of this. 5-10 generally works well).
I have tried a number of different settings and each time the model seems to converge, but never achieves the interpolation behavior.
Thank you for any input on this!
Beta Was this translation helpful? Give feedback.
All reactions