You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Or is this something that should be left outside the model? Like, does "prediction" in a sample-based framework mean "give me _all_ the posterior trajectories?"
In a Bayesian context, in general we get distributions not values. That holds with predictions, where we get predictive distributions. Condensing that down to a single summary is a whole thing. But it's a pretty modular decision and should be pushed as close to where it's needed as possible and not be done before.
There are packages that actually return more samples from the posterior predictive distribution than posterior samples because they sample multiple times from the likelihood per posterior sample for smoothness.
Other than thinning to use only some fraction of samples where storage becomes a problem, I know of nothing that returns fewer predictive distribution samples than posterior distribution samples.
There is a sense in which any one posterior sample is as good of a single-number summary of the whole distribution as any other. But posterior means and medians are more common choices, and unless we want to be in the business of offering a bunch of options here (we don't), we should modularize that decision and return the distribution. Then let the user decide later when they need that one summary what it should be.
Curious @afmagee42 what you do with the linmod stuff. Does predict() return a single trajectory? Many? An adjustable number?
In linmod, we work with distributions until we absolutely have to summarize them. We don't currently have an explicit Model.predict(), but when we need predictions on observables we use the whole kit and caboodle.
Presumably we could do something like use the same integer as the starting seed for both the JAX RNG key as well as for the numpy seed?
Funny enough, linmod actually also has numpy RNGs in it for generating predictions anyways, though. It's how we sample from the multinomial. My take is it doesn't really matter how you handle the seeds behind the scene as long as a user only has to set one. But the easiest thing to do is to use the same integer for both seeds.
Prediction is now corrected to use each single draw from the posterior parameter distribution to predict an entire forward trajectory, in #102. We can consider closing this issue once #102 is reviewed and merged. However, we may want to open a new issue for specific improvements to this model:
Include a link function or other nonlinearity to prevent incident uptake ($\upsilon_{t+1}$) from being negative.
Include a layer of observation uncertainty in the model.
In a Bayesian context, in general we get distributions not values. That holds with predictions, where we get predictive distributions. Condensing that down to a single summary is a whole thing. But it's a pretty modular decision and should be pushed as close to where it's needed as possible and not be done before.
There are packages that actually return more samples from the posterior predictive distribution than posterior samples because they sample multiple times from the likelihood per posterior sample for smoothness.
Other than thinning to use only some fraction of samples where storage becomes a problem, I know of nothing that returns fewer predictive distribution samples than posterior distribution samples.
There is a sense in which any one posterior sample is as good of a single-number summary of the whole distribution as any other. But posterior means and medians are more common choices, and unless we want to be in the business of offering a bunch of options here (we don't), we should modularize that decision and return the distribution. Then let the user decide later when they need that one summary what it should be.
In linmod, we work with distributions until we absolutely have to summarize them. We don't currently have an explicit
Model.predict()
, but when we need predictions on observables we use the whole kit and caboodle.Funny enough, linmod actually also has numpy RNGs in it for generating predictions anyways, though. It's how we sample from the multinomial. My take is it doesn't really matter how you handle the seeds behind the scene as long as a user only has to set one. But the easiest thing to do is to use the same integer for both seeds.
Originally posted by @afmagee42 in #102 (comment)
The text was updated successfully, but these errors were encountered: