2021-12-04

An animated exploration of Bayesian updating.

Bayesian Updating

We recently saw the posterior approximation of the globe tossing model via grid approximation.

$$W \sim \textrm{Binomial}(N, p)$$ $$p \sim \textrm{Uniform}(0, 1)$$

library(tibble)
library(dplyr)
library(ggplot2)
library(gganimate)
library(stringr)

We generalized a function that could take a sequence of observations, e.g. WWWLWLLLW, and plot the posterior distribution based on the observed data.

plot_posterior_approximation <- function(sequence, grid_length = 25) {
num_water <- stringr::str_count(sequence, "W")
df <- tibble::tibble(grid = seq(from = 0, to = 1, length.out = grid_length),
prior = rep(1, grid_length),
likelihood = dbinom(num_water,
size=stringr::str_length(sequence),
prob=grid),
posterior_unstandardized = likelihood * prior,
posterior = posterior_unstandardized / sum(posterior_unstandardized))
ggplot2::ggplot(df, ggplot2::aes(x = grid, y = posterior)) +
ggplot2::geom_point() +
ggplot2::geom_line() +
ggplot2::labs(x = "probability of water",
y = "posterior probability",
title = "Posterior (grid approximation)")
}

To emphasize the process of Bayesian updating though, now we create animations of the posterior grid approximation being updated as if the sequence had been observed one observation at a time.

It's a simple addition. We require the posteriors of all subsequences in the same tidy dataframe.

compute_posterior_approximation <- function(sequence, grid_length = 25) {
num_water <- stringr::str_count(sequence, "W")
df <- tibble::tibble(grid = seq(from = 0, to = 1, length.out = grid_length),
prior = rep(1, grid_length),
likelihood = dbinom(num_water,
size=stringr::str_length(sequence),
prob=grid),
posterior_unstandardized = likelihood * prior,
posterior = posterior_unstandardized / sum(posterior_unstandardized))
return(df)
}

compute_bayesian_update <- function(sequence, grid_length = 25) {
df <- tibble::tibble()
for (i in seq(from = 1, to = stringr::str_length(sequence) + 1)) {
subsequence <- substr(sequence, 1, i)
posterior_approximation <- compute_posterior_approximation(subsequence)
posterior_approximation <- dplyr::mutate(posterior_approximation, sequence = subsequence)
df <- bind_rows(df, posterior_approximation)
}
return(df)
}

animate_bayesian_update <- function(sequence, grid_length = 25) {
df <- compute_bayesian_update(sequence, grid_length)
animation <- ggplot2::ggplot(df, ggplot2::aes(x = grid, y = posterior, group = sequence)) +
ggplot2::geom_point() +
ggplot2::geom_line() +
ggplot2::labs(x = "probability of water",
y = "posterior probability",
title = "Posterior (grid approximation)",
subtitle = "Subsequence {closest_state}") +
gganimate::transition_states(sequence)
animate(animation, fps=60)
}

So we can create the gif with one line:

animate_bayesian_update("WLWWWLWLW") And we can look at the frames of the gif by plotting the facet:

ggplot(compute_bayesian_update("WLWWWLWLW"),
aes(x = grid, y = posterior)) +
geom_point() +
geom_line() +
labs(x = "probability of water",
y = "posterior probability",
title = "Posterior (grid approximation)") +
facet_wrap(~ sequence) 