Support and influence analysis for visualizing posteriors of probabilistic programs

A common way to interpret the results of any computational model is to visualize its output. For probabilistic programming, this often means visualizing a posterior probability distribution. The webppl language has a visualization library called webppl-viz that facilitates this process. A useful feature of webppl-viz is that it does some amount of automatic visualization—the user simply passes in the posterior and the library tries to construct a useful visual representation of it. For instance, consider this posterior, with a discrete component b and continuous component m:

var dist = Infer(
  {method: 'MCMC', samples: 1000},
  function () {
    var b = flip(0.7)
    var m = gaussian(0, 1)
    var y = gaussian(m, b ? 4 : 2)
    condition(y > 0.3)
    return {b: b, m: m}

We visualize it by calling viz(dist), which gives us this picture:

This is a reasonable choice. There are two density curves for m—an orange curve for when b is true, and a blue curve for when b is false. webppl-viz often produces helpful graphs but, as I will show, it can also produce graphs with obvious flaws. One reason for this is that webppl-viz defines a limited set of variable types for visualization and it uses heuristics to guess the types particular components in a posterior sample. Another issue is that webppl-viz does not scale well with the number of components in a posterior. There are only a handful of ways to visually encode data, and webppl-viz gives up if the dimensionality of the posterior exceeds the number of available visual channels. In this article, I argue that methods from programming languages research suggest solutions to these two problems.

Extended abstract: Ouyang abstract

This entry was posted in Uncategorized. Bookmark the permalink.