Introducing model-based thinking into AI systems

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Vikash Mansinghka on recent developments in probabilistic programming.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode I spoke with Vikash Mansinghka, research scientist at MIT, where he leads the Probabilistic Computing Project, and co-founder of Empirical Systems. I’ve long wanted to introduce listeners to recent developments in probabilistic programming, and I found the perfect guide in Mansinghka.

Probability is the mathematical language to represent, model, and manipulate uncertainty, and probabilistic programming provides frameworks for representing probabilistic models as computer programs. This family of tools and techniques distinguishes between models and the inference procedures, and in the process, encourages the kind of model-based thinking that may inform the design of future artificial intelligence systems and supplement current data and compute-intensive systems that rely primarily on large-scale pattern recognition.

Below are highlights from my conversation with Mansinghka:

Solving computer vision problems with Picture

With Picture, which was one of the probabilistic languages developed by my lab, you can solve … hard 3D computer vision problems. You can go from a single image to a 3D model that’s still useful when you rotate it. In fact, that’s a much harder problem to solve than typical face recognition systems just based on deep learning. It has to imagine what that image is going to look like, not just classify it. Picture lets you solve problems like that, using just—in this case—about 50 lines of picture code.

What does that code do? Part of that code is a model based on computer graphics, so you can use off-the-shelf rendering components for computer graphics that draw pictures of faces. Picture lets you turn that generative knowledge, the knowledge of how to generate faces, into knowledge about how to recognize or build 3D models of faces. Picture programs also include hints about how to do inference. When you write a Picture program, you say here’s a model of what might be out there in the world, by writing down a graphics program that’s driven with random inputs; you also write down inference hints and inference tactics that tell the picture engine at a very high level how to explore the space of possible faces to find one that matches.

It’s not just as though you write down a model and then inference is automatic. In something like Picture, the problems are hard enough and varied enough that the user does have to tell the machine something about how to solve the problem—just at a much higher level than people are used to from computer vision.

Probabilistic meta programming

Venture is a language for probabilistic meta programming. What I mean by ‘meta programming’ is meta programs you might think of as programs that create, transform, or analyze other programs. Most programs operate on simple kinds of data, like let’s say text or images, but meta programs operate on programs.

Zoubin Ghahramani’s group built this cool system called the Automatic Statistician. It takes time series data and produces natural language descriptions, so you could take a time series of passengers’ accounts for airlines, and the system would take that data and spit out a description in English, like ‘Well there’s a linearly increasing trend but there’s a periodic cycle that goes roughly 12 months, and they’re various periods where there are spikes or other short term deviations.’ The machine can understand some of the structure behind the time series.

The Automatic Statistician produces those sorts of descriptions from time series data. How does it do it? Well, it does that by searching a very large space of little symbolic descriptions of time series that use techniques from nonparametric Bayesian Statistics, such as Gaussian processes. These processes define the relationship between little symbolic expressions that say things like ‘linear trend with a certain slope’ or ‘some periodic signal’ and connect those to messy time series data with all kinds of noise and exceptions. That system was very complex, and if you talk to the authors, they’ll say that that system was really difficult to build.

One of the applications we did in Venture was we showed how you can reimplement it in just 60 lines of Venture code. This Venture code is a probabilistic meta program that works by exploring a very simple space of probabilistic programs. The structure of the description of the time series, like let’s say linear with a periodic overlay, and some deviations, is represented as a little program.

Related resources:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s