Intro to Probabilistic Programming with PyMC

Summary posted by: Reshama Shaikh

Introduction

In this article and video, we share how to do Bayesian Modeling and Computation in Python.

Austin Rochford, a maintainer of PyMC, presented to the Data Umbrella community an introduction to probabilistic programming with PyMC, with a particular emphasis on the how open source probabilistic programming makes Bayesian inference algorithms near the frontier of academic research accessible to a wide audience.

In the last ten years, there have been a number of advancements in the study of Hamiltonian Monte Carlo and variational inference algorithms that have enabled effective Bayesian statistical computation for much more complicated models than were previously feasible. These algorithmic advancements have been accompanied by a number of open source probabilistic programming packages that make them accessible to the general engineering, statistics, and data science communities. PyMC is one such package written in Python and supported by NumFOCUS.

Below in the outline is a summary of topics that were covered.

Topics Covered

  • Probabilistic programming from two perspectives
    • Philosophical: storytelling with data
    • Mathematical: Monte Carlo methods
  • Probabilistic programming with PyMC
    • The Monty Hall problem
    • Robust regression
  • Hamiltonian Monte Carlo
    • Aesara
  • Lego example
  • Next Steps

Video

Resources

Contribute to PyMC: upcoming online hackathon (sprint)

Section Timestamps of Video

Intro

  • 00:00:00 Reshama introduces Data Umbrella
  • 00:04:40 Austin begins
  • 00:06:15 Talk agenda
  • 00:08:08 Probabilistic programming from two perspectives
  • 00:08:53 What is probabilistic programming?
  • 00:10:15 Mathematical: Monte Carlo Methods
  • 00:13:55 Monty Hall Problem (game: Let’s Make a Deal)
  • 00:16:15 Solve Monty Hall Problem using PyMC (solution)
  • 00:18:42 Using aesara
  • 00:21:00 Doing inference with sampling
  • 00:24:00 What is Aesara? (It is based on Theano.) PyMC’s tensor computational backend, fills niche such as PyTorch or TensorFlow.
  • 00:25:20 Using PyMC to do robust regression: with example Anscombe’s Quartet
  • 00:28:10 Using arviz (library with pre-built visualizations and statistical routines that will help you understand the results of your inference with PyMC.
  • 00:33:08 What is Ridge Regression? (normal priors on your coefficients)
  • 00:36:05 Student-T Distribution
  • 00:39:00 Why are we using Aesara? To do Hamiltonian Monte Carlo.
  • 00:43:10 Bayesian Analysis of Lego Prices
  • 00:49:00 Recommended books
  • 00:50:37 Meenal talks about upcoming PyMC sprint
  • 00:56:30 Q&A with Austin

About the Speaker

Bio

Austin Rochford is the Chief Data Scientist at Kibo Commerce. He is a recovering mathematician and is passionate about math education, Bayesian statistics, and machine learning.

Connect with the Speaker

Upcoming Events

Join our Meetup group for more events: Data Umbrella Meetup