Missing Value Imputation with Item Response Theory
A Bayesian Bake-Off
In many large surveys, not every respondent is asked every question, and not every respondent answers the questions they are asked. So how can we compare people who answer different sets of questions? One solution is to use item response theory (IRT) to impute missing responses – and nothing goes better with IRT than Bayesian methods!
In this talk, we report the results of a friendly competition – a bake-off – between two approaches to this problem, one using grid algorithms and a simplified model, the other using PyMC and a more detailed model. We’ll discuss the implementations, compare the results, and outline their pros and cons.
Dynamic data are all around us. Changepoint models allow us to know when changes happen in these data and what they look like. Probabilistic modelling allows us to elegantly build customizable changepoint models for different data types, as well as provide us with uncertainty estimates for the position and magnitude of the change (both indispensable quantities for decision-making and hypothesis testing). This tutorial will briefly cover building changepoint models for multivariate data using PyMC but will primarily focus on the ways in which this “basic” model can be extended.
This tutorial is targeted towards academic researchers, data scientists, and anyone interested in being able to easily build bespoke models which provide uncertainty estimates for inferred statistics. This talk will attempt to be accessible to beginners but leans towards more intermediate users interested in changepoint modelling. Previous experience with PyMC, and a background in statistical modelling is assumed. No libraries other than PyMC and the basic scientific stack (numpy, scipy, matplotlib) will be used.
The tutorial aims to be hands-on, will discuss some theory to provide context for the models discussed, and will be heavy on understanding code to construct the “guts” of the models (in particular, selection of distributions for modelling the emissions and changepoint locations, and the details of the tensor manipulation to put everything together).
Causal analysis is rapidly gaining popularity, but why? Machine learning methods might help us predict what’s going to happen with great accuracy, but what’s the value of that if it doesn’t tell us what to do to achieve a desirable outcome? Without a causal understanding of the world, it’s often impossible to identify which actions lead to a desired outcome.
Causal analysis is often embedded in a frequentist framework, which comes with some well-documented baggage. In this talk, Thomas will present how we can super-charge PyMC for Bayesian Causal Analysis by using a powerful new feature: the do operator.
We believe that the PyMC project is more than just a codebase. It is also a community that is interested not only in statistical methods and code, but also in sharing knowledge and helping others - whether they be Bayesian veterans or new to the world of open source. The purpose of this conference is to better our community, just as we better our code .