In this post we are going to talk about two well known techniques used to calculate Average Treatment Effects (ATEs): propensity score analysis and inverse probability weighting. This post assumes you have the very basic notions of causal inference, that is you understand the problem of estimating effects under the presence of confounding.
Introductory talk at BcnAnalytics
Bartek Skorulski and myself we gave an introductory talk at one of the regular Barcelona’s events of BcnAnalytics. We focused on how combining both techniques, AB Testing and Causal Inference, can give a comprehensive solution to causality problems in businesses. You can see the event here!
This is the third part of the post “What to expect from a causal inference business project: an executive’s guide”. You will find the second one here.
Most of these words have fuzzy meaning, at least at a popular level. Let me define first what will they mean some of them in this post.
Big data: All the computing infrastructure devoted to providing access and calculations for querying, preprocessing data or training models with large data sets (they do not fit in your laptop).
One of the main ideas in big data technologies is that the more data you have…
This is the second part of the post “What to expect from a causal inference business project: an executive’s guide”. You will find the third part here.
Casual inference models how variables affect each other. Based on this information, uses some calculation tools to answer questions like what would have happened if instead of doing this I had done that? can I have an estimate of the effect of a variable to another?
Causal inference provides a broad-brush approach to get preliminary estimates of causal effects. If you want more definitive conclusions, you should go, whenever is possible, for more…
Causal inference is a new language to model causality to help understand better causes and impacts so that we can make better decisions. Here we will explain how it can help a company or organization to gain insights from their data. This post is written for those in a data-driven company, not necessarily technical staff, who want to understand which are the key points in a causal inference project.
This is the fourth post on a series about causal inference and data science. The previous one was “Observing is not intervening”.
Simpson’s paradox is a great example. At first, it challenges our intuition, but then, if we are able to dissect it properly, gives a lot of ideas about how to handle analysis of observational data (data that hadn’t been obtained through a well-designed experiment). It appears in many data analysis. We will walk through it using the well-known case of kidney stones. …
In causal inference we are interested in measuring the effect that a variable A , say a treatment for some particular disease, has on some other variable B, say the probability of recovery, often from observational data. This means that we are interested in measuring the differences of the probability of recovery between the cases A=treated vs A=untreated.
In data science and machine learning we are used to work with conditional…
This is the second post of a series about causality in data science. You can check the first one “Why do we need causality in data science?” and the next one “Observing is not intervening”. As we said, there are currently two principal frameworks for working with causality: potential outcomes and with graphs. Here we will continue explaining why is causal inference necessary and how graphs help with it.
Graphs are an awesome tool. Modeling causality through graphs brings an appropriate language to describe the dynamics of causality. …
This is a series of posts explaining why we need causal inference in data science and machine learning (next one is ‘Use Graphs!’). Causal inference brings a new fresh set of tools and perspectives that let us deal with old problems.
First off, designing and running experiments (typically with A/B testing) is always better than using causal inference techniques: you don’t need to model how data is generated. If you can do that, go for it!
However, there are many situations where this is not entirely possible: