## CRL Task 3: Counterfactual Decision Making

In the previous blog post we discussed some theory of how to select optimal and possibly optimal interventions in a causal framework. For those interested in the decision science, this blog post may be more inspiring. This next task involves applying counterfactual quantities to boost learning performance. This is clearly very important for an RL agent where its entire learning mechanism is based on interventions in a system. What if intervention isn’t possible? Let’s begin!

## This Series

1. Causal Reinforcement Learning
2. Preliminaries for CRL
3. CRL Task 1: Generalised Policy Learning
4. CRL Task 2: Interventions – When and Where?
5. CRL Task 3: Counterfactual Decision Making
6. CRL Task 4: Generalisability and Robustness
7. Task 5: Learning Causal Models
8. (Coming soon) Task 6: Causal Imitation Learning
9. (Coming soon) Wrapping Up: Where To From Here?

## Counterfactual Decision Making

A key feature of causal inference is its ability to deal with counterfactual queries. Reinforcement learning, by its nature, deals with interventional quantities in a trial-and-error style of learning.…

## CRL Task 2: Interventions – When and Where?

In the previous blog post we discussed the gorey details of generalised policy learning – the first task of CRL. We went into some very detailed mathematical description of dynamic treatment regimes and generalised modes of learning for data processing agents. The next task is a bit more conceptual and focuses on the question on how to identfy optimal areas of intervention in a system. This is clearly very important for an RL agent where its entire learning mechanism is based on these very interventions in some system with a feedback mechanism. Let’s begin!

## This Series

1. Causal Reinforcement Learning
2. Preliminaries for CRL
3. CRL Task 1: Generalised Policy Learning
4. CRL Task 2: Interventions – When and Where?
5. CRL Task 3: Counterfactual Decision Making
6. CRL Task 4: Generalisability and Robustness
7. Task 5: Learning Causal Models
8. (Coming soon) Task 6: Causal Imitation Learning
9. (Coming soon) Wrapping Up: Where To From Here?

## CRL Task 1: Generalised Policy Learning

In the previous blog post we developed some ideas and theory needed to discuss a causal approach to reinforcement learning. We formalised notions of multi-armed bandits (MABs), Markov Decision Processes (MDPs), and some causal notions. In this blog post we’ll finally get to developing some causal reinforcement learning ideas. The first of which is dubbed Task 1, for CRL can help solve. This is Generalised Policy Learning. Let’s begin.

## This Series

1. Causal Reinforcement Learning
2. Preliminaries for CRL
3. CRL Task 1: Generalised Policy Learning
4. CRL Task 2: Interventions – When and Where?
5. CRL Task 3: Counterfactual Decision Making
6. CRL Task 4: Generalisability and Robustness
7. Task 5: Learning Causal Models
8. (Coming soon) Task 6: Causal Imitation Learning
9. (Coming soon) Wrapping Up: Where To From Here?

## Generalised Policy Learning

Reinforcement learning typically involves learning and optimising some policy about how to interact in an environment to maximise some reward signal. Typical reinforcement learning agents are trained in isolation, exploiting copious amounts of computing power and energy resources.…

## Causal Reinforcement Learning: A Primer

As part of any honours degree at the University of Cape Town, one is obliged to write a thesis ‘droning’ on about some topic. Luckily for me, applied mathematics can pertain to pretty much anything of interest. Lo and behold, my thesis on merging causality and reinforcement learning. This was entitled Climbing the Ladder: A Survey of Counterfactual Methods in Decision Making Processes and was supervised by Dr Jonathan Shock.

In this series of posts I will break down my thesis into digestible blog chucks and go into quite some detail of the emerging field of Causal Reinforcement Learning (CRL) – which is being spearheaded by Elias Bareinboim and Judea Pearl, among others. I will try to present this in such a way as to satisfy those craving some mathematical detail whilst also trying to paint a broader picture as to why this is generally useful and important. Each of these blog posts will be self contained in some way.…

## Correlation vs Mutual Information

This post is based on a (very small) part of the (dense and technical) paper Fooled by Correlation by N.N. Taleb, found at (1)

Notes on the main ideas in this post are available from Universidad de Cantabria, found at (2)

The aims of this post are to 1) introduce mutual information as a measure of similarity and 2) to show the nonlinear relationship between correlation and information my means of a relatively simple example

Introduction

A significant part of Statistical analysis is understanding how random variables are related – how much knowledge about the value of one variable tells us about the value of another. This post will consider this issue in the context of Gaussian random variables. More specifically, we will compare- and discuss the relationship between- correlation and mutual information.

Mutual Information

The Mutual Information between 2 random variables is the amount of information that one gains about a random variable by observing the value of the other.…

## The (Central) Cauchy distribution

The core of this post comes from Mathematical Statistics and Data Analysis by John A. Rice which is a useful resource for subjects such as UCT’s STA2004F.

Introduction

The Cauchy distribution has a number of interesting properties and is considered a pathological (badly behaved) distribution. What is interesting about it is that it is a distribution that we can think about in a number of different ways*, and we can formulate the probability density function these ways. This post will handle the derivation of the Cauchy distribution as a ratio of independent standard normals and as a special case of the Student’s t distribution.

Like the normal- and t-distributions, the standard form is centred on, and symmetric about 0. But unlike these distributions, it is known for its very heavy (fat) tails. Whereas you are unlikely to see values that are significantly larger or smaller than 0 coming from a normal distribution, this is just not the case when it comes to the Cauchy distribution.…

Gallery

## p-values (part 3): meta distribution of p-values

Introduction

So far we have discussed what p-values are and how they are calculated, as well as how bad experiments can lead to artificially small p-values. The next thing that we will look at comes from a paper by N.N. Taleb (1), in which he derives the meta-distribution of p-values i.e. what ranges of p-values we might expect if we repeatedly did an experiment where we sampled from the same underlying distribution.

The derivations are pretty in depth and this content and the implications of the results are pretty new to me, so any discrepancies/misinterpretations found should be pointed out and/or discussed.

Thankfully, in this video (2) there is an explanation that covers some of what the paper says as well as some Monte-Carlo simulations. My discussion will focus on some simulations of my own that are based on those that are done in the video.

We have already discussed what p-values mean and how they can go wrong.…

## Integrals with sec and tan when the power of tan is odd

We went through an example in class today which was

$\int tan^6\theta \sec^4\theta d\theta$

In this case we took out two powers of sec and then converted all the other $\sec$ into $latex\ tan$, which left a function of tan times $sec^2\theta d\theta$. We wanted to do this because the derivative of $\tan$ is $\sec^2$ and so we can do a simple substitution. If we have an odd power of $\tan$, we can employ a different trick. Let’s look at:

$I=\int \tan^5\theta\sec^7\theta d\theta$.

Here, sec is an odd power and so we can’t employ the same trick as before. Now we want to convert everything to a function of $\sec$ and have only a factor which is the derivative of $\sec$ left over. The derivative of $\sec$ is $\sec\tan$, so let’s try and take this out:

$I=\int \tan^5\theta\sec^7\theta d\theta=\int \tan^4\theta\sec^6\theta (\sec\theta\tan\theta)d\theta$.

Now convert the $\tan$ into $\sec$ by $\tan^2\theta=\sec^2\theta-1$:

$I=\int (\sec^2\theta-1)^2\sec^6\theta (\sec\theta\tan\theta)d\theta=\int (\sec^{10}\theta-2\sec^8\theta+\sec^6\theta) (\sec\theta\tan\theta)d\theta$

where here we have just expanded out the bracket and multiplied everything out.…

## Fundamental theorem of calculus example

We did an example today in class which I wanted to go through again here. The question was to calculate

$\frac{d}{dx}\int_a^{x^4}\sec t dt$

We spot the pattern immediately that it’s an FTC part 1 type question, but it’s not quite there yet. In the FTC part 1, the upper limit of the integral is just $x$, and not $x^4$. A question that we would be able to answer is:

$\frac{d}{dx}\int_a^{x}\sec t dt$

This would just be $\sec x$. Or, of course, we can show that in exactly the same way:

$\frac{d}{du}\int_a^{u}\sec t dt=\sec u$

That’s just changing the names of the variables, which is fine, right? But that’s not quite the question. So, how can we convert from $x^4$ to $u$? Well, how about a substitution? How about letting $x^4=u$ and seeing what happens. This is actually just a chain rule. It’s like if I asked you to calculate:

$\frac{d}{dx} g(x^4)$.

You would just say: Let $x^4=u$ and then we have:

$\frac{d}{dx} g(x^4)=\frac{du}{dx}\frac{d}{du}g(u)=4x^3 g'(u)$.…

## PDE: Physics, Math and Common Sense. Part I: Conservation Law

Source: CFDIinside blog

INTRODUCTION

The course of Partial differential equations (PDEs) usually is a tough one. There is a number of factors contributing to this toughness:

• PDE course combines the knowledge from calculus, algebra, ordinary differential equations (ODEs), complex analysis and functional analysis. Simply put, there is a lot that you need to know about!
• PDE methods often (or should I say, mostly?) come from physics, but this aspect is not always emphasized and, as a result, the intuition is lost.
• There is lots of abstraction in the PDE course material: characteristics, generalized functions (distributions), eigenfunctions, convolutions and etc. Many of these concepts actually have simple interpretations, but again, this is not emphasized.
• PDEs themselves are tough. In contrast to ODEs, there are no general methods for all kinds of PDEs. The field is young and a bit messy.

This series of posts aims to demystify PDEs and show some general way of handling PDE problems by combining physical intuition and mathematical methods.…