Sticky Post – Read this first. Categories and Links in Mathemafrica

The navigability of Mathemafrica isn’t ideal, so I have created this post which might guide you to what you are looking for. Here are a number of different categories of post which you might like to take a look at:

Always write in a comment if there is anything you would like to see us write about, or you would like to write about.…

By | January 17th, 2018|Uncategorized|0 Comments

A challenging limit

This post comes mostly from the youtube video by BlackPenRedPen found here: https://www.youtube.com/watch?v=89d5f8WUf1Y&t=3s

This in turn comes from Brilliant.com – details and links can be found in the original video

In this post we will have a look at a complicated-looking limit that has an interesting solution. Here it is:

\lim_{n \rightarrow \infty} ( \frac{n!}{n^n})^{\frac{1}{n}}

This looks pretty daunting – but we will break the solution down into sections:

  • taking the logarithms and rearranging
  • recognising something familiar
  • finding the numerical value

 

Step 1: Taking the Logarithm

The first step here is to take the logarithm, a generally useful trick when applying limits. First we assign the variable L to the limit (so that we can solve for it in the end). Now lets do some algebra:

L = \lim_{n \rightarrow \infty} ( \frac{n!}{n^n})^{\frac{1}{n}}

\ln(L) = \ln(\lim_{n \rightarrow \infty} ( \frac{n!}{n^n})^{\frac{1}{n}})

Noting that the natural logarithm \ln is a continuous function and therefore we can take the limit outside of the function:

\ln(L) =  \lim_{n \rightarrow \infty} \ln( (\frac{n!}{n^n})^{\frac{1}{n}})

Next we can use the logarithm laws to bring down the exponent:

\ln(L) =  \lim_{n \rightarrow \infty}  \frac{1}{n} \ln(\frac{n!}{n^n})

Alright, now we have taken the logarithm, step 1 is complete.…

By | November 29th, 2020|MAM1000, Uncategorized|0 Comments

Parrondos Paradox

Introduction

In this post we will have a look at Parrondos paradox. In a paper* entitled “Information Entropy and Parrondo’s Discrete-Time Ratchet”** the authors demonstrate a situation where, by switching between 2 losing strategies, we can create a winning strategy.

Setup

The setup to this paradox is as follows:

We have 2 games that we can play – if we win we get 1 unit of wealth, if we lose, it costs 1 unit of wealth. Game A gives us a payout of 1 with a probability of slightly less than 0.5. Clearly if we play this game for long enough we will end up losing.

Game B is a little more complicated in that it is defined with reference to our existing winnings. If our current level of wealth is a multiple of M we play a game where the probability of winning is slightly less than 0.1. If it is not a multiple of M, the probability of winning is slightly less than 0.75.…

By | November 11th, 2020|Uncategorized|0 Comments

Basic Reverse Image Search Using an Autoencoder

Introduction

In this post we are going to create a simple reverse image search on the MNIST handwritten image dataset. That is to say, given any image, we want to return images that look most similar to it. To do this, we will use an autoencoder, trained using Tensorflow 2.

The dataset

The MNIST dataset is a commonly-used dataset in machine learning comprised of 28-by-28 images of handwritten digits between 0 and 9. For our purposes we would be interested in our image searcher returning images of the same number as the query images, i.e. if we input a 3 we want the images returned to all be 3s. However, if we had, say, four 3s and one 2 that mightn’t be too bad, considering how 2 and 3 look a bit similar. However, if we had three 3s, one 1 and a 7 we might say that the performance is not up to standard.…

By | October 21st, 2020|Uncategorized|0 Comments

A simple introduction to causal inference

 

Introduction

Causal inference is a branch of Statistics that is increasing in popularity. This is because it allows us to answer questions in a more direct way than do other methods. Usually, we can make inference about association or correlation between a variable and an outcome of interest, but these are often subject to outside influences and may not help us answer the questions in which we are most interested.

Causal inference seeks to remedy this by measuring the effect on the outcome (or response variable) that we see when we change another variable (the ‘treatment’). In a sense, we are looking to reproduce the situation that we have when we do an designed experiment (with a ‘treated’ and a ‘control’ group). The goal here is to have groups that are otherwise the same (with regard to factors that might influence the outcome) but where one is ‘treated’ and the other is not.…

By | August 20th, 2020|English, Uncategorized|0 Comments

Inverse Reinforcement Learning: Guided Cost Learning and Links to Generative Adversarial Networks

Recap

In the first post we introduced inverse reinforcement learning, then we stated some result on the characterisation of admissible reward functions (i.e reward functions that solve the inverse reinforcement learning problem), then on the second post we saw a way in which we proceed with solving problems, more or less, using a maximum entropy framework, and we encountered two problems:
1. It would be hard to use the method introduced if we did not know the dynamics of the system already, and
2. We have to solve the MDP in the inner loop, which may be an expensive process.

Here, we shall attempt to mitigate the challenges that we have encountered, as before, and we shall give a rather beautiful closing which shall link concepts in this space of inverse reinforcement learning to ‘general’ machine learning structures, in particular generative adversarial networks.

Inverse Reinforcement Learning with Unknown Dynamics and Possibly Higher Dimensional Spaces

As we saw previously, the maximum entropy inverse reinforcement learning approach proceeds by defining the probability of a certain trajectory under the expert as being,

p(\tau)=\dfrac{1}{Z}e^{R_\psi (\tau)},

where

Z=\int e^{R_\psi(\tau)}d \tau.

We mentioned that this is hard to compute in higher dimensional spaces.…

By | May 28th, 2020|Uncategorized|0 Comments

Maximum Entropy Inverse Reinforcement Learning: Algorithms and Computation

In the previous post we introduced inverse reinforcement learning. We defined the problem that is associated with this field, which is that of reconstructing a reward function given a set of demonstrations, and we saw what the ability to do this implies. In addition to this, we also saw came across some classification results as well as convergence guarantees from selected methods that were simply referred to in the post. There were some challenges with the classification results that we discussed, and although there were attempts to deal with these, there is still quite a lot that we did not talk about.

Maximum Entropy Inverse Reinforcement Learning

We shall now introduce a probabilistic approach based on what is known as the principle of maximum entropy, and this provides a well defined globally normalised distribution over decision sequences, while providing the same performance assurances as previously mentioned methods. This probabilistic approach allows moderate reasoning about uncertainty in the setting inverse reinforcement learning, and the assumptions further limits the space in which we search for solutions which we saw, last time, was quite massive.…

By | May 22nd, 2020|Uncategorized|0 Comments

Inverse Reinforcement Learning: The general basics

Standard Reinforcement Learning

The very basic ideas in Reinforcement Learning are usually defined in the context of Markov Decision Processes. For everything that follows, unless stated otherwise, assume that the structures are finite.

A Markov Decision Process (MDP) is a tuple (S,A, P, \gamma, R) where the following is true:
1. S is the set of states s_k with k\in \mathbb{N} .
2. A is the set of actions a_k with k\in \mathbb{N} .
3. P is the matrix of transition probabilities for taking action a_k given state s_j.
4. \gamma is the discount factor in the unit interval.
5. R is defined as the reward function, and is taken as a function from A\times S\to \mathbb{R}.

In this context, we have policies as maps

\pi:S\to A ,

state value functions for a policy, \pi , evaluated at s_1 as

V^\pi(s_1)=\mathbb{E}[\sum_{i=0}\gamma ^i R(s_i)|\pi],

and state action values defined as

Q^\pi (s,a)=R(s)+\gamma \mathbb{E}_{s'\sim P_{sa}}[V^\pi (s')].

The optimal functions are defined as

V^*(s)=\sup_\pi V^{\pi}(s),

and

Q^*(s,a)=\sup_\pi Q^\pi (s,a).

Here we assume that we have a reward function, and this reward function is used to determine an optimal policy.

By | May 17th, 2020|Uncategorized|0 Comments

Correlation vs Mutual Information

This post is based on a (very small) part of the (dense and technical) paper Fooled by Correlation by N.N. Taleb, found at (1)

Notes on the main ideas in this post are available from Universidad de Cantabria, found at (2)

The aims of this post are to 1) introduce mutual information as a measure of similarity and 2) to show the nonlinear relationship between correlation and information my means of a relatively simple example

Introduction

A significant part of Statistical analysis is understanding how random variables are related – how much knowledge about the value of one variable tells us about the value of another. This post will consider this issue in the context of Gaussian random variables. More specifically, we will compare- and discuss the relationship between- correlation and mutual information.

Mutual Information

The Mutual Information between 2 random variables is the amount of information that one gains about a random variable by observing the value of the other.…

By | March 28th, 2020|English, Level: intermediate, Uncategorized|0 Comments

The Res-Net-NODE Narrative

Humble Beginnings: Ordinary Differential Equations

The story begins with differential equations. Consider f such that f:[0,T]\times \mathbb{R}^n\to \mathbb{R}^n is a continuous function. We can construct a rather simple differential equation given this in the following way. We let

\begin{cases}  {y'(t)}=f(t,y(t))\\  y(0)=y_0\in \mathbb{R}^n  \end{cases}

A solution to this system is a continuous map that is defined in the neighbourhood of t=0 such that this map satisfies the differential equation.

Ordinary differential equations are well-studied, and we know that, for example, a solution to the given differential equation will exist whenever the function f satisfies the following:

(\forall x,y\in \mathbb{R}^n)(\exists C>0 (\in \mathbb{R}))(||f(t,y)-f(t,x)||\leq C||y-x||)

This property is known as Lipschitz continuity. A function that satisfies this condition is said to be Lipschitz. We shall see that whenever we require this condition, on up coming situations, wonderful things happen!

A remarkable field that almost always couples with differential equations is numerical analysis, where we learn to solve differential equations numerically and we study these numerical schemes. We shall explore numerical integration briefly.…

By | March 18th, 2020|Uncategorized|Comments Off on The Res-Net-NODE Narrative

Scaled Reinforcement Learning: A Brief Introduction to Deep Q-Learning

This blog post is a direct translation of a talk that was given by the author on the 17th of February 2020. The ideas was to very briefly introduce Deep Q-Learning to an audience that was familiar with the fundamental concepts of reinforcement learning. If the person reading this is not familiar with these basics, then a very great introduction can be found here: An Introduction to Reinforcement Learning. Without the additional details from the talk, one will note that this post is rather brief, and should really be used as a tool to gain an overview for the method or a gateway to relevant resources. This will not the case for posts later in the series, because the intention is to deal more with the mathematical aspect of reinforcement learning.

Basic Reinforcement Learning Notions

The idea behind reinforcement learning is that there is an agent that interacts with the environment in order to achieve a certain task.…

By | March 18th, 2020|Uncategorized|0 Comments