The Res-Net-NODE Narrative

Humble Beginnings: Ordinary Differential Equations

The story begins with differential equations. Consider f such that f:[0,T]\times \mathbb{R}^n\to \mathbb{R}^n is a continuous function. We can construct a rather simple differential equation given this in the following way. We let

\begin{cases}  {y'(t)}=f(t,y(t))\\  y(0)=y_0\in \mathbb{R}^n  \end{cases}

A solution to this system is a continuous map that is defined in the neighbourhood of t=0 such that this map satisfies the differential equation.

Ordinary differential equations are well-studied, and we know that, for example, a solution to the given differential equation will exist whenever the function f satisfies the following:

(\forall x,y\in \mathbb{R}^n)(\exists C>0 (\in \mathbb{R}))(||f(t,y)-f(t,x)||\leq C||y-x||)

This property is known as Lipschitz continuity. A function that satisfies this condition is said to be Lipschitz. We shall see that whenever we require this condition, on up coming situations, wonderful things happen!

A remarkable field that almost always couples with differential equations is numerical analysis, where we learn to solve differential equations numerically and we study these numerical schemes. We shall explore numerical integration briefly.…

By | March 18th, 2020|Uncategorized|Comments Off on The Res-Net-NODE Narrative

Scaled Reinforcement Learning: A Brief Introduction to Deep Q-Learning

This blog post is a direct translation of a talk that was given by the author on the 17th of February 2020. The ideas was to very briefly introduce Deep Q-Learning to an audience that was familiar with the fundamental concepts of reinforcement learning. If the person reading this is not familiar with these basics, then a very great introduction can be found here: An Introduction to Reinforcement Learning. Without the additional details from the talk, one will note that this post is rather brief, and should really be used as a tool to gain an overview for the method or a gateway to relevant resources. This will not the case for posts later in the series, because the intention is to deal more with the mathematical aspect of reinforcement learning.

Basic Reinforcement Learning Notions

The idea behind reinforcement learning is that there is an agent that interacts with the environment in order to achieve a certain task.…

By | March 18th, 2020|Uncategorized|0 Comments

Curves for the Mathematically Curious – an anthology of the unpredictable, historical, beautiful and romantic, by Julian Havil – a review

NB I was sent this book as a review copy.

What a beautiful idea. What a beautiful book! In studying mathematics, one comes across various different curves while studying calculus, or number theory, or geometry in various forms and they are asides of the particular subject. The idea however of flipping the script and looking at curves themselves and from them gaining insight into: statistics, combinatorics, number theory, analysis, cryptography, fractals, Fourier series, axiomatic set theory and so much more is just wonderful.

This book looks at ten carefully chosen curves and from them shows how much insight one can get into vast swathes of mathematics and mathematical history. The curves chosen are:

  1. The Euler Spiral – an elegant spiral which leads to many other interesting parametrically defined curves
  2. The Weierstrass Curve – an everywhere continuous but nowhere differentiable function
  3. Bezier Curves – which show up in computer graphics and beyond
  4. The Rectangular Hyperbola – which leads to the investigation of logarithms and exponentials
  5. The Quadratrix of Hippies – which are tightly linked to the impossible problems of antiquity
  6. Peano’s Function and Hilbert’s Curve – space filling curves which lead to a completely flipped understanding of the possibilities of infinitely thin lines
  7. Curves of Constant Width – curves which can perfectly fit down a hallway as they rotate.
By | March 15th, 2020|Book reviews, Reviews, Uncategorized|1 Comment

The Objective Function

In both Supervised and Unsupervised machine learning, most algorithms are centered around minimising (or, equivalently) maximising some objective function. This function is supposed to somehow represent what the model knows/can get right. Normally, as one would expect, the objective function does not always reflect exactly what we want.

The objective function presents 2 main problems: 1. how do we minimise it (the answer to this is up for debate and there is lots of interesting research about efficient optimisation of non-convex functions and 2) assuming we can minimise it perfectly, is it the correct thing to be minimising?

It is point 2 which is the focus of this post.

Let’s take the example of square-loss-linear-regression. To do so we train a linear regression model with a square loss \mathcal{L}(\mathbf{w})=\sum_i (y_i - \mathbf{w}^Tx_i)^2. (Where we are taking the inner product of learned weights with a vector of features for each observation to predict the outcome).…

By | February 20th, 2020|Level: Simple, Uncategorized|0 Comments

Tales of Impossibility – The 2000 year quest to solve the mathematical problems of antiquity, by David S. Richeson – a review

NB I was sent this book as a review copy.

Four impossible puzzles, all described in detail during the height of classical Greek Mathematics. All simple to define and yet so tempting that it has taken not only the brain power of many, many thousands of mathematicians (amateur and professional alike), but also two millennia to show that however hard you may try, these puzzles are just not possible. The puzzles are:

  • Squaring the circle: With only a compass and a straight edge, draw a square with the same area as that of a given circle.
  • Doubling the cube: With only a compass and a straight edge, draw the edge of a cube with volume twice that of a cube whose edge is given.
  • Constructing regular polygons: Given a compass and a straight edge, construct a regular n-gon in a given circle for n\ge 3.
  • Trisecting an angle: Given a compass and a straight edge, and a given angle, construct an angle that is one third of the original.
By | February 9th, 2020|Uncategorized|1 Comment

Simpson’s Paradox

Introduction

A key consideration when analysing stratified data is how the behaviour of each category differs and how these differences might influence the overall observations about the data. For example, a data set might be split into one large category that dictates the overall behaviour or there may be a category with statistics that are significantly different from the other categories that skews the overall numbers. These features of the data are important to be aware of and go find to prevent drawing erroneous conclusions from your analysis. Context, the source of the data and a careful analysis of the data can prevent this. Simpson’s paradox is an interesting result of some of these effects.

The Paradox

Simpson’s paradox is observed in statistics when a trend is observed in a number of different groups but it is not observed in the overall data or the opposite trend is observed.

Observing the overall data might therefore lead us to draw a conclusion, but when the data is grouped we might conclude something different.…

By | January 5th, 2020|English, Level: Simple|1 Comment

What is mathematics?

Below you find some thoughts on this wide question, I encourage you to think about. What is your vision of mathematics? It will be most probably the result of your own experience with the subject, traumas that happened along the way and realizing that, could make you more conscious about your relationship with the subject and the walls you might have built against the subject or part of the subject. In a sense, by understanding the bias and blockage, by objectively thinking about its value, you could allow yourself to be able to equip yourself with the full set of skills mathematics gives you to build your own greatest life.

Many of thoughts in the following around this topic are taken from the paper I recommend to read: Teaching and Learning “What is Mathematics and why we should ask, where one should experience and learn that and how to teach it”, by Gunter M.…

By | December 16th, 2019|Uncategorized|0 Comments

Reasoning and making sense: a pillar of mathematics?

An essential part of learning mathematics is about reasoning and making sense. What does this exactly mean?

When a student is given a problem, he needs to make sense of it, from his level of perceptive which is unique to each individual. This will come with big struggle, and the important next step is to stay motivated, curious, be extremely perseverant and not give up after the first few attempts. This might also require a good relationship with mistakes.

A students will have to develop his own strategy to solve a given problem. That might imply first to translate it in their own language, use their own words and knowledge background to get (understand) the actual question and problem they are attempting to solve.

They will have to build bridge in their mind to similar problem they have solve in the past even though they might seem different. This bridge will be easier and easier to connect with practice and experience and sometimes might not work and some other connections will need to be created until finding a suitable one.…

By | December 16th, 2019|Uncategorized|0 Comments

The Wisdom of the Crowds

This content comes primarily from the notes of Mark Herbster (contributed to by Massi Pontil and John Shawe-Taylor) of University College London.

Introduction

The Wisdom of the Crowds, or majority rule and related ideas tend to come up pretty often. Democracy is based (partly) on the majority of people being able to make the correct decision, often you might make decisions in a group of friends based on what the most people want, and it is logical to take into account popular opinion when reasoning on issues where you have imperfect information. On the other hand, of course, there is the Argumentum ad Populum fallacy which states that a popular belief isn’t necessarily true.

This is idea appears also in Applied Machine Learning – ensemble methods such as Random Forests, Gradient Boosted Models (especially XGBoost) and stacking of Neural Networks have resulted in overall more powerful models. This is especially notable in Kaggle competitions, where it is almost always an ensemble model (combination of models) that achieves the best score.…

By | November 15th, 2019|Uncategorized|0 Comments

Automatic Differentiation

Much of this content is based on lecture slides from slides from Professor David Barber at University College London: resources relating to this can be found at: www.cs.ucl.ac.uk/staff/D.Barber/brml

What is Autodiff?

Autodiff, or Automatic Differentiation, is a method of determining the exact derivative of a function with respect to its inputs. It is widely used in machine learning- in this post I will give an overview of what autodiff is and why it is a useful tool.

The above is not a very helpful definition, so we can compare autodiff first to symbolic differentiation and numerical approximations before going into how it works.

Symbolic differentiation is what we do when we calculate derivatives when we do it by hand, i.e. given a function f, we find a new function f'. This is really good when we want to know how functions behave across all inputs. For example if we had f(x) = x^2 + 3x + 1 we can find the derivative as f'(x) = 2x + 3 and then we can find the derivative of the function for all values of x.…

By | October 23rd, 2019|English, Uncategorized|0 Comments