I am a South African and UCT Alumnus (applied mathematics) currently doing a PhD in theoretical neuroscience at University College London. I am interested in various fields of mathematics, particularly probability theory and measure theory, as they relate to information theory and machine learning which is used widely in theoretical neuroscience. I will be blogging about what I learn "from a mathematician's" point of view.

## Welcome to Reproducing Kernel Hilbert Space

In a series of posts I hope to introduce Mathemafrica readers to some useful data analysis methods which rely on operations in a little back-water of Hilbert space, namely Reproducing Kernel Hilbert Space (or RKHS).

We’ll start with the “classic” example. Consider the data plotted in figure 1. Each data point has 3 “properties”: an $x_1$ coordinate, an $x_2$ coordinate and a colour (red or blue). Suppose we want to be able to separate all data points into two groups: red points and blue points. Furthermore, we want to be able to do this linearly, i.e. we want to be able to draw a line (or plane or hyperplane) such that all points on one side are blue, all points on the other are red. This is called linear classification.

Figure 1: A scatter of data with three properties: an x_1 coordinate, an x_2 coordinate and a colour.

Suppose for each data point we generate a representation of the data point $\phi(x)=[x_1, x_2, x_1x_2]$.…

## You’re (probably) a Bayesian – whether you like it or not!

Statisticians have long been separated into two camps as to how they philosophically interpret their trade. These schools of thought are usually called Frequentists and Bayesians.

Frequentists believe that a probability, $p\in[0~ 1]$, associated with a specific possible outcome of an observable occurrence or process, is simply telling you that, could you observe this occurrence (or process) infinitely many times, the fraction of such observations that would yield that specific outcome is $p$ . Using the age-old coin toss example: tossing the coin is the occurrence or process and recording a Heads or Tails are the two observation. The number 0.5 $\left(P(\text{Tails})=0.5=P(\text{Heads})\right)$ tells a Frequentist that, in the pursuit of infinitely many coin tosses, the ratio of Heads recorded to the number of tosses performed asymptotically approaches 0.5. And that’s all! The value should not be interpreted as the most likely outcome for the next observation or sample taken from the process (though I’ve always wondered how a Frequentist would gamble…).…