## The Objective Function

In both Supervised and Unsupervised machine learning, most algorithms are centered around minimising (or, equivalently) maximising some objective function. This function is supposed to somehow represent what the model knows/can get right. Normally, as one would expect, the objective function does not always reflect exactly what we want.

The objective function presents 2 main problems: 1. how do we minimise it (the answer to this is up for debate and there is lots of interesting research about efficient optimisation of non-convex functions and 2) assuming we can minimise it perfectly, is it the correct thing to be minimising?

It is point 2 which is the focus of this post.

Let’s take the example of square-loss-linear-regression. To do so we train a linear regression model with a square loss $\mathcal{L}(\mathbf{w})=\sum_i (y_i - \mathbf{w}^Tx_i)^2$. (Where we are taking the inner product of learned weights with a vector of features for each observation to predict the outcome).…

## Tales of Impossibility – The 2000 year quest to solve the mathematical problems of antiquity, by David S. Richeson – a review

NB I was sent this book as a review copy. Four impossible puzzles, all described in detail during the height of classical Greek Mathematics. All simple to define and yet so tempting that it has taken not only the brain power of many, many thousands of mathematicians (amateur and professional alike), but also two millennia to show that however hard you may try, these puzzles are just not possible. The puzzles are:

• Squaring the circle: With only a compass and a straight edge, draw a square with the same area as that of a given circle.
• Doubling the cube: With only a compass and a straight edge, draw the edge of a cube with volume twice that of a cube whose edge is given.
• Constructing regular polygons: Given a compass and a straight edge, construct a regular n-gon in a given circle for $n\ge 3$.
• Trisecting an angle: Given a compass and a straight edge, and a given angle, construct an angle that is one third of the original.