## The Objective Function

In both Supervised and Unsupervised machine learning, most algorithms are centered around minimising (or, equivalently) maximising some objective function. This function is supposed to somehow represent what the model knows/can get right. Normally, as one would expect, the objective function does not always reflect exactly what we want.

The objective function presents 2 main problems: 1. how do we minimise it (the answer to this is up for debate and there is lots of interesting research about efficient optimisation of non-convex functions and 2) assuming we can minimise it perfectly, is it the correct thing to be minimising?

It is point 2 which is the focus of this post.

Let’s take the example of square-loss-linear-regression.* *To do so we train a linear regression model with a square loss . (Where we are taking the inner product of learned weights with a vector of features for each observation to predict the outcome).…