## What is mathematics?

Below you find some thoughts on this wide question, I encourage you to think about. What is your vision of mathematics? It will be most probably the result of your own experience with the subject, traumas that happened along the way and realizing that, could make you more conscious about your relationship with the subject and the walls you might have built against the subject or part of the subject. In a sense, by understanding the bias and blockage, by objectively thinking about its value, you could allow yourself to be able to equip yourself with the full set of skills mathematics gives you to build your own greatest life.

Many of thoughts in the following around this topic are taken from the paper I recommend to read: Teaching and Learning “What is Mathematics and why we should ask, where one should experience and learn that and how to teach it”, by Gunter M.…

## Reasoning and making sense: a pillar of mathematics?

An essential part of learning mathematics is about reasoning and making sense. What does this exactly mean?

When a student is given a problem, he needs to make sense of it, from his level of perceptive which is unique to each individual. This will come with big struggle, and the important next step is to stay motivated, curious, be extremely perseverant and not give up after the first few attempts. This might also require a good relationship with mistakes.

A students will have to develop his own strategy to solve a given problem. That might imply first to translate it in their own language, use their own words and knowledge background to get (understand) the actual question and problem they are attempting to solve.

They will have to build bridge in their mind to similar problem they have solve in the past even though they might seem different. This bridge will be easier and easier to connect with practice and experience and sometimes might not work and some other connections will need to be created until finding a suitable one.…

## The Wisdom of the Crowds

This content comes primarily from the notes of Mark Herbster (contributed to by Massi Pontil and John Shawe-Taylor) of University College London.

Introduction

The Wisdom of the Crowds, or majority rule and related ideas tend to come up pretty often. Democracy is based (partly) on the majority of people being able to make the correct decision, often you might make decisions in a group of friends based on what the most people want, and it is logical to take into account popular opinion when reasoning on issues where you have imperfect information. On the other hand, of course, there is the Argumentum ad Populum fallacy which states that a popular belief isn’t necessarily true.

This is idea appears also in Applied Machine Learning – ensemble methods such as Random Forests, Gradient Boosted Models (especially XGBoost) and stacking of Neural Networks have resulted in overall more powerful models. This is especially notable in Kaggle competitions, where it is almost always an ensemble model (combination of models) that achieves the best score.…

## Automatic Differentiation

Much of this content is based on lecture slides from slides from Professor David Barber at University College London: resources relating to this can be found at: www.cs.ucl.ac.uk/staff/D.Barber/brml

What is Autodiff?

Autodiff, or Automatic Differentiation, is a method of determining the exact derivative of a function with respect to its inputs. It is widely used in machine learning- in this post I will give an overview of what autodiff is and why it is a useful tool.

The above is not a very helpful definition, so we can compare autodiff first to symbolic differentiation and numerical approximations before going into how it works.

Symbolic differentiation is what we do when we calculate derivatives when we do it by hand, i.e. given a function $f$, we find a new function $f'$. This is really good when we want to know how functions behave across all inputs. For example if we had $f(x) = x^2 + 3x + 1$ we can find the derivative as $f'(x) = 2x + 3$ and then we can find the derivative of the function for all values of $x$.…

Gallery

## What did you expect? Some notes on the Expectation operator.

Introduction

A significant amount of focus in statistics is on making inference about the averages or means of phenomena. For example, we might be interested in the average number of goals scored per game by a football team, or the average global temperature or the average cost of a house in a particular area.

The two types of averages that we usually focus on are the sample mean from a set of data and the expectation that comes from a probability distribution. For example if three men weigh 70kg, 80kg, and 90kg respectively then the sample mean of their weight is $\bar x = \frac{70+80+90}{3} = 80$. Alternatively, we might say that the arrival times of trains are exponentially distributed with parameter $\lambda = 3$ we can use the properties of the exponential distribution to find the mean (or expectation). In this case the mean is $\mu = \frac{1}{\lambda} = \frac{1}{3}$.

It is this second kind of mean (which we will call the expectation from now on), along with the generalisation of taking the expectation of functions of random variables that we will focus on.…

## What’s the shortest known Normal Number?

Well, the answer is that it has to be infinitely long, but the question is what is the most compact form of a Normal Number possible.

I was motivated to look into this from a lovely Numberphile video about all the real numbers.

Normal numbers in base 10 are those for which, in the base 10 decimal expansion, you can find every natural number.

Champernowne’s number is a very simple example of this where it is simply written as:

0.12345678910111213…etc.

I thought that it might be interesting to see if one could write a more compact Normal Number, but using a similar procedure to Champernowne. I haven’t seen this done anywhere else. For example, in the above expression, you don’t need to include the 12 explicitly as it’s already there at the beginning. You could write

0.12345678910113

So you skip the 12, and also 11 and 13 becomes 113. We will do all of this just with the list of digits, rather than the number in base 10.…

## A quick argument for why we don’t accept the null hypothesis

Introduction

When doing hypothesis testing, an often-repeated rule is ‘never accept the null hypothesis’. The reason for this is that we aren’t making probability statements about true underlying quantities, rather we are making statements about the observed data, given a hypothesis.

We reject the null hypothesis if the observed data is unlikely to be observed given the null hypothesis. In a sense we are trying to disprove the null hypothesis and the strongest thing we can say about it is that we fail to reject the null hypothesis.

That is because observing data that is not unlikely given that a hypothesis is true does not make that hypothesis true. That is a bit of a mouthful, but basically what we are saying is that if we make some claim about the world and then we see some data that does not disprove this claim, we cannot conclude that the claim is true.…

## Cantor–Schröder–Bernstein Theorem

Knowledge this posts assumes: What is a set, set cardinality, a function, an image of a function and an injective (one-to-one) function.

David Hilbert imagines a hotel with an infinite number of rooms. In this hotel, each room can only be occupied by one guest, and each room is indeed occupied by exactly one guest. What happens if more guests show up? Can they be accommodated for?

PAUSE: WHAT DO YOU THINK AND WHY?

Suppose we propose they cannot be accommodated for, since all the rooms are occupied. Hilbert then claims that he can define the functions $f:A \mapsto B,$ and $g:B \mapsto C,$ where $A$ is a set containing all current guests, and $f$ simply maps each guest to a room in the set $B$, and $g$ maps each room in $B$ to a new one in $C$. Notice that these functions must be injective, since if a room contains two different guests, those two different guests must be the same guest; recall $f(a) = f(b) \rightarrow a = b$.…

## 1.6 Partitions

Recall the  relation $\equiv \text{ mod} (4)$ on the set $\mathbb{ N}.$

One of the equivalence classes is $[0] = \{ ..., -8, -4, 0, 4, 8, ...\}$ which is equivalent to writing $[0] = [4] = [-4] = [8] = [-8] ...$

We could do this because the equivalence class collects all the natural numbers that are related to zero under the relation $\equiv \text{ mod} (4)$

The following theorem generalises this idea for any relation $\equiv \text{ mod} (n)$ on the set $\mathbb{ N}:$ for the integer $n.$

Let $R$ be an equivalence relation on set $A.$ If $a, b \in A,$ then $[a] = [b] \iff aRb.$

Essentially, equivalence classes  $[a] = [b]$ are equal if the elements  $a, b \in A,$ are related under the relation $R.$ And simultaneously, knowing that elements $a, b \in A,$ are related under $R$ means their equivalence classes  $[a] = [b]$ are equal.

An equivalence class  $\equiv \text{ mod} (n)$ divides set a $A$ into $n$ equivalence classes. We call this situation a partition of set $A.$

A partition of a set $A$ is defined as a set of non-empty subsets of $A,$ such that both these conditions are simultaneously satisfied:

(i) the union of all these subsets equals $A.$

(ii) the intersection of any two different subsets is

Let’s return to our example: $\equiv \text{ mod} (4)$ on the set $\mathbb{ N}.$ We could represent this set as:

• NOTE: Each equivalence class above represents an infinite set and despite the drawing suggesting $[0]$ is larger than $[3]$ for instance, this is not true.

## Review: Calculus Reordered

Book title: Calculus Reordered: A History of the Big Ideas
Author : David M. Bressoud

Princeton University Press
Link to the book: Calculus Reordered: A History of the Big Ideas

Discussions on the history of different fields are usually dry, wordy and generally, when you are studying the field, hard to read. This is because they are usually geared towards the general audience, and in doing so most authors tend to strip away the very exciting technical details. I expected the same treatment from the author, but I was pleasantly surprised.

The book contains $5$ chapters, which are the following:

1) Accumulations
2) Ratios of Change
3) Sequences of Partial Sums
4) The Algebra of Inequalities
5) Analysis

Each of these chapters has a central theme that is being covered, but they are not at all disjoint. For instance, the last three contain the history of concepts that would normally be found in a first course for Real Analysis, while the first two are essentially the more applied spectrum to serve as some form of motivation for going through all this trouble, although they can certainly stand on their own.…