## The Gradient Vector

Introduction

In this post we introduce two important concepts in multivariate calculus: the gradient vector and the directional derivative. These both extend the idea of the derivative of a function of one variable, each in a different way. The aim of this post is to clarify what these concepts are, how they differ and show that the directional derivative is maximised in the direction of the gradient vector.

The gradient vector, is, simply, a vector of partial derivatives. So to find this, we can 1) find the partial derivatives 2) put them into a vector.  So far so good. Let’s start this on some familiar territory: a function of 2 variables.

That is, let $f: \mathbb{R}^2 \rightarrow \mathbb{R}$ be a function of 2 variables, x,y. Then the gradient vector can be written as:

$\nabla f(x,y) = \left [ {\begin{array}{c} \frac{\partial f(x,y)}{\partial x} \\ \frac{\partial f(x,y)}{\partial y} \\ \end{array} } \right]$

For a more tangible example, let $f(x,y) = x^2 + 2xy$, then:

$\nabla f(x,y) = \left [ {\begin{array}{c} 2x + 2y \\ 2x \\ \end{array} } \right]$

So far, so good. Now we can generalise this for a function $f: \mathbb{R}^n \rightarrow \mathbb{R}$ taking in a vector $\mathbf{x} = x_1, x_2, x_3, \dots, x_n$.…

## p-values (part 2) : p-Hacking Why drinking red wine is not the same as exercising

What is p-hacking?

You might have heard about a reproducibility problem with scientific studies. Or you might have heard that drinking a glass of red wine every evening is equivalent to an hour’s worth of exercise.

Part of the reason that you might have heard about these things is p-hacking: ‘torturing the data until it confesses’. The reason for doing this is mostly pressure on researchers to find positive results (as these are more likely to be published) but it may also arise from misapplication of Statistical procedures or bad experimental design.

Some of the content here is based on a more serious video from Veritasium: https://www.youtube.com/watch?v=42QuXLucH3Q. John Oliver has also spoken about this on Last Week Tonight, for those who are interested in some more examples of science that makes its way onto morning talk shows.

p-hacking can be done in a number of ways- basically anything that is done either consciously or unconsciously to produce statistically significant results where there aren’t any.…

## A quick argument for why we don’t accept the null hypothesis

Introduction

When doing hypothesis testing, an often-repeated rule is ‘never accept the null hypothesis’. The reason for this is that we aren’t making probability statements about true underlying quantities, rather we are making statements about the observed data, given a hypothesis.

We reject the null hypothesis if the observed data is unlikely to be observed given the null hypothesis. In a sense we are trying to disprove the null hypothesis and the strongest thing we can say about it is that we fail to reject the null hypothesis.

That is because observing data that is not unlikely given that a hypothesis is true does not make that hypothesis true. That is a bit of a mouthful, but basically what we are saying is that if we make some claim about the world and then we see some data that does not disprove this claim, we cannot conclude that the claim is true.…

## p-values: an introduction (Part 1)

The starting point

This is the first of (at least) 3 posts on p-values. p-values are everywhere in statistics- especially in fields that require experimental design.

They are also pretty tricky to get your head around at first. This is because of the nature of classical (frequentist) statistics. So to motivate this I am going to talk about a non-statistical situation that will hopefully give some intuition about how to think when interpreting p-values and doing hypothesis testing.

My New Car

I want to buy a car. So I go down to the second hand car dealership to get one. I walk around a bit until I find one that I like.

I think to myself: ‘this is a good car’.

Now because I am at a second-hand car dealership I find it appropriate to gather some data. So I chat to the lady there (looks like a bit of a scammer, but I am here for a deal) about the car.…

## R-squared values for linear regression

What we are talking about

Linear regression is a common and useful statistical tool. You will have almost certainly come across it if your studies have presented you with any sort of statistical problems.

The pros of regression are that it is relatively easy to implement and that the relationship between inputs and outputs is linear (it’s in the name, but this simplifies the interpretation of the relationship significantly). On the downside, it relies fairly heavily on frequentist interpretation of probability (which is a little counterintuitive) and it’s very easy to draw erroneous conclusions from different models.

This post will deal with a measure of how good a model is: $R^2$. First, I will go through what this value means and what it measures. Then, I will discuss an example of how reliance on  $R^2$  is a dangerous game when it comes to linear models.

What you should know

Firstly, let’s establish a bit of context.…

## The definite integral

I realise now, in all the excitement of the FTC that I hadn’t written a post about the definite integral…that’s shocking! ok, here we go…the plan for this post:

• Look at our Riemann sums and think about taking a limit of them
• Define the definite integral
• Look at a couple of theorems about the definite integral
• Do an example
• Look at properties of definite integrals

That’s quite a lot, but we are more or less going to follow along with Stewart. Stewart just has a slightly different style to mine, so I recommend reading his for more detail, and mine for potentially a bit more intuition.

So, let’s begin…

We have seen in previous lectures/sections/semesters/lives that we can approximate the area under a curve by splitting it up into rectangular regions. Here are examples of splitting up one function into rectangles (and, in the last way trapezoids, but you don’t have to worry about this).…

## The Fundamental Theory of Calculus part 2 (part ii)

OK, get ready for some Calculus-Fu!

We have now said that rather than taking pesky limits of Riemann sums to calculate areas under curves (ie. definite integrals), all we need is to find an antiderivative of the function that we are looking at.

As a reminder, to calculate the definite integral of a continuous function, we have:

$\int_a^b f(x)dx=F(b)-F(a)$

where $F$ is any antiderivative of $f$

Remember that to calculate the area under the curve of $f(x)=x^4$ from, let’s say 2 to 5, we had to write:

$\int_2^5 x^4 dx=\lim_{n\rightarrow \infty}\sum_{i=1}^n f(x_i)\Delta x=\lim_{n\rightarrow \infty} f\left(2+\frac{3i}{n}\right)\frac{3}{n}=\lim_{n\rightarrow\infty}\frac{3}{n}\left(2+\frac{3i}{n}\right)^4$

And at that point we had barely even started because we still had to actually evaluate this sum, which is a hell of a calculation…then we have to calculate the limit. What a pain.

Now, we are told that all we have to do is to find any antiderivative of $f(x)=x^4$ and we are basically done.

Can we find a function which, when we take its derivative gives us $x^4$?…

## The Fundamental Theory of Calculus part 2 (part i)

OK, now we come onto the part of the FTC that you are going to use most. We are finally going to show the direct link between the definite integral and the antiderivative. I know that you’ve been holding your breaths until this moment. Get ready to breath a sign of relief:

The Fundamental Theorem of Calculus, Part 2 (also known as the Evaluation Theorem)

If $f$ is continuous on $[a,b]$ then

$\int_a^b f(x) dx=F(b)-F(a)$

where $F$ is any antiderivative of $f$. Ie any function such that $F'=f$.

————-

This means that, very excitingly, now to calculate the area under the curve of a continuous function we no longer have to do any ghastly Riemann sums. We just have to find an antiderivative!

OK, let’s prove this one straight away.

We’ll define:

$g(x)=\int_a^x f(t)dt$

and we know from the FTC part 1 how to take derivatives of this. It’s just $g'(x)=f(x)$. This says that $g$ is an antiderivative of $f$.…

## The Fundamental Theorem of Calculus part 1 (part iii)

So, we are now ready to prove the FTC part 1. We’re going to follow the proof in Stewart and add in some discussion as we go along to motivate what we are doing. What we are going to prove is that:

$\frac{d}{dx} \int_a^x f(t) dt=f(x)$

for $x\in [a,b]$ when $f$ is continuous on $[a,b]$.

Proof:

we define $g(x)=\int_a^x f(t)dt$ and we want to find the derivative of $g$. We will do this by using the fundamental definition of the derivative, so let’s look at calculating this function at $x$ and $x+h$ – ie. how much does it change when we change $x$ by a little bit?

$g(x+h)-g(x)=\int_a^{x+h}f(t) dt-\int_a^x f(t) dt$

But remember that the definite integral is just the area, so this difference is the area between a and x+h minus the area between a and x. Which is just the area between x and x+h. Using the properties of integrals, we can write this formally as:

$g(x+h)-g(x)=\int_a^{x+h}f(t) dt-\int_a^x f(t) dt=\left(\int_a^{x}f(t)+\int_x^{x+h}f(t)\right)-\int_a^{x}f(t)=\int_x^{x+h}f(t)dt$

and we can write, for $h\ne 0$:

$\frac{g(x+h)-g(x)}{h}=\frac{1}{h}\int_x^{x+h}f(t)dt$

Restated, we can think of this as the area between x and x+h divided by h.…