Introduction

A key consideration when analysing stratified data is how the behaviour of each category differs and how these differences might influence the overall observations about the data. For example, a data set might be split into one large category that dictates the overall behaviour or there may be a category with statistics that are significantly different from the other categories that skews the overall numbers. These features of the data are important to be aware of and go find to prevent drawing erroneous conclusions from your analysis. Context, the source of the data and a careful analysis of the data can prevent this. Simpson’s paradox is an interesting result of some of these effects.

Simpson’s paradox is observed in statistics when a trend is observed in a number of different groups but it is not observed in the overall data or the opposite trend is observed.

Observing the overall data might therefore lead us to draw a conclusion, but when the data is grouped we might conclude something different.…

Gallery

## What did you expect? Some notes on the Expectation operator.

Introduction

A significant amount of focus in statistics is on making inference about the averages or means of phenomena. For example, we might be interested in the average number of goals scored per game by a football team, or the average global temperature or the average cost of a house in a particular area.

The two types of averages that we usually focus on are the sample mean from a set of data and the expectation that comes from a probability distribution. For example if three men weigh 70kg, 80kg, and 90kg respectively then the sample mean of their weight is $\bar x = \frac{70+80+90}{3} = 80$. Alternatively, we might say that the arrival times of trains are exponentially distributed with parameter $\lambda = 3$ we can use the properties of the exponential distribution to find the mean (or expectation). In this case the mean is $\mu = \frac{1}{\lambda} = \frac{1}{3}$.

It is this second kind of mean (which we will call the expectation from now on), along with the generalisation of taking the expectation of functions of random variables that we will focus on.…

## A quick argument for why we don’t accept the null hypothesis

Introduction

When doing hypothesis testing, an often-repeated rule is ‘never accept the null hypothesis’. The reason for this is that we aren’t making probability statements about true underlying quantities, rather we are making statements about the observed data, given a hypothesis.

We reject the null hypothesis if the observed data is unlikely to be observed given the null hypothesis. In a sense we are trying to disprove the null hypothesis and the strongest thing we can say about it is that we fail to reject the null hypothesis.

That is because observing data that is not unlikely given that a hypothesis is true does not make that hypothesis true. That is a bit of a mouthful, but basically what we are saying is that if we make some claim about the world and then we see some data that does not disprove this claim, we cannot conclude that the claim is true.…

## p-values: an introduction (Part 1)

The starting point

This is the first of (at least) 3 posts on p-values. p-values are everywhere in statistics- especially in fields that require experimental design.

They are also pretty tricky to get your head around at first. This is because of the nature of classical (frequentist) statistics. So to motivate this I am going to talk about a non-statistical situation that will hopefully give some intuition about how to think when interpreting p-values and doing hypothesis testing.

My New Car

I want to buy a car. So I go down to the second hand car dealership to get one. I walk around a bit until I find one that I like.

I think to myself: ‘this is a good car’.

Now because I am at a second-hand car dealership I find it appropriate to gather some data. So I chat to the lady there (looks like a bit of a scammer, but I am here for a deal) about the car.…

## Learn Wolfram Mathematica in the Cloud Part 6

Today we delve into Associations aka Dictionaries in languages like Python

 How clear is this post?

## Learn Wolfram Mathematica in the Cloud part 5

Let’s do some list FU, a kind of Kung Fu with Wolfram language lists

 How clear is this post?

## Learn Wolfram Mathematica in the cloud part 4

Diving deeper into lists

 How clear is this post?

## Learn Wolfram Mathematica in the cloud part 3

Dipping into Lists

 How clear is this post?

## Learn Wolfram Mathamatica Part 2

Today we see how to use Wolfram language as a Calculator using the Notebook environment

 How clear is this post?

## Introduction to Wolfram Mathematica programming

Mathematica is becoming an indispensable tool for doing all kinds of computation and it is important to know how to use it as it will leverage your problem-solving skills, allowing you focus on higher level issues of modelling solutions rather than focusing too much on calculational details.

I will be posting short lessons regularly and the good news is you don’t need to install anything locally as all examples can be run online. If you need to do any form of extensive programming you can always go to the Wolfram Cloud and click on Programming Lab and get access for free.

Instructions on running code

If you don’t have a Wolfram ID, create one as this will give you access to the Wolfram Cloud. After you have done this sign in. If you have one then simply sign into the Wolfram Cloud.

If you have downloaded the notebook file from the blog and saved it somewhere then do the following to upload it to the cloud so you can play with the code.…