Introduction

In this post we will have a look at Parrondos paradox. In a paper* entitled “Information Entropy and Parrondo’s Discrete-Time Ratchet”** the authors demonstrate a situation where, by switching between 2 losing strategies, we can create a winning strategy.

Setup

The setup to this paradox is as follows:

We have 2 games that we can play – if we win we get 1 unit of wealth, if we lose, it costs 1 unit of wealth. Game A gives us a payout of 1 with a probability of slightly less than 0.5. Clearly if we play this game for long enough we will end up losing.

Game B is a little more complicated in that it is defined with reference to our existing winnings. If our current level of wealth is a multiple of M we play a game where the probability of winning is slightly less than 0.1. If it is not a multiple of M, the probability of winning is slightly less than 0.75. This game is also a losing game (although this is not obvious from the outset).

Now, we have 2 losing games – how do we win? By switching between playing Game A and Game B.

Some simulations

We now simulate playing each of the games we have described above (i.e. A and B) as well as a couple of strategies for switching between them. The graphs below show the histograms of the results of 1000 gambles of length 50000 (i.e. 1000 different examples of playing the each game for 50000 discrete time steps) for each strategy. We have specified the parameters of the games as per the original paper: M=3,  p=0.495,  p_{12} =0.745,  p_{3} =0.095 where p is the probability of winning for Game A and p_{3} and p_{12} are the probabilities of winning when our winnings are multiple a multiple of 3 or not, respectively, when playing Game B. We start our wealth on 99 and run the simulations below.

Strategy A

wealth A

Strategy B

wealth B

As can be seen, these lose on most occasions (if we let the game go on for long enough we will always lose).

Now, what if we switch between the strategies according to a specified pattern? And what if we play each game with a probability of 50%?

Strategy BBABABBABABBABA….

wealth BBABA

The strategy AABBAABBAABB…..

wealth AABB

Switching randomly with equal probability

wealth random

As can be seen these win on most occasions. So by switching between games we have managed to turn a combination of 2 losing strategies into a winning one. This doesn’t work for all switching strategies, however. Consider the extreme case of playing Game A 99999 times in a row and then playing Game B once – this is too close to Game A to allow the effect of playing Game B to have an impact. What is especially surprising is that the strategy of switching randomly is a winning game. The following section will give an explanation, using Markov Chains, as to why Game B is losing but switching randomly between Game A and Game B is not.

Some maths

Now that we have seen this phenomenon at work, lets have a look at a bit of the maths.

To see that gamble A is losing in the long run is easy – the probability of winning is lower than the probability of losing in all situations

To see that gamble B is losing is a little less intuitive. The high chance of winning in certain circumstances makes it seem that this strategy may well be winning if these situations crop up often enough. Taking the parameters that we used in the simulations we can crudely estimate the chance of winning. In one third of cases (i.e. where our wealth is a multiple of 3)  we have a chance of winning of 9.5% and in 2 thirds we have a chance of winning of 74.5%. This gives us an overall probability of winning of:

\frac{1}{3} \times 0.095 + \frac{2}{3} \times 0.745 \approx 0.53. But this is more than 0.5! Intuitively, the strategy should be a winning one. But this is not the case. We saw this through the simulations above.

The reason for this crude approximation being wrong is that the probability of our total wealth being a multiple of 3 (and thus us playing the game that is more biased against us) is more than \frac{1}{3}. To show that this strategy is losing we can use a Markov Chain***.

Assuming we start at a point where our wealth is a multiple of 3, it suffices to show that if we lose 3 units of wealth before we gain 3 (probabilistically speaking), the game is losing in general. This is because if we start at a multiple of 3 and get down to a lower multiple of 3 more often than we get up to a multiple of 3 (i.e. back to the same situation in terms of the gamble taken but with less wealth) we can see this as losing. The Markov Property**** is used here for this to hold. We can use the Markov chain below to represent this situation – we start at 0 (which can be seen as any multiple of 3) and move to 1 with probability p_3 = 0.095 and to -1 with probability 1-p_3 = 0.905. In any other situation we move up with p_{12} = 0.745 and down with 1- p_{12} = 0.155. When we reach 3 or -3 we take it as an absorbing state i.e. moving back to itself with probability 1 because at this point we have enough information to determine whether the game is a winning one or not – this means that if we get to 3 with a higher probability than we get to -3, the game is a winning one, else it is not.

Now, we can work out our probabilities as follows: let Z_j be the probability of reaching state 3 from state j. We derive the following system of equations:

z_{-2} = p_{12}*z_{-1}\\    z_{-1} = (1-p_{12})*z_{-2} + p_{12}*z_0\\    z_{0} = (1 - p_3)*z{-1} + p_3*z_1\\    z_{1} = (1-p_{12})*z_0 + p_{12}*z_2\\    z_2 = (1-p_{12})*z_1 + p_{12}\\

Markov Chain for Game B

Screenshot 2020-11-09 at 13.21.12

Now, these probabilities are all defined with reference to one another, and this is a solvable set of equations. We are most interested in z_0 as this is the one that will tell us whether we expected to win on average or not. To get some intuition into how these work, lets consider z_{-1}, the probability that we win 3 before losing 3 given that we have lost 1 already: we can see the expression:

z_{-1} = (1-p_{12})*z_{-2} + p_{12}*z_0

is a weighted sum of the probabilities z_{-2} and z_{0}, weighted by the probabilities of moving to those states. That is to say, if we are 1 unit of currency down and win (with probability p_{12}), we then move to 0 units of currency and then we have the probability, z_0 of winning 3 before losing 3. We apply the same logic if we lose and to all of the remaining states.

We are going to gloss over the solving of the equations and focus on the solution for z_0. The expression we get is:

z_0 = \frac{p_3*p^2_{12}}{(1-p_3)*(1-p_{12})^2 + p_3*p^2_{12}}

and plugging in p_{12} = 0.745 and p_3 = 0.095

we get:

z_0 \approx 0.47

So this tells us that, contrary to our crude estimate above, the probability of winning in this game is less than 0.5. Starting from any multiple of 3 in our wealth we have a less than 50% chance of reaching the higher multiple of 3 before we reach the lower multiple of 3 – once we reach that lower multiple of 3 we are in the same position (in terms of the game’s dynamics) as we were to start and the process then repeats.

Alright, now that we know that A is a losing game and we have convinced ourselves that B is a losing game, we can have a look at one of the strategies that we simulated above: randomly switching between A and B.

Markov Chain for Switching randomly between A and B

Screenshot 2020-11-09 at 13.21.16

We will use pretty much the same method as above, to show this, so all we need to do is find q_{12} and q_3. For each of these, half the time we will play Game A and the other half of the time we will play Game B, so these are easily computed:

q_{12} = 0.5*0.495 + 0.5*0.745 = 0.62\\    q_{3} =  0.5*495 + 0.5*0.095 = 0.295\\

We can use the same formula for z_0 above to get, for randomly switching between A and B:

z_{0} \approx 0.53

Since this is greater than 0.5 we now have a winning game!

Essentially, even though Game A is losing, by adding it in randomly amongst Game B, we can drive up the probability of winning when our wealth is a multiple of 3 sufficiently without decreasing the probability of winning when our wealth is not a multiple of 3 too much. This gives us a game that is winning overall, even though neither of the games that we are switching between are winning on their own.

This counterintuitive yet useful phenomenon has applications in engineer, biology and financial risk management. We have shown how it works with simulations and given an argument as to how randomly switching between losing games can create a winning one.

*https://pdfs.semanticscholar.org/dc1a/dc311ec732b5e8405be141617d7e473e74d7.pdf?_ga=2.78137623.2064758464.1603901876-1899645328.1603901876

**there is no information theory in this post

***https://en.wikipedia.org/wiki/Markov_property#:~:text=A%

20stochastic%20process%20has%20the,is%20called%20a%20Markov%20process.

****https://en.wikipedia.org/wiki/Markov_property#:~:text=A%20stochastic

%20process%20has%20the,is%20called%20a%20Markov%20process.

Other references

https://www.youtube.com/watch?v=SqbZxmCEeDc – Markov chains

 

How clear is this post?