Ever since Euclid first proved that there are infinitely many prime numbers, mathematicians have found ever more creative ways to prove the same result, and also various stronger theorems that imply it. Dirichlet’s Theorem, for example, states that if$m$ and $n$ are relatively prime integers, then there are infinitely many prime numbers of the form $mk + n$ for some integer $k$. It is also known that the sum of the reciprocals of the prime numbers diverges, that the sum

$\displaystyle \sum_{\substack{p \leq n \\ p \text{ prime}}} \frac{1}{p} \sim \log(\log(n))$

and that the number of prime numbers less than $n$ is asymptotically equal to $\displaystyle \frac{n}{\log(n)}$. In this blog post, we will continue this proud tradition by proving that there are infinitely many prime numbers which have your phone number somewhere in their digits, and which simultaneously have a prime number of digits.

To do so, we will look at the convergence of two different sums: that of the reciprocals of the primes with a prime number of digits, and that of the reciprocals of the natural numbers which do not contain your phone number amongst their digits. As we shall show, the former diverges, while the latter converges.

Amongst the first examples of divergent series which students are introduced to is the harmonic series

$\displaystyle \sum_{n=1}^\infty \frac{1}{n}$

This can be shown to diverge by grouping the terms into groups of powers of $2$: We have that for $n$ lying between $2^k$ and $2^{k+1}$, the reciprocal of $n$ is at least $2^{-1 -k}$. Thus it follows that

$\displaystyle \sum_{n=2^k+1}^{2^{k+1}} \frac{1}{n} \geq \sum_{n=2^k+1}^{2^{k+1}} \frac{1}{2^{k+1}} = \frac{2^k}{2^{k+1}} = \frac{1}{2}$

From this it follows that

$\displaystyle \sum_{n=1}^{2^k} \frac{1}{n} \geq 1 + \frac{k}{2}$

and so the sum

$\displaystyle \sum_{n=1}^\infty \frac{1}{n}$

diverges.

We will use a similar argument to show that a slightly modified sequence of reciprocals converges. We will again group numbers based on their relative sizes, but will show that there are few enough numbers of each size that the series will converge.

We prove the following proposition:

Proposition 1

Let $S$ be any string of (decimal) digits, and consider the set $T$ of natural numbers which do not contain $S$ in their decimal expansion. Then the sum of the reciprocals of the elements of $T$ converges.

To prove this, we consider for each whole number $k$ the sum of the reciprocal of those elements of $T$ which have a number of digits from $mk+1$ to $m(k+1)$, where $m$ is the number of digits in the string $S$. The digits of such a natural number can be broken up into $k+1$ blocks of $m$ consecutive digits. There are $10^m$ possible strings of $m$ digits, and each of the blocks of digits could potentially be any one of these $10^m$ strings, except for $S$. Thus there are at most $10^m - 1$ options for each block of digits, and so at most $(10^m - 1)^{k+1}$ elements of $T$ which have from $mk+1$ to $m(k+1)$ digits.

A natural number with at least $mk+1$ digits is greater than or equal to $10^{mk}$, and so its reciprocal is at most $10^{-mk}$. Thus if we consider the sum of the reciprocals of the elements of $T$ which have from $mk+1$ to $m(k+1)$ digits, we see that it is bounded above by

$\displaystyle \frac{(10^m - 1)^{k+1}}{10^{mk}} = (10^m - 1) \cdot \left(\frac{10^m - 1}{10^m}\right)^k$

The sum of the reciprocals of all of the elements of $T$ is then bounded above by the geometric series

$\displaystyle \sum_{k=0}^\infty (10^m - 1) \cdot \left(\frac{10^m - 1}{10^m}\right)^k$

which converges to $10^{m} (10^m - 1)$.

$\square$

It is perhaps surprising that we can make the harmonic series converge merely by removing the reciprocals of those numbers which do not contain some string of digits $S$. At least as surprising is that if we were instead to remove the reciprocals of all of the numbers which are not prime, then the resulting series would diverge. As stated at the start of the post, it is even true that the sum of the reciprocals of the prime numbers with a prime number of digits diverges. We will in fact prove an even stronger result. To do so, we will make use of the Prime Number Theorem which was proven independently by Hadamard and by de la Vallée-Poussin in 1896.

Prime Number Theorem

Let $\pi(n)$ be the number of prime numbers which are less than $n$. Then

$\displaystyle \pi(n) \sim \frac{n}{\log(n)}$

That is to say

$\displaystyle \lim_{n \to \infty} \pi(n) \Big\slash \frac{n}{\log(n)} = 1$

This allows us to prove the following proposition:

Proposition 2

Let $S_0$ be the set of natural numbers, and for each $n$, let $S_{n+1}$ be the set of prime numbers which have a number of digits which is an element of $S_n$. Then for each $n$, the sum of the reciprocals of the elements of $S_n$ diverges.

The proof is by induction on the number $n$.

The proposition holds for $n = 0$, since the sum of the reciprocals of the elements of $S_0$ is just the harmonic series, which we earlier showed to diverge. Now suppose that for some number $n$, the sum of the reciprocals of the elements of $S_n$ diverges. We will show that the same is true for $n+1$.

By the Prime Number Theorem, there is some natural number $N$ such that for all $n \geq N$, we have that

$\displaystyle \frac{1}{2} \cdot \frac{n}{\log(n)} \leq \pi(n) \leq \frac{3}{2} \cdot \frac{n}{\log(n)}$

Consider those prime numbers with $m$ digits. The number of such prime numbers is $\pi(10^m) - \pi(10^{m-1})$. Provided that $10^{m-1} \geq N$, we see that this is equal to at least

$\displaystyle \frac{1}{2} \cdot \frac{10^m}{m \log(10)} - \frac{3}{2} \cdot \frac{10^{m-1}}{(m-1) \log(10)}$

$\displaystyle = \frac{10^{m-1}}{2\log(10)} \cdot \frac{10(m-1) - 3m}{m(m-1)}$

Such a prime number is at most $10^m$, and so its reciprocal is at least $10^{-m}$. It follows that the sum of the reciprocals of all such prime numbers is equal to at least

$\displaystyle \frac{1}{20 \log(10)} \cdot \frac{7m - 10}{m(m-1)}$

For $m \geq 4$, we have that $7m - 10 \geq 6(m-1)$, and so this is bounded below by

$\displaystyle \frac{3}{10 \log(10)} \cdot \frac{1}{m}$

Thus provided that $10^{m-1} \geq N$, and $m \geq 4$, we have that the sum of the reciprocals of the primes with $m$ digits is at least $\frac{c}{m}$, where $c$ is the constant

$\displaystyle c = \frac{3}{10 \log(10)}$

The elements of $S_{n+1}$ have a number of digits which is in $S_n$, and so it follows that the sum of the reciprocals of the elements of $S_{n+1}$ is bounded below by

$\displaystyle \sum_{\substack{m \in S_n\\ 10^{m-1} \geq 1000N}} \frac{c}{m}$

which diverges by the inductive hypothesis. Thus the sum of the reciprocals of the elements of $S_{n+1}$ diverges, and the result is proven.

$\square$

It now follows easily that there are infinitely many prime numbers which contain your phone number amongst their digits, and which simultaneously have a prime number of digits.

Suppose that this were not the case. Let $P$ be the set of prime numbers with a prime number of digits which do not contain your phone number amongst their digits. By taking the string $S$ to be the digits of your phone number in the first proposition, we see that the sum of the reciprocals of the elements of $P$ converges. But $P$ contains all but finitely many of the elements of the set $S_2$ from the second proposition, and so the sum of the reciprocals of the elements of $P$ diverges, a contradiction.

Of course, there is nothing special about decimal notation. It should be clear that the results above hold for any base.

In the special case in which the string of digits $S$ is taken to be the string $9"$, the sequence described in the first proposition above is known as the Kempner Series. The second proposition was brought to my attention by this thread on reddit.

 How clear is this post?