Wednesday, December 22, 2021

A fun argument due to Erdös

 I wanted to share a fun argument that I read in the book Gamma, Exploring Euler's Constant by Julian Havil. Whenever I start reading through proofs of the prime number theorem, this sort of thing crops up and I always find it hard to get my mind to grok it. I understand it line by line, but have difficulty really feeling the essence.

Theorem: The number of primes less than or equal to $n$ is at least $\frac{1}{2}\log_2n.$

The proof is as follows. Let $p_1,p_2,\ldots,p_k$ be the primes less than or equal to $n$. Then any natural number less than or equal to $n$ can be uniquely written as $$p_1^{\epsilon_1}\cdots p_k^{\epsilon_k}\cdot m^2,$$ where the exponents $\epsilon_i\in\{0,1\}$ and $m$ is an integer. In other words, we have decomposed the number into a square free part times a square, and all the exponents of the primes in the square free part will be either $0$ or $1$. Obviously $m^2\leq n$, so $m\leq\sqrt{n}$.

So in order to construct any integer less than or equal to $n$ you must make $k$ binary choices on the exponents of the squarefree part as well as pick an integer $m\leq\sqrt{n}$. Not all such choices will make an integer less than or equal to $n$, but you will definitely construct every integer less than or equal to $n$ in this way! So $$n\leq\sqrt{n}\cdot 2^k.$$ Dividing both sides by $\sqrt{n}$ and taking the base $2$ logarithm, we get the desired result. $\Box$

Well, I am happy I wrote that down, and I have to say I understand the argument a lot better now! I think one of the things that made it a bit hard to grasp at first is that the inequalities are not just about numbers but about numbers of numbers. It is a really nice argument now that I understand it.

Actually, I got to thinking about how this argument fits into a set of "nearby" arguments.  The most straightforward variation is to see what happens when you dispense with the square free factorization. Every number $\leq n$ can be written as $$p_1^{r_1}\cdots p_k^{r_k},$$ but how do we bound the exponents. One try would be to note that the largest any exponent can get is when the base is $2$, where it cannot exceed $\log_2 n$. So, with a very crude estimate, each exponent can be chosen in $\leq \log_2n$ ways, leading to the the inequality $$n\leq (\log_2n)^k.$$ Taking natural logs, we derive 

$$k\geq \frac{\ln n }{\ln(\log_2 n)}. $$

I am kind of surprised this gives as much information as it does. I had expected to get an inequality with no content like $2n>n$! This crude inequality still approaches infinity as $n$ grows, implying the infinitude of primes. When $n=100,000$, we get $k\geq 4.097$ which is a fairly modest statement. On the other hand the original estimate of $1/2 \log_2n$ gives $k\geq 8.304,$ which is not much better. However, one can see that the ratio of the Erdös estimate to the crude estimate goes to infinity, so the Erdös estimate is a substantial improvement.

Another way one could vary the argument is to look at the cube free factorization instead. In that case the exponents all have three choices and there is a factor $m^3$. So we get $$n\leq 3^k\cdot \sqrt[3]{n},$$ leading to the inequality $$k\geq\frac{2}{3}\log_3 n.$$ When $n=100,000$ we get $k\geq 6.986,$ which is again not bad compared to the original, though still worse.  In fact the cubic estimate is always $\frac{4\log2}{3\log 3}\approx 0.84$ times the original estimate. So the original estimate is "only" an improvement of the multiplicative constant.





No comments:

Post a Comment