Chebyshev’s Theorem (weak version)

Prerequirements:

Chebyshev dealed with, in addition to mathematics and statistics, also theoretical physics: this is his “plantigrade machine”, able to convert an axis rotation into a walk (Wikipedia)

When studying the least common multiple of the first positive integers, which we indicated with the notation \psi^{\star}(x), an interesting connection arised, with the function computing the number of primes less than or equal to x, wchich in turn we denoted with the notation \pi(x). This connection is given by the Corollary of Proposition N.2:

\psi^{\star}(x) \leq x^{\pi(x)} \tag{1}

Now we aim at studying the function \pi(x) and in particular its order of magnitude, using the tools of asymptotic analysis. Since, in (1), the function we are interested in appears at the exponent, the first passage to be made is computing the logarithm of both members, in order to take it back to the base. For simplifying the notation, let’s define then the functions \theta(x) and \psi(x) as the logarithms of the corresponding functions with the asterisk:

Logarithmic functions \theta(x) and \psi(x)

We define the functions

\theta(x) := \log \theta^{\star}(x)
\psi(x) := \log \psi^{\star}(x)

where x is a positive integer.

As logarithm base, it’s convenient to use the Nepero’s number e, for reasons that will be clear later. However for some theorems, like the Chebyshev’s theorem we’ll see at the end of this post, the base is irrelevant; but we’ll talk about this better later.

We can note that, by applying respectively Definition N.4 (Product of the first prime numbers) and Proposition N.2 (Calculation of \psi^{\star}(x)), we obtain two expressions it’s worth remembering:

\theta(x) = \log \prod_{p \leq x} p = \sum_{p \leq x} \log p
\psi(x) = \log \prod_{p \leq x} p^{\left \lfloor \log_p x \right \rfloor} = \sum_{p \leq x} \left \lfloor \log_p x \right \rfloor \log p

Going back to (1), by applying the logarithm we have:

\psi(x) \leq \pi(x) \log x

hence

\pi(x) \geq \frac{\psi(x)}{\log x} \tag{1'}

So certainly the order of magnitude of \pi(x) cannot exceed that of \frac{\psi(x)}{\log x}. In order to understand what is this order of magnitude, let’s proceed step by step, studying the order of magnitude of \psi(x) and of \theta(x) first, recalling that these two functions are strictly connected by Proposition N.4 (Connection between the functions \psi^{\star} and \theta^{\star}); then we’ll see how their order of magnitude are connected to that of \pi(x). This way we’ll obtain the following theorem, known as Chebyshev’s theorem:

Chebyshev’s theorem: order of magnitude of \pi(x)

\pi(x) \asymp \frac{x}{\log x} \tag{15}

Actually, as mentioned in the introductive page, a much stronger result than Chebyshev’s theorem holds: it’s the famous prime number Theorem, according to which

\pi(x) \sim \frac{x}{\log x} \tag{16}

In spite of the formal similarity, the proof of this theorem is far more complex than the one of Chebyshev’s theorem: we’ll devote great part of the next posts to it.

Asymptotic equivalence and order of magnitude of \theta(x) and \psi(x)

In order to establish the order of magnitude of the functions \theta(x) and \psi(x), the overestimations and the underestimations we obtained previously will prove useful. In particular, about the function \psi^{\star}, Proposition N.3 (Underestimation of \psi^{\star}(x)) states that

\psi^{\star}(x) \geq \sqrt[3]{2^x}

hence

\psi(x) \geq \log \sqrt[3]{2^x} = \frac{1}{3} \log 2^x = \frac{1}{3} x \log 2 = \frac{\log 2}{3} x \tag{2}

So \psi(x) is greater or equal to a constant multiplied by x. On the other hand, Proposition N.1 (Overestimation of the product of primes up to x) states that

\theta^{\star}(x) \leq 2^{2(x-1)}

hence

\theta(x) \leq \log 2^{2(x-1)} = 2(x-1) \log 2 \leq 2x \log 2 = (2 \log 2) x \tag{3}

So \theta(x) is less than or equal to a constant multiplied by x: the opposite of what we have seen to hold for \psi(x).
But equations (2) and (3) are tightly connected through the already cited Proposition N.4, according to which:

\psi^{\star}(x) = \theta^{\star}(x) \cdot \theta^{\star}(\sqrt{x}) \cdot \ldots \cdot \theta^{\star}(\sqrt[R]{x}) \tag{4}

where R := \left \lfloor \log_2 x \right \rfloor.
Passing to logarithms, (4) can be rewritten as follows:

\psi(x) = \theta(x) + \theta(\sqrt{x}) + \ldots + \theta(\sqrt[R]{x})

Being \theta(x) strictly increasing, the sum terms become smaller and smaller, because the function arguments, x, \sqrt{x}, \ldots, \sqrt[R]{x}, decrease. So we can overestimate the sum by repeating the second term in place of the next ones, obtaining:

\psi(x) \leq \theta(x) + \underbrace{\theta(\sqrt{x}) + \ldots + \theta(\sqrt{x})}_{R - 1 \textrm{ times}} = \theta(x) + (R - 1) \theta(\sqrt{x})

But, by (3), \theta(\sqrt{x}) \leq (2 \log 2) \sqrt{x}. Moreover R - 1 = \left \lfloor \log_2 x \right \rfloor\ - 1 \leq \log_2 x - 1 \leq \log_2 x. Substituting, we have:

\psi(x) \leq \theta(x) + (\log_2 x) (2 \log 2) \sqrt{x} \tag{5}

This equation connecting \psi(x) and \theta(x) is the key for obtaining the order of magnitude of both, starting from equations (2) and (3). In fact, (2) lacks an underestimation for \psi(x) and (3) lacks an overestimation for \theta(x), and in both cases we find in (5) the missing part. So, with some passages we can obtain inequalities like:

A x \leq \psi(x) \leq B x
C x \leq \theta(x) \leq D x

where the capital letters indicate some constants. These inequalities, by Definition A.3, mean that \psi(x) \asymp x and \theta(x) \asymp x, that is \psi(x) and \theta(x) have the same order as the function x or, adopting an abbreviated terminology, they have order \bm{x}.

Finally, starting from formulas (2), (3) and (5) again, we can prove that, definitively:

\psi(x) \cdot c \leq \theta(x), \forall c \lt 1 \tag{6}

Furthermore we know that \theta(x) \leq \psi(x) and this, together with the previous formula, implies that \psi(x) and \theta(x) are asymptotically equivalent (\psi(x) \sim \theta(x)). Putting it altogether, the following Theorem is obtained.

Asymptotic equivalence and order of magnitude of \theta(x) and \psi(x)

The functions \theta(x) and \psi(x) are asymptotically equivalent and have order x:

\theta(x) \sim \psi(x)
\theta(x) \asymp x \asymp \psi(x)

From (2) and (5) we have that:

\frac{\log 2}{3} x \leq \psi(x) \leq \theta(x) + (\log_2 x) (2 \log 2) \sqrt{x} \tag{7}

While from (3) and (5) we have that:

\psi(x) - (\log_2 x) (2 \log 2) \sqrt{x} \leq \theta(x) \leq (2 \log 2) x \tag{8}

Now we have all the elements for finding the order of magnitudes, using the tools of asymptotic analysis.

Let’s start from (7). We can use (3) again for overestimating \theta(x), in addition we can note that (\log_2 x) (2 \log 2) \sqrt{x} is a small oh of x (in fact we know that \lim_{x \to \infty} \frac{(\log_2 x) (2 \log 2) \sqrt{x}}{x} = 0). In the light of this, replacing constants with generic symbols, as their values are not of interest when doing asymptotic analysis, we can develop (7) as follows:

\frac{\log 2}{3} x \leq \psi(x) \leq \theta(x) + (\log_2 x) (2 \log 2) \sqrt{x} \implies
\frac{\log 2}{3} x \leq \psi(x) \leq (2 \log 2) x + o(x) \implies
A x \leq \psi(x) \leq B x + o(x) \tag{9}

About the rightmost member, by Definition A.3 (Asymptotic relationships, alternative definitions using inequalities) we know that there exists a constant C such that definitively o(x) \leq Cx, so definitively B x + o(x) \leq B x + C x = (B + C) x. We can call D the constant B + C and rewrite (9) as follows:

A x \leq \psi(x) \leq D x

And this, as we observed, means that \psi(x) \asymp x, that is \psi(x) has order x.

Starting from (8) and reasoning in a similar way as we did for (7), we can prove that also \theta(x) has order x. But we want to achieve this result by understanding the link between the two functions more in depth. Up to now we asymptotically compared them with the function x, but it’s just so much interesting to compare them between each other. In fact, from the definitions of \theta^{\star}(x) and \psi^{\star}(x) we immediately obtain that \theta^{\star}(x) \leq \psi^{\star}(x), and so also \theta(x) \leq \psi(x). This can be used for completing (5), this way:

\psi(x) - (\log_2 x) (2 \log 2) \sqrt{x} \leq \theta(x) \leq \psi(x)

By dividing all by \psi(x) we obtain:

1 - \frac{(\log_2 x) (2 \log 2) \sqrt{x}}{\psi(x)} \leq \frac{\theta(x)}{\psi(x)} \leq 1 \tag{10}

But we saw that \psi(x) is of order x, so definitively \psi(x) \leq A x, hence \frac{(\log_2 x) (2 \log 2) \sqrt{x}}{A x} \geq \frac{(\log_2 x) (2 \log 2) \sqrt{x}}{\psi(x)}. So we can extend (10):

1 - \frac{(\log_2 x) (2 \log 2) \sqrt{x}}{A x} \leq 1 - \frac{(\log_2 x) (2 \log 2) \sqrt{x}}{\psi(x)} \leq \frac{\theta(x)}{\psi(x)} \leq 1

Disregarding the second expression and passing to the limit for x \to \infty we obtain:

1 - \lim_{x \to \infty} \frac{(\log_2 x) (2 \log 2) \sqrt{x}}{A x} \leq \lim_{x \to \infty} \frac{\theta(x)}{\psi(x)} \leq 1

We note, incidentally, that the leftmost member is the constant c of (6). Like before, we have that the first limit is zero, so

1 \leq \lim_{x \to \infty} \frac{\theta(x)}{\psi(x)} \leq 1

that is

\lim_{x \to \infty} \frac{\theta(x)}{\psi(x)} = 1

In other terms, by Definition A.4 (Asymptotical relationships, alternative definitions using limits), \psi(x) and \theta(x) are asymptotically equivalent. Now we can understand better why also \theta(x) has order x: it’s asymptotically equivalent to the function \psi(x) having order x, and two asymptotically equivalent functions have the same order, being interchangeable in limit calculations.

Theorem N.4 does not imply that \theta(x) \sim x or \psi(x) \sim x. In fact, if these relationships were true, the limits \lim_{x \to \infty} \frac{\psi(x)}{x} and \lim_{x \to \infty} \frac{\theta(x)}{x} should be 1, but it cannot be so, by (2) and (3).

In order to avoid getting confused, it should be remembered that the relationship \asymp is “weaker” than \sim, this means that two functions can have the same order still being not asymptotically equivalent. This happens, for example, for polynomials: all polynomials of degree three are of the same order, but only those ones having the same leading coefficient are asymptotically equivalent (see Definition A.2 (Asymptotic equivalence, case of polynomial functions).

Asymptotic equivalence between \pi(x) and \frac{\theta(x)}{\log x}, order of magnitude of \pi(x)

Now we can put the previous elements altogether in order to find the order of magnitude of \pi(x).

Resuming (1′), we have that

\frac{\psi(x)}{\log x} \leq \pi(x)

But, since \theta(x) \leq \psi(x), even more so:

\frac{\theta(x)}{\log x} \leq \pi(x) \tag{11}

Actually the functions \frac{\theta(x)}{\log x} and \pi(x) are linked by a very stronger relation: they are asymptotically equivalent. This means, by Definition A.3 (Asymptotic relationships, alternative definitions using inequalities), that two conditions must be true definitively:

  • \frac{\theta(x)}{\log x} \cdot c \leq \pi(x), for every constant c \lt 1
  • \frac{\theta(x)}{\log x} \cdot C \geq \pi(x), for every constant C \gt 1

The first condition is a direct consequence of (11). The second condition instead is proved starting from Lemma N.3 (Underestimation of \theta^{\star}(x) through \pi(x)) that, rewritten in logarithmic form, becomes:

\theta(x) \gt (\pi(x) - x^{\delta}) \delta \log x

hence

\frac{\theta(x)}{\log x} \frac{1}{\delta} \gt (\pi(x) - x^{\delta})

We can note a strong similarity with the second condition, that could be proved starting from the formula above. This way an alternative proof would be obtained, because the classical proof shown in the detail below, using limit calculus, starts from Lemma N.3 too but it rewrites it in a different form. In both cases, the final point is the following Theorem.

Asymptotic equivalence between \pi(x) and \frac{\theta(x)}{\log x}

\pi(x) \sim \frac{\theta(x)}{\log x}

Let’s see the proof based on limit calculus. By Definition A.4 (Asymptotic relationships, alternative definitions using limits), we have to prove that the ratio between the functions \pi(x) and \frac{\theta(x)}{\log x} tends to 1:

\lim_{x \to +\infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} = 1

Thanks to (11) we are halfway in proving this limit:

\lim_{x \to +\infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} \geq \lim_{x \to +\infty} \frac{\pi(x)}{\pi(x)} = 1 \tag{12}

Yet we have to answer to the question:

\lim_{x \to +\infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} \leq 1 \ ?

For answering, we can apply Lemma N.3 (Underestimation of \theta^{\star}(x) through \pi(x)). Rewriting it in logarithmic form, we have that:

\theta(x) \gt (\pi(x) - x^{\delta}) \delta \log x

Now let’s rewrite the formula in order to isolate the quantity \frac{\pi(x)}{\frac{\theta(x)}{\log x}} we are interested in:

\theta(x) \gt (\pi(x) - x^{\delta}) \delta \log x \implies
\pi(x) - x^{\delta} \lt \frac{\theta(x)}{\log x} \frac{1}{\delta} \implies
\pi(x) \lt \frac{\theta(x)}{\log x} \frac{1}{\delta} + x^{\delta} \implies
\frac{\pi(x)}{\frac{\theta(x)}{\log x}} \lt \frac{1}{\delta} + x^{\delta} \frac{\log x}{\theta(x)}

If we multiply and divide by x the rightmost term, we obtain the ratio \frac{x}{\theta(x)} that, by Theorem N.4 and Definition A.4, tends to a constant A:

\frac{\pi(x)}{\frac{\theta(x)}{\log x}} \lt \frac{1}{\delta} + x^{\delta} \frac{\log x}{x} \frac{x}{\theta(x)} \implies
\lim_{x \to \infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} \leq \lim_{x \to \infty} \frac{1}{\delta} + \left( \lim_{x \to \infty} x^{\delta} \frac{\log x}{x} \right) \left( \lim_{x \to \infty} \frac{x}{\theta(x)} \right) = \lim_{x \to \infty} \frac{1}{\delta} + \left( \lim_{x \to \infty} x^{\delta} \frac{\log x}{x} \right) A \tag{13}

The term \frac{1}{\delta} does not depend on x, so its limit is the term itself. We can rewrite the last limit inside brackets as \frac{\log x}{x^{1 - \delta}} and observe that it, for \delta \neq 1, tends to zero, because \log x = o(x^k) for any k \neq 0.

This passage is fundamental: you may recall that, before obtaining Lemma N.3, we thought about two alternatives about how many primes to exclude: \sqrt{x} and \frac{1}{2} x, for \delta = \frac{1}{2}; that is x^{\delta} and \delta x, for a generic \delta. Finally we chose x^{\delta} postponing the reason for later. Well, this reason lies just in the limit of x^{\delta} \frac{\log x}{x} = \frac{\log x}{x^{1 - \delta}}. If we had chosen \delta x instead of x^{\delta} and had done the same passages as before, we would have obtained the expression \delta x \frac{\log x}{x} = \delta \log x the limit of which, though, does not tend to zero, and this would not let us achieve the result we’ll obtain in a moment. So we can see that, sometimes, the same argument can conduct either to an interesting result or to a deadlock, depending on how we choose the functions we work with.

Going back to (13), for what we said it can be rewritten simply as

\lim_{x \to \infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} \leq \frac{1}{\delta} \textrm{ (}\delta \neq 1\textrm{)} \tag{14}

The value \delta = 1 we excluded for getting to (14), as often happens in analysis, is just the most interesting. In fact if we try to compute the limit of both members of (14) as \delta \to 1, we obtain

\lim_{\delta \to 1} \lim_{x \to \infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} \leq \lim_{\delta \to 1} \frac{1}{\delta} \textrm{ (}\delta \neq 1\textrm{)}

But the first limit is unnecessary because the expression \lim_{x \to \infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} does not depend on \delta. So we have:

\lim_{x \to \infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} \leq \lim_{\delta \to 1} \frac{1}{\delta} = 1

hence \lim_{x \to \infty} \frac{\pi(x)}{\frac{\theta(x)}{\log x}} \leq 1: it’s just the inequality we were looking for. Joining it with (12), we obtain Theorem N.5.

From Theorem N.4 we know that \theta(x) \sim \psi(x), hence \frac{\theta(x)}{\log x} \sim \frac{\psi(x)}{\log x}, so we obtain the following corollary:

Asymptotic equivalence between \pi(x) and \frac{\psi(x)}{\log x}

\pi(x) \sim \frac{\psi(x)}{\log x}

But from Theorem N.4 we also know that the functions \theta(x) and \psi(x) have order x, so from Theorem N.5 or from its corollary the Chebyshev’s Theorem cited at the beginning is immediately obtained.

To summarize, what does Chebyshev’s Theorem tell us about prime numbers? It’s difficult to make an example, because it’s an asymptotic relationship between two functions, in which it doesn’t matter what happens for a particular value of x, but the behaviour of the functions as x grows. But we can have an idea of the importance of the theorem imagining all involved functions to be polynomials: that is, let’s imagine that in Chebyshev’s Theorem \pi(x) is a polynomial, and that in place of the function \frac{x}{\log x} there is a polynomial, for example the second degree polynomial 2x^2 - 3x + 4. The question is: in this simplified context, what information the Theorem would convey about \pi(x)? As we already know, if two polynomials are of the same order, then they have the same degree. So \pi(x) would be of degree two too, that is something like ax^2 + bx + c, where the coefficients a, b, c are unknown yet. So the kind of information that Chebyshev’s Theorem conveys about the function \pi(x) is equivalent to know, about a polynomial, only its degree. In a hypothetical game “guess the polynomial”, the degree may be the first important question. Then the coefficients would have to be guessed too, but the degree is the first important step.

We end with some important remarks. Starting from this post, as you have noticed, our development of number theory makes intense use of logarithms. One of the reasons is rather simple: it’s that, in several formulas like (1), the function \pi(x) or other functions of interest appear at the exponent, so logarithm is necessary for getting them back to the base, so obtaining theorems like Chebyshev’s one. Then there is another reason, more profound, that we’ll understand in the next posts: logarithms allow to convert products into sums; this passage lets in turn convert sums into integrals, so letting use some integration techniques… but we’ll see all this with time.
Also the choice of the logarithm base will be clarified later: for example we’ll see why in the prime number Theorem the base must be the Nepero’s number e. This is not true for Chebyshev’s Theorem, in which any base is good. In fact, we suppose the logarithm in (15) to be in a certain base a, with the following passages we can express it in another base b:

\pi(x) \asymp \frac{x}{\log_a x} = \frac{x}{\frac{\log_b x}{\log_b a}} = (\log_b a) \frac{x}{\log_b x}

But the multiplication by a constant \log_b a is irrelevant for the asymptotic order: if \pi(x) is of the same order as (\log_b a) \frac{x}{\log_b x}, it’s also of the same order as \frac{x}{\log_b x}, that is Chebyshev’s Theorem is also true with base b. This means that in (16) we can choose any base for the logarithm.
The same remark is also valid for other formulas, like the first of Theorem N.4:

\theta(x) \sim \psi(x)

Here logarithms are implicit, inside the definitions of the functions \theta(x) and \psi(x) (Definition N.7). Here the base does not matter too, as long as it’s the same in the definition of \theta and in the definition of \psi: a possible change of base would multiply both members by the same constant, preserving the asymptotic equivalence.
The case of the prime number Theorem is instead very different: in (16) there is only one logarithm, because there is no logarithm inside the definition of \pi(x). A possible change of base would entail, as in (15), the multiplication by a constant, but the asymptotic equivalence \sim is not preserved by the multiplication by a generic constant, differently from \asymp. For this reason it will be important, for getting to this Theorem, choosing the correct base.

Leave a Reply

Your email address will not be published. Required fields are marked *