Prerequirements:

- Elements of asymptotic analysis
- Chebyshev’s Theorem: weak version
- The lemma of bar chart area
- From integer numbers to real numbers
- The sum of inverses of the first positive integers
- The limit inferior and the limit superior of a sequence
- Properties of asymptotic orders
- Chebyshev’s Theorem: strong version
- The functions W and V
- The integral mean value and the absolute error function R
- Selberg’s Theorem: proof and application

Looking back to the path travelled so far, we can identify a turning point: it was when we introduced Hypothesis N.1, which consists in an inequality with an integral inside it. At that point, in order to simplify, we decided to replace the integral with a summation (which is reasonable, because the Riemann integral is defined as the limit of a summation). Since then, we have always worked with summations, up to the proof of Selberg’s Theorem and some consequences of it. So we obtained some relationships similar to Hypothesis N.1, except for the presence of summations instead of integrals. Now it’s time to return to the integral form, for resuming the argument which brought us to Hypothesis N.1, trying to prove it.

We have to remember that Hypothesis N.1 consists in supposing the relationship \alpha \leq \beta (where the constants \alpha and \beta were introduced with Definition N.14) to be true without the limit superior, i.e. supposing that for all positive integers x the following hypothetical relationship is true:

The most similar relationship which we arrived to so far, is the one of Corollary 2 of Proposition N.8:

where the coefficients a_n are such that:

The goal of this post is to pass from the relationship (2) to one of the kind:

That is, we want to pass from a summation with coefficients a_n to an integral with some terms f(t) to be determined.

The first problem, in doing that, is to establish which asymptotic order is needed in formula (3): this way we’ll be able to understand what is the most suitable function f.

## Determination of the asymptotic order to be used when passing to the integral: a possible alternative proof

In order to understand what asymptotic order is required in (3), let’s consider for the moment what would happen if the logarithm was not squared. In the previous post, in particular in Figure 2, we saw that the following relationship turns out to be empirically plausible:

Still at an empirical level, in the previous post we saw that the summation \sum_{n \leq x} \Lambda(n) \left|R\left(\frac{x}{n}\right)\right| is well approximated by the integral \int_1^x \left| R\left(\frac{x}{t}\right) \right| dt, so it would result:

If this small experimental error was proved to be O(x), by the Corollary of Property A.9 the last two terms on the right would join together into a single term O(x) + O(x) = O(x):

hence Hypothesis N.1 would be proved up to a term of order O(x). This formula seems to be coherent with real data, as we can see in the following picture, where the term O(x) seems to be a bit smaller in absolute value than the function -\frac{3}{2} x:

Now we think it’s worth doing a brief digression: let’s see what implications would (5′) have, if it was proved. For this aim, let’s express it in terms of the function V. Using formula (13) of the post The integral mean value and the absolute error function R, from (5′) we’ll obtain:

hence, by dividing all by x, applying Property A.8A and remembering that, by formula (12) of the cited post, V(\log x) = \frac{R(x)}{x}:

Dividing all by \log x and applying Property A.8A again, we’ll obtain:

hence, making the limit superior of both sides and noting that the limit superior of a O(\frac{1}{\log x}) is zero, we’ll obtain:

Why the limit superior of a O(\frac{1}{\log x}) is zero?

The expression “the limit superior of a O(\frac{1}{\log x}) is zero” means that, given any function f = O(\frac{1}{\log x}), then \limsup_{x \to +\infty} f(x) = 0. Let’s see why. If f = O(\frac{1}{\log x}) then, by Definition A.4, \lim_{x \to +\infty} f(x) / \frac{1}{\log x} = L, where L \in \mathbb{R}. Then \lim_{x \to +\infty} f(x) \log x = L, and in particular, by Property A.3, \limsup_{x \to +\infty} f(x) \log x = L. But \limsup_{x \to +\infty} \log x = \lim_{x \to +\infty} \log x = +\infty, so the only possibility in order that \limsup_{x \to +\infty} f(x) \log x = L is that the limit superior of f(x) is 0, because if it was \pm \infty or a non-zero real number, the limit superior of the product f(x) \log x would be \pm \infty (we assumed that the limit superior of a product has analogous properties to the ones of the limit). So \limsup_{x \to +\infty} f(x) = 0.

Remembering Definition N.14 (Constants \alpha and \beta), equation (6) can be written simply as follows:

But this is just the relationship we wanted to prove! So, if formula (5′) was proved, it would let simplify considerably a part of the proof of the Prime Number Theorem: so we think it’s important. In fact we’ll see, in this post and in the next one, that the canonical proof of formula (6′) requires some quite heavy technical artifices.

Now many our readers may be asking: why isn’t worth completing such an alternative proof, if it promises to be simpler? Indeed we don’t know, so the only way to answer this question for us would be by trying. Some great technical difficulty could arise, and this may be the reason why Erdős and Selberg preferred to follow another, apparently more complex, way. Or maybe, who knows, the goal can be achieved, obtaining actually a simpler proof the the inequality \alpha \leq \beta… that can be a good challenge for our readers!

Now let’s come back to the real proof. In order to obtain the same left member as (3), we have to multiply both members of (5′) by \log x (by also using Corollary 1 of Property A.8):

If we compare this formula with (3), we can say that the asymptotic order we are looking for should be O(x \log x). Clearly this argument is not a proof, because we based on formula (5′) which is not proved, but it gives the idea of what to look for, and in the next section we’ll prove that this idea is correct.

Unfortunately, Hypothesis N.1 will remain an open problem for us. We needed it as a starting point for developing the proof, but in order to go on it’s not essential to establish whether it’s true or false, so we won’t do that. We think it’s true up to a term O(x), i.e. formula (5′) is true; so Hypothesis N.1 is linked to the argument developed so far.

## Proof strategy: the two phases

Let’s suppose that the idea developed in the previous section is correct, i.e. the asymptotic order which appears in (3) is just O(x \log x):

One technique for proving this relationship is to use a summation as an intermediate step towards the integral, i.e. to prove that:

where the first inequality is formula (2) we saw at the beginning, while the second one, by the properties of asymptotic orders Corollary 3 of Property A.8, Property A.10 and Corollary of Property A.9, is equivalent to:

So we have to prove the latter asymptotic relationship. We’ll do that in two phases, by introducing an expression which is a middle ground between the summation and the integral:

- The first phase consists in proving that:
\left| \sum_{n \leq x} a_n \left|R\left(\frac{x}{n}\right)\right| - \sum_{2 \leq n \leq x} \left|R\left(\frac{x}{n}\right)\right| \int_{n-1}^n f(t) dt \right| = O(x \log x) \tag{8}
- The second phase consists in proving that:
\left| \sum_{2 \leq n \leq x} \left|R\left(\frac{x}{n}\right)\right| \int_{n-1}^n f(t) dt - \int_1^x f(t) \left| R\left(\frac{x}{t}\right) \right| dt \right| = O(x \log x) \tag{9}

Now let’s call A := \sum_{n \leq x} a_n \left|R\left(\frac{x}{n}\right)\right|, B := \sum_{2 \leq n \leq x} \left|R\left(\frac{x}{n}\right)\right| \int_{n-1}^n f(t) dt and C := \int_1^x f(t) \left| R\left(\frac{x}{t}\right) \right| dt, hence the absolute value of (8) becomes |A - B| and the one of (9) becomes |B - C|. So, joining the results of the two phases together, we’ll obtain:

hence, by Property A.12, A - C = O(x \log x), which is (7).

The two phases of the proof are the one the consequence of the other, meaning that the function f(t) can be determined from (8), completing the first phase; next, in the second phase, the aim is to check that such a function satisfies also (9).

By several technical passages, it can be proved that the resulting function is f(t) = 2 \log t. So, running over the passages backwards, this function can be put also into (3′), obtaining:

Consequence of Selberg’s Theorem, integral form

First of all we can note that, in (8), the coefficient a_1 is equal to zero. In fact, we have to remember that the coefficients which we denoted by a_n are given by Corollary 1 of Proposition N.8 (Consequence of Selberg’s Theorem), where we set:

But, if n = 1, we’ll have that a_1 = \Lambda(1) \Lambda(1) + \Lambda(1) \log 1, hence, since \Lambda(1) = \log 1 = 0 (by Definition N.11), we’ll have that a_1 = 0. So in (8) we can replace the summation \sum_{n \leq x} a_n \left|R\left(\frac{x}{n}\right)\right| with the summation \sum_{2 \leq n \leq x} a_n \left|R\left(\frac{x}{n}\right)\right|, which can be combined with the next one as follows:

Setting c_n := a_n - \int_{n-1}^n f(t) dt, the last formula can be rewritten as follows:

Applying the Lemma of bar chart area, we’ll obtain:

where, for all n = 2, \ldots, x:

This expression is important because it introduces the summation \sum_{h \leq n} a_h, which we know from (2′).

We can note that \left|R\left(\frac{x}{x}\right)\right| = R(1) = \overline{\psi}(1) - 1 = \psi(1) - 1 = \log \psi^{\star}(1) = \log 1 = 0 (we applied the Definitions N.15, N.8, N.7 and N.5), hence the first term inside the absolute value of (a) goes away:

By the triangle inequality:

So, in order to prove (a’), which in turn is equivalent to the previous formulas, by Property A.12 and Corollary della Prorietà A.6, we have to prove that:

Now, in order to find the function f(t), first we have to “guess” the asymptotic order of C(n), in such a way to satisfy the last asymptotic relationship. After that, we’ll put the appropriate asymptotic order into (b) and so we’ll finally understand what is the function f(t).

Now let’s find an overestimation for the expression \left| \left|R\left(\frac{x}{n}\right)\right| - \left|R\left(\frac{x}{n+1}\right)\right| \right|. For this aim, we can use a general property of inequalities with absolute value: if k and h are two real numbers, we’ll have that \left| |k| - |h| \right| \leq |k - h|. This property is a consequence of the triangle inequality; in fact the triangle inequality states that, if a and b are two real numbers, then |a + b| \leq |a| + |b|, hence |a + b| - |a| \leq |b| and, applying the absolute value to both sides, \left||a + b| - |a|\right| \leq |b|. If in the latter inequality we set a := h and b := k - h, we’ll obtain that \left||k| - |h|\right| \leq |k - h|, which is the property we mentioned before. By applying it, we’ll have:

Using Definition N.15, by some simple algebric passages, we’ll obtain:

where in the last passage we noted that \frac{x}{n} \gt \frac{x}{n + 1} and \overline{\psi}\left(\frac{x}{n}\right) \geq \overline{\psi}\left(\frac{x}{n+1}\right), since \overline{\psi} is an increasing function (being a simple extension of the function \psi = \log \psi^{\star}, where \psi^{\star} is increasing by definition).

Joining (d) with (e), we’ll have:

hence, substituting into (c):

Now, in the absence of particular constraints, an obvious choice for C(n) would be C(n) := 1 because this way in the summation, thanks to the alternation of signs, the majority of terms would elimitate each other. But for our purposes another choice is more convenient; it’s a bit more complex but it generates a similar elimination effect: C(n) := n. With this choice we’ll have:

where:

- in the passage (*) we applied Theorem N.4, according to which \psi(x) \asymp x, hence \psi(x) = O(x) and so also \overline{\psi}(x) = O(x) (in fact \overline{\psi} and \psi, by the definition of simple extension, differ at most by a unity, so by a O(1), hence \overline{\psi}(x) = \psi(x) + O(1) = O(x) + O(1) = O(x));
- in the next passage we applied Property A.8A and Theorem N.6 (by the latter, \sum_{1 \leq n \leq x} \frac{1}{n} = \log x + \gamma + O\left(\frac{1}{x}\right), but \gamma and O\left(\frac{1}{x}\right) are both O(1) respectively by Corollary of Property A.7 and Property A.10);
- in the next passage we applied Property A.9 in order to bring the big Oh symbol outside the summation; we also noted that 1 + \frac{1}{2} + \frac{1}{x} is a constant and so it’s O(1) by Corollary of Property A.7;
- in the next passages we noted that we can apply to the summatin \sum_{2 \leq n \leq x - 1} \frac{1}{n} the same argument applied to the summation \sum_{3 \leq n \leq x - 1} \frac{1}{n}; we also applied the properties of asymptotic orders Corollary 1 of Property A.8, Property A.5 and Property A.10.

We can note that, if in the initial formula of (g) we put O(n) instead of n, the final result is the same:

where in the first, in the second and in the last equality we used respectively the properties Corollary 1 of Property A.8, Property A.9 and Property A.11.

But if C(n) = O(n), also |C(n)| = O(n), by Corollary of Property A.6). So, by (f) and (h):

But then, by Property A.12:

so we proved (c), which was our intermediate goal. This means that the choice C(n) := O(n) is plausible according to asymptotic orders. Now we still have to determine what is the function f, given (b). In fact, if in (b) we substitute C(n) := O(n), we’ll obtain:

The possible functions f now can be obtained by derivating both members with respect to n. By the way, we can note two things: first, n is an integer, so in order to do the derivation passage it should be replaced by a real variable; second, the solution is not unique because for example, if f is a solution, also f + c is a solution for any constant c. For brevity we prefer to skip the passages and directly verify that the function f(t) := 2 \log t is a possible solution (actually it’s the simplest possible solution):

where in the second passage we applied the integral of logarithm, which can be calculated by parts and is often used as an exercise in analysis courses.

In summary, we can say that:

- The function f(t) := 2 \log t satisfies (b), making sure that C(n) = O(n)
- If C(n) = O(n) then (c) is satisfied, and so also (a’), which in turn is equivalent to (8)
- So the function f(t) := 2 \log t satisfies (8)

In order to prove (9), first of all we’ll replace f by the function f(t) := 2 \log t which we have found in the proof of (8):

The left member can be developed as follows:

where in the passage (*) we applied an inequality which can be proved with formulas which are similar to formulas (d) and (e) of the previous detail:

where in the last passage we have considered that t \leq n, because it’s the integration variable of an integral between n - 1 and n. For the same reason, t \geq n - 1, relationship which has been used in (i) in the passage (**), obtaining an expression that does not depend on t, and so can be brought outside of the integral. Hence in (i) it remains to develop the integral \int_{n-1}^n 2 \log t\ dt (like in the previous detail, we’ll suppose that the indefinite integral of logarithm is known):

We can note that (n - 1) \log \frac{n}{n-1} = \log \left( \frac{n}{n-1} \right) ^ {n - 1} = \log \left( 1 + \frac{1}{n-1} \right) ^ {n - 1} and, by a known limit, the expression \left( 1 + \frac{1}{n-1} \right) ^ {n - 1}, for n \to \infty, tends to Napier’s number e. So, for n \to \infty, (n - 1) \log \frac{n}{n-1} \to \log e = 1, hence it’s expressible as an asymptotic order as O(1) (this is a consequence of Property A.7). Remembering that, as it’s known in analysis, \log n = o(n), and so by Property A.10 \log n = O(n), we’ll have that:

where we applied Property A.10, Corollary of Property A.9 and Corollary 2 of Property A.8.

Joining (j) with (k) we’ll have:

Now let’s substitute into (i):

Finally:

where:

- in the passage (*) we replaced all the terms before the summation by O(x) (the terms with the function \overline{\psi} are O(x) by virtue of Theorem N.4, as we have already noted in the previous detail);
- in the passage (**) we applied formula (h) of the previous detail
- in the next passage we applied, as usual, Property A.10 and Corollary of Property A.9.

Thus we have completed the proof of (9).

Proposition N.12 is depicted in the following picture, where the term O(x \log x) seems to be a bit smaller in absolute value than the function -\frac{5}{2} \log x.

We can note that the function \int_1^x \log t \left| R\left(\frac{x}{t}\right) \right| dt which appears in Proposition N.12 is not the typical integral function like g(x) := \int_a^x f(t) dt. In the case of such a function g, the integrand function does not depend on x, so for x = 1, 2, 3, \ldots we’ll obtain the values g(1) = \int_a^1 f(t) dt,\ g(2) = \int_a^2 f(t) dt,\ g(3) = \int_a^3 f(t) dt, \ldots, which correspond, if f is not negative, to the areas below the graph of f limited to abscissa intervals [a, 1],\ [a, 2],\ [a, 3], \ldots. In the function \int_1^x \log t \left| R\left(\frac{x}{t}\right) \right| dt instead, the x appears also inside the integrand function, hence the function of which we compute the integral changes as x does: for x = 1, 2, 3, \ldots we’ll obtain the values \int_1^1 \log t \left| R\left(\frac{1}{t}\right) \right| dt, \int_1^2 \log t \left| R\left(\frac{2}{t}\right) \right| dt, \int_1^3 \log t \left| R\left(\frac{3}{t}\right) \right| dt, \ldots. The values shown in Figure 2 have been obtained by computing numerically these integrals. The same is true also for Figure 1.