Prerequisites:
- Selberg’s sieve: study of the error term
- Numerical series and prime numbers
- Arithmetic functions: definitions and some properties
- The Möbius inversion formula
- The sum of the inverses of the first positive integers
- The Möbius function and its connection with the Λ function
This article concludes the “technical” part of the proof of Selberg’s sieve and lays the foundation for the applications that we’ll see later. The parts of the proof that we have seen in the previous articles will lead us, in a few steps, to the statement from which we started, Theorem C.4, which however is only a simplified version of the theorem. In this article we’ll also state its ordinary form and we’ll see that its proof differs only in a few points from that of the simplified version.
A stronger version of the simplified statement
In the previous articles we have seen some parts of the proof of Selberg’s sieve in the simplified version; we’ll now reconstruct a unitary picture.
At the end of the article Selberg’s sieve: statement and beginning of the proof we overapproximated the sieve function S(A, \mathbb{P}, z) this way:
In the next article, formula (9), we expressed T in the form:
where the function \phi has been defined as:
We then obtained the values of the w_d parameters that minimize the expression of T, reaching the conclusion that (formula 11):
where V(z) is an auxiliary function that we initially defined as follows:
In light of the previous article, however, in all the summations on d we can introduce the condition d \lt z, in order to bound the error term. The definitions of T and V(z) then become:
Substituting w_d into the expression of T, we’ll get:
where the central passage is motivated by the observation that, as the Möbius function is defined, \mu(d) can only be 1 or -1 since d is square-free, so we have \mu^2(d) = 1 in any case.
In (3), the term \frac{1}{V^2(z)} can be taken out of the summation, because it does not depend on its variable, namely d. We’ll therefore have:
where in the second step we substituted (2), that is the definition of V(z).
Combining (3) and (4), we can express T in the simple form:
Returning to (1), as regards the error term R, in the previous article we have arrived at the following overestimate:
where we recall that \omega(d) is the number of distinct prime factors appearing in the factorization of d, while r(d) is the function defined in the initial Theorem statement, equal to the difference between the number of multiples of d which are in the set A, and the ratio |A|/d.
Combining formulas (1), (5) and (6) we’ll get:
This overestimation is very similar to that of Theorem C.4: the only difference is that in the latter there is \sum_{1 \leq n \lt z} \frac{1}{n} instead of V(z); however, (7) is also a valid upper bound, and could be an alternative statement of the Theorem if one wanted. Furthermore, (7) is even a better overestimation than that of Theorem C.4. It can in fact be proved that:
from which:
hence we’ll finally obtain, by transitivity, the thesis of Theorem C.4:
Therefore, Theorem C.4 is weaker than (7), because it’s a consequence of it. Despite this, the official formula of Selberg’s sieve, in its simplified form, remains (7′). We could hypothesize that this formula is preferred for its greater simplicity, since it uses the well-known harmonic series instead of the V(z) function, and probably the asymptotic difference with respect to the other formula is not relevant. On this last point we propose the following example, but if any reader has studied the matter in depth we would be happy to receive further information:
Now that we have concluded the proof, it is interesting to compare the estimate given by the statement with the true value of the sieve function, that is S(A, \mathbb{P}, z), and also with the more accurate estimate that we found during the proof. To make this comparison we’ll choose, as already done in previous articles, A := \{1, 2, \ldots, N\}, where N is a fixed positive integer, and z := \sqrt{N+1}; that is, we’ll eliminate from A all multiples of prime numbers less than \sqrt{N+1}, that is, less than or equal to \sqrt{N}. With these choices, the estimate (7′), the one of the ordinary statement of the theorem, becomes:
while the formula of the other estimate becomes the following:
For increasing values of N up to 10,000, we obtained the following picture:

Apparently the two estimates are not too different from each other and the asymptotic order is not very different from that of the true function; however, it’s not possible to be certain without a proof, since differences between the asymptotic orders may start to show for numbers larger than those shown in the graph.
Conclusion of the proof of the simplified form
From what we have seen in the previous paragraph, the only relation still to be proved, to complete the proof of Theorem C.4, is (8), that is:
The first step to prove this is to write the \phi function differently:
We have defined (Selberg’s sieve: study of the parameters λ, step 2) the function \phi as:
Recall that this formula was obtained by applying the Möbius inversion formula to the identity function; now we’ll prove that this formula is equivalent to (9), although it has a completely different form.
We’ll proceed by induction on the number of prime factors that appear in the factorization of d (remembering that they are all distinct, with exponent equal to 1).
Let’s first prove the base case. If only one prime appears in the factorization of d, that is if d = q for some prime q, (a) becomes:
With formula (9) we’ll get the same result:
We have therefore shown that, with a single prime number, (a) and (9) give the same value. Let’s now move on to the inductive part: let’s suppose that the equality holds for k prime numbers, and let’s prove that it also holds for k + 1 prime numbers. So we’ll suppose that the following equality holds:
Based on this induction hypothesis, let’s show that the two formulas give the same result even when the number of prime factors is k + 1:
Let’s compare the left-hand sides of (b) and (c). In (b) the indices of the sum are all the possible pairs (d, d^{\prime}) the product of which is p_1 \cdot \ldots \cdot p_k, while in (c) the product of the pair is p_1 \cdot \ldots \cdot p_k \cdot p_{k+1}. What’s the relationship between the pairs in (b) and those in (c)? Suppose we start from a pair in (c), that is, from two numbers d, d^{\prime} such that:
Since p_{k+1} is prime, one and only one of these two cases must occur:
- p_{k+1} \mid d^{\prime}. In this case, dividing both sides of (d) by p_{k+1}, we’ll get that \frac{d^{\prime}}{p_{k+1}} d^{\prime \prime} = p_1 \cdot \ldots \cdot p_k, so \left(\frac{d^{\prime}}{p_{k+1}}, d^{\prime \prime} \right) is a pair of (b)
- p_{k+1} \mid d^{\prime \prime}. In this case, proceeding as before, we’ll obtain that d^{\prime} \frac{d^{\prime \prime}}{p_{k+1}} = p_1 \cdot \ldots \cdot p_k, so \left(d^{\prime}, \frac{d^{\prime \prime}}{p_{k+1}}\right) is a pair of (b)
We can then rewrite the sum of (c) as follows:
Now an important aspect to observe is that in each of the two cases the set of pairs obtained (of the type \left(\frac{d^{\prime}}{p_{k+1}}, d^{\prime \prime} \right) in the first case and \left(d^{\prime}, \frac{d^{\prime \prime}}{p_{k+1}}\right) in the second) includes all the possible pairs of (b). To understand why, let’s try to reason in the opposite direction. If (d^{\prime}, d^{\prime \prime}) is a pair of indices of the sum of (b), then d^{\prime} d^{\prime \prime} = p_1 \cdot \ldots \cdot p_k, so two cases can occur:
- Multiplying d^{\prime} by p_{k+1}, we’ll obtain the pair (d^{\prime} p_{k+1}, d^{\prime \prime}) the product of which is p_1 \cdot \ldots \cdot p_k \cdot p_{k+1}, therefore this pair of indices belongs to the sum (c) and the first index is a multiple of p_{k+1}
- Multiplying d^{\prime \prime} by p_{k+1}, we’ll obtain the pair (d^{\prime}, d^{\prime \prime} p_{k+1}) which, as in the previous case, belongs to the sum (c), but this time the second index is a multiple of p_{k+1}
So we have partitioned the indices of the sum (c) into two subsets: those for which the first index is a multiple of p_{k+1} and those for which the second index is, and both these subsets are in one-to-one correspondence with the indices of the sum (b).
Suppose that k = 2 and (p_1, p_2, p_3) = (2, 3, 5). The following table compares the indices of (b), i.e. the pairs of integers the product of which is p_1 \cdot \ldots \cdot p_k = p_1 \cdot p_2 = 2 \cdot 3, with those of (c), i.e. the pairs of integers the product of which is p_1 \cdot \ldots \cdot p_k \cdot p_{k+1} = p_1 \cdot p_2 \cdot p_3 = 2 \cdot 3 \cdot 5:
Pairs of integers the product of which is 2 \cdot 3 = 6 | Pairs of integers the product of which is 2 \cdot 3 \cdot 5 = 30 |
---|---|
(1, 2 \cdot 3) (2, 3) (3, 2) (2 \cdot 3, 1) |
(1 \cdot \mathbf{5}, 2 \cdot 3) (2 \cdot \mathbf{5}, 3) (3 \cdot \mathbf{5}, 2) (2 \cdot 3 \cdot \mathbf{5}, 1) |
(1, 2 \cdot 3 \cdot \mathbf{5}) (2, 3 \cdot \mathbf{5}) (3, 2 \cdot \mathbf{5}) (2 \cdot 3, 1 \cdo t \mathbf{5}) |
We can observe that in the right column we have all the 8 possible pairs of indices of the sum (c), divided into two subsets obtained by multiplying by p_{k+1} = p_3 = 5 respectively the first and second elements of the pairs of the first column, which are the indices of the sum (b).
So we can rewrite the first sum of (e) as follows:
where the equality is motivated by the one-to-one correspondence that we have seen: the d^{\prime} p_{k+1} of the right-hand summation, where d^{\prime} multiplied by d^{\prime \prime} gives p_1 \cdot \ldots \cdot p_k, coincides numerically with the d^{\prime} of the left-hand summation, where d^{\prime} multiplied by d^{\prime \prime} gives p_1 \cdot \ldots \cdot p_k \cdot p_{k+1} and is a multiple of p_{k+1} (you have to not get confused by the use of the same symbol d^{\prime} with two different meanings in the two summations; this practice is quite common).
Reasoning in a similar way, we’ll get that:
Combining (e), (f) and (g) we’ll get:
Now we can observe that p_{k+1} can be taken out of the second summation, while in the first summation \mu(d^{\prime} p_{k+1}) = - \mu(d^{\prime}), because if d^{\prime} is the product of an even number of prime factors (so \mu(d^{\prime}) = 1), d^{\prime} p_{k+1} is the product of an odd number of prime factors (so \mu(d^{\prime}) = -1), while if d^{\prime} is the product of an odd number of prime factors (so \mu(d^{\prime}) = -1), d^{\prime} p_{k+1} is the product of an even number of prime factors (so \mu(d^{\prime}) = 1): in both cases the sign changes. Therefore:
Now it’s sufficient to apply the induction hypothesis (b) and carry out the calculations:
We got (c), which is what we wanted to prove.
We originally defined the function \phi as a sum (\phi(d) := \sum_{d^{\prime} d^{\prime \prime} = d} \mu(d^{\prime}) d^{\prime \prime}), while in (9) we have products. In fact, the sum of the definition we knew is nothing more than the result obtained after having solved all the products of (9), as this example shows for d = 2 \cdot 3 \cdot 5 = 30:
where the last formula is precisely \phi(30) according to the definition we gave in the article Selberg’s sieve: study of the parameters λ, step 2.
In the previous article we mentioned the property that \phi is multiplicative, i.e.:
- \phi(1) = 1
- if a and b are two square-free and coprime numbers, then \phi(a) \phi(b) = \phi(ab)
We can note that both points are an immediate consequence of (9).
As for the first point, just replace d = 1 in the formula: \phi(1) = 1 \cdot \prod_{p \mid 1} \left(1 - \frac{1}{p} \right). Since no prime number is a divisor of 1, the product is empty, so it is 1; therefore \phi(1) = 1 \cdot 1 = 1.
As for the second point, suppose that a = p_1 \cdot \ldots \cdot p_k and b = q_1 \cdot \ldots \cdot q_h, each of which is a product of distinct prime factors, with k \gt 0 and h \gt 0 (if it were k = 0 or h = 0 it would be respectively a = 1 or b = 1 and we could apply the previous case). Then, by (9):
where in the penultimate equality we have considered that all and only the primes that divide ab are precisely p_1, \ldots, p_k, q_1, \ldots, q_h. In fact, since by hypothesis a and b are coprime, the sets \{p_1, \ldots, p_k\} and \{q_1, \ldots, q_h\} cannot have elements in common.
From (9) we’ll have that:
But \frac{1}{1 - \frac{1}{p}} = \frac{1}{\frac{p - 1}{p}} = \frac{p}{p - 1}, by Property A.19, is the sum of the series
So (10) can be written as:
In turn, by Property A.26, this formula can be rewritten as follows:
We can observe that the statement “in the factorization of a there are only prime factors of d” is equivalent to “a is the product of all and only the prime factors of d, with exponent greater than or equal to 0″; in fact, any prime factor of d missing in a can be inserted into the factorization with zero exponent, as in 2 = 2^1 \cdot 3^0 where there are all the prime factors of d = 6, including 3 (of course, this in not strictly a prime factorization, but it’s a useful tool for some proofs like the present one).
So, if we multiply each of these a by d, which, being square-free, is the product of all and only its prime factors, we are adding 1 to all the exponents of the prime factors in the writing of a (for example, if d = 6, from 2 = 2^1 \cdot 3^0 we go to 6 \cdot 2 = 2^2 \cdot 3^1 = 2^{1+1} \cdot 3^{0+1}), so the minimum possible exponent becomes 1 instead of 0 (so, differently from before, we now have a true factorization). Consequently, after multiplying by d, in the factorization of a there must be all the prime factors of d. We can therefore conclude that:
For d = 6, we’ll have that:
We can substitute this expression in (8); so the relation to be proved becomes:
To prove this inequality, we’ll show that all the terms of the right-hand summation also appear in the double left-hand summation. This technique is justified by Theorem A.1: since all the terms of the left-hand series are positive, and since it’s convergent because by (11) it has sum \frac{1}{\phi(d)}, the order of the terms does not matter, so if it has at least the same terms as the right-hand series (regardless of the order), then the relation of \geq holds.
Let’s check that, given n such that 1 \leq n \lt z, in the double summation on the left we can find the term \frac{1}{n}. First, let’s observe that n can only have prime factors not greater than z, because it’s itself less than z. But the prime factors not greater than z are precisely those the product of which is P(z), let’s call them p_1, p_2, \ldots, p_m. So we can write the factorization of n in the form:
where h_1, h_2, \ldots, h_k \gt 0, \{q_1, q_2, \ldots, q_k\} \subseteq \{p_1, p_2, \ldots, p_m \}, p_1 \cdot p_2 \cdot \ldots \cdot p_m = P(z).
Now consider the product q_1 q_2 \ldots q_k. Each factor is one of the prime factors of P(z), so the whole product divides P(z); moreover the product is less than z because q_1 q_2 \ldots q_k \leq q_1^{h_1} q_2^{h_2} \ldots q_k^{h_k} = n \lt z. Ultimately we can set d := q_1 q_2 \ldots q_k, because such d satisfies both conditions of the external summation of (8′). Let’s now verify that the fraction \frac{1}{n} appears within the internal sum obtained in correspondence with the value of d we have constructed, that is:
The verification is immediate, because the factorization of n, given by (a), contains precisely all (and only) the prime factors of q_1 q_2 \ldots q_k, therefore n is one of the “a“s of this last sum.
In conclusion surely, in the double sum on the left in (8′), one of the terms is equal to \frac{1}{n}. By arbitrariness of n, this happens for all fractions \frac{1}{n} with 1 \leq n \lt z. The upper bound (8′) is therefore proved.
General form of Selberg’s sieve
As mentioned at the beginning, the most famous statement of Selberg’s sieve is a bit more general than the form we have proved so far. The most general form is the following:
Selberg’s sieve
Let them be:
- A a finite set of natural numbers
- \mathbb{P} the set of prime numbers
- z \gt 1 a real number
- P(z) the product of all prime numbers less than z
- A_d the set of multiples of a natural number d which are in A
- f a fully multiplicative arithmetic function, such that 0 \lt f(p) \lt 1 for every prime number p
- r(d) := |A_d| - |A| f(d)
- \omega(d) the number of distinct prime factors of d
Then:
Comparing this statement with its simplified version (Theorem C.4), we can note that the only difference is the introduction of the function f; in fact, as it’s easy to check by substitution, the simplified version is obtained by setting:
This function f is the key to the whole sieve. It must be chosen so that the product |A| f(d), for each d, is as close as possible to the number of multiples of d present in the set A, which with the notations of the Theorem is indicated by |A_d|. In fact, the quantity r(d) = |A_d| - |A| f(d) appears in the error term, and this is what we want to minimize; the ideal is therefore that, for each d, r(d) is as small as possible, that is, |A| f(d) is as close as possible to |A_d|. If f(d) = \frac{1}{d} and A = \{1, \ldots, N\} this is exactly what happens, because the number of multiples of d in the set A is very close, if not equal, to |A| f(d) = \frac{|A|}{d} = \frac{N}{d}. However, by varying the set A, there may exist functions f that are better able to minimize the error r(d). The generalization of the Theorem is important precisely because it allows us to choose the most appropriate function on a case-by-case basis: we’ll see an example in the next article.
Now that we have almost concluded our discussion of Selberg’s sieve, it’s worth noting that, unlike the intuitive concept of sieve, based on the sieve of Eratosthenes, there is no explicit elimination process in Selberg’s sieve. The advantage of Selberg’s sieve, however, is that it separates the term that provides the estimate of the sieve function – the meaning of which is difficult to grasp without reviewing the details of the proof – from the error term, which unlike the previous one has a much more intuitive meaning. In fact, the term |r(d)| appears in it, which we could think of as the difference between the number of multiples of d “eliminated” by the sieve (A f(d)) and the actual number of multiples of d present in A (|A_d|). The similarities with the intuitive concept of sieve, however, stop there, because in the error term these terms |r(d)| are evaluated independently of one another and not in successive steps: as if, after having eliminated the multiples of a certain number, we started again from the initial set when eliminating the multiples of another number. This is a big advantage: one of the major difficulties in calculating a sieve is in fact taking into account the numbers already eliminated previously (see Why don’t algorithmic approaches work well in sieve theory? (part I)), while Selberg’s sieve manages to trace back the calculation to a function which expresses the effect of the elimination of the multiples of only one number at a time.
The reason we have so far proved the simplified version of Selberg’s sieve is that the proof of the general version is more symbol-heavy, but conceptually identical: it’s enough to replace \frac{1}{n} with f(n) in the appropriate places. In fact, if you try to make this substitution within the proof of the simplified version, you’ll notice that the only two properties of the function f that you use are those required by the statement; it’s not necessary for f to be precisely the function \frac{1}{n}.
Where are the two properties of the f function used in the general proof?
As an example, let’s look at a point in the general proof where both properties of f are used.
Let’s start with the function \phi (usually indicated with F in the general proof, because the similarity with Euler’s \phi is only valid in the simplified version). The definition we know:
becomes:
While the definition of V(z), being based on that of \phi, remains the same:
However, this definition presupposes that \phi(d) \neq 0, which in the simplified proof is a consequence of (9), while in the general proof it must be proved separately. In particular, if p is a prime number, we have that:
where in the central equality we used the property that f(1) = 1 (part of the definition that f is completely multiplicative), while the final inequality is a consequence of the property that 0 \lt f(p) \lt 1 (if f(p) were 0 the expression \frac{1}{f(p)} would not make sense, while if f(p) were less than 0 or greater than or equal to 1, the expression \frac{1}{f(p)} - 1 would be a negative number).
As for \phi(d), where d is a product of distinct prime factors that we can denote by p_1 \cdot \ldots \cdot p_k, it can be shown that:
which is the generalization of (9), and which can be similarly proved by induction. In particular, as observed for the previous function \phi, this new \phi is multiplicative (a property that is used in the proof, as we saw in the previous article).
In conclusion, from (c) and (d) it follows that \phi(d) \gt 0; in particular \phi(d) is different from zero, so V(z), in the formula of which \phi(d) appears at the denominator, is well defined.
There is another important point of the proof, however, where it’s not enough that \phi(d) \neq 0, but it’s necessary that \phi(d) \gt 0. In the article Selberg’s sieve: study of the parameters λ we have found the values of the \lambda_d parameters that minimize our estimate of the sieve function, given by (formula (9) of the cited article):
and we have proved that, for such values of \lambda_d (not directly visible in the formula, but inherent in the definition of w_d), the value assumed by T is \frac{1}{V(z)}. It’s important that this value is positive, because our estimate must be positive too. In both proofs \frac{1}{V(z)} is positive because V(z) is positive, which in turn, by (b), is positive because \phi(d) is positive for every d.
Another important point of the proof in which the first property of f is used, namely that 0 \lt f(p) \lt 1 for every prime p, is the proof of (8). In fact, as we have seen, it uses the sum of a geometric series of ratio \frac{1}{p}, which in the general proof becomes f(p); the hypothesis 0 \lt f(p) \lt 1 assures us about the convergence of this geometric series, since it implies that |f(p)| \lt 1, which is precisely the necessary and sufficient condition for a geometric series of ratio f(p) to converge.