Why don't algorithmic approaches work well in sieve theory? (part I)

Prerequisites:

In the previous article we examined in detail the sieve of Eratosthenes, both at an algorithmic level and as a sieve function. We have seen that there is a strong connection between the two formulations, as the sieve function calculates how many elements remain in the set $\{2, 3, 4, \ldots, N\}$ after having applied the algorithm. But so, to calculate the sieve function, why not start from the study of the algorithm from which it originates? In this article and the next one we’ll see that this idea does not work in practice, and we’ll understand why.

From Eratosthenes’ sieve algorithm to the calculation of the corresponding sieve function

Given an integer $N \gt 1$ , we’ll consider the set $A := \{1, 2, 3, \ldots, N\}$ and we’ll suppose that, starting from this set, we have executed the sieve of Eratosthenes, in the version that cancels all the multiples of a prime $p$ , including $p$ itself (even if we know that this is not the most common version of the Erathostenes’ sieve, but it is a simplification useful for what follows). Unlike the previous article, this time we have also included 1 in the starting set to simplify some formulas that we’ll see shortly, but the substance does not change: in the end, only the numbers that are not divisible by any prime less than or equal to $\sqrt{N}$ will remain in $A$ , i.e. 1 and all primes greater than $\sqrt{N}$ . Using the definition of sieve function, we can indicate the number of such primes with the following notation:

S(A, \mathbb{P}, \sqrt{N + 1}) \tag{1}

This number can be obtained by following Eratosthenes’ sieve algorithm step by step, starting from the initial number of elements and gradually subtracting the number of elements eliminated in each step:

Initially the set $A$ contains all integers from 1 to $N$ , then $|A| = N$ .

First the sieve of Eratosthenes eliminates from $A$ all multiples of 2. The eliminated elements will therefore be 2, 4, 6, …, $N$ if $N$ is even; 2, 4, 6, …, $N - 1$ if $N$ is odd: in total, they are $\frac{N}{2}$ elements if $N$ is even and $\frac{N - 1}{2}$ if $N$ is odd.
Combining the two cases, we can say the algorithm will eliminate $\left\lfloor \frac{N}{2} \right\rfloor$ elements (where $\left\lfloor \right\rfloor$ is the integer part rounded down), given that the result of this formula coincides with $\frac{N}{2}$ if $N$ is even and with $\frac{N - 1}{2}$ if $N$ is odd. So, whatever $N$ is, after eliminating all multiples of 2, in $A$ there will be $N - \left\lfloor \frac{N}{2} \right\rfloor$ elements left.

Subsequently, the sieve of Eratosthenes eliminates from A all multiples of 3. However, the eliminated numbers will not be 3, 6, 9, 12, …, because some of them, for example 6, will have already been eliminated previously, as they are also divisible by 2. However, to calculate how many multiples of 3 are eliminated, we can carry out two steps:
- first we count all the multiples of 3 less than or equal to $N$ (i.e. how many there were in the initial set $A$ , before the previous step);
- from the number obtained, we subtract the number of multiples of 6 less than or equal to $N$ ; in fact, since they are also divisible by 2, they have already been eliminated previously and must not be recounted.
In general, the number of positive multiples of a certain natural number $k$ that are less than or equal to $N$ is $\left\lfloor \frac{N}{k} \right\rfloor$ , as it is easy to prove; therefore the number of multiples of 3 and the number of multiples of 6 less than or equal to $N$ are respectively given by the formulas $\left\lfloor \frac{N}{3} \right\rfloor$ and $\left\lfloor \frac{N}{6} \right\rfloor$ . So, based on the previous two points, the number of items eliminated in this step is $\left\lfloor \frac{N}{3} \right\rfloor - \left\lfloor \frac{N}{6} \right\rfloor$ . So, if after deleting the multiples of 2 there were $N - \left\lfloor \frac{N}{2} \right\rfloor$ elements in $A$ , after having eliminated also the multiples of 3 the set will remain with $N - \left\lfloor \frac{N}{2} \right\rfloor - \left( \left\lfloor \frac{N}{3} \right\rfloor - \left\lfloor \frac{N}{6} \right\rfloor \right) = N - \left\lfloor \frac{N}{2} \right\rfloor - \left\lfloor \frac{N}{3} \right\rfloor + \left\lfloor \frac{N}{6} \right\rfloor$ elements.

These first two steps are represented in the following image:

Figure 1: elimination of multiples of 2 (first row) and of 3 (second row) from the set of integers from 1 to N. The multiples of 3 eliminated from step 2 to step 3 are equal to the multiples of 3 less than or equal to N, except multiples of 6.

Subsequently, the sieve of Eratosthenes eliminates all multiples of 5 from A. We must consider, however, that the multiples of 5 divisible by 2 or 3 or both have already been eliminated, so we could reason as follows:
1. first we count all multiples of 5 less than or equal to $N$ ;
2. from the number obtained, we subtract the number of multiples of 10 less than or equal to $N$ ; in fact, since they are also divisible by 2, they have already been eliminated previously;
3. from the number obtained, we subtract the number of multiples of 15 less than or equal to $N$ ; in fact, since they are also divisible by 3, they have already been eliminated previously.
Now, however, the question becomes more complex because, if we stopped here, we would remove twice (in points b. and c.) the multiples of 30, i.e. the multiples of 5 which are also divisible simultaneously by 2 and by 3. To fix this double deletion, we can add one more step:
1. first we count all multiples of 5 present less than or equal to $N$ ;
2. from the number obtained, we subtract the number of multiples of 10 less than or equal to $N$ ; in fact, since they are also divisible by 2, they have already been eliminated previously;
3. from the number obtained, we subtract the number of multiples of 15 less than or equal to $N$ ; in fact, since they are also divisible by 3, they have already been eliminated previously;
4. to the obtained number, add the number of multiples of 30 less than or equal to $N$ ; in fact, since they are also divisible by 2 and 3, they have been subtracted twice in the previous steps, while they must be subtracted only once.
So the number of multiples of 5 eliminated by $A$ is $\left\lfloor \frac{N}{5} \right\rfloor - \left\lfloor \frac{N}{10} \right\rfloor - \left\lfloor \frac{N}{15} \right\rfloor + \left\lfloor \frac{N}{30} \right\rfloor$ . The procedure is represented in the following image:

Figure 2: elimination of multiples of 5 from the set of integers from 1 to N already stripped of multiples of 2 and 3. In calculating the number of multiples of 5 to eliminate (first row), the number of multiples of 30 must be added to the formula for compensating for double subtraction, both as part of multiples of 10 and as part of multiples of 15.

The number of elements of $A$ before this step was:

$N - \left\lfloor \frac{N}{2} \right\rfloor - \left\lfloor \frac{N}{3} \right\rfloor + \left\lfloor \frac{N}{6} \right\rfloor$

So, after also eliminating the multiples of 5, the number of elements of $A$ becomes:

$\begin{aligned} N - \left\lfloor \frac{N}{2} \right\rfloor - \left\lfloor \frac{N}{3} \right\rfloor + \left\lfloor \frac{N}{6} \right\rfloor - \left( \left\lfloor \frac{N}{5} \right\rfloor - \left\lfloor \frac{N}{10} \right\rfloor - \left\lfloor \frac{N}{15} \right\rfloor + \left\lfloor \frac{N}{30} \right\rfloor \right) & = \\ N - \left\lfloor \frac{N}{2} \right\rfloor - \left\lfloor \frac{N}{3} \right\rfloor + \left\lfloor \frac{N}{6} \right\rfloor - \left\lfloor \frac{N}{5} \right\rfloor + \left\lfloor \frac{N}{10} \right\rfloor + \left\lfloor \frac{N}{15} \right\rfloor - \left\lfloor \frac{N}{30} \right\rfloor &= \\ N - \left\lfloor \frac{N}{2} \right\rfloor - \left\lfloor \frac{N}{3} \right\rfloor - \left\lfloor \frac{N}{5} \right\rfloor + \left\lfloor \frac{N}{6} \right\rfloor + \left\lfloor \frac{N}{10} \right\rfloor + \left\lfloor \frac{N}{15} \right\rfloor - \left\lfloor \frac{N}{30} \right\rfloor \end{aligned} \tag{2}$

As we know, the Eratosthenes’ sieve algorithm continues until the multiples of the last prime number to be considered are eliminated, i.e. the multiples of the largest prime number less than or equal to $\sqrt{N}$ . In the subsequent elimination steps, we can reason in the same way as in the previous steps but, as you can imagine, the formulas become more and more complicated.

At the basis of the above reasoning there is a more general principle, called inclusion-exclusion principle, which, however, we do not need to know for our purposes, therefore we leave this topic to the discretion of our readers.

Resuming the examples of the previous article, suppose that $N = 32$ . Let’s compare our formulas step by step with the elements actually eliminated by the sieve of Eratosthenes at each step:

Initially $A = \{1, 2, 3, 4, \ldots, 32\}$ .

The multiples of 2 to eliminate are 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 and 32; in total they are $16 = \left\lfloor \frac{32}{2} \right\rfloor$ elements. So after this step the set $A$ will have a number of elements equal to:
$\begin{aligned} |A| - |\{2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32\}| &= \\ 32 - \left\lfloor \frac{32}{2} \right\rfloor &= \\ 32 - 16 &= \\ 16 \end{aligned} \tag{a}$

Indeed, at the end of this passage, $A = \{1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31\}$ , so it will have 16 elements.

The multiples of 3 to eliminate in the new set $A$ are 3, 9, 15, 21 and 27. These 5 elements are given by the set of all multiples of 3 less than or equal to 32 ( $\{3, 6, 9, 12, 15, 18, 21, 24, 27, 30\}$ ) from which, however, we must remove the set of multiples of 6 ( $\{6, 12, 18, 24, 30\}$ ):
$\begin{aligned} \{3, 9, 15, 21, 27\} &= \\ \{3, 6, 9, 12, 15, 18, 21, 24, 27, 30\} \setminus \{6, 12, 18, 24, 30\} \end{aligned}$

from which, passing to cardinalities:

$\begin{aligned} |\{3, 9, 15, 21, 27\}| &= \\ |\{3, 6, 9, 12, 15, 18, 21, 24, 27, 30\}| - |\{6, 12, 18, 24, 30\}| &= \\ \left\lfloor \frac{32}{3} \right\rfloor - \left\lfloor \frac{32}{6} \right\rfloor &= \\ 10 - 5 &= \\ 5 \end{aligned} \tag{b}$

The new number of elements of $A$ will therefore be:

$\begin{aligned} |\{1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31\}| - |\{3, 9, 15, 21, 27\}| &= \text{[by (a)]} \\ 32 - \left\lfloor \frac{32}{2} \right\rfloor - |\{3, 9, 15, 21, 27\}| &= \text{[by (b)]} \\ 32 - \left\lfloor \frac{32}{2} \right\rfloor - \left( \left\lfloor \frac{32}{3} \right\rfloor - \left\lfloor \frac{32}{6} \right\rfloor \right) &= \\ 32 - \left\lfloor \frac{32}{2} \right\rfloor - \left\lfloor \frac{32}{3} \right\rfloor + \left\lfloor \frac{32}{6} \right\rfloor &= \\ 32 - 16 - 10 + 5 &= \\ 11 \end{aligned} \tag{c}$

Indeed, at the end of this step, $A = \{1, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31\}$ , so it will have 11 elements.

In the new set $A$ there are only two multiples of 5: 5 and 25. They can be obtained with the following set operation:
$\begin{aligned} \{5, 25\} &= \\ \{5, 10, 15, 20, 25, 30\} \setminus \{10, 20, 30\} \setminus \{15, 30\} \end{aligned}$

that is the set of multiples of 5, minus the set of multiples of 10, to consider the multiples of 2 already removed, minus the set of multiples of 15, to consider the multiples of 3 already removed. However, passing to cardinalities, we have an inconsistency:

$\begin{aligned} |\{5, 25\}| &\neq \\ |\{5, 10, 15, 20, 25, 30\}| - |\{10, 20, 30\}| - |\{15, 30\}| \end{aligned}$

In fact, the second and third sets are not disjoint because they have 30 in common, which in the expression on the right is counted twice with a minus sign. To count it only once, re-establishing equality, we must add the count of the elements which are common to the last two sets (in this case only 30):

$\begin{aligned} |\{5, 25\}| &= \\ |\{5, 10, 15, 20, 25, 30\}| - |\{10, 20, 30\}| - |\{15, 30\}| + |\{30\}| &= \\ 6 - 3 - 2 + 1 &= \\ 2 \end{aligned}$

But the four sets in the second passage are nothing more than, respectively, the multiples of 5, of 10, of 15 and of 30, so their cardinalities are given respectively by the formulas $\left\lfloor \frac {32}{5} \right\rfloor$ , $\left\lfloor \frac{32}{10} \right\rfloor$ , $\left\lfloor \frac{32}{15} \right\rfloor$ and $\left\lfloor \frac{32}{30} \right\rfloor$ . Algebraically adding up all, the number of multiples of 5 removed by the algorithm in this step (i.e. only 5 and 25) is given by the formula:

$\begin{aligned} |\{5, 25\}| &= \\ \left\lfloor \frac{32}{5} \right\rfloor - \left\lfloor \frac{32}{10} \right\rfloor - \left\lfloor \frac{32}{15} \right\rfloor + \left\lfloor \frac{32}{30} \right\rfloor \end{aligned} \tag{d}$

The new number of elements of $A$ will therefore be:

$\begin{aligned} |\{1, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31\}| - |\{5, 25\}| &= \text{[by (c)]} \\ 32 - \left\lfloor \frac{32}{2} \right\rfloor - \left\lfloor \frac{32}{3} \right\rfloor + \left\lfloor \frac{32}{6} \right\rfloor - |\{5, 25\}| &= \text{[by (d)]} \\ 32 - \left\lfloor \frac{32}{2} \right\rfloor - \left\lfloor \frac{32}{3} \right\rfloor + \left\lfloor \frac{32}{6} \right\rfloor - \left(\left\lfloor \frac{32}{5} \right\rfloor - \left\lfloor \frac{32}{10} \right\rfloor - \left\lfloor \frac{32}{15} \right\rfloor + \left\lfloor \frac{32}{30} \right\rfloor \right) &= \\ 32 - \left\lfloor \frac{32}{2} \right\rfloor - \left\lfloor \frac{32}{3} \right\rfloor - \left\lfloor \frac{32}{5} \right\rfloor + \left \lfloor \frac{32}{6} \right\rfloor + \left\lfloor \frac{32}{10} \right\rfloor + \left\lfloor \frac{32}{15} \right\rfloor - \left\lfloor \frac{32}{30} \right\rfloor &= \\ 32 - 16 - 10 - 6 + 5 + 3 + 2 - 1 &= \\ 9\end{aligned}$

Indeed, at the end of this step, $A = \{1, 7, 11, 13, 17, 19, 23, 29, 31\}$ , so it will have 9 elements.

Eratosthenes’ sieve ends here, because 5 is the largest prime less than or equal to $\sqrt{32}$ .

In summary, we can therefore say that the formula

32 - \left\lfloor \frac{32}{2} \right\rfloor - \left\lfloor \frac{32}{3} \right\rfloor - \left\lfloor \frac{32}{5} \right\rfloor + \left\lfloor \frac{32}{6} \right\rfloor + \left\lfloor \frac{32}{10} \right\rfloor + \left\lfloor \frac{32}{15} \right\rfloor - \left\lfloor \frac{32}{30} \right\rfloor

correctly calculates the number of elements of the initial set $A$ that are not divisible by 2, 3 or 5. Thus, by definition of sieve function, we can say that:

S(A, \mathbb{P}, \sqrt{33}) = 32 - \left\lfloor \frac{32}{2} \right\rfloor - \left\lfloor \frac{32}{3} \right\rfloor - \left\lfloor \frac{32}{5} \right\rfloor + \left\lfloor \frac{32}{6} \right\rfloor + \left\lfloor \frac{32}{10} \right\rfloor + \left\lfloor \frac{32}{15} \right\rfloor - \left\lfloor \frac{32}{30} \right\rfloor

The general formula of the sieve of Eratosthenes

We can note that formula (2) is an algebraic sum of expressions of the type $\left\lfloor \frac{N}{d} \right\rfloor$ , with a negative sign if $d$ is the product of an odd number of prime numbers, and with a positive sign if $d$ is the product of an even number of prime numbers. The first term, $N$ , is no exception, because we can understand it as $\left\lfloor \frac{N}{1} \right\rfloor$ , where 1 is the product of zero prime numbers, therefore of an even number of prime numbers. Furthermore, $d$ is always the product of distinct prime factors (equivalently, it is square free). The converse is also valid: for every possible product $d$ of zero or more distinct prime numbers, chosen from the set $\{2, 3, 5\}$ of the prime numbers considered in the sieve, we have a term like $\left\lfloor \frac{N}{d} \right\rfloor$ .
This rule becomes increasingly evident as the number of primes used in the sieve increases. For example, if we eliminated from $A$ also the multiples of 7, the number of remaining elements would be given by the following formula:

\begin{aligned} N \\ - \left\lfloor \frac{N}{2} \right\rfloor - \left\lfloor \frac {N}{3} \right\rfloor - \left\lfloor \frac{N}{5} \right\rfloor - \left\lfloor \frac{N}{7} \right\rfloor \\ + \left\lfloor \frac{N}{6} \right\rfloor + \left\lfloor \frac{N}{10} \right\rfloor + \left\lfloor \frac{N}{15} \right\rfloor + \left \lfloor \frac{N}{14} \right\rfloor + \left\lfloor \frac{N}{21} \right\rfloor + \left\lfloor \frac{N}{35} \right\rfloor \\ - \left\lfloor \frac{N}{30} \right\rfloor - \left\lfloor \frac{N}{42} \right\rfloor - \left\lfloor \frac{N}{70} \right\rfloor - \left\lfloor \frac{N}{105} \right\rfloor \\ + \left\lfloor \frac{N}{210} \right\rfloor \end{aligned} \tag{3}

In this expression we have grouped the terms $\left\lfloor \frac{N}{d} \right\rfloor$ on different lines based on the number of prime factors that constitute the factorization of $d$ :

on the first line we find the term with $d = 1$ , 1 being the product of 0 prime numbers, with a positive sign because 0 is even;
on the second line $d$ is a prime number chosen from the set $\{2, 3, 5, 7\}$ and the terms have a negative sign because 1 is odd;
on the third line we have all the possible terms where $d$ is the product of 2 distinct prime numbers chosen from the same set, and the terms have a positive sign because 2 is even;
the distinct primes that make up $d$ become 4 on the fourth line, with terms having a negative sign;
finally, on the last line, we have the only possible product of four distinct prime numbers chosen from the set $\{2, 3, 5, 7\}$ , i.e. $d = 2 \cdot 3 \cdot 5 \cdot 7 = 210$ , and the sign of the term is positive.

The formula contains a total of $16 = 2^4$ terms. In general, given an integer $k \gt 0$ , we can consider the set of the first $k$ prime numbers and obtain, starting from it, a formula structurally analogous to formulas (2) and (3) (in which $k$ is respectively 3 and 4):

General formula of the sieve of Eratosthenes

Let $N \gt 1$ be an integer and let $k$ be the number of prime numbers less than or equal to $\sqrt{N}$ . Consider the formula defined by the following properties:

The formula is an algebraic sum of terms like $\pm \left\lfloor \frac{N}{d} \right\rfloor$ , where $d$ is a product of zero or more distinct primes chosen from the set of the first $k$ prime numbers;
For every possible product of distinct prime numbers chosen from the set of the first $k$ prime numbers, there is one and only one term of the type $\pm \left\lfloor \frac{N}{d} \right\rfloor$ ;
The sign of each term $\pm \left\lfloor \frac{N}{d} \right\rfloor$ is positive if $d$ is the product of an even number of prime numbers; negative if $d$ is the product of an odd number of prime numbers.

We will call this formula general formula of the sieve of Eratosthenes with starting number $N$ .

It’s easy to prove that the following property holds:

Number of terms of the general formula of the sieve of Eratosthenes

For each $N \gt 1$ , setting $k$ the number of prime numbers less than or equal to $\sqrt{N}$ , the general formula of the sieve of Eratosthenes with starting number $N$ has $2^k$ terms.

By definition, the general formula of the sieve of Eratosthenes contains, up to the sign, all possible terms of the type $\left\lfloor \frac{N}{d} \right\rfloor$ , where $d$ is a product of zero or more primes chosen from the set of the first $k$ prime numbers, without repetitions. We can represent such a $d$ as a $k$ -tuple $(a_1, a_2, \ldots, a_k)$ , where each element $a_i$ can be 1 or 0, depending on whether or not the $i$ -th prime appears in the factorization of $d$ . For example, if $k = 4$ , as in formula (3), the number 10 would be represented by the quadruple $(1, 0, 1, 0)$ , because it is the product of the first prime number by the third prime number; the number 35 would be represented by the quadruple $(0, 0, 1, 1)$ , because it is the product of the third and fourth prime numbers; the numbers 1 and 210 would be represented by the quadruples $(0, 0, 0, 0)$ and $(1, 1, 1, 1)$ respectively, because they are respectively the product of no prime number and of all the first four prime numbers.

It’s evident that each $k$ -uple determines one and only one number $d$ (it’s enough to calculate the product of the corresponding prime numbers) and, vice versa, each number $d$ has one and only one corresponding $k$ -tuple (to obtain it, it is enough to see which primes are present in its factorization). In other words, there is a one-to-one correspondence between the set of possible $d$ s and that of possible $k$ -tuples formed by binary values (0 or 1), hence the cardinality of the two sets must be the same. But the possible $k$ -tuples are $2^k$ , because there are two choices for the first position, for every choice of the first position there are two possible choices for the second position, and so on up to the $k$ -th position, so that in total there are $\underbrace{2 \cdot 2 \cdot \ldots \cdot 2}_{k \text{ times}} = 2^k$ possible $k$ -tuples. Then also the possible $d$ s will be $2^k$ and, since each $d$ determines a single term of the general formula, the latter will contain $2^k$ terms.

It’s precisely this exponential growth in the number of terms that makes Eratosthenes’ sieve algorithm not very useful for calculating the corresponding sieve function.
This might seem counterintuitive, because an algorithm is also a form of calculation; however, it must be considered that the sieve of Eratosthenes was designed to calculate the set of prime numbers less than or equal to $N$ , not the cardinality of such set. It is clear that, once the set has been calculated, it becomes immediate to also calculate its cardinality, but obtaining a formula to directly calculate the cardinality without going through the set is a different type of problem and, as we have seen, the approach of starting from a algorithm that calculates the set does not work satisfactorily. With the exception of very small values of $k$ , it is definitely more convenient to apply the sieve of Eratosthenes algorithm as it is and finally count how many elements appear in the result, rather than applying the general formula with $2^k$ terms, although this formula derives from the same algorithm.

Legendre’s identity

So far we have seen how sometimes an effective approach from an algorithmic point of view may not be effective from a mathematical point of view. However, this wasn’t the only purpose of this article. Fortunately, the approach we have described so far has its theoretical utility, in fact we’ll use it as a starting point for the creation of more sophisticated sieves that we’ll see in the next articles. To this end, however, it is convenient to state the general formula in a suitable form, obtaining the so-called Legendre’s identity.
To obtain it, we can start from formula (3) and make two considerations:

Indicating as usual with $d$ the denominator of the generic term of the formula, we can note that these $d$ s form, on the whole, the set of divisors of 210, which is the product of the prime numbers used in the formula ( $210 = 2 \cdot 3 \cdot 5 \cdot 7$ ). Indeed all the denominators (1, 2, 3, 5, 7, 6, 10, 15, …) are divisors of 210; conversely, if $d$ is a divisor of 210, it is one of the denominators because it satisfies the first point of Definition C.2, that is, it is the product of distinct prime numbers chosen from the set $\{2, 3, 5, 7\}$ .

Why, if $d$ is a divisor of 210, it satisfies the first point of Definition C.2?

If $p$ is a prime factor of $d$ , which by hypothesis divides 210, $p$ must also be a prime factor of 210, so $p \in \{2, 3, 5, 7\}$ . Also, $p$ cannot be present in the factorization of $d$ more than once, otherwise, $d$ being a divisor of 210, it would be present multiple times also in the factorization of 210, but this is not possible because 210 is square-free. So $d$ is the product of distinct prime numbers chosen in the set $\{2, 3, 5, 7\}$ .
We said that the sign of the generic term $\left\lfloor \frac{N}{d} \right\rfloor$ is positive if $d$ is the product of a even number of primes; negative if $d$ is the product of an odd number of primes. But there is a function, well known in number theory, which performs this calculation: it’s the Möbius function, denoted by the symbol $\mu$ . The definition, which is detailed in the prime number Theorem path, is as follows:
$\mu(d) := \begin{cases} 1 & \begin{aligned} & \text{if $d$ is square free and is the product} \\ & \text{of an even number of prime factors} \end{aligned} \\ \\ -1 & \begin{aligned} & \text{if $d$ is square free and is the product} \\ & \text{of an odd number of prime factors} \end{aligned} \\ \\ 0 & \text{if $d$ is not square free} \end{cases}$ < /div>

where $d$ is a positive integer.
In our case $d$ is a divisor of 210, but since the latter is square free, $d$ will also have the same property, so $\mu(d)$ will never be zero. Furthermore, by definition, $\mu(d) = -1$ if $d$ is the product of an odd number of prime factors, while $\mu(d) = 1$ if $d$ is the product of an even number of prime factors. So we can rewrite the positive terms of the formula (where $d$ is the product of an even number of prime factors) as $\left\lfloor \frac{N}{d} \right\rfloor = 1 \cdot \left\lfloor \frac{N}{d} \right\rfloor = \mu(d) \left\lfloor \frac{N}{d} \right\rfloor$ and the negative terms (where $d$ is the product of an odd number of prime factors) as $- \left\lfloor \frac{N}{d} \right\rfloor = (-1) \cdot \left\lfloor \frac{N}{d} \right\rfloor = \mu(d) \left\lfloor \frac{N}{d} \right\rfloor$ , obtaining in both cases the expression $\mu(d) \left\lfloor \frac{N}{d} \right\rfloor$ .

Taking these two observations into account, formula (3) can be rewritten as follows:

\sum_{d \mid 210} \mu(d) \left\lfloor \frac{210}{d} \right\rfloor

Now the formula is much easier to generalize, because we just need to replace 210 with a suitable expression that depends on our starting number $N$ . In this case, 210 is the product of the first 4 prime numbers, i.e. the product of the prime numbers less than or equal to $\sqrt{N} = 32$ , or equivalently, as we saw in the previous article, less than $\sqrt{N + 1}$ . In the previous article we also saw a simple notation to indicate the product of prime numbers less than a certain number $z$ , i.e. $P(z)$ . So $210 = P(\sqrt{33}) = P(\sqrt{N + 1})$ . This observation allows us to write the general formula of the Eratosthenes’ sieve as follows:

General formula of the sieve of Eratosthenes: algebraic expression

For every integer $N \gt 1$ , the general formula of the sieve of Eratosthenes with starting number $N$ is as follows:

\sum_{d \mid P(\sqrt{N+1})} \mu(d) \left\lfloor \frac{N}{d} \right\rfloor

where $P(\sqrt{N+1})$ denotes the product of prime numbers less than $\sqrt{N+1}$ .

So far we have concentrated our attention on the formula alone, but we have seen that this formula is based on a set reasoning, the result of which is precisely the value of the sieve function $S(A, \mathbb{P}, \sqrt{N + 1})$ . We have verified this on the basis of an example, but in order to prove it in general we should use the principle of inclusion-exclusion, the proof of which however is not relevant for our purposes, but is achieved through the same basic mechanism. For this reason we’ll omit the proof of the following Proposition:

Eratosthenes’ sieve formula as a sieve function

For every integer $N \gt 1$ , setting $A := \{1, 2, 3, \ldots, N\}$ , the following equality holds:

S(A, \mathbb{P}, \sqrt{N + 1}) = \sum_{d \mid P(\sqrt{N+ 1})} \mu(d) \left\lfloor \frac{N}{d} \right\rfloor \tag{4}

where $P(\sqrt{N+1})$ denotes the product of prime numbers less than $\sqrt{N+1}$ .

Again using the inclusion-exclusion principle, it can be proved that formula (4) is valid even if the sieving level $z$ is changed. The sieve of Eratosthenes is obtained with the choice $z = \sqrt{N + 1}$ , i.e. considering the prime numbers less than or equal to $\sqrt{N}$ ; if we changed $z$ we would obtain a different sieve, which eliminates from $A$ all multiples of prime numbers less than $z$ , but formula (4) would still hold replacing $\sqrt{N+1}$ with $z$ . This generalization is precisely Legendre’s identity:

Legendre’s identity

For every integer $N \gt 1$ and for every real number $z \gt 1$ , setting $A := \{1, 2, 3, \ldots, N\}$ , the following equality holds:

S(A, \mathbb{P}, z) = \sum_{d \mid P(z)} \mu(d) \left\lfloor \frac{N}{d} \right\rfloor

where $P(z)$ denotes the product of prime numbers less than $z$ .

As an example, the one we saw at the beginning is valid, provided that we stop before eliminating the multiples of 5. In fact, with $N = 32$ the last prime number considered by the sieve of Eratosthenes is 5, so if we stop before (for example at 3) we’ll have a different sieve, with a different value of $z$ (any $z$ such that $3 \lt z \leq 5$ ). In such modified sieve we can verify the validity of Legendre’s identity: for example for $3 \lt z \leq 5$ Legendre’s identity expands to the formula $32 - \left\lfloor \frac{32}{2} \right\rfloor - \left\lfloor \frac{32}{3} \right\rfloor + \left\lfloor \frac{32}{6} \right\rfloor$ which , as can be seen in formula (c) of the example, correctly calculates the number of elements of $A$ after removing the multiples of 2 and the multiples of 3. In essence, therefore, Legendre’s identity tells us that the general formula of the sieve of Eratosthenes holds not only at the end of the algorithm, but also in the intermediate steps.

Why don’t algorithmic approaches work well in sieve theory? (part I)

From Eratosthenes’ sieve algorithm to the calculation of the corresponding sieve function

The general formula of the sieve of Eratosthenes

Legendre’s identity

Leave a Reply Cancel reply

Dashed line viewer

Goldbach pairs viewer

Bidimensional sieve of Eratosthenes

Factorizer

Maximum space distance calculator

Number theory statements

Number theory definitions and symbols

Dashed line theory definitions and symbols

Bookshop

Prizes

Our readers' contributions