Prerequisite:

The proof strategy shown here is aimed at proving the Hypothesis H.1 (Hypothesis of existence of Goldbach pairs based on dashed lines) by using properties of spaces. We have found two different methods for doing this:

- A method that makes use of a formula for explicit calculation of the function \mathrm{t\_space};
- A method that makes use of the concept of maximum distance between consecutive spaces.

Both have, as a starting point, a reformulation of Hypothesis H.1, different in each case.

## Method based on explicit calculation of function \mathrm{t\_space}

Starting from Hypothesis H.1, we can use the function \mathrm{t\_space} in order to rewrite the two unknown spaces p and q respectively as \mathrm{t\_space}(x) and \mathrm{t\_space}(y). So, we’ll obtain this Hypothesis, in which the unknown variables are no more p and q, but x and y:

Hypothesis of existence of Goldbach pairs based on function \mathrm{t\_space}

Let 2n \gt 4 be an even number, and k the related validity order. Then, two positive integers x and y exist, such that, referring to dashed line T_k:

- \mathrm{t\_space}(x) and \mathrm{t\_space}(y) are both within validity interval;
- \mathrm{t\_space}(x) + \mathrm{t\_space}(y) = 2n.

Then, this method consists in proving that the equation \mathrm{t\_space}(x) + \mathrm{t\_space}(y) = 2n has at least a solution (x, y), for each 2n \gt 4. We’ll name this equation *Goldbach’s equation about spaces*:

Goldbach’s equation about spaces

Let 2n \gt 4 be an even number, and k the related validity order. Then, the following equation in unknown variables x and y, referring to dashed line T_k:

is named **Goldbach’s equation about spaces**.

The following picture represents graphically this resolution method, based on an example:

The problems which must be solved in order to reach the goal are essentially the following:

- Find a formula for calculating the function \mathrm{t\_space} for linear dashed lines of any order (even if treating dashed lines T_k, for any k, could be enough);
- Rewrite the Goldbach’s equation about spaces, by using the formula after finding it;
- Prove that the rewritten equation has solutions;
- Prove that, for at least one of the solutions (x, y) found, the related values of \mathrm{t\_space}(x) and \mathrm{t\_space}(y) are within the validity interval.

At the end, we will be able to apply the Property T.1 (Spaces and prime numbers), which ensures that all spaces in the validity interval are also prime numbers, in order to assert that the solutions found this way are also Goldbach pairs, which would complete the proof of the conjecture.

## Method based on the concept of maximum distance between consecutive spaces

Solving Hypothesis H.1 would be much easier if, instead of finding the two spaces p and q, it was enough finding just one of them, let’s say p. This would be possible if a mechanism existed to ensure that the integer q = 2n - p is in turn a space, without explicitly stating this condition. This mechanism exists, and is based on the use of a slightly more complicated dashed line than T_k of Hypothesis H.1. Let’s look how it works.

The basic principle consists on limiting the possible choices of p, identifying some conditions that ensure that also q is a space; in order to find them, we can start from the definition of space. If q must be a space, then it must not be divisible by any component of the dashed line, which, in Hypothesis H.1, is T_k, then q must not be divisible by p_1, \ldots, p_k, so q \mathrm{\ mod\ } p_i must be different from 0, for each i=1,\ldots,k. The interesting aspect is that this condition on q, thanks to the Goldbach equation that binds p and q, can be transformed into a similar condition on p. This follows from a general property of integers, expressed by the following Lemma:

Relationship between the moduli of two positive integers and the one of their sum

Let a, b and m be three positive integers. Then

Let’s indicate with a^{\prime}, b^{\prime} and h^{\prime}, respectively, the quotients of the division of a, b and h by m, then a = ma^{\prime} + a \mathrm{\ mod\ } m, b = mb^{\prime} + b \mathrm{\ mod\ } m and h = mh^{\prime} + h \mathrm{\ mod\ } m. Replacing in (1) we have:

Separating the moduli from the other terms and collecting m, we have:

If a \mathrm{\ mod\ } m = 0, we have:

In particular, h \mathrm{\ mod\ } m - b \mathrm{\ mod\ } m is a multiple of m. But h \mathrm{\ mod\ } m and b \mathrm{\ mod\ } m, by definition of modulus, are both less than m, so their difference can vary from a minimum of -(m - 1) (when h \mathrm{\ mod\ } m = 0 and b \mathrm{\ mod\ } m = m - 1) to a maximum of m - 1 (when h \mathrm{\ mod\ } m = m - 1 and b \mathrm{\ mod\ } m = 0). Then, the only multiple of m which can be equal to the difference h \mathrm{\ mod\ } m - b \mathrm{\ mod\ } m is 0, so b \mathrm{\ mod\ } m = h \mathrm{\ mod\ } m. But h = a + b, so b \mathrm{\ mod\ } m = (a + b) \mathrm{\ mod\ } m. This way, we have proved the implication from left to right.

As for the opposite implication, if we suppose that b \mathrm{\ mod\ } m = (a + b) \mathrm{\ mod\ } m = h \mathrm{\ mod\ } m, replacing in (2) we have:

In particular, a \mathrm{\ mod\ } m is a multiple of m; but, by definition of modulus, its value must be between 0 andm - 1, so it cannot be equal to zero. Then, also the implication from right to left is proved.

Applying Lemma L.1 to our case, with a := q, b := p and m := p_i, and recalling that p + q = 2n, we have:

Then, by negating it, we have:

And this is true for each component p_i, because we didn’t make any particular assumption about what component we must have chosen. It means that the condition stating that q is a space, i.e.:

is equivalent to:

The condition that q is a space, then, has been translated into an equivalent condition on p. This way, we can focus just on the only variable p.

Going back to Hypothesis H.1, we have to consider that also p must be a space; then, recapping all, p must satisfy two conditions:

- It must be a space;
- It must be such that also q = 2n - p is a space.

The first condition can be written as we already have done about q:

The second condition, as we have already seen, is given by (3). Combining formulas (3) and (4), we obtain the following formula:

We can observe that we havep_1 = 2, so, in the first line, we have 2n \mathrm{\ mod\ } p_1 = 2n \mathrm{\ mod\ } 2 = 0; then, the set \{0, 2n \mathrm{\ mod\ } p_1\} is actually made up of only the number 0. We can make this condition more explicit by separating the first line from the following ones, obtaining:

For the lines related to p_2, \ldots, p_k, the same situation as the first one could happen: for a generic component p_i, for i = 2, \ldots, k, the value of 2n \mathrm{\ mod\ } p_i could be 0, then the set \{0, 2n \mathrm{\ mod\ } p_i\} could reduce to the only item 0. It’s possible, but it’s not always true in general, as instead it happens for the first line.

Based on previous passages, we can prove the following Proposition:

Characterization of Goldbach pairs formed by spaces of T_k

Let 2n \gt 4 be an even number, and k the related validity order. Then, given two positive integers p and q, the following conditions are equivalent:

- (p, q) is a Goldbach pair for 2n formed by two spaces of T_k
- 1 \lt p \lt 2n - 1;
\begin{cases} p \mathrm{\ mod\ } p_1 \neq 0 \\ p \mathrm{\ mod\ } p_2 \notin \{0, 2n \mathrm{\ mod\ } p_2\} \\ \ldots \\ p \mathrm{\ mod\ } p_k \notin \{0, 2n \mathrm{\ mod\ } p_k\} \end{cases}
q = 2n - p.

- 1 \lt p \lt 2n - 1
- \begin{cases} p \mathrm{\ mod\ } p_1 \neq 0 \\ p \mathrm{\ mod\ } p_2 \notin \{0, 2n \mathrm{\ mod\ } p_2\} \\ \ldots \\ p \mathrm{\ mod\ } p_k \notin \{0, 2n \mathrm{\ mod\ } p_k\} \end{cases}
- q = 2n - p

First of all, p cannot be 1, otherwise it would not be prime, so it would not be able to form a Goldbach pair. It’s a positive integer by hypothesis, so it must be p \gt 1. The same reasoning could be repeated for q, so we have also q \gt 1. But, by Goldbach’s equation, we have that p + q = 2n, so we have p = 2n - q, and we have also q \gt 1, so p \lt 2n - 1. This way, we have proved (a).

We have seen that (b) is equivalent to stating that p and q = 2n - p are both spaces of T_k, and this is true in the 1. of the statement, so also (b) is proved. Finally, (c) is a different rewriting of Goldbach’s equation.

Let’s prove the opposite, 2. \Rightarrow 1. i.e. that if (a), (b) e (c) are true, then (p, q) is a Goldbach pair for 2n formed by spaces of T_k, i.e.:

- p and q are spaces of T_k;
- p and q are primes;
- p + q = 2n.

The (i) follows directly from (b), as we have seen before. About (ii), for the Property T.1 (spaces and prime numbers) we just need to prove that p and q are within the validity interval, i.e. that p_k + 1 \leq p \leq p_k^2 - 1. This is true because:

- The smallest space of T_k greather than 1 is p_k + 1. In fact, if m is an integer between 2 and p_k, all its prime factors, which are less than or equal to m, are also less than or equal to p_k, then each of them coincides with one of primes p_1, \ldots, p_k, which are the components of T_k. Then m, which is divisible by one of the components of the dashed line, isn’t a space. We have supposed that2 \leq m \leq p_k, so none of these m can be a space, so the smallest space greater than 1 is greater than or equal to p_k + 1.
- k is the validity order related to 2n, so, by definition, we have that 2n \leq p_k^2 - 1; then, by (a) p \lt 2n - 1 \leq 2n, we have also p \leq p_k^2 - 1.

Finally, (iii) follows directly from (c).

In order to understand formula (5), it is better to decompose the conditions of the kind p \mathrm{\ mod\ } p_i \notin \{0, 2n \mathrm{\ mod\ } p_i\} in the two conditions p \mathrm{\ mod\ } p_i \neq 0 and p \mathrm{\ mod\ } p_i \neq 2n \mathrm{\ mod\ } p_i, and sorting them as follows:

- p \mathrm{\ mod\ } p_1 \neq 0
- p \mathrm{\ mod\ } p_2 \neq 0
- \ldots
- p \mathrm{\ mod\ } p_k \neq 0
- p \mathrm{\ mod\ } p_2 \neq 2n \mathrm{\ mod\ } p_2
- \ldots
- p \mathrm{\ mod\ } p_k \neq 2n \mathrm{\ mod\ } p_k

The first k conditions state that p must not be divisible by p_1, \ldots, p_k, i.e. that it must be a space of the dashed line T_k. As an example, let’s consider, for k = 4, the following dashed line:

Referring to the representation of this dashed line, the values of p which satisfy the first k conditions correspond to the columns not containing dashes. This happens because, on each row, we put a dash in correspondence of values of p which *do not* satisfy each condition: in general, on row i there’s a dash in correspondence of columns p such that p \mathrm{\ mod\ } p_i = 0, which is the opposite of p \mathrm{\ mod\ } p_i \neq 0. Vice versa, then, the columns which *do not* have a dash on row i are the ones that *satisfy* the i-th condition, i.e. such that p \mathrm{\ mod\ } p_i \neq 0. As a consequence, if a column contains *no* dashes, on no rows, it means that such column satisfies *all* the first k conditions, i.e., by definition, it’s a space.

The same process can be extended to remaining conditions; this passage, whish is not part of the construction of classic linear dashed lines, is needed for passing from T_k to the new type of dashed lines which is needed to proceed. By relating the condition p \mathrm{\ mod\ } p_i \neq 2n \mathrm{\ mod\ } p_i to row i, with i = 2, \ldots, k, on each row we can add a dash in correspondence with the columns for which the corresponding condition is *not* satisfied (if the cell located like this already contains a dash, we keep the existing one). So, on the second row we will add a dash in correspondence of the columns p such that p \mathrm{\ mod\ } p_2 = 2n \mathrm{\ mod\ } p_2,and we will go on like this until row k, where we add a dash in corrispondence of the columns p such that p \mathrm{\ mod\ } p_k = 2n \mathrm{\ mod\ } p_k, adding from time to time a dash only if one isn’t already present. For example, if 2n \mathrm{\ mod\ } p_i is 1 for i = 2, 0 for i = 3 and 2 for i = 4, we will obtain the following dashed line, which we will name U:

We can see that 2n \mathrm{\ mod\ } p_3 = 0, so we added no dashes on row 3, differently from starting dashed line T_k, in fact the latter already had the dashed of the third row in correspondence with columns p such that p \mathrm{\ mod\ } p_3 = 0 = 2n \mathrm{\ mod\ } p_3. By comparing the representation of the dashed line U with the one of T_k, we can say that some rows (1 and 3) of the new dashed line coincide with the same of T_k, but other rows (2 and 4) have a double number of dashes (more exactly, considering a set of consecutive columns the cardinality of which is multiple of the component related to the row, U has a double number of dashes than T_k has in same columns). For this reason, we will name U a *double linear dashed line*, or simply *double dashed line*. The mathematical function expressing U is the following:

In general, starting from the dashed line T_k and stating that r_i := 2n \mathrm{\ mod\ } p_i for i = 1, \ldots, k, we obtain a dashed line which will be named T_k^{(r_1, \ldots, r_k)}, given from the following function:

It’s possible to verify that (7) coincides with (6) for k = 4 and (r_1, \ldots, r_k) = (r_1, r_2, r_3, r_4) = (0, 1, 0, 2), which are the hypotheses of our example.

More generally, if instead of T_k we consider a generic linear dashed line T, we’ll obtain the following definition of double dashed line, which resumes the Definitions T.1 (Dashed line) and T.2 (Linear dashed line):

Double dashed line

Let T = (n_1, \ldots, n_k) be a linear dashed line of order k with I as the set of its indices. Let r_1, \ldots, r_k be some integers such that, for each i = 1, \ldots, k, we have 0 \leq r_i \lt n_i. Then, we define the dashed line T^{(r_1, \ldots, r_k)}: I \times \mathbb{N} \Rightarrow \mathbb{N} such that:

It’s named **double linear dashed line** (or simply **double dashed line**) based on T, with displacements r_1, \ldots, r_k.

By using the definition of dashed line, we can rewrite Proposition L.1 (Characterization of Goldbach pairs formed by spaces of T_k) in the following form:

Characterization of Goldbach pairs formed by spaces of T_k

Let 2n \gt 4 be an even number, and k the related validity order. Then, given two positive integers p and q, the following conditions are equivalent:

- (p, q) is a Goldbach pair for 2n formed by spaces of T_k
- 1 \lt p \lt 2n; p is a space of the double dashed line T_k^{(0, r_2, \ldots, r_k)}, with r_i := 2n \mathrm{\ mod\ } p_i for each i = 2, \ldots, k; q = 2n - p.

By using this Proposition, we can rewrite the Hypothesis H.1 in this new form:

Hypothesis of existence of Goldbach pairs based on dashed lines (second form)

Let 2n \gt 4 be an even number, and let k be the related validity order. Let be r_i := 2n \mathrm{\ mod\ } p_i, for each i = 2, \ldots, k. Then, the double dashed line T_k^{(0, r_2, \ldots, r_k)} contains at least a space p such that 1 \lt p \lt 2n.

So, we have come to a fundamental problem: how can we ensure that a dashed line (a double dashed line in this case) contains a space in a preset range?

A simple way for doing it is considering the maximum distance between two consecutive spaces of the dashed line. In fact, if we name d this number, i.e. if two consecutive spaces of the dashed line can have a maximum distance equal to d, we must just compare d with the width A of the range, including its bounds:

Maximum distance between spaces and existence of spaces in a range

Let I be a range of width A, including its bounds. Let d be the maximum distance between two consecutive spaces of a dashed line T. If d \leq A, then I contains at least a space of T.

In the case of Hypothesis H.1 (second form), our range is [2, 2n - 1], the width of which is (2n - 1) - 2 + 1 = 2n - 2. Then, if we prove that the maximum distance between consecutive spaces of the dashed line T_k^{(0, r_2, \ldots, r_k)} is less than or equal to 2n - 2, then a space p surely exists in the range [2, 2n - 1], i.e. such that 1 \lt p \lt 2n; , the hypothesis would be proved. This observation allows us to formulate a further Hypothesis:

Hypothesis of existence of Goldbach pairs based on maximum distance between spaces

Let 2n \gt 4 be an even number, and let k be the related validity order. Let be r_i := 2n \mathrm{\ mod\ } p_i, for each i = 2, \ldots, k. Then, the maximum distance between two consecutive spaces of the double dashed line T_k^{(0, r_2, \ldots, r_k)} is less than or equal to 2n - 2.

A flaw of this Hypothesis is that the number 2n - 2, with which we must compare the maximum distance between consecutive spaces, depends on n, whereas the dashed line T_k^{(0, r_2, \ldots, r_k)} depends on k (and only indirectly by n, because k, in turn, is defined starting from n). We could simplify things if we replace 2n - 2 with an expression depending on k, so we can study more easily what happens with variations of k. This is possible, remembering that k has been defined (Definition L.2) as the smallest integer such that p_{k+1}^2 \gt 2n. It means that p_k^2 \leq 2n (otherwise, the relation defining k would be satisfied also by k-1, so k would no longer be *the smallest* integer satisfying it), so p_k^2 - 2 \leq 2n - 2. Then, if we prove that the maximum distance between two consecutive spaces of the double dashed line T_k^{(0, r_2, \ldots, r_k)} is less than or equal to p_k^2 - 2, it will also be less than or equal to 2n - 2. So, in Hypothesis H.1.MDS, we can replace 2n - 2 with p_k^2 - 2, obtaining a new Hypothesis which is slightly stronger, but more appropriate for studying what happens with variations of k:

Hypothesis of existence of Goldbach pairs based on maximum distance between spaces (stronger version)

Let 2n \gt 4 be an even number, and let k be the related validity order. Let be r_i := 2n \mathrm{\ mod\ } p_i, for each i = 2, \ldots, k. Then, the maximum distance between two consecutive spaces of the double dashed line T_k^{(0, r_2, \ldots, r_k)} is less than or equal to p_k^2 - 2.

At this point, the question can be posed in more general terms, trying to understand how varies the maximum distance between two consecutive spaces of a double linear dashed line when order k varies, maybe proving that this distance does not depend by the displacements r_2, \ldots, r_k, but only by k.

Since the concept of maximum distance between consecutive spaces of a dashed line will recur often, it’s convenient to define a symbol to write it in a compact form:

Maximum distance between consecutive spaces of a dashed line

Let T be a dashed line. Maximum distance between consecutive spaces of T will be indicated with the symbol \mathrm{MDS}(T).

We can observe that, actually, in order to try to prove Hypothesis H.1.MDSk, we do not need to calculate *exactly* \mathrm{MDS}(T_k^{(0, r_2, \ldots, r_k)}), but an upper bound is enough. In other words, we can even not have an idea of what’s this maximum distance; the important aspect is to prove that it’s less than or equal to p_k^2 - 2, or, equivalently, prove it’s less than or equal to a constant M_k, which, in turn, is less than or equal to p_k^2 - 2. We have already made some first progresses in this direction, beginning by simplicity from linear dashed lines, instead of double dashed lines. We have already proved that a linear dashed line (n_1, \ldots, n_k), independently by what are its components, this M_k is:

- 2, if the dashed line is of first order (Property D.1.S (Upper bound for maximum distance between consecutive spaces for first order linear dashed line));
- 4, if the dashed line is of second order (Property D.2.S (Upper bound for maximum distance between consecutive spaces for second order linear dashed lines));
- n_1 \cdot (n_3 + 4), if the dashed line is of third order (Property D.3.S (Upper bound for maximum distance between consecutive spaces for third order linear dashed lines)). The proof has still some open points, but proving them in turn should not be hard.

These results are compatible with Hypothesis H.1.MDSk because, for dashed lines of type T_k, the value we found is always less than or equal to p_k^2 - 2, as shown in the following table:

k | T_k | p_k | p_k^2 - 2 | M_k such that \mathrm{MDS}(T_k) \leq M_k |
---|---|---|---|---|

1 | (2) | 2 | 2 | 2 |

2 | (2, 3) | 3 | 7 | 4 |

3 | (2, 3, 5) | 5 | 23 | 2 \cdot (5 + 4) = 18 |

From the table, we see it’s always M_k \leq p_k^2 - 2. Actually, the table should not be compared with Hypothesis H.1.MDSk, because the latter considers *double dashed lines* based on dashed lines T_k, whereas the table is limited to dashed lines T_k. But, it is a starting step which can be extended as divided into two phases:

- Generalizing with respect to k, for example by finding a function f such that M_k \leq f(k) for each k
- Introducing the concept of double dashed line, understanding how the maximum distance between consecutive spaces changes, when we pass from a linear dashed line to a double dashed line based upon it. For example, we could find a function g such that, if for a linear dashed line T we have \mathrm{MDS}(T) = d, then for any double dashed line T^{\prime} based on T we have \mathrm{MDS}(T^{\prime}) \leq g(d).

Finally, putting these results together, we can state that a double dashed line T_k^{\prime} based upon T_k (like the dashed line T_k^{(0, r_2, \ldots, r_k)} of Hypothesis H.1.MDSk) is such that \mathrm{MDS}(T_k^{\prime}) \leq g(f(k)), so the Hypothesis H.1.MDSk will be proved if g(f(k)) \leq p_k^2 - 2, for each k.