In 1637, Pierre de Fermat wrote in the margin of a book that he had a proof of his famous "Last Theorem":
If $A^n + B^n = C^n$,
where $A, B, C, n$ are positive integers
then $n \le 2$.
Centuries passed before Andrew Beal, a businessman and amateur mathematician, made his conjecture in 1993:
If $A^x + B^y = C^z$,
where $A, B, C, x, y, z$ are positive integers and $x, y, z$ are all greater than $2$,
then $A, B$ and $C$ must have a common prime factor.
Andrew Wiles proved Fermat's theorem in 1995, but Beal's offer of $1,000,000 for a proof or disproof of his conjecture remains unclaimed. I don't have the mathematical skills of Wiles, so all I can do is write a program to search for counterexamples. I first wrote that program in 2000, and my name got associated with Beal's Conjecture, which means I get a lot of emails with purported proofs or counterexamples (many asking how they can collect their prize money). So far, all the emails have been wrong. This page catalogs some of the more common errors—including two mistakes of my own—and shows an updated program.
$A, B, C, x, y, z$ are positive integers
$x, y, z > 2$
$A^x + B^y = C^z$
$A, B, C$ have no common prime factor.
(If you think you have a valid counterexample, before you share it with Andrew Beal, or me, or anyone else, you can check it with my Online Beal Counterexample Checker.)
from math import gcd #### In Python versions < 3.5, use "from fractions import gcd"
A, B, C = 60000000000000000000, 70000000000000000000, 82376613842809255677
x = y = z = 3.
A ** x + B ** y == C ** z and gcd(gcd(A, B), C) == 1
True
WOW! The result is True
! Is this a real counterexample to Beal? And also a disproof of Fermat?
Alas, it is not. Notice the decimal point in "3.
", indicating a floating point number, with inexact, limited precision. Change the inexact "3.
" to an exact "3
" and the result changes to "False
". Below we see that the two sides of the equation are the same for the first 18 digits, but differ starting with the 19th:
(A ** 3 + B ** 3,
C ** 3)
(559000000000000000000000000000000000000000000000000000000000, 559000000000000000063037470301555182935702892172500189973733)
They say "close" only counts in horseshoes and hand grenades, and if you threw two horseshoes at a stake on the planet Kapteyn-b (a possibly habitable and thus possibly horseshoe-playing exoplanet 12.8 light years from Earth) and the two paths differed in the 19th digit, the horseshoes would end up less than an inch apart. That's really, really close, but close doesn't count in number theory.
Speaking of close: in two different episodes of The Simpsons, close counterexamples to Fermat's Last Theorem are shown: $1782^{12} + 1841^{12} = 1922^{12}$ and $3987^{12} + 4365^{12} = 4472^{12}$. These were designed by Simpsons writer David X. Cohen to be correct up to the precision found in most handheld calculators. Cohen found the equations with a program that must have been something like this:
from itertools import combinations
def simpsons(bases, powers):
"""Find the integers (A, B, C, n) that come closest to solving
Fermat's equation, A ** n + B ** n == C ** n.
Let A, B range over all pairs of bases and n over all powers."""
equations = ((A, B, iroot(A ** n + B ** n, n), n)
for A, B in combinations(bases, 2)
for n in powers)
return min(equations, key=relative_error)
def iroot(i, n):
"The integer closest to the nth root of i."
return int(round(i ** (1./n)))
def relative_error(equation):
"Error between LHS and RHS of equation, relative to RHS."
(A, B, C, n) = equation
LHS = A ** n + B ** n
RHS = C ** n
return abs(LHS - RHS) / RHS
simpsons(range(1000, 2000), [11, 12, 13])
(1782, 1841, 1922, 12)
simpsons(range(3000, 5000), [12])
(3987, 4365, 4472, 12)
beal
2.0 and 2.1¶In October 2015 I looked back at my original program from 2000.
I ported it from Python 1.5 to 3.5 (by putting parens around the argument to print
and adding long = int
). It runs 250 times faster today, a tribute to both computer hardware engineers and the developers of the Python interpreter.
I found that I had misunderstood the problem in 2000. I thought that, by definition, $A$ and $B$ could not have a common factor, but actually, the definition of the conjecture only rules out examples where all three of $A, B, C$ share a common factor. I rewrote the program to reflect that, but then [Mark Tiefenbruck ](mailto:mark @tiefenbruck.org) (and later Edward P. Berlin and Shen Lixing) wrote to point out that my original program was actually correct, not by definition, but by derivation: if $A$ and $B$ have a commmon prime factor $p$, then the sum of $A^x + B^y$ must also have that factor $p$, and since $A^x + B^y = C^z$, then $C^z$ and hence $C$ must have the factor $p$. So I was wrong twice—I originally failed to understand the problem completely, and then I failed to recognize the optimization—and that means the original program was correct.
Mark Tiefenbruck also suggested another optimization: only consider exponents that are odd primes, or 4. The idea is that a number like 512 can be expressed as either $2^9$ or $8^3$, and my program doesn't need to consider both. In general, any time we have a composite exponent, such as $b^{qp}$, where $p$ is prime, we should ignore $A=b, x=qp$, and instead consider only $A=b^q, x=p$. There's one complication to this scheme: 2 is a prime, but 2 is not a valid exponent for a Beal counterexample. So we will allow 4 as an exponent, as well as all odd primes up to max_x
.
Here is the complete, updated program:
from math import gcd, log
from itertools import combinations, product
def beal(max_A, max_x):
"""See if any A ** x + B ** y equals some C ** z, with gcd(A, B) == 1.
Consider any 1 <= A,B <= max_A and x,y <= max_x, with x,y prime or 4."""
Apowers = make_Apowers(max_A, max_x)
Czroots = make_Czroots(Apowers)
for (A, B) in combinations(Apowers, 2):
if gcd(A, B) == 1:
for (Ax, By) in product(Apowers[A], Apowers[B]):
Cz = Ax + By
if Cz in Czroots:
C = Czroots[Cz]
x, y, z = exponent(Ax, A), exponent(By, B), exponent(Cz, C)
print('{} ** {} + {} ** {} == {} ** {} == {}'
.format(A, x, B, y, C, z, C ** z))
def make_Apowers(max_A, max_x):
"A dict of {A: [A**3, A**4, ...], ...}."
exponents = exponents_upto(max_x)
return {A: [A ** x for x in (exponents if (A != 1) else [3])]
for A in range(1, max_A+1)}
def make_Czroots(Apowers): return {Cz: C for C in Apowers for Cz in Apowers[C]}
def exponents_upto(max_x):
"Return all odd primes up to max_x, as well as 4."
exponents = [3, 4] if max_x >= 4 else [3] if max_x == 3 else []
for x in range(5, max_x, 2):
if not any(x % p == 0 for p in exponents):
exponents.append(x)
return exponents
def exponent(Cz, C):
"""Recover z such that C ** z == Cz (or equivalently z = log Cz base C).
For exponent(1, 1), arbitrarily choose to return 3."""
return 3 if (Cz == C == 1) else int(round(log(Cz, C)))
It takes less than a second to verify that there are no counterexamples for combinations up to $100^{100}$, a computation that took Andrew Beal thousands of hours on his 1990s-era computers:
%time beal(100, 100)
CPU times: user 352 ms, sys: 2.2 ms, total: 354 ms Wall time: 354 ms
The execution time goes up roughly with the square of max_A
, so with 5 times more A
values, this computation takes about 25 times longer:
%time beal(500, 100)
CPU times: user 10.8 s, sys: 143 ms, total: 11 s Wall time: 11.1 s
beal
Works¶The function beal
first does some precomputation, creating two data structures:
Apowers
: a dict of the form {A: [A**3, A**4, ...]}
giving thenonredundant powers (prime and 4th powers) of each base, A
, from 1 to max_x
.
Czroots
: a dict of {C**z : C}
pairs, giving the zth root of each power in Apowers
.Then we consider all combinations of two bases, A
and B
, from Apowers
.
Here is a very small example Apowers table:
Apowers = make_Apowers(6, 10)
Apowers
{1: [1], 2: [8, 16, 32, 128], 3: [27, 81, 243, 2187], 4: [64, 256, 1024, 16384], 5: [125, 625, 3125, 78125], 6: [216, 1296, 7776, 279936]}
Consider the combination where A
is 3
and B
is 6
. Of course gcd(3, 6) == 3
, so the program would not consider them further, but imagine if they did not share a common factor. Then we would look at all possible Ax + By
sums, for Ax
in [27, 81, 243, 2187]
and By
in [216, 1296, 7776, 279936].
One of these would be 27 + 216
, which sums to 243
. We look up 243
in Czroots
:
Czroots = make_Czroots(Apowers)
print(Czroots)
Czroots[243]
{128: 2, 1: 1, 1296: 6, 1024: 4, 32: 2, 8: 2, 64: 4, 2187: 3, 78125: 5, 256: 4, 16384: 4, 16: 2, 81: 3, 279936: 6, 243: 3, 3125: 5, 625: 5, 216: 6, 7776: 6, 27: 3, 125: 5}
3
We see that 243
is in Czroots
, with value 3
, so this would be a counterexample (except for the common factor). The program uses the exponent
function to recover the values of x, y, z
, and prints the results.
Can we gain confidence in the program? It is difficult to test beal
, because the expected output is nothing, for all known inputs.
One thing we can do is verify that beal
finds cases like 3 ** 3 + 6 ** 3 == 3 ** 5 == 243
that would be a counterexample except for the common factor 3
. We can test this by temporarily replacing the gcd
function with a mock function that always reports no common factors:
def gcd(a, b): return 1
beal(100, 100)
3 ** 3 + 6 ** 3 == 3 ** 5 == 243 7 ** 7 + 49 ** 3 == 98 ** 3 == 941192 8 ** 4 + 16 ** 3 == 2 ** 13 == 8192 8 ** 5 + 32 ** 3 == 16 ** 4 == 65536 9 ** 3 + 18 ** 3 == 9 ** 4 == 6561 16 ** 5 + 32 ** 4 == 8 ** 7 == 2097152 17 ** 4 + 34 ** 4 == 17 ** 5 == 1419857 19 ** 4 + 38 ** 3 == 57 ** 3 == 185193 27 ** 3 + 54 ** 3 == 3 ** 11 == 177147 28 ** 3 + 84 ** 3 == 28 ** 4 == 614656 34 ** 5 + 51 ** 4 == 85 ** 4 == 52200625
Let's make sure all those expressions are true:
{3 ** 3 + 6 ** 3 == 3 ** 5 == 243,
7 ** 7 + 49 ** 3 == 98 ** 3 == 941192,
8 ** 4 + 16 ** 3 == 2 ** 13 == 8192,
8 ** 5 + 32 ** 3 == 16 ** 4 == 65536,
9 ** 3 + 18 ** 3 == 9 ** 4 == 6561,
16 ** 5 + 32 ** 4 == 8 ** 7 == 2097152,
17 ** 4 + 34 ** 4 == 17 ** 5 == 1419857,
19 ** 4 + 38 ** 3 == 57 ** 3 == 185193,
27 ** 3 + 54 ** 3 == 3 ** 11 == 177147,
28 ** 3 + 84 ** 3 == 28 ** 4 == 614656,
34 ** 5 + 51 ** 4 == 85 ** 4 == 52200625}
{True}
I get nervous having an incorrect version of gcd
around; let's change it back, quick!
from math import gcd
beal(100, 100)
We can also provide some test cases for the subfunctions of beal
:
def tests():
assert make_Apowers(6, 10) == {
1: [1],
2: [8, 16, 32, 128],
3: [27, 81, 243, 2187],
4: [64, 256, 1024, 16384],
5: [125, 625, 3125, 78125],
6: [216, 1296, 7776, 279936]}
assert make_Czroots(make_Apowers(5, 8)) == {
1: 1, 8: 2, 16: 2, 27: 3, 32: 2, 64: 4, 81: 3,
125: 5, 128: 2, 243: 3, 256: 4, 625: 5, 1024: 4,
2187: 3, 3125: 5, 16384: 4, 78125: 5}
Czroots = make_Czroots(make_Apowers(100, 100))
assert 3 ** 3 + 6 ** 3 in Czroots
assert 99 ** 97 in Czroots
assert 101 ** 100 not in Czroots
assert Czroots[99 ** 97] == 99
assert exponent(10 ** 5, 10) == 5
assert exponent(7 ** 3, 7) == 3
assert exponent(1234 ** 999, 1234) == 999
assert exponent(12345 ** 6789, 12345) == 6789
assert exponent(3 ** 10000, 3) == 10000
assert exponent(1, 1) == 3
assert exponents_upto(2) == []
assert exponents_upto(3) == [3]
assert exponents_upto(4) == [3, 4]
assert exponents_upto(40) == [3, 4, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
assert exponents_upto(100) == [
3, 4, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61,
67, 71, 73, 79, 83, 89, 97]
assert gcd(3, 6) == 3
assert gcd(3, 7) == 1
assert gcd(861591083269373931, 94815872265407) == 97
assert gcd(2*3*5*(7**10)*(11**12), 3*(7**5)*(11**13)*17) == 3*(7**5)*(11**12)
return 'tests pass'
tests()
'tests pass'
The program is mostly straightforward, but relies on the correctness of these arguments:
combinations
without replacements from the table? In other words, are we sure there are no solutions of the form $A^x + A^x = C^z$? Yes, we can be sure, because then $2\;A^x = C^z$, and all the factors of $A$ would also be factors of $C$.* Are we justified in having a single value for each key in the `Czroots` table? Consider that $81 = 3^4 = 9^2$. We put `{81: 3}` in the table and discard `{81: 9}`, because any number that has 9 as a factor will always have 3 as a factor as well, so 3 is all we need to know. But what if a number could be formed with two bases where neither was a multiple of the other? For example, what if $2^7 = 5^3 = s$; then wouldn't we have to have both 2 and 5 as values for $s$ in the table? Fortunately, that can never happen, because of the [fundamental theorem of arithmetic](https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic).
* Could there be a rounding error involving the `exponent` function that was not caught by the tests? Possibly; but `exponent` is not used to find counterexamples, only to print them, so any such error wouldn't cause us to miss a counterexample.
* Are we justified in only considering exponents that are odd primes, or the number 4? In one sense, yes, because when we consider the two terms $A^{qp}$ and $(A^q)^p$, we find they are always equal, and always have the same prime factors (the factors of $A$), so for the purposes of the Beal problem, they are equivalent, and we only need consider one of them. In another sense, there is a difference. With this optimization, when we run `beal(6, 10)`, we are no longer testing $512$ as a value of $A$ or $B$, even though $512 = 2^9$ and both $2$ and $9$ are within range, because the program chooses to express $512$ as $8^3$, and $8$ is not in the specified range. So the program is still correctly searching for counterexamples, but the space that it searches for given `max_A` and `max_x` is different with this optimization.
* Are we really sure that when $A$ and $B$ have a common factor greater than 1, then $C$ also shares that common factor? Yes, because if $p$ is a factor of both $A$ and $B$, then it is a factor of $A^x + B^y$, and since we know this is equal to $C^z$, then $p$ must also be a factor of $C^z$, and thus a factor of $C$.
Arithmetic is slow with integers that have thousands of digits. If we want to explore much further, we'll have to make the program more efficient. An obvious improvement would be to do all the arithmetic module some prime number $p$ that fits in one word. Then we know:
$$\mbox{if} ~~ A^x + B^y = C^z ~~ \mbox{then} ~~ A^x (\mbox{mod} ~ p) + B^y (\mbox{mod} ~ p) = C^z \;(\mbox{mod} ~ p)$$So we can do efficient tests modulo $p$, and then do the full arithmetic only for combinations that work modulo $p$. Unfortunately there will be collisions (two numbers that are distinct, but are equal mod $p$), so the tables will have to have lists of values. Here is a simple, unoptimized implementation:
from math import gcd
from itertools import combinations, product
from collections import defaultdict
def beal_modp(max_A, max_x, p=2**31-1):
"""See if any A ** x + B ** y equals some C ** z (mod p), with gcd(A, B) == 1.
If so, verify that the equation works without the (mod p).
Consider any 1 <= A,B <= max_A and x,y <= max_x, with x,y prime or 4."""
assert p >= max_A
Apowers = make_Apowers_modp(max_A, max_x, p)
Czroots = make_Czroots_modp(Apowers)
for (A, B) in combinations(Apowers, 2):
if gcd(A, B) == 1:
for (Axp, x), (Byp, y) in product(Apowers[A], Apowers[B]):
Czp = Axp + Byp
if Czp in Czroots:
lhs = A ** x + B ** y
for (C, z) in Czroots[Czp]:
if lhs == C ** z:
print('{} ** {} + {} ** {} == {} ** {} == {}'
.format(A, x, B, y, C, z, C ** z))
def make_Apowers_modp(max_A, max_x, p):
"A dict of {A: [(A**3 (mod p), 3), (A**4 (mod p), 4), ...]}."
exponents = exponents_upto(max_x)
return {A: [(pow(A, x, p), x) for x in (exponents if (A != 1) else [3])]
for A in range(1, max_A+1)}
def make_Czroots_modp(Apowers):
"A dict of {C**z (mod p): [(C, z),...]}"
Czroots = defaultdict(list)
for A in Apowers:
for (Axp, x) in Apowers[A]:
Czroots[Axp].append((A, x))
return Czroots
Here we see that each entry in the Apowers
table is a list of (A**x (mod p), x)
pairs.
For example, $6^7 = 279,936$, so in our (mod 1000) table we have the pair (936, 7)
under 6
.
Apowers = make_Apowers_modp(6, 10, 1000)
Apowers
{1: [(1, 3)], 2: [(8, 3), (16, 4), (32, 5), (128, 7)], 3: [(27, 3), (81, 4), (243, 5), (187, 7)], 4: [(64, 3), (256, 4), (24, 5), (384, 7)], 5: [(125, 3), (625, 4), (125, 5), (125, 7)], 6: [(216, 3), (296, 4), (776, 5), (936, 7)]}
And each item in the Czroots
table is of the form {C**z (mod p): [(C, z), ...]}
.
For example, 936: [(6, 7)]
.
make_Czroots_modp(Apowers)
defaultdict(list, {1: [(1, 3)], 8: [(2, 3)], 16: [(2, 4)], 24: [(4, 5)], 27: [(3, 3)], 32: [(2, 5)], 64: [(4, 3)], 81: [(3, 4)], 125: [(5, 3), (5, 5), (5, 7)], 128: [(2, 7)], 187: [(3, 7)], 216: [(6, 3)], 243: [(3, 5)], 256: [(4, 4)], 296: [(6, 4)], 384: [(4, 7)], 625: [(5, 4)], 776: [(6, 5)], 936: [(6, 7)]})
Let's run the program:
%time beal_modp(500, 100)
CPU times: user 9 s, sys: 145 ms, total: 9.14 s Wall time: 9.27 s
This is a bit faster than the previous version, and the idea is that as we start dealing with much larger integers, this version will be even faster, relatively. I could improve this version by caching certain computations, managing the memory layout better, moving some computations out of loops, considering using multiple primes (as in a Bloom filter), finding a way to parallelize the program, and re-coding in a faster compiled language (such as C++ or Go or Julia). Then I could invest thousands (or millions) of CPU hours searching for counterexamples.
But Witold Jarnicki and David Konerding already did that: they wrote a C++ program that built a table of $C^z \;(\mbox{mod} \; p)$ up to $5000^{5000}$, and, in parallel across thousands of machines, searched for $A, B$ up to 200,000 and $x, y$ up to 5,000, but found no counterexamples. On a smaller scale, Edwin P. Berlin searched all $C^z$ up to $10^{17}$ and also found nothing. So I don't think it is worthwhile to continue on that path.
This was fun, but I can't recommend anyone spend a serious amount of computer time looking for counterexamples to the Beal Conjecture—the money you invest in computer time would be more than the expected value of your prize winnings. I suggest you work on a proof rather than a counterexample, or work on some other interesting problem instead!