Euler Problem 21

Let d(n) be defined as the sum of proper divisors of n (numbers less than n which divide evenly into n). If d(a) = b and d(b) = a, where a ≠ b, then a and b are an amicable pair and each of a and b are called amicable numbers.

For example, the proper divisors of 220 are 1, 2, 4, 5, 10, 11, 20, 22, 44, 55 and 110; therefore d(220) = 284. The proper divisors of 284 are 1, 2, 4, 71 and 142; so d(284) = 220.

Evaluate the sum of all the amicable numbers under 10000.

We reuse the sieve of eratosthenes (which we called get_primes_smaller previously). What annoys me about the implementation is that we do not stop scanning for prospects once the current prime squared is greater than the limit. Let's test if it returns the same result and compare the execution time.

In [1]:
def get_primes_smaller(limit):
    primes = []
    prospects = [n for n in range(2, limit)]
    while prospects:
        p = prospects[0]
        prospects = [x for x in prospects if x % p != 0]
        primes.append(p)
    return primes

def sieve_of_eratosthenes(limit):
    primes = []
    prospects = [n for n in range(2, limit)]
    while prospects:
        p = prospects[0]
        prospects = [x for x in prospects if x % p != 0]
        primes.append(p)
        if p * p > limit:
            break
    primes += prospects
    return primes    

import timeit
import functools
print(timeit.timeit(functools.partial(get_primes_smaller, 10000), number=100))
print(timeit.timeit(functools.partial(sieve_of_eratosthenes, 10000), number=100))
assert(get_primes_smaller(10000) == sieve_of_eratosthenes(10000))
5.062845234464119
0.3188800166435408

Okay, this difference is sick actually. Interesting to see what the difference between O(sqrt(n)) and O(n) can mean. Now we can use the primes to do the factorization. I will again apply some optimizations and see how it turns out

In [2]:
def get_prime_factors_old(number):
    prime_generator = sieve_of_eratosthenes(number // 2)
    remainder = number
    factors = []
    for p in prime_generator:
        while remainder % p == 0:
            remainder //= p
            factors.append(p)
        if remainder == 1 or p * p > number:
            break
    if remainder != 1:
        factors.append(remainder)
    return factors

def get_prime_factors(number):
    remainder = number
    factors = []
    for p in sieve_of_eratosthenes(number // 2):
        while remainder % p == 0:
            remainder //= p
            factors.append(p)
        if remainder == 1:
            break
        if p > remainder or p * p > number:
            factors.append(remainder)
            break
    return factors

import timeit
import functools
print(timeit.timeit(functools.partial(get_prime_factors_old, 1000000), number=10))
print(timeit.timeit(functools.partial(get_prime_factors, 1000000), number=10))
assert(get_prime_factors_old(100000) == get_prime_factors(100000))
6.370271180404346
5.785088868397899

Okay, actually no big difference. Kind of expected if you check how similar the implementations are. I though stopping if p is greater than the remainder may be a common case.

Now we can use the prime factors to get all factors. For this each prime number can be used 0 to n times where n is the number how often the prime occurs in the prime factorization. For example, all possible fators for the primes $2, 2, 3$ are $2^0 \times 3^0, 2^1 \times 3^0, 2^2 \times 3^0, 2^0 \times 3^1, 2^1 \times 3^1, 2^2 \times 3^1$ which results in $1, 2, 4, 3, 6, 12$.

So we proceed in the following way. Once we have the prime factors we group then. Then we calculate the factors for each group. For example for $2,2$ the factors would be $1, 2, 4$ as explained in the previous paragraph. Then we calculate the combinations of all factor groups.

In [3]:
def group(xs):
    from functools import reduce
    rs = []
    def f(xss, x):
        if xss and x in xss[-1]:
            xss[-1].append(x)
        else:
            xss.append([x])
        return xss
    return reduce(f, xs, [])

assert(group([1, 2, 2, 2, 4, 4, 5]) == [[1], [2, 2, 2], [4, 4], [5]])

def group(xs):
    r = []
    for x in xs:
        if r and r[-1][0] == x:
            r[-1].append(x)
        else:
            r.append([x])
    return r

assert(group([1, 2, 2, 2, 4, 4, 5]) == [[1], [2, 2, 2], [4, 4], [5]])

As you can see I had a fancy version with reduce first, but the loop is way more readable. We iterate over the input list. If the current item is in the last sublist of the return list we add it to the list. Otherwise we append a new sublist with the item.

Next we write a function that calculates the resulting factors for a factor group.

In [4]:
def prime_group_to_factors(group):
    f = group[0]
    n = len(group)
    return [f**i for i in range(n + 1)]

assert(prime_group_to_factors([2, 2, 2]) == [1, 2, 4, 8])

No we only have to get all factors by calculating the combination of two factor lists.

In [5]:
def combine_factors(xs, ys):
    return [x * y for y in ys for x in xs]

assert(combine_factors([1, 2, 4], [1, 3]) == [1, 2, 4, 3, 6, 12])

Now actually we want to combine an arbitrary number of factor lists.

In [6]:
def combine_factors(xss):
    ys = [1]
    for xs in xss:
        ys = [x * y for y in ys for x in xs]
    return ys
In [7]:
def get_divisors(n):
    prime_factors = get_prime_factors(n)
    prime_groups = group(prime_factors)
    factor_groups = list(map(prime_group_to_factors, prime_groups))
    divisors = combine_factors(factor_groups)
    divisors.sort()
    return divisors
    
assert(get_divisors(220) == [1, 2, 4, 5, 10, 11, 20, 22, 44, 55, 110, 220])
assert(get_divisors(13) == [1, 13])
assert(get_divisors(26) == [1, 2, 13, 26])

We can finally start to get the solution. Remember that we only want to get the sum of proper divisors which are the divisors withouth the number itself.

In [8]:
def sum_of_proper_divisors(number):
    return sum(get_divisors(number)[:-1])

assert(sum_of_proper_divisors(220) == 284)
assert(sum_of_proper_divisors(284) == 220)

def is_amicable(number):
    s = sum_of_proper_divisors(number)
    if s == number:
        return False
    return number == sum_of_proper_divisors(s)

assert(is_amicable(220))

The if statement is very import because a number pair $a, b$ is only amicable if $a \neq b$.

In [9]:
amicable_till_10000 = [i for i in range(1, 10000) if is_amicable(i)]
assert(sum(amicable_till_10000) == 31626)
print(sum(amicable_till_10000))
31626

I found some brute force solutions in the Euler forum and they claim to have a sick performance. I want to compare them.

In [10]:
def brute_force():
    return sum([i for i in range(1, 10000) if is_amicable(i)])

import timeit
print(timeit.timeit(brute_force, number=10))
108.68903765424807
In [14]:
def sum_of_proper_divisors(n):
    from math import sqrt
    s = 1
    for i in range(2, int(sqrt(n)) + 1):
        if n % i == 0:
            s += i
            s += (n // i)
    return s

assert(sum_of_proper_divisors(220) == 284)
       
def is_amicable(number):
    s = sum_of_proper_divisors(number)
    if s == number:
        return False
    return number == sum_of_proper_divisors(s)

def brute_force():
    return sum([i for i in range(1, 10000) if is_amicable(i)])   
    
import timeit
print(timeit.timeit(brute_force, number=10))
0.7862764837699387

This is actually really embarrassing. Especially I was aware of the sqrt trick all the way but did not manage to put it into use properly. Another example where greate engineer will leave you far behind compared to great thinking.

Supplement: Also I have discovered that my sum of proper divisors function has a bug when n is a square, because in that case i == (n // 1) which adds this value two times. So there is no amicable number under 10000 which is also a square because otherwise we would have failed.