This week, I was helping my son with prime number factorization in his math homework. Beyond a basic understanding of primes and factorization, I have not spent a lot of time thinking about primes. However, when you stop to think about them, they are fascinating. It is amazing to me that a concept that is fundamental to math is relatively unknown. We don’t have a way to easily find large prime numbers. What is it about prime numbers that makes them difficult to find or test?
This made me interested in taking a minor exploration of prime numbers. Notably, I wanted to implement the following:
- is_prime: function to determine whether a given number is prime
- get_primes: get prime numbers up to n
- get_factors: get the prime factors of a given number
Determine Whether a Number is Prime Number
The first goal was to determine whether a number is prime. We started with a simple brute force algorithm.
def is_prime(num):
'''Determine whether a given number is a prime number or not.'''
if num == 1:
return False
elif num == 2:
return True
for i in range(3,num):
if num%i==0:
return False
return True
We can significantly speed this up. We based our improvements on this wikipedia article.
First, we recognize that 2 is the only even number. Therefore, we don’t need to evaluate even numbers. We can modify the search range to:
for i in range(3,num,2):
if num%i==0:
return False
Second, we don’t need to test every number from 3 to n. Instead, we can stop looking at sqrt(n). Why is this true? If we assume the number is composite, then we could write it as n = a * b. A and b can’t both be larger than the sqrt(n). For example, if n=16. Sqrt(16) = 4. If a = 5 and b = 5, then a*b = 25. Therefore, either a or b must be equal to or below sqrt(n). If we check relevant numbers up to sqrt(n), we will ensure the test is comprehensive.
for i in range(3,math.sqrt(num),2):
if num%i==0:
return False
These edits are really significant. For example, if we are testing the number 101. Under our original approach, we had to check 100 possible numbers. By only searching odd numbers, we can cut that in half. By further stopping at the sqrt(101), we stop checking at 10, meaning we only need to check 5 numbers.
Running some time tests with the number 6700417, we get the following execution times:
- All Numbers: 0.421952 seconds
- Odd Numbers: 0.20841 seconds
- Odd Numbers up to sqrt(n): 0.000125 seconds
For further improvement, we can construct numbers as 6k+i where i is {0,1,2,3,4,5}. For various reasons, we only actually need to test i=1 and i=5. This means we only need to check 6k+1 and 6k+5 up to the sqrt(n).
def fast_prime(num):
'''Determine whether a given number is a prime number or not.'''
if num == 2 or num == 3 or num == 5:
return True
elif num == 1 or num % 2 == 0 or num % 3 == 0 or num % 5 == 0:
return False
k = 0
i = 5
test_number = 6*k + i
target = math.sqrt(num)
while test_number <= target:
if num % test_number == 0:
return False
if i == 5:
i = 1
k += 1
else:
i = 5
test_number = 6*k+i
return True
Get All Prime Numbers (up to N)
Determining whether a number is a prime is useful; however, sometimes you need to list all primes up to a given number n. Again, we can take brute force approach:
def get_primes(num):
'''Gets the first n number of primes.'''
primes = []
current_number = 1
while len(primes) < num:
if is_prime(current_number):
primes.append(current_number)
current_number += 1
return primes
Alternatively, we can use the Sieve of Eratosthenes. The basic premise is that once we know a prime number, we do not need to check any number that is a multiple of the prime. In a way, we used this to determine whether a given number is prime. We know that 2 is a prime number. However, any positive even number other than 2 is composite because any even number is divisible by 2. We take the same approach with 3 (3, 9, 12, … , 3n), 5 (5, 10, 15, …, 5n), 7 (7, 14, 21, …, 7n), etc until we get to the square root of our target.
def prime_sieve(num):
'''Uses Erastosthenes to find primes smaller than n (n < 10e6)'''
prime = [True for i in range(num+1)]
prime[0] = False
prime[1] = False
p = 2
while p*p < num:
if prime[p]:
# Mark the multiples
for i in range(p*p,num+1,p):
prime[i] = False
p += 1
final_prime = []
for index in range(len(prime)):
if prime[index]:
final_prime.append(index)
return final_prime
Note that this algorithm only works for “small” primes. This limitation is primarily due to complexity (i.e., memory limits). For this method, we are storing all composite factors up to the square root of the target number. On a 32 bit system, this appears to be a max of 500,000,000 elements. It would likely be larger on a 64 bit system, but I could not find a definitive answer.
Prime Number Factorization
Now that we can identify primes, the next function will get the prime factors. This function will be recursive. First, we determine whether the current number is prime. If so, we simply return the number. If it is not prime, then we will get all of the primes that are less than or equal to the square root of the number. We will then test those primes against the current number. Once we find a prime that divides evenly, we add the factor to our list, and recursively send the remainder to the get factors function. Here is the code.
def get_factors(num):
if is_prime(num):
return [num]
factors = []
primes = prime_sieve(math.floor(math.sqrt(num)))
for i in primes:
if num % i == 0:
next_factor = i
break
other_factor = num//i
factors.append(next_factor)
factors += get_factors(other_factor)
return sorted(factors)
Conclusion
I don’t imagine that this post will be very useful to many people. It is definitely not groundbreaking code. However, it was fun to think through primes and how to implement algorithms to calculate primes and factors.