Algorithms in Ruby: Largest Prime Factor
Another post today continuing with my math problem project. This one is very mathy, but if you understand the straightforwardness of the theory, this one is not as confusing as my previous entry.
- The prime factors of 13195 are 5, 7, 13 and 29. The largest prime factor in this case is 29.
- What is the largest prime factor of the number 600851475143 ?
So, quick story: I bought a four pack of Rubik cubes this week. I’ve got the strategy down for a 3x3, memorized quite a few of the algorithms, and pretty much know what patterns to look for. Interesting point: there are 42 quintillion possible movements, but only one right solution to solving this problem. I got inspired and tried programming it, beginning with creating an idea of arrays to represent sides. I didn’t get very far tho. When I decided to see who else has attempted to program these algorithms, this popped up. Check out it out.
The first question I asked myself was “how do I determine if a prime number is a factor of the input?” Went to quick research, and got this idea back:
Any number which is not prime can be written as the product of prime numbers: we simply keep dividing it into more parts until all factors are prime.
So my function will be: check if my input is divisible with no remainder when divided by a prime number. Try increasingly larger prime numbers each time, till it’s impossible to satisfy that. Simply keep dividing into more parts till all factors are prime, and pluck out the greatest one.
This approach is taking the “dividing it into more parts” seriously.
Start with the smallest prime, and work up to the largest prime that’ll satisfy, and return that largest prime.
The original input is subject to evaluation. Take it, and manipulate it till it represents the correct output. Upon eval, its returns can be assigned to variables, which can then in turn be evaluated similarly. To this approach, we divide a non-prime number by any of all prime numbers.
Let’s say we start with the lowest prime and working our way up till a prime number that is a factor is found. Maybe that is 3, maybe 13. But the point is, a very, very large number like 600851475143 — which is not prime — is going to be divisible by a smaller number, and/or one that is prime.
That quotient is then divided by the next largest prime multiple until we’ve reached the point that there is no prime number larger that can be a factor
How do we determine that point without testing every prime number all the way back up to our initial input? That would be an absurd process.
Maybe I was kidding when I was saying this is less confusing than the last problem.
In a best case, we do math till the quotient and the prime multiple are the same. At that point we’ve found our last multiple.
So we’ll have a break there. Let’s look at an example:
If we want the largest prime multiple of 10, and the first prime number is 2, and when divided, is 5, with no remainder. Then, next prime multiple is 5. If we take the quotient of 5 and divide it by the prime multiple, 5, we get 1. Therefore, we have found the largest, but also the last prime number that will satisfy. Stop doing work, and return that last prime.
For this, it seems like that is a sane breaking point. But what if that point doesn’t occur for a much longer time?
A big number like 13,195.
Can it be divided by two with zero remainder? No, move on to next prime number. Can it be by 3? No, move on to to next prime. By 5? Yes. Do the math: 13,195/5 = 2639; output is set at this moment to 2639, and increment to next prime; loops continues. By 7? Yes. 23639/7 = 377; increment to 11; no; increment to 13, yes 377/13 = 29, increment to next prime of 17, then 19, then 23, then 29 hits, and we break out of the loop.
Problem here is, first, I need an array populated with all the prime numbers already. “Maybe there is a built in ruby method to get around this,” is my first thought. Because I certainly can’t create an array on my own that includes all the primes to 1 million. Nobody is that smart.
And what if my input is very, very large, and it’s not divisible by a prime number for a long time, ie it is divisible by a very large prime? I’ll have to run tests all the way up to that large prime, which could take a long time.
We’re running tests here we know are not going to work. Once that quotient is prime, that’s that. So I considered that…
In actuality, what is happening is that when my quotient becomes a prime number, and thus no longer divisible by anything else, that will be our largest prime factor if we have been following the procedure of dividing non-primes quotients by increasingly large prime numbers. However, this means that every loop of our routine would have to add a procedure that evaluates if the quotient is a prime number each time. Built into Ruby or not, that would add on so much time.
That’s when I took to the internet, and being the place that is it is, someone has already discussed this problem. I wasn’t going to to deduce this on my own, and hearing it definitely gave me some closure to get started on pseudocoding.
Apparently, if the prime number we are testing is squared, and the product is greater than our current quotient, that and any other prime number thereafter would be failures. If we are at the point that our test number is squared is greater than a smaller division of our original input, then it stands to reason that that test number cannot be a factor of the smaller division, and therefore the larger original input. Looking at this article, its got my head spinning.
Can someone better explain this math better? I’m just going to accept it and move forward in the meantime. I’m not a mathematician. I’m a coder, and I’m trying to make code that works for the real world. I’ll leave it to scientists to figure the real reasoning behind equations, but I can program a computer to do their calculations for them. So, now with an understanding of their calculations: pseudocode!
So here’s how I make it work then make it pretty. UGLY version here:
Breaking things down here, you can see I implemented my own #is_prime? Just for show, because in reality this can be costly. Each iteration takes O(n) time, so the function as we have it is O of n squared time. However, consider:
Does this procedure loop over an array of prime numbers? Yes, by using Ruby’s fascinating Prime class, which comes with all sorts of its own methods, one which actually does the very thing we are trying to solve. So we won’t actually utilize that yet. Point here is to grasp what is going on under the hood. And while looping over increasing larger primes, we are slowly dividing up our original input, thus decreasing the length to which Prime.each has to go over the entire prime array. This is an ok way to do things. But on bigger numbers, we’re screwed. This will take all day to compute.
Here’s the SPOILER ALERT: ruby has this covered:
def lpf(n) p Prime.prime_division(n)endlpf(600851475143)# prints: [[71, 1], [839, 1], [1471, 1], [6857, 1]]
Calling #prime_division on the prime class with a given value will return and array of arrays. Each index of the array is an array, who’s 0 index is the prime multiple of the number. If we access that then, boom, we are done:
def lpf(n)p Prime.prime_division(n).lastendlpf(600851475143)#prints: 6857
So, I went back to the drawing board. Instead of evaluating each and every number, I’ve got to find a way to cut things down.
Lots printing to get an idea of where we are in our loop. But what is neat and different this time is that we are cutting down our tests significantly. The reassignment of our limit in a while loop allows us to trim things down. And our break cases are thus easier to achieve in a shorter time.
For final consideration, here is a nice object oriented version to bring it on home:
I barely made sense of this one, but was able to take math beyond my understanding, write it out and implement it through long, head bashing testing.
Each time I do this, tho, it feels more and more like a routine of things, and the patterns of solving a problem become more illuminated. Once you start seeing how questions like this are often solved, you can see many similarities. So this definitely takes more than lots of practice; it takes continuous observation and questioning of methods to uncover all the hidden patterns.