Checking for primes? Dumber algorithm is faster algorithm

Mar 21 2011 Tags: , ,

Blogging burns a lot of caffeine. If you enjoy my posts, you should buy me a cup of tea.

It’s been a busy few weeks since I last posted about my toying around with project euler, but never you mind. I did do some progress nontheless! HA!

Just so I won’t bore you with the details, the most interesting thing I discovered is that the way I was calculating prime numbers was totally wrong. Not in the sense that it didn’t work, but in the sense that it was only fast in theory. In practice it turns out that either my implementation was completely and utterly horrible or that I simply don’t know enough about performance bottlenecks of clojure and made a boo boo somewhere.

Here’s my original way of creating prime numbers and checking if a number is prime:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
(defn any? [l]
  (reduce #(or %1 %2) l))
 
(defn prime? [n known]
  (loop [cnt (dec (count known)) acc []]
    (if (< cnt 0) (not (any? acc))
	(recur (dec cnt) (concat acc [(zero? (mod n (nth known cnt)))])))))
 
(defn next-prime [primes]
  (let [n (inc (count primes))]
    (let [lk (if (even? (inc (last primes))) (+ 2 (last primes)) (inc (last primes)))]
      (loop [cnt lk p primes]
	(if (>= (count p) n) (last p)
	    (recur (+ cnt 2) (if (prime? cnt p) (concat p [cnt]) p)))))))
 
(defn n-primes [n]
  (loop [cnt 1 p [2]]
    (if (>= cnt n) p
	(recur (inc cnt) (concat p [(next-prime p)])))))

Not exactly certain what sieve I was using here, but the idea is that a number is prime if it isn’t divisible by any primes smaller than the number you’re checking. This works great in theory because you’re always dividing with very few numbers.

But turns out, there’s a much much faster way of doing it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(defn any? [l]
  (reduce #(or %1 %2) l))
 
(defn prime? [n]
  (if (even? n) false
      (let [root (num (int (Math/sqrt n)))]
	(loop [i 3]
	  (if (> i root) true
	      (if (zero? (mod n i)) false
		  (recur (+ i 2))))))))
 
(defn n-primes [n]
  (loop [num 2 p []]
    (if (>= (count p) n) p
	(recur (inc num) (if (prime? num) (concat p [num]) p)))))

This way of checking for primes just loops through all numbers smaller than the square root of the one we’re checking by starting off with 3 and jumping by two places.

Obviously a very big improvement here is using the square root as the looping boundary, but turns out even adding this trick to the previous method only goes so far. The dumber version is still at least a 100 times faster. In fact the dumb version produced all primes under 2,000,000 in about a minute whereas the first version wasn’t even finished after ten minutes.

So what’s happening here? Considering the sparseness of prime number distribution I’m thinking that the dumber method performed at least ten times more divisions.

My theory is that maybe, just maybe, the problem was in copying that huge arse list of known primes around when performing function calls. Or maybe it was the appending of primes to the big list that was the problem … whatever it was, I learned a valuable lesson: Keep it Stupid Simple For Realz.

Enhanced by Zemanta

---
Need a freelance developer? Email me!

You should follow me on twitter
 Subscribe to RSS

5 responses so far

  • Simon

    Function calls are not very expensive in Clojure and even then the costs that are there are not due to copying as such. Clojure copies on modification and even then resulting data is shared with the original where possible.

    Some general tips:
    * any? ~= some? from core
    * you almost never need (nor should) use “naked” loop
    * your indentation of if is bad/wrong
    * (num (int (…)) should probably be (contrib.math/floor …)

    (def primes (concat [2] (lazy-seq
    ((fn next-prime [n]
    (if (some (comp zero? (partial rem n))
    (take-while #(<= (* % %) n) primes))
    (recur (+ n 2))
    (lazy-seq (cons n (next-prime (+ n 2))))))
    3))))

  • Simon

    Oh, and btw, Eratosten called, he wants you to use his sieve. ;)

  • http://swizec.com Swizec

    Yes, any? is very similar to some?, I’ve learned about apply since writing this code. It makes life easier :)

    And what’s a naked loop? The clojure docs describe this is the proper way to write recursion because otherwise it can’t do tail-recursion optimisation …

    As for indentation, blame my emacs. I’m just pressing tab. :P

  • Simon

    You can (and should) often use reduce, iterate, reductions, dotimes, map, … (none of which will blow the stack) instead of loop.

    Don’t blame emacs, it’s your job to press enter after the conditional. ;)

  • http://swizec.com Swizec

    Oh you meant my newlining not so much the indentation; noted!

    And loops, very well, more lispy lisp should be more lispy! Guess I won’t be showing you that one solution where I used a tripple nested naked loop :P

« Project euler is a fun way to... Circadian rhythm meltdown »