I worked on the Project Euler problem solving at Clojure to get better, and I have come across several times generating prime numbers. My problem is that it takes too much time. I was hoping someone could help me find an effective way to do this Clojure-y.
When I made it with my fist, I rudely forced it. It was easy to do. But calculating 10,001 primes took 2 minutes at Xeon 2.33 GHz, too long for the rules and generally too long. Here was the algorithm:
(defn next-prime-slow "Find the next prime number, checking against our already existing list" ([sofar guess] (if (not-any?
Replacing next-prime-slow with a newer procedure that took into account some additional rules (for example, property 6n + / - 1), I was able to speed up the process by about 70 seconds.
Then I tried to make a sieve from Eratosthenes in pure Clojure. I don't think I pulled out all the errors, but I gave up because it was just too slow (even worse than the higher, I think).
(defn clean-sieve "Clean the sieve of what we know isn't prime based" [seeds-left sieve] (if (zero? (count seeds-left)) sieve ; Nothing left to filter the list against (recur (rest seeds-left) ; The numbers we haven't checked against (filter #(> (mod % (first seeds-left)) 0) sieve)))) ; Filter out multiples (defn self-clean-sieve ; This seems to be REALLY slow "Remove the stuff in the sieve that isn't prime based on it self" ([sieve] (self-clean-sieve (rest sieve) (take 1 sieve))) ([sieve clean] (if (zero? (count sieve)) clean (let [cleaned (filter #(> (mod % (last clean)) 0) sieve)] (recur (rest cleaned) (into clean [(first cleaned)])))))) (defn find-primes "Finds prime numbers, hopefully faster" ([] (find-primes 10001 [2])) ([needed seeds] (if (>= (count seeds) needed) seeds ; We have enough (recur ; Recalculate needed (into seeds ; Stuff we've already found (let [start (last seeds) end-range (+ start 150000)] ; NOTE HERE (reverse (self-clean-sieve (clean-sieve seeds (range (inc start) end-range))))))))))
This is bad. It also causes a stack overflow if the number 150000 is less. This is despite the fact that I use recur. It could be my fault.
Then I tried a sieve using Java methods on Java ArrayList. It took quite a bit of time and memory.
My last attempt was to use a sieve using a Clojure hash map, inserting all the numbers into the sieve, and then separating the numbers that are not prime. In the end, he takes a list of keys, which are the primes he found. It takes about 10-12 seconds to find 10,000 primes. I am not sure that it is fully debugged. This is also recursive (using recur and loop), since I'm trying to be Lispy.
Thus, with such problems, problem 10 (summing all primes up to 2,000,000) is killing me. My fastest code found the correct answer, but it took 105 seconds, and it took me a lot of memory (I allocated 512 MB to him, so I did not have to bother with this). My other algorithms take so much time that I always ended up stopping them first.
I could use a sieve to calculate so many primes in Java or C pretty quickly and without using a lot of memory. I know that I am missing something in my Clojure / Lisp style that is causing the problem.
Is there something I'm doing really wrong? Is Clojure just slow with large sequences? After reading some of the discussions on the Euler project, people figured out the first 10,000 primes in other Lispx in less than 100 milliseconds. I understand that the JVM can slow things down, and Clojure is relatively young, but I would not expect a difference of 100 times.
Can someone explain me a quick way to calculate prime numbers in Clojure?