I would make a few changes.
- It seems strange that you are doing
filterPrimes for all numbers between 2 and max / 2 , for an "actual" sieve you only need to do filterPrimes for all primes between 2 and sqrt(max) . - It also seems strange that you are using a var and a for loop. To make this a “functional” way, I would use a recursive function instead.
- Instead of doing
filterPrimes all over the list, you can collect primes as you go; no need to throw them through the filter again and again. - This is rather strange for
map , and then filter , how you do it, since the map just determines which elements to filter, you can do the same using only the filter.
So here is my first attempt at these changes:
def filterFactors(seed: Int, xs: List[Int]) = { xs.filter(x => x % seed != 0) } def sieve(max: Int) = { def go(xs: List[Int]) : List[Int] = xs match { case y :: ys => { if (y*y > max) y :: ys else y :: go(filterFactors(y, ys)) } case Nil => Nil } go((2 to max).toList) }
However, this reflects my Haskell bias and has a huge flaw: it will take up a huge amount of stack space due to the recursive call of y :: go(...) in the go helper function. Running sieve(1000000) led to "OutOfMemoryError" for me.
Try using the generic FP trick: tail recursion with batteries.
def sieve(max: Int) = { def go(xs: List[Int], acc: List[Int]) : List[Int] = xs match { case y :: ys => { if (y*y > max) acc.reverse ::: (y :: ys) else go(filterFactors(y, ys), y :: acc) } case Nil => Nil } go((2 to max).toList, Nil) }
By adding a battery value, we can write the auxiliary function go in a tail recursive form, thereby avoiding the huge glass problem. (Haskell's evaluation strategy is very different, so it does not need any benefits from tail recursion)
Now compare speed with the imperative, mutation-based approach.
def mutationSieve (max: Int) = { var arr: Array[Option[Int]] = (2 to max).map (x => Some (x)).toArray var i = 0 var seed = (arr (i)).get while (seed * seed < max) { for (j: Int <- (i + seed) to (max - 2) by seed) { arr (j) = None } i += 1 while (arr (i).isEmpty) { i += 1 } seed = (arr (i)).get } arr.flatten }
Here I use Array[Option[Int]] and “cross out” the number, replacing its entry with “No”. There is a small portion of optimization opportunities; perhaps a slight speedup can be obtained using the bools array, where the index represents a specific number. Whatever.
Using very primitive technologies (carefully placed calls to new Date() ...) I compared the functional version about 6 times slower than the actual version. It is clear that both approaches have the same complexity in time, but the constant factors associated with programming with linked lists are costly.
I also compared your version using Math.sqrt(max).ceil.toInt instead of max / 2 : it was about 15 times slower than the functional version presented here. Interestingly, it is estimated 1 that approximately 1 out of every 7 numbers between 1 and 1000 ( sqrt(1000000) ) is prime (1 / ln (1000)), so a significant part of the slowdown may be due to the fact that you loop on each separate room, while I perform my function only for every simple one. Of course, if it took 15 times to complete ~ 1000 iterations, it would take ~ 7500x to complete 500,000 iterations , so your source code is painfully slow.