Why is this cyclic function so slow compared to the card?

I looked at the source code of the cards, which basically continues to create lazy sequences. I would have thought that iterating over the collection and adding to the transition vector would be faster, but that is clearly not the case. What I don't understand about clojures performance behavior?

;=> (time (do-with / (range 1 1000) (range 1 1000))) ;"Elapsed time: 23.1808 msecs" ; ; vs ;=> (time (doall (map #(/ %1 %2) (range 1 1000) (range 1 1000)))) ;"Elapsed time: 2.604174 msecs" (defn do-with [fn coll1 coll2] (let [end (count coll1)] (loop [i 0 res (transient [])] (if (= i end) (persistent! res) (let [x (nth coll1 i) y (nth coll2 i) r (fn xy)] (recur (inc i) (conj! res r))) )))) 
+7
clojure lazy-sequences
source share
1 answer

In order of expected impact on relative results:

  • The do-with function uses nth to access individual elements of input collections. nth operates in linear time on ranges, making do-with quadratic. Needless to say, this will kill productivity in large collections.

  • range creates segmented sequences and map processes them extremely efficiently. (In fact, it creates fragments up to 32 elements long - in fact, it will be exactly 32 - by performing a tight loop on the internal array of each input block, in turn, outputting the results to the internal arrays of the output blocks).

  • Benchmarking with time does not give you stable operation. (That's why you really need to use the proper benchmarking library; in the case of Clojure, Criterium is the standard solution.)

By the way, (map #(/ %1 %2) xs ys) can simply be written as (map / xs ys) .

Update:

I compared the map version, the original do-with and the new do-with with Criterium, using (range 1 1000) as both inputs in each case (as in the question text), getting the following average runtime:

 ;;; (range 1 1000) new do-with 170.383334 µs (doall (map ...)) 230.756753 µs original do-with 15.624444 ms 

In addition, I repeated the test using a vector stored as input and not a range (i.e. with (def r (vec (range 1 1000))) at the beginning and using r as both collection arguments in each test) . It is not surprising that the original do-with came in first - nth very fast on vectors (plus using nth with a vector avoids all intermediate distributions involved in seq-traversal).

 ;;; (vec (range 1 1000)) original do-with 73.975419 µs new do-with 87.399952 µs (doall (map ...)) 153.493128 µs 

Here's a new do-with with linear time complexity:

 (defn do-with [f xs ys] (loop [xs (seq xs) ys (seq ys) ret (transient [])] (if (and xs ys) (recur (next xs) (next ys) (conj! ret (f (first xs) (first ys)))) (persistent! ret)))) 
+14
source share

All Articles