Can Clojure Optimize Scanning?

Question

Can Clojure Optimize Scanning?

;; Suppose we want to compute the min and max of a collection.
;; Ideally there would be a way to tell Clojure that we want to perform
;; only one scan, which will theoretically save a little time  

;; First we define some data to test with
;; 10MM element lazy-seq
(def data (for [x (range 10000000)] (rand-int 100)))

;; Realize the lazy-seq 
(dorun data)

;; Here is the amount of time it takes to go through the data once
(time (apply min data))
==> "Elapsed time: 413.805 msecs"

;; Here is the time to calc min, max by explicitly scanning twice
(time (vector (apply min data) (apply max data)))
==> "Elapsed time: 836.239 msecs"

;; Shouldn't this be more efficient since it going over the data once?
(time (apply (juxt min max) data))
==> "Elapsed time: 833.61 msecs"

Chuck, here are my results after using your solution:

test.core=> (def data (for [x (range 10000000)] (rand-int 100)))
#'test.core/data

test.core=> (dorun data)
nil

test.core=> (realized? data)
true

test.core=> (defn minmax1 [coll] (vector (apply min coll) (apply max coll)))    
#'test.core/minmax1

test.core=> (defn minmax2 [[x & xs]] (reduce (fn [[tiny big] n] [(min tiny n) (max big n)]) [x x] xs))    
#'test.core/minmax2

test.core=> (time (minmax1 data))
"Elapsed time: 806.161 msecs"
[0 99]

test.core=> (time (minmax2 data))
"Elapsed time: 6072.587 msecs"
[0 99]

+4

optimization clojure traversal sequence

Badmanchild Jan 03 '14 at 23:35

source share

2 answers

, juxt, - ((juxt f g) x) [(f x) (g x)]. .

, , , - :

(defn minmax [[x & xs]]
  (reduce 
    (fn [[tiny big] n] [(min tiny n) (max big n)]) 
    [x x]
    xs))

+3

Chuck 04 . '14 0:44

mikera · Accepted Answer · 2014-01-04T15:09:00+0000

This may not exactly answer your general question (for example, how to scan Clojure data structures), but you should know that this type of code is often better for specialized structure / library data if you really care about performance.

eg. using core.matrix / vectorz-clj and a bit sassy Java interop:

;; define the raw data
(def data (for [x (range 10000000)] (rand-int 100)))

;; convert to a Vectorz array
(def v (array :vectorz data))

(time (Vectorz/minValue v))
"Elapsed time: 18.974904 msecs"
0.0

(time (Vectorz/maxValue v))
"Elapsed time: 21.310835 msecs"
99.0

. 20-50% , , .

, , Clojure, , . - .

Can Clojure Optimize Scanning?

More articles: