So counting parallel files should be so simple?
Is not:)
I tried to solve this problem better. I realized that I was doing blocking I / O , so pmap does not do this work. I thought that maybe giving meaning to pieces of directories (branches) to agents for processing it would make sense independently. It seems like it does :) Well, I haven't tested it yet.
This works, but there may be some problems with symbolic links on UNIX-like systems.
(def user-dir (clojure.java.io/file "/home/janko/projects/")) (def root-dir (clojure.java.io/file "/")) (def run? (atom true)) (def *max-queue-length* 1024) (def *max-wait-time* 1000) ;; wait max 1 second then process anything left (def *chunk-size* 64) (def queue (java.util.concurrent.LinkedBlockingQueue. *max-queue-length* )) (def agents (atom [])) (def size-total (atom 0)) (def a (agent [])) (defn branch-producer [node] (if @run? (doseq [f node] (when (.isDirectory f) (do (.put queue f) (branch-producer (.listFiles f))))))) (defn producer [node] (future (branch-producer node))) (defn node-consumer [node] (if (.isFile node) (.length node) 0)) (defn chunk-length [] (min (.size queue) *chunk-size*)) (defn compute-sizes [a] (doseq [i (map (fn [f] (.listFiles f)) a)] (swap! size-total
You can run it by typing
(producer (list user-dir)) (consumer)
For result type
@size-total
You can stop it (there are futures - correct me if I am wrong)
(swap! run? not)
If you find errors or errors, you can share your ideas!
source share