Split seq into a window predicate in Clojure

I would like to β€œchunk” seq in subseqs the same way as for the section, except that the function is not applied to each individual element, but to a range of elements.

So for example:

(gather (fn [ab] (> (- ba) 2)) [1 4 5 8 9 10 15 20 21]) 

will result in:

 [[1] [4 5] [8 9 10] [15] [20 21]] 

Similarly:

 (defn f [ab] (> (- ba) 2)) (gather f [1 2 3 4]) ;; => [[1 2 3] [4]] (gather f [1 2 3 4 5 6 7 8 9]) ;; => [[1 2 3] [4 5 6] [7 8 9]] 

The idea is that I apply the beginning of the list and the next element to the function, and if the function returns true, we split the current list title to this point into a new section.

I wrote this:

 (defn gather [pred? lst] (loop [acc [] cur [] l lst] (let [a (first cur) b (first l) nxt (conj cur b) rst (rest l)] (cond (empty? l) (conj acc cur) (empty? cur) (recur acc nxt rst) ((complement pred?) ab) (recur acc nxt rst) :else (recur (conj acc cur) [b] rst))))) 

and it works, but I know there is an easier way. My question is:

Is there a built-in function for this where this function is not needed? If not, is there a more idiomatic (or simpler) solution that I missed? Something combining reduction and reduction of time?

Thanks.

+7
clojure
source share
5 answers

Original interpretation of the question

We (all) seemed to misinterpret your question as wanting to start a new section whenever a predicate was held for consecutive elements.

Another lazy partition-by

 (defn partition-between [pred? coll] (let [switch (reductions not= true (map pred? coll (rest coll)))] (map (partial map first) (partition-by second (map list coll switch))))) 

 (partition-between (fn [ab] (> (- ba) 2)) [1 4 5 8 9 10 15 20 21]) ;=> ((1) (4 5) (8 9 10) (15) (20 21)) 

Actual question

The actual question asks us to start a new section whenever pred? runs to start the current section and current item. To do this, we can simply drop the partition-by with a few settings to our source.

 (defn gather [pred? coll] (lazy-seq (when-let [s (seq coll)] (let [fst (first s) run (cons fst (take-while #((complement pred?) fst %) (next s)))] (cons run (gather pred? (seq (drop (count run) s)))))))) 

 (gather (fn [ab] (> (- ba) 2)) [1 4 5 8 9 10 15 20 21]) ;=> ((1) (4 5) (8 9 10) (15) (20 21)) (gather (fn [ab] (> (- ba) 2)) [1 2 3 4]) ;=> ((1 2 3) (4)) (gather (fn [ab] (> (- ba) 2)) [1 2 3 4 5 6 7 8 9]) ;=> ((1 2 3) (4 5 6) (7 8 9)) 
+10
source share

Since you need to have information from the previous or next elements other than the one you are currently deciding on, partition pairs with reduce can do the trick in this case.

Here is what I came up with after several iterations:

 (defn gather [pred s] (->> (partition 2 1 (repeat nil) s) ; partition the sequence and if necessary ; fill the last partition with nils (reduce (fn [acc [x :as s]] (let [n (dec (count acc)) acc (update-in acc [n] conj x)] (if (apply pred s) (conj acc []) acc))) [[]]))) (gather (fn [ab] (when (and ab) (> (- ba) 2))) [1 4 5 8 9 10 15 20 21]) ;= [[1] [4 5] [8 9 10] [15] [20 21]] 

The basic idea is to partition the number of elements that the predicate function performs, filling out the last nil section if necessary. The function then reduces each section to determine whether the predicate is executed; if so, then the first element in the section is added to the current group and a new group is created. Since the last section could be filled with zeros, the predicate must be changed.

Towing possible improvements to this feature will allow the user to:

  • Define the value to fill in the last section, so the reduction function can check if any of the elements in the section is this value.
  • Indicate the arity of the predicate, which allows you to determine the grouping taking into account the current and next n elements.
+2
source share

I wrote this some time ago in useful:

 (defn partition-between [split? coll] (lazy-seq (when-let [[x & more] (seq coll)] (lazy-loop [items [x], coll more] (if-let [[x & more] (seq coll)] (if (split? [(peek items) x]) (cons items (lazy-recur [x] more)) (lazy-recur (conj items x) more)) [items]))))) 

It uses lazy-loop , which is a way of writing lazy-seq expressions that look like loop/recur , but I hope this is clear enough.

I am connected with the historical version of the function, because later I understood there a more general function that you can use to implement partition-between , or partition-by , or indeed many other sequential functions. These days, the implementation is much shorter , but less obvious what happens if you are not familiar with the more general function that I called glue

 (defn partition-between [split? coll] (glue conj [] (fn [vx] (not (split? [(peek v) x]))) (constantly false) coll)) 

Please note that both of these solutions are lazy, which while I am writing this does not apply to any of the other solutions in this thread.

+1
source share

Here is one way with a breakdown of steps. It can be narrowed down to fewer statements.

 (def l [1 4 5 8 9 10 15 20 21]) (defn reduce_fn [fxy] (cond (f (last (last x)) y) (conj x [y]) :else (conj (vec (butlast x)) (conj (last x) y)) ) ) (def reduce_fn1 (partial reduce_fn #(> (- %2 %1) 2))) (reduce reduce_fn1 [[(first l)]] (rest l)) 
0
source share

keep-indexed is a great feature. For the function f and the vector lst ,

 (keep-indexed (fn [idx it] (if (apply f it) idx)) (partition 2 1 lst))) (0 2 5 6) 

this returns the indices after which you want to split. Let them increase them and fix them on the front 0:

 (cons 0 (map inc (.....))) (0 1 3 6 7) 

Separate them to get ranges:

 (partition 2 1 nil (....)) ((0 1) (1 3) (3 6) (6 7) (7)) 

Now use them to generate subvecs :

 (map (partial apply subvec lst) ....) ([1] [4 5] [8 9 10] [15] [20 21]) 

Putting it all together:

 (defn gather [f lst] (let [indices (cons 0 (map inc (keep-indexed (fn [idx it] (if (apply f it) idx)) (partition 2 1 lst))))] (map (partial apply subvec (vec lst)) (partition 2 1 nil indices)))) (gather #(> (- %2 %) 2) '(1 4 5 8 9 10 15 20 21)) ([1] [4 5] [8 9 10] [15] [20 21]) 
0
source share

All Articles