Problem with lazy convolution fn in Clojure

I'm writing some kind of signal processing software, and I'm starting to write discrete convolution function ,

This works fine for the first ten thousand or so list of values, but as they get larger (say 100k), I start getting StackOverflow errors, of course.

Unfortunately, I have many problems converting the urgent convolution algorithm that I have for a recursive and lazy version that is actually fast enough to use (to have at least a modest elegance style as well).

I am also not 100% sure that I have this function completely right, but - please let me know if I missed something or did something wrong. I think it is right.

(defn convolve " Convolves xs with is. This is a discrete convolution. 'xs :: list of numbers 'is :: list of numbers " [xs is] (loop [xs xs finalacc () acc ()] (if (empty? xs) (concat finalacc acc) (recur (rest xs) (if (empty? acc) () (concat finalacc [(first acc)])) (if (empty? acc) (map #(* (first xs) %) is) (vec-add (map #(* (first xs) %) is) (rest acc))))))) 

I would really appreciate any help: I ​​still get support in Clojure, and making this elegant, lazy and / or recursive would be great.

I'm a little surprised how difficult it is to express an algorithm that is pretty easy to express in an imperative language in Lisp. But maybe I'm doing it wrong!

EDIT: Just to show how easy it is to express in an imperative language and give people an algorithm that works beautifully and is easy to read, here is the Python version. Besides being shorter, more concise, and much easier to reason about, it runs faster than Clojure code: even my imperative Clojure code using Java arrays.

 from itertools import repeat def convolve(ns, ms): y = [i for i in repeat(0, len(ns)+len(ms)-1)] for n in range(len(ns)): for m in range(len(ms)): y[n+m] = y[n+m] + ns[n]*ms[m] return y 

Here, on the other hand, is the imperative Clojure code. It also transfers the last, not completely submerged values ​​from the convolution. Therefore, apart from being slow and ugly, it is not 100% functional. Not functional.

 (defn imp-convolve-1 [xs is] (let [ys (into-array Double (repeat (dec (+ (count xs) (count is))) 0.0)) xs (vec xs) is (vec is)] (map #(first %) (for [i (range (count xs))] (for [j (range (count is))] (aset ys (+ ij) (+ (* (nth xs i) (nth is j)) (nth ys (+ ij))))))))) 

It is so discouraging. Please show me, I just missed something obvious.

EDIT 3:

Here is another version that I came up with yesterday, showing how I would like to express it (although the other solutions are pretty elegant, I just put another one there!)

 (defn convolve-2 [xs is] (reduce #(vec-add %1 (pad-l %2 (inc (count %1)))) (for [x xs] (for [i is] (* xi))))) 

It uses this vec-add utility function:

 (defn vec-add ([xs] (vec-add xs [])) ([xs ys] (let [lxs (count xs) lys (count ys) xs (pad-r xs lys) ys (pad-r ys lxs)] (vec (map #(+ %1 %2) xs ys)))) ([xs ys & more] (vec (reduce vec-add (vec-add xs ys) more)))) (vec (reduce vec-add (vec-add xs ys) more)))) 
+6
functional-programming clojure signal-processing convolution
source share
5 answers
 (defn ^{:static true} convolve ^doubles [^doubles xs ^doubles is] (let [xlen (count xs) ilen (count is) ys (double-array (dec (+ xlen ilen)))] (dotimes [p xlen] (dotimes [q ilen] (let [n (+ pq), x (aget xs p), i (aget is q), y (aget ys n)] (aset ys n (+ (* xi) y))))) ys)) 

Riffing in jg-faustus version if I do this in the Clojure equiv branch. It works for me. ~ 400 ms for 1,000,000 points; ~ 25 ms for 100,000 on the i7 Mackbook Pro.

+4
source share

The likely cause of errors is that lazy tricks get too deep. ( concat and map lazy). Try wrapping these calls in doall to force an evaluation of their return values.

For a more functional solution, try something like this:

 (defn circular-convolve "Perform a circular convolution of vectors f and g" [fg] (letfn [(point-mul [mn] (* (fm) (g (mod (- nm) (count g))))) (value-at [n] (reduce + (map #(point-mul % n) (range (count g)))))] (map value-at (range (count g))))) 

Usage can use reduce to easily summarize, and since map creates a lazy sequence, this function is also lazy.

+4
source share

It is not possible to help with the high-performance functional version, but you can get 100x acceleration for the imperative version by abandoning laziness and adding type hints:

 (defn imp-convolve-2 [xs is] (let [^doubles xs (into-array Double/TYPE xs) ^doubles is (into-array Double/TYPE is) ys (double-array (dec (+ (count xs) (count is)))) ] (dotimes [i (alength xs)] (dotimes [j (alength is)] (aset ys (+ ij) (+ (* (aget xs i) (aget is j)) (aget ys (+ ij)))))) ys)) 

If the xs size is 100k and is size 2, your imp-convolve-1 takes about 6,000 ms on my machine when it is wrapped in doall, and this will take ~ 35 ms.

Update

Here is the lazy functional version:

 (defn convolve ([xs is] (convolve xs is [])) ([xs is parts] (cond (and (empty? xs) (empty? parts)) nil (empty? xs) (cons (reduce + (map first parts)) (convolve xs is (remove empty? (map rest parts)))) :else (cons (+ (* (first xs) (first is)) (reduce + (map first parts))) (lazy-seq (convolve (rest xs) is (cons (map (partial * (first xs)) (rest is)) (remove empty? (map rest parts))))))))) 

In sizes 100k and 2, it works with a frequency of ~ 600 ms (from 450 to 750 ms) versus ~ 6,000 ms for improvisation-1 and ~ 35 ms for imp-convolve-2.

Thus, it is functional, lazy and has a tolerant performance. However, it is twice as much code as the strong version, and it took me 1-2 hours to find, so I'm not sure I see the point.

I use pure functions when they make the code shorter or simpler, or have another advantage over the imperative version. When they do not, I do not mind the transition to an imperative regime.

This is one of the reasons why I think Clojure is great, as you can use any approach you like.

Update 2:

I will correct my “that it is necessary to do this functionally”, saying that I like this functional implementation (second, further down the page) by David Kabana.

This is short, readable, and time up to ~ 140 ms with the same input sizes as above (100,000 and 2), which makes it by far the most effective functional implementation of those that I have tried.

Given that it is functional (but not lazy), it does not use any type hints and works for all numeric types (not only doubles), which is quite impressive.

+4
source share
 (defn convolve [xs is] (if (> (count xs) (count is)) (convolve is xs) (let [os (dec (+ (count xs) (count is))) lxs (count xs) lis (count is)] (for [i (range os)] (let [[start-x end-x] [(- lxs (min lxs (- os i))) (min i (dec lxs))] [start-i end-i] [(- lis (min lis (- os i))) (min i (dec lis))]] (reduce + (map * (rseq (subvec xs start-x (inc end-x))) (subvec is start-i (inc end-i))))))))) 

You can express a lazy, functional solution in concise terms. Alas, performance for> 2k is impractical. I am interested to know if there are ways to speed it up without sacrificing readability.

Edit: After reading drcabana's informative post on the topic ( http://erl.nfshost.com/2010/07/17/discrete-convolution-of-finite-vectors/ ), I updated my code to accept vectors of different sizes. Its implementation is more efficient: for xs size 3, has a size of 1,000,000, ~ 2s versus ~ 3s

Edit 2: Accepting drcabana's ideas, just changing xs and padding, I came:

 (defn convolve [xs is] (if (> (count xs) (count is)) (convolve is xs) (let [is (concat (repeat (dec (count xs)) 0) is)] (for [s (take-while not-empty (iterate rest is))] (reduce + (map * (rseq xs) s)))))) 

This is probably as concise as collecting, but still slower, probably due to capture. Kudos to the blog author for a well thought out approach. The only advantage here is that the above is really lazy in that if I ask (nth res 10000), only the first 10k calculations are needed to achieve the result.

+3
source share

Not quite the answer to any of the many questions you asked, but I have a few comments on those that you did not ask.

  • You probably shouldn't use nth for vectors. Yes, this is O (1), but since nth works with other sequences in O (n), it (a) does not make it clear that you expect the input to be a vector, and (b) means if you make a mistake, your the code will mysteriously become very slow, and will not work immediately.

  • for and map are lazy, and aset are just side effects. This combination is a recipe for disaster: for a side effect of for use, use doseq .

  • for and doseq allow multiple bindings, so you do not need to accumulate loads from them, as you (apparently) in Python.

     (doseq [i (range (count cs)) j (range (count is))] ...) 

    will do what you want.

  • #(first %) more briefly written as first ; similarly, #(+ %1 %2) is + .

  • Calling vec on a bunch of intermediate results that shouldn't be vectors will slow you down. In particular, in vec-add enough to call only vec when you make the final return value: in (vec (reduce foo bar)) there is no reason for foo turn its intermediate results into vectors if it never uses them for random access.

+3
source share

All Articles