Julia: Parallel loop for partition iterator partitions

So, I'm trying to iterate over the partition list of something, say 1:n for some n between 13 and 21. The code that I ideally want to run looks something like this:

 valid_num = @parallel (+) for p in partitions(1:n) int(is_valid(p)) end println(valid_num) 

This will use @parallel for to reduce my problem. For example, compare this with the example in the Julia documentation:

 nheads = @parallel (+) for i=1:200000000 Int(rand(Bool)) end 

However, if I try to adapt the loop, I get the following error:

 ERROR: `getindex` has no method matching getindex(::SetPartitions{UnitRange{Int64}}, ::Int64) in anonymous at no file:1433 in anonymous at multi.jl:1279 in run_work_thunk at multi.jl:621 in run_work_thunk at multi.jl:630 in anonymous at task.jl:6 

which, I think, because I'm trying to iterate over something that does not have a 1:n form (EDIT: I think because you cannot call p[3] if p=partitions(1:n) ).

I tried using pmap to solve this problem, but since the number of partitions can become really large, very fast (there are more than 2.5 million 1:13 partitions, and when I get to 1:21 things will be huge), creating such a large array becomes a problem. I left him for the night, and he still has not finished.

Does anyone have any tips on how I can do this effectively in Julia? I have access to a 30-core computer, and my task seems to be easily parallelizable, so I would be very grateful if anyone knows how to do this in Julia.

Thank you very much!

+5
source share
2 answers

Below is code 511, the number of partitions is 2 sets of 10 in size.

 using Iterators s = [1,2,3,4,5,6,7,8,9,10] is_valid(p) = length(p)==2 valid_num = @parallel (+) for i = 1:30 sum(map(is_valid, takenth(chain(1:29,drop(partitions(s), i-1)), 30))) end 

This solution combines the takenth, drop, and chain tactors to get the same effect as the take_every iterator below under PREVIOUS ANSWER. Note that in this solution, each process must compute each section. However, since each process uses a different argument for drop , no two processes will ever call is_valid in the same section.

If you don't want to do a lot of math to figure out how to actually skip sections, there is no way to avoid sequentially breaking sections into at least one process. I think Simon answers one process and divides the sections. Mine asks each workflow to calculate the partitions themselves, which means the calculation is duplicated. However, it is duplicated in parallel, which (if you actually have 30 processors) will not cost you time.

Here is a resource on how iterators over partitions are computed: http://www.informatik.uni-ulm.de/ni/Lehre/WS03/DMM/Software/partitions.pdf .

PREVIOUS ANSWER (More complicated than necessary)

I noticed that Simon answered while writing mine. Our solutions are similar to me, except mine use iterators to avoid storing partitions in memory. I'm not sure it will actually be faster for what size sets, but I find it good to have both options. Assuming you take significantly more time to calculate is_valid than to calculate the partitions themselves, you can do something like this:

 s = [1,2,3,4] is_valid(p) = length(p)==2 valid_num = @parallel (+) for i = 1:30 foldl((x,y)->(x + int(is_valid(y))), 0, take_every(partitions(s), i-1, 30)) end 

which gives me 7, the number of partitions of size 2 for a set of 4. The take_every function returns an iterator that returns every 30th section, starting from the i-th one. Here is the code for this:

 import Base: start, done, next immutable TakeEvery{Itr} itr::Itr start::Any value::Any flag::Bool skip::Int64 end function take_every(itr, offset, skip) value, state = Nothing, start(itr) for i = 1:(offset+1) if done(itr, state) return TakeEvery(itr, state, value, false, skip) end value, state = next(itr, state) end if done(itr, state) TakeEvery(itr, state, value, true, skip) else TakeEvery(itr, state, value, false, skip) end end function start{Itr}(itr::TakeEvery{Itr}) itr.value, itr.start, itr.flag end function next{Itr}(itr::TakeEvery{Itr}, state) value, state_, flag = state for i=1:itr.skip if done(itr.itr, state_) return state[1], (value, state_, false) end value, state_ = next(itr.itr, state_) end if done(itr.itr, state_) state[1], (value, state_, !flag) else state[1], (value, state_, false) end end function done{Itr}(itr::TakeEvery{Itr}, state) done(itr.itr, state[2]) && !state[3] end 
+2
source

One approach is to divide the problem into parts that are not too large to be implemented, and then process the elements inside each part in parallel, for example. in the following way:

 function my_take(iter,state,n) i = n arr = Array[] while !done(iter,state) && (i>0) a,state = next(iter,state) push!(arr,a) i = i-1 end return arr, state end function get_part(npart,npar) valid_num = 0 p = partitions(1:npart) s = start(p) while !done(p,s) arr,s = my_take(p,s,npar) valid_num += @parallel (+) for a in arr length(a) end end return valid_num end valid_num = @time get_part(10,30) 

I was going to use the take () method to implement up to npar elements from the iterator, but take() seems to be deprecated, so I included my own implementation, which I called my_take() . Thus, the getPart() function uses my_take() to get up to npar partitions at a time and performs calculations on them. In this case, the calculation simply adds their lengths because I do not have code for the OP function is_valid() . get_part() then returns the result.

Since computing length() not very time consuming, this code is actually slower when running on parallel processors than on a single processor:

 $ julia -p 1 parpart.jl elapsed time: 10.708567515 seconds (373025568 bytes allocated, 6.79% gc time) $ julia -p 2 parpart.jl elapsed time: 15.70633439 seconds (548394872 bytes allocated, 9.14% gc time) 

Alternatively, pmap() can be used for each part of the task instead of a parallel for loop.

Regarding the memory issue, the implementation of 30 elements from partitions(1:10) took about 1 gigabyte of memory on my PC when I started Julia with 4 workflows, so I expect that the implementation of even a small subset of partitions(1:21) will require Great deal memory. It might be advisable to estimate how much memory is required in order to make sure that this would be possible before attempting such a calculation.

Regarding the calculation time, please note that:

 julia> length(partitions(1:10)) 115975 julia> length(partitions(1:21)) 474869816156751 

... therefore, even efficient parallel processing on 30 cores may not be sufficient to solve a big problem in a reasonable amount of time.

+1
source

All Articles