I am trying to implement a modified parallel depth search algorithm in Erlang (let's call it * dfs_mod *).
All I want to get is all the “dead ends”, which are basically the paths that return when * dfs_mod * visits a vertex without neighbors or a vertex with neighbors already visited. I save every path to ets_table1 if my user-defined function fun1(Path) returns true and ets_table2 if fun1(Path) returns false (I need to filter the resulting dead-end paths using some client filter).
I implemented a serial version of this algorithm and for some strange reason, it works better than the parallel one.
The idea of a parallel implementation is simple:
- visit
Vertex from [Vertex|Other_vertices] = Unvisited_neighbours , - add this
Vertex to the current path; - send
{self(), wait} to the 'collector' process; - run * dfs_mod * for the
Unvisited_neighbours current Vertex in the new process ; - continue to work * dfs_mod * with the remaining vertices provided (
Other_vertices ); - when there are no more vertices to visit, send
{self(), done} to the collector process and complete;
So, basically every time I visit a peak with undisclosed neighbors, I create a new depth search process and then continue with other vertices.
Immediately after creating the first process * dfs_mod, I begin to collect all the messages {Pid, wait} and {Pid, done} (message wait - save the collector waiting for all messages done ). N milliseconds after waiting, the collector function returns ok .
For some reason, this parallel implementation runs from 8 to 160 seconds, while the serial version runs for only 4 seconds (testing was carried out on a fully connected digraph with 5 vertices on a machine with an Intel i5 processor).
Here are my thoughts on such poor performance:
- I pass the
Graph digraph to each new process that runs * dfs_mod *. Perhaps the execution of digraph:out_neighbours(Graph) against one digraph of many processes causes this slowness? - I am accumulating the current path in the list and passing it to each new process * dfs_mod *, maybe sending so many lists is a problem?
- I use the ETS table to save the path every time I visit a new vertex and add it to the path. ETS properties
([bag, public,{write_concurrency, true}) , but maybe I'm doing something wrong? - every time I go to a new vertex and add it to the path, I check the path using the
fun1() custom function (it basically checks to see if there is a path with vertices marked with the letter "n" that appears before the vertices with "m" , and returns true/false depending on the result). Perhaps this fun1() slows down? - I tried to run * dfs_mod * without collecting
done and wait messages, but htop shows a lot of Erlang activity for quite some time after * dfs_mod * returns ok in the shell, so I don’t think that sending an active message slows down.
How to make my parallel dfs_mod faster than its sequential instance?
Edit : when starting parallel * dfs_mod *, pman , no processes are displayed at all, although htop shows that all 4 processor threads are busy.
skanatek
source share