Topological Sorting in PHP

I found this topological sort function for PHP:

Source: http://www.calcatraz.com/blog/php-topological-sort-function-384/

function topological_sort($nodeids, $edges) { $L = $S = $nodes = array(); foreach($nodeids as $id) { $nodes[$id] = array('in'=>array(), 'out'=>array()); foreach($edges as $e) { if ($id==$e[0]) { $nodes[$id]['out'][]=$e[1]; } if ($id==$e[1]) { $nodes[$id]['in'][]=$e[0]; } } } foreach ($nodes as $id=>$n) { if (empty($n['in'])) $S[]=$id; } while (!empty($S)) { $L[] = $id = array_shift($S); foreach($nodes[$id]['out'] as $m) { $nodes[$m]['in'] = array_diff($nodes[$m]['in'], array($id)); if (empty($nodes[$m]['in'])) { $S[] = $m; } } $nodes[$id]['out'] = array(); } foreach($nodes as $n) { if (!empty($n['in']) or !empty($n['out'])) { return null; // not sortable as graph is cyclic } } return $L; } 

I look beautiful and short. In any case, for some input - I get duplicate lines in the output - see http://codepad.org/thpzCOyn

Generally, the sorting seems correct if I delete duplicates using array_unique()

I checked the function with two examples, and the sort itself looks right.

Should I just call array_unique() for the result?

+7
source share
2 answers

You get duplicate rows because there are duplicate edges. I'm not the head of graph theory, but I'm sure this is not legal:

 0 => array ( 0 => 'nominal', 1 => 'subtotal', ), 2 => array ( 0 => 'nominal', 1 => 'subtotal', ), ... 

You can add a test to the part that creates the nodes, something like this:

 if ($id==$e[0] && !in_array($e[1], $nodes[$id]['out'])) { $nodes[$id]['out'][]=$e[1]; } if ($id==$e[1] && !in_array($e[0], $nodes[$id]['in'])) // Not needed but cleaner { $nodes[$id]['in'][]=$e[0]; } 

... or just make sure you don't pass duplicate edges to the function .: P

+3
source

I am the author of the original topological sort function. Thanks to Alex for drawing attention to the issue of duplication. I updated the function to correctly remove duplicate edges and nodes. The updated version is here:

http://www.calcatraz.com/blog/php-topological-sort-function-384 (Same as the original link)

I added the following to achieve deduplication:

 // remove duplicate nodes $nodeids = array_unique($nodeids); // remove duplicate edges $hashes = array(); foreach($edges as $k=>$e) { $hash = md5(serialize($e)); if (in_array($hash, $hashes)) { unset($edges[$k]); } else { $hashes[] = $hash; }; } 

I had to serialize the edges to make sure the duplicates were deleted correctly. I also tidied up the rest of the function a bit and added some comments.

+7
source

All Articles