Depth Tree Iterator Implementation in Python

I am trying to implement the iterator class for optional binary trees in Python. After the iterator is built with the root of the node tree, its next() function can be called again to traverse the tree in depth order (for example, this order ), finally returning None when there are no nodes.

Here is the base Node class for a tree:

 class Node(object): def __init__(self, title, children=None): self.title = title self.children = children or [] self.visited = False def __str__(self): return self.title 

As you can see above, I introduced the visited property for nodes for my first approach, since I did not see the path around it. With this extra state, the Iterator class looks like this:

 class Iterator(object): def __init__(self, root): self.stack = [] self.current = root def next(self): if self.current is None: return None self.stack.append(self.current) self.current.visited = True # Root case if len(self.stack) == 1: return self.current while self.stack: self.current = self.stack[-1] for child in self.current.children: if not child.visited: self.current = child return child self.stack.pop() 

This is good and good, but I want to get rid of the need for the visited property without resorting to recursion or other changes to the Node class.

All the state I need should take care of the iterator, but I don't understand how this can be done. Saving the visited list for the whole tree is non-scalable and out of the question, so there should be a smart way to use the stack.

What bothers me especially is because the next() function certainly returns, how can I remember where I was without labeling or using redundant storage? Intuitively, I think about iterating over children, but this logic is interrupted / forgotten when the next() function returns!

UPDATE - here is a little test:

 tree = Node( 'A', [ Node('B', [ Node('C', [ Node('D') ]), Node('E'), ]), Node('F'), Node('G'), ]) iter = Iterator(tree) out = object() while out: out = iter.next() print out 
+7
python iterator algorithm depth-first-search tree
source share
2 answers

If you really need to avoid recursion, this iterator works:

 from collections import deque def node_depth_first_iter(node): stack = deque([node]) while stack: # Pop out the first element in the stack node = stack.popleft() yield node # push children onto the front of the stack. # Note that with a deque.extendleft, the first on in is the last # one out, so we need to push them in reverse order. stack.extendleft(reversed(node.children)) 

With that said, I think you think too much about it. A good (recursive) generator also does the trick:

 class Node(object): def __init__(self, title, children=None): self.title = title self.children = children or [] def __str__(self): return self.title def __iter__(self): yield self for child in self.children: for node in child: yield node 

both of them pass your tests:

 expected = ['A', 'B', 'C', 'D', 'E', 'F', 'G'] # Test recursive generator using Node.__iter__ assert [str(n) for n in tree] == expected # test non-recursive Iterator assert [str(n) for n in node_depth_first_iter(tree)] == expected 

and you can easily make Node.__iter__ use a non-recursive form if you want:

 def __iter__(self): return node_depth_first_iter(self) 
+7
source share

However, it could still potentially contain every label. I want an iterator to save only a subset of the tree at a time.

But you already hold everything. Remember that an object is essentially a dictionary with an entry for each attribute. Having self.visited = False in __init__ of Node means that you keep the redundant key "visited" and False for each Node object no matter what. At the very least, the set also has the potential to not hold every ID node. Try the following:

 class Iterator(object): def __init__(self, root): self.visited_ids = set() ... def next(self): ... #self.current.visited = True self.visited_ids.add(id(self.current)) ... #if not child.visited: if id(child) not in self.visited_ids: 

Finding an identifier in a set should be as fast as accessing the node attribute. The only way that can be more wasteful than your decision is the overhead of the set object itself (and not its elements), which is worrying only if you have several parallel iterators (which you obviously don’t do, otherwise node visited may not be useful to you).

0
source share

All Articles