Find if this amount exists on the way to BST

The question is to determine if a given amount exists on any path in the BST. The question is simple if the path means the root for the leaf, or easily if the path means part of the path from the root to the leaf, which may not include the root or leaf. But here it becomes difficult because the path can span both the left and right child of the node. For example, in this figure, the sum 132 exists in a circular way. How can I find the existence of such a path? Using a hash to store all possible amounts in a node is discouraged!

enter image description here

+8
algorithm binary-tree binary-search-tree
source share
4 answers

You can, of course, generate all possible paths, adding up gradually when you go. The fact that the tree is a BST can save you time by limiting certain amounts, although I'm not sure if this will give an asymptotic increase in speed. The problem is that the amount formed using the left child of a given node will not necessarily be less than the amount formed using the correct child, since the path for the first sum can contain many more nodes. The following algorithm will work for all trees, not just BST.

To create all possible paths, note that the topmost point of the path is special: it is the only point on the path that is allowed (although not required) to have both children in the path. Each path contains a single highest point. Therefore, the outer recursion layer should be to visit each node tree and generate all the paths that have the node as the highest point.

// Report whether any path whose topmost node is t sums to target. // Recurses to examine every node under t. EnumerateTopmost(Tree t, int target) { // Get a list of sums for paths containing the left child. // Include a 0 at the start to account for a "zero-length path" that // does not contain any children. This will be in increasing order. a = append(0, EnumerateSums(t.left)) // Do the same for paths containing the right child. This needs to // be sorted in decreasing order. b = reverse(append(0, EnumerateSums(t.right))) // "List match" to detect any pair of sums that works. // This is a linear-time algorithm that takes two sorted lists -- // one increasing, the other decreasing -- and detects whether there is // any pair of elements (one from the first list, the other from the // second) that sum to a given value. Starting at the beginning of // each list, we compute the current sum, and proceed to strike out any // elements that we know cannot be part of a satisfying pair. // If the sum of a[i] and b[j] is too small, then we know that a[i] // cannot be part of any satisfying pair, since all remaining elements // from b that it could be added to are at least as small as b[j], so we // can strike it out (which we do by advancing i by 1). Similarly if // the sum of a[i] and b[j] is too big, then we know that b[j] cannot // be part of any satisfying pair, since all remaining elements from a // that b[j] could be added to are at least as big as a[i], so we can // strike it out (which we do by advancing j by 1). If we get to the // end of either list without finding the right sum, there can be // no satisfying pair. i = 0 j = 0 while (i < length(a) and j < length(b)) { if (a[i] + b[j] + t.value < target) { i = i + 1 } else if (a[i] + b[j] + t.value > target) { j = j + 1 } else { print "Found! Topmost node=", t return } } // Recurse to examine the rest of the tree. EnumerateTopmost(t.left) EnumerateTopmost(t.right) } // Return a list of all sums that contain t and at most one of its children, // in increasing order. EnumerateSums(Tree t) { If (t == NULL) { // We have been called with the "child" of a leaf node. return [] // Empty list } else { // Include a 0 in one of the child sum lists to stand for // "just node t" (arbitrarily picking left here). // Note that even if t is a leaf node, we still call ourselves on // its "children" here -- in C/C++, a special "NULL" value represents // these nonexistent children. a = append(0, EnumerateSums(t.left)) b = EnumerateSums(t.right) Add t.value to each element in a Add t.value to each element in b // "Ordinary" list merge that simply combines two sorted lists // to produce a new sorted list, in linear time. c = ListMerge(a, b) return c } } 

The above pseudo code only reports the topmost node in the path. The entire path can be restored if EnumerateSums() returns a list of pairs (sum, goesLeft) instead of a simple list of sums, where goesLeft is a Boolean value indicating whether the path used to generate this sum will first go from the parent node.

The above pseudo-code calculates lists of sums several times for each node: EnumerateSums(t) will be called once for each node above t in the tree, in addition to calling for t itself. It would be possible to make EnumerateSums() memoise a list of sums for each node so that it would not be recounted on subsequent calls, but this does not actually improve the asymptotics: only O (n) work is required, print a list of n sums using regular recursion, and changing this to O (1) will not change the overall time complexity, because the entire list of sums caused by any call to EnumerateSums() should generally be read by the caller, and this takes O (n) time. EDIT: As Evgeni Klyuyev noted, EnumerateSums() actually behaves like a merge sort, being O (nlog n) when the tree is perfectly balanced and O (n ^ 2) when this is the only way. Thus, memoisation will actually give an asymptotic performance improvement.

You can get rid of temporary sumlists by rearranging EnumerateSums() into an iterator-like object that merges lists lazily and can be queried to get the next sum in ascending order. This would also entail the creation of EnumerateSumsDown() , which does the same, but extracts the amounts in descending order and uses this instead of reverse(append(0, EnumerateSums(t.right))) . Doing this reduces the complexity of the algorithm space to O (n), where n is the number of nodes in the tree, since each iterator object requires constant space (pointers to left and right child iterator objects, and also the place to write the last sum), and may not be more than one per tree node.

+2
source share

i to move along the left subtree and inversely cross the right subtree at the same time as merge sort works. move the iterator each time, which makes aum closer. its order n

+2
source share

Not the fastest, but easiest approach is to use two nested depth searches.

Use the regular depth search to start the start of the node. Use the second modified version of the depth search to check the sums for all paths starting from this node.

enter image description here

The second depth search differs from the usual depth search in two details:

  • It saves the current amount of the path. It adds a value to the sum every time a new node is added to the path and removes the value from the sum when a certain node is deleted.
  • It crosses the edges of the path from the root to the starting node only in the opposite direction (red edges in the diagram). All other edges intersect in the right direction, as usual (black edges in the diagram). To cross edges in the opposite direction, it either uses the "parent" pointers of the original BST (if any), or peeks into the stack of the first search for the first depth to get these "parent" pointers.

The time complexity of each DFS is O (N), so the total time complexity is O (N 2 ). Space requirements are O (N) (space for both DFS stacks). If the original BST contains โ€œparentโ€ pointers, the space requirements are O (1) (the โ€œparentโ€ pointers allow you to navigate the tree in any direction without stacks).


Another approach is based on the ideas of j_random_hacker and robert king (maintaining lists of sums, matching them, and then combining them). It processes the tree from the bottom up (starting from the leaves).

Use DFS to find the leaf node. Then go back and find the last branch of the node, which is grand ... the progenitor of this leaf node. This gives a chain between branches and leaf nodes. Process this chain:

 match1(chain) sum_list = sum(chain) match1(chain): i = j = sum = 0 loop: while (sum += chain[i]) < target: ++i while (sum -= chain[j]) > target: ++j if sum == target: success! sum(chain): result = [0] sum = 0 i = chain.length - 1 loop: sum += chain[i] --i result.append(sum) return result 

enter image description here

Continue DFS and find other leaf chains. When two chains originating from the same node meet, possibly, another chain ahead (red and green chains in the diagram preceded by the blue chain), process these chains:

 match2(parent, sum_list1, sum_list2) sum_list3 = merge1(parent, sum_list1, sum_list2) if !chain3.empty: match1(chain3) match3(sum_list3, chain3) sum_list4 = merge2(sum_list3, chain3) match2(parent, sum_list1, sum_list2): i = 0 j = chain2.length - 1 sum = target - parent.value loop: while sum < sum_list1[i] + sum_list2[j]: ++i while sum > sum_list1[i] + sum_list2[j]: --j if sum == sum_list1[i] + sum_list2[j]: success! merge1(parent, sum_list1, sum_list2): result = [0, parent.value] i = j = 1 loop: if sum_list1[i] < sum_list2[j]: result.append(parent.value + sum_list1[i]) ++i else: result.append(parent.value + sum_list2[j]) ++j return result match3(sum_list3, chain3): i = sum = 0 j = sum_list3.length - 1 loop: sum += chain3[i++] while sum_list3[j] + sum > target: --j if sum_list3[j] + sum == target: success! merge2(sum_list3, chain3): result = [0] sum = 0 i = chain3.length - 1 loop: sum += chain3[i--] result.append(sum) result.append(sum_list3[1...] + sum) 

Do the same where any two lists of sums or chains and the list of sums are descendants of the same node. This process can be continued until there is only one list of amounts belonging to the root node.

+2
source share

Are there any limitations of complexity? As you said: โ€œIt is easy if the path meansโ€œ root โ€for the leaf, or it is easy if the path means part of the path from the root to the leaf, which may not include the root or leaf.โ€ You can reduce the problem to this statement by setting the root each time to a different node and doing the search n times. This would be a simple approach, not sure which one is optimal.

Edit : if the tree is unidirectional, something like this might work (pseudo-code):

 findSum(tree, sum) if(isLeaf(tree)) return (sum == tree->data) for (i = 0 to sum) isfound |= findSum(leftSubStree, i) && findSum(rightSubTree, sum-i) return isfound; 

There are probably many mistakes here, but hopefully this clarifies the idea.

+1
source share

All Articles