User:Nomen4Omen/sandbox
This article needs additional citations for verification. (May 2009) |
In computer science, tree traversal (also known as tree search) is a form of graph traversal and refers to the process of visiting (checking and/or updating) each node in a tree data structure, exactly once. Such traversals are classified by the order in which the nodes are visited. Most of the following algorithms describe traversal of a binary tree, but they may be generalized to other trees as well.
Types
[edit]Unlike linked lists, one-dimensional arrays and other linear data structures, which are canonically traversed in linear order, trees may be traversed in multiple ways. They may be traversed in depth-first or breadth-first order. There are three common ways to traverse them in depth-first order: in-order, pre-order and post-order.[1] Beyond these basic traversals, various more complex or hybrid schemes are possible, such as depth-limited searches like iterative deepening depth-first search. The latter, as well as breadth-first search, can also be used to traverse infinite trees, see below.
Data structures for tree traversal
[edit]Traversing a tree involves iterating over all nodes in some manner. Because from a given node there is more than one possible next node (it is not a linear data structure), then, assuming sequential computation (not parallel), some nodes must be deferred—stored in some way for later visiting. This is often done via a stack (LIFO) or queue (FIFO). As a tree is a self-referential (recursively defined) data structure, traversal can be defined by recursion or, more subtly, corecursion, in a very natural and clear fashion; in these cases the deferred nodes are stored implicitly in the call stack.
Depth-first search is easily implemented via a stack, including recursively (via the call stack), while breadth-first search is easily implemented via a queue, including corecursively.
Depth-first search
[edit]These searches are referred to as depth-first search (DFS), since the search is deepened as much as possible on each child before going to the next sibling. DFS is well-suited for recursive programming.[2] [3]
Binary tree
[edit]In the binary example tree (see figure Depth-first traversal), the left child (L) is accessed before the right (R) which results in a counterclockwise circular route around the tree („peripheral circuit“). This is called left-to-right traversal. But (R) before (L) is also possible (which is called right-to-left traversal), e. g. reverse in-order.
The general recursive pattern for traversing a binary tree is as follows, a pattern which is considered the definition of the terms pre-, in- and post-order:
Go down one level to some child N.
If N exists (is a node) execute the necessary operations within the recursive subroutine visit. The colored display-operations are optional and produce a specific sequentialisation:[pre-order] (red in the figure) Display node N for compiling a pre-order sequentialisation.[a] (L) Recursively visit N's left subtree N.L. [in-order] (green in the figure) Display node N for compiling an in-order sequentialisation.[b] (R) Recursively visit N's right subtree N.R. [post-order] (blue in the figure) Display node N for compiling a post-order sequentialisation.[c] Return by going up one level and arriving at the parent node of N.
- ^ If in-order or post-order operations are not present, the whole traversal is called a (standard[4]: Background ) pre-order traversal.
- ^ If pre-order or post-order operations are not present, the whole traversal is called a (standard) in-order traversal.
- ^ If pre-order or in-order operations are not present, the whole traversal is called a (standard) post-order traversal.
The figure Depth-first traversal shows a complete test case for a DFS traversal of a binary tree as elaborated by the opposite table.
incoming edge |
outgoing edge |
sample path in the figure |
---|---|---|
down left | down left | 5,2,1 |
up right | up right | 0,1,2 |
down right | down left | 2,4,3 |
up right | up left | 3,4,2 |
down left | up right | 1,0,1 |
up right | down right | 2,5,8 |
down right | up left | 6,7,6 |
down left | down right | 8,6,7 |
up left | up right | 7,6,8 |
down right | down right | 8,9,X |
up left | up left | 9,8,5 |
The figure shows, by the way, that a traversal which is in-order can be cut short: If the visit count equals the number of nodes, the traversal can be quit, because ascending the right spine (in the figure the path 9,8,5) executes only (empty) post-order operations.
- The pre-order sequentialisation is a topologically sorted one, because a parent node is processed before any of its child nodes is done.
- A binary search tree is ordered such that in each node the key is greater than all keys in its left subtree and less than all keys in its right subtree. In such a tree, the in-order sequentialisation compiles (retrieves) the keys in ascending sorted order (hence the name).[5]
- If in an in-order traversal of such a tree, the right child (R) is accessed before the left (L) this is called reverse in-order. It retrieves the keys in descending sorted order.
The trace of a traversal is called a sequentialisation of the tree. The traversal trace is an ordered list of the visited nodes. No one sequentialisation according to pre-, in- or post-order describes the underlying tree uniquely. Given a tree with distinct elements, either pre-order or post-order paired with in-order is sufficient to describe the tree uniquely. However, pre-order with post-order leaves some ambiguity in the tree structure.[6]
General tree
[edit]To traverse any tree with depth-first search, perform the following operations recursively at each node N:
- Perform pre-order operations at node N.
- For each i from 1 to the number of children do:
- Visit i-th node, if present.
In between perform in-order operations at node N.
- Visit i-th node, if present.
- Perform post-order operations at node N.
Depending on the problem at hand, the pre-order, in-order or post-order operations may be void, or you may only want to visit a specific child, so these operations are optional. Also, in practice more than one of pre-order, in-order and post-order operations may be required. For example, when inserting into a ternary tree, a pre-order operation is performed by comparing items. A post-order operation may be needed afterwards to re-balance the tree.
Breadth-first search / level order
[edit]Trees can also be traversed in level-order, where we visit every node on a level before going to a lower level. This search is referred to as breadth-first search (BFS), as the search tree is broadened as much as possible on each depth before going to the next depth.
Other types
[edit]There are also tree traversal algorithms that classify as neither depth-first search nor breadth-first search. One such algorithm is Monte Carlo tree search, which concentrates on analyzing the most promising moves, basing the expansion of the search tree on random sampling of the search space.
Applications
[edit]A traversal while deleting or freeing nodes and values can delete or free an entire binary tree. Thereby the node is freed after freeing its children, so the operation free is post-order.
Similarly, a traversal while duplicating nodes and edges can make a complete duplicate of a binary tree. Thereby space has to be allocated for the copy of a node where the pointers to the copies of the two children are to be placed, so that the allocation is a pre-order operation. But the final (and essential) return
(of the pointer of the copy) is a post-order operation, which cannot happen before both children have return
-ed their pointers as well (see implementation), and the right spine (see remark above) has to be ascended in order to collect all the pointers to the copies, which is a post-order operation. In summary, because there are essential pre-order as well as essential post-order operations, the traversal as a whole cannot be considered neither a standard pre-order nor a standard post-order traversal as defined in section binary tree. Dwyer calls it „triple order“.[7]
Pre-order traversal can be used to make a prefix expression (Polish notation) from expression trees: traverse the expression tree pre-orderly. For example, traversing the depicted arithmetic expression in pre-order yields "+ * A - B C + D E".
Post-order traversal can compile a postfix representation (Reverse Polish notation) of a binary tree. Traversing the depicted arithmetic expression in post-order yields "A B C - * D E + +"; the latter can easily be transformed into machine code to evaluate the expression by a stack machine.
Implementations
[edit]Depth-first traversal
[edit]Recursive
[edit]With the recursive traversal, pre-order, in-order, as well as post-order operations are easily supported in a single program for traversal. It requires additional space for the call stack with as the height of the tree.[8]: 4.9.1
typedef struct mynode mynode;
struct mynode
{
mynode *childL; // -> left child
mynode *childR; // -> right child
int data; // user data
};
void recursiveTraversal(mynode *N)
{
if (N == NULL) return;
// N is in pre-order position
recursiveTraversal(N->childL); // left child
// N is in in-order position
recursiveTraversal(N->childR); // right child
// N is in post-order position
}
// perform the DFS traversal
recursiveTraversal(tree->root);
For example, the duplicate-binary-tree application can be coded as follows, where the assignment of the return value is separated in an extra statement, in order to show the proper temporal sequence:[8]: §77
mynode* recursiveDuplicateTree(mynode* N)
{
mynode* copy;
mynode* temp;
if (N == NULL) return NULL;
copy = (mynode*)malloc(sizeof(* N)); // pre-order position
if (copy == NULL)
{
fprintf (stderr, "Out of memory!\n");
exit (EXIT_FAILURE);
}
copy->data = N->data;
temp = recursiveDuplicateTree(N->childL); // access the left child
copy->childL = temp; // in-order position
temp = recursiveDuplicateTree(N->childR); // access the right child
copy->childR = temp; // post-order position
return copy;
}
Iterative
[edit]As shown in the figure Depth-first traversal the position whether pre-, in- or post-order is determined by the existence of a child and its left-right situation. At a given node N, there are four possibilities:
- N is a leaf: pre-, in- and post-order immediately follow each other.
- N has two children: pre-order precedes access to left child precedes in-order precedes access to right child precedes post-order.
- N has left child only: pre-order precedes access to left child precedes in-order precedes post-order.
- N has right child only: pre-order precedes in-order precedes access to right child precedes post-order.
Iterative traversal may be more easily demonstrated solely with either pre-order, in-order, or post-order operation.[9] All three programs traverse the subtree rooted by node N and require additional space for the ancestor stack with as the height of this subtree.
Iterative Preorder
void iterativePreorder(mynode *N)
{
mynode *save[100]; // stack of ancestors
int top = 0;
if (N == NULL) return;
save[top++] = N;
while (top != 0)
{
N = save[--top];
// Display node N for compiling a pre-order sequentialisation.
// right child is pushed first so that left is processed first
if (N->childR != NULL)
save[top++] = N->childR;
if (N->childL != NULL)
save[top++] = N->childL;
}
}
Iterative Inorder
void iterativeInorder (mynode *node)
{
mynode *save[100]; // stack of ancestors
int top = 0;
while(N != NULL)
{
while (N != NULL)
{
if (N->childR != NULL)
save[top++] = N->childR;
save[top++] = N;
N = N->childL;
}
N = save[--top];
while(top != 0 && N->childR == NULL)
{
printf("[%d] ", N->value);
N = save[--top];
}
// Display N for compiling an in-order sequentialisation.
N = (top != 0) ? save[--top] : (mynode *) NULL;
}
}
Iterative Postorder
void iterativePostorder(mynode *N)
{
struct
{
mynode *node;
unsigned vleft :1; // Visited left?
unsigned vright :1; // Visited right?
}
save[100];
int top = 0;
save[top++].node = N;
while ( top != 0 )
{
/* Move to the left subtree if present and not visited */
if(N->childL != NULL && !save[top].vleft)
{
save[top].vleft = 1;
save[top++].node = N;
N = N->childL;
continue;
}
/* Move to the right subtree if present and not visited */
if(node->childR != NULL && !save[top].vright )
{
save[top].vright = 1;
save[top++].node = N;
N = N->childR;
continue;
}
// Display node for compiling a post-order sequentialisation.
/* Clean up the stack */
save[top].vleft = 0;
save[top].vright = 0;
/* Move up */
N = save[--top].node;
}
}
All the above implementations require stack space proportional to the height of the tree which is a call stack for the recursive and a parent stack for the iterative ones. In a poorly balanced tree, this can be considerable. With the iterative implementations we can remove the stack requirement by maintaining parent pointers in each node, or by threading the tree (next section).
Advancing to the next node
[edit]Advancing to the in-order next node[8]: 4.9.3.7
This is useful e. g. when the search key determines only a region
and the intended object has to be searched for in a subsequent sequential search.
mynode *inOrderNext(mynode *N)
{
mynode *next = N->childR;
if (next != NULL)
{
N = next->childL;
while (N != NULL)
{
next = N;
N = next->childL;
}
return next; // a node without left child
}
// ascend some right spine:
do
{
next = N;
if ((N = next->parent) == NULL)
return NULL; // next is the root:
// i.e. N has been the largest element
} until (next != N->childR);
return N;
}
A Bounded-Space Tree Traversal Algorithm
[edit]The descent (to the leaves) is provided by the pointers to the children, commonly known as left and right child. In a recursive program the information required for the ascent to the root is contained in the program's call stack. There the local context is automatically saved at each recursive call and restored at each return. This requires space where denotes the tree height.
In an iterative program some information required for the ascent to the root has to be provided explicitly. It consists of the pointer to the parent node and an information, called chirality or orientation, indicating whether the node under consideration is a left or a right child. In the following there are some possibilities to provide this information, ranked from expensive to cheap. There is the number of nodes and is the height of the tree.
technical approach |
space cost |
Landau symbol | |
---|---|---|---|
1 | parent pointer held in node[a] | words | |
2 | parent pointer uses child fields + chirality bit in node |
bits | |
3 | parent pointer held in stack[a] | words[b] | |
4 | parent pointer uses child fields + chirality bit in stack |
bits[b] | |
5 | parent pointer uses child fields + chirality by "pointer inversion"[10] |
0 | |
6 | parent pointer uses child fields + chirality by comparing again[c] |
0 |
All six approaches have the same asymptotic time complexity, namely . If child fields are used to record the parent pointer, the tree is temporarily modified. But with all approaches except approach 5 the tree is the same after the traversal. Approach 5 cannot be taken when the sizes of the nodes vary or when there are pointers into the tree.[11] Approach 6 cannot be taken when the tree is not a search tree or when the search tree contains duplicates.
If both approaches, 5 and 6, are possible, the constant factor of the time complexity of approach 6 should be the better one in most cases.
The following code example shows an iteratively programmed traversal which supports all three sequentialisations (called „triple order“[7] traversal) in one single circular route. It is quite close to ALGORITHM A of Hirschberg and Seiden, even in its use of goto
statements.[4]: p.4
It implements approach 6.
#include <stdio.h>
typedef struct mynode mynode;
struct mynode
{
mynode* L; // -> left child
mynode* R; // -> right child
int key; // sort key
int value; // user data
};
void iterativeTripleOrder(mynode* N)
// N shall be the root of the subtree to be traversed.
{
mynode *child, *parent;
if (N == NULL) return;
parent = NULL; /* simulates the first parent */
goto Re1;
Gr0: /* N is a left child. */
child = N;
N = parent;
parent = N->L; /* in any case get next higher parent */
N->L = child; /* restore correct left child */
/* N's left child has been handled. */
Gr1:
printf("in-order: [%d] ", N->value); /* N is in in-order position */
child = N->R;
if (child == NULL)
goto Bl1; /* N has no right child. */
/* N has a right child. */
N->R = parent;
Re0:
/* all edges down */
/* Try to loop down some left spine, i.e.
all available left (grand)children until there is none. */
parent = N;
N = child;
Re1:
printf("pre-order: [%d] ", N->value); /* N is in pre-order position */
child = N->L;
if (child == NULL)
goto Gr1; /* N has no left child. */
/* N has a left child */
N->L = parent;
goto Re0;
Bl0: /* N is a right child. */
child = N;
N = parent;
parent = N->R; /* in any case get next higher parent */
N->R = child; /* restore correct right child */
/* N's right child has been handled. */
Bl1:
printf("post-order: [%d] ", N->value); /* N is in post-order position */
if (parent == NULL) /* N is the root. */
return;
/* N is not the root: */
/* Try to loop up some right spine, i.e.
all available (grand)parents to the left until there is none. */
if (N->key < parent->key)
goto Gr0; /* N is a left child. */
goto Bl0; /* N is a right child. */
}
The highlighted lines are those in which approach 6 differs from the other approaches. E. g. in approach 5, line 64 is designed to be a comparison of addresses, say,
if ( (unsigned int)N < (unsigned int)(parent->L) )
.
Lines 23 to 26 and 52 to 55 characterize the technique „parent pointer uses child fields“ and are specific to approaches 2, 4, and 6. With approach 5 these assignments are slightly more complicated.[10][4]: 3
Version mit C-Schleifen:
#include <stdio.h>
typedef struct mynode mynode;
struct mynode
{
mynode* L; // -> left child
mynode* R; // -> right child
int key; // sort key
int value; // user data
};
void iterativeTripleOrder(mynode* N)
// N shall be the root of the subtree to be traversed.
{
mynode *child, *parent;
if (N == NULL) return;
child = N;
N = NULL; /* will simulate the first parent */
goto Red;
while (N->key > parent->key) /* N is the right child of parent. */
{
child = N;
N = parent;
parent = N->R; /* in any case get next higher parent */
N->R = child; /* restore correct right child */
/* N's right child has been handled. */
Blue:
printf("post-order: [%d] ", N->value); /* N is in post-order position */
if (parent == NULL) /* N is the root. */
return;
/* N is not the root; it is some child of parent. */
/* Try to loop up some right spine, i.e.
all available (grand)parents to the left until there is none. */
}
/* N is the left child of parent. */
child = N;
N = parent;
parent = N->L; /* in any case get next higher parent */
N->L = child; /* restore correct left child */
/* N's left child has been handled. */
Green:
printf("in-order: [%d] ", N->value); /* N is in in-order position */
child = N->R;
if (child == NULL)
goto Blue; /* N has no right child. */
/* N has a right child. */
N->R = parent;
goto Red; /* N has a right child. */
while (child != NULL) /* N has a left child. */
{
N->L = parent;
Red:
/* all edges down */
/* Try to loop down some left spine, i.e.
all available left (grand)children until there is none. */
parent = N;
N = child;
printf("pre-order: [%d] ", N->value); /* N is in pre-order position */
child = N->L;
}
/* N has no left child. */
goto Green;
}
The highlighted lines are those in which approach 6 differs from the other approaches. E. g. in approach 5, line 23 is designed to be a comparison of addresses, say,
if ( (unsigned int)N < (unsigned int)(parent->L) )
.
Lines 25 to 28, 40 to 43, 47, 51, 56, 61 to 62 and 64 characterize the technique „parent pointer uses child fields“ and are specific to approaches 2, 4, and 6. Thereby on descent, lines 47 and 51 resp. 64 and 56 both couples followed by lines 61 to 62 save the pointer to the parent of the current node N in the node structure and switch to the new (N, parent)-pair whereas on ascent, lines 25 to 28 as well as lines 40 to 43 restore the previous contents of the child fields.
With approach 5 these assignments are slightly more complicated.[10][4]: 3
The program can be easily adapted to a single step advancement to the next node in each of the three sequentialisations.
Morris in-order traversal using threading
[edit]A binary tree is threaded by making every left child pointer (that would otherwise be null) point to the in-order predecessor of the node (if it exists) and every right child pointer (that would otherwise be null) point to the in-order successor of the node (if it exists).
Advantages:
- Avoids recursion, which uses a call stack and consumes memory and time.
- The node keeps a record of its parent.
Disadvantages:
- The tree is more complex.
- We can make only one traversal at a time.
- It is more prone to errors when both the children are not present and both values of nodes point to their ancestors.
Morris traversal is an implementation of in-order traversal that uses threading:[12]
- Create links to the in-order successor.
- Print the data using these links.
- Revert the changes to restore original tree.
Breadth-first search
[edit]Also, listed below is pseudocode for a simple queue based level-order traversal, and will require space proportional to the maximum number of nodes at a given depth. This can be as much as the total number of nodes / 2. A more space-efficient approach for this type of traversal can be implemented using an iterative deepening depth-first search.
levelorder(node) q ← empty queue q.enqueue(node) while not q.isEmpty() do node ← q.dequeue() visit(node) if node.childL ≠ null then q.enqueue(node.childL) if node.childR ≠ null then q.enqueue(node.childR)
Infinite trees
[edit]While traversal is usually done for trees with a finite number of nodes (and hence finite depth and finite branching factor) it can also be done for infinite trees. This is of particular interest in functional programming (particularly with lazy evaluation), as infinite data structures can often be easily defined and worked with, though they are not (strictly) evaluated, as this would take infinite time. Some finite trees are too large to represent explicitly, such as the game tree for chess or go, and so it is useful to analyze them as if they were infinite.
A basic requirement for traversal is to visit every node eventually. For infinite trees, simple algorithms often fail this. For example, given a binary tree of infinite depth, a depth-first search will go down one side (by convention the left side) of the tree, never visiting the rest, and indeed an in-order or post-order traversal will never visit any nodes, as it has not reached a leaf (and in fact never will). By contrast, a breadth-first (level-order) traversal will traverse a binary tree of infinite depth without problem, and indeed will traverse any tree with bounded branching factor.
On the other hand, given a tree of depth 2, where the root has infinitely many children, and each of these children has two children, a depth-first search will visit all nodes, as once it exhausts the grandchildren (children of children of one node), it will move on to the next (assuming it is not post-order, in which case it never reaches the root). By contrast, a breadth-first search will never reach the grandchildren, as it seeks to exhaust the children first.
A more sophisticated analysis of running time can be given via infinite ordinal numbers; for example, the breadth-first search of the depth 2 tree above will take ω·2 steps: ω for the first level, and then another ω for the second level.
Thus, simple depth-first or breadth-first searches do not traverse every infinite tree, and are not efficient on very large trees. However, hybrid methods can traverse any (countably) infinite tree, essentially via a diagonal argument ("diagonal"—a combination of vertical and horizontal—corresponds to a combination of depth and breadth).
Concretely, given the infinitely branching tree of infinite depth, label the root (), the children of the root (1), (2), …, the grandchildren (1, 1), (1, 2), …, (2, 1), (2, 2), …, and so on. The nodes are thus in a one-to-one correspondence with finite (possibly empty) sequences of positive numbers, which are countable and can be placed in order first by sum of entries, and then by lexicographic order within a given sum (only finitely many sequences sum to a given value, so all entries are reached—formally there are a finite number of compositions of a given natural number, specifically 2n−1 compositions of n ≥ 1), which gives a traversal. Explicitly:
0: () 1: (1) 2: (1, 1) (2) 3: (1, 1, 1) (1, 2) (2, 1) (3) 4: (1, 1, 1, 1) (1, 1, 2) (1, 2, 1) (1, 3) (2, 1, 1) (2, 2) (3, 1) (4) etc.
This can be interpreted as mapping the infinite depth binary tree onto this tree and then applying breadth-first search: replace the "down" edges connecting a parent node to its second and later children with "right" edges from the first child to the second child, from the second child to the third child, etc. Thus at each step one can either go down (append a (, 1) to the end) or go right (add one to the last number) (except the root, which is extra and can only go down), which shows the correspondence between the infinite binary tree and the above numbering; the sum of the entries (minus one) corresponds to the distance from the root, which agrees with the 2n−1 nodes at depth n − 1 in the infinite binary tree (2 corresponds to binary).
References
[edit]- ^ "Lecture 8, Tree Traversal". Retrieved 2 May 2015.
- ^ http://www.cise.ufl.edu/~sahni/cop3530/slides/lec216.pdf
- ^ "Preorder Traversal Algorithm". Retrieved 2 May 2015.
- ^ a b c d Hirschberg, Dan S.; Seiden, S. S. (Sep 1993). "A Bounded-Space Tree Traversal Algorithm" (PDF). Information Processing Letters. 47 (4): 215–219.
{{cite journal}}
: CS1 maint: year (link) - ^ Wittman, Todd. "Tree Traversal" (PDF). UCLA Math. Archived from the original (PDF) on February 13, 2015. Retrieved January 2, 2016.
- ^ "Algorithms, Which combinations of pre-, post- and in-order sequentialisation are unique?, Computer Science Stack Exchange". Retrieved 2 May 2015.
- ^ a b Dwyer, Barry (1974). "Simple algorithms for traversing a tree without an auxiliary stack". Information Processing Letters. 2 (5): 143–145.
- ^ a b c d Pfaff, Ben (2007). An Introduction to Binary Search Trees and Balanced Trees. Free Software Foundation, Inc. Retrieved 2020-03-05.
- ^ All 3 programs taken from crazyforcode.
- ^ a b c Schorr, Herbert; Waite, William M. (1967). "An Efficient Machine-Independent Procedure for Garbage Collection in Various List Structures". Communications of the ACM. 10 (8): 501–506.
- ^ It may be remarked that conventional tree rotations do not observe the kind of ordering of approach 5.
- ^ Morris, Joseph M. (1979). "Traversing binary trees simply and cheaply". Information Processing Letters. 9 (5). doi:10.1016/0020-0190(79)90068-1.
- General
- Dale, Nell. Lilly, Susan D. "Pascal Plus Data Structures". D. C. Heath and Company. Lexington, MA. 1995. Fourth Edition.
- Drozdek, Adam. "Data Structures and Algorithms in C++". Brook/Cole. Pacific Grove, CA. 2001. Second edition.
- http://www.math.northwestern.edu/~mlerma/courses/cs310-05s/notes/dm-treetran
External links
[edit]- Storing Hierarchical Data in a Database with traversal examples in PHP
- Managing Hierarchical Data in MySQL
- Working with Graphs in MySQL
- Sample code for recursive and iterative tree traversal implemented in C.
- Sample code for recursive tree traversal in C#.
- See tree traversal implemented in various programming language on Rosetta Code
- Tree traversal without recursion
Category:Trees (data structures)
Category:Articles with example pseudocode
Category:Graph algorithms
Category:Recursion
Category:Iteration in programming