FACTOID # 9: The bookmobile capital of America is Kentucky.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Binary search tree
A binary search tree of size 9 and depth 3, with root 8 and leaves 1, 4, 7 and 13

In computer science, a binary search tree (BST) is a binary tree data structure which has the following properties: Image File history File links Binary_search_tree. ... Image File history File links Binary_search_tree. ... Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. ... In computer science, a binary tree is a tree data structure in which each node has at most two children. ... A binary tree, a simple type of branching linked data structure. ...

• each node (item in the tree) has a value;
• a total order (linear order) is defined on these values;
• the left subtree of a node contains only values less than the node's value;
• the right subtree of a node contains only values greater than or equal to the node's value.

The major advantage of binary search trees over other data structures is that the related sorting algorithms and search algorithms such as in-order traversal can be very efficient. In mathematics and set theory, a total order, linear order, simple order, or (non-strict) ordering is a binary relation (here denoted by infix â‰¤) on some set X. The relation is transitive, antisymmetric, and total. ... A simple example unordered tree In computer science, a tree is a widely-used data structure that emulates a tree structure with a set of linked nodes. ... A simple example unordered tree In computer science, a tree is a widely-used data structure that emulates a tree structure with a set of linked nodes. ... In computer science and mathematics, a sorting algorithm is an algorithm that puts elements of a list in a certain order. ... In computer science, a search algorithm, broadly speaking, is an algorithm that takes a problem as input and returns a solution to the problem, usually after evaluating a number of possible solutions. ... In computer science, tree traversal is the process of visiting each node in a tree data structure. ...

Binary search trees can choose to allow or disallow duplicate values, depending on the implementation.

Binary search trees are a fundamental data structure used to construct more abstract data structures such as sets, multisets, and associative arrays. A binary tree, a simple type of branching linked data structure. ... In computer science, the set is a collection of certain values without any particular order. ... In mathematics, a multiset (or bag) is a generalization of a set. ... An associative array (also map, hash, dictionary, finite map, lookup table, and in query-processing an index or index file) is an abstract data type composed of a collection of keys and a collection of values, where each key is associated with one value. ...

Operations on a binary tree require comparisons between nodes. These comparisons are made with calls to a comparator, which is a subroutine that computes the total order (linear order) on any two values. This comparator can be explicitly or implicitly defined, depending on the language in which the BST is implemented. In electronics, a comparator is a device which compares two voltages or currents and switches its output to indicate which is larger. ... In computer science, a subroutine (function, method, procedure, or subprogram) is a portion of code within a larger program, which performs a specific task and can be relatively independent of the remaining code. ...

### Searching

Searching a binary tree for a specific value can be a recursive or iterative process. This explanation covers a recursive method. A common method of simplification is to divide a problem into subproblems of the same type. ... The word iteration is sometimes used in everyday English with a meaning virtually identical to repetition. ...

We begin by examining the root node. If the value we are searching for equals the root, the value exists in the tree. If the value we are searching for is less than the root, it must be in the left subtree. Similarly, if it is greater than the root, then it must be in the right subtree. A simple example unordered tree In computer science, a tree is a widely-used data structure that emulates a tree structure with a set of linked nodes. ...

This process is repeated on each subsequent node until the value is found or we reach a leaf node. If a leaf node is reached and the searched value is not found, then the item must not be present in the tree. A simple example unordered tree In computer science, a tree is a widely-used data structure that emulates a tree structure with a set of linked nodes. ...

Here is the search algorithm in the Python programming language: Python is a general-purpose, high-level programming language. ...

` def search_binary_tree(node, key): if node is None: return None # key not found if key < node.key: return search_binary_tree(node.left, key) elif key > node.key: return search_binary_tree(node.right, key) else: # key is equal to node key return node.value # found key `

This operation requires O(log n) time in the average case, but needs O(n) time in the worst-case, when the unbalanced tree resembles a linked list (degenerate tree). For other uses, see Big O. In computational complexity theory, big O notation is often used to describe how the size of the input data affects an algorithms usage of computational resources (usually running time or memory). ... For other uses, see Big O. In computational complexity theory, big O notation is often used to describe how the size of the input data affects an algorithms usage of computational resources (usually running time or memory). ... In computer science, a linked list is one of the fundamental data structures, and can be used to implement other data structures. ... In computer science, a binary tree is a tree data structure in which each node has at most two children. ...

### Insertion

Insertion begins as a search would begin; if the root is not equal to the value, we search the left or right subtrees as before. Eventually, we will reach an external node and add the value as its right or left child, depending on the node's value. In other words, we examine the root and recursively insert the new node to the left subtree if the new value is less than the root, or the right subtree if the new value is greater than or equal to the root.

Here's how a typical binary search tree insertion might be performed in C++: For a WikiBook on programming with C++, see Wikibooks: C++ Programming. ...

` /* Inserts the node pointed to by "newNode" into the subtree rooted at "treeNode" */ void InsertNode(Node *&treeNode, Node *newNode) { if (treeNode == NULL) treeNode = newNode; else if (newNode->key < treeNode->key) InsertNode(treeNode->left, newNode); else InsertNode(treeNode->right, newNode); } `

The above "destructive" procedural variant modifies the tree in place. It uses only constant space, but the previous version of the tree is lost. Alternatively, as in the following Python example, we can reconstruct all ancestors of the inserted node; any reference to the original tree root remains valid, making the tree a persistent data structure: Python is a general-purpose, high-level programming language. ... In computing, a persistent data structure is a data structure which always preserves the previous version of itself when it is modified; such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure. ...

` def binary_tree_insert(node, key, value): if node is None: return TreeNode(None, key, value, None) if key == node.key: return TreeNode(node.left, key, value, node.right) if key < node.key: return TreeNode(binary_tree_insert(node.left, key, value), node.key, node.value, node.right) else: return TreeNode(node.left, node.key, node.value, binary_tree_insert(node.right, key, value)) `

The part that is rebuilt uses Θ(log n) space in the average case and Ω(n) in the worst case (see big-O notation). The Big O notation is a mathematical notation used to describe the asymptotic behavior of functions. ...

In either version, this operation requires time proportional to the height of the tree in the worst case, which is O(log n) time in the average case over all trees, but Ω(n) time in the worst case. For other uses, see Big O. In computational complexity theory, big O notation is often used to describe how the size of the input data affects an algorithms usage of computational resources (usually running time or memory). ...

Another way to explain insertion is that in order to insert a new node in the tree, its value is first compared with the value of the root. If its value is less than the root's, it is then compared with the value of the root's left child. If its value is greater, it is compared with the root's right child. This process continues, until the new node is compared with a leaf node, and then it is added as this node's right or left child, depending on its value.

There are other ways of inserting nodes into a binary tree, but this is the only way of inserting nodes at the leaves and at the same time preserving the BST structure.

### Deletion

Some examples of Binary Search Tree insertions and deletions.

There are several cases to be considered:

• Deleting a leaf: Deleting a node with no children is easy, as we can simply remove it from the tree.
• Deleting a node with one child: Delete it and replace it with its child.
• Deleting a node with two children: Suppose the node to be deleted is called N. We replace the value of N with either its in-order successor (the left-most child of the right subtree) or the in-order predecessor (the right-most child of the left subtree).

Image File history File links Binary_search_tree_delete. ...

Once we find either the in-order successor or predecessor, swap it with N, and then delete it. Since both the successor and the predecessor must have fewer than two children, either one can be deleted using the previous two cases. A good implementation avoids consistently using one of these nodes, however, because this can unbalance the tree. In computing, a self-balancing binary search tree is a binary search tree that attempts to keep its height, or the number of level of nodes beneath the root, as small as possible at all times, automatically. ...

Here is C++ sample code for a destructive version of deletion. (We assume the node to be deleted has already been located using `search`.) C++ (pronounced ) is a general-purpose programming language. ...

` void DeleteNode(Node * & node) { if (node->left == NULL) { Node *temp = node; node = node->right; delete temp; } else if (node->right == NULL) { Node *temp = node; node = node->left; delete temp; } else { // In-order predecessor (rightmost child of left subtree)  // Node has two children - get max of left subtree Node **temp = &node->left; // get left node of the original node // find the rightmost child of the subtree of the left node while ((*temp)->right != NULL) { temp = &(*temp)->right; } // copy the value from the in-order predecessor to the original node node->value = (*temp)->value; // then delete the predecessor DeleteNode(*temp); } } `

Although this operation does not always traverse the tree down to a leaf, this is always a possibility; thus in the worst case it requires time proportional to the height of the tree. It does not require more even when the node has two children, since it still follows a single path and does not visit any node twice.

Here is the code in Python:

` def findSuccessor(self): succ = None if self.rightChild: succ = self.rightChild.findMin() else: if self.parent.leftChild == self: succ = self.parent else: self.parent.rightChild = None succ = self.parent.findSuccessor() self.parent.rightChild = self return succ def findMin(self): n = self while n.leftChild: n = n.leftChild print 'found min, key = ', n.key return n def spliceOut(self): if (not self.leftChild and not self.rightChild): if self == self.parent.leftChild: self.parent.leftChild = None else: self.parent.rightChild = None elif (self.leftChild or self.rightChild): if self.leftChild: if self == self.parent.leftChild: self.parent.leftChild = self.leftChild else: self.parent.rightChild = self.leftChild else: if self == self.parent.leftChild: self.parent.leftChild = self.rightChild else: self.parent.rightChild = self.rightChild def binary_tree_delete(self, key): if self.key == key: if not (self.leftChild or self.rightChild): if self == self.parent.leftChild: self.parent.leftChild = None else: self.parent.rightChild = None elif (self.leftChild or self.rightChild) and (not (self.leftChild and self.rightChild)): if self.leftChild: if self == self.parent.leftChild: self.parent.leftChild = self.leftChild else: self.parent.rightChild = self.leftChild else: if self == self.parent.leftChild: self.parent.leftChild = self.rightChild else: self.parent.rightChild = self.rightchild else: succ = self.findSuccessor() succ.spliceOut() if self == self.parent.leftChild: self.parent.leftChild = succ else: self.parent.rightChild = succ succ.leftChild = self.leftChild succ.rightChild = self.rightChild else: if key < self.key: if self.leftChild: self.leftChild.delete_key(key) else: print "trying to remove a non-existant node" else: if self.rightChild: self.rightChild.delete_key(key) else: print "trying to remove a non-existant node" `

### Traversal

Once the binary search tree has been created, its elements can be retrieved in order by recursively traversing the left subtree of the root node, accessing the node itself, then recursively traversing the right subtree of the node, continuing this pattern with each node in the tree as it's recursively accessed. The tree may also be traversed in pre-order or post-order traversals. In computer science, tree traversal is the process of visiting each node in a tree data structure. ... In computer science, tree traversal is the process of visiting each node in a tree data structure. ... In computer science, tree traversal is the process of visiting each node in a tree data structure. ...

` def traverse_binary_tree(treenode): if treenode is None: return left, nodevalue, right = treenode traverse_binary_tree(left) visit(nodevalue) traverse_binary_tree(right) `

Traversal requires Ω(n) time, since it must visit every node. This algorithm is also O(n), and so it is asymptotically optimal. In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor worse than the best possible algorithm. ...

### Sort

A binary search tree can be used to implement a simple but inefficient sorting algorithm. Similar to heapsort, we insert all the values we wish to sort into a new ordered data structure — in this case a binary search tree — and then traverse it in order, building our result: In computer science and mathematics, a sorting algorithm is an algorithm that puts elements of a list in a certain order. ... A run of the heapsort algorithm sorting an array of randomly permuted values. ...

` def build_binary_tree(values): tree = None for v in values: tree = binary_tree_insert(tree, v) return tree def traverse_binary_tree(treenode): if treenode is None: return [] else: left, value, right = treenode return (traverse_binary_tree(left), [value], traverse_binary_tree(right)) `

The worst-case time of `build_binary_tree` is Θ(n2) — if you feed it a sorted list of values, it chains them into a linked list with no left subtrees. For example, `build_binary_tree([1, 2, 3, 4, 5])` yields the tree `(None, 1, (None, 2, (None, 3, (None, 4, (None, 5, None)))))`. In computer science, a linked list is one of the fundamental data structures, and can be used to implement other data structures. ...

There are several schemes for overcoming this flaw with simple binary trees; the most common is the self-balancing binary search tree. If this same procedure is done using such a tree, the overall worst-case time is O(nlog n), which is asymptotically optimal for a comparison sort. In practice, the poor cache performance and added overhead in time and space for a tree-based sort (particularly for node allocation) make it inferior to other asymptotically optimal sorts such as quicksort and heapsort for static list sorting. On the other hand, it is one of the most efficient methods of incremental sorting, adding items to a list over time while keeping the list sorted at all times. In computing, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically. ... For other uses, see Big O. In computational complexity theory, big O notation is often used to describe how the size of the input data affects an algorithms usage of computational resources (usually running time or memory). ... In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor worse than the best possible algorithm. ... A comparison sort is a particular type of sorting algorithm; a number of well-known algorithms are comparison sorts. ... Diagram of a CPU memory cache A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. ... In computer science, dynamic memory allocation is the allocation of memory storage for use in a computer program during the runtime of that program. ... Q sort redirects here. ... A run of the heapsort algorithm sorting an array of randomly permuted values. ...

### Example for a Binary Search Tree in Python:

` class Node: def __init__(self, lchild = None, rchild = None, value = -1, data = None): self.lchild = lchild self.rchild = rchild self.value = value self.data = data class Bst: """Implements Binary Search Tree""" def __init__(self): self.l = [] #nodes self.root = None def add(self, key, dt): """Add a node in tree""" if self.root == None: self.root = Node(value = key, data = dt) self.l.append(self.root) return 0 else: self.p = self.root while True: if self.p.value > key: if self.p.lchild == None: self.p.lchild = Node(value = key, data = dt) return 0 #success else: self.p = self.p.lchild elif self.p.value == key: return -1 # value already in tree else: if self.p.rchild == None: self.p.rchild = Node(value = key, data = dt) return 0 # success else: self.p = self.p.rchild return -2 #should never happen def search(self, key): """Searches Tree for a key and returns data; if not found returns None""" self.p = self.root if self.p == None: return None while True: # print self.p.value, self.p.data if self.p.value > key: if self.p.lchild == None: return None #Not Found else: self.p = self.p.lchild elif self.p.value == key: return self.p.data else: if self.p.rchild == None: return None #Not Found else: self.p = self.p.rchild return None #Should never happen def deleteNode(self, key): """Deletes node with value == key""" if self.root.value == key: if self.root.rchild == None: if self.root.lchild == None: self.root = None else: self.root = self.root.lchild else: self.root.rchild.lchild = self.root.lchild self.root = self.root.rchild return 1 self.p = self.root while True: if self.p.value > key: if self.p.lchild == None: return 0 #Not found anything to delete elif self.p.lchild.val == key: self.p.lchild = self.proceed(self.p, self.p.lchild) return 1 else: self.p = self.p.lchild # There's no way self.p.value to be equal to key! if self.p.value < key: if self.p.rchild == None: return 0 #Not found anything to delete elif self.p.rchild.value == key: self.p.rchild = self.proceed(self.p, self.p.rchild) return 1 else: self.p = self.p.rchild return 0 def proceed(self, parent, delValue): if delValue.lchild == None and delValue.rchild == None: return None elif delValue.rchild == None: return delValue.lchild else: return delValue.rchild def sort(self): self.__traverse__(self.root, mode = 1) def __traverse__(self, v, mode = 0): """Traverse in: preorder = 0, inorder = 1, postorder = 2""" if v == None: return if mode == 0: print (v.value, v.data) self.__traverse__(v.lchild) self.__traverse__(v.rchild) elif mode == 1: self.__traverse__(v.lchild, 1) print (v.value, v.data) self.__traverse__(v.rchild, 1) else: self.__traverse__(v.lchild, 2) self.__traverse__(v.rchild, 2) print (v.value, v.data) def main(): tree = Bst() tree.add(4, "test1") tree.add(10, "test2") tree.add(23, "test3") tree.add(1, "test4") tree.add(3, "test5") tree.add(2, "test6") tree.sort() print tree.search(3) print tree.deleteNode(10) print tree.deleteNode(23) print tree.deleteNode(4) print tree.search(3) tree.sort() if __name__ == "__main__": main() `

## Types of binary search trees

There are many types of binary search trees. AVL trees and red-black trees are both forms of self-balancing binary search trees. A splay tree is a binary search tree that automatically moves frequently accessed elements nearer to the root. In a treap ("tree heap"), each node also holds a priority and the parent node has higher priority than its children. An example of an unbalanced non-AVL tree In computer science, an AVL tree is a self-balancing binary search tree, and the first such data structure to be invented. ... A red-black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically used to implement associative arrays. ... In computing, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically. ... A splay tree is a self-balancing binary search tree with the additional unusual property that recently accessed elements are quick to access again. ... In computer science, a treap is a binary search tree that orders the nodes by adding a priority attribute to a node, as well as a key. ... Example of a complete binary max heap In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) â‰¥ key(B). ...

Two other titles describing binary search trees are that of a complete and degenerate tree.

A complete tree is a tree with n levels, where for each level d <= n - 1, the number of existing nodes at level d is equal to 2d. This means all possible nodes exist at these levels. An additional requirement for a complete binary tree is that for the nth level, while every node does not have to exist, the nodes that do exist must fill from left to right.

A degenerate tree is a tree where for each parent node, there is only one associated child node. What this means is that in a performance measurement, the tree will essentially behave like a linked list data structure.

### Performance comparisons

D. A. Heger (2004) [1] presented a performance comparison of binary search trees. Treap was found to have the best average performance, while red-black tree was found to have the smallest amount of performance fluctuations. In computer science, a treap is a binary search tree that orders the nodes by adding a priority attribute to a node, as well as a key. ... A red-black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically used to implement associative arrays. ...

### Optimal binary search trees

If we don't plan on modifying a search tree, and we know exactly how often each item will be accessed, we can construct an optimal binary search tree, which is a search tree where the average cost of looking up an item (the expected search cost) is minimized.

Assume that we know the elements and that for each element, we know the proportion of future lookups which will be looking for that element. We can then use a dynamic programming solution, detailed in section 15.5 of Introduction to Algorithms (Second Edition) by Thomas H. Cormen, to construct the tree with the least possible expected search cost. In mathematics and computer science, dynamic programming is a method of solving problems exhibiting the properties of overlapping subproblems and optimal substructure (described below) that takes much less time than naive methods. ...

Even if we only have estimates of the search costs, such a system can considerably speed up lookups on average. For example, if you have a BST of English words used in a spell checker, you might balance the tree based on word frequency in text corpora, placing words like "the" near the root and words like "agerasia" near the leaves. Such a tree might be compared with Huffman trees, which similarly seek to place frequently-used items near the root in order to produce a dense information encoding; however, Huffman trees only store data elements in leaves and these elements need not be ordered. In computing terms, a spelling checker (also spell checker) is a software program designed to verify the spelling of words in a file, helping a user ensure his/her spelling is correct. ... In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (now usually electronically stored and processed). ... In computer science, Huffman coding is an entropy encoding algorithm used for data compression that finds the optimal system of encoding strings based on the relative frequency of each character. ...

If we do not know the sequence in which the elements in the tree will be accessed in advance, we can use splay trees which are asymptotically as good as any static search tree we can construct for any particular sequence of lookup operations. A splay tree is a self-balancing binary search tree with the additional unusual property that recently accessed elements are quick to access again. ...

Alphabetic trees are Huffman trees with the additional constraint on order, or, equivalently, search trees with the modification that all elements are stored in the leaves. Faster algorithms exist for optimal alphabetic binary trees (OABTs).

In computer science, binary search or binary chop is a search algorithm for finding a particular value in a linear array, by ruling out half of the data at each step. ... In computer science, a binary tree is a tree data structure in which each node has at most two children. ... In computing, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically. ... A randomized binary search tree (abbreviated RBST) is a type of binary search tree, with data nodes organized as in a normal binary search tree. ... B-trees are tree data structures that are most commonly found in databases and filesystem implementations. ... A binary tree, a simple type of branching linked data structure. ... A trie for keys to, tea, ten, i, in, and inn. In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. ... In computer science, a ternary search tree (trie,TST) is a ternary (three-way) tree data structure which combines the time efficiency of digital tries with the space efficiency of binary search trees. ... In computer science, a hash table is a data structure that speeds up searching for information by a particular aspect of that information, called a key. ... Invented in 1990 by William Pugh, a skip list is a probabilistic data structure, based on parallel linked lists, with efficiency comparable to a binary search tree (order O(log n) average time for most operations). ...

## References

1. ^ Heger, Dominique A. (2004), "A Disquisition on The Performance Behavior of Binary Search Tree Data Structures", European Journal for the Informatics Professional 5 (5), <http://www.upgrade-cepis.org/issues/2004/5/up5-5Mosaic.pdf>

Donald Ervin Knuth ( or Ka-NOOTH[1], Chinese: [2]) (b. ... Thomas H. Cormen is the co-author of Introduction to Algorithms, along with Charles Leiserson, Ron Rivest, and Cliff Stein. ... Charles E. Leiserson is a computer scientist, specializing in the theory of parallel computing and distributed computing, and particularly practical applications thereof; as part of this effort, he developed the Cilk multithreaded language. ... Professor Ron Rivest Professor Ronald Linn Rivest (born 1947, Schenectady, New York) is a cryptographer, and is the Viterbi Professor of Computer Science at MITs Department of Electrical Engineering and Computer Science. ... Clifford Stein is a computer scientist, currently working as a professor at Columbia University in New York, NY. He earned his BSE from Princeton University in 1987, a MS from Massachusetts Institute of Technology in 1989, and a PhD from Massachusetts Institute of Technology in 1992. ... Cover of the second edition Introduction to Algorithms is a book by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. ...

Results from FactBites:

 AVL tree - Wikipedia, the free encyclopedia (652 words) In an AVL tree the heights of the two child subtrees of any node differ by at most one, therefore it is also called height-balanced. While AVL trees are theoretically quite sound, they are not commonly implemented due to their high implementation complexity to keep it balanced, making development less effective when compared to self-correcting tree structures, such as splay trees or heaps. Insertion into an AVL tree may be carried out by inserting the given value into the tree as if it were an unbalanced binary search tree, and then retracing one's steps toward the root, rotating about any nodes which have become unbalanced during the insertion (see tree rotation).
More results at FactBites »

Share your thoughts, questions and commentary here