data_tree_biologyprune

One of the questions that arises in a decision tree algorithm is the optimal size of the final tree. A tree that is too large risks overfitting the training data and poorly generalizing to new samples. A small tree might not capture important structural information about the sample space. A common strategy is to grow the tree until each node contains a small number of instances then use pruning to remove nodes that do not provide additional information. Decision tree pruning is an accepted technique for post -processing trees that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances. Pruning removes nodes from an inferred decision tree. It has been demonstrated to improve the predictive accuracy of inferred decision trees in a wide variety of domains. Pruning reduces the number of partitions imposed on an instance space by a decision tree. The metaphor for this method is taken from horticultural practice, where it involves the selective removal of parts of a plan to shape a tree, control or direct its growth, improve health, reduce risk from falling branches, prepare nursery specimens for transplanting, and both harvest and increase the yield or quality of flowers and fruits. The practice entails targeted removal of diseased, damaged, dead, non-productive, structurally unsound, or otherwise unwanted tissue. Pruning young trees, known as developmental tree pruning, is performed for structural enhancement. This tree care procedure helps ensure that young trees have a desirable branch architecture and structural integrity.