Skip to main content

Decision Tree


Picture Source: https://towardsdatascience.com/machine-learning-classifiers-a5cc4e1b0623
Python Code: Github Link

What is Decision Tree?

1. A simple IF-THEN rules through the branches of a tree till the leaf which tells us the output and why it is.
2. It splits the features into the branches and goes till leaves.
3. So, the parameters for a decision tree will be maximum depth of tree, minimum leaf size and the evaluation criterion.
4. How to evaluate a leaf? We always see for a homogeneous leaf.
5. What is homogeneous? Let's have 2 leafs L1 and L2. L1 have 50 values Yes and 50 Values No.L2 have 90 vales Yes and 10 values No. Which one we will prefer? It's obvious to prefer L2. So L2 is a homogeneous leaf which gives more information.
6. So, the evaluation criterion is based on how much information we gaining.
7. Because of this information gain, the decision tree always try to overfit the model.
8. So how to stop our tree for not getting overfit?
9. Here comes the pruning. Pruning is cutting down the tree based on information gain.
10. If there is no much information gain from one level to next level of the tree, there is no use of expanding our tree.