During the fourth week at Zipfian, we implemented a few more basic machine learning algorithms from scratch using Python. One of the more interesting ones for me is the decision tree. As a classification technique, the decision tree has quite a number of appealing features, the chief of which are its simplicity and interpretability. Despite its conceptual simplicity, it is actually not that straightforward to implement from a programming perspective. Hopefully, this post can serve as a brief overview on how to program a simple decision tree.
Like many other algorithms we’ve learned, the first step is to define a cost function that we will be minimizing. In the case of a decision tree, the cost function determines how variance is calculated at each branch of the tree. For our implementation, we used a concept called entropy to calculate this variance: