Phone:

+(886) 909 756 966

Email:
moneychien20639@gmail.com

© 2024 Yu-Hang

Course:

Introduction to Machine Learning (10-601)

Time Spent:

10 hours

Source Code:
to github

Decision Tree

In this assignment, I implemented an end-to-end Decision Tree classifier from scratch to solve binary classification problems. The programming tasks included:

Data Inspection:

  • Developed a Python program to calculate the label entropy at the root of the dataset and the error rate when using a majority vote classifier. This provided a baseline for evaluating the classifier's performance.

Decision Tree Implementation:

    Built a binary Decision Tree learner capable of:
  • Training on datasets using mutual information as the splitting criterion.
  • Supporting configurable maximum tree depth to control overfitting.
  • Handling ties in mutual information by selecting the first column.
  • Classifying examples using majority voting at leaf nodes.

Training and Evaluation:

  • Applied the Decision Tree to heart dataset and education dataset
  • Conducted experiments at various maximum tree depths and evaluated performance through training and testing error rates.

This project enhanced my understanding of decision trees, information theory concepts (e.g., entropy, mutual information), and the practical challenges of machine learning implementation.


  • Machine Learning
  • Decision Trees
  • Information Theory
  • Entropy
  • Mutual Information
  • Binary Classification
  • Data Analysis
  • Python
  • ML Evaluation