Decision Tree

In this assignment, I implemented an end-to-end Decision Tree classifier from scratch to solve binary classification problems. The programming tasks included:

Data Inspection:

Developed a Python program to calculate the label entropy at the root of the dataset and the error rate when using a majority vote classifier. This provided a baseline for evaluating the classifier's performance.

Decision Tree Implementation:

Training on datasets using mutual information as the splitting criterion.
Supporting configurable maximum tree depth to control overfitting.
Handling ties in mutual information by selecting the first column.
Classifying examples using majority voting at leaf nodes.

Training and Evaluation:

Applied the Decision Tree to heart dataset and education dataset
Conducted experiments at various maximum tree depths and evaluated performance through training and testing error rates.

This project enhanced my understanding of decision trees, information theory concepts (e.g., entropy, mutual information), and the practical challenges of machine learning implementation.

Machine Learning
Decision Trees
Information Theory
Entropy
Mutual Information
Binary Classification
Data Analysis
Python
ML Evaluation

Phone:

+(886) 909 756 966

Email:

moneychien20639@gmail.com

Arthur Chien

Course:

Time Spent:

Source Code:

Decision Tree

Data Inspection:

Decision Tree Implementation:

Training and Evaluation: