Phone:

+(886) 909 756 966

Email:
moneychien20639@gmail.com

© 2024 Yu-Hang

Course:

Introduction to Machine Learning (10-601)

Time Spent:

10 hours

Source Code:
to github

Neural Networks

In this assignment, I implemented a one-hidden-layer neural network from scratch to perform image classification on an Optical Character Recognition (OCR) dataset. This project helped solidify my understanding of forward/backward propagation, gradient-based optimization, and hyperparameter tuning.


Model Definition:

  • Architecture: A single hidden layer with sigmoid activation; output layer uses softmax to generate class probabilities.
  • Objective Function: Average cross-entropy loss over the training dataset.

Tasks Accomplished:

  • Forward Propagation: Computed activations for hidden/output layers with matrix operations; included bias terms.
  • Backward Propagation: Derived gradients using the chain rule for both layers to enable error backpropagation.
  • Parameter Updates: Performed SGD with configurable learning rate and epoch count.

Implementation Details:

  • Hyperparameters: Included hidden layer size, learning rate, and initialization scheme.
  • Initialization: Supported random initialization (uniform in [-0.1, 0.1]) and zero initialization.
  • Optimization: Reshuffled data each epoch; provided CLI args for flexible training control.

Empirical Evaluations:

  • Compared cross-entropy loss for varying hidden layer sizes.
  • Analyzed learning rate effects on convergence (e.g., 0.03, 0.003, 0.0003).
  • Investigated the role of random vs. zero initialization.
  • Compared one vs. two hidden layers to explore underfitting/overfitting.

Programming Techniques:

  • Vectorization: Leveraged NumPy operations for efficient computation.
  • Encapsulation: Created modular classes (e.g., Linear, Sigmoid) with forward/backward methods.
  • Unit Tests: Used provided tests to verify correctness of forward/backward passes.

Outputs and Results:

  • Tracked cross-entropy loss on training and validation sets per epoch.
  • Reported final training and validation error rates.
  • Produced label predictions in the required output format.

This assignment strengthened my understanding of neural networks from theory to implementation. It deepened my practical skills in debugging, modular code design, and performance analysis of machine learning models.


  • Neural Networks
  • Image Classification
  • OCR
  • Forward Propagation
  • Backward Propagation
  • Cross-Entropy Loss
  • SGD
  • Python
  • NumPy