Lab 2: Evaluating Classification Models — Confusion Matrices, Errors, and ROC Curves
Learning how to measure model performance: from true/false positives to confusion matrices, error types, and ROC curves — with hands-on examples in Python.
Introduction
Building predictive models is only half the story — understanding how good they are is equally important. In this lab, I explored the fundamentals of model evaluation, starting with simple metrics like true positives and negatives, and moving to more nuanced concepts like confusion matrices and ROC curves.
Through real-world inspired examples, including a medical diagnosis case study, I discovered why accuracy alone is often misleading, and why metrics like precision, recall, and sensitivity matter.
Key Steps Covered
- True/False Positives and Negatives
- Defined core evaluation terms.
- Type I & II Errors
- False positives vs false negatives explained.
- Confusion Matrix
- Implemented from scratch and with
sklearn.metrics. - Case study: evaluating different medical diagnostic classifiers.
- Implemented from scratch and with
- Visualization
- Generated heatmaps of confusion matrices with Matplotlib & Seaborn.
- ROC Curve
- Introduced the concept of trade-offs between sensitivity and specificity.
- Compared models using ROC analysis.
Takeaway
This lab emphasized that evaluation goes beyond raw accuracy. By learning how to interpret confusion matrices and ROC curves, I gained tools to select the right model for the right context — especially in high-stakes applications like medical testing, where false positives and negatives carry very different consequences.
🔗 View the full Lab Notebook on GitHub
▶️ Run in Google Colab
