Lab 4: Estimating Probabilities — Maximum Likelihood vs Maximum A Posteriori

Exploring parameter estimation through coin flips: learning how MLE and MAP differ, and why Bayesian priors make estimates more robust.

Introduction

In this lab, I studied two foundational techniques for estimating model parameters: Maximum Likelihood Estimation (MLE) and Maximum A Posteriori Estimation (MAP).

Both aim to estimate unknown parameters (like the probability of getting heads in a coin toss), but they take different philosophical approaches:

MLE assumes a single “true” parameter value and chooses the one that maximizes the likelihood of the observed data.
MAP combines observed data with prior beliefs (expressed as a distribution), producing more stable estimates — especially with small datasets.

Key Steps Covered

MLE on Unbiased Coin
- Estimated probability of heads with θ̂ = α_H / (α_H + α_T).
- Example: 7 heads, 3 tails → θ̂ = 0.7.
MAP on Unbiased Coin
- Incorporated prior knowledge using a Beta distribution.
- Example: Prior mean 0.5 with observed 7 heads, 3 tails → θ̂ = 0.6 (closer to true fairness).
Biased Coin Simulation
- Ran experiments with biased coins (true θ = 0.6).
- Showed how MLE converges to the true value with many samples, but fluctuates with fewer.
- MAP produced more stable estimates by incorporating prior knowledge.
Multiple Experiments
- Compared MLE estimates across multiple trials.
- Visualized convergence trends with line plots.

Takeaway

This lab highlighted the trade-off between MLE and MAP:

MLE is straightforward and works well with abundant data.
MAP is more robust with limited data, as prior beliefs smooth out fluctuations.

Together, they form the backbone of statistical parameter estimation — one rooted in frequentist thinking, the other in Bayesian reasoning.

🔗 View the full Lab Notebook on GitHub
▶️ Run in Google Colab

Written on August 22, 2025