Lab 8: Dimensionality Reduction with PCA and ICA
Reducing complexity with PCA and uncovering hidden signals with ICA — applying dimensionality reduction techniques to the Iris dataset and beyond.
Introduction
High-dimensional datasets are common in data mining, but working with them directly can be inefficient and even misleading. That’s where dimensionality reduction comes in. In this lab, I explored two powerful techniques:
- Principal Component Analysis (PCA): Projects data into a lower-dimensional space while retaining as much variance as possible.
- Independent Component Analysis (ICA): Separates mixed signals into independent sources — useful for tasks like blind source separation.
Using the Iris dataset, I applied PCA step by step, from covariance matrices to eigen decomposition and singular value decomposition (SVD). Then I compared it with ICA to see how independent signals can be extracted.
Key Steps Covered
- Principal Component Analysis (PCA)
- Centered and standardized data.
- Computed covariance matrix, eigenvectors, and eigenvalues.
- Implemented PCA with both eigen decomposition and SVD.
- Visualized 4D Iris data projected into 2D.
- Explained Variance
- Analyzed how much variance is retained with each principal component.
- Independent Component Analysis (ICA)
- Introduced ICA as a complementary technique to PCA.
- Demonstrated how ICA can separate mixed signals into independent sources.
Takeaway
This lab showed how PCA can simplify high-dimensional data while preserving important structure — a crucial step in preprocessing for machine learning. ICA added another perspective, revealing how independent signals can be disentangled from observed mixtures. Together, these tools are foundational for data compression, visualization, and signal processing.
🔗 View the full Lab Notebook on GitHub
▶️ Run in Google Colab
