Computer Science · 12th Grade

Active learning ideas

Fundamentals of Machine Learning: Unsupervised Learning

Active learning works for unsupervised learning because students need to physically experience how algorithms impose structure on unlabeled data. Moving bodies, plotting points, and comparing visuals help learners internalize that patterns emerge from mathematical choices, not objective truths.

Common Core State StandardsCSTA: 3B-AP-09CSTA: 3B-DA-06

15–40 minPairs → Whole Class4 activities

Activity 01

Problem-Based Learning25 min · Whole Class

Simulation Activity: Human K-Means Clustering

Tape a large coordinate grid on the floor. Give each student a card with (x, y) values and have them stand at their position. The teacher randomly assigns two students as initial centroids. Students assign themselves to the nearest centroid by walking toward it, then recompute centroids as a group average. Repeat for two more rounds. Students observe convergence and discuss whether the result is globally optimal.

Explain how unsupervised learning can discover patterns without explicit labels.

Facilitation TipDuring the Human K-Means Clustering activity, have students physically walk to new centroid positions step-by-step rather than jumping to final clusters immediately.

What to look forPresent students with a scatter plot of unlabeled data points. Ask them to visually identify 2-3 potential clusters and explain the criteria they used for grouping. Then, ask them to hypothesize what a centroid for one of their clusters might represent.

AnalyzeEvaluateCreateDecision-MakingSelf-ManagementRelationship Skills

Generate Complete Lesson

Activity 02

Collaborative Problem-Solving40 min · Pairs

Collaborative Problem-Solving: Clustering Unlabeled Data

Students run k-means on a dataset of their choice, customer purchase data, penguin measurements, or movie ratings, using Python and scikit-learn. They experiment with different values of k, visualize the results, and write a paragraph interpreting what each cluster might represent. The ambiguity of interpreting unlabeled clusters is a key learning moment.

Compare the applications of clustering and dimensionality reduction in data analysis.

Facilitation TipIn the Clustering Unlabeled Data lab, model how to interpret silhouette scores by comparing two different k values side-by-side with the class.

What to look forPose the question: 'Imagine you are given a dataset of customer reviews for a new product, but the reviews are not categorized by sentiment (positive, negative, neutral). How could you use unsupervised learning to gain insights into customer feedback, and what are the potential challenges in interpreting the results?'

ApplyAnalyzeEvaluateCreateRelationship SkillsDecision-MakingSelf-Management

Generate Complete Lesson

Activity 03

Think-Pair-Share15 min · Pairs

Think-Pair-Share: Is This Clustering Useful?

Present two clustering results for the same dataset, one with two clusters, one with eight. Pairs discuss which is more useful for a specific business decision (e.g., designing a marketing campaign). There is no single right answer; the discussion surfaces the fact that 'good' clustering depends on the question being asked, not just on a mathematical metric.

Analyze the challenges of evaluating the performance of unsupervised learning models.

Facilitation TipFor the Gallery Walk of dimensionality reduction, assign each group one technique so they become the ‘experts’ who explain trade-offs to peers.

What to look forProvide students with a brief description of a scenario (e.g., identifying fraudulent transactions, grouping similar news articles). Ask them to identify whether clustering or dimensionality reduction would be more appropriate and to explain why in one to two sentences.

UnderstandApplyAnalyzeSelf-AwarenessRelationship Skills

Generate Complete Lesson

Activity 04

Gallery Walk18 min · Small Groups

Gallery Walk: Dimensionality Reduction Visualization

Post printouts showing the same dataset in 3D and as a 2D PCA projection, alongside visualizations of t-SNE and UMAP. Students annotate each with what information appears preserved and what appears lost. The walk helps students understand dimensionality reduction as a compression decision with trade-offs rather than as a magical reveal of hidden truth.

Explain how unsupervised learning can discover patterns without explicit labels.

UnderstandApplyAnalyzeCreateRelationship SkillsSocial Awareness

Generate Complete Lesson

A few notes on teaching this unit

Start with the Human K-Means activity to ground the concept in embodied learning. Then use the lab to connect abstract metrics like inertia to concrete decisions. Avoid rushing to applications before students have wrestled with how algorithms ‘see’ data differently than humans do. Research shows visual and kinesthetic experiences reduce misconceptions about objectivity in clustering results.

Successful learning looks like students articulating why different distance metrics or cluster counts produce different groupings. They should connect mathematical assumptions to real-world outcomes and critique when unsupervised methods are appropriate or misleading.

Watch Out for These Misconceptions

During Human K-Means Clustering, watch for students assuming the final clusters reveal the ‘true’ groups in the data.
Pause after two iterations and ask groups to compare their current centroids to the starting points, explicitly naming the mathematical assumption (e.g., Euclidean distance, fixed k) that shaped the shift.
During the Clustering Unlabeled Data lab, watch for students believing clustering is only useful when labels are unknown.
Ask students to overlay the ground-truth labels (if available) onto their clusters and calculate the adjusted Rand index to quantify overlap, then discuss why mismatch occurs even with clear patterns.
During the Gallery Walk: Dimensionality Reduction Visualization, watch for students concluding that reduced dimensions always lose critical information.
Have each group plot the same two principal components twice—once with original data and once with synthetic noise added—then ask them to identify which patterns persist, linking variance retention to noise reduction.

Methods used in this brief

More in Data Science and Intelligent Systems

Introduction to Data Science Workflow

Students learn the end-to-end process of data science, from data acquisition and cleaning to analysis and communication of results.

2 methodologies

Big Data Concepts and Pattern Recognition

Students analyze massive datasets to find hidden trends, using statistical libraries to process and visualize complex information sets.

2 methodologies

Data Visualization and Interpretation

Students learn to create effective data visualizations to communicate insights and identify patterns in complex datasets.

2 methodologies

Fundamentals of Machine Learning: Supervised Learning

Students are introduced to supervised learning, exploring concepts like regression and classification and how models learn from labeled data.

2 methodologies

Neural Networks and Deep Learning (Conceptual)

Students conceptually explore how neural networks are structured, how they learn from experience, and the basics of deep learning.

2 methodologies