Computer Science · 11th Grade

Active learning ideas

Training Data and Model Evaluation

Active learning works for this topic because students need to see how data quality and evaluation choices shape real model behavior. Watching a model fail due to overfitting or trusting a high-accuracy metric without context sticks better than abstract lectures.

Common Core State StandardsCSTA: 3B-DA-07

20–45 minPairs → Whole Class4 activities

Activity 01

Think-Pair-Share20 min · Pairs

Think-Pair-Share: Accuracy Isn't Everything

Present a scenario: a disease affects 1% of the population, and a diagnostic AI claims 99% accuracy by always predicting 'healthy.' Ask partners to explain why this is misleading and what metric would be better. After sharing, introduce precision and recall as tools for understanding model behavior on imbalanced datasets.

Explain the critical role of training data in machine learning model development.

Facilitation TipDuring Think-Pair-Share: Accuracy Isn't Everything, assign one student in each pair to argue for accuracy and the other to critique it using the provided scenario cards.

What to look forProvide students with a scenario where an AI model for recommending movies performed poorly. Ask them to identify two potential issues with the training data or model evaluation and suggest one specific step to address each issue.

UnderstandApplyAnalyzeSelf-AwarenessRelationship Skills

Generate Complete Lesson

Activity 02

Case Study Analysis45 min · Pairs

Overfitting Experiment

Students train a simple model (using a provided notebook) on progressively smaller subsets of training data while testing on the same fixed test set. They plot training vs. test accuracy as sample size decreases and observe overfitting emerge. Pairs write a paragraph describing what they observed and predicting what would happen with even less data.

Analyze various metrics used to evaluate the performance of AI models (e.g., accuracy, precision, recall).

Facilitation TipWhen running the Overfitting Experiment, give each group two different model sizes and let them present their validation curves side by side on the same chart.

What to look forPresent students with a confusion matrix for a binary classification model. Ask them to calculate and explain the precision and recall for the positive class, identifying which metric might be more important given a specific application (e.g., spam detection).

AnalyzeEvaluateCreateDecision-MakingSelf-Management

Generate Complete Lesson

Activity 03

Gallery Walk30 min · Small Groups

Gallery Walk: Critique the AI Claim

Post five printed AI headlines or marketing claims ('Our model achieves 98% accuracy!', 'AI outperforms doctors in diagnosis'). Student groups annotate each with questions they'd need answered before accepting the claim: What is the test set? Is accuracy the right metric? What population was tested? How were edge cases handled? Class discusses which claims hold up to scrutiny.

Critique the potential pitfalls of overfitting and underfitting in model training.

Facilitation TipDuring the Gallery Walk: Critique the AI Claim, post each claim at a station with a single evaluation metric; students must write why that metric alone is insufficient.

What to look forFacilitate a class discussion using the prompt: 'Imagine you are building a facial recognition system. What are the ethical implications of using a training dataset that is not representative of all demographic groups, and how might this lead to underfitting or biased performance?'

UnderstandApplyAnalyzeCreateRelationship SkillsSocial Awareness

Generate Complete Lesson

Activity 04

Case Study Analysis35 min · Small Groups

Feature Engineering Challenge

Give teams a raw dataset (e.g., raw text strings, timestamps) and ask them to engineer three new features they think would help a model predict a given outcome. Teams present their features and justify why they might be predictive. Class votes on which features they think would most improve the model, then test predictions using a provided script.

Explain the critical role of training data in machine learning model development.

Facilitation TipIn the Feature Engineering Challenge, provide a dataset with 10 raw features and require teams to submit their code before they can add or drop any.

AnalyzeEvaluateCreateDecision-MakingSelf-Management

Generate Complete Lesson

A few notes on teaching this unit

Start with concrete examples before theory. Use the same dataset across activities so students see how data choices ripple into evaluation and model behavior. Avoid jumping straight to code; focus first on the reasoning behind each step. Research shows students grasp overfitting better when they see a model’s validation curve move as they change complexity.

Students will explain why accuracy can mislead, recognize overfitting in models, critique claims about AI systems, and justify their own feature choices. They will use evaluation metrics to make decisions, not just report numbers.

Watch Out for These Misconceptions

During Think-Pair-Share: Accuracy Isn't Everything, watch for students who insist accuracy is always the best metric. Redirect them to the imbalanced dataset card and ask them to calculate precision and recall from the confusion matrix provided.
Use the confusion matrix on the card to recalculate precision and recall. Ask students to compare the two metrics to the reported accuracy and explain why a model with 95% accuracy might miss half the positive cases.
During the Overfitting Experiment, watch for students who blame overfitting on the model being 'too smart'. Redirect them to the validation curve on their shared screen and ask what happens to training and validation error as model size increases.
Ask students to point out where the validation error starts rising while the training error keeps falling. Then ask them to explain what the model is memorizing instead of learning.
During the Feature Engineering Challenge, watch for students who add every feature hoping for better results. Redirect them to the performance plot on the whiteboard and ask which features actually improved the score.
Have teams present their feature list and the code that generated it. Ask the class to vote on which features were truly informative and which introduced noise.

Methods used in this brief

More in Artificial Intelligence and Ethics

Introduction to Artificial Intelligence

Students will define AI, explore its history, and differentiate between strong and weak AI.

2 methodologies

Machine Learning Fundamentals

Introduction to how computers learn from data through supervised and unsupervised learning.

2 methodologies

Supervised Learning: Classification and Regression

Exploring algorithms that learn from labeled data to make predictions.

2 methodologies

Unsupervised Learning: Clustering

Discovering patterns and structures in unlabeled data using algorithms like K-Means.

2 methodologies

AI Applications: Image and Speech Recognition

Exploring how AI is used in practical applications like recognizing images and understanding speech.

2 methodologies