The Role of Training Data Quality

Active learning works for this topic because students need to experience the consequences of data choices firsthand. Simply discussing automation’s impact won’t help them grasp how training data shapes outcomes. Simulation and investigation tasks let students test their assumptions and see how flawed data leads to real-world problems.

Common Core State StandardsCSTA: 3A-AP-13CSTA: 3A-IC-24

20–45 minPairs → Whole Class3 activities

Activity 01

Simulation Game45 min · Whole Class

Simulation Game: The Automation Wave

Assign students different 'jobs' (truck driver, surgeon, artist). Introduce 'AI breakthroughs' one by one. Students must decide if their job is automated, assisted, or unchanged, and then 're-skill' by finding a new role.

Analyze the role of training data quality in the success of an AI model.

Facilitation TipFor the 'Simulation: The Automation Wave,' give teams a limited set of job roles and require them to justify their automation predictions using specific data-heavy criteria from the overview.

What to look forPresent students with two short descriptions of AI training datasets for a loan application model. One dataset is described as 'diverse and up-to-date,' the other as 'older and primarily from urban areas.' Ask students to write one sentence explaining which dataset is likely to produce a fairer model and why.

ApplyAnalyzeEvaluateCreateSocial AwarenessDecision-Making

Generate Complete Lesson

Activity 02

Inquiry Circle40 min · Small Groups

Inquiry Circle: Industry 4.0

Groups research how a specific industry (like farming or fashion) has changed due to technology over the last 50 years and predict what it will look like in 2050.

Critique the potential biases introduced by poor quality or unrepresentative training data.

Facilitation TipDuring 'Collaborative Investigation: Industry 4.0,' assign each group a different industry document to read first, then have them teach their findings to peers to ensure accountability.

What to look forFacilitate a class discussion using the prompt: 'Imagine you are building an AI to recommend books. What potential biases could exist in your training data, and what specific steps would you take to ensure your data is representative of a wide range of readers?'

AnalyzeEvaluateCreateSelf-ManagementSelf-Awareness

Generate Complete Lesson

Activity 03

Think-Pair-Share20 min · Pairs

Think-Pair-Share: The Un-automatable

Students brainstorm a list of skills they think a robot will *never* be able to do. They pair up to challenge each other's lists and narrow it down to the top three 'human-only' skills.

Design strategies for improving the quality and diversity of training datasets.

Facilitation TipIn 'Think-Pair-Share: The Un-automatable,' provide a timer for the pair discussion phase to keep the energy focused and prevent off-topic conversations.

What to look forProvide students with a scenario where an AI chatbot exhibits biased language. Ask them to identify one possible cause related to training data quality and suggest one method to improve the chatbot's responses.

UnderstandApplyAnalyzeSelf-AwarenessRelationship Skills

Generate Complete Lesson

A few notes on teaching this unit

Teachers should emphasize that data quality is not just a technical detail but a human-centered issue. Avoid presenting automation as an abstract future event; ground discussions in current real-world examples from students’ potential career fields. Research shows that students grasp complex systems better when they analyze concrete, relatable cases rather than theoretical scenarios.

Successful learning looks like students identifying which job tasks are automatable and explaining why training data quality matters. They should connect dataset characteristics to model fairness and articulate clear steps to improve data quality. Discussions should reflect nuanced understanding beyond initial misconceptions.

Watch Out for These Misconceptions

During the 'Simulation: The Automation Wave,' watch for students assuming automation will eliminate all jobs permanently. Redirect them by pointing to the simulation’s output showing how new roles emerge from automation.
During the 'Collaborative Investigation: Industry 4.0,' have students compare job postings from 10 years ago and today in their assigned industry. Ask them to identify tasks that no longer exist and new ones that have appeared, reinforcing the idea that automation transforms rather than erases jobs.

Methods used in this brief

More in The Impact of Artificial Intelligence

Machine Learning vs. Traditional Programming

Students will understand how machine learning differs from traditional rule-based programming.

2 methodologies

Supervised and Unsupervised Learning

Students will understand how computers learn from examples through supervised and unsupervised learning.

2 methodologies

AI Creativity and Mimicry

Students will discuss whether a computer can truly be creative or if it is just mimicking patterns.

2 methodologies

Sources of Algorithmic Bias

Students will analyze how human prejudices can be encoded into software and the resulting social impact.

2 methodologies

Ethical Decision-Making in AI

Students will discuss ethical dilemmas faced by AI systems and the importance of human oversight.

2 methodologies