Skip to content

The Role of Training Data QualityActivities & Teaching Strategies

Active learning works for this topic because students need to experience the consequences of data choices firsthand. Simply discussing automation’s impact won’t help them grasp how training data shapes outcomes. Simulation and investigation tasks let students test their assumptions and see how flawed data leads to real-world problems.

9th GradeComputer Science3 activities20 min45 min

Learning Objectives

  1. 1Analyze how the quality and quantity of training data impact the performance and fairness of an AI model.
  2. 2Critique specific examples of AI bias resulting from unrepresentative or inaccurate training datasets.
  3. 3Design a plan to identify and mitigate bias in a given AI training dataset.
  4. 4Explain the ethical considerations related to data collection and its use in AI model training.

Want a complete lesson plan with these objectives? Generate a Mission

45 min·Whole Class

Simulation Game: The Automation Wave

Assign students different 'jobs' (truck driver, surgeon, artist). Introduce 'AI breakthroughs' one by one. Students must decide if their job is automated, assisted, or unchanged, and then 're-skill' by finding a new role.

Prepare & details

Analyze the role of training data quality in the success of an AI model.

Facilitation Tip: For the 'Simulation: The Automation Wave,' give teams a limited set of job roles and require them to justify their automation predictions using specific data-heavy criteria from the overview.

Setup: Flexible space for group stations

Materials: Role cards with goals/resources, Game currency or tokens, Round tracker

ApplyAnalyzeEvaluateCreateSocial AwarenessDecision-Making
40 min·Small Groups

Inquiry Circle: Industry 4.0

Groups research how a specific industry (like farming or fashion) has changed due to technology over the last 50 years and predict what it will look like in 2050.

Prepare & details

Critique the potential biases introduced by poor quality or unrepresentative training data.

Facilitation Tip: During 'Collaborative Investigation: Industry 4.0,' assign each group a different industry document to read first, then have them teach their findings to peers to ensure accountability.

Setup: Groups at tables with access to source materials

Materials: Source material collection, Inquiry cycle worksheet, Question generation protocol, Findings presentation template

AnalyzeEvaluateCreateSelf-ManagementSelf-Awareness
20 min·Pairs

Think-Pair-Share: The Un-automatable

Students brainstorm a list of skills they think a robot will *never* be able to do. They pair up to challenge each other's lists and narrow it down to the top three 'human-only' skills.

Prepare & details

Design strategies for improving the quality and diversity of training datasets.

Facilitation Tip: In 'Think-Pair-Share: The Un-automatable,' provide a timer for the pair discussion phase to keep the energy focused and prevent off-topic conversations.

Setup: Standard classroom seating; students turn to a neighbor

Materials: Discussion prompt (projected or printed), Optional: recording sheet for pairs

UnderstandApplyAnalyzeSelf-AwarenessRelationship Skills

Teaching This Topic

Teachers should emphasize that data quality is not just a technical detail but a human-centered issue. Avoid presenting automation as an abstract future event; ground discussions in current real-world examples from students’ potential career fields. Research shows that students grasp complex systems better when they analyze concrete, relatable cases rather than theoretical scenarios.

What to Expect

Successful learning looks like students identifying which job tasks are automatable and explaining why training data quality matters. They should connect dataset characteristics to model fairness and articulate clear steps to improve data quality. Discussions should reflect nuanced understanding beyond initial misconceptions.

These activities are a starting point. A full mission is the experience.

  • Complete facilitation script with teacher dialogue
  • Printable student materials, ready for class
  • Differentiation strategies for every learner
Generate a Mission

Watch Out for These Misconceptions

Common MisconceptionDuring the 'Simulation: The Automation Wave,' watch for students assuming automation will eliminate all jobs permanently. Redirect them by pointing to the simulation’s output showing how new roles emerge from automation.

What to Teach Instead

During the 'Collaborative Investigation: Industry 4.0,' have students compare job postings from 10 years ago and today in their assigned industry. Ask them to identify tasks that no longer exist and new ones that have appeared, reinforcing the idea that automation transforms rather than erases jobs.

Assessment Ideas

Quick Check

After the 'Simulation: The Automation Wave,' present students with two short descriptions of AI training datasets for a loan application model. One dataset is described as 'diverse and up-to-date,' the other as 'older and primarily from urban areas.' Ask students to write one sentence explaining which dataset is likely to produce a fairer model and why.

Discussion Prompt

During 'Think-Pair-Share: The Un-automatable,' facilitate a class discussion using the prompt: 'Imagine you are building an AI to recommend books. What potential biases could exist in your training data, and what specific steps would you take to ensure your data is representative of a wide range of readers?' Listen for students to name concrete biases (e.g., over-representation of bestsellers) and propose data collection strategies.

Exit Ticket

After the 'Collaborative Investigation: Industry 4.0,' provide students with a scenario where an AI chatbot exhibits biased language. Ask them to identify one possible cause related to training data quality and suggest one method to improve the chatbot's responses.

Extensions & Scaffolding

  • Challenge: Have students research a specific job they’re interested in and create a one-page proposal for how training data could be improved for an AI tool in that field.
  • Scaffolding: Provide sentence starters for the 'Think-Pair-Share' activity, such as 'One task that is hard to automate is _____ because _____.'
  • Deeper: Invite a local professional in a data-driven field (e.g., healthcare analytics, supply chain management) to discuss how training data quality impacts their daily work.

Key Vocabulary

Training DataThe dataset used to teach an AI model patterns and relationships. The model learns from this data to make predictions or decisions.
Data BiasSystematic errors or prejudices in a dataset that can lead an AI model to produce unfair or discriminatory outcomes.
Representative DataA dataset that accurately reflects the diversity and characteristics of the real-world population or phenomenon the AI model is intended to serve.
Data CleaningThe process of detecting and correcting or removing corrupt, inaccurate, or irrelevant records from a dataset used for AI training.
Algorithmic FairnessThe principle that AI systems should not create or perpetuate unjust discrimination against individuals or groups, often achieved through careful data management.

Ready to teach The Role of Training Data Quality?

Generate a full mission with everything you need

Generate a Mission