Skip to content
Computer Science · 12th Grade

Active learning ideas

Introduction to Data Science Workflow

Active learning works for this topic because students need to experience firsthand how messy, human-centered decisions shape every stage of the data science workflow. When students clean data, debate categories, or interpret visualizations, they confront the real challenges of turning raw information into meaningful insight.

Common Core State StandardsCSTA: 3B-DA-05CCSS.ELA-LITERACY.RST.11-12.7
20–50 minPairs → Whole Class3 activities

Activity 01

Inquiry Circle50 min · Small Groups

Inquiry Circle: Bias in the Data

Provide groups with a dataset used for a fictional 'college admissions AI' that contains historical biases (e.g., favoring certain zip codes). Students must find the patterns that lead to unfair outcomes and propose a way to 'clean' or adjust the data to ensure equity.

Explain the iterative nature of the data science workflow and its key stages.

Facilitation TipDuring Collaborative Investigation: Bias in the Data, circulate and listen for groups that conflate 'common' with 'correct' when identifying bias in datasets, then ask them to justify their claims with data examples.

What to look forPresent students with a short, messy dataset (e.g., a CSV with inconsistent formatting, missing entries). Ask them to identify at least three specific cleaning steps needed and explain why each step is important for accurate analysis.

AnalyzeEvaluateCreateSelf-ManagementSelf-Awareness
Generate Complete Lesson

Activity 02

Gallery Walk45 min · Individual

Gallery Walk: Data Visualizations

Students take a raw dataset and create a visualization (chart, map, or infographic) that tells a specific story. They display their work around the room, and peers use a 'See-Think-Wonder' protocol to evaluate what the data is saying and what might be missing.

Analyze the importance of data cleaning and preprocessing in ensuring reliable insights.

Facilitation TipFor the Gallery Walk: Data Visualizations, post guiding questions at each station to push students beyond 'it looks pretty' to 'what pattern does this reveal and why'.

What to look forPose the scenario: 'A city wants to use data from traffic cameras to optimize traffic light timing.' Ask students to discuss: What types of data would be acquired? What are potential ethical concerns regarding privacy? How would they communicate their findings to city officials?

UnderstandApplyAnalyzeCreateRelationship SkillsSocial Awareness
Generate Complete Lesson

Activity 03

Think-Pair-Share20 min · Pairs

Think-Pair-Share: Correlation vs. Causation

Present students with 'spurious correlations' (e.g., ice cream sales and shark attacks). Students work in pairs to explain why these two things are correlated but not causal, and then share their own examples of how Big Data might lead to false conclusions if not interpreted correctly.

Design a basic data science project plan for a given real-world problem.

Facilitation TipIn Think-Pair-Share: Correlation vs. Causation, deliberately pair students with opposing initial interpretations so they must reconcile differences using dataset evidence.

What to look forOn an index card, have students list the four main stages of the data science workflow in order. For each stage, ask them to write one sentence describing a key activity or challenge associated with it.

UnderstandApplyAnalyzeSelf-AwarenessRelationship Skills
Generate Complete Lesson

A few notes on teaching this unit

Approach this topic by treating data science as a human practice, not just a technical skill. Teach students to question every step, from data collection to final claims, by modeling your own skepticism during demonstrations. Avoid rushing to tools before students understand what those tools are actually doing to the data. Research shows that students grasp the Four Vs better when they grapple with concrete consequences of each V, like velocity overwhelming analysis or veracity making predictions unreliable.

Successful learning looks like students recognizing that data is not neutral, questioning the stories charts tell, and justifying their reasoning with evidence from datasets. By the end of these activities, students should articulate why workflow steps matter and how to avoid common pitfalls like confusing correlation with causation.


Watch Out for These Misconceptions

  • During Collaborative Investigation: Bias in the Data, watch for students who assume larger datasets automatically correct for bias because they include more examples.

    Use the dataset’s metadata and collection context to guide students into noticing how even large datasets can encode bias if the original sampling excluded certain groups or measured irrelevant variables.

  • During Collaborative Investigation: Bias in the Data, watch for students who believe data is neutral if it comes from 'official' sources like government records.

    Have students trace a single variable’s journey from collection to publication, highlighting the human choices in defining categories, setting thresholds, and omitting outliers.


Methods used in this brief