Skip to content
Computer Science · 9th Grade

Active learning ideas

Ethical Data Scraping and Privacy

Active learning helps students confront the real-world tensions between data utility and privacy head-on. When students scrape data or build models, they quickly see how choices affect individuals, which builds lasting ethical awareness. Role-playing and structured discussion make abstract privacy risks tangible and memorable.

Common Core State StandardsCSTA: 3A-DA-11CSTA: 3A-IC-24
25–40 minPairs → Whole Class3 activities

Activity 01

Simulation Game40 min · Small Groups

Simulation Game: The Mystery Predictor

Give students a 'training' dataset (e.g., shoe size vs. reading level in elementary students). They build a simple 'model' to predict one from the other, then test it against a 'hidden' dataset to see if their prediction holds up.

Critique the ethical considerations of scraping data from public websites.

Facilitation TipDuring The Mystery Predictor, circulate and listen for students to name the specific third variable (like ice cream sales and drowning both rising in summer) that explains a spurious correlation.

What to look forPose the following scenario: 'A student wants to build a website that aggregates job postings from various company career pages. What ethical questions should they consider before they start scraping these sites? What are the potential privacy risks for job applicants?' Facilitate a class discussion around their responses.

ApplyAnalyzeEvaluateCreateSocial AwarenessDecision-Making
Generate Complete Lesson

Activity 02

Formal Debate30 min · Small Groups

Formal Debate: Correlation vs. Causation

Present several 'spurious correlations' (e.g., ice cream sales and shark attacks). Groups must argue whether there is a causal link, a hidden third variable, or if it is just a coincidence.

Justify the importance of data privacy in the context of data collection.

Facilitation TipFor the Correlation vs. Causation debate, assign roles explicitly—affirmative, negative, and moderator—to keep the discussion focused on evidence rather than opinion.

What to look forPresent students with two hypothetical scenarios: Scenario A involves scraping publicly available, non-personal data like weather patterns. Scenario B involves scraping user profiles from a social media site without explicit consent. Ask students to write one sentence explaining which scenario raises more significant privacy concerns and why.

AnalyzeEvaluateCreateSelf-ManagementDecision-Making
Generate Complete Lesson

Activity 03

Think-Pair-Share25 min · Pairs

Think-Pair-Share: Model Ethics

Students read a short case study about an algorithm used to predict which students might drop out of school. They discuss the benefits and the potential dangers of relying on such a model.

Predict the potential negative impacts of unauthorized data collection.

Facilitation TipIn Model Ethics think-pair-share, prompt pairs to swap written responses so they compare justifications before sharing with the whole class.

What to look forAsk students to define 'Personally Identifiable Information (PII)' in their own words and list two examples. Then, have them write one sentence explaining why protecting PII is crucial when collecting data.

UnderstandApplyAnalyzeSelf-AwarenessRelationship Skills
Generate Complete Lesson

A few notes on teaching this unit

Teachers should frame ethics as a design constraint, not an add-on. Start with familiar tools students already use, then layer in privacy concepts. Research shows that students grasp abstract rules better when they see immediate consequences, so activities should surface real dilemmas early. Avoid long lectures; instead, let students experience the tension and then reflect together.

Students will articulate why correlation does not imply causation, identify ethical pitfalls in data collection, and justify privacy safeguards with concrete examples. Success looks like clear explanations, respectful debate, and thoughtful written justifications tied to the activities.


Watch Out for These Misconceptions

  • During The Mystery Predictor, watch for students to assume that because two variables appear linked, one must cause the other.

    Use the activity’s spurious correlation examples (like shoe size and reading ability) and ask teams to brainstorm a third hidden factor that could explain the pattern.

  • During Simulation: The Mystery Predictor, watch for students to believe a model that fits training data perfectly will work on new data.

    Have teams test their predictor on a separate dataset they haven’t seen before and discuss why performance often drops when conditions change.


Methods used in this brief