Skip to content
Computer Science · 11th Grade · Data Structures and Management · Weeks 1-9

Ethical Considerations in Data Collection

Examining the privacy, consent, and bias issues inherent in collecting and storing large datasets.

Common Core State StandardsCSTA: 3B-IC-24CSTA: 3B-IC-25

About This Topic

Ethical data collection is central to CSTA standards 3B-IC-24 and 3B-IC-25, which ask students to evaluate the social and ethical implications of data systems. In 11th grade, students move beyond simply understanding how data is collected to asking whether it should be collected at all, and under what conditions. This topic is directly relevant to their lives as users of social platforms, apps, and government services that constantly gather personal information.

In the US K-12 context, this topic connects naturally to ongoing policy debates around student data privacy laws (FERPA, COPPA) and broader conversations about commercial data brokers. Students often assume that using a free service means they have consented to all possible data uses, a common misconception that structured analysis can correct. Examining real cases like Cambridge Analytica or school district student monitoring software grounds abstract ethics concepts in tangible events.

Active learning is especially productive here because ethical reasoning requires students to weigh competing values, not memorize facts. Structured deliberation formats like philosophical chairs or case-based role plays give students practice articulating and defending positions while hearing perspectives different from their own.

Key Questions

  1. Analyze the ethical implications of collecting and storing personal data.
  2. Differentiate between informed consent and implied consent in data collection.
  3. Predict the potential societal impact of widespread data collection without proper safeguards.

Learning Objectives

  • Analyze the ethical trade-offs between data utility and individual privacy in a given scenario.
  • Evaluate the validity of consent mechanisms used by popular online services based on established privacy principles.
  • Critique the potential for algorithmic bias to emerge from specific data collection practices.
  • Propose safeguards to mitigate ethical risks associated with collecting sensitive personal data.

Before You Start

Introduction to Data Types and Structures

Why: Students need a foundational understanding of what data is and how it can be organized to discuss collection and storage.

Basic Principles of Algorithms

Why: Understanding that algorithms process data is essential for grasping how collection practices can lead to bias or other ethical issues.

Key Vocabulary

Informed ConsentPermission granted by an individual after being fully informed about how their data will be collected, used, and protected.
Implied ConsentPermission that is not expressly granted but is inferred from an individual's actions or inaction, often in less sensitive contexts.
Data MinimizationThe practice of collecting only the data that is strictly necessary for a specific, defined purpose.
Algorithmic BiasSystematic and repeatable errors in a computer system that create unfair outcomes, such as favoring one arbitrary group of users over others.
Data BrokerA company that collects and sells personal information about individuals, often gathered from public records and online activity.

Watch Out for These Misconceptions

Common MisconceptionFree services do not collect much meaningful data.

What to Teach Instead

Free services are typically funded by advertising revenue that depends on detailed user profiling. Users often generate more valuable data than they would if paying cash. Data audit activities help students see the scope of what is collected even in simple apps.

Common MisconceptionChecking a terms-of-service box counts as informed consent.

What to Teach Instead

True informed consent requires that people actually understand what they are agreeing to. Long, complex terms of service written in legal language rarely meet that standard. Analyzing real TOS excerpts alongside plain-language summaries makes this distinction concrete.

Common MisconceptionBias in datasets only matters when someone actively intends to discriminate.

What to Teach Instead

Data can reflect and perpetuate historical inequities even when no one intends to discriminate. Algorithms trained on biased datasets produce biased outputs automatically. Examining documented cases of algorithmic bias in hiring or criminal justice helps students understand this mechanism.

Active Learning Ideas

See all activities

Real-World Connections

  • Healthcare providers must navigate HIPAA regulations when collecting patient data, balancing the need for comprehensive medical history with strict privacy requirements to protect sensitive health information.
  • Social media platforms like TikTok and Meta collect vast amounts of user data, raising ongoing debates about the adequacy of their consent agreements and the potential for misuse of personal information for targeted advertising or other purposes.
  • Law enforcement agencies sometimes use facial recognition technology, which relies on large datasets of images, prompting discussions about privacy violations and the potential for biased identification of individuals.

Assessment Ideas

Discussion Prompt

Present students with a scenario: A school district wants to implement AI-powered software to monitor student engagement during online classes. Ask: What data would this software likely collect? What are the potential benefits? What are the major ethical concerns regarding privacy and consent? How could the school ensure informed consent from students and parents?

Quick Check

Provide students with short descriptions of two different data collection methods (e.g., a fitness tracker app asking for location data vs. a weather app asking for general location). Ask them to identify which scenario is more likely to rely on informed consent versus implied consent and to explain their reasoning in one to two sentences for each.

Exit Ticket

Ask students to write down one potential source of bias in a dataset used for hiring algorithms and one specific strategy to mitigate that bias. They should also briefly explain why data minimization is an important ethical principle.

Frequently Asked Questions

What is the difference between informed consent and implied consent in data collection?
Informed consent means a person actively agrees to specific data practices after receiving clear explanations. Implied consent assumes agreement based on context or behavior, like continued use of a service. The distinction matters legally and ethically, especially when sensitive data is involved, and is a live debate in US privacy law.
What does FERPA protect for students?
FERPA (Family Educational Rights and Privacy Act) protects the privacy of student education records. It gives parents and eligible students the right to access, review, and request correction of those records, and restricts schools from sharing them without consent. It applies to schools that receive federal funding, which includes most US public schools.
How can data collection cause harm if individual data points seem harmless?
Even individually harmless data points can become sensitive when combined. Location data alone can reveal where someone worships, receives medical care, or attends political meetings. This aggregation problem means assessing risk requires looking at the full dataset and its possible combinations, not just individual fields.
How does active learning help students think through data ethics?
Data ethics involves weighing competing values like convenience, privacy, and public safety, which requires practice with structured argument, not memorization. Role-play and deliberation activities give students a low-stakes environment to develop and pressure-test their own positions before they encounter real decisions as citizens and future developers.