Computer Science · 11th Grade · Data Structures and Management · Weeks 1-9

Ethical Considerations in Data Collection

Examining the privacy, consent, and bias issues inherent in collecting and storing large datasets.

Common Core State StandardsCSTA: 3B-IC-24CSTA: 3B-IC-25

About This Topic

Ethical data collection is central to CSTA standards 3B-IC-24 and 3B-IC-25, which ask students to evaluate the social and ethical implications of data systems. In 11th grade, students move beyond simply understanding how data is collected to asking whether it should be collected at all, and under what conditions. This topic is directly relevant to their lives as users of social platforms, apps, and government services that constantly gather personal information.

In the US K-12 context, this topic connects naturally to ongoing policy debates around student data privacy laws (FERPA, COPPA) and broader conversations about commercial data brokers. Students often assume that using a free service means they have consented to all possible data uses, a common misconception that structured analysis can correct. Examining real cases like Cambridge Analytica or school district student monitoring software grounds abstract ethics concepts in tangible events.

Active learning is especially productive here because ethical reasoning requires students to weigh competing values, not memorize facts. Structured deliberation formats like philosophical chairs or case-based role plays give students practice articulating and defending positions while hearing perspectives different from their own.

Key Questions

Analyze the ethical implications of collecting and storing personal data.
Differentiate between informed consent and implied consent in data collection.
Predict the potential societal impact of widespread data collection without proper safeguards.

Learning Objectives

Analyze the ethical trade-offs between data utility and individual privacy in a given scenario.
Evaluate the validity of consent mechanisms used by popular online services based on established privacy principles.
Critique the potential for algorithmic bias to emerge from specific data collection practices.
Propose safeguards to mitigate ethical risks associated with collecting sensitive personal data.

Before You Start

Introduction to Data Types and Structures

Why: Students need a foundational understanding of what data is and how it can be organized to discuss collection and storage.

Basic Principles of Algorithms

Why: Understanding that algorithms process data is essential for grasping how collection practices can lead to bias or other ethical issues.

Key Vocabulary

Informed Consent	Permission granted by an individual after being fully informed about how their data will be collected, used, and protected.
Implied Consent	Permission that is not expressly granted but is inferred from an individual's actions or inaction, often in less sensitive contexts.
Data Minimization	The practice of collecting only the data that is strictly necessary for a specific, defined purpose.
Algorithmic Bias	Systematic and repeatable errors in a computer system that create unfair outcomes, such as favoring one arbitrary group of users over others.
Data Broker	A company that collects and sells personal information about individuals, often gathered from public records and online activity.

Watch Out for These Misconceptions

Common MisconceptionFree services do not collect much meaningful data.

What to Teach Instead

Free services are typically funded by advertising revenue that depends on detailed user profiling. Users often generate more valuable data than they would if paying cash. Data audit activities help students see the scope of what is collected even in simple apps.

Common MisconceptionChecking a terms-of-service box counts as informed consent.

What to Teach Instead

True informed consent requires that people actually understand what they are agreeing to. Long, complex terms of service written in legal language rarely meet that standard. Analyzing real TOS excerpts alongside plain-language summaries makes this distinction concrete.

Common MisconceptionBias in datasets only matters when someone actively intends to discriminate.

What to Teach Instead

Data can reflect and perpetuate historical inequities even when no one intends to discriminate. Algorithms trained on biased datasets produce biased outputs automatically. Examining documented cases of algorithmic bias in hiring or criminal justice helps students understand this mechanism.

Active Learning Ideas

See all activities

Philosophical Chairs: Should Schools Track Student Device Activity?

Students take positions for and against a school district's policy of monitoring all student internet activity on school devices. They physically move to sides of the room based on their stance, respond to arguments from the other side, and may change position as their thinking evolves. A class debrief identifies which arguments were most persuasive and why.

30 min·Whole Class

Case Study Analysis: Data Broker Audit

Small groups research a real data broker company and map out what data is collected, how it is obtained, who it is sold to, and what consent model is used. Groups present findings and the class compares consent practices across different brokers to surface patterns.

40 min·Small Groups

Think-Pair-Share: Informed vs. Implied Consent

Present three real-world scenarios (a fitness app, a loyalty card program, a hospital intake form). Students individually classify each as informed or implied consent, then compare their reasoning with a partner before a whole-class discussion that surfaces edge cases.

20 min·Pairs

Design Sprint: Privacy-First Data Collection Policy

Groups draft a one-page data collection policy for a hypothetical school app, specifying what data is collected, why, who can access it, and how long it is retained. Groups swap drafts and provide written critique, then revise before a brief share-out.

35 min·Small Groups

Real-World Connections

Healthcare providers must navigate HIPAA regulations when collecting patient data, balancing the need for comprehensive medical history with strict privacy requirements to protect sensitive health information.
Social media platforms like TikTok and Meta collect vast amounts of user data, raising ongoing debates about the adequacy of their consent agreements and the potential for misuse of personal information for targeted advertising or other purposes.
Law enforcement agencies sometimes use facial recognition technology, which relies on large datasets of images, prompting discussions about privacy violations and the potential for biased identification of individuals.

Assessment Ideas

Discussion Prompt

Present students with a scenario: A school district wants to implement AI-powered software to monitor student engagement during online classes. Ask: What data would this software likely collect? What are the potential benefits? What are the major ethical concerns regarding privacy and consent? How could the school ensure informed consent from students and parents?

Quick Check

Provide students with short descriptions of two different data collection methods (e.g., a fitness tracker app asking for location data vs. a weather app asking for general location). Ask them to identify which scenario is more likely to rely on informed consent versus implied consent and to explain their reasoning in one to two sentences for each.

Exit Ticket

Ask students to write down one potential source of bias in a dataset used for hiring algorithms and one specific strategy to mitigate that bias. They should also briefly explain why data minimization is an important ethical principle.

Frequently Asked Questions

What is the difference between informed consent and implied consent in data collection?

Informed consent means a person actively agrees to specific data practices after receiving clear explanations. Implied consent assumes agreement based on context or behavior, like continued use of a service. The distinction matters legally and ethically, especially when sensitive data is involved, and is a live debate in US privacy law.

What does FERPA protect for students?

FERPA (Family Educational Rights and Privacy Act) protects the privacy of student education records. It gives parents and eligible students the right to access, review, and request correction of those records, and restricts schools from sharing them without consent. It applies to schools that receive federal funding, which includes most US public schools.

How can data collection cause harm if individual data points seem harmless?

Even individually harmless data points can become sensitive when combined. Location data alone can reveal where someone worships, receives medical care, or attends political meetings. This aggregation problem means assessing risk requires looking at the full dataset and its possible combinations, not just individual fields.

How does active learning help students think through data ethics?

Data ethics involves weighing competing values like convenience, privacy, and public safety, which requires practice with structured argument, not memorization. Role-play and deliberation activities give students a low-stakes environment to develop and pressure-test their own positions before they encounter real decisions as citizens and future developers.

More in Data Structures and Management

Arrays and Linked Lists

Students will compare and contrast static arrays with dynamic linked lists, focusing on memory and access patterns.

2 methodologies

Stacks: LIFO Data Structure

Implementing and utilizing linear data structures to manage program flow and state.

2 methodologies

Queues: FIFO Data Structure

Implementing and utilizing linear data structures to manage program flow and state.

2 methodologies

Hash Tables and Hashing Functions

Exploring efficient key-value storage and the challenges of collision resolution.

2 methodologies

Trees: Binary Search Trees

Introduction to non-linear data structures, focusing on efficient searching and ordering.

2 methodologies

Introduction to Relational Databases

Designing schemas and querying data using structured language to find meaningful patterns.

2 methodologies