Causation vs. Correlation
Distinguishing between situations where correlation implies causation and where it does not.
About This Topic
The distinction between correlation and causation is one of the most practically important concepts in 9th grade statistics, and one of the most frequently violated in US media, health reporting, and political discourse. Students who can reliably make this distinction have a foundational tool for evaluating claims they will encounter throughout their lives. The CCSS standard HSS.ID.C.9 asks students to apply this distinction using specific data examples, not just as an abstract principle to state.
The mechanism that explains most spurious correlations is the confounding variable: a third variable that causally influences both of the variables being compared, making them appear related when they are not directly connected. Teaching students to ask what else might explain this connection is the key habit of mind this topic builds.
Active learning is particularly effective here because compelling examples exist everywhere and students genuinely enjoy arguing about them. Structured debate or structured academic controversy, where students argue both sides of a causal claim using real data, develops the analytical skills that make the distinction durable rather than a formula recalled on a test.
Key Questions
- Differentiate between correlation and causation with real-world examples.
- Analyze common pitfalls in assuming causation from correlation.
- Construct an argument for or against a causal link based on given data.
Learning Objectives
- Analyze provided datasets to identify potential correlations between two variables.
- Evaluate common media claims for causal links, identifying confounding variables or alternative explanations.
- Construct a written argument, supported by data, for or against a causal relationship between two specific phenomena.
- Compare and contrast the definitions of correlation and causation using real-world scenarios.
Before You Start
Why: Students need to be able to read and interpret basic graphs and data tables to identify patterns and relationships between variables.
Why: Understanding basic statistical measures helps students interpret the data they will use to explore correlations.
Key Vocabulary
| Correlation | A statistical measure that describes the extent to which two variables change together. A strong correlation means that as one variable changes, the other tends to change in a predictable way. |
| Causation | The relationship between cause and effect, where one event (the cause) directly produces another event (the effect). |
| Confounding Variable | A variable that influences both the dependent variable and independent variable, causing a spurious association. It is an 'extra' variable that is not accounted for. |
| Spurious Correlation | A correlation between two variables that appears to be related but is actually due to coincidence or a third, unobserved variable. |
Watch Out for These Misconceptions
Common MisconceptionA strong correlation coefficient (r close to 1 or -1) proves that one variable causes the other.
What to Teach Instead
Correlation strength measures how consistently variables move together, not why. Presenting well-known spurious high correlations, such as the correlation between per capita cheese consumption and deaths from bedsheet tangling, makes this viscerally clear. Group laughter and analysis of why the correlation exists without any causal connection is an effective and memorable correction.
Common MisconceptionIf a study is published or reported in the news, its causal claims have been verified.
What to Teach Instead
Observational studies can establish correlation but cannot verify causation. Causal claims require controlled experiments with random assignment. Students who learn to ask whether a study was experimental or observational have a powerful and practical filter for evaluating media claims. Partner analysis of real news summaries builds this habit efficiently.
Common MisconceptionCommon sense reliably tells us when correlation implies causation.
What to Teach Instead
Many spurious correlations feel intuitively plausible until the confounding variable is identified. Students who rely on intuition rather than systematic analysis are vulnerable to motivated reasoning. Presenting counterintuitive examples where the intuitive causal story is demonstrably wrong builds appropriate epistemic humility about gut-level causal judgments.
Active Learning Ideas
See all activitiesInquiry Circle: Find the Confounding Variable
Present groups with three real-world spurious correlations such as ice cream sales correlating with drowning rates, or shoe size correlating with reading ability in children. Groups identify the likely confounding variable for each, explain how it drives both variables, and present their reasoning to the class for critique.
Think-Pair-Share: Does This Prove Causation?
Show a news headline claiming a causal relationship based on a correlation study. Students individually write whether they accept the causal claim and why, then compare with a partner. Pairs that disagree discuss what additional evidence would be needed to establish causation rather than just correlation.
Structured Academic Controversy: Did X Cause Y?
Assign pairs a position (causal or non-causal) for a specific data relationship. Each pair prepares a two-minute argument, then the four-person group hears both sides and works toward a consensus about what the data does and does not support. Groups share their final position and reasoning with the class.
Gallery Walk: Evaluate the Claim
Post four data visualizations from real published studies around the room. Students rotate, evaluate each claim for causal validity, and write on sticky notes one piece of evidence that would strengthen or weaken the causal argument. Groups read previous groups' notes and add a response if they disagree.
Real-World Connections
- Medical researchers often observe correlations between lifestyle factors and disease rates. For example, a correlation between ice cream sales and drowning incidents might be observed, but both are likely caused by a third factor: hot weather.
- Economists analyze data for correlations between economic indicators like unemployment rates and consumer spending. They must be careful to distinguish correlation from causation when proposing policy changes.
- Marketers might see a correlation between advertising campaigns and product sales. However, they need to consider other factors like seasonality, competitor actions, or economic conditions before concluding the ads caused the sales increase.
Assessment Ideas
Present students with three scenarios: 1) A clear causal link (e.g., hitting a light switch and the light turning on). 2) A strong correlation with a likely confounding variable (e.g., number of firefighters at a fire and the amount of damage). 3) A spurious correlation (e.g., number of pirates and global warming). Ask students to label each as causation, correlation with confounding variable, or spurious correlation, and briefly explain their reasoning for the latter two.
Pose the question: 'If two things are correlated, does that mean one causes the other?' Facilitate a class discussion using student-generated examples. Prompt students to ask: 'What else could be causing this?' or 'Is there another explanation?'
Provide students with a news headline that implies causation from correlation (e.g., 'Study Shows Coffee Drinkers Live Longer'). Ask them to write one sentence explaining why this headline might be misleading and suggest one question they would ask to investigate further.
Frequently Asked Questions
What is the difference between correlation and causation?
What is a confounding variable?
How does active learning help students distinguish correlation from causation?
How do scientists establish causation rather than just correlation?
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Statistical Reasoning and Data
Measures of Central Tendency
Evaluating mean, median, and mode to determine the most representative value of a data set.
3 methodologies
Measures of Spread: Range and IQR
Visualizing data distribution and variability using five-number summaries and box plots.
3 methodologies
Standard Deviation and Data Consistency
Quantifying how much data values deviate from the mean to understand consistency.
3 methodologies
Shapes of Distributions
Identifying normal, skewed, and bimodal distributions and their implications.
3 methodologies
Two-Way Frequency Tables
Analyzing categorical data to identify associations and conditional probabilities between variables.
3 methodologies
Scatter Plots and Correlation
Creating and interpreting scatter plots to visualize relationships between two quantitative variables.
3 methodologies