Correlation and Causation
Understanding the difference between correlation and causation in bivariate data.
About This Topic
Correlation describes the strength and direction of association between two variables in bivariate data, often shown in scatterplots and measured by coefficients from -1 to 1. Year 10 students distinguish this from causation, where one variable directly causes changes in another. They analyze patterns, such as the link between study hours and test scores, and recognize that correlation alone cannot prove cause and effect.
Aligned with AC9M10ST01, this topic strengthens statistical investigations by introducing confounding variables, like exercise influencing both diet and health outcomes. Students evaluate real-world data from sources like Australian Bureau of Statistics reports on crime rates and ice cream sales, both peaking in summer due to temperature. These exercises build data literacy and skepticism toward oversimplified claims in news or advertising.
Active learning suits this topic well. When students debate causal claims from graphs, hunt for lurking variables in datasets, or generate their own bivariate examples, they internalize the distinction through trial and error. Group analysis uncovers shared errors, while peer teaching reinforces precise reasoning.
Key Questions
- Explain why correlation does not necessarily imply causation between two variables?
- Analyze real-world examples where correlation is mistaken for causation.
- Justify the importance of considering confounding variables in statistical analysis.
Learning Objectives
- Explain why a strong correlation between two variables does not automatically mean one causes the other.
- Analyze real-world scenarios to identify instances where correlation is incorrectly interpreted as causation.
- Evaluate the role of confounding variables in obscuring or creating apparent relationships in bivariate data.
- Critique statistical claims made in media or advertising by distinguishing between correlation and causation.
Before You Start
Why: Students need to be able to interpret scatterplots and understand how to visually represent the relationship between two quantitative variables.
Why: Understanding how to read and interpret tables and graphs is essential for analyzing statistical information and identifying patterns.
Key Vocabulary
| Correlation | A statistical measure that describes the extent to which two variables change together. It indicates the strength and direction of a linear relationship. |
| Causation | A relationship where a change in one variable directly produces or brings about a change in another variable. |
| Confounding Variable | An unmeasured variable that influences both the independent and dependent variables, potentially creating a spurious correlation. |
| Spurious Correlation | A correlation between two variables that appears to be related but is actually due to coincidence or the influence of a third, unobserved factor. |
Watch Out for These Misconceptions
Common MisconceptionA strong positive correlation always means one variable causes the other.
What to Teach Instead
Correlation shows association, but causation requires evidence like controlled experiments. Hands-on graphing of confounders, such as temperature in summer sales data, helps students visualize lurking influences. Peer debates expose this flaw through counterexamples.
Common MisconceptionNo correlation means no causal relationship exists.
What to Teach Instead
Absence of correlation in observed data does not rule out causation, especially with confounders present. Group investigations of datasets reveal hidden patterns, while role-playing scenarios build understanding that correlation is necessary but not sufficient for causation claims.
Common MisconceptionCorrelation direction determines which variable causes the other.
What to Teach Instead
Direction indicates association type, not causal order. Students clarify this by swapping axes in pair scatterplot activities, prompting discussions on reverse causation or bidirectionality. Collaborative hypothesis testing strengthens analytical precision.
Active Learning Ideas
See all activitiesJigsaw: Spurious Correlations
Assign small groups one real-world example, such as cheese consumption and bed linen tangles. Groups research data, identify confounders, and create posters. Regroup into expert jigsaws to teach peers, followed by class vote on most convincing case. Conclude with shared scatterplot sketches.
Scatterplot Debates: Pairs Challenge Claims
Pairs receive a scatterplot with a causal headline, like 'More parks cause lower obesity.' They list evidence for and against causation, then debate with another pair. Switch roles and vote on strongest arguments using correlation coefficient criteria.
Data Detective Hunt: Whole Class Analysis
Project three datasets from Australian sources, such as rainfall and crop yields. Class brainstorms causal hypotheses in a shared digital whiteboard, then identifies confounders via think-pair-share. Tally votes and discuss experimental design needs.
Simulation Stations: Confounding Variables
Set up stations with props: one for ice cream/shark attacks (weather confounder), another for homework/grades (parental involvement). Groups rotate, model with graphs, and predict coefficient changes if confounder is controlled. Share insights in plenary.
Real-World Connections
- Public health officials in Sydney must be careful not to assume that increased ice cream sales directly cause higher rates of drowning. Both are correlated with warmer weather, which is the confounding variable.
- Market researchers analyzing sales data for a new smartphone app might observe a correlation between downloads and user engagement. They need to investigate if other factors, like targeted advertising campaigns, are the true cause of engagement, not just the download itself.
- Economists studying the relationship between education levels and income in Australia need to account for factors like socioeconomic background and geographic location, which can influence both education attainment and earning potential.
Assessment Ideas
Present students with a graph showing a strong positive correlation between the number of firefighters at a fire and the amount of damage caused. Ask: 'Does this graph prove that sending more firefighters causes more damage? Why or why not? What other factors might explain this relationship?'
Provide students with three brief statements, each describing a correlation. For example: 'A study shows that students who eat breakfast perform better on tests.' Ask students to write one sentence for each statement explaining if it demonstrates correlation or causation, and to identify a potential confounding variable if applicable.
Ask students to define correlation and causation in their own words. Then, have them provide one example of a correlation they have observed (or heard about) and explain why it might not be a causal relationship.
Frequently Asked Questions
What are effective real-world examples for teaching correlation vs causation?
How do I explain confounding variables to Year 10 students?
How can active learning help students understand correlation and causation?
Why is distinguishing correlation from causation key in Year 10 Maths?
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Statistical Investigations and Data Analysis
Box Plots and Five-Number Summary
Constructing and interpreting box plots from a five-number summary to visualize data distribution.
2 methodologies
Comparing Data Sets using Box Plots and Histograms
Using visual displays and summary statistics to compare two or more data sets.
2 methodologies
Bivariate Data and Scatter Plots
Examining the relationship between two numerical variables and identifying trends.
2 methodologies
Line of Best Fit and Prediction
Drawing and using lines of best fit to make predictions and interpret relationships.
2 methodologies
Introduction to Linear Regression
Using technology to find the equation of the least squares regression line.
2 methodologies
Statistical Investigations: Planning and Reporting
Designing and conducting a statistical investigation, from formulating questions to presenting findings.
2 methodologies