Scatter Graphs and Correlation
Students will construct and interpret scatter graphs, identifying types of correlation and drawing lines of best fit.
About This Topic
Bivariate data involves looking at two variables simultaneously to see if there is a relationship between them. In Year 9, students learn to construct scatter graphs, identify types of correlation (positive, negative, or none), and draw lines of best fit. This is a crucial part of the Statistics attainment targets, teaching students how to interpret data and make evidence-based predictions.
A key lesson in this topic is that 'correlation does not imply causation', just because two things happen together doesn't mean one causes the other. Students also learn the difference between interpolation (predicting within the data range) and extrapolation (predicting outside it). This topic particularly benefits from collaborative investigations where students collect their own data, as it gives them ownership over the variables and a deeper interest in the results.
Key Questions
- Differentiate between positive, negative, and no correlation on a scatter graph.
- Analyze whether a strong correlation between two variables always implies causation.
- Construct a line of best fit and justify its position on a scatter graph.
Learning Objectives
- Construct scatter graphs to represent bivariate data sets.
- Analyze scatter graphs to identify and classify correlation as positive, negative, or none.
- Evaluate the strength of correlation shown on a scatter graph.
- Create a line of best fit on a scatter graph and justify its position.
- Distinguish between correlation and causation using examples.
Before You Start
Why: Students need to be able to accurately plot points on a Cartesian grid to construct scatter graphs.
Why: Familiarity with representing data visually helps students understand the purpose and construction of different graph types, including scatter graphs.
Key Vocabulary
| Bivariate Data | Data that consists of two variables, allowing for the investigation of relationships between them. |
| Scatter Graph | A graph that displays values for two variables for a set of data, with the values shown as a collection of points. |
| Correlation | The statistical relationship between two variables, indicating whether they tend to move together (positive), in opposite directions (negative), or show no consistent pattern (none). |
| Line of Best Fit | A straight line drawn on a scatter graph that best represents the trend of the data points, used for prediction. |
| Causation | The relationship where one event directly causes another event to occur. |
Watch Out for These Misconceptions
Common MisconceptionThinking the line of best fit must pass through the origin or connect the first and last points.
What to Teach Instead
The line should have roughly an equal number of points above and below it. Using a clear ruler and 'peer-reviewing' each other's lines helps students understand that the line represents the *trend*, not a path between specific dots.
Common MisconceptionAssuming a strong correlation means one variable causes the other.
What to Teach Instead
This is a classic logical error. Use 'spurious correlation' examples to spark discussion. Active learning through debate helps students internalise the idea that data shows a relationship, but logic determines the cause.
Active Learning Ideas
See all activitiesInquiry Circle: The Human Scatter Graph
Students collect data on themselves (e.g., arm span vs. height). They plot this on a large coordinate grid on the floor or a wall using sticky notes. As a class, they discuss the correlation and use a piece of string to determine the 'line of best fit'.
Formal Debate: Correlation vs Causation
Present pairs with 'spurious correlations' (e.g., ice cream sales and shark attacks). Students must debate whether one causes the other or if there is a 'hidden variable' (like warm weather) and then present their reasoning to the class.
Think-Pair-Share: The Danger of Extrapolation
Show a scatter graph of a child's growth from age 1 to 10. Ask students to use a line of best fit to predict the person's height at age 50. Pairs discuss why this prediction is likely wrong and the risks of 'predicting the unknown'.
Real-World Connections
- Meteorologists use scatter graphs to analyze the relationship between atmospheric pressure and temperature, helping to predict weather patterns for regions like the UK.
- Market researchers plot customer spending versus advertising expenditure on scatter graphs to understand the impact of campaigns, informing business strategies for companies like Tesco or Sainsbury's.
- Sports analysts examine scatter graphs to see if there is a correlation between training hours and performance metrics for athletes, guiding training programs for teams in the Premier League.
Assessment Ideas
Provide students with three pre-drawn scatter graphs, each showing a different type of correlation (positive, negative, none). Ask students to label each graph with the correct correlation type and write one sentence explaining their choice.
Give students a small data set (e.g., hours studied vs. test score). Ask them to construct a scatter graph on a mini-whiteboard. Then, ask them to draw a line of best fit and write one prediction based on their line, stating whether it is interpolation or extrapolation.
Present a scenario: 'Ice cream sales increase when the temperature rises.' Ask students: 'Does this mean hot weather causes people to buy ice cream, or is there another factor at play?' Facilitate a discussion on correlation versus causation, using this or similar examples.
Frequently Asked Questions
How can active learning help students understand scatter graphs?
What is a 'line of best fit'?
What is an outlier?
What is the difference between positive and negative correlation?
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Data Interpretation and Probability
Interpolation and Extrapolation
Students will use lines of best fit to make predictions, distinguishing between interpolation and extrapolation and understanding their reliability.
2 methodologies
Probability Basics: Mutually Exclusive Events
Students will calculate probabilities of single events and understand the concept of mutually exclusive events.
2 methodologies
Tree Diagrams for Independent Events
Students will use tree diagrams to represent and calculate probabilities of combined independent events.
2 methodologies
Tree Diagrams for Dependent Events
Students will use tree diagrams to represent and calculate probabilities of combined dependent events (without replacement).
2 methodologies
Venn Diagrams for Probability
Students will use Venn diagrams to represent sets and calculate probabilities of events, including 'and' and 'or' conditions.
2 methodologies
Averages: Mean, Median, Mode (Grouped Data)
Students will calculate estimates for the mean, median, and modal class from grouped frequency tables.
2 methodologies