Scatter Graphs and Correlation
Students will plot and interpret scatter graphs, identifying types of correlation and drawing lines of best fit.
About This Topic
Scatter graphs show the relationship between two variables by plotting data points on a coordinate grid. Year 11 students plot points from bivariate data sets, such as heights and weights or exam scores and study hours, then identify positive correlation where points trend upward, negative where they trend downward, and no correlation where points scatter randomly. They draw lines of best fit by eye, ensuring the line passes close to most points, and use these to estimate missing values or predict trends.
This topic aligns with GCSE Statistics requirements for data interpretation, building skills in pattern recognition and critical analysis. Students learn that correlation measures association strength but does not prove causation, a key distinction applied to real-world claims like ice cream sales and drowning rates both rising in summer. Practising with varied data sets strengthens their ability to question data reliability and context.
Active learning suits scatter graphs because students construct meaning from their own or peer-collected data. When they gather measurements in class, plot collaboratively, and debate lines of best fit, misconceptions surface naturally. Group critiques refine judgments, making abstract statistical concepts concrete and relevant to decision-making.
Key Questions
- Differentiate between positive, negative, and no correlation in scatter graphs.
- Explain why correlation does not imply causation.
- Predict future values using a line of best fit, assessing the reliability of the prediction.
Learning Objectives
- Analyze bivariate data sets to identify and classify the type of correlation present in a scatter graph.
- Evaluate the strength and direction of correlation from a scatter graph, distinguishing between strong, moderate, and weak relationships.
- Construct a line of best fit on a scatter graph by eye, justifying its placement relative to the data points.
- Predict future values or trends using a line of best fit, and critique the reliability of these predictions based on the data's spread and the prediction's distance from the plotted data.
- Explain the difference between correlation and causation, providing a reasoned example to support the explanation.
Before You Start
Why: Students must be able to accurately plot points on a Cartesian grid to create scatter graphs.
Why: Students need to differentiate between independent and dependent variables to correctly label axes and interpret relationships.
Key Vocabulary
| Bivariate Data | A data set consisting of two variables for each individual observation, used to investigate relationships. |
| Correlation | A statistical measure that describes the extent to which two variables change together. It can be positive, negative, or none. |
| Line of Best Fit | A straight line drawn on a scatter graph that best represents the trend in the data, minimizing the distance between the line and the data points. |
| Causation | The relationship where one event is the result of another event; correlation does not imply causation. |
Watch Out for These Misconceptions
Common MisconceptionA strong correlation always means one variable causes the other.
What to Teach Instead
Correlation shows association, not causation; third factors often explain links. Group debates on real examples like cricket scores and rainfall help students articulate alternatives. Peer challenges expose flawed reasoning quickly.
Common MisconceptionThe line of best fit must pass through every data point.
What to Teach Instead
Lines balance points above and below for best average fit. Collaborative plotting activities let students test and adjust lines together, seeing how outliers affect balance. Visual feedback from class graphs corrects over-precision.
Common MisconceptionNo correlation means the variables have zero relationship.
What to Teach Instead
Weak scatter still shows no linear link, though non-linear ones may exist. Sorting activities with varied plots train eyes to distinguish strengths. Student-led examples from daily life reinforce nuanced interpretation.
Active Learning Ideas
See all activitiesData Hunt: Class Measurements
Students pair up to measure partners' heights and arm spans using tape measures. Pairs plot their data on shared graph paper, discuss correlation type, and draw a line of best fit. Groups compare graphs and predict values for new students.
Correlation Card Sort: Real Datasets
Prepare cards with scatter plots, descriptions, and correlation types. Small groups sort cards into positive, negative, or none categories, then justify choices. Extend by drawing lines of best fit on selected plots.
Prediction Challenge: Sports Data
Provide data on athletes' training hours and race times. Whole class plots on a large board, votes on line of best fit, then predicts performance for hypothetical athletes. Discuss prediction reliability based on scatter strength.
Causation Debate: Mystery Pairs
Show paired variables like shoe size and IQ. Individuals plot sample data, note correlation, then debate in small groups if one causes the other. Reveal lurking variables to clarify concepts.
Real-World Connections
- Market researchers use scatter graphs to analyze the relationship between advertising spend and product sales, helping to predict future sales based on marketing budgets.
- Environmental scientists plot data on scatter graphs to investigate links between pollution levels and respiratory illnesses in specific urban areas, informing public health initiatives.
- Financial analysts examine scatter graphs to see if there is a correlation between a company's profit margins and its stock price, aiding investment decisions.
Assessment Ideas
Provide students with a scatter graph showing a clear positive correlation. Ask them to: 1. Describe the correlation in one sentence. 2. Draw a line of best fit. 3. Predict the value of the dependent variable when the independent variable is X (a value within the range of the data).
Present students with two scenarios: Scenario A: Ice cream sales increase as the temperature rises. Scenario B: The number of shark attacks increases as ice cream sales rise. Ask: 'Which scenario shows correlation, and which might show causation? Explain your reasoning, focusing on the role of a third variable.'
Display three scatter graphs: one with positive correlation, one with negative, and one with no correlation. Ask students to hold up fingers corresponding to the type of correlation (e.g., 1 for positive, 2 for negative, 3 for none) for each graph shown.
Frequently Asked Questions
How do I teach students to draw lines of best fit accurately?
What real-world examples work best for correlation types?
How can active learning improve understanding of scatter graphs?
Why is distinguishing correlation from causation important in GCSE Maths?
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Data Interpretation and Statistics
Cumulative Frequency Graphs
Students will construct and interpret cumulative frequency graphs to estimate medians and quartiles.
2 methodologies
Box Plots and Interquartile Range
Students will construct and interpret box plots to compare distributions and identify outliers using the interquartile range.
2 methodologies
Histograms with Equal Class Widths
Students will construct and interpret histograms for continuous data with equal class intervals.
2 methodologies
Histograms with Unequal Class Widths
Students will construct and interpret histograms where frequency density is used to represent data with unequal class intervals.
2 methodologies