Scatter Plots and Correlation
Creating and interpreting scatter plots to visualize relationships between two quantitative variables.
About This Topic
Scatter plots give students their first graphical tool for exploring relationships between two quantitative variables. Each point represents an individual case with two measured attributes, and the overall pattern of points reveals whether and how the variables relate. In 9th grade, students describe correlation qualitatively (positive, negative, none) and connect the visual pattern to the correlation coefficient r, which is introduced or previewed depending on the specific course sequence.
The crucial conceptual move at this level is separating the pattern in the plot from what that pattern means causally. Strong correlation means the variables move together, but it does not explain why. Students naturally seek causal explanations, so this topic provides a valuable opportunity to build the skepticism about data claims that appears regularly in US news media.
Active learning is especially effective here because scatter plot interpretation requires calibrated judgment. Students looking at the same plot often describe the pattern differently, and structured peer discussion where they must agree on a description before moving forward builds the precise vocabulary and perceptual accuracy that later statistical work requires.
Key Questions
- Analyze what the pattern of points on a scatter plot reveals about the relationship between variables.
- Differentiate between positive, negative, and no correlation.
- Explain why correlation does not imply causation.
Learning Objectives
- Create scatter plots to visually represent the relationship between two quantitative variables from a given dataset.
- Analyze the pattern of points on a scatter plot to describe the direction and strength of the relationship between variables.
- Differentiate between positive, negative, and no correlation based on the visual distribution of points on a scatter plot.
- Explain why a strong correlation between two variables does not necessarily imply a causal relationship, using a concrete example.
Before You Start
Why: Students need to be able to plot and interpret points in a two-dimensional coordinate system to create scatter plots.
Why: Students should have experience working with numerical data and understanding what different values represent.
Key Vocabulary
| Scatter Plot | A graph that displays the relationship between two quantitative variables. Each point on the plot represents a pair of values for the two variables. |
| Correlation | A statistical measure that describes the extent to which two variables change together. It indicates the direction and strength of a linear relationship. |
| Positive Correlation | A relationship where as one variable increases, the other variable also tends to increase. Points on the scatter plot generally rise from left to right. |
| Negative Correlation | A relationship where as one variable increases, the other variable tends to decrease. Points on the scatter plot generally fall from left to right. |
| No Correlation | A relationship where there is no discernible pattern between the two variables. Points on the scatter plot appear randomly scattered. |
Watch Out for These Misconceptions
Common MisconceptionA correlation of 0 means the two variables have no relationship at all.
What to Teach Instead
A correlation of 0 means there is no linear relationship. Two variables can have a strong curved relationship and still produce an r value near 0. Showing a U-shaped scatter plot where points clearly follow a pattern but r is approximately 0 is the most effective correction for this persistent misunderstanding.
Common MisconceptionThe correlation coefficient r tells you how steep the line of best fit will be.
What to Teach Instead
r describes the strength and direction of the linear relationship, not its steepness. Two scatter plots can have the same r but very different slopes, depending on the scales of the variables. Comparing two such plots side by side in a partner activity makes this distinction concrete.
Common MisconceptionPositive correlation means both variables always have large values at the same time.
What to Teach Instead
Positive correlation means that as one variable increases, the other tends to increase too, regardless of the absolute magnitude of either variable. Both variables could be small in value and still show strong positive correlation. Using scatter plots with small-scale data helps students focus on the direction of change rather than the size of the numbers.
Active Learning Ideas
See all activitiesInquiry Circle: Create and Interpret a Real-Data Scatter Plot
Provide groups with a dataset of two quantitative variables such as hours of sleep versus average GPA or temperature versus energy use across US cities. Groups create the scatter plot, describe the form, direction, and strength of the association in complete sentences, and identify any points that appear to be outliers with a written justification.
Think-Pair-Share: Describe What You See
Show a scatter plot without labels or context. Students individually write a description of the pattern using specific vocabulary (positive or negative, strong or weak, linear or nonlinear), then compare descriptions with a partner and reconcile any differences in language or interpretation before sharing with the class.
Gallery Walk: Matching Correlation to Scatter Plot
Post scatter plots at stations alongside cards showing r values such as -0.9, -0.4, 0.1, 0.7, and 0.95. Students rotate to each station and match the plot to the most appropriate r value, writing one sentence explaining the visual feature that most influenced their choice.
Real-World Connections
- Market researchers analyze scatter plots to see if there is a relationship between advertising spending and product sales for a new consumer good, helping to optimize marketing budgets.
- Environmental scientists use scatter plots to investigate the relationship between average daily temperature and the number of reported heat-related illnesses in a city, informing public health advisories.
- Economists examine scatter plots to explore potential links between a country's GDP per capita and its life expectancy, contributing to discussions about economic development and public health policy.
Assessment Ideas
Provide students with a small dataset (e.g., hours studied vs. test scores for 5 students). Ask them to: 1. Plot the data on a scatter plot. 2. Describe the apparent correlation (positive, negative, or none) and its strength. 3. Write one sentence explaining why this correlation does not prove that studying causes higher scores.
Present students with a scatter plot showing a strong positive correlation between ice cream sales and drowning incidents. Ask: 'What is the relationship shown in this plot? Is it reasonable to conclude that eating ice cream causes drowning? What other factor might explain both of these trends?'
Show students three different scatter plots, each representing positive correlation, negative correlation, and no correlation. Ask students to label each plot with the correct type of correlation and briefly justify their choice based on the pattern of points.
Frequently Asked Questions
What does the direction of a scatter plot tell you?
What is the correlation coefficient and what does it measure?
How does active learning help students interpret scatter plots accurately?
Why does correlation not imply causation?
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Statistical Reasoning and Data
Measures of Central Tendency
Evaluating mean, median, and mode to determine the most representative value of a data set.
3 methodologies
Measures of Spread: Range and IQR
Visualizing data distribution and variability using five-number summaries and box plots.
3 methodologies
Standard Deviation and Data Consistency
Quantifying how much data values deviate from the mean to understand consistency.
3 methodologies
Shapes of Distributions
Identifying normal, skewed, and bimodal distributions and their implications.
3 methodologies
Two-Way Frequency Tables
Analyzing categorical data to identify associations and conditional probabilities between variables.
3 methodologies
Lines of Best Fit and Regression
Using scatter plots and residuals to determine the strength and direction of linear correlations.
3 methodologies