Skip to content
Mathematics · Grade 8 · Patterns in Data · Term 3

Interpreting Scatter Plots and Association

Interpreting scatter plots to look for patterns, clusters, and outliers in data sets.

Ontario Curriculum Expectations8.SP.A.1

About This Topic

A line of best fit (or trend line) is a straight line that best represents the data on a scatter plot. In the Ontario Grade 8 curriculum, students learn to informally fit these lines by eye, ensuring that the line follows the general direction of the dots with roughly equal numbers of points above and below it. This is a crucial step in moving from data visualization to data prediction.

Students use the equation of the line of best fit to make 'interpolations' (predictions within the data range) and 'extrapolations' (predictions outside the range). They also interpret the slope and y-intercept of the line in the context of the data. For example, in a graph of 'hours worked' vs. 'money earned,' the slope represents the hourly wage. This connects data analysis directly back to linear algebra.

This topic comes alive when students can engage in collaborative investigations. By using real-world data sets, like the growth of a plant over time or the cooling of a cup of cocoa, students see how a simple line can help them predict the future and understand the underlying rate of change.

Key Questions

  1. Analyze what the strength and direction of a correlation tell us about the relationship between two variables.
  2. Explain how outliers influence our interpretation of a data set.
  3. Differentiate between positive, negative, and no association in scatter plots.

Learning Objectives

  • Analyze scatter plots to identify patterns, clusters, and outliers in bivariate data sets.
  • Explain the meaning of positive, negative, and no association between two variables as represented on a scatter plot.
  • Evaluate how the presence of outliers can influence the perceived relationship between two variables in a scatter plot.
  • Compare the strength and direction of association between different pairs of variables presented in scatter plots.

Before You Start

Data Representation

Why: Students need to be able to plot and read points on a coordinate plane before interpreting scatter plots.

Introduction to Data Analysis

Why: Understanding basic data sets and how to organize them is foundational to interpreting patterns within them.

Key Vocabulary

Scatter PlotA graph that displays the relationship between two quantitative variables by plotting individual data points.
AssociationThe relationship between two variables. This can be positive, negative, or show no clear pattern.
OutlierA data point that is significantly different from other data points in the set, potentially affecting the interpretation of the overall trend.
ClusterA group of data points that are close together on a scatter plot, suggesting a concentration of values for the two variables.

Watch Out for These Misconceptions

Common MisconceptionStudents often think the line of best fit must pass through the origin (0,0).

What to Teach Instead

Show data sets where the starting value isn't zero (like 'age vs. height'). Peer discussion about why a baby isn't 0cm tall at birth helps them see that the y-intercept must match the data, not just the corner of the graph.

Common MisconceptionStudents may try to 'connect the dots' like a dot-to-dot puzzle.

What to Teach Instead

Use the 'spaghetti' method to show that the line represents the *trend*, not every single point. Collaborative work where students compare their straight lines to a 'connected' line helps them see which one is more useful for making general predictions.

Active Learning Ideas

See all activities

Real-World Connections

  • Environmental scientists use scatter plots to examine the relationship between air pollution levels and respiratory illness rates in urban areas, helping to identify potential health risks.
  • Economists analyze scatter plots to investigate correlations between factors like education level and average income across different regions, informing policy decisions.
  • Sports analysts use scatter plots to compare player statistics, such as points scored versus assists, to understand player performance and team dynamics.

Assessment Ideas

Quick Check

Provide students with 2-3 different scatter plots. Ask them to write one sentence describing the association (positive, negative, or none) for each plot and identify any obvious outliers.

Exit Ticket

Give students a scatter plot showing student study hours versus test scores. Ask them to explain in their own words what the pattern on the plot suggests about the relationship between these two variables and to identify one outlier if present.

Discussion Prompt

Present a scatter plot with a clear outlier. Ask students: 'How does this single data point affect our understanding of the overall relationship between the two variables? What might this outlier represent in the real world?'

Frequently Asked Questions

How do you draw a line of best fit?
You draw a straight line that follows the general 'path' of the data points. Try to have about the same number of points above the line as below it, and make sure the line reflects the overall steepness (slope) of the data cluster.
What is the difference between interpolation and extrapolation?
Interpolation is making a prediction *inside* the range of data you already have. Extrapolation is making a prediction *outside* that range (into the future or past). Extrapolation is usually riskier because trends can change over time.
How can active learning help students understand lines of best fit?
Active learning, like the 'Spaghetti Fit,' takes the guesswork out of drawing trend lines. When students can physically move a line (the spaghetti) and see how it balances the points, they develop a visual 'feel' for the data. Calculating the slope of their own physical line then makes the algebra feel like a natural extension of their visual work.
What does the slope of a trend line tell us?
The slope tells us the average rate of change between the two variables. If the slope is 2 for 'study hours' vs. 'test score,' it suggests that for every extra hour of study, the score goes up by an average of 2 points.

Planning templates for Mathematics