Mathematics · 7th Grade · Probability and Statistics · Weeks 28-36

Comparing Data Sets

Using measures of center and variability to compare two numerical data distributions.

Common Core State StandardsCCSS.Math.Content.7.SP.B.3CCSS.Math.Content.7.SP.B.4

About This Topic

Comparing two data distributions is one of the central applications of statistical reasoning in 7th grade. Rather than analyzing a single data set in isolation, students use measures of center and variability together to draw conclusions about how two groups differ or overlap. This is directly tied to CCSS 7.SP.B.3 and B.4, which require students to informally assess the degree of visual overlap between distributions.

The key insight is that two groups can have similar centers but very different variability, or vice versa. A difference in means only tells part of the story. When data sets overlap heavily, even a noticeable mean difference may not represent a meaningful distinction between groups. Students develop language for describing these comparisons: 'Group A's median is about 5 points higher than Group B's, and there is minimal overlap between the two distributions,' versus 'While the means differ, the ranges overlap substantially.'

Active learning is especially powerful for this topic because comparison requires interpretation, not just calculation. Structured discussions and peer argumentation push students to move beyond simply reporting numbers toward making evidence-based claims about differences between groups.

Key Questions

When is the median a better measure of center than the mean?
How does the overlap of two data sets affect our ability to say they are significantly different?
Why does the range or interquartile range matter when comparing two groups?

Learning Objectives

Calculate the mean, median, and interquartile range for two different numerical data sets.
Compare the measures of center (mean and median) and measures of variability (range and IQR) for two data sets using precise language.
Evaluate the degree of overlap between two data distributions and explain how it impacts conclusions about their differences.
Construct a written or verbal argument justifying whether two data sets represent significantly different groups, using calculated statistics and visual representations as evidence.

Before You Start

Calculating Measures of Center

Why: Students need to be able to accurately calculate the mean and median before they can compare them across data sets.

Calculating Measures of Variability

Why: Students must understand how to find the range and interquartile range to compare the spread of data distributions.

Data Visualization (Dot Plots, Box Plots)

Why: Students benefit from visual representations of data to understand overlap and variability intuitively before formal calculations.

Key Vocabulary

Mean	The average of a data set, calculated by summing all values and dividing by the number of values. It can be sensitive to extreme values.
Median	The middle value in a data set when the values are ordered from least to greatest. It is not affected by extreme values and is a good measure of center for skewed data.
Range	The difference between the maximum and minimum values in a data set. It provides a simple measure of the spread of the data.
Interquartile Range (IQR)	The difference between the third quartile (75th percentile) and the first quartile (25th percentile) of a data set. It measures the spread of the middle 50% of the data and is less affected by outliers than the range.
Overlap	The extent to which the values in one data set share common values with another data set. Significant overlap suggests the groups may not be substantially different.

Watch Out for These Misconceptions

Common MisconceptionIf one group has a higher mean, that group is definitively better or different.

What to Teach Instead

A higher mean matters more when the two distributions have low overlap. When distributions overlap heavily, the mean difference may be within the natural variability of the data and may not represent a meaningful distinction. Students need to consider both center and spread together.

Common MisconceptionVariability doesn't matter if you're comparing centers.

What to Teach Instead

Variability provides essential context for interpreting mean or median differences. A 10-point difference in means means something very different when IQRs are 5 versus when they are 40. Ignoring variability leads to overconfident comparisons.

Active Learning Ideas

See all activities→

Structured Academic Controversy

Which Group Performed Better?

Provide pairs of dot plots or box plots comparing two groups (e.g., Class A vs. Class B test scores). Groups are assigned a position (Class A scored better / Class B scored better) and must support their claim using center and variability measures. After arguing their position, groups switch sides and argue the opposite view, then reach a consensus.

35 min·Pairs

Think-Pair-Share

Does Overlap Matter?

Present two sets of dot plots , one pair with clearly separated distributions and one pair with significant overlap, both with the same mean difference. Students individually describe what the overlap tells them, compare with a partner, then share with the class why overlap changes the interpretation.

20 min·Pairs

Case Study Analysis

Data Analysis Station Rotation

Set up four stations, each with a different real-world comparison (heights of plants in two conditions, scores from two classes, speeds in two trials). Groups rotate every 8 minutes, recording center and variability measures and writing one comparison sentence at each station. Final debrief connects all four comparisons.

40 min·Small Groups

Real-World Connections

Sports analysts compare statistics like batting averages or points per game between two teams or players to determine performance differences. They consider not just the average but also how consistent each player's performance is over a season.
Environmental scientists compare temperature readings or pollution levels from two different geographic locations to assess environmental impact or climate change. They look at both average conditions and the variability to understand if one location is consistently more extreme than the other.
Market researchers compare customer satisfaction scores for two different product versions. They analyze the average scores and the spread of responses to decide which product is performing better overall and for most customers.

Assessment Ideas

Exit Ticket

Provide students with two small data sets (e.g., test scores for two different classes). Ask them to calculate the mean, median, and range for each set. Then, have them write one sentence comparing the centers and one sentence comparing the spreads.

Discussion Prompt

Present students with two box plots showing the heights of two different plant species. Ask: 'Based on these box plots, can we confidently say that one plant species is taller than the other? Explain your reasoning, referring to the median, IQR, and any overlap you observe.'

Quick Check

Show students two dot plots of student performance on a task. Ask them to identify which data set has a larger median and which has a larger range. Then, ask them to describe the overlap between the two data sets in their own words.

Frequently Asked Questions

How do you compare two data sets using measures of center and variability?

Calculate the mean or median for each group to see which tends to be higher, then compare their ranges or IQRs to understand how spread out each is. Also note how much the two distributions overlap , heavy overlap suggests the groups are more similar than the difference in centers implies.

When is median a better measure than mean for comparing groups?

Median is better when one or both distributions contain outliers or are skewed, because the mean gets pulled toward extreme values. Comparing medians gives a more accurate sense of what's typical for each group when the data isn't symmetric.

What does it mean when two data distributions overlap?

Overlap means that many individual values from both groups fall in the same range. A lot of overlap suggests the two groups are quite similar, even if their centers differ slightly. Less overlap means the groups are more meaningfully different from each other.

Why is active learning effective for teaching data set comparisons?

Comparing data distributions requires judgment and argumentation, not just arithmetic. When students debate which group 'really' performed better using the same data, they encounter the complexity of statistical interpretation , the kind of reasoning that can't be learned by watching a teacher solve problems.

Planning templates for Mathematics

Lesson Plan

5E Model

The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.

Unit Planner

Math Unit

Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.

Rubric

Math Rubric

Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.

More in Probability and Statistics

Understanding Populations and Samples

Students will differentiate between populations and samples and understand the importance of representative samples.

2 methodologies

Random Sampling and Bias

Understanding that statistics can be used to gain information about a population by examining a sample.

2 methodologies

Drawing Inferences from Samples

Students will use data from a random sample to draw inferences about a population with an unknown characteristic of interest.

2 methodologies

Measures of Center: Mean, Median, Mode

Students will calculate and interpret measures of center for numerical data sets.

2 methodologies

Measures of Variability: Range and IQR

Students will calculate and interpret measures of variability (range, interquartile range) for numerical data sets.

2 methodologies

Understanding Probability

Students will define probability and understand the likelihood of events.

2 methodologies