Comparing Data Sets using Box Plots and HistogramsActivities & Teaching Strategies
Active learning works well for comparing data sets because students must physically and visually engage with the spread, center, and shape of data to truly understand differences between groups. Moving beyond calculations to interpret visual displays helps students develop a nuanced understanding of variability and distribution.
Learning Objectives
- 1Compare the distribution, center, and spread of two or more data sets using summary statistics and visual displays.
- 2Critique the suitability of box plots and histograms for representing and comparing different types of data distributions.
- 3Analyze visual displays of data to identify potential outliers and assess the symmetry or skewness of data sets.
- 4Formulate arguments about differences between populations based on statistical evidence from comparative displays.
Want a complete lesson plan with these objectives? Generate a Mission →
Inquiry Circle: The Reaction Time Challenge
Students use an online tool to measure their reaction times (e.g., dominant vs. non-dominant hand). In groups, they create back-to-back box plots of the results and write a 'statistical report' comparing the median and spread of the two groups.
Prepare & details
Explain how visual displays can be used to argue that two populations are significantly different?
Facilitation Tip: During the Collaborative Investigation, circulate and ask student groups to point to the parts of their human box plot that represent the median and quartiles, ensuring physical movement reinforces abstract concepts.
Setup: Groups at tables with access to source materials
Materials: Source material collection, Inquiry cycle worksheet, Question generation protocol, Findings presentation template
Gallery Walk: Skewness and Stories
The teacher posts four different histograms (e.g., house prices, heights, dice rolls). Groups must match each histogram to a 'story' or data source and explain their reasoning based on the shape (symmetric, left-skewed, right-skewed) to the rest of the class.
Prepare & details
Compare the central tendency and spread of two data sets based on their box plots.
Facilitation Tip: For the Gallery Walk, provide sticky notes with sentence stems like 'This skewness suggests...' to scaffold written interpretations of the displays.
Setup: Wall space or tables arranged around room perimeter
Materials: Large paper/poster boards, Markers, Sticky notes for feedback
Think-Pair-Share: The Outlier Debate
Students are given a data set with one extreme outlier. They individually calculate the mean and median, then pair up to discuss which 'average' is a fairer representation of the group. They must agree on a recommendation for a 'news report' based on their findings.
Prepare & details
Critique the effectiveness of different graphical displays for comparing data sets.
Facilitation Tip: During the Think-Pair-Share debate, assign one student in each pair to argue for the mean and the other for the median, forcing both perspectives to be considered before consensus.
Setup: Standard classroom seating; students turn to a neighbor
Materials: Discussion prompt (projected or printed), Optional: recording sheet for pairs
Teaching This Topic
Teach this topic by balancing visual interpretation with hands-on construction of both box plots and histograms. Avoid overemphasizing calculation and instead focus on what each display reveals about the data. Use real-world data sets that naturally lead to questions about center and spread, and encourage students to critique which display better answers their questions. Research shows that students better understand variability when they compare and contrast multiple representations of the same data.
What to Expect
Successful learning is evident when students confidently compare data sets using precise language about median, IQR, spread, and skewness, not just shape or range. They should justify their comparisons with evidence from both box plots and histograms, and recognize when one display reveals information the other does not.
These activities are a starting point. A full mission is the experience.
- Complete facilitation script with teacher dialogue
- Printable student materials, ready for class
- Differentiation strategies for every learner
Watch Out for These Misconceptions
Common MisconceptionDuring the Collaborative Investigation, watch for students assuming that a longer box or whisker contains more data points.
What to Teach Instead
Use the human box plot to have students count the number of students in each quartile section. Then, measure the distance between quartiles and discuss how spread relates to variability, not quantity of data.
Common MisconceptionDuring the Think-Pair-Share debate, watch for students defaulting to the mean when comparing skewed data sets.
What to Teach Instead
Provide each pair with a data set like annual incomes and have them calculate both the mean and median. Ask them to explain which measure better represents the 'typical' value and why.
Assessment Ideas
After the Collaborative Investigation, present students with two box plots comparing reaction times from two different age groups. Ask: 'Which group has more consistent reaction times? Justify your answer using the IQR and any outliers.'
During the Gallery Walk, give each student a clipboard with a checklist to evaluate two displays. Items include: 'Is the median clearly marked? Is the IQR labeled or calculated? Does the display help explain skewness?' Collect these to assess understanding of key concepts.
After the Think-Pair-Share debate, have students exchange their written comparisons of the two data sets. Each student uses a rubric to evaluate their partner's work on clarity, use of statistics, and reasoning about outliers or skewness.
Extensions & Scaffolding
- Challenge: Provide a data set with a known outlier and ask students to create both a histogram and box plot, then write a paragraph explaining how the outlier affects each display and their interpretation of the data.
- Scaffolding: For students struggling with quartiles, give them a pre-sorted set of data cards to physically divide into four equal groups before plotting.
- Deeper exploration: Introduce students to a bimodal data set and ask them to create both displays, then hypothesize about the underlying cause of the two peaks and design a follow-up data collection to test their hypothesis.
Key Vocabulary
| Box Plot | A visual representation of the distribution of data through quartiles. It shows the minimum, first quartile (Q1), median, third quartile (Q3), and maximum values. |
| Histogram | A graphical display of data where the data is divided into bins (intervals), and the frequency of data points falling into each bin is represented by a bar. |
| Interquartile Range (IQR) | The difference between the third quartile (Q3) and the first quartile (Q1) of a data set, representing the spread of the middle 50% of the data. |
| Median | The middle value in a data set when the data is ordered from least to greatest. It is a measure of central tendency. |
| Outlier | A data point that is significantly different from other data points in a data set. Box plots often use fences to identify potential outliers. |
Suggested Methodologies
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Statistical Investigations and Data Analysis
Box Plots and Five-Number Summary
Constructing and interpreting box plots from a five-number summary to visualize data distribution.
2 methodologies
Bivariate Data and Scatter Plots
Examining the relationship between two numerical variables and identifying trends.
2 methodologies
Correlation and Causation
Understanding the difference between correlation and causation in bivariate data.
2 methodologies
Line of Best Fit and Prediction
Drawing and using lines of best fit to make predictions and interpret relationships.
2 methodologies
Introduction to Linear Regression
Using technology to find the equation of the least squares regression line.
2 methodologies
Ready to teach Comparing Data Sets using Box Plots and Histograms?
Generate a full mission with everything you need
Generate a Mission