Box Plots and Data Comparison
Drawing and interpreting box plots to compare distributions of two or more datasets.
About This Topic
Box plots summarise data distributions by showing the median, lower quartile, upper quartile, interquartile range, and outliers. Year 10 students order datasets to find these values, then draw the plot: a box from the quartiles with a median line inside, whiskers to the smallest and largest non-outlier values. This reveals central tendency, spread, and skewness without needing every data point plotted.
In the GCSE Statistics unit on statistical measures and graphs, students compare box plots from two or more sets, such as pupil reaction times or exam marks across year groups. They note if one box is narrower for less variability, or if medians differ for shifts in centre. Justifying box plots over histograms comes through discussing their efficiency for skewed data and outliers.
Active learning suits this topic well. Students gather real data like step counts from fitness trackers, build plots collaboratively, and debate comparisons. This approach makes statistics concrete, encourages precise calculation through peer checks, and builds skills in visual interpretation that stick for exams.
Key Questions
- Explain what each section of a box plot reveals about data distribution.
- Compare two datasets using their box plots, focusing on central tendency and spread.
- Justify the use of box plots for visual comparison of data distributions.
Learning Objectives
- Calculate the median, quartiles, and interquartile range for two or more datasets.
- Construct accurate box plots representing given datasets, including whiskers and median lines.
- Compare and contrast the central tendency and spread of two or more distributions using their box plots.
- Justify the selection of box plots over other graphical representations for comparing specific datasets, considering skewness and outliers.
Before You Start
Why: Students need to be able to find the median of a dataset, which is a key component of a box plot.
Why: Understanding how to find the range and IQR is fundamental to interpreting the spread shown in a box plot.
Why: The process of finding quartiles and the median requires data to be ordered from smallest to largest.
Key Vocabulary
| Median | The middle value in an ordered dataset, dividing the data into two equal halves. |
| Quartiles | Values that divide an ordered dataset into four equal parts; the lower quartile (Q1) is the median of the lower half, and the upper quartile (Q3) is the median of the upper half. |
| Interquartile Range (IQR) | The difference between the upper quartile (Q3) and the lower quartile (Q1), representing the spread of the middle 50% of the data. |
| Outlier | A data point that is significantly different from other observations in the dataset, often calculated as being more than 1.5 times the IQR below Q1 or above Q3. |
Watch Out for These Misconceptions
Common MisconceptionThe line inside the box shows the mean.
What to Teach Instead
It marks the median. Pairs activities calculating both mean and median from the same data, then plotting, highlight the difference. Students teach each other, cementing the distinction through discussion.
Common MisconceptionOutliers are errors to ignore.
What to Teach Instead
Outliers represent real extremes. Group analysis of datasets with sports records shows outliers' value for full distributions. Debating their inclusion builds judgement on data validity.
Common MisconceptionBox plots require at least 30 data points.
What to Teach Instead
They work for smaller sets. Class surveys with 15-20 points, plotted and compared, demonstrate this. Hands-on construction reveals flexibility early.
Active Learning Ideas
See all activitiesPairs Task: Heights Comparison
Pairs measure and record heights of 20 classmates, split by gender. Order data to calculate medians and quartiles, draw side-by-side box plots. Discuss which group has greater spread and why.
Small Groups: Reaction Time Challenge
Provide stopwatch for groups to test 30 reaction times to a signal, split into two conditions like rested versus tired. Construct box plots and compare medians and ranges. Groups justify which condition performs better.
Whole Class: Sleep Survey Analysis
Conduct a 1-minute survey on weekly sleep hours. Collate data on board, split by day types. Class draws shared box plots, then votes on interpretations of spread and outliers.
Individual: Invent and Interpret
Students invent two small datasets from sports scores. Draw individual box plots, swap with a partner to interpret and critique. Share strongest comparisons with class.
Real-World Connections
- Sports analysts use box plots to compare player statistics, such as the distribution of points scored per game by forwards versus midfielders in a football league, to identify performance trends.
- Financial advisors might use box plots to compare the historical returns of different investment funds over a specific period, helping clients understand variability and potential risk.
- Medical researchers can employ box plots to compare the effectiveness of different treatments by visualizing the distribution of patient recovery times or symptom severity scores.
Assessment Ideas
Provide students with two sets of data (e.g., test scores from two different classes). Ask them to calculate the median, Q1, Q3, and IQR for each set, and then draw comparative box plots on the same axis.
Present students with two box plots, one representing exam scores for a class that used a new teaching method and another for a class using the traditional method. Ask: 'Which class performed better overall, and how do you know? What does the spread of the data tell you about the consistency of learning in each class?'
Give each student a box plot. Ask them to write down: 1. The value of the median. 2. The range of the middle 50% of the data. 3. One observation about the distribution (e.g., is it skewed, is there a large spread?).
Frequently Asked Questions
How do you teach students to draw box plots accurately?
Why use box plots to compare datasets in GCSE stats?
How can active learning help with box plots?
What errors occur when interpreting box plot comparisons?
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Statistical Measures and Graphs
Measures of Central Tendency
Calculating and interpreting mean, median, and mode from raw data and frequency tables.
2 methodologies
Measures of Spread: Range and Interquartile Range
Calculating and interpreting range and interquartile range from raw data and frequency tables.
2 methodologies
Cumulative Frequency Graphs
Constructing and interpreting cumulative frequency graphs to find median, quartiles, and interquartile range.
2 methodologies
Histograms with Equal Class Widths
Constructing and interpreting histograms with equal class widths, understanding frequency representation.
2 methodologies