Shapes of Distributions
Identifying normal, skewed, and bimodal distributions and their implications.
About This Topic
Box plots (or box-and-whisker plots) provide a visual summary of a data set's distribution based on a five-number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum. In 9th grade, students use these plots to compare different data sets and identify variability. This topic is central to the Common Core standards for summarizing and comparing data distributions.
Students learn to use the Interquartile Range (IQR) to measure the 'spread' of the middle 50% of the data, which is a more stable measure than the full range. This topic comes alive when students can create 'human box plots' where they physically stand in a line and divide themselves into quartiles, making the abstract concept of '25% of the data' a visible reality.
Key Questions
- Analyze what real-world phenomena typically follow a normal distribution.
- Explain how the tail of a distribution influences the mean.
- Justify why a bimodal distribution might suggest the presence of two different groups.
Learning Objectives
- Classify given data sets as representing normal, skewed, or bimodal distributions based on their graphical representations.
- Explain how the position of the mean relative to the median indicates the direction and severity of skew in a distribution.
- Analyze the characteristics of a bimodal distribution to infer the potential presence of two distinct underlying groups within the data.
- Compare the implications of a normal distribution versus a skewed distribution for making predictions about future data points.
Before You Start
Why: Students need to understand how to calculate and interpret the mean and median to analyze their relationship within different distribution shapes.
Why: Students must be able to read and interpret histograms to identify the visual patterns characteristic of normal, skewed, and bimodal distributions.
Key Vocabulary
| Normal Distribution | A symmetrical, bell-shaped distribution where data clusters around the mean, with most values close to the mean and fewer values farther away. |
| Skewness | A measure of the asymmetry of a probability distribution. A distribution can be skewed left (negative skew) or skewed right (positive skew). |
| Bimodal Distribution | A distribution with two distinct peaks, suggesting that the data set may be composed of two separate groups or populations. |
| Mean | The average of a data set, calculated by summing all values and dividing by the number of values. It is sensitive to outliers and extreme values. |
| Median | The middle value in a data set when the data is ordered from least to greatest. It is not affected by extreme values. |
Watch Out for These Misconceptions
Common MisconceptionStudents often think a longer 'whisker' or a wider 'box' means there are more data points in that section.
What to Teach Instead
Use the 'Human Box Plot.' Peer discussion helps students realize that each of the four sections contains the SAME number of people; a wider section just means those people's values are more spread out.
Common MisconceptionConfusing the median (the line in the box) with the mean.
What to Teach Instead
Have students calculate both for a skewed data set. Collaborative analysis of the box plot shows that the median is a physical 'middle' of the sorted list, which may not be the same as the 'balance point' (mean).
Active Learning Ideas
See all activitiesSimulation Game: The Human Box Plot
The whole class stands in order of their birth month or height. Students are then 'divided' into four equal groups to find the median and quartiles. They use a long rope to create the 'box' and 'whiskers' around the students standing at the key positions.
Inquiry Circle: Comparing the Leagues
Groups are given the heights of players from two different sports (e.g., NBA vs. MLB). They create box plots for both on the same scale and must write a report comparing the 'typical' height and the 'consistency' (spread) of the two groups.
Think-Pair-Share: Outlier Detectives
Give students a data set with one extreme value. Pairs must use the 1.5xIQR rule to mathematically determine if that value qualifies as an outlier and discuss whether it should be included in a final report.
Real-World Connections
- Height measurements for adult males in a large population typically follow a normal distribution. This allows clothing manufacturers to predict the most common sizes needed for production.
- Test scores for a challenging exam might show a negatively skewed distribution, with most students scoring high but a few scoring very low. This suggests the test was generally manageable but had some difficult questions.
- Customer satisfaction survey data could reveal a bimodal distribution if there are two distinct groups of customers: those highly satisfied and those highly dissatisfied, indicating potential issues with different aspects of a product or service.
Assessment Ideas
Provide students with three histograms, each representing a different distribution (normal, right-skewed, left-skewed). Ask them to label each histogram with the correct distribution type and briefly explain their reasoning, referencing the shape and the relative positions of the mean and median.
Present students with a scenario: 'A study found that the number of hours students in a school spent on homework per week had two peaks, one around 3 hours and another around 8 hours.' Ask students: 1. What type of distribution does this suggest? 2. What might this tell us about the student population?
Pose the question: 'Imagine you are analyzing the salaries of employees at a company. If the distribution is heavily skewed to the right, what does this imply about the salary structure? How would this differ if the distribution were normal?' Facilitate a class discussion on the implications for understanding typical earnings.
Frequently Asked Questions
What is the 'five-number summary'?
How can active learning help students understand box plots?
How do you calculate the Interquartile Range (IQR)?
Why are box plots useful for comparing data?
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Statistical Reasoning and Data
Measures of Central Tendency
Evaluating mean, median, and mode to determine the most representative value of a data set.
3 methodologies
Measures of Spread: Range and IQR
Visualizing data distribution and variability using five-number summaries and box plots.
3 methodologies
Standard Deviation and Data Consistency
Quantifying how much data values deviate from the mean to understand consistency.
3 methodologies
Two-Way Frequency Tables
Analyzing categorical data to identify associations and conditional probabilities between variables.
3 methodologies
Scatter Plots and Correlation
Creating and interpreting scatter plots to visualize relationships between two quantitative variables.
3 methodologies
Lines of Best Fit and Regression
Using scatter plots and residuals to determine the strength and direction of linear correlations.
3 methodologies