Comparing Data DistributionsActivities & Teaching Strategies
Active learning works for comparing data distributions because students need to manipulate real numbers, visualize shifts in data, and justify their reasoning aloud. When they adjust outliers or compare class heights, they connect abstract measures like mean and MAD to concrete outcomes they can see and debate.
Learning Objectives
- 1Calculate the mean, median, and mean absolute deviation for two different data sets.
- 2Compare the measures of center (mean and median) and spread (MAD) for two populations, identifying the impact of outliers.
- 3Explain why the median is a more appropriate measure of center than the mean for data sets with extreme outliers.
- 4Evaluate the variability of two data sets to determine the confidence level in predictions made about each population.
Want a complete lesson plan with these objectives? Generate a Mission →
Pairs: Outlier Adjustment
Give pairs two datasets on cardstock, one with an outlier like extreme test score. Calculate mean, median, MAD before and after removal. Pairs sketch dot plots and note changes in a shared chart, then share with class.
Prepare & details
Which measure of center is most affected by extreme outliers in a data set?
Facilitation Tip: During the Outlier Adjustment activity, circulate and ask pairs: 'How does removing or adding an outlier change the mean compared to the median?' to prompt immediate reflection.
Setup: Groups at tables with case materials
Materials: Case study packet (3-5 pages), Analysis framework worksheet, Presentation template
Small Groups: Class Height Comparison
Measure heights of students in two groups, like by birth month. Groups create side-by-side dot plots, compute measures of center and MAD. Discuss which population has more typical heights and why variability matters.
Prepare & details
How does the 'spread' or variability of data impact our confidence in a prediction?
Facilitation Tip: For the Class Height Comparison, ensure each small group measures their own heights first to create authentic data sets they care about analyzing.
Setup: Groups at tables with case materials
Materials: Case study packet (3-5 pages), Analysis framework worksheet, Presentation template
Whole Class: Prediction Challenge
Display two data sets on board, like city rainfall. Class votes on predictions, then calculates measures together. Adjust data live based on suggestions to show spread's impact on confidence.
Prepare & details
When is the median a better representation of a 'typical' value than the mean?
Facilitation Tip: In the Prediction Challenge, intentionally include one set with high variability to highlight why MAD matters when predicting future values.
Setup: Groups at tables with case materials
Materials: Case study packet (3-5 pages), Analysis framework worksheet, Presentation template
Individual: Data Doctor
Students get mixed datasets from sports or weather. Individually identify best measures for comparison, justify in writing. Follow with pair share to refine arguments.
Prepare & details
Which measure of center is most affected by extreme outliers in a data set?
Facilitation Tip: During Data Doctor, require students to write a brief justification for their diagnosis using at least two measures of center and variability.
Setup: Groups at tables with case materials
Materials: Case study packet (3-5 pages), Analysis framework worksheet, Presentation template
Teaching This Topic
Experienced teachers approach this topic by having students repeatedly calculate mean, median, and MAD while altering one variable at a time. They avoid teaching these measures in isolation, instead embedding them in tasks where students must defend their choices. Research suggests starting with physical manipulatives like sticky notes on a board helps students grasp how outliers skew data before moving to abstract calculations.
What to Expect
Successful learning looks like students selecting the most appropriate measure of center based on context, explaining how variability affects prediction confidence, and using data displays to support their comparisons. They should articulate why one set’s median is more representative than another’s mean, especially when outliers are present.
These activities are a starting point. A full mission is the experience.
- Complete facilitation script with teacher dialogue
- Printable student materials, ready for class
- Differentiation strategies for every learner
Watch Out for These Misconceptions
Common MisconceptionDuring the Outlier Adjustment activity, watch for students who automatically choose the mean as the best measure of center without considering the presence of outliers.
What to Teach Instead
After pairs adjust the outlier, ask them to recalculate both mean and median and explain which value better represents a typical data point in their adjusted set, using their plotted points as visual evidence.
Common MisconceptionDuring the Class Height Comparison activity, watch for students who confuse MAD with the range because both describe spread.
What to Teach Instead
Have groups calculate both measures side by side and ask: 'Why does MAD using every data point give a clearer picture of variability than range, which only uses the highest and lowest values?'
Common MisconceptionDuring the Prediction Challenge activity, watch for students who assume similar means indicate identical distributions.
What to Teach Instead
After the whole-class simulation, display two sets with close means but different MADs and ask students to predict the next value, then discuss how variability affects confidence in their predictions.
Assessment Ideas
After the Outlier Adjustment activity, collect students’ adjusted data sets and their written justifications for choosing the mean or median as the better measure of center in each scenario.
During the Class Height Comparison activity, listen for students’ explanations of how variability in their data sets affects whether they’d use the mean or median to describe a 'typical' height in their class.
After the Prediction Challenge, give students two new data sets with different spreads and ask them to write one sentence explaining how the MAD of each set would affect their confidence in predicting the next value.
Extensions & Scaffolding
- Challenge students to create a data set where the mean and median are identical but the MADs are very different, then explain how this impacts predictions.
- Scaffolding: Provide pre-labeled dot plots for students to compare, focusing first on identifying which measure of center fits best before they calculate.
- Deeper exploration: Have students research a real-world scenario (e.g., salaries, temperatures) where comparing distributions led to a policy decision, and present how measures of center and variability informed the outcome.
Key Vocabulary
| Mean | The average of a data set, calculated by summing all values and dividing by the number of values. |
| Median | The middle value in a data set when the values are arranged in order; if there are two middle values, it is the average of those two. |
| Mean Absolute Deviation (MAD) | The average distance of each data point from the mean of the data set, indicating the spread or variability. |
| Outlier | A data point that is significantly different from other observations in a data set. |
Suggested Methodologies
Planning templates for Mathematics
5E Model
The 5E Model structures lessons through five phases (Engage, Explore, Explain, Elaborate, and Evaluate), guiding students from curiosity to deep understanding through inquiry-based learning.
Unit PlannerMath Unit
Plan a multi-week math unit with conceptual coherence: from building number sense and procedural fluency to applying skills in context and developing mathematical reasoning across a connected sequence of lessons.
RubricMath Rubric
Build a math rubric that assesses problem-solving, mathematical reasoning, and communication alongside procedural accuracy, giving students feedback on how they think, not just whether they got the right answer.
More in Data Analysis and Statistics
Sampling Strategies
Distinguishing between biased and representative samples to ensure valid conclusions.
2 methodologies
Making Inferences from Samples
Using data from a random sample to draw inferences about a population with an unknown characteristic of interest.
2 methodologies
Measures of Center: Mean, Median, Mode
Calculating and interpreting mean, median, and mode for various data sets.
2 methodologies
Measures of Variability: Range & IQR
Understanding and calculating range and interquartile range to describe data spread.
2 methodologies
Visualizing Data: Box Plots
Creating and interpreting box plots to identify trends and patterns, including quartiles and outliers.
2 methodologies
Ready to teach Comparing Data Distributions?
Generate a full mission with everything you need
Generate a Mission