Skip to content
Probability and Inferential Statistics · Weeks 19-27

Statistical Significance

Using p values and confidence intervals to evaluate the validity of experimental claims.

Need a lesson plan for Mathematics?

Generate Mission

Key Questions

  1. What does it truly mean for a result to be statistically significant?
  2. How does the choice of confidence level affect the width of a confidence interval?
  3. Why can we never 'prove' a null hypothesis, but only fail to reject it?

Common Core State Standards

CCSS.Math.Content.HSS.IC.A.2CCSS.Math.Content.HSS.IC.B.5
Grade: 12th Grade
Subject: Mathematics
Unit: Probability and Inferential Statistics
Period: Weeks 19-27

About This Topic

Statistical significance is one of the most important, and most frequently misunderstood, concepts in data analysis. In US AP Statistics and senior mathematics courses, students learn that a result is statistically significant when the p-value falls below the predetermined significance level α. This means the observed result would be unlikely if the null hypothesis were true, not that the result is large, important, or meaningful in a practical sense.

Confidence intervals and p-values are complementary tools for evaluating significance. When a 95% confidence interval for a difference does not include zero, the corresponding two-tailed test at α = 0.05 will also yield a significant result. Students who understand both perspectives can evaluate statistical claims more completely and identify cases where a technically significant result has negligible real-world impact.

A central philosophical point of this topic is why we can never prove a null hypothesis. The logic of statistical testing is asymmetric: we gather evidence against H₀, not for it. Failing to reject does not confirm, it only means the data were not unusual enough to meet the evidence threshold. Active learning discussions and analysis of published research help students develop the critical mindset that statistics education ultimately aims to build.

Learning Objectives

  • Critique experimental claims by evaluating the relationship between p-values, significance levels, and the plausibility of the null hypothesis.
  • Calculate and interpret confidence intervals for population parameters, explaining how the confidence level impacts interval width and precision.
  • Compare and contrast the outcomes of hypothesis testing and confidence interval estimation for a given dataset.
  • Explain the asymmetry in hypothesis testing, articulating why one can fail to reject but never accept the null hypothesis.

Before You Start

Introduction to Hypothesis Testing

Why: Students need a foundational understanding of null and alternative hypotheses and the concept of testing claims with data.

Basic Probability and Distributions

Why: Understanding probability is essential for interpreting p-values and the likelihood of observed results under the null hypothesis.

Sampling Distributions

Why: Knowledge of sampling distributions is crucial for understanding how sample statistics relate to population parameters and for constructing confidence intervals.

Key Vocabulary

p-valueThe probability of observing a test statistic as extreme as, or more extreme than, the one computed from sample data, assuming the null hypothesis is true.
Significance Level (α)A predetermined threshold for rejecting the null hypothesis. Commonly set at 0.05, it represents the maximum acceptable probability of a Type I error.
Confidence IntervalA range of values, derived from sample statistics, that is likely to contain the value of an unknown population parameter.
Null Hypothesis (H₀)A statement of no effect or no difference, which is tested against the sample data. It is the hypothesis that researchers aim to find evidence against.
Type I ErrorRejecting the null hypothesis when it is actually true. The probability of a Type I error is equal to the significance level α.

Active Learning Ideas

See all activities

Real-World Connections

Pharmaceutical companies use p-values and confidence intervals to determine if a new drug is significantly more effective than a placebo or existing treatments, influencing FDA approval decisions for medications like new antibiotics.

Market researchers analyze survey data using these statistical tools to assess whether a new advertising campaign has led to a statistically significant increase in product sales for brands such as Coca-Cola or Nike.

Political pollsters report margins of error, which are directly related to confidence intervals, to indicate the uncertainty in their estimates of public opinion for national elections.

Watch Out for These Misconceptions

Common MisconceptionStatistical significance means the result is important or meaningful.

What to Teach Instead

Significance only says the result is unlikely under H₀, not that it matters practically. Effect size and context determine practical importance. Having students compare a medically trivial but highly significant result with a moderate effect in a small study illustrates the distinction effectively.

Common MisconceptionA higher confidence level always gives a better result.

What to Teach Instead

Increasing confidence widens the interval, reducing precision. There is a direct trade-off: more confidence means less precision. Students who adjust a Desmos slider and watch the interval widen develop this intuition naturally through observation.

Common MisconceptionFailing to reject H₀ at α = 0.05 means the null is probably true.

What to Teach Instead

Failure to reject means only that the evidence threshold was not met, it is not evidence for H₀. This is the classic absence-of-evidence vs. evidence-of-absence distinction, best understood through structured discussion and concrete case examples.

Assessment Ideas

Discussion Prompt

Present students with a news headline reporting a statistically significant finding (e.g., 'Study finds eating chocolate reduces stress by 10%'). Ask: 'What is the null hypothesis here? What does 'statistically significant' likely mean in this context? What additional information, like the p-value or confidence interval, would you need to assess the practical importance of this finding?'

Quick Check

Provide students with a scenario: 'A researcher tests if a new fertilizer increases crop yield, finding a p-value of 0.03. The significance level was set at α = 0.05.' Ask them to: 1. State the conclusion regarding the null hypothesis. 2. Explain what the p-value of 0.03 means in this context. 3. Identify the type of error they might have made.

Exit Ticket

Ask students to write a short paragraph explaining the difference between a statistically significant result and a practically important result, using an example of their own or one discussed in class. They should also define 'confidence level' in their own words.

Ready to teach this topic?

Generate a complete, classroom-ready active learning mission in seconds.

Generate a Custom Mission

Frequently Asked Questions

What does 'statistically significant' actually mean?
A result is statistically significant when the p-value is less than the chosen significance level α. It means the observed result would be unlikely if the null hypothesis were true. It does not mean the effect is large, certain, or practically important, those judgments require additional context.
How does the confidence level affect the width of a confidence interval?
A higher confidence level (e.g., 99% vs. 90%) produces a wider interval because you need to cast a broader net to be more certain of capturing the true parameter. The trade-off is between certainty and precision; the right level depends on the consequences of being wrong in a given research context.
Why can we never prove the null hypothesis?
Statistical tests only evaluate how surprising the data are under H₀. Failing to reject simply means the data were not sufficiently unusual, there could be a real effect that the study lacked the power to detect. We can only reject or fail to reject; we cannot confirm the null is true.
How does active learning help students think critically about statistical significance?
Analyzing real cases of statistical overclaiming, replication failures, and misleading headlines requires evaluative thinking that drill problems cannot build. Structured discussions and case studies help students apply significance concepts to messy, real contexts, which is exactly what AP exam investigative tasks require.