Skip to content
Mathematics · 12th Grade · Probability and Inferential Statistics · Weeks 19-27

Hypothesis Testing: T-Tests

Performing t-tests for population means when the population standard deviation is unknown.

Common Core State StandardsCCSS.Math.Content.HSS.IC.B.5

About This Topic

Hypothesis testing with t-tests is a cornerstone of AP Statistics and data literacy education at the 12th grade level in the US. When the population standard deviation is unknown, which is the norm in real research, statisticians substitute the sample standard deviation and the resulting test statistic follows a t-distribution rather than the standard normal. The t-distribution has heavier tails than the normal, reflecting the extra uncertainty introduced by estimating from sample data.

Degrees of freedom, calculated as n - 1 for one-sample tests, control how heavy those tails are. With small samples, the distribution is noticeably wider; as sample size grows, it converges toward the standard normal. Students encounter three test variants: the one-sample t-test compares a sample mean against a stated value, the two-sample t-test compares means from two independent groups, and the paired t-test analyzes within-pair differences when observations are matched or repeated measures on the same subject.

Active learning is well suited to t-test instruction because the hardest skill is not computation but scenario judgment. Students who argue about which test applies to ambiguous cases, check conditions together, and interpret p-values in context retain the reasoning process rather than just the steps.

Key Questions

  1. Explain why the t-distribution is used instead of the normal distribution when sigma is unknown.
  2. Differentiate between one-sample, two-sample, and paired t-tests.
  3. Analyze the impact of degrees of freedom on the shape of the t-distribution.

Learning Objectives

  • Explain the rationale for using the t-distribution over the normal distribution when the population standard deviation is unknown.
  • Differentiate between the hypotheses and conditions for one-sample, two-sample independent, and paired t-tests.
  • Calculate the appropriate t-statistic for one-sample, two-sample independent, and paired scenarios.
  • Analyze the effect of sample size and degrees of freedom on the critical values and p-values of a t-test.
  • Interpret the results of a t-test in the context of a given research question, including stating conclusions in plain language.

Before You Start

Introduction to Statistical Inference

Why: Students need a foundational understanding of sampling distributions and the logic of hypothesis testing before applying t-tests.

Descriptive Statistics: Mean and Standard Deviation

Why: Calculating sample means and sample standard deviations is fundamental to computing t-statistics.

Normal Distribution and Z-scores

Why: Familiarity with the normal distribution and z-scores provides a basis for understanding the t-distribution's properties and its relationship to the normal distribution.

Key Vocabulary

t-distributionA probability distribution that is bell-shaped and symmetric like the normal distribution, but has heavier tails. It is used for inference when the population standard deviation is unknown.
degrees of freedom (df)A parameter that characterizes the shape of the t-distribution, typically related to the sample size. For a one-sample t-test, df = n - 1.
null hypothesis (H0)A statement of no effect or no difference, which the t-test aims to find evidence against.
alternative hypothesis (Ha)A statement that contradicts the null hypothesis, proposing that there is an effect or difference.
p-valueThe probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming the null hypothesis is true.

Watch Out for These Misconceptions

Common MisconceptionThe t-distribution and the normal distribution are separate, unrelated tools.

What to Teach Instead

As degrees of freedom increase, the t-distribution converges to the standard normal. For large enough samples, t and z critical values are nearly identical. Having students overlay both distributions in Desmos at varying df values makes this relationship concrete rather than abstract.

Common MisconceptionA paired t-test and a two-sample t-test can be used interchangeably depending on preference.

What to Teach Instead

Paired tests use within-pair differences and remove between-subject variability, making them more powerful when observations are genuinely matched or repeated. Using a two-sample test on paired data inflates the standard error and loses statistical power. Scenario card sorts where students must defend their choice help fix this distinction.

Common MisconceptionA statistically significant result means the effect is large or practically important.

What to Teach Instead

Statistical significance only tells you that the result is unlikely under the null hypothesis. A tiny, practically meaningless difference can produce a very small p-value with a large sample. Pairing t-test results with effect size calculations and context-based interpretation helps students separate statistical from practical significance.

Active Learning Ideas

See all activities

Card Sort: Which T-Test Applies?

Groups of three receive a set of eight scenario cards and sort them into one-sample, two-sample, and paired categories. Each group writes a one-sentence justification for every card, then the class debriefs on the most contested cases. Disagreement between groups is the discussion goal, not just getting the right answer.

18 min·Small Groups

Think-Pair-Share: Estimating Sigma

Present a realistic research scenario where sigma is unknown and ask pairs to explain what changes when the sample standard deviation substitutes for the population value, and why that requires a different distribution. Each pair writes one sentence before the teacher formalizes the idea for the class.

10 min·Pairs

Desmos Degrees-of-Freedom Gallery

Students use a Desmos t-distribution slider to observe how the curve changes at df = 3, 10, 30, and 100, then sketch and annotate each shape in their notes. They answer three comparison questions about tail area and critical values before a brief whole-class discussion on the practical significance of sample size.

15 min·Pairs

Paired T-Test Lab: Before and After

Students collect a small paired dataset, such as dominant versus non-dominant hand grip strength or reaction time before and after a short warm-up, calculate the mean difference and its standard error by hand, and run the test. Each group writes a one-paragraph conclusion interpreting the p-value in plain language before comparing conclusions across groups.

30 min·Small Groups

Real-World Connections

  • Medical researchers use paired t-tests to compare the effectiveness of a new drug by measuring blood pressure in the same patients before and after treatment.
  • Quality control engineers in manufacturing might use a two-sample t-test to determine if there is a significant difference in the average length of bolts produced by two different machines.
  • Social scientists conduct one-sample t-tests to investigate if the average score on a standardized test for a particular school district differs significantly from the national average.

Assessment Ideas

Exit Ticket

Provide students with a scenario describing a research question. Ask them to: 1. Identify whether a one-sample, two-sample independent, or paired t-test is most appropriate. 2. State the null and alternative hypotheses in symbols. 3. List the conditions that must be met for the chosen test.

Discussion Prompt

Present students with two scenarios: one where the population standard deviation is known (use z-test) and one where it is unknown (use t-test). Ask: 'Why do we use different distributions in these cases? What is the practical implication of using the t-distribution?'

Quick Check

Give students a small dataset (e.g., 5 pairs of measurements). Ask them to calculate the mean difference and the sample standard deviation of the differences. Then, ask them to determine the degrees of freedom for a paired t-test on this data.

Frequently Asked Questions

Why use a t-distribution instead of the normal distribution when the population standard deviation is unknown?
Substituting the sample standard deviation for the population value introduces extra uncertainty that the normal distribution does not account for. The t-distribution compensates with heavier tails, producing wider confidence intervals and more conservative p-values. As sample size grows, this extra uncertainty shrinks and the t-distribution approaches the standard normal.
What is the difference between a one-sample, two-sample, and paired t-test?
A one-sample t-test compares a sample mean to a known or hypothesized value. A two-sample t-test compares means from two independent groups. A paired t-test analyzes differences between matched pairs or repeated measures on the same subjects. The key distinction is whether the observations are independent or naturally linked.
How do degrees of freedom affect the t-distribution shape?
Smaller degrees of freedom produce a flatter, wider distribution with heavier tails, reflecting greater uncertainty about the population standard deviation. As degrees of freedom increase, the distribution narrows and the tails lighten. With around 30 or more degrees of freedom, the t-distribution is nearly indistinguishable from the standard normal.
How does active learning help students understand t-tests?
T-test instruction often stalls at calculation steps rather than building judgment about when each test applies. Scenario-based card sorts, paired discussions about which test fits a given design, and real data labs where students interpret results in context develop the reasoning skills that AP free-response questions and real research both require.

Planning templates for Mathematics