Skip to content
Mathematics · 12th Grade · Probability and Inferential Statistics · Weeks 19-27

Chi-Square Tests for Categorical Data

Using chi-square tests to analyze relationships between categorical variables (goodness-of-fit, independence).

Common Core State StandardsCCSS.Math.Content.HSS.IC.B.6

About This Topic

Chi-square tests address a type of question that other tests in the AP Statistics curriculum cannot: whether observed counts for categorical variables differ from expected counts. The two main procedures, goodness-of-fit and test of independence, appear in US 12th grade courses as students apply inferential reasoning beyond the continuous, normally distributed setting. A goodness-of-fit test checks whether a sample distribution matches a hypothesized distribution; a test of independence evaluates whether two categorical variables are associated in a two-way table.

Both tests use the same chi-square statistic: the sum of (observed - expected)²/expected across all cells. Large values indicate a large discrepancy between observed and expected counts, which produces a small p-value and leads to rejection of the null. The key intellectual tasks are computing expected counts correctly and verifying that each expected cell count is at least 5 for the chi-square approximation to be valid.

Active learning fits chi-square well because the contexts are naturally engaging: Are M&M colors distributed as the company claims? Is there a relationship between study habits and grade category? These questions lend themselves to genuine data collection, which makes the procedure feel purposeful rather than abstract.

Key Questions

  1. Explain the purpose of a chi-square goodness-of-fit test versus a test of independence.
  2. Analyze how observed frequencies compare to expected frequencies in a chi-square test.
  3. Justify the conditions required for a valid chi-square test.

Learning Objectives

  • Compare the expected frequencies to observed frequencies for a given categorical data set to determine statistical significance.
  • Justify the conditions necessary for the valid application of chi-square goodness-of-fit and independence tests.
  • Calculate the chi-square test statistic and p-value for a given scenario involving categorical variables.
  • Distinguish between the null and alternative hypotheses for a chi-square goodness-of-fit test and a chi-square test of independence.
  • Evaluate the conclusion of a chi-square test based on its p-value and a chosen significance level.

Before You Start

Introduction to Probability

Why: Students need a foundational understanding of probability to grasp the concept of expected frequencies and how they relate to observed data.

Basic Concepts of Hypothesis Testing

Why: Students must be familiar with the general framework of hypothesis testing, including null and alternative hypotheses, p-values, and significance levels, to understand the application of chi-square tests.

Working with Proportions and Percentages

Why: Chi-square tests often involve comparing proportions or expected counts derived from proportions, making this skill essential.

Key Vocabulary

Categorical VariableA variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to one of a particular group or nominal category.
Observed FrequencyThe actual count of data points that fall into a specific category or cell in a study.
Expected FrequencyThe count of data points that would be expected in a specific category or cell if the null hypothesis were true.
Chi-Square StatisticA test statistic calculated from observed and expected frequencies, used to assess the goodness of fit or independence of categorical variables.
Goodness-of-Fit TestA statistical test used to determine whether a sample distribution matches a hypothesized population distribution.
Test of IndependenceA statistical test used to determine whether there is a significant association between two categorical variables in a population.

Watch Out for These Misconceptions

Common MisconceptionChi-square tests work with any cell size, including very small expected counts.

What to Teach Instead

The chi-square approximation becomes unreliable when expected cell counts fall below 5. Students must check this condition before running the test; some cells may need to be combined or a Fisher's exact test used instead. Condition-checking gallery walks build this as a consistent habit.

Common MisconceptionA significant chi-square test result tells you which specific cells differ most.

What to Teach Instead

The overall test only tells you whether the null hypothesis of no difference or no association should be rejected. To identify which cells drove the result, students must examine individual (observed - expected)²/expected contributions, a follow-up analysis not automatically provided by the overall p-value.

Common MisconceptionGoodness-of-fit and test of independence use different formulas.

What to Teach Instead

Both use the same chi-square statistic formula. The difference is in how expected counts are computed and what the null hypothesis states. Students who understand the shared formula can transfer reasoning between the two tests rather than treating them as entirely separate procedures.

Active Learning Ideas

See all activities

Real-World Connections

  • Market researchers use chi-square tests of independence to analyze survey data, determining if customer preferences for product features (e.g., color, size) are associated with demographic groups (e.g., age, income). This helps companies tailor marketing strategies.
  • Biologists employ goodness-of-fit tests to examine genetic mutation rates. They compare observed frequencies of offspring genotypes against expected frequencies predicted by Mendelian genetics to identify deviations that might indicate evolutionary pressures or errors in data collection.
  • Political scientists use chi-square tests to analyze election results. They might test if voting patterns in different precincts are independent of the precinct's socioeconomic characteristics, helping to understand regional political behavior.

Assessment Ideas

Quick Check

Present students with a scenario describing a categorical data set and a research question (e.g., 'Does the distribution of M&M colors in a bag match the company's stated proportions?'). Ask them to write the null and alternative hypotheses and list the conditions they would need to check before performing a chi-square test.

Exit Ticket

Provide students with a small two-way table showing observed counts for two categorical variables. Ask them to calculate the expected counts for one specific cell and explain how they arrived at that value. Also, ask them to state what a large chi-square statistic would imply about the relationship between the variables.

Discussion Prompt

Pose the question: 'When would you choose a chi-square goodness-of-fit test, and when would you choose a chi-square test of independence?' Have students discuss in pairs, focusing on the type of data and the research question each test addresses. Call on pairs to share their reasoning.

Frequently Asked Questions

What is the difference between a chi-square goodness-of-fit test and a test of independence?
A goodness-of-fit test compares an observed sample distribution to a specific theoretical distribution (e.g., are dice fair?). A test of independence examines whether two categorical variables are associated using a two-way frequency table (e.g., is political affiliation related to age group?).
How do I calculate expected counts for a chi-square test?
For goodness-of-fit, expected count = n × hypothesized proportion. For independence, expected count = (row total × column total) / grand total. In both cases, expected counts represent what you would see if the null hypothesis were exactly true.
What conditions are required for a valid chi-square test?
The three key conditions are: data come from a random sample or random assignment, observations are independent of each other, and all expected cell counts are at least 5. Violating the last condition makes the chi-square distribution approximation unreliable.
How does active learning help with chi-square tests?
Chi-square tests are most memorable when the data are real and the question is genuine. Hands-on labs, testing whether M&M color distributions match stated proportions or whether class survey results show independence, make the test procedure feel purposeful. Students who collect their own data are far more invested in checking conditions and interpreting results carefully.

Planning templates for Mathematics