Inference for Quantitative Data: Means — AP Statistics Unit Overview
For: AP Statistics candidates sitting AP Statistics.
Covers: This AP Statistics unit overviews all core inference methods for population means required for the exam, from t-distribution fundamentals to one-sample intervals and tests, two-sample independent inference, and matched pairs mean difference inference.
You should already know: Sampling distribution behavior for sample means. General logic of confidence intervals and hypothesis tests. Conditions for valid statistical inference.
A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.
1. Why This Matters
This unit is one of the highest-weighted on the AP Statistics exam, accounting for 12-15% of your total score per the official College Board CED, and it appears on both multiple choice and free response sections, often as a full-length FRQ. Unlike inference for proportions (which applies only to categorical yes/no data), inference for means applies to almost all real-world quantitative data: blood pressure, crop yield, monthly sales, test scores, and delivery wait time are all measured as averages, making this the most widely used inference framework in science, business, and social science.
This unit also resolves a key limitation of z-based inference you learned for proportions: in almost all real research, you do not know the true population standard deviation , so you need a new distribution (the t-distribution) to account for the additional uncertainty from estimating with the sample standard deviation . The unit builds incrementally from single-population inference to the comparative inference that is the backbone of experimental design, helping you answer questions like “does a new drug lower blood pressure more than a placebo?” or “do 1-bedroom rents differ on average between two cities?”
2. Concept Map
The six sub-topics in this unit build incrementally from foundational theory to applied comparative inference, following the core logic of inference you learned earlier:
- First, What Is a t Distribution? lays the foundational correction for when population standard deviation is unknown (the almost universal case for means). This replaces the z-distribution you used for proportions, adjusting for extra uncertainty from estimating with . Without this, all your interval and test calculations will be biased.
- Next, Inference for a Population Mean introduces the general framework, conditions, and structure for inference when estimating or testing a single population mean, setting up all subsequent work in the unit.
- Confidence Intervals for a Population Mean applies the framework to interval estimation, teaching you how to calculate and interpret a range of plausible values for the unknown population mean.
- Hypothesis Tests for a Population Mean applies the framework to significance testing for claims about a single population mean, walking through p-value and critical value approaches.
- Next, the unit extends to comparative inference: Inference for the Difference in Two Population Means adapts all the earlier logic to the case of two independent groups, the most common design for comparing two treatments or populations.
- Finally, Inference for a Mean Difference with Paired Data covers the specialized matched pairs design, where observations are dependent (same subject before/after, matched pairs of subjects), requiring a different approach that reduces to a one-sample test on the differences.
3. A Guided Tour
We will use a common exam-style problem to show how two core sub-topics work together to solve the problem:
Problem: A food regulation analyst wants to test if the average calorie content of a popular brand of “low-calorie” frozen pizza is higher than the advertised 280 calories per serving. A random sample of 12 servings gives a sample mean of 286 calories and sample standard deviation of 11 calories. Carry out an appropriate test.
Step 1: First, we use the foundational sub-topic What Is a t Distribution? to choose the correct distribution. We do not know the true population standard deviation of calorie content for all servings of this pizza; we only have the sample standard deviation . The t-distribution accounts for the additional uncertainty introduced by estimating with , so we use a t-distribution here instead of a z-distribution. Degrees of freedom are calculated as .
Step 2: Next, we apply Hypothesis Tests for a Population Mean to answer the research question. First, we state hypotheses: , , where is the true mean calorie content per serving. We check conditions: random sample (given), 10% condition (12 < 10% of all servings), and we assume no extreme skew for the small sample, so the t-procedure is robust. We calculate the test statistic: For , the one-tailed p-value falls between 0.04 and 0.05. At the significance level, we reject , and conclude there is convincing evidence that the true mean calorie content is higher than advertised.
Key takeaway: Every inference procedure for means relies on the t-distribution foundation first, before you can correctly calculate intervals or tests. If you had incorrectly used a z-distribution here, your p-value would be incorrectly small, leading to an incorrect conclusion.
Exam tip for the unit: Always stop to identify the study design before choosing your procedure—this cuts your work in half and avoids common mistakes.
4. Common Cross-Cutting Pitfalls
- Wrong move: Using a z-distribution instead of a t-distribution for means inference because you remember z from proportions. Why: Students confuse the case for proportions (where we have a hypothesized to get the standard deviation) with means, where is almost always unknown. Correct move: Always use t-distribution for means inference unless the problem explicitly states you know the population standard deviation .
- Wrong move: Comparing two dependent paired samples with a two-sample t-procedure for independent means. Why: Students forget to check if data is paired before choosing the procedure, defaulting to the more recently taught two-sample method. Correct move: Always check if observations are linked (same subject before/after, matched pairs) first; if yes, use the paired t-procedure on the differences.
- Wrong move: Interpreting a confidence interval for a mean as “95% of sample means are in this interval” or “95% of individual data values are in this interval”. Why: Students confuse the interpretation of the confidence level for the parameter estimate with the distribution of sample or individual data. Correct move: Always interpret the interval as “We are 95% confident that the true population mean [context] is between [lower bound] and [upper bound].”
- Wrong move: Using for two-sample t-procedures instead of the conservative . Why: Students confuse total sample size with degrees of freedom for two-sample inference. Correct move: For the AP exam, use the conservative df calculation for two-sample t-procedures unless you are using a calculator to get the exact df.
- Wrong move: Using the pooled two-sample t-procedure by default, even when not required. Why: Some textbooks introduce the pooled procedure, leading students to think it is the standard method. Correct move: Never use the pooled two-sample t-procedure unless the problem explicitly tells you the population variances are equal (this almost never happens on the AP exam).
5. Quick Check: When Do I Use Which Procedure?
For each scenario, identify which inference procedure from this unit you would use:
- You want to estimate the average difference in test scores after students complete an online review module, with scores measured before and after for the same 30 students.
- You want to test whether the average SAT score of first-year students at a liberal arts college is higher than the national average of 1050.
- You want to compare the average monthly rent of 1-bedroom apartments in two different cities, with independent random samples taken from each city.
- You want to calculate a 90% confidence interval for the average time it takes for a delivery driver to complete a route, based on a random sample of 25 routes.
- You want to test whether the average change in resting heart rate after 8 weeks of a new exercise program differs from zero, with measurements on 40 participants.
Answers:
- Inference for a Mean Difference with Paired Data (paired t-procedure)
- Hypothesis Test for a Population Mean (one-sample t-test)
- Inference for the Difference in Two Population Means (two-sample t-test)
- Confidence Intervals for a Population Mean (one-sample t-interval)
- Inference for a Mean Difference with Paired Data (paired t-test)
6. Practice Questions (AP Statistics Style)
Question 1 (Multiple Choice)
A researcher calculates a 95% confidence interval for the true mean difference in reaction time for subjects after consuming caffeine versus before, based on a random sample of 25 subjects. The interval is calculated as (12 ms, 48 ms). Which of the following is a correct interpretation of the 95% confidence level? A) 95% of the subjects have a difference in reaction time between 12 ms and 48 ms. B) There is a 95% probability that the true mean difference in reaction time is between 12 ms and 48 ms. C) If we took many random samples of 25 subjects and constructed a 95% confidence interval from each sample, about 95% of the intervals would contain the true mean difference in reaction time. D) We are 95% confident that the sample mean difference in reaction time is between 12 ms and 48 ms.
Worked Solution: This question tests core understanding of confidence level interpretation, a concept tested across all inference for means. Option A is incorrect because the interval estimates the population mean, not individual values. Option B is incorrect because the true mean difference is a fixed unknown value, so it does not have a probability of being in the interval; the confidence level describes the method, not a specific interval. Option D is incorrect because we already know the sample mean difference is the center of the interval, so we do not need to estimate it. Only Option C correctly describes the long-run behavior of the confidence interval method. Correct answer: C
Question 2 (Free Response)
A horticulturist is comparing the growth rate of two varieties of tomato plants, Variety A and Variety B, grown under identical conditions. She plants 15 Variety A plants and 12 Variety B plants, and measures the height after 8 weeks. The summary statistics are below:
| Variety | Mean Height (cm) | Standard Deviation (cm) |
|---|---|---|
| A | 62.4 | 7.2 |
| B | 55.8 | 6.5 |
(a) Explain why a two-sample t-procedure is appropriate for this study, instead of a paired t-procedure. (b) Construct and interpret a 95% confidence interval for the difference in mean height () between the two varieties. (c) Based on your interval, does the data provide convincing evidence that the mean height of Variety A differs from the mean height of Variety B at the significance level? Explain.
Worked Solution: (a) A paired t-procedure is only used when each observation in the first group is matched or dependent with an observation in the second group. In this study, the samples of Variety A and Variety B plants are independent, so a two-sample t-procedure is appropriate. (b) Use the conservative degrees of freedom: . The critical value for 95% confidence and is . The standard error is: The interval is . We are 95% confident that the true difference in mean 8-week height (Variety A minus Variety B) is between 0.79 cm and 12.41 cm. (c) Yes, there is convincing evidence that the means differ. For a two-tailed test at , if the 95% confidence interval does not contain 0 (the null hypothesized difference), we reject the null hypothesis. Since 0 is not in our interval, we conclude there is convincing evidence that the mean heights differ.
Question 3 (Real-World Application)
A coffee shop chain wants to test whether their average wait time for mobile orders is less than the target of 3 minutes. A random sample of 30 mobile orders gives a sample mean wait time of 2.68 minutes and sample standard deviation of 0.75 minutes. Carry out a significance test at the level to answer the chain's question.
Worked Solution: This is a one-sample t-test for a population mean. Hypotheses: , , where is the true mean wait time for all mobile orders. Conditions are met: random sample given, 30 < 10% of all mobile orders, and so the Normal condition is satisfied. Calculate the test statistic: Degrees of freedom , so the one-tailed p-value is between 0.01 and 0.015, which is less than . We reject the null hypothesis. There is convincing evidence that the true average wait time for mobile orders at this chain is less than the 3-minute target.
7. Quick Reference Cheatsheet
| Category | Formula | Notes |
|---|---|---|
| One-sample t-test statistic | , use when testing a claim about one population mean, unknown | |
| One-sample t-confidence interval | is critical value for given confidence level and | |
| Two-sample t-test statistic | for most tests comparing two means | |
| Two-sample standard error | Never pool unless explicitly told populations have equal variance | |
| Two-sample conservative degrees of freedom | Accepted for full credit on all AP exam questions | |
| Two-sample confidence interval | Use for independent groups, estimating difference in population means | |
| Paired t-procedure | One-sample t-procedure on differences | For matched/paired dependent data, |
| Paired t-test statistic | for almost all paired tests of no difference |
8. See Also (Sub-Topics in This Unit)
After mastering this unit overview, you will dive into each sub-topic to build the specific skills needed for the AP exam. This unit builds on your earlier knowledge of inference for proportions to create a unified framework for all statistical inference, and it is a prerequisite for the next unit on chi-square procedures for categorical data. Mastery of inference for means is required for all experimental design questions on the AP exam, which make up a large portion of your score.