Introducing Hypothesis Tests for Proportions — AP Statistics Study Guide
For: AP Statistics candidates sitting AP Statistics.
Covers: Null and alternative hypotheses for a population proportion, conditions for a one-proportion z-test, test statistic calculation, p-value interpretation, and significance conclusions for one-sided and two-sided tests of a single proportion.
You should already know: Sampling distributions for sample proportions. Confidence interval basics for a population proportion. The general definition of a p-value and significance level α.
A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.
1. What Is Introducing Hypothesis Tests for Proportions?
This topic is your first introduction to formal, inferential hypothesis testing for a categorical population proportion, and it forms the foundation of all hypothesis testing in AP Statistics. It is part of Unit 6: Inference for Categorical Data: Proportions, which makes up 12-15% of the total AP exam weight. This subtopic itself appears on both multiple choice (MCQ) and free response (FRQ) sections of the exam, and is often the opening part of a multi-part FRQ that tests more complex inference skills.
Formally, a hypothesis test for a proportion uses sample data to evaluate a claim about the true unknown value of a population proportion . Key notation conventions: = population proportion (the unknown parameter we are testing), = sample proportion (the known statistic calculated from our data), = the hypothesized value of from the null hypothesis. This method is sometimes called a one-proportion z-test, because the test statistic follows a standard normal (z) distribution when conditions are met.
2. Stating Null and Alternative Hypotheses
All hypothesis tests start with two competing claims about the population parameter . The null hypothesis () is the default claim of no effect, no difference, or the status quo. By convention, the null hypothesis always includes an equals sign: , where is the hypothesized value from the original claim. The alternative hypothesis () is the research claim we are trying to find evidence for, and it never includes equality. There are three possible forms for the alternative hypothesis, depending on the question:
- One-sided (left-tailed): (we suspect the true proportion is lower than the null claim)
- One-sided (right-tailed): (we suspect the true proportion is higher than the null claim)
- Two-sided: (we suspect the true proportion is just different from the null claim, no direction specified)
A critical rule: Hypotheses are always about the population parameter , never the sample statistic . We already know the value of from our sample; we are testing a claim about the unknown population value. The logic of hypothesis testing is similar to a criminal trial: we assume the null is true (innocent) unless we get enough evidence to reject it (prove guilty beyond a reasonable doubt).
Worked Example
A social media platform claims that 60% of its active users log on to the site every day. A privacy researcher suspects the true proportion of daily active users is lower than 60%. State the appropriate null and alternative hypotheses for this test, including a definition of the parameter of interest.
Solution:
- Define the parameter: = the true proportion of all active users of the platform who log on every day.
- The hypothesized value from the platform's claim is .
- The null hypothesis is always a statement of equality: .
- The researcher suspects is lower than 0.60, so the one-sided alternative is: .
Exam tip: Always define the parameter in context before writing your hypotheses. AP Statistics graders require this step to award full credit, even if your hypotheses are written correctly.
3. Checking Conditions for a One-Proportion Z-Test
Before conducting inference, we must check three core conditions to ensure our sampling distribution is well-behaved, so our p-value will be accurate. The three conditions are Random, Independent, Normal:
- Random: The data must come from a random sample from the population of interest, or a randomized experiment. This condition ensures our sample is unbiased, so we can generalize results to the population.
- Independent: When sampling without replacement, we require the 10% condition: the sample size is less than 10% of the total population size (). This ensures we can treat individual observations as independent even when sampling without replacement.
- Normal (Large Counts Condition): The sampling distribution of is approximately normal if and . A key difference from confidence intervals for proportions: for hypothesis tests, we use the null hypothesized value (not ) to check this condition. Why? We assume the null hypothesis is true for the test, so we use the hypothesized proportion to confirm normality.
Worked Example
Continuing the social media example: the researcher takes a random sample of 150 active users, and 82 report logging on daily. The platform has more than 2 million total active users. Check all conditions for a hypothesis test.
Solution:
- Random: The problem explicitly states the sample is random, so this condition is satisfied.
- Independent: The total population is 2 million, so 10% of the population is 200,000. Our sample size , so the 10% condition is met, and independence is satisfied.
- Normal (Large Counts): Using from the null hypothesis: , and . Both thresholds are met, so the normality condition is satisfied. All conditions are met to conduct a one-proportion z-test.
Exam tip: If the problem does not explicitly state the population size, assume the 10% condition is met as long as the population is clearly larger than the sample (e.g., all users of a platform, all customers of a store).
4. Calculating the Test Statistic and P-Value
If all conditions are met, we assume is true, so the sampling distribution of is approximately normal with mean equal to and standard deviation (this is the standard error of under the null hypothesis).
The z-test statistic measures how far our observed sample proportion is from the hypothesized value , measured in standard error units. The formula is: A large absolute value of means our observed result is very far from what we would expect if were true, which provides evidence against .
The p-value is the probability of getting a test statistic as extreme or more extreme than the one we observed, assuming is true. The calculation depends on the alternative hypothesis:
- : (left tail probability)
- : (right tail probability)
- : (two tail probability, double the single-tail area)
Worked Example
Continuing the social media example: we have , , , . Calculate the test statistic and p-value.
Solution:
- Calculate : .
- Plug into the z formula:
- For , the p-value is . Using a standard normal table, this is approximately 0.093. So the p-value ≈ 0.093.
Exam tip: When using a calculator for p-values, you do not need to show the z-table lookup; just report the z statistic and p-value. On FRQ, however, you must show the formula for z to earn full credit.
5. Drawing a Conclusion in Context
After calculating the p-value, we compare it to the pre-specified significance level (usually , unless another value is given in the problem). There are only two possible correct conclusions:
- If : Reject . There is convincing statistical evidence to support the alternative hypothesis in the context of the problem.
- If : Fail to reject . There is not convincing statistical evidence to support the alternative hypothesis in context.
Critical rules for conclusions: Never say "we accept " or "we prove is true". We cannot prove the null hypothesis is true; we only fail to find enough evidence to reject it. We also never "prove" the alternative is true, we only say there is convincing evidence for it.
Worked Example
Continuing the social media example: we have a p-value of 0.093, and we use the standard significance level . State the conclusion in context.
Solution:
- Compare p-value to α: .
- Decision: We fail to reject the null hypothesis .
- Conclusion in context: At the 0.05 significance level, there is not convincing statistical evidence that the true proportion of active users who log on every day is lower than the 60% claimed by the social media platform.
Exam tip: If you get a p-value just barely above or below α, make sure your conclusion matches the comparison; AP exam graders will deduct points if your conclusion contradicts your p-value comparison.
6. Common Pitfalls (and how to avoid them)
- Wrong move: Stating hypotheses about the sample proportion instead of the population proportion , writing instead of . Why: Students confuse the known sample statistic with the unknown population parameter when setting up tests. Correct move: Always define as the true population proportion in context first, then write hypotheses only in terms of .
- Wrong move: Checking the Large Counts condition using instead of for a hypothesis test. Why: Students memorize the Large Counts condition from confidence intervals and incorrectly apply it without adjustment. Correct move: Always use the null hypothesized value when checking normality for one-proportion hypothesis tests, reserve for confidence interval condition checks.
- Wrong move: Forgetting to double the single-tail probability when calculating a p-value for a two-sided alternative hypothesis. Why: Students remember one-sided p-value calculation and overlook the "extreme in either direction" logic for two-sided tests. Correct move: Explicitly multiply the single-tail probability by 2 when your alternative is .
- Wrong move: Using for the denominator of the z-test statistic instead of . Why: Students carry over the confidence interval standard error formula to the hypothesis test setting. Correct move: Always use in the test statistic denominator, since we assume the null hypothesis is true for the test.
- Wrong move: Concluding "we accept the null hypothesis" when -value . Why: Students think the binary decision means either hypothesis is proven true, but we start with the null as an assumption. Correct move: Always use the phrasing "fail to reject the null hypothesis" and state there is not convincing evidence for the alternative, never claim the null is proven true.
- Wrong move: Writing a conclusion that is not in the context of the problem, e.g., "we reject " with no connection to the original claim. Why: Students rush at the end of problems and overlook the AP requirement for contextual interpretation. Correct move: Always end your conclusion by linking the decision to the original research question about the population.
7. Practice Questions (AP Statistics Style)
Question 1 (Multiple Choice)
A city council candidate claims that more than 55% of registered voters approve of their job performance. A pollster takes a random sample of 200 registered voters and finds 120 approve of the candidate's performance. Which of the following is closest to the p-value for the appropriate test of the candidate's claim? A) 0.10 B) 0.15 C) 0.20 D) 0.25
Worked Solution: First, identify the hypotheses: , (one-sided right-tailed, since the candidate claims more than 55% approve). Calculate the sample proportion: . Calculate the z test statistic: . The p-value is , which is closest to 0.10. The correct answer is A.
Question 2 (Free Response)
A candy company claims that 25% of all their candy boxes contain a prize. A group of customers suspects that the true proportion of boxes with prizes is less than 25%. They take a random sample of 120 candy boxes, and 23 boxes contain prizes. Use for all inference. (a) State the appropriate null and alternative hypotheses, and define the parameter of interest. (b) Check all conditions for inference. (c) Calculate the test statistic and p-value, then state your conclusion in context.
Worked Solution: (a) Let = true proportion of all candy boxes produced by the company that contain a prize. The hypotheses are and .
(b) 1. Random: The sample is stated to be random, so the random condition is satisfied. 2. 10% Condition: The total population of candy boxes produced by the company is far larger than , so independence is satisfied. 3. Large Counts: Using , and , so the normality condition is satisfied. All conditions are met.
(c) . The test statistic is: p-value = . Since , we fail to reject . Conclusion: At the 0.05 significance level, there is not convincing statistical evidence that the true proportion of candy boxes with prizes is less than the 25% claimed by the company.
Question 3 (Application / Real-World Style)
Genetic theory predicts that 75% of pea plants grown from hybrid seeds will have purple flowers. A botanist grows a random sample of 60 hybrid pea plants, and 51 have purple flowers. Using a significance level of , what conclusion does the botanist draw about whether the observed data matches the genetic theory?
Worked Solution: Hypotheses: = true proportion of hybrid pea plants with purple flowers. , (two-sided, because we test if the proportion differs from the theoretical prediction). Conditions: Random sample stated, population of pea plants is much larger than 600, so 10% condition met. , , so normality is met. Calculate . . Two-sided p-value = . Since , we fail to reject . Conclusion: At the 0.05 significance level, there is not convincing evidence that the true proportion of purple-flowered hybrid pea plants differs from the 75% predicted by genetic theory.
8. Quick Reference Cheatsheet
| Category | Formula | Notes |
|---|---|---|
| Null Hypothesis | Always includes equality; states the hypothesized population proportion | |
| Left-tailed Alternative | Used when the claim is the true proportion is lower than hypothesized | |
| Right-tailed Alternative | Used when the claim is the true proportion is higher than hypothesized | |
| Two-sided Alternative | Used when the claim is the true proportion is different from hypothesized | |
| Large Counts Condition (Test) | , | Use , not ; confirms approximate normality |
| 10% Condition | = population size; confirms independence for sampling without replacement | |
| One-Proportion Z-Test Statistic | Use in the denominator (standard error under the null) | |
| P-Value (Left-tailed) | Probability of a z this small or smaller if is true | |
| P-Value (Right-tailed) | Probability of a z this large or larger if is true | |
| P-Value (Two-sided) | $2P(Z > | z |
| Conclusion (p < α) | Reject | Convincing evidence for in context |
| Conclusion (p ≥ α) | Fail to reject | No convincing evidence for ; never say "accept " |
9. What's Next
This topic introduces the full four-step hypothesis testing framework that you will use for all inference in the rest of AP Statistics, and it is the foundational test for all inference on proportions. It typically makes up 4-6% of the total AP exam score, so mastering it is critical for earning a high score. Next you will learn about Type I and Type II errors, which describe potential mistakes in hypothesis test conclusions, followed by two-sample inference for the difference between two population proportions. Without mastering the basics of one-proportion hypothesis testing covered here, more complex tests and error analysis will be very difficult to navigate, as they build directly on the logic and methods you learned here.
Type I and Type II Errors in Hypothesis Testing Confidence Intervals for One Population Proportion Hypothesis Tests for Two Population Proportions