AP · Confidence Intervals for the Difference in Two Proportions · 14 min read · Updated 2026-05-10

Confidence Intervals for the Difference in Two Proportions — AP Statistics Study Guide

For: AP Statistics candidates sitting AP Statistics.

Covers: Conditions for inference for the difference in two proportions, the confidence interval formula, interval interpretation in context, pooled vs unpooled standard errors, and step-by-step construction for two independent samples.

You should already know: How to construct a one-proportion z-interval for a single population proportion. The Central Limit Theorem for sample proportions. The meaning of confidence level and interval interpretation for one-sample inference.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.

1. What Is Confidence Intervals for the Difference in Two Proportions?

A confidence interval for the difference in two population proportions ( $p_{1} - p_{2}$ ) gives a range of plausible values for the true difference between the proportion of successes in two separate independent populations. This topic is part of Unit 6: Inference for Categorical Data: Proportions, and accounts for roughly 10-15% of the AP Statistics exam weight per the official College Board CED. It appears on both multiple-choice (MCQ) and free-response (FRQ) sections of the exam. This method is used when we want to compare proportions from two distinct groups: for example, the proportion of defective parts from two manufacturing lines, the proportion of students who pass an exam after a new prep course vs a traditional course, or the proportion of vaccinated people in two different regions. Synonyms you may see on the exam include "two-sample z-interval for the difference in proportions" or "confidence interval for $p_{1} - p_{2}$ ". Unlike a one-proportion interval that estimates a single proportion, this interval lets us assess whether there is a meaningful difference between groups: if 0 is not in the interval, we have evidence of a difference at the corresponding significance level.

2. Conditions for Two-Proportion Confidence Intervals

Before constructing any confidence interval for $p_{1} - p_{2}$ , you must verify four core conditions to ensure the sampling distribution of $\overset{p}{^}_{1} - \overset{p}{^}_{2}$ is approximately normal and your standard error estimate is reliable. All four conditions are required for full credit on AP FRQs:

Random: Both samples must be independently drawn random samples from their respective populations, or come from a randomized controlled experiment. This ensures $\overset{p}{^}_{1}$ and $\overset{p}{^}_{2}$ are unbiased estimators of $p_{1}$ and $p_{2}$ .
Independent Groups: The two samples must be independent of one another—no pairing or matching between observations in the two groups. This method only applies to independent samples; paired (matched) categorical data uses a different framework.
10% Condition: When sampling without replacement, each sample size must be less than 10% of its population. This ensures individual observations within each sample are independent.
Large Counts (Normal Approximation): For each sample, there must be at least 10 observed successes and 10 observed failures, meaning: $n_{1} \overset{p}{^}_{1} \geq 10$ , $n_{1} (1 - \overset{p}{^}_{1}) \geq 10$ , $n_{2} \overset{p}{^}_{2} \geq 10$ , $n_{2} (1 - \overset{p}{^}_{2}) \geq 10$ . This confirms the sampling distribution of $\overset{p}{^}_{1} - \overset{p}{^}_{2}$ is approximately normal.

Worked Example

A high school counselor wants to compare the proportion of 12th grade students who have taken an SAT prep course before graduation, between students who attend public vs private high schools in a large state. She randomly samples 150 public school 12th graders and 80 private school 12th graders. 63 public school students and 41 private school students report taking a prep course. Verify all conditions for a 95% confidence interval for $p_{p u b l i c} - p_{p r i v a t e}$ .

Solution:

Random: The problem explicitly states both samples are random, so this condition is met.
Independent Groups: Samples are drawn separately from two distinct populations, no pairing, so groups are independent.
10% Condition: The population of public and private 12th graders in the state is far larger than $10 * 150 = 1500$ and $10 * 80 = 800$ respectively, so this condition is met.
Large Counts: Public: 63 successes, 87 failures; Private: 41 successes, 39 failures. All values are ≥10, so this condition is met. All conditions for inference are satisfied.

Exam tip: On AP FRQs, you must explicitly name and verify every condition, not just say "conditions are met". You will lose an entire point if you do not show the counts for the Large Counts condition.

3. Constructing the Confidence Interval

Once conditions are verified, we construct the interval using the unpooled standard error (the only version required for confidence intervals by the AP CED). First, recall notation: $p_{1}$ = true proportion of successes for population 1, $p_{2}$ = true proportion for population 2, $\overset{p}{^}_{1} = x_{1} / n_{1}$ , $\overset{p}{^}_{2} = x_{2} / n_{2}$ , where $x_{1}, x_{2}$ are the number of observed successes in each sample, and $n_{1}, n_{2}$ are sample sizes.

The general formula for a confidence interval for $p_{1} - p_{2}$ is: $(\overset{p}{^}_{1} - \overset{p}{^}_{2}) \pm z^{*} \frac{p ^ _{1} ( 1 - p ^ _{1} )}{n _{1}} + \frac{p ^ _{2} ( 1 - p ^ _{2} )}{n _{2}}$ Where:

$\overset{p}{^}_{1} - \overset{p}{^}_{2}$ is the point estimate for the true difference,
$z^{*}$ is the critical z-value for the desired confidence level (common values: 90% = 1.645, 95% = 1.96, 99% = 2.576),
The term inside the square root is the variance of the difference, which is the sum of the variances of each sample proportion (for independent variables, the variance of a difference equals the sum of variances).

Unlike hypothesis tests for two proportions, we never pool the sample proportions for a confidence interval. Pooling is only used for hypothesis tests when we assume the null hypothesis $p_{1} = p_{2}$ is true; for confidence intervals, we make no such assumption, so we use the individual sample proportions to calculate standard error.

Worked Example

Using the prep course data from the previous example: 63 out of 150 public school students took an SAT prep course, 41 out of 80 private school students took a prep course. Construct a 95% confidence interval for $p_{p u b l i c} - p_{p r i v a t e}$ .

Solution:

Conditions are already verified, so we proceed.
Calculate point estimate: $\overset{p}{^}_{p u b l i c} = 63/150 = 0.42$ , $\overset{p}{^}_{p r i v a t e} = 41/80 = 0.5125$ . Point estimate: $0.42 - 0.5125 = - 0.0925$ .
Calculate standard error: $S E = \frac{0.42 ( 0.58 )}{150} + \frac{0.5125 ( 0.4875 )}{80} = 0.001624 + 0.003127 \approx 0.004751 \approx 0.0689$ .
Critical $z^{*}$ for 95% confidence is 1.96. Margin of error: $M E = 1.96 * 0.0689 \approx 0.135$ .
Final interval: $- 0.0925 \pm 0.135 = (- 0.2275, 0.0425)$ .

Exam tip: Always explicitly label which population is 1 and which is 2 at the start. This prevents sign errors that can lead to incorrect inference conclusions later.

4. Interpreting Intervals and Drawing Conclusions

Interpretation is one of the most frequently tested skills on the AP exam for this topic. A correct interpretation has two key components: interpreting the interval itself in context, and interpreting the confidence level if asked.

For the confidence interval, the correct phrasing is: We are [C]% confident that the true difference in the proportion of [successes] between [population 1] and [population 2] is between [lower bound] and [upper bound]. Never say "there is a C% chance the true difference is in the interval"—the true difference is a fixed, unchanging number, so it is either in the interval or not. The C% confidence refers to the method: if we repeated the random sampling process and constructed a new interval each time, about C% of all intervals would capture the true difference.

For inference, the rule of thumb is: If 0 is not inside the confidence interval, we have convincing evidence at the (1-C) significance level that the two population proportions differ. If 0 is inside the interval, we do not have convincing evidence that they differ. We can never conclude that the proportions are equal, because the interval contains many non-zero plausible values.

Worked Example

Using the interval we calculated in the previous example, $(- 0.228, 0.043)$ for $p_{p u b l i c} - p_{p r i v a t e}$ . (a) Interpret the interval in context. (b) What conclusion can we draw about whether the proportion of students who take SAT prep differs between public and private schools? (c) Interpret what "95% confidence" means here.

Solution: (a) We are 95% confident that the true difference (public minus private) in the proportion of 12th grade students who take an SAT prep course is between -0.228 and 0.043. This means the true proportion for public schools could be as much as 22.8 percentage points lower, or 4.3 percentage points higher, than for private schools. (b) Because 0 is inside the interval, we do not have convincing evidence at the 95% confidence level that the proportion of students who take SAT prep differs between public and private 12th grade schools in this state. (c) "95% confidence" means that if we repeatedly took random samples of 150 public and 80 private 12th graders, and constructed a 95% confidence interval for the difference in proportions each time, about 95% of the intervals would capture the true difference in population proportions.

Exam tip: AP graders require context for interpretation points. If you write a generic interpretation without naming the two populations and the parameter being measured, you will not earn full credit.

5. Common Pitfalls (and how to avoid them)

Wrong move: Pooling the sample proportions when calculating standard error for a confidence interval. Why: Students confuse confidence interval rules with hypothesis test rules, where pooling is sometimes used. Correct move: Always use unpooled standard error (with separate $\overset{p}{^}_{1}$ and $\overset{p}{^}_{2}$ ) for two-proportion confidence intervals, per AP CED requirements.
Wrong move: Verifying the Large Counts condition using a pooled $\overset{p}{^}$ instead of individual sample counts. Why: Students carry over the pooled value from the standard error mistake to the condition check. Correct move: Always check that each sample has at least 10 successes and 10 failures using the observed counts from each sample separately.
Wrong move: Interpreting the interval as "there is a 95% chance that the true difference is between $a$ and $b$ ". Why: Students mix up the probability of the method working with the probability of the fixed parameter being in the interval. Correct move: Always use the phrasing "we are [C]% confident that the true difference... is between..." and reserve probability language for the sampling method when interpreting the confidence level.
Wrong move: Subtracting the variance terms inside the standard error square root instead of adding them. Why: Students match the subtraction in the point estimate to the variance calculation. Correct move: Remember that the variance of a difference of independent variables is the sum of the variances, so always add the terms inside the square root.
Wrong move: Concluding that two proportions are equal when 0 is inside the confidence interval. Why: Students confuse "no evidence of a difference" with "evidence of no difference". Correct move: When 0 is inside the interval, only conclude that there is not convincing evidence that the two proportions differ. The interval contains many non-zero values, so you cannot confirm equality.
Wrong move: Using this method for matched pairs (dependent) categorical data. Why: Students see two proportions and automatically use this method, even when samples are dependent (e.g., before/after on the same subjects). Correct move: For paired binary data, use McNemar's test for matched pairs, not this two-sample independent method.

6. Practice Questions (AP Statistics Style)

Question 1 (Multiple Choice)

A farmer tests a new organic fertilizer against a conventional fertilizer to compare germination rates of carrot seeds. He plants 200 seeds with the new fertilizer, and 160 germinate. He plants 150 seeds with conventional fertilizer, and 105 germinate. What is the margin of error for a 90% confidence interval for the difference (new minus conventional) in germination proportion? A) 0.052 B) 0.064 C) 0.077 D) 0.091

Worked Solution: First calculate the sample proportions: $\overset{p}{^}_{n e w} = 160/200 = 0.8$ , $\overset{p}{^}_{co n v e n t i o na l} = 105/150 = 0.7$ . Next calculate the standard error: $S E = \frac{0.8 ( 0.2 )}{200} + \frac{0.7 ( 0.3 )}{150} = 0.0008 + 0.0014 = 0.0022 \approx 0.0469$ . The critical $z^{*}$ for 90% confidence is 1.645. Multiply to get margin of error: $M E = 1.645 * 0.0469 \approx 0.077$ . The correct answer is C.

Question 2 (Free Response)

A college admissions office wants to compare the proportion of first-year students who graduate within 4 years between in-state and out-of-state applicants. Independent random samples of 120 in-state first-years and 100 out-of-state first-years from 10 years ago are selected. 84 in-state students and 62 out-of-state students graduated within 4 years. (a) Construct and interpret a 95% confidence interval for the difference $p_{in - s t a t e} - p_{o u t - o f - s t a t e}$ . (b) Based on your interval, is there convincing evidence that the 4-year graduation rate differs between in-state and out-of-state first-years at this college? Justify your answer. (c) Explain what "95% confidence" means in this context.

Worked Solution: (a) Verify conditions: Random independent samples are given, population of in-state/out-of-state students is more than 10 times sample size, and 84/36 for in-state, 62/38 for out-of-state are all ≥10, so conditions are met. Calculate: $\overset{p}{^}_{in} = 84/120 = 0.7$ , $\overset{p}{^}_{o u t} = 62/100 = 0.62$ . Point estimate = $0.7 - 0.62 = 0.08$ . $S E = \frac{0.7 ( 0.3 )}{120} + \frac{0.62 ( 0.38 )}{100} \approx 0.0645$ . $M E = 1.96 * 0.0645 \approx 0.126$ . Interval = $(0.08 - 0.126, 0.08 + 0.126) = (- 0.046, 0.206)$ . Interpretation: We are 95% confident that the true difference (in-state minus out-of-state) in 4-year graduation rates is between -0.046 and 0.206. (b) 0 is inside the interval, so there is not convincing evidence at the 95% confidence level that 4-year graduation rates differ between in-state and out-of-state first-years. (c) If we repeatedly took random samples of 120 in-state and 100 out-of-state first-years and constructed a 95% confidence interval each time, about 95% of the intervals would capture the true difference in 4-year graduation rates.

Question 3 (Application / Real-World Style)

A transit agency wants to compare the on-time rate for bus routes on Main Street vs 1st Avenue. A random sample of 120 trips on Main Street found 102 were on time. A random sample of 150 trips on 1st Avenue found 135 were on time. Construct a 90% confidence interval for the difference (Main Street minus 1st Avenue) in true on-time proportion, and interpret your result in context.

Worked Solution: All conditions are met: independent random samples, populations of trips are large enough, 102/18 for Main Street, 135/15 for 1st Avenue all ≥10. Calculate: $\overset{p}{^}_{M ain} = 102/120 = 0.85$ , $\overset{p}{^}_{1 s t} = 135/150 = 0.9$ . Point estimate = $- 0.05$ . $S E = \frac{0.85 ( 0.15 )}{120} + \frac{0.9 ( 0.1 )}{150} \approx 0.0396$ . $z^{*}$ for 90% = 1.645, $M E \approx 0.065$ . Interval = $(- 0.05 - 0.065, - 0.05 + 0.065) = (- 0.115, 0.015)$ . Interpretation: We are 90% confident that the true on-time rate for Main Street is between 11.5 percentage points lower and 1.5 percentage points higher than the on-time rate for 1st Avenue. Because 0 is inside the interval, we do not have convincing evidence that the on-time rates differ between the two routes at the 90% confidence level.

7. Quick Reference Cheatsheet

Category	Formula / Value	Notes
Parameter	$p_{1} - p_{2}$	True difference in proportions for two independent populations
Point Estimate	$\overset{p}{^}_{1} - \overset{p}{^}_{2} = \frac{x _{1}}{n _{1}} - \frac{x _{2}}{n _{2}}$	$x_{1}, x_{2}$ = number of successes per sample
Large Counts Condition	$n_{1} \overset{p}{^}_{1} \geq 10, n_{1} (1 - \overset{p}{^}_{1}) \geq 10$ $n_{2} \overset{p}{^}_{2} \geq 10, n_{2} (1 - \overset{p}{^}_{2}) \geq 10$	Check per sample, do not use pooled $\overset{p}{^}$
Unpooled Standard Error	$S E = \frac{p ^ _{1} ( 1 - p ^ _{1} )}{n _{1}} + \frac{p ^ _{2} ( 1 - p ^ _{2} )}{n _{2}}$	Always use unpooled for confidence intervals, never pool
Confidence Interval	$(\overset{p}{^}_{1} - \overset{p}{^}_{2}) \pm z^{*} \times S E$	General formula for any confidence level
Common Critical $z^{*}$	90% = 1.645, 95% = 1.96, 99% = 2.576	Memorize these to save time on the exam
Inference Rule	0 not in interval → convincing evidence of difference 0 in interval → no convincing evidence of difference	Never conclude "proportions are equal" when 0 is in the interval

8. What's Next

This topic lays the foundational framework for comparing two independent populations, which is the core of most applied statistical inference. It directly sets up the next topic in Unit 6: hypothesis tests for the difference in two proportions. Without mastering the conditions, standard error logic, and interpretation rules here, you will struggle to distinguish between confidence interval and hypothesis test rules (like when pooling is appropriate), a common source of lost points on the AP exam. Beyond Unit 6, this comparison framework extends to comparing two population means in Unit 7, and to comparing proportions across more than two groups with chi-square tests in Unit 8. Mastery of this topic is required for all subsequent inference for categorical data in AP Statistics.

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Confidence Intervals for the Difference in Two Proportions — AP Statistics Study Guide

1. What Is Confidence Intervals for the Difference in Two Proportions?

2. Conditions for Two-Proportion Confidence Intervals

Worked Example

3. Constructing the Confidence Interval

Worked Example

4. Interpreting Intervals and Drawing Conclusions

Worked Example

5. Common Pitfalls (and how to avoid them)

6. Practice Questions (AP Statistics Style)

Question 1 (Multiple Choice)

Question 2 (Free Response)

Question 3 (Application / Real-World Style)

7. Quick Reference Cheatsheet

8. What's Next

More study guides