AP Statistics · Sampling Distributions · 14 min read · Updated 2026-05-09

Sampling Distributions — AP Statistics Study Guide

For: AP Statistics candidates sitting AP Statistics.

Covers: Sampling distribution of a sample mean and a sample proportion, Central Limit Theorem, conditions (Random / 10% / Normal or Large Counts) for using normal approximations — AP Statistics Unit 5.

You should already know: Probability and random variables (Unit 4), normal distribution (Unit 1).

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board papers. Use them to practise the technique; cross-check with official College Board mark schemes for grading conventions.

1. Why Sampling Distributions Matter

Unit 5 is the conceptual bridge between probability (Units 3-4) and inference (Units 6-9). Every confidence interval and hypothesis test you'll learn from Unit 6 onwards rests on knowing the sampling distribution of the relevant statistic. About 7-12% of the AP Stats score comes from Unit 5 directly, but its concepts power the remaining 30+%.

The single big idea: a statistic computed from a random sample is itself a random variable, with its own probability distribution called the sampling distribution. Knowing the shape, centre, and spread of this distribution lets you say "the sample mean is unlikely to be more than X away from the population mean".

2. Sampling distribution of a sample proportion $\overset{p}{^}$

If we take samples of size $n$ from a population with true proportion $p$ , the sample proportion $\overset{p}{^}$ has:

Mean: $μ_{\overset{p}{^}} = p$ (the sampling distribution is unbiased).
Standard deviation (= standard error): $σ_{\overset{p}{^}} = p (1 - p) / n$ .
Shape: approximately Normal when Large Counts condition holds: $n p \geq 10$ AND $n (1 - p) \geq 10$ .

Conditions for using these formulas:

Random: data come from a random sample.
10% condition: $n \leq 0.10 N$ (sample is no more than 10% of population, so independence of draws is approximately valid).
Large Counts: as above.

3. Sampling distribution of a sample mean $\overset{x}{ˉ}$

For samples of size $n$ from a population with mean $μ$ and standard deviation $σ$ :

Mean: $μ_{\overset{x}{ˉ}} = μ$ .
Standard deviation: $σ_{\overset{x}{ˉ}} = σ / n$ (decreases as $n$ ).
Shape: approximately Normal when either the population is Normal or $n \geq 30$ (Central Limit Theorem).

Conditions:

Random: random sample.
10% condition: $n \leq 0.10 N$ .
Normal or large $n$ : as above (CLT).

4. Central Limit Theorem (CLT)

For a sample of size $n$ drawn from any population (Normal, skewed, bimodal, etc.) with finite mean $μ$ and standard deviation $σ$ :

The sampling distribution of $\overset{x}{ˉ}$ approaches Normal( $μ$ , $σ / n$ ) as $n \to \infty$ .

Practical rule of thumb: $n \geq 30$ is "large enough" for most populations. Exceptions: extremely skewed populations may need larger $n$ for the approximation to be good.

CLT does not say:

The original population becomes Normal (it doesn't).
Individual data points are Normally distributed (they're not).
The shape of $\overset{x}{ˉ}$ for small $n$ is Normal (it isn't, unless population is).

5. Bias and variability

A statistic is unbiased if its mean equals the parameter it estimates (e.g. $μ_{\overset{x}{ˉ}} = μ$ ). $\overset{x}{ˉ}$ and $\overset{p}{^}$ are both unbiased.

Variability is the spread of the sampling distribution — controlled by $n$ (larger $n$ → less variable). To halve the standard error, multiply $n$ by 4.

A good estimator is both unbiased and low-variability. A precise but biased estimator (always 5% off) and a high-variability unbiased estimator (averaging right but each estimate wildly different) are both bad.

6. Differences and combinations

Sampling distribution of $\overset{p}{^}_{1} - \overset{p}{^}_{2}$ (independent samples): $μ = p_{1} - p_{2}, σ = \frac{p _{1} ( 1 - p _{1} )}{n _{1}} + \frac{p _{2} ( 1 - p _{2} )}{n _{2}}$

Sampling distribution of $\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}$ (independent samples): $μ = μ_{1} - μ_{2}, σ = \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}}$

Variances add (under independence); standard deviations don't. This is the foundation of two-sample inference.

7. Worked Example

A factory claims 92% of its widgets pass inspection. A consumer group samples $n = 200$ widgets randomly and finds 175 pass (so $\overset{p}{^} = 175/200 = 0.875$ ). What is the probability of getting a sample proportion of 0.875 or lower if the true rate is really 92%?

Solution.

Conditions check:
- Random ✓ (stated).
- 10% condition: assume $n = 200 \leq 0.10 N$ , i.e. factory makes ≥ 2000 widgets — reasonable.
- Large Counts: $n p = 200 \cdot 0.92 = 184 \geq 10$ ✓; $n (1 - p) = 200 \cdot 0.08 = 16 \geq 10$ ✓.
Sampling distribution of $\overset{p}{^}$ : approximately Normal with $μ = 0.92$ , $σ = 0.92 \cdot 0.08/200 = 0.000368 = 0.01918$ .
Z-score: $z = (0.875 - 0.92) /0.01918 = - 2.347$ .
Probability: $P (\overset{p}{^} \leq 0.875) = P (Z \leq - 2.347) \approx 0.0095$ .

Interpretation: only ~1% chance of seeing a sample proportion this low if the company's claim is true. That's strong evidence against the 92% claim — suggests the true rate is below 92%.

8. Common Pitfalls

Confusing $σ$ vs $σ_{\overset{x}{ˉ}}$ : the population SD is one thing; the SD of the sampling distribution of the mean is $σ / n$ , which is smaller. AP graders catch this often.
Ignoring conditions: state and check Random, 10%, and Normal/Large Counts conditions every time on the FRQ. Missing one loses points.
CLT confusion: CLT applies to the sampling distribution, not the population. The population's shape stays whatever it is.
Sample size confusion: bigger $n$ → smaller standard error, not smaller population SD.

9. Practice Questions (CED Style)

A population has mean $μ = 50$ and SD $σ = 12$ . For samples of size 36, calculate $σ_{\overset{x}{ˉ}}$ and find $P (\overset{x}{ˉ} > 53)$ . Justify the use of Normal approximation.
A poll surveys 1000 voters. The true population proportion supporting candidate A is 0.55. Find the probability the sample proportion will be between 0.52 and 0.58.
Two independent factories produce circuit boards. Factory A averages 5 defects per 100 boards (n=400), Factory B averages 3 defects per 100 boards (n=300). Find the standard error of $\overset{p}{^}_{A} - \overset{p}{^}_{B}$ , and the probability the observed difference is greater than 4 percentage points.

10. Quick Reference Cheatsheet

$\overset{p}{^}$ : $μ = p$ , $σ = p (1 - p) / n$ , Normal if $n p \geq 10$ AND $n (1 - p) \geq 10$ .
$\overset{x}{ˉ}$ : $μ = μ_{pop}$ , $σ = σ_{pop} / n$ , Normal if pop. Normal OR $n \geq 30$ .
Conditions: Random, 10%, Normal/Large Counts. State all 3 every time.
CLT: $\overset{x}{ˉ}$ approaches Normal as $n \to \infty$ for any population shape.
Difference of two: variances add, $σ_{diff} = σ_{1}^{2} + σ_{2}^{2}$ .
Halve SE: quadruple $n$ .

11. What's Next

Sampling distributions are the foundation for Unit 6 (Inference: Proportions) and Unit 7 (Inference: Means) — confidence intervals and hypothesis tests are direct applications. Drill the conditions check until it's automatic. Use Ollie to step through specific FRQ scenarios: "Why is my sample size too small to use the Normal approximation here?" or "Walk me through the Z-score logic for a difference of two means".

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →