Sampling Distribution of a Sample Mean — AP Statistics Study Guide
For: AP Statistics candidates sitting AP Statistics.
Covers: The mean and standard error of the sampling distribution of , the Central Limit Theorem, normality conditions, 10% condition for independence, probability calculations for sample means, and distinguishing between population, sample, and sampling distributions.
You should already know: 1) The difference between a population parameter and a sample statistic. 2) Basic properties of the normal distribution and z-score calculation. 3) The general definition of a sampling distribution.
A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.
1. What Is Sampling Distribution of a Sample Mean?
The sampling distribution of a sample mean (often shortened to sampling distribution of ) is the probability distribution of the sample mean statistic calculated from every possible random sample of the same fixed size drawn from a given population. This is distinct from the population distribution (distribution of all individual values in the population) and the sample distribution (distribution of values in a single collected sample). By convention, we use for the population mean, for the population standard deviation, for the mean of the sampling distribution, and (called the standard error of the mean) for the standard deviation of the sampling distribution. According to the AP Statistics Course and Exam Description (CED), this topic makes up roughly 10-15% of Unit 5 (Sampling Distributions), which corresponds to 1-3% of the total AP exam score. It appears in both multiple choice (MCQ) as standalone questions and free response (FRQ) as a foundational step for inference questions about population means.
2. Mean and Standard Error of the Sampling Distribution of
Two core properties of any sampling distribution are its center (mean) and spread (standard deviation, called standard error here). For any simple random sample of size drawn from a population with mean and standard deviation , the mean of the sampling distribution of is always equal to the population mean: This means the sample mean is an unbiased estimator of the population mean : over repeated sampling, the average of all possible sample means equals the true population mean. The spread of the sampling distribution is given by: This formula only holds if two conditions are met: (1) observations in the sample are independent, which for sampling without replacement requires the 10% condition: the sample size is no more than 10% of the total population size (). If we sample with replacement or the population is infinite, the 10% condition is automatically satisfied. Intuitively, increasing the sample size reduces the standard error, meaning larger samples produce sample means that are closer on average to the true population mean.
Worked Example
The distribution of annual household income in a large city has a mean of and a standard deviation of . We take a random sample of 100 households from this city. Find the mean and standard error of the sampling distribution of the sample mean household income, justifying any conditions.
- Check 10% condition: 100 households is far less than 10% of all households in a large city, so the condition is satisfied, and we can use the standard error formula.
- Calculate the mean of the sampling distribution: .
- Calculate the standard error: .
Final answer: The mean of the sampling distribution is and the standard error is .
Exam tip: On AP FRQs, you will lose a point if you do not explicitly state and check the 10% condition before using the formula. Always include this step even if the condition is obviously satisfied.
3. The Central Limit Theorem (CLT)
The Central Limit Theorem (CLT) is the key result that allows us to use normal distribution calculations for sample means even when the underlying population is not normally distributed. Formally, the CLT states that for any population distribution (regardless of its shape: skewed, uniform, bimodal, etc.), the sampling distribution of the sample mean becomes approximately normally distributed as the sample size increases. For AP Statistics, we use the rule of thumb that is a large enough sample size for the CLT approximation to hold. If the original population is already normally distributed, the sampling distribution of is exactly normally distributed for any sample size, no matter how small, so we do not need the CLT in that case.
Intuitively, the CLT works because averaging cancels out extreme values in individual observations. Even if many individual values are very high or very low, the average of multiple values will tend to cluster around the mean, producing a bell-shaped distribution for the average even if the original distribution is not bell-shaped.
Worked Example
The distribution of waiting time for a bus at a busy downtown stop is heavily left-skewed, because most buses arrive within a few minutes, but occasionally there is a long gap between buses. A city transit planner takes a random sample of 36 waiting times to study the distribution. Is the sampling distribution of the sample mean waiting time approximately normal? Justify your answer.
- We know the population distribution of waiting times is heavily left-skewed, so it is not normal.
- The sample size is , which is greater than the AP threshold of 30 for the Central Limit Theorem.
- By the CLT, the sampling distribution of the sample mean waiting time will be approximately normal, regardless of the skewness of the population distribution.
Conclusion: Yes, the sampling distribution is approximately normal.
Exam tip: Only invoke the Central Limit Theorem when the population is non-normal. If the population is already normal, you do not need CLT to claim normality of the sampling distribution; stating the population is normal is sufficient for any sample size. AP graders will deduct points for mis-citing CLT in this case.
4. Calculating Probabilities for Sample Means
The most common application of this topic on the AP exam is calculating probabilities for a sample mean falling in a given range. The step-by-step process for this calculation is:
- Check conditions: (a) 10% condition for independence, (b) normality of the sampling distribution (either population is normal, or and CLT applies).
- Calculate and .
- Calculate the z-score for the observed sample mean : Note that this is different from the z-score for an individual observation, which uses (population standard deviation) instead of (standard error).
- Use the standard normal distribution to find the desired probability.
Worked Example
The mean score on a national high school biology exam is 72 with a standard deviation of 12. A random sample of 64 students who took the exam is selected. What is the probability that the mean exam score for the sample is between 70 and 74?
- Check conditions: 10% condition: 64 students < 10% of all students who took the national exam, so satisfied. Normality: , so CLT applies, sampling distribution is approximately normal.
- Calculate parameters: , .
- Calculate z-scores: , .
- Find probability: .
Final answer: The probability is approximately 0.816.
Exam tip: If the question asks for the probability that an individual observation falls in a range, use ; if it asks for the probability that a sample mean falls in a range, always use . Double-check which one the question asks for before calculating.
5. Common Pitfalls (and how to avoid them)
- Wrong move: Using instead of when calculating the standard error or z-score for a sample mean. Why: Confuses the standard deviation of the population (for individual observations) with the standard deviation of the sampling distribution of the sample mean, a very common mix-up on MCQs. Correct move: Every time you work with a sample mean , use for the standard deviation, regardless of the problem context.
- Wrong move: Claiming the Central Limit Theorem makes the population distribution normal when is large. Why: Confuses the sampling distribution of the statistic with the original population distribution. Correct move: Remember CLT describes the shape of the distribution of all possible sample means, not the shape of the population or a single sample.
- Wrong move: Forgetting to check the 10% condition when using the formula for sampling without replacement. Why: Students focus on normality conditions and skip the independence check required for the formula to hold. Correct move: Explicitly state and check the 10% condition on every FRQ problem involving a sampling distribution of a sample mean.
- Wrong move: Claiming that means the sampling distribution of can never be normal. Why: Confuses the CLT requirement for non-normal populations with the case of normally distributed populations. Correct move: If the original population is normally distributed, the sampling distribution of is exactly normal even for ; the rule only applies to non-normal populations.
- Wrong move: Invoking the Central Limit Theorem to claim normality when the population is already normally distributed. Why: Students memorize "CLT = normality" and cite it automatically, even when it is unnecessary. Correct move: Only cite CLT when the population is non-normal and ; if the population is normal, state the sampling distribution is exactly normal for any sample size.
- Wrong move: Interpreting as the mean of one sample. Why: Mixes up the three levels of distribution: population, sample, and sampling distribution. Correct move: Remember is the mean of all possible sample means from samples of size , not the mean of one sample or the population.
6. Practice Questions (AP Statistics Style)
Question 1 (Multiple Choice)
The distribution of the weight of avocados sold at a farmers market is approximately normal with mean 150 grams and standard deviation 20 grams. A random sample of 4 avocados is selected. What is the probability that the mean weight of the sample is less than 135 grams?
A) 0.0668 B) 0.2266 C) 0.2734 D) 0.9332
Worked Solution: First, confirm conditions: the population of avocado weights is normal, so the sampling distribution of is exactly normal, and the 10% condition is satisfied because 4 avocados is less than 10% of all avocados at the market. Calculate grams and grams. The z-score for is . Looking up in the standard normal table gives 0.0668. Option B is the result of incorrectly using instead of for an individual avocado. The correct answer is A.
Question 2 (Free Response)
A large chain of coffee shops collects data on the amount of time customers wait in line to receive their order. The population distribution of wait times is strongly right-skewed with a mean of 3.2 minutes and standard deviation of 1.5 minutes. A random sample of 50 customers is selected. (a) Calculate the mean and standard deviation of the sampling distribution of the sample mean wait time. Be sure to check any necessary conditions. (b) What is the shape of this sampling distribution? Justify your answer. (c) Would the shape of the sampling distribution change if the sample size was decreased to 8? If so, how? Explain.
Worked Solution: (a) First, check the 10% condition: 50 customers is less than 10% of all customers of the large chain, so the condition is satisfied. The mean of the sampling distribution is equal to the population mean: minutes. The standard error is minutes. (b) The shape of the sampling distribution is approximately normal. The population distribution is strongly right-skewed (non-normal), but the sample size , so the Central Limit Theorem guarantees the sampling distribution of is approximately normal. (c) Yes, the shape would change. For , the Central Limit Theorem does not apply, so the sampling distribution of would inherit the strong right skewness of the original population distribution and would not be approximately normal.
Question 3 (Application / Real-World Style)
A bottling machine fills 12-ounce soda bottles. The actual amount of soda dispensed by the machine follows a uniform distribution between 11.8 and 12.4 ounces, so the population mean is 12.1 ounces and population standard deviation is approximately 0.173 ounces. A quality control inspector takes a random sample of 35 bottles to check the machine's calibration. What is the probability that the mean amount of soda in the sample is between 12.0 and 12.2 ounces? Interpret your result in context.
Worked Solution:
- Check conditions: 35 bottles is less than 10% of all bottles filled by the machine, so 10% condition is satisfied. , so CLT applies, sampling distribution is approximately normal.
- Calculate parameters: ounces, ounces.
- Calculate z-scores: , .
- Find probability: .
Interpretation: If the machine is calibrated correctly as described, approximately 99.94% of all random samples of 35 bottles will have a mean amount of soda between 12.0 and 12.2 ounces.
7. Quick Reference Cheatsheet
| Category | Formula | Notes |
|---|---|---|
| Mean of Sampling Distribution of | Always true for simple random samples; is an unbiased estimator of | |
| Standard Error of the Sample Mean | Requires of the population for sampling without replacement | |
| Z-score for a Sample Mean | Different from z-score for an individual observation, which uses | |
| 10% Condition | = population size; ensures independent observations when sampling without replacement | |
| Normality: Normal Population | Sampling distribution of is exactly normal | Applies for any sample size, no minimum n required |
| Normality: Non-Normal Population | Approximately normal by CLT if | is the AP Statistics rule of thumb for "large enough" |
| Central Limit Theorem | For large , approximately | Applies to any population distribution, regardless of original shape |
8. What's Next
This topic is the foundational prerequisite for all statistical inference involving population means, which makes up roughly 20% of the total AP exam score across Units 6, 7, and 8. Next, you will apply the properties of the sampling distribution of you learned here to construct confidence intervals for an unknown population mean and conduct significance tests for claims about a population mean. Without mastering this topic, you will not be able to correctly check conditions or interpret results for these inference procedures, which are heavily tested on the FRQ section. This topic also paves the way for inference comparing two population means, a common FRQ question. Follow-on topics: Confidence Intervals for a Population Mean Significance Tests for a Population Mean Sampling Distribution of a Sample Proportion Inference for Two Population Means