AP · Comparing Distributions of a Quantitative Variable · 14 min read · Updated 2026-05-10

Comparing Distributions of a Quantitative Variable — AP Statistics Study Guide

For: AP Statistics candidates sitting AP Statistics.

Covers: Systematic comparison of shape, center, spread, and outliers across two or more distributions of quantitative variables; graphical comparison with dotplots/histograms/boxplots; correct pairing of summary statistics; the 1.5*IQR outlier rule for comparisons.

You should already know: How to describe a single quantitative variable’s distribution. How to calculate common one-variable summary statistics. How to interpret basic graphical displays for quantitative data.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.

1. What Is Comparing Distributions of a Quantitative Variable?

Comparing distributions of a quantitative variable is the foundational exploratory skill of systematically identifying similarities and differences between two or more groups of quantitative data, rather than describing a single distribution in isolation. Per the AP Statistics Course and Exam Description (CED), this topic falls within Unit 1 (Exploring One-Variable Data), which accounts for 15-20% of total exam score weight. This topic appears regularly on both multiple-choice (MCQ) and free-response (FRQ) sections of the exam: you can expect 1-2 standalone MCQs, and it is almost always the first component of a multi-part FRQ focused on exploratory analysis. Notation conventions follow standard one-variable statistics: subscripts label distinct groups, so $\overset{x}{ˉ}_{1}, s_{1}, M_{1}, I Q R_{1}$ refer to the mean, standard deviation, median, and interquartile range of Group 1, with matching notation for other groups. The core goal of comparison is to answer a practical question: do the groups differ systematically in their measured values, and if so, how?

2. Comparing Distributions Graphically

Comparisons almost always start with a graphical display, which lets you identify overall patterns and unusual features that may be hidden in summary statistics. Common displays for comparison are side-by-side dotplots (best for small datasets, show all individual points), overlapping or side-by-side histograms (best for large datasets, show overall shape), and side-by-side boxplots (ideal for comparing center, spread, and outliers across groups by explicitly displaying the five-number summary for each group). The consistent framework for graphical comparison follows four required components, in order: (1) compare shape, (2) compare center, (3) compare spread, (4) note any unusual features (outliers, clusters). All comparisons must be relative: you must explicitly state how one group differs from the other, not just describe one group in isolation.

Worked Example

A fitness researcher compares the number of minutes spent weekly doing cardio for two groups of adults: those who work full-time office jobs (Group O) and those who work full-time physically active jobs (Group A). The side-by-side dotplot of 12 adults per group is summarized below:

Group O: 10, 15, 20, 20, 25, 30, 30, 35, 40, 45, 50, 60
Group A: 0, 5, 10, 15, 15, 20, 20, 25, 30, 35, 40, 45

Compare the distributions of weekly cardio time graphically.

Shape: Both distributions are roughly unimodal and symmetric, with no extreme skewness.
Center: The center of the office worker distribution is higher: the median cardio time for Group O is 30 minutes, compared to 20 minutes for Group A.
Spread: The spread of values is similar for both groups: the range for Group O is $60 - 10 = 50$ minutes, and the range for Group A is $45 - 0 = 45$ minutes, so only a small difference in variability.
Unusual features: Neither distribution has outliers or clusters.
Conclusion: Office workers typically get more weekly cardio time outside of work than workers with physically active jobs, with similar variability in cardio time across both groups.

Exam tip: Always make your comparison contextual and relative. Instead of writing "the median is 30 minutes," write "the median for office workers is 10 minutes higher than the median for active workers" to earn full credit on FRQs.

3. Comparing Center and Spread with Summary Statistics

Graphical comparison gives a qualitative big picture, while numerical summary statistics quantify the magnitude of differences between groups. The key rule for choosing appropriate summary statistics depends on the shape of the distribution and presence of outliers, and requires matching resistant measures to resistant measures and non-resistant to non-resistant:

If the distribution is symmetric with no outliers: Use mean for center and standard deviation for spread. Both are non-resistant (affected by skewness and outliers), so they pair correctly.
If the distribution is skewed or has outliers: Use median for center and interquartile range (IQR) for spread. Both are resistant (unaffected by extreme values), so they pair correctly.

Pairing median with standard deviation or mean with IQR is almost always incorrect, as it mixes resistant and non-resistant measures.

Worked Example

The table below gives summary statistics for the monthly rent (in dollars) for one-bedroom apartments in two neighborhoods in a large city:

Neighborhood	Mean	Median	Standard Deviation	IQR	Shape
Downtown	1850	1725	420	550	Right-skewed
Suburb	1580	1490	310	420	Right-skewed

Compare the center and spread of monthly rents using appropriate summary statistics.

Choose appropriate statistics: Both distributions are right-skewed, so we use median (resistant center) and IQR (resistant spread).
Compare center: The median monthly rent for downtown one-bedroom apartments is $1725, w hi c hi s$ 235 higher than the median rent of $1490 in the suburb. This means typical rent is higher downtown.
Compare spread: The IQR of downtown rents is $550, w hi c hi s$ 130 larger than the suburb IQR of $420. This means there is more variability in typical rent prices downtown than in the suburb.
Why not mean/SD?: The right skew pulls the mean upward, so the difference in means ( $1850 - 1580 =$ 270) overstates the difference in typical rent between the two neighborhoods.

Exam tip: AP FRQs almost always award a separate point for choosing the correct summary statistics. Explicitly state why you chose your statistics if the question gives you shape information, to guarantee you earn that point.

4. Comparing Shape and Outliers Across Groups

Shape describes the overall pattern of a distribution, and differences in shape between groups are often as important as differences in center or spread. Key features to compare include: (1) skewness (symmetric vs right/left skewed), (2) modality (unimodal vs bimodal), and (3) presence and location of outliers. Skewness is easily inferred from the relative position of mean and median: the mean is always pulled toward the long tail of the distribution, so $\overset{x}{ˉ} > M$ indicates right skew, and $\overset{x}{ˉ} < M$ indicates left skew. Outliers are values far from the overall pattern of the data, and we use the formal 1.5*IQR rule to confirm outliers: any value less than $Q_{1} - 1.5 I QR$ or greater than $Q_{3} + 1.5 I QR$ is classified as an outlier. When comparing, note how many outliers each group has and where they fall (e.g., "the treatment group has two unusually high yield outliers").

Worked Example

A coffee shop chain compares the number of daily customer transactions at two of its locations over 25 days. The five-number summaries (in hundreds of transactions) are:

Location A (mall): Min = 12, Q1 = 18, Med = 24, Q3 = 29, Max = 42
Location B (street corner): Min = 8, Q1 = 19, Med = 25, Q3 = 30, Max = 36

Compare shape and identify any outliers for the two locations.

Calculate outlier cutoffs for each location:
- Location A: $I QR = 29 - 18 = 11$ , lower cutoff = $18 - 1.5 (11) = 1.5$ , upper cutoff = $29 + 1.5 (11) = 45.5$ . The maximum of 42 is below 45.5, so no outliers.
- Location B: $I QR = 30 - 19 = 11$ , lower cutoff = $19 - 1.5 (11) = 2.5$ , upper cutoff = $30 + 1.5 (11) = 46.5$ . All values are between 8 and 36, so no outliers.
Infer skewness from five-number summary:
- For Location A: The distance from min to median is $24 - 12 = 12$ , and from median to max is $42 - 24 = 18$ . The distance from Q1 to median is $24 - 18 = 6$ , and from median to Q3 is $29 - 24 = 5$ . This indicates a mild right skew.
- For Location B: The distance from min to median is $25 - 8 = 17$ , and from median to max is $36 - 25 = 11$ . This indicates a mild left skew.
Conclusion: Location A has a mildly right-skewed distribution of daily transactions with no outliers, while Location B has a mildly left-skewed distribution with no outliers.

Exam tip: If summary statistics give you mean and median, always link skewness to their relative position explicitly: "since the mean is greater than the median, the distribution is right-skewed" is a clear, point-earning statement for AP rubrics.

5. Common Pitfalls (and how to avoid them)

Wrong move: Listing summary statistics for each group but not making an explicit comparison between groups, e.g., writing "downtown median is 1725, suburb median is 1490" without noting downtown is higher. Why: Students are used to describing single distributions and forget the task requires comparison. Correct move: For every statistic, explicitly state which group has a higher/larger value and by how much, in context.
Wrong move: Pairing median (resistant center) with standard deviation (non-resistant spread), or mean (non-resistant center) with IQR (resistant spread). Why: Students memorize measures separately but forget the pairing rule matching resistance. Correct move: Before writing your comparison, check shape: if skewed/outliers, use median + IQR; if symmetric/no outliers, use mean + standard deviation.
Wrong move: Calling a distribution right-skewed when the mean is less than the median. Why: Students mix up the direction of the tail and its effect on the mean. Correct move: Remember the mean is pulled toward the long tail: $\overset{x}{ˉ} > M$ = right skew, $\overset{x}{ˉ} < M$ = left skew.
Wrong move: Calling an extreme value an outlier without using the 1.5IQR rule to confirm. Why: Students rely on visual guesswork instead of the formal rule required by AP rubrics. Correct move: Always calculate 1.5IQR cutoffs if the five-number summary is given to confirm a value is an outlier.
Wrong move: Omitting units for numerical differences or reporting units for the wrong variable. Why: Students focus on the comparison and forget contextual units. Correct move: Always include units for all values and differences, e.g., "a $235 difference in median monthly rent" not just "a 235 difference."

6. Practice Questions (AP Statistics Style)

Question 1 (Multiple Choice)

The five-number summaries for the maximum daily temperature (in degrees Fahrenheit) in July for two U.S. cities are given below:

Portland, OR: Min = 60, Q1 = 68, Med = 73, Q3 = 80, Max = 92
Seattle, WA: Min = 61, Q1 = 69, Med = 75, Q3 = 81, Max = 89

Both distributions are approximately symmetric with no outliers. Which of the following comparisons is correct? A) Seattle has a higher median maximum temperature and a larger IQR than Portland. B) Seattle has a higher median maximum temperature and a smaller IQR than Portland. C) Portland has a lower median maximum temperature and the same IQR as Seattle. D) Portland has a higher median maximum temperature and a smaller IQR than Seattle.

Worked Solution: First, calculate IQR for each city to compare spread. For Portland: $I QR = 80 - 68 = 12$ . For Seattle: $I QR = 81 - 69 = 12$ . Compare medians: Portland's median is 73, which is 2 degrees lower than Seattle's median of 75. The IQR is identical for both cities. This matches the description in option C. Correct answer: C.

Question 2 (Free Response)

A consumer group compares the battery life (in hours) of two brands of rechargeable headphones. Summary statistics for 40 tested headphones of each brand are below:

Brand	Mean	Median	Standard Deviation	IQR	Shape
Brand X	16.2	17	3.1	3.8	Left-skewed
Brand Y	17.1	18	2.8	3.2	Left-skewed

(a) Using appropriate summary statistics, compare the center and spread of battery life for the two brands. (b) Explain why you chose the summary statistics you used in part (a) instead of the other available options. (c) What conclusion can you draw about the battery life of the two brands based on your comparison?

Worked Solution: (a) Both distributions are left-skewed, so we use median for center and IQR for spread. The median battery life of Brand X is 17 hours, which is 1 hour shorter than the median battery life of 18 hours for Brand Y. The IQR for Brand X is 3.8 hours, which is 0.6 hours larger than the IQR of 3.2 hours for Brand Y. So Brand Y has longer typical battery life with less variability than Brand X. (b) Left skewness pulls the mean toward the long left tail of low battery life values, making the mean a non-resistant measure that does not reflect the typical battery life. Median is resistant to skewness and extreme values, so it is a better measure of center. By pairing rule, we use the resistant measure of spread (IQR) with the resistant measure of center, instead of non-resistant standard deviation. (c) Brand Y’s typical battery life is longer and more consistent than Brand X’s battery life, so consumers looking for longer battery life should choose Brand Y.

Question 3 (Application / Real-World Style)

A farm supply company tests two new varieties of corn to compare their yield (in bushels per acre). The five-number summaries for 20 test plots of each variety are given below:

Variety A: Min = 112, Q1 = 128, Med = 142, Q3 = 158, Max = 185
Variety B: Min = 108, Q1 = 135, Med = 148, Q3 = 162, Max = 210

Use the 1.5*IQR rule to check for outliers, then compare the yield distributions, and interpret your conclusion in context.

Worked Solution: First calculate IQR and outlier cutoffs: For Variety A, $I QR = 158 - 128 = 30$ , lower cutoff = $128 - 1.5 (30) = 83$ , upper cutoff = $158 + 1.5 (30) = 203$ . All yields are between 112 and 185, so no outliers. For Variety B, $I QR = 162 - 135 = 27$ , lower cutoff = $135 - 1.5 (27) = 94.5$ , upper cutoff = $162 + 1.5 (27) = 202.5$ . The maximum yield of 210 is above 202.5, so 210 bushels per acre is an outlier for Variety B. Compare distributions: The median yield for Variety B is 148 bushels per acre, 6 bushels higher than Variety A's median of 142. The IQR of Variety A (30) is slightly larger than Variety B's (27). Variety B has one unusually high yield outlier, while Variety A has no outliers. Conclusion: Variety B has a higher typical corn yield per acre than Variety A, with one test plot that produced a far higher yield than any plot of Variety A.

7. Quick Reference Cheatsheet

Category	Formula	Notes
Group notation	$\overset{x}{ˉ}_{g}, M_{g}, s_{g}, I Q R_{g}$	Subscript $g$ labels distinct groups, standard for all comparisons
1.5*IQR Outlier Cutoffs	Lower: $Q_{1} - 1.5 I QR$ Upper: $Q_{3} + 1.5 I QR$	Any value outside this range is an outlier; required for AP problems
Skewness Rule	Right skew: $\overset{x}{ˉ} > M$ Left skew: $\overset{x}{ˉ} < M$	Mean is always pulled toward the long tail
Pairing (Symmetric, no outliers)	Compare $\overset{x}{ˉ}_{1}$ vs $\overset{x}{ˉ}_{2}$ (center), $s_{1}$ vs $s_{2}$ (spread)	Use non-resistant measures for symmetric, outlier-free data
Pairing (Skewed / has outliers)	Compare $M_{1}$ vs $M_{2}$ (center), $I Q R_{1}$ vs $I Q R_{2}$ (spread)	Use resistant measures for skewed data or data with outliers
Interquartile Range (IQR)	$Q_{3} - Q_{1}$	Resistant measure of spread, always paired with median
Range	$Max - Min$	Rough measure of spread, less precise than IQR or standard deviation for comparisons

8. What's Next

Mastering comparison of one-variable quantitative distributions is a critical prerequisite for all future topics in AP Statistics. Immediately after this topic in Unit 1, you will refine your exploratory analysis skills for categorical variables, before moving on to exploring relationships between two variables in Unit 2. This chapter’s focus on systematic, contextual comparison of groups directly feeds into the entire topic of statistical inference for two groups later in the course: when you conduct a hypothesis test for the difference in two population means, you are formally answering the question you first posed as an exploratory comparison in Unit 1. Without being able to correctly describe and compare distributions, you will not be able to interpret the results of inference tests or communicate conclusions clearly to stakeholders.

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Comparing Distributions of a Quantitative Variable — AP Statistics Study Guide

1. What Is Comparing Distributions of a Quantitative Variable?

2. Comparing Distributions Graphically

Worked Example

3. Comparing Center and Spread with Summary Statistics

Worked Example

4. Comparing Shape and Outliers Across Groups

Worked Example

5. Common Pitfalls (and how to avoid them)

6. Practice Questions (AP Statistics Style)

Question 1 (Multiple Choice)

Question 2 (Free Response)

Question 3 (Application / Real-World Style)

7. Quick Reference Cheatsheet

8. What's Next

More study guides