IBO · ibo-math-aa-hl · IB Math: Analysis & Approaches HL · Statistics and Probability · 18 min read · Updated 2026-05-06

Statistics and Probability — IB Math AA HL AA HL Study Guide

For: IB Math AA HL candidates sitting IB Math: Analysis & Approaches HL.

Covers: Descriptive statistics (measures of centre and spread), conditional and independent probability, discrete/continuous binomial and normal distributions, HL Bayes' theorem, and hypothesis testing intuition.

You should already know: IGCSE / pre-DP math, comfort with proof and algebra.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the IB Math AA HL style for educational use. They are not reproductions of past IBO papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official IBO mark schemes for grading conventions.

1. What Is Statistics and Probability?

Statistics and Probability is the branch of mathematics focused on collecting, analyzing, and interpreting numerical data, and quantifying the likelihood of random events occurring. For IB Math AA HL, this topic blends computational skills (e.g. calculating distribution probabilities) with conceptual reasoning (e.g. justifying conclusions in hypothesis tests) and accounts for 30-40% of your final assessment across Papers 2 and 3, including both short-response and extended problem-solving questions. It is also one of the most practically applicable topics, with real-world uses in fields from medicine to economics.

2. Descriptive statistics — measures of centre and spread

Descriptive statistics summarize datasets to make them interpretable, without drawing broader conclusions about the population the data is drawn from. The two core categories of descriptive measures are centre (average value) and spread (variability of values).

Measures of centre

Mean: The arithmetic average of the dataset. For ungrouped data: $\overset{x}{ˉ} = \frac{\sum x}{n}$ . For grouped data, use class midpoints ( $x$ ) and frequencies ( $f$ ): $\overset{x}{ˉ} = \frac{\sum f x}{\sum f}$ .
Median: The middle value of an ordered dataset. For ungrouped data with $n$ points, it is the value at position $\frac{n + 1}{2}$ . For grouped data, use linear interpolation to estimate the median within the median class.
Mode: The most frequently occurring value, rarely tested for HL except for discrete distributions.

Measures of spread

Interquartile range (IQR): Difference between the 75th percentile ( $Q_{3}$ ) and 25th percentile ( $Q_{1}$ ), $I QR = Q_{3} - Q_{1}$ . It is not affected by outlier values, unlike range.
Variance: Average squared deviation from the mean. For ungrouped data: $σ^{2} = \frac{\sum ( x - x ˉ ) ^{2}}{n} = \frac{\sum x ^{2}}{n} - \overset{x}{ˉ}^{2}$ . For grouped data, weight by frequency: $σ^{2} = \frac{\sum f x ^{2}}{\sum f} - \overset{x}{ˉ}^{2}$ .
Standard deviation: Square root of variance, measured in the same units as the original data.

Worked example: For the ungrouped dataset [2,4,4,6,7,9,10], the mean is $\frac{42}{7} = 6$ , median is the 4th value = 6, $Q_{1} = 4$ , $Q_{3} = 9$ , $I QR = 5$ , variance ≈7.14, and standard deviation ≈2.67.

Exam tip: Always check if the question asks for population (divide by $n$ ) or sample (divide by $n - 1$ ) variance; if unspecified, use population variance for descriptive statistics questions.

3. Probability concepts — conditional, independent events

Probability quantifies the likelihood of an event occurring, with values between 0 (impossible) and 1 (certain). Core foundational rules include:

Complement rule: $P (A^{'}) = 1 - P (A)$ , where $A^{'}$ is the event that $A$ does not occur.
Union rule: $P (A \cup B) = P (A) + P (B) - P (A \cap B)$ . For mutually exclusive events (cannot occur at the same time), $P (A \cap B) = 0$ , so $P (A \cup B) = P (A) + P (B)$ .

Conditional probability

Conditional probability is the probability of event $A$ occurring given that event $B$ has already occurred, defined as: $P (A ∣ B) = \frac{P ( A \cap B )}{P ( B )}, P (B) \neq = 0$

Independent events

Two events are independent if the occurrence of one does not affect the probability of the other. This means $P (A ∣ B) = P (A)$ and $P (B ∣ A) = P (B)$ , which rearranges to the independence test: $P (A \cap B) = P (A) P (B)$

Worked example: You draw one card from a standard 52-card deck. Let $A$ = drawing a heart, $B$ = drawing a king. $P (A) = \frac{1}{4}$ , $P (B) = \frac{1}{13}$ , $P (A \cap B) = \frac{1}{52}$ . Since $\frac{1}{4} \times \frac{1}{13} = \frac{1}{52}$ , $A$ and $B$ are independent.

Exam tip: Never confuse mutually exclusive and independent events! If two events are mutually exclusive and have non-zero probability, they cannot be independent, as $P (A \cap B) = 0$ cannot equal $P (A) P (B) > 0$ .

4. Discrete and continuous distributions — binomial, normal

Probability distributions describe the probability of every possible outcome of a random variable.

Discrete distributions: Binomial

A discrete random variable takes distinct, countable values. The binomial distribution applies to scenarios with:

Fixed number of independent trials $n$
Two possible outcomes (success/failure)
Constant probability of success $p$ across all trials

Notation: $X \sim B (n, p)$ . The probability of $k$ successes is: $P (X = k) = (k n) p^{k} (1 - p)^{n - k}, k = 0, 1, ..., n$ Mean $E (X) = n p$ , variance $V a r (X) = n p (1 - p)$ .

Worked example: You roll a fair die 10 times, $X$ = number of 6s rolled. $X \sim B (10, \frac{1}{6})$ . $P (X = 2) = (2 10) (\frac{1}{6})^{2} (\frac{5}{6})^{8} \approx 0.291$ , $E (X) \approx 1.67$ , $V a r (X) \approx 1.39$ .

Continuous distributions: Normal

A continuous random variable takes any value in an interval, so the probability of a single exact value is 0. The normal distribution is a symmetric bell-shaped distribution defined by its mean $μ$ and variance $σ^{2}$ .

Notation: $X \sim N (μ, σ^{2})$ . The standard normal distribution has $μ = 0$ , $σ^{2} = 1$ , noted $Z \sim N (0, 1)$ . To standardize any normal variable: $Z = \frac{X - μ}{σ}$

Worked example: Adult male heights are $X \sim N (175, 36)$ (mean 175cm, standard deviation 6cm). The probability a man is shorter than 180cm is $P (X < 180) = P (Z < \frac{180 - 175}{6} \approx 0.833) \approx 0.798$ , or ~80%.

Exam tip: The second parameter of the normal distribution is variance, not standard deviation. Writing $N (175, 6)$ instead of $N (175, 36)$ is one of the most common mark-losing mistakes on HL stats questions.

5. Bayes' theorem (HL)

Bayes' theorem is used to update the probability of an event given new evidence, and is a required HL-only concept. It is derived directly from the conditional probability rule. For a set of mutually exclusive, exhaustive events $A_{1}, A_{2}, ..., A_{n}$ that partition the sample space: $P (A_{i} ∣ B) = \frac{P ( B ∣ A _{i} ) P ( A _{i} )}{\sum _{j = 1}^{n} P ( B ∣ A _{j} ) P ( A _{j} )}$

The most common 2-event version is: $P (A ∣ B) = \frac{P ( B ∣ A ) P ( A )}{P ( B ∣ A ) P ( A ) + P ( B ∣ A ^{'} ) P ( A ^{'} )}$

Worked example: A disease affects 1% of the population. A test for the disease has a 95% true positive rate ( $P (p os i t i v e ∣ d i se a se d) = 0.95$ ) and 2% false positive rate ( $P (p os i t i v e ∣ n o t d i se a se d) = 0.02$ ). What is the probability you have the disease if you test positive? Let $A$ = diseased, $B$ = test positive. Substitute: $P (A ∣ B) = \frac{0.95 \times 0.01}{( 0.95 \times 0.01 ) + ( 0.02 \times 0.99 )} = \frac{0.0095}{0.0293} \approx 0.324$ Only ~32% of positive test results are true positives, due to the low prevalence of the disease.

Exam tip: HL questions often use 3 or more partition events. Always explicitly label each event and list all given conditional probabilities before plugging into the formula to avoid mixing up $P (A ∣ B)$ and $P (B ∣ A)$ .

6. Hypothesis testing intuition

Hypothesis testing is a statistical framework to test a claim about a population parameter using sample data. The core steps are:

State hypotheses: The null hypothesis $H_{0}$ is the default assumption of no effect/no difference, e.g. $μ = μ_{0}$ , $p = p_{0}$ . The alternative hypothesis $H_{1}$ is the claim you are testing, and can be one-tailed (e.g. $μ > μ_{0}$ ) or two-tailed (e.g. $μ \neq = μ_{0}$ ).
Set significance level: The $α$ value (usually 5% = 0.05 for IB) is the maximum allowed probability of a Type I error (rejecting $H_{0}$ when it is true).
Calculate test statistic: For normal distributions, the z-test statistic is $Z = \frac{x ˉ - μ}{σ / n}$ .
Make a decision: Compare the test statistic to the critical value for your significance level, or calculate the p-value. If $p < α$ , reject $H_{0}$ ; otherwise, fail to reject $H_{0}$ .
Conclusion: Always state your conclusion in the context of the original question.

Worked example: A café claims their average latte temperature is 65°C. You sample 10 lattes, find a mean of 62°C, with known population standard deviation 3°C. Test at 5% significance if the temperature is lower than claimed.

$H_{0} : μ = 65$ , $H_{1} : μ < 65$ (one-tailed)
$α = 0.05$ , critical z-value = -1.645
Test statistic: $Z = \frac{62 - 65}{3/ 10} \approx - 3.16 < - 1.645$
Reject $H_{0}$ : There is sufficient evidence at the 5% level that the average latte temperature is lower than the café's claim.

7. Common Pitfalls (and how to avoid them)

Wrong move: Using standard deviation instead of variance when writing $N (μ, σ^{2})$ notation. Why: Students often use standard deviation for calculations, so they mix up the second parameter. Correct move: Always square the standard deviation to get variance for distribution notation, and double-check before using calculator distribution functions.
Wrong move: Confusing mutually exclusive and independent events. Why: Both describe relationships between events, so students mix their definitions. Correct move: Remember mutually exclusive = $P (A \cap B) = 0$ , independent = $P (A \cap B) = P (A) P (B)$ . Non-zero probability mutually exclusive events cannot be independent.
Wrong move: Forgetting to double p-values or split significance levels for two-tailed hypothesis tests. Why: Students default to one-tailed calculations even when $H_{1}$ uses a $\neq =$ sign. Correct move: For two-tailed tests, use $α /2$ for each tail, or double the one-tailed p-value before comparing to $α$ .
Wrong move: Skipping linear interpolation for grouped medians or quartiles, using only the class midpoint. Why: Students take shortcuts to save time, losing method marks. Correct move: Use the interpolation formula: $v a l u e = L + \frac{p os i t i o n - F}{f} \times w$ , where $L$ = lower class bound, $F$ = cumulative frequency before the class, $f$ = class frequency, $w$ = class width.
Wrong move: Stating only "reject $H_{0}$ " as the conclusion of a hypothesis test, without context. Why: Students forget that examiners require application to the scenario. Correct move: Always link your decision back to the original question, e.g. "there is evidence at the 5% level that the new production method increases bulb lifetime".

8. Practice Questions (IB Math: Analysis & Approaches HL Style)

Question 1

The grouped data below shows the time taken for 50 students to complete a HL math problem:

Time (t, minutes)	0 ≤ t < 2	2 ≤ t < 4	4 ≤ t < 6	6 ≤ t < 8	8 ≤ t < 10
Frequency	7	18	15	7	3
(a) Calculate the estimated mean time. (2 marks)
(b) Calculate the interquartile range using linear interpolation. (4 marks)

Solution 1

(a) Class midpoints: 1, 3, 5, 7, 9. Sum of weighted values: $\sum f x = (7 \times 1) + (18 \times 3) + (15 \times 5) + (7 \times 7) + (3 \times 9) = 212$ . Estimated mean = $\frac{212}{50} = 4.24$ minutes. (b) $n = 50$ , $Q_{1}$ position = 12.5th value, $Q_{3}$ position = 37.5th value.

$Q_{1}$ in 2≤t<4 class: $Q_{1} = 2 + \frac{12.5 - 7}{18} \times 2 \approx 2.61$ minutes
$Q_{3}$ in 4≤t<6 class: $Q_{3} = 4 + \frac{37.5 - 25}{15} \times 2 \approx 5.67$ minutes
$I QR = 5.67 - 2.61 = 3.06$ minutes

Question 2

A bag contains 3 red balls and 5 blue balls. You draw two balls without replacement. Let $R_{1}$ = first ball red, $R_{2}$ = second ball red. (a) Calculate $P (R_{1} \cap R_{2})$ (2 marks) (b) Calculate $P (R_{2} ∣ R_{1}^{'})$ (2 marks) (c) Are $R_{1}$ and $R_{2}$ independent? Justify your answer. (2 marks)

Solution 2

(a) $P (R_{1}) = \frac{3}{8}$ , $P (R_{2} ∣ R_{1}) = \frac{2}{7}$ , so $P (R_{1} \cap R_{2}) = \frac{3}{8} \times \frac{2}{7} = \frac{3}{28} \approx 0.107$ (b) $R_{1}^{'}$ = first ball blue, so 3 red balls remain out of 7 total. $P (R_{2} ∣ R_{1}^{'}) = \frac{3}{7} \approx 0.429$ (c) For independence, $P (R_{2}) = P (R_{2} ∣ R_{1})$ . $P (R_{2}) = (\frac{3}{8} \times \frac{2}{7}) + (\frac{5}{8} \times \frac{3}{7}) = \frac{21}{56} = \frac{3}{8} = 0.375$ . $P (R_{2} ∣ R_{1}) = \frac{2}{7} \approx 0.286 \neq = 0.375$ , so events are not independent.

Question 3

A factory produces light bulbs with lifetime normally distributed with mean 1200 hours, standard deviation 80 hours. A new production method is tested, and a sample of 20 bulbs from the new line has a mean lifetime of 1240 hours. Assume the standard deviation is unchanged. Test at the 5% significance level if the new method increases bulb lifetime. (6 marks)

Solution 3

Hypotheses: $H_{0} : μ = 1200$ , $H_{1} : μ > 1200$ (one-tailed test)
Significance level $α = 0.05$ , critical z-value = 1.645
Test statistic: $Z = \frac{1240 - 1200}{80/ 20} \approx 2.236$
2.236 > 1.645, so reject $H_{0}$
Conclusion: There is sufficient evidence at the 5% significance level that the new production method increases the average lifetime of the light bulbs.

9. Quick Reference Cheatsheet

Formula/Rule	Key Notes
Grouped mean: $\overset{x}{ˉ} = \frac{\sum f x}{\sum f}$	$x$ = midpoint of class interval
Population variance: $σ^{2} = \frac{\sum x ^{2}}{n} - \overset{x}{ˉ}^{2}$	Sample variance divides by $n - 1$
Conditional probability: $P(A	B) = \frac{P(A∩B)}{P(B)}$
Independent events: $P (A \cap B) = P (A) P (B)$	$P(A
Binomial distribution: $X \sim B (n, p)$	$E (X) = n p$ , $V a r (X) = n p (1 - p)$
Normal distribution: $X \sim N (μ, σ^{2})$	Standardize: $Z = \frac{X - μ}{σ}$ , $Z \sim N (0, 1)$
Bayes' theorem (2-event): $P(A	B) = \frac{P(B
Hypothesis testing workflow	1. State $H_{0} / H_{1}$ 2. Set $α$ 3. Calculate test statistic 4. Compare to critical value 5. Contextual conclusion

10. What's Next

This guide covers the core foundational content for IB Math AA HL Statistics and Probability, and builds the base for more advanced HL-only topics including Poisson distributions, linear regression, chi-squared goodness of fit tests, and confidence intervals that appear on Paper 3. Probability concepts also cross over to pure math topics like combinatorics and counting principles, while continuous distribution questions are often paired with calculus problems requiring integration of probability density functions for HL candidates.

If you struggle with any of the concepts, worked examples, or practice questions in this guide, you can ask Ollie for step-by-step explanations, extra practice problems, or custom quizzes tailored to your weak spots directly on Ollie. You can also find more topic-specific study guides for IB Math AA HL on the homepage to build your exam readiness and target specific gaps in your knowledge.

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →