Statistics and Probability — IB Math AA HL AA HL Study Guide
For: IB Math AA HL candidates sitting IB Math: Analysis & Approaches HL.
Covers: Descriptive statistics (measures of centre and spread), conditional and independent probability, discrete/continuous binomial and normal distributions, HL Bayes' theorem, and hypothesis testing intuition.
You should already know: IGCSE / pre-DP math, comfort with proof and algebra.
A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the IB Math AA HL style for educational use. They are not reproductions of past IBO papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official IBO mark schemes for grading conventions.
1. What Is Statistics and Probability?
Statistics and Probability is the branch of mathematics focused on collecting, analyzing, and interpreting numerical data, and quantifying the likelihood of random events occurring. For IB Math AA HL, this topic blends computational skills (e.g. calculating distribution probabilities) with conceptual reasoning (e.g. justifying conclusions in hypothesis tests) and accounts for 30-40% of your final assessment across Papers 2 and 3, including both short-response and extended problem-solving questions. It is also one of the most practically applicable topics, with real-world uses in fields from medicine to economics.
2. Descriptive statistics — measures of centre and spread
Descriptive statistics summarize datasets to make them interpretable, without drawing broader conclusions about the population the data is drawn from. The two core categories of descriptive measures are centre (average value) and spread (variability of values).
Measures of centre
- Mean: The arithmetic average of the dataset. For ungrouped data: . For grouped data, use class midpoints () and frequencies (): .
- Median: The middle value of an ordered dataset. For ungrouped data with points, it is the value at position . For grouped data, use linear interpolation to estimate the median within the median class.
- Mode: The most frequently occurring value, rarely tested for HL except for discrete distributions.
Measures of spread
- Interquartile range (IQR): Difference between the 75th percentile () and 25th percentile (), . It is not affected by outlier values, unlike range.
- Variance: Average squared deviation from the mean. For ungrouped data: . For grouped data, weight by frequency: .
- Standard deviation: Square root of variance, measured in the same units as the original data.
Worked example: For the ungrouped dataset [2,4,4,6,7,9,10], the mean is , median is the 4th value = 6, , , , variance ≈7.14, and standard deviation ≈2.67.
Exam tip: Always check if the question asks for population (divide by ) or sample (divide by ) variance; if unspecified, use population variance for descriptive statistics questions.
3. Probability concepts — conditional, independent events
Probability quantifies the likelihood of an event occurring, with values between 0 (impossible) and 1 (certain). Core foundational rules include:
- Complement rule: , where is the event that does not occur.
- Union rule: . For mutually exclusive events (cannot occur at the same time), , so .
Conditional probability
Conditional probability is the probability of event occurring given that event has already occurred, defined as:
Independent events
Two events are independent if the occurrence of one does not affect the probability of the other. This means and , which rearranges to the independence test:
Worked example: You draw one card from a standard 52-card deck. Let = drawing a heart, = drawing a king. , , . Since , and are independent.
Exam tip: Never confuse mutually exclusive and independent events! If two events are mutually exclusive and have non-zero probability, they cannot be independent, as cannot equal .
4. Discrete and continuous distributions — binomial, normal
Probability distributions describe the probability of every possible outcome of a random variable.
Discrete distributions: Binomial
A discrete random variable takes distinct, countable values. The binomial distribution applies to scenarios with:
- Fixed number of independent trials
- Two possible outcomes (success/failure)
- Constant probability of success across all trials
Notation: . The probability of successes is: Mean , variance .
Worked example: You roll a fair die 10 times, = number of 6s rolled. . , , .
Continuous distributions: Normal
A continuous random variable takes any value in an interval, so the probability of a single exact value is 0. The normal distribution is a symmetric bell-shaped distribution defined by its mean and variance .
Notation: . The standard normal distribution has , , noted . To standardize any normal variable:
Worked example: Adult male heights are (mean 175cm, standard deviation 6cm). The probability a man is shorter than 180cm is , or ~80%.
Exam tip: The second parameter of the normal distribution is variance, not standard deviation. Writing instead of is one of the most common mark-losing mistakes on HL stats questions.
5. Bayes' theorem (HL)
Bayes' theorem is used to update the probability of an event given new evidence, and is a required HL-only concept. It is derived directly from the conditional probability rule. For a set of mutually exclusive, exhaustive events that partition the sample space:
The most common 2-event version is:
Worked example: A disease affects 1% of the population. A test for the disease has a 95% true positive rate () and 2% false positive rate (). What is the probability you have the disease if you test positive? Let = diseased, = test positive. Substitute: Only ~32% of positive test results are true positives, due to the low prevalence of the disease.
Exam tip: HL questions often use 3 or more partition events. Always explicitly label each event and list all given conditional probabilities before plugging into the formula to avoid mixing up and .
6. Hypothesis testing intuition
Hypothesis testing is a statistical framework to test a claim about a population parameter using sample data. The core steps are:
- State hypotheses: The null hypothesis is the default assumption of no effect/no difference, e.g. , . The alternative hypothesis is the claim you are testing, and can be one-tailed (e.g. ) or two-tailed (e.g. ).
- Set significance level: The value (usually 5% = 0.05 for IB) is the maximum allowed probability of a Type I error (rejecting when it is true).
- Calculate test statistic: For normal distributions, the z-test statistic is .
- Make a decision: Compare the test statistic to the critical value for your significance level, or calculate the p-value. If , reject ; otherwise, fail to reject .
- Conclusion: Always state your conclusion in the context of the original question.
Worked example: A café claims their average latte temperature is 65°C. You sample 10 lattes, find a mean of 62°C, with known population standard deviation 3°C. Test at 5% significance if the temperature is lower than claimed.
- , (one-tailed)
- , critical z-value = -1.645
- Test statistic:
- Reject : There is sufficient evidence at the 5% level that the average latte temperature is lower than the café's claim.
7. Common Pitfalls (and how to avoid them)
- Wrong move: Using standard deviation instead of variance when writing notation. Why: Students often use standard deviation for calculations, so they mix up the second parameter. Correct move: Always square the standard deviation to get variance for distribution notation, and double-check before using calculator distribution functions.
- Wrong move: Confusing mutually exclusive and independent events. Why: Both describe relationships between events, so students mix their definitions. Correct move: Remember mutually exclusive = , independent = . Non-zero probability mutually exclusive events cannot be independent.
- Wrong move: Forgetting to double p-values or split significance levels for two-tailed hypothesis tests. Why: Students default to one-tailed calculations even when uses a sign. Correct move: For two-tailed tests, use for each tail, or double the one-tailed p-value before comparing to .
- Wrong move: Skipping linear interpolation for grouped medians or quartiles, using only the class midpoint. Why: Students take shortcuts to save time, losing method marks. Correct move: Use the interpolation formula: , where = lower class bound, = cumulative frequency before the class, = class frequency, = class width.
- Wrong move: Stating only "reject " as the conclusion of a hypothesis test, without context. Why: Students forget that examiners require application to the scenario. Correct move: Always link your decision back to the original question, e.g. "there is evidence at the 5% level that the new production method increases bulb lifetime".
8. Practice Questions (IB Math: Analysis & Approaches HL Style)
Question 1
The grouped data below shows the time taken for 50 students to complete a HL math problem:
| Time (t, minutes) | 0 ≤ t < 2 | 2 ≤ t < 4 | 4 ≤ t < 6 | 6 ≤ t < 8 | 8 ≤ t < 10 |
|---|---|---|---|---|---|
| Frequency | 7 | 18 | 15 | 7 | 3 |
| (a) Calculate the estimated mean time. (2 marks) | |||||
| (b) Calculate the interquartile range using linear interpolation. (4 marks) |
Solution 1
(a) Class midpoints: 1, 3, 5, 7, 9. Sum of weighted values: . Estimated mean = minutes. (b) , position = 12.5th value, position = 37.5th value.
- in 2≤t<4 class: minutes
- in 4≤t<6 class: minutes
- minutes
Question 2
A bag contains 3 red balls and 5 blue balls. You draw two balls without replacement. Let = first ball red, = second ball red. (a) Calculate (2 marks) (b) Calculate (2 marks) (c) Are and independent? Justify your answer. (2 marks)
Solution 2
(a) , , so (b) = first ball blue, so 3 red balls remain out of 7 total. (c) For independence, . . , so events are not independent.
Question 3
A factory produces light bulbs with lifetime normally distributed with mean 1200 hours, standard deviation 80 hours. A new production method is tested, and a sample of 20 bulbs from the new line has a mean lifetime of 1240 hours. Assume the standard deviation is unchanged. Test at the 5% significance level if the new method increases bulb lifetime. (6 marks)
Solution 3
- Hypotheses: , (one-tailed test)
- Significance level , critical z-value = 1.645
- Test statistic:
- 2.236 > 1.645, so reject
- Conclusion: There is sufficient evidence at the 5% significance level that the new production method increases the average lifetime of the light bulbs.
9. Quick Reference Cheatsheet
| Formula/Rule | Key Notes |
|---|---|
| Grouped mean: | = midpoint of class interval |
| Population variance: | Sample variance divides by |
| Conditional probability: $P(A | B) = \frac{P(A∩B)}{P(B)}$ |
| Independent events: | $P(A |
| Binomial distribution: | , |
| Normal distribution: | Standardize: , |
| Bayes' theorem (2-event): $P(A | B) = \frac{P(B |
| Hypothesis testing workflow | 1. State 2. Set 3. Calculate test statistic 4. Compare to critical value 5. Contextual conclusion |
10. What's Next
This guide covers the core foundational content for IB Math AA HL Statistics and Probability, and builds the base for more advanced HL-only topics including Poisson distributions, linear regression, chi-squared goodness of fit tests, and confidence intervals that appear on Paper 3. Probability concepts also cross over to pure math topics like combinatorics and counting principles, while continuous distribution questions are often paired with calculus problems requiring integration of probability density functions for HL candidates.
If you struggle with any of the concepts, worked examples, or practice questions in this guide, you can ask Ollie for step-by-step explanations, extra practice problems, or custom quizzes tailored to your weak spots directly on Ollie. You can also find more topic-specific study guides for IB Math AA HL on the homepage to build your exam readiness and target specific gaps in your knowledge.