| Study Guides
AP · What Is Statistics? · 14 min read · Updated 2026-05-10

What Is Statistics? — AP Statistics Study Guide

For: AP Statistics candidates sitting AP Statistics.

Covers: Key definitions of statistics, populations vs. samples, parameters vs. statistics, types of data, and variable classification for one-variable data, aligned with AP Statistics CED Unit 1 learning objectives.

You should already know: Basic algebraic operations, ability to interpret data tables, basic proportional reasoning.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.


1. What Is What Is Statistics?

Statistics is the science of collecting, organizing, analyzing, interpreting, and presenting data to answer questions and make decisions in the face of natural variation. Unlike pure mathematics, which often deals with fixed, certain values, statistics inherently accounts for uncertainty and variation, so all conclusions require context and consideration of error. Per the AP Statistics Course and Exam Description (CED), this topic is the foundation of Unit 1: Exploring One-Variable Data, which accounts for 15-20% of the total AP exam weight. This topic rarely appears as a standalone free-response question, but its concepts are tested repeatedly in multiple-choice questions (MCQ) and embedded in every free-response question (FRQ) that involves data context. Standard notation conventions you will use throughout the course are: = population size, = sample size, = individual quantitative data value, = population proportion, = sample proportion. Mastering the terminology here is critical: mixing up core definitions will lead to errors on nearly every other topic in the course.

2. Populations vs. Samples and Parameters vs. Statistics

The core goal of almost all statistical work is to learn something about a large group without measuring every individual in that group. A population is the entire group of individuals we want information about, regardless of whether we can actually measure every individual. A sample is a smaller subset of the population that we actually collect data from, which we use to draw conclusions about the whole population. A parameter is a number that describes a characteristic of the entire population; it is almost always unknown, because we rarely have the resources to measure the entire population. A statistic is a number calculated from sample data that we use to estimate the unknown population parameter. AP Statistics follows a strict notation convention that is tested on exams: population parameters use Greek letters, while sample statistics use Latin letters. Common examples include: population mean (Greek mu) vs. sample mean (Latin x-bar); population standard deviation (Greek sigma) vs. sample standard deviation (Latin); population proportion (the standard exception, Latin) vs. sample proportion (marked to indicate it is a sample estimate).

Worked Example

A city school district wants to know what proportion of middle school students bring their own lunch to school at least three times per week. They randomly select 200 middle schoolers from the district's total 3,200 middle school students, and find that 88 of the sampled students bring their own lunch at least three times per week. Identify the population, sample, parameter, and statistic in this context, and write the correct notation for the parameter and statistic.

  1. Population: All 3,200 middle school students in the city school district. This is the entire group the district wants information about.
  2. Sample: The 200 randomly selected middle schoolers that were actually surveyed by the district.
  3. Parameter: The true proportion of all middle school students in the district who bring their own lunch at least three times per week. The correct notation for this population proportion is .
  4. Statistic: The proportion of sampled students who meet the criteria, calculated as . The correct notation for this sample proportion is .

Exam tip: If an FRQ asks you to identify a parameter or statistic with correct notation, you will lose a point for swapping Greek/Latin letters or failing to mark the sample proportion with a hat (). Always confirm whether your number describes the entire population or just the sample before writing notation.

3. Categorical vs. Quantitative Data

After identifying your population and sample, the next critical step is to classify your data type, because different data types require different graphs, summaries, and analyses. Categorical (or qualitative) data places individuals into one of several distinct groups or categories based on a characteristic. Categorical data can be further split into ordinal (categories have a natural order, e.g., movie ratings 1-5 stars, class rank: freshmen/sophomore/junior/senior) or nominal (no inherent order to categories, e.g., eye color, phone brand). Quantitative data consists of numerical values that represent a count or measurement, such that arithmetic operations (like averaging) produce a meaningful result. Quantitative data is split into discrete (counted values that can only take specific, separate values, usually whole numbers: e.g., number of AP classes a student takes) and continuous (measured values that can take any value within an interval: e.g., height, time spent studying). A common confusion is that numerical labels are not always quantitative: for example, a student ID number is numerical, but it is just a label for an individual, so averaging student IDs gives no meaningful information, making it categorical. The "meaningful average" test always works for classification: if averaging gives a meaningful result, it is quantitative; if not, it is categorical.

Worked Example

Classify each of the following variables as categorical or quantitative, and add any relevant secondary classifications (ordinal, discrete, continuous): (a) The number of goals scored by a soccer team in a season (b) The neighborhood a resident lives in within a city (c) The total miles per gallon of gas for a passenger car (d) A runner's finishing place in a 5K race (1st, 2nd, 3rd, etc.)

  1. (a): Number of goals is a count where averaging across a sample of teams is meaningful, and it can only take whole number values. This is quantitative discrete.
  2. (b): Neighborhood groups residents into unordered categories, and averaging neighborhood labels gives no meaningful result. This is categorical nominal.
  3. (c): Miles per gallon is a measurement that can take any value within a reasonable range, and averaging is meaningful. Even if we round to one decimal place, this is just a measurement limitation, so this is quantitative continuous.
  4. (d): Finishing place is a ranked category with a clear order, but it is not a numerical measurement of the runner's speed (two runners with different speeds can get the same place in different races). This is categorical ordinal.

Exam tip: If you are stuck classifying ordinal data, remember that it is always categorical, even with an order. Only classify as quantitative if the variable is a direct count or measurement of the characteristic you are studying.

4. Individuals and Variables

Every data set is built from two core components: individuals and variables. Individuals are the objects described by a set of data; they can be people, animals, plants, objects, or even events, depending on the study context. For example, if you test how long different brands of batteries last, the individuals are the batteries, not people. A variable is any characteristic of an individual that can take different values across different individuals. If every individual has the same value of a characteristic, it is a constant, not a variable, and it will not be the focus of analysis. One-variable data, the focus of this unit, means we measure exactly one variable per individual, and we are only interested in the distribution of that single variable (the distribution tells us what values the variable takes and how often it takes those values). For example, measuring the height of 50 students is one-variable data, while measuring both height and weight of 50 students is two-variable data, which we use to study relationships between variables in Unit 2. Recognizing individuals, variables, and the number of variables per individual is the first step of any statistical analysis, so AP exam questions regularly test this skill to confirm you understand the study context.

Worked Example

A consumer research group tests 60 different models of electric bikes, recording the maximum range (in miles) on a full charge and the price of the bike (in US dollars). For this study, identify (a) the individuals, (b) the number of variables, (c) classify each variable as categorical or quantitative, with secondary classification if applicable.

  1. (a) The individuals in this study are the 60 different electric bike models being tested by the consumer group.
  2. (b) There are two characteristics measured per individual, so this is a two-variable data set (not one-variable).
  3. (c) First variable: maximum range on a full charge. This is a numerical measurement where averaging across models is meaningful, and it can take any value within a range, so it is quantitative continuous. Second variable: price of the bike. Price is a numerical measurement that can take any value within a range (even if it is usually rounded to whole dollars for display), so it is also quantitative continuous.

Exam tip: When asked if a data set is one-variable or two-variable, count how many characteristics are measured per individual. One = one-variable, two = two-variable, regardless of how many individuals there are.

5. Common Pitfalls (and how to avoid them)

  • Wrong move: Calling numerical labels like student ID numbers or zip codes quantitative because they are written as numbers. Why: Students assume all numerical values are quantitative, ignoring the requirement that arithmetic must produce a meaningful result. Correct move: Always apply the "meaningful average" test: if averaging two zip codes gives no meaningful characteristic of the group, the variable is categorical.
  • Wrong move: Swapping notation for population parameters and sample statistics, e.g., writing the sample mean as or the population standard deviation as . Why: Students forget the Greek vs. Latin rule, and mix up which group the number describes. Correct move: Before writing notation, explicitly ask "Is this number describing the entire population or just the sample?" Use Greek for population parameters, Latin for sample statistics.
  • Wrong move: Defining the population as the group that was sampled, instead of the entire group of interest. Why: Students confuse where data came from with the group the study wants to learn about. Correct move: To identify the population, always ask "What group is this study trying to draw conclusions about?" That group is the population, regardless of sample size.
  • Wrong move: Classifying ordinal data as quantitative just because it has a natural order. Why: Students confuse order with numerical measurement, assuming any ordered characteristic is quantitative. Correct move: Even if categories are ordered, if they are still distinct groups rather than direct numerical measurements, they are categorical.
  • Wrong move: Assuming all population parameters are known because textbook problems often give you parameter values. Why: Working with given parameters in practice problems leads students to incorrect assumptions about real studies. Correct move: Remember that in almost all real statistical studies, population parameters are unknown, and we use sample statistics to estimate them.

6. Practice Questions (AP Statistics Style)

Question 1 (Multiple Choice)

A national restaurant chain wants to estimate the average wait time for a table at all of its locations during peak weekend hours. They randomly sample 80 locations across the country and collect wait time data for 10 customers at each location, for a total of 800 customer wait times. Which of the following is the statistic in this study? A) The average wait time for all customers at all locations of the chain during peak weekend hours B) All customers of the chain during peak weekend hours C) The average wait time of the 800 sampled customers D) The 800 sampled customers

Worked Solution: First, recall that a statistic is a numerical value calculated from sample data that describes a characteristic of the sample. Option A is a numerical value describing the entire population of interest, so it is the parameter, not the statistic. Options B and D describe the population and sample themselves, not numerical characteristics, so they can be eliminated. Option C is the numerical summary calculated from the sample, which matches the definition of a statistic. Correct answer: C.


Question 2 (Free Response)

A state park service wants to know what proportion of park visitors are satisfied with the new trail system the park installed. They randomly survey 400 visitors who exit the park, and find that 312 of the surveyed visitors are satisfied. (a) Identify the population and sample in this context. (2 points) (b) Identify the parameter and statistic in this context, give correct notation for each, and calculate the value of the statistic. (3 points) (c) Is the variable "satisfaction with the new trail system" categorical or quantitative? Justify your answer. (2 points)

Worked Solution: (a) Population: All visitors to the state park who use the new trail system. This is the entire group the park service wants information about. Sample: The 400 randomly selected exiting visitors that were actually surveyed. (b) Parameter: The true proportion of all park visitors who are satisfied with the new trail system. Correct notation for this population proportion is . Statistic: The proportion of sampled visitors who are satisfied, calculated as . (c) The variable is categorical. Each visitor is placed into one of two groups: satisfied or not satisfied. Averaging coded values (1 = satisfied, 0 = not satisfied) gives a summary proportion, but the variable itself is a group classification, not a quantitative measurement of an individual characteristic.


Question 3 (Application / Real-World Style)

An agricultural researcher wants to estimate the average yield of corn (in bushels per acre) for a new variety of corn developed by a seed company. She plants 38 test plots of the new corn variety and records the yield for each plot. Yields are measured to the nearest 0.1 bushels per acre. Identify the individuals, classify the variable of interest, and name the parameter and statistic the researcher will use.

Worked Solution: The individuals in this study are the 38 test plots planted with the new corn variety; the population of interest is all possible test plots of this new corn variety. The variable of interest is yield in bushels per acre: this is a numerical measurement where averaging is meaningful, and can take any value within a reasonable range (rounding to 0.1 is just a measurement limitation), so it is quantitative continuous. The parameter of interest is the true mean yield of the new corn variety across all possible test plots, denoted . The researcher will calculate the sample mean yield from the 38 sampled plots to estimate the unknown population parameter. In context, this means the researcher uses data from 38 test plots to learn about the expected yield of the new corn variety overall.

7. Quick Reference Cheatsheet

Category Definition / Notation Notes
Population Entire group of interest Always the group you want to draw conclusions about, not just the sampled group
Sample Subset of the population you actually measure Used to draw conclusions about the full population
Parameter Numerical value describing a population characteristic Uses Greek letters () for AP notation; is the standard exception for population proportion; almost always unknown in practice
Sample Statistic Numerical value calculated from sample data Uses Latin letters () for AP notation; used to estimate unknown population parameters
Categorical Data Data that groups individuals into categories Use the "meaningful average" test: no meaningful average = categorical
Quantitative Data Data that is a count or measurement Meaningful average = quantitative, regardless of discrete/continuous classification
Discrete Quantitative Counted, distinct values Only takes specific separate values (usually whole numbers); e.g., number of students
Continuous Quantitative Measured values that take any value in an interval Rounding is just a measurement limitation, not an inherent discrete property
Individual Object described by a data set Can be people, animals, plants, objects, or events
One-Variable Data One variable measured per individual Core focus of Unit 1: Exploring One-Variable Data

8. What's Next

This topic is the foundational terminology for all of AP Statistics, and the classifications you learned here will guide every analytical choice you make for the rest of the course. Next, you will move on to representing one-variable data with graphs, where the first step in choosing the correct graph is always classifying your data as categorical or quantitative. Without mastering the definitions of population vs. sample and parameter vs. statistic, you will not be able to correctly interpret sampling distributions, confidence intervals, or hypothesis testing later in the course, all of which rely on this core distinction. This topic feeds into the overarching AP Statistics goal of using sample data to draw inferences about unknown population parameters, which is the core of the third and fourth units of the course.

Representing a Categorical Variable with Graphs Representing a Quantitative Variable with Graphs Describing Distributions of Quantitative Data Parameters and Statistics for Inference

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →