Hypothesis Tests for the Slope of a Regression Model — AP Statistics Study Guide
For: AP Statistics candidates sitting AP Statistics.
Covers: This chapter covers stating null and alternative hypotheses for a regression slope, checking inference conditions, calculating the t-test statistic for slope, finding p-values, interpreting results in context, and connecting hypothesis tests to confidence intervals for slope.
You should already know: How to calculate the slope of a least-squares regression line from sample data. The four conditions for inference for regression. How to calculate t-test statistics and p-values for t-procedures.
A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Statistics style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.
1. What Is Hypothesis Tests for the Slope of a Regression Model?
This topic makes up roughly 5-7% of the overall AP Statistics exam, and it appears on both MCQ and FRQ sections, most commonly as a multi-part FRQ question paired with confidence intervals for slope. The core goal of this test is to answer a critical question: when we fit a least-squares regression line to sample data, do we have evidence that a true linear relationship exists between the explanatory and response variable in the full population?
Common synonyms for this test: t-test for regression slope, significance test for linear association, inference for regression slope. Standard notation: the true population slope is denoted (beta), and the estimated sample slope is denoted or . The default null hypothesis is always , because a slope of zero means no linear association between the two variables. Almost all AP questions ask for a test of non-zero slope, though one-tailed directional tests are possible if a directional claim is made. This topic unifies your knowledge of significance testing and linear regression, two of the most heavily tested themes on the exam.
2. Stating Hypotheses and Checking Inference Conditions
The first step of any hypothesis test for slope is to correctly define the parameter, state hypotheses, and verify the conditions required for the sampling distribution of the slope to follow a t-distribution.
First, you must always state hypotheses in terms of the population parameter, not the sample statistic. For a test of any linear relationship, the default hypotheses are:
- (no linear relationship between and in the population)
- (there is a non-zero linear relationship between and in the population)
If the question claims the slope is positive or negative, use a one-tailed alternative ( or ). Next, check the four conditions for inference, remembered by the acronym LINE:
- Linear: The true relationship between and is linear. Check with a residual plot (no curved pattern = condition met).
- Independent: Individual observations are independent. Check 10% condition for sampling without replacement, or random assignment for experiments.
- Normal: Residuals are normally distributed around the regression line. Check with a linear normal probability plot of residuals, or a roughly symmetric histogram of residuals.
- Equal Variance: The spread of residuals is constant across all . Check on the residual plot (no fanning in/out = condition met).
Worked Example
A public health researcher wants to test if there is a linear relationship between average daily step count and resting heart rate (beats per minute) for adults in a large city. She collects a random sample of 45 adults. State appropriate hypotheses and describe how to check all conditions for this test.
- Define the parameter: Let = the true slope of the regression line relating average daily step count to resting heart rate for all adults in the city.
- State hypotheses: Since the question tests for any linear relationship, use a two-tailed test: and .
- Check conditions: Linearity: Plot residuals vs. step count; no curved pattern = condition satisfied. Independence: The sample is random, and 45 adults is less than 10% of all adults in the city = condition satisfied. Normality: Create a normal probability plot of residuals; a roughly linear plot = condition satisfied. Equal Variance: Check the residual plot for consistent spread of residuals across all step count values; no fanning = condition satisfied.
Exam tip: On AP FRQs, you must name each condition and explain how you check it in context; just writing "LINE" is not enough to earn full credit.
3. Calculating the Test Statistic and P-Value
If all conditions are met, the sampling distribution of the sample slope follows a t-distribution with degrees of freedom , where is the number of observations in the sample. We lose 2 degrees of freedom because we estimate two parameters for the regression line: the intercept and the slope.
The formula for the t-test statistic is: Where is the sample slope from least-squares regression, is the hypothesized population slope from the null hypothesis (almost always 0), and is the standard error of the slope. On the AP exam, is almost always given in regression computer output; you will almost never need to calculate it by hand.
Once you calculate the t-statistic, find the p-value, which is the probability of observing a t-statistic as extreme or more extreme than your result, assuming is true. For a two-tailed test, double the one-tailed p-value; for a one-tailed test, use the one-tailed p-value matching the direction of .
Worked Example
An economist studies the relationship between monthly advertising spending (hundreds of dollars, ) and monthly sales (thousands of dollars, ) for 18 small retail stores. He gets the following regression output for the slope: , . Calculate the t-test statistic and bound the p-value for a two-tailed test of .
- Degrees of freedom: , and .
- Calculate t-statistic: .
- Find p-value bounds: For , a t-statistic of 2.43 falls between the critical values for one-tailed p=0.01 (t=2.583) and p=0.025 (t=2.120). For a two-tailed test, double the one-tailed bounds, so .
Exam tip: If you are given full regression output with pre-calculated t and p-values for the slope row, you do not need to recalculate them; just use the given values directly to save time on the exam.
4. Drawing a Conclusion in Context
After finding the p-value, you compare it to the pre-specified significance level (almost always unless stated otherwise). The decision rule is standard for significance testing: if , reject the null hypothesis; if , fail to reject the null hypothesis.
The most heavily tested skill on the AP exam for this topic is writing a correct conclusion in context. There are two common mistakes students make here: (1) not tying the conclusion back to the problem context, and (2) incorrectly stating that the null hypothesis is true when failing to reject. You can never prove the null is true; you only have enough or not enough evidence to reject it. Never use the phrase "accept ". Also, remember that statistical significance does not equal practical significance: a very small slope can be significant with a large sample size, but may not have any real-world meaning.
Worked Example
In the advertising and sales example from the previous section, the p-value is between 0.02 and 0.05, and . State a correct conclusion in context.
- Compare p-value to : , so we reject the null hypothesis .
- Write conclusion in context: There is convincing statistical evidence at the 0.05 significance level that there is a non-zero linear relationship between monthly advertising spending and monthly sales for the population of small retail stores similar to those in the study.
- If p was 0.07 instead, the conclusion would be: There is not convincing statistical evidence at the 0.05 significance level that there is a linear relationship between monthly advertising spending and monthly sales for this population of small retail stores.
Exam tip: Always include the phrase "convincing statistical evidence" and explicitly reference the significance level in your AP FRQ conclusion; omitting these can cost you a full point.
Common Pitfalls (and how to avoid them)
- Wrong move: Stating hypotheses in terms of the sample slope instead of the population slope , e.g., . Why: Students mix up sample statistics and population parameters, a confusion carried over from earlier inference topics. Correct move: Always define as the true population slope first, then state all hypotheses in terms of .
- Wrong move: Using degrees of freedom instead of . Why: Students reuse the degrees of freedom from one-sample t-tests for means, which is incorrect for regression. Correct move: Always remember regression estimates two parameters (intercept and slope), so subtract 2 from the sample size to get .
- Wrong move: Checking normality by looking at the distribution of the explanatory variable , instead of the distribution of residuals. Why: Students confuse the conditions for inference on means with the conditions for regression inference. Correct move: All regression inference conditions are checked on residuals, so always use residuals for normality checks.
- Wrong move: Concluding "there is no linear relationship" when you fail to reject . Why: Students assume failing to reject means the null is proven true, which is a core logical error in significance testing. Correct move: Always write "there is not convincing statistical evidence of a linear relationship", never that no relationship exists.
- Wrong move: Using the standard deviation of the response variable instead of the standard error of the slope when calculating the t-statistic. Why: Students confuse different types of standard deviation/error in regression output. Correct move: Always pull from the "SE Coef" column in the row for your explanatory variable.
- Wrong move: Claiming a statistically significant slope means the relationship is strong or practically important. Why: Students confuse statistical significance with practical significance. Correct move: If asked about importance, check the size of the slope; a small slope can be significant with a large sample but have no practical impact.
Practice Questions (AP Statistics Style)
Question 1 (Multiple Choice)
A ecologist wants to test if there is a linear relationship between tree diameter at breast height (, cm) and total tree height (, meters). She calculates a 95% confidence interval for the population slope to be (-0.02, 0.15). For a two-tailed hypothesis test of at , which of the following is correct?
A) Reject , because 0 is inside the confidence interval.
B) Fail to reject , because 0 is inside the confidence interval.
C) Reject , because the interval contains both positive and negative values.
D) Fail to reject , because the upper bound of the interval is positive.
Worked Solution: A confidence interval for directly corresponds to a two-tailed hypothesis test at significance level . If the null value (0) is inside the interval, we fail to reject ; if 0 is outside, we reject . In this case, 0 is inside the interval (-0.02, 0.15), so we fail to reject . The other options do not follow the correct connection between confidence intervals and hypothesis tests. The correct answer is B.
Question 2 (Free Response)
A education researcher studies the relationship between average class size (, number of students) and average end-of-year test score (, points out of 100) across 25 randomly selected 4th grade classes in a state. She wants to test if there is a negative linear relationship between class size and test scores.
(a) State the appropriate null and alternative hypotheses, defining all parameters.
(b) Regression output gives: , . Calculate the test statistic and degrees of freedom for this test.
(c) The p-value for the test is 0.015. Using , state a full conclusion in context.
Worked Solution: (a) Let = the true slope of the regression line relating average class size to average end-of-year test score for all 4th grade classes in the state. Hypotheses: , .
(b) Degrees of freedom: . Test statistic: .
(c) Since , we reject the null hypothesis. There is convincing statistical evidence at the 0.05 significance level that there is a negative linear relationship between average class size and average end-of-year test score for 4th grade classes in this state.
Question 3 (Application / Real-World Style)
A coffee shop owner wants to test if there is a linear relationship between daily high temperature (F, ) and daily iced coffee sales (dollars, ). He collects 25 days of random data, and gets , . Test for a linear relationship at , and interpret your result in context. Assume all conditions for inference are met.
Worked Solution: Hypotheses are , , with . The test statistic is . For a two-tailed test with , the p-value is between 0.01 and 0.02. Since , we reject the null hypothesis. In context: There is convincing statistical evidence at the 0.05 significance level that there is a non-zero linear relationship between daily high temperature and daily iced coffee sales at this coffee shop.
Quick Reference Cheatsheet
| Category | Formula | Notes |
|---|---|---|
| Population Slope Parameter | True slope for the entire population; hypotheses are stated in terms of | |
| Sample Slope Estimate | Estimated slope from sample least-squares regression; never use for hypotheses | |
| Default Null Hypothesis | Tests for no linear relationship between and | |
| Two-Tailed Alternative | Used when testing for any linear association with no direction claimed | |
| One-Tailed Alternative | / | Used when the question claims a specific direction of association |
| t-test Statistic for Slope | for almost all tests; comes from regression output | |
| Degrees of Freedom | Subtract 2 because we estimate intercept and slope | |
| Inference Conditions | LINE: Linear, Independent, Normal, Equal Variance | All conditions checked on residuals; name and describe check on AP FRQ |
| Conclusion (Reject ) | "Convincing evidence of a linear relationship" | Use when ; always write in context |
| Conclusion (Fail to Reject ) | "No convincing evidence of a linear relationship" | Never say "accept " or "no relationship exists" |
| Connection to Confidence Intervals | Reject at if 0 not in CI | Common shortcut for multiple choice questions |
What's Next
Hypothesis tests for regression slope is the core significance testing topic for Unit 9, and it is a prerequisite for the next topic: confidence intervals for the slope of a regression model. Confidence intervals provide additional information about the magnitude of the true slope, not just whether it is significantly different from zero, so you need to master the conditions, t-procedure, and conclusion framework from this chapter before moving on. This topic also extends the general logic of significance testing you learned for proportions and means to the context of linear regression, one of the most widely used statistical methods in real-world research. Mastering this topic builds the foundation for more advanced regression topics you will encounter in college statistics courses. Follow-on topics: Confidence Intervals for the Slope of a Regression Model Least-Squares Regression Significance Testing for Means Inference for Proportions