AP · Competing function model validation · 14 min read · Updated 2026-05-10

Competing function model validation — AP Precalculus Study Guide

For: AP Precalculus candidates sitting AP Precalculus.

Covers: Residual analysis, coefficient of determination ( $R^{2}$ ), log-transformation linearization, and context-based goodness-of-fit comparison for competing linear, exponential, and power function models per AP Precalculus CED.

You should already know: How to fit linear and exponential models to bivariate data. Properties of logarithms for algebraic transformation. Calculating residuals for regression models.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Precalculus style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.

1. What Is Competing function model validation?

Competing function model validation is the process of testing two or more candidate function models (most commonly linear, exponential, and power) against a set of real-world bivariate data to select the model that best describes the underlying relationship. According to the AP Precalculus CED, this topic accounts for approximately 2-3% of the total exam score, and it appears in both multiple-choice (MCQ) and free-response (FRQ) sections. On the AP exam, you will typically be given a scatterplot, data table, or pre-fit candidate models, then asked to justify which model is most appropriate using quantitative or graphical evidence. Sometimes, you will need to linearize an exponential or power model using logarithms first, then compare goodness of fit to validate which non-linear model works best. Unlike fitting a single model, validation focuses on comparing competing options, a critical skill for applied data analysis that is heavily weighted for justification points on FRQs.

2. Graphical Residual Analysis

Residual analysis is the most intuitive and widely tested method for comparing model fit on the AP exam. A residual for a data point $(x_{i}, y_{i})$ from a model $\overset{y}{^} = f (x)$ is defined as $e_{i} = y_{i} - \overset{y}{^}_{i}$ , the difference between the observed $y$ -value and the model's predicted $y$ -value. If a model fits the data well, its residuals will be randomly scattered around the horizontal axis $e = 0$ , with no clear pattern (such as a curve, upward/downward trend, or funnel shape that widens/narrows over $x$ ). If residuals show a clear systematic pattern, that means the model is systematically missing the underlying trend in the data, so another competing model is almost certainly a better choice. For example, a linear model fit to exponential growth data will almost always produce a curved residual pattern, because the linear model cannot capture the accelerating growth trend.

Worked Example

Problem: A ecologist measures the population of deer in a protected forest over 12 years. Residual plots for two candidate models are available: Model 1 (linear growth) has residuals that start positive, become negative in the middle of the time range, then become positive again, forming a clear upward-opening parabolic pattern. Model 2 (exponential growth) has residuals randomly scattered between $- 18$ and $15$ deer, centered evenly around $e = 0$ . Which model is a better fit? Justify your choice.

Recall that a well-fitting model produces residuals with no systematic pattern, randomly distributed around the zero line.
Evaluate Model 1: The clear parabolic pattern in residuals means the linear model fails to capture the non-linear trend in deer population growth, so it is a poor fit.
Evaluate Model 2: Residuals have no systematic pattern and are evenly scattered around zero, which indicates the model correctly captures the underlying trend.
Conclude that Model 2 (exponential growth) is the better fitting model.

Exam tip: On AP FRQ, you must explicitly reference the presence/absence of a pattern in residuals to earn the justification point; just saying "the residuals are better" will not get you full credit.

3. Coefficient of Determination ( $R^{2}$ ) for Model Comparison

The coefficient of determination, written $R^{2}$ , is a quantitative measure of the proportion of variation in the response variable $y$ that is explained by the explanatory variable $x$ in the fitted model. For any regression model, $R^{2}$ ranges from $0$ to $1$ (or 0% to 100%). When comparing two competing models fit to the same data, the model with the higher $R^{2}$ explains more of the variation in $y$ , so it is generally the better fitting model. A critical caveat for the AP exam: this comparison is only valid when both models use the same response variable. If you linearize an exponential model by taking the natural log of $y$ , the $R^{2}$ you get from linear regression of $ln (y)$ on $x$ measures variation in the transformed $ln (y)$ data, not the original $y$ , so you cannot directly compare that $R^{2}$ to the $R^{2}$ of a linear model fit to original $y$ .

Worked Example

Problem: A small business owner compares two models for annual revenue over 10 years: a linear model $R (t) = 125000 + 18000 t$ and an exponential model $R (t) = 120000 (1.11)^{t}$ . Both models are fit to the original untransformed annual revenue data. The linear model has $R^{2} = 0.84$ , and the exponential model has $R^{2} = 0.95$ . Which model is a better fit? Justify.

Confirm that both models use the same response variable (original untransformed annual revenue $R$ ), so direct comparison of $R^{2}$ is valid.
Recall that for competing models with the same response variable, a higher $R^{2}$ indicates more variation explained and a better overall fit.
Compare values: $0.95 > 0.84$ , so the exponential model explains 11% more variation in annual revenue than the linear model.
Conclude the exponential model is the better fit for this revenue data.

Exam tip: Always check that the response variable is identical for both models before comparing $R^{2}$ ; transformed models have $R^{2}$ values that only compare to other models of the same transformed response.

4. Log-Transformation for Linearization of Non-Linear Models

When comparing two non-linear models (exponential vs power), we use log-transformation to linearize both models, then compare the fit of the linearized versions to select the best original model. This is a common skill on AP FRQs, where you may be asked to transform data and compare fit by hand. An exponential model has the form $y = a b^{x}$ . Taking the natural log of both sides gives the linear form: $ln (y) = ln (a) + x ln (b)$ This is linear in $x$ , with slope $ln (b)$ and intercept $ln (a)$ . A power model has the form $y = a x^{b}$ . Taking the natural log of both sides gives its linear form: $ln (y) = ln (a) + b ln (x)$ This is linear in $ln (x)$ . To compare which non-linear model fits better, we check the $R^{2}$ of the linearized regression: the model whose linearized form has a higher $R^{2}$ (and random residuals after regression) is the better original non-linear model. This method is only valid if all $x$ and $y$ values are positive, since the logarithm is only defined for positive inputs.

Worked Example

Problem: We have 9 data points $(x, y)$ with $x > 0$ and $y > 0$ , and we want to choose between an exponential model $y = a b^{x}$ and a power model $y = a x^{b}$ . After correct transformation and regression, the exponential model has $R^{2} = 0.88$ , and the power model has $R^{2} = 0.97$ . Both linearized models have randomly scattered residuals. Which original model is better? Justify.

Confirm all $x$ and $y$ are positive, so log-transformation is valid for both models.
Confirm each model was correctly linearized: exponential is regressed as $ln (y)$ vs $x$ , power is regressed as $ln (y)$ vs $ln (x)$ .
Compare the $R^{2}$ of the correctly linearized models: $0.97 > 0.88$ , so the power model's linearization has better fit.
A better fit for the linearized transformation corresponds to a better fit for the original non-linear model, so the original power model is preferred.

Exam tip: Always remember that exponential models linearize against $x$ , while power models linearize against $ln (x)$ ; mixing up the predictor variable will give an incorrect $R^{2}$ and wrong conclusion.

5. Common Pitfalls (and how to avoid them)

Wrong move: Comparing $R^{2}$ of a linear model fit to original $y$ directly to the $R^{2}$ of a linearized exponential model fit to $ln (y)$ . Why: Students forget $R^{2}$ measures variation in the response variable, so it is only comparable when the response is identical. Correct move: If you need to compare on the original scale, calculate $R^{2}$ for the exponential model using predicted $\overset{y}{^}$ (not $ln (\overset{y}{^})$ ) on the original scale, then compare.
Wrong move: Claiming a model is bad just because one residual is much larger than the rest. Why: Students confuse a single outlier with a systematic pattern across all data points. Correct move: Judge model fit based on the overall pattern of all residuals, not just one extreme outlier.
Wrong move: Linearizing a power model by regressing $ln (y)$ on $x$ instead of $ln (x)$ . Why: Students mix up the linearization formulas for exponential and power models. Correct move: Memorize the two transformations: exponential uses predictor $x$ , power uses predictor $ln (x)$ , and match to the original model form.
Wrong move: Concluding a model is better just because it has a higher $R^{2}$ , even when it has a clear systematic residual pattern. Why: Students over-rely on $R^{2}$ and ignore critical graphical evidence of poor fit. Correct move: Always check residual patterns first; a model with slightly lower $R^{2}$ but random residuals is better than a higher $R^{2}$ model with a clear systematic pattern.
Wrong move: Attempting to take the logarithm of zero or a negative $y$ -value when linearizing. Why: Students do not check the domain of the data before applying transformation. Correct move: Confirm all $x$ and $y$ values are positive before using log-transformation; if non-positive values exist, use residual analysis on original data for comparison instead.

6. Practice Questions (AP Precalculus Style)

Question 1 (Multiple Choice)

A researcher compares three models for the height of a growing tomato plant over time, all fit to the original height data (in cm). The results are below:

Linear Model: $R^{2} = 0.82$ , residuals form a clear upward curved trend
Exponential Model: $R^{2} = 0.91$ , residuals are randomly scattered around 0
Quadratic Model: $R^{2} = 0.92$ , residuals are randomly scattered around 0 Based on this information, which model is the most appropriate? A) Linear model, because it has the lowest $R^{2}$ B) Quadratic model, because it has the highest $R^{2}$ and random residuals C) Exponential model, because it has a higher $R^{2}$ than the linear model D) Quadratic model, because growth is always curved

Worked Solution: First, eliminate option A, because lower $R^{2}$ means less variation explained and worse fit, so A is incorrect. The linear model has a clear curved residual pattern, so it is already invalid. We are left with two models that both have random residuals, fit to the same original response variable (height). The quadratic model has a higher $R^{2}$ (0.92 vs 0.91 for exponential), meaning it explains more variation in height, so it is the better fit. Option C is incorrect because it ignores the superior quadratic model, and option D's reasoning is wrong because we select models based on empirical evidence, not assumptions about growth. The correct answer is B.

Question 2 (Free Response)

A data set of 10 points with $x > 0, y > 0$ is given, and researchers want to choose between an exponential model $y = a b^{x}$ and a power model $y = a x^{b}$ . (a) State the correct transformed linear equation for each model, clearly identifying the response and predictor variables for each linear regression. (b) After fitting both transformed linear regressions, the exponential model has $R^{2} = 0.912$ and the power model has $R^{2} = 0.968$ . The residual plots for both linearized models show random scatter around 0. Which model is preferred for the original data? Justify your answer. (c) For the preferred model, the linearized regression has intercept 1.2 and slope 0.8. Find the original model parameters $a$ and $b$ , writing the final original model in terms of $x$ and $y$ .

Worked Solution: (a) For the exponential model $y = a b^{x}$ : take the natural log of both sides to get $ln (y) = ln (a) + (ln b) x$ . The response variable is $ln (y)$ , and the predictor variable is $x$ . For the power model $y = a x^{b}$ : take the natural log of both sides to get $ln (y) = ln (a) + b ln (x)$ . The response variable is $ln (y)$ , and the predictor variable is $ln (x)$ . (b) All $x$ and $y$ are positive, so log-transformation is valid for both models. Both linearized models have random residuals, so we compare their $R^{2}$ values: $0.968 > 0.912$ . The power model's linearized form has a higher $R^{2}$ , meaning it explains more variation in the transformed response, which corresponds to a better fit for the original power model. Thus the power model is preferred. (c) The preferred model is the power model, so its linearized form is $ln (y) = ln (a) + b ln (x)$ . We have intercept $= 1.2 = ln (a)$ and slope $= 0.8 = b$ . Solving for $a$ gives $a = e^{1.2} \approx 3.32$ . The final original model is $y = 3.32 x^{0.8}$ .

Question 3 (Application / Real-World Style)

A bakery owner tracks the number of daily loaf sales over 14 days after launching a new marketing campaign, with the following results: A linear model fit to the data has residuals that are negative for the first 4 days, positive for days 5-10, and negative again for days 11-14, forming a clear U-shaped pattern. The linear model has $R^{2} = 0.72$ . An exponential model fit to the same original sales data has residuals that are randomly scattered between -8 and 9 loaves around 0, with $R^{2} = 0.91$ . Daily sales are always positive. Which model should the owner use to predict future sales? Justify your answer, and interpret what the better model implies about sales growth after the campaign.

Worked Solution: Both models are fit to the same original response variable (daily loaf sales), so we can compare both residual patterns and $R^{2}$ . The linear model has a clear U-shaped residual pattern, which means it systematically misses the non-linear trend in sales, and it has a much lower $R^{2}$ of 0.72. The exponential model has randomly scattered residuals with no systematic pattern, and a higher $R^{2}$ of 0.91, meaning it explains 91% of the variation in daily sales vs only 72% for the linear model. Thus, the exponential model is the better choice for prediction. In context, this means daily loaf sales are growing at an increasing (exponential) rate after the marketing campaign, rather than at a constant linear rate.

7. Quick Reference Cheatsheet

Category	Formula / Rule	Notes
Residual Calculation	$e_{i} = y_{i} - \overset{y}{^}_{i}$	$y_{i}$ = observed $y$ , $\overset{y}{^}_{i}$ = predicted $y$ from the model
Good Residual Pattern	No systematic pattern, random around $e = 0$	Indicates a well-fitting model
Bad Residual Pattern	Curve, linear trend, or funnel shape	Indicates a poorly fitting model
$R^{2}$ Comparison Rule	Higher $R^{2}$ = better fit	Only valid when both models have the same response variable
Exponential Model Linearization	$ln (y) = ln (a) + (ln b) x$	Response = $ln (y)$ , Predictor = $x$ , only valid for $y > 0$
Power Model Linearization	$ln (y) = ln (a) + b ln (x)$	Response = $ln (y)$ , Predictor = $ln (x)$ , only valid for $x > 0, y > 0$
Model Selection Priority	1. Check residual pattern first 2. Compare $R^{2}$ if multiple models have good residuals	Residual pattern always takes priority over $R^{2}$

8. What's Next

Competing function model validation is the capstone modeling skill for Unit 2: Exponential and Logarithmic Functions, and it prepares you for all future modeling-focused content on the AP Precalculus exam. Immediately after this topic, you will move into Unit 3, which covers polynomial and rational functions, where you will apply the same core model validation skills to compare polynomial models of different degrees to bivariate data. Without mastering residual analysis and $R^{2}$ comparison from this chapter, justifying model selection for non-linear polynomial models will be much more difficult, as the same principles apply. This topic also feeds into the bigger picture of quantitative reasoning, where you will need to select appropriate models for real-world data across all areas of applied math and science.

Exponential model parameters and interpretation Logarithm properties for transformation Linear regression for bivariate data Polynomial model validation

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Competing function model validation — AP Precalculus Study Guide

1. What Is Competing function model validation?

2. Graphical Residual Analysis

Worked Example

3. Coefficient of Determination (R2) for Model Comparison

Worked Example

4. Log-Transformation for Linearization of Non-Linear Models

Worked Example

5. Common Pitfalls (and how to avoid them)

6. Practice Questions (AP Precalculus Style)

Question 1 (Multiple Choice)

Question 2 (Free Response)

Question 3 (Application / Real-World Style)

7. Quick Reference Cheatsheet

8. What's Next

More study guides

3. Coefficient of Determination ( $R^{2}$ ) for Model Comparison