AP · Function model selection and assumption articulation · 14 min read · Updated 2026-05-10

Function model selection and assumption articulation — AP Precalculus Study Guide

For: AP Precalculus candidates sitting AP Precalculus.

Covers: Selecting linear, quadratic, higher-degree polynomial, and rational function models from contextual data, articulating modeling assumptions, comparing model fit, and validating extrapolation of results per AP Precalculus CED.

You should already know: End behavior of polynomials and rational functions, basic scatter plot and regression concepts, limit behavior of functions at asymptotes.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the AP Precalculus style for educational use. They are not reproductions of past College Board / Cambridge / IB papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official mark schemes for grading conventions.

1. What Is Function model selection and assumption articulation?

Function model selection and assumption articulation is the process of matching a contextual data set or relationship description to an appropriate polynomial or rational function model, then explicitly stating the unstated simplifying assumptions that make the model valid. Per the AP Precalculus Course and Exam Description (CED), this topic accounts for approximately 2.5% of the total exam score, and appears in both multiple-choice (MCQ) and free-response (FRQ) sections. MCQ questions typically test model identification via elimination of inappropriate candidates, while FRQ questions require justification of model choice and explicit articulation of assumptions for full credit. Standard notation uses $x$ for the independent input variable (usually time, count, or mass) and $f (x)$ for the dependent output variable (usually cost, population, or concentration). Synonyms for this process include contextual model fitting, model validation, and assumption analysis. Unlike pure statistical regression, this topic emphasizes matching functional behavior to contextual constraints rather than just minimizing prediction error, making it a frequent test of conceptual understanding.

2. Model Selection Based on End Behavior and Contextual Constraints

The first step in any model selection problem is to narrow down candidate models by matching their end behavior and domain properties to the requirements of the context. For polynomials, end behavior is entirely determined by degree and leading coefficient: odd-degree polynomials have opposite end behavior (one end goes to $+ \infty$ , the other to $- \infty$ ), while even-degree polynomials have matching end behavior (both ends go to the same signed infinity). Polynomials are defined for all real inputs, so they can never have vertical asymptotes (infinite output at a finite input).

For rational functions (ratios of two polynomials), vertical asymptotes occur at values of $x$ that make the denominator zero (and do not cancel with a root in the numerator), so rational functions are the only possible choice for contexts where output approaches infinity at a finite input. End behavior for rational functions depends on the degrees of the numerator and denominator: if the numerator degree equals the denominator degree, there is a horizontal asymptote at the ratio of leading coefficients; if the numerator degree is one higher than the denominator, there is an oblique asymptote; if the numerator degree is more than one higher, output grows without bound as $x$ increases.

Contextual constraints that eliminate candidates include impossible outputs (e.g., negative cost, negative population) and required asymptotic behavior. Always eliminate inappropriate candidates first before fitting parameters or comparing fit.

Worked Example

Problem: A city planner models the total energy consumption $E$ (in megawatt-hours) of a new residential neighborhood as a function of the number of homes $h$ , where $h > 0$ . Context tells us: (1) Total energy consumption grows without bound as the number of homes becomes very large, increasing at a roughly constant rate. (2) If the neighborhood has 0 homes, total energy consumption for homes is 0. Which model is most appropriate, and why?

Solution:

List required properties: End behavior as $h \to + \infty$ is $E (h) \to + \infty$ with constant slope, and $E (0) = 0$ (finite output at 0, no vertical asymptote).
Eliminate even-degree polynomials: All even-degree polynomials have matching end behavior, so $E (h) \to + \infty$ as $h \to - \infty$ , which incorrectly predicts positive energy use for negative (contextually meaningless) numbers of homes.
Eliminate rational models: To get constant-slope end behavior, a rational model would need a numerator one degree higher than the denominator, which requires a vertical asymptote at a finite $h > 0$ . This is unnecessary here because $h = 0$ has finite output 0.
The only matching model is a linear (first-degree) polynomial: $E (h) = k h$ where $k > 0$ is energy consumption per home, satisfying all required constraints.

Exam tip: Always start with end behavior and domain constraints first, before worrying about parameter values. 70% of AP MCQ model selection questions can be answered just by eliminating candidates that don't match context, no calculation needed.

3. Articulating Explicit Modeling Assumptions

Once you select a model, AP Precalculus requires you to explicitly state the assumptions that underpin your choice. All models are simplifications of real-world complexity, and assumptions are the unstated constraints you accept to use the model. Assumptions must be tied directly to the form of your model, not generic statements like "measurements are accurate."

Common model-specific assumptions for polynomial and rational models include: constant marginal change (for linear models), constant acceleration (for quadratic projectile models), fixed carrying capacity (for rational population models), and no external unmodeled factors (e.g., no change in material cost when modeling production cost). FRQ questions almost always require at least one stated assumption for full credit, even if you correctly select the model.

Worked Example

Problem: A civil engineer uses a quadratic model $h (t) = - 16 t^{2} + 64 t + 12$ to model the height of water from a fire hose, where $t$ is time in seconds after the water leaves the nozzle. Name one key assumption the engineer is making that is required for this model to be valid, and explain why it is necessary.

Solution:

The core assumption here is that air resistance on the water stream is negligible, and acceleration due to gravity is constant.
This assumption is necessary because a quadratic model has a constant second derivative (constant acceleration), which only holds if acceleration does not change with speed or time.
If air resistance were not negligible, acceleration would decrease as water speed increases, adding a negative first-order term to the height function, changing its form from quadratic to a higher-degree or rational function.
This assumption directly impacts model predictions: ignoring air resistance leads the quadratic model to overpredict the maximum height and range of the water stream.

Exam tip: For FRQ, always tie your assumption to the function form you selected. Generic assumptions will not earn credit; you need to connect the assumption to why your model is the right choice instead of another model.

4. Comparing Model Fit and Validating Extrapolation

When multiple candidate models satisfy the basic contextual constraints, you next compare how well they fit observed data and whether their extrapolations (predictions outside the range of observed data) make contextual sense. A model that fits the observed data well can still be invalid if it extrapolates to impossible values.

The standard metric for fit is the sum of squared errors (SSE), which sums the squared difference between each model prediction and the observed data value: lower SSE means better fit. However, contextual validity of extrapolation always takes priority over SSE on the AP exam. For example, a linear model for average production cost may have a low SSE for small production runs but predict negative average cost for large runs, which is impossible, so it should be rejected in favor of a rational model with the correct asymptotic behavior.

Worked Example

Problem: A manufacturing company has collected the following data for average cost $C (n)$ (in dollars per unit) of producing $n$ units of a custom part:

n	10	50	100	200
C(n)	27	7.4	5.2	4.6
A linear model fit to the data gives $C (n) = - 0.11 n + 16$ , and a rational model gives $C (n) = \frac{200 + 3 n}{n}$ . Which model is better for extrapolating to $n = 1000$ units, and why?

Solution:

Calculate extrapolated values: For the linear model, $C (1000) = - 0.11 (1000) + 16 = - 94$ dollars per unit. For the rational model, $C (1000) = \frac{200 + 3 ( 1000 )}{1000} = 3.2$ dollars per unit.
Contextually, average cost per unit can never be negative, so the linear model's extrapolation is impossible. The rational model has a horizontal asymptote at $lim_{n \to + \infty} C (n) = 3$ dollars per unit, which matches the real-world expectation that average cost approaches the variable cost per unit for large production runs.
Check fit to observed data: The linear model has an SSE of approximately 260, while the rational model has an SSE of approximately 22, meaning the rational model fits the observed data far better.
Conclusion: The rational model is the better choice for extrapolation.

Exam tip: Always check extrapolated values for contextual sense (no negative counts, costs, or concentrations) even if the model fits observed data well. AP exam questions frequently test this by giving a low-SSE model that extrapolates to an impossible result.

Common Pitfalls (and how to avoid them)

Wrong move: Selecting a polynomial model for a context that requires a vertical asymptote at a finite positive input. Why: Students default to simpler polynomials and forget that polynomials are defined for all real inputs, so they cannot produce infinite output at a finite input. Correct move: Always check if your context requires output to approach infinity at a finite input; if it does, select a rational model with a vertical asymptote at that input.
Wrong move: Stating a generic assumption (e.g., "all data is accurate") instead of a model-specific assumption on FRQ. Why: Students confuse general measurement assumptions with functional form assumptions, which are what AP questions ask for. Correct move: Always tie your assumption to the form of your selected model, e.g., "I assume average revenue per customer is constant, which makes a linear model appropriate instead of a quadratic model."
Wrong move: Selecting an even-degree polynomial for a context where output grows to $+ \infty$ as positive input grows, but the polynomial predicts positive output for all negative inputs. Why: Students forget that even-degree polynomials have matching end behavior on both ends, leading to nonsensical predictions for contextually meaningless negative inputs. Correct move: For contexts where only positive inputs are valid and output grows to $+ \infty$ as $x \to + \infty$ , prioritize odd-degree polynomials unless the context specifically requires even-degree behavior.
Wrong move: Assuming that a model with lower error on observed data is automatically the best choice for extrapolation. Why: Students focus on goodness of fit for existing data and ignore the model's behavior outside the observed data range. Correct move: Always check the model's behavior at the edges of the domain for contextual validity before selecting it for extrapolation.
Wrong move: Forgetting to restrict the model domain to contextually valid inputs, leading to incorrect interpretation of results. Why: Students assume the entire mathematical domain of the function is valid, even for inputs that do not make sense in context. Correct move: Always state the domain of your model explicitly, e.g., $n > 0$ for number of units, to avoid interpreting invalid inputs as meaningful.

Practice Questions (AP Precalculus Style)

Question 1 (Multiple Choice)

A chemist models the concentration $C$ (in mol/L) of a solute in a solution as a function of the total mass $m$ (in grams) of solute added to 1 L of water, where $0 < m < 360$ g. The maximum solubility of the solute is 360 g per liter, so as $m$ approaches 360 g from below, concentration approaches infinity (the solution becomes fully saturated, and no more solute can dissolve. As $m$ approaches 0 from above, concentration approaches 0. Which of the following models is most appropriate for this context?

A) Quadratic: $C (m) = a m^{2} + bm$ , $a > 0$ , $b > 0$ B) Rational: $C (m) = \frac{k m}{360 - m}$ , $k > 0$ C) Rational: $C (m) = \frac{k m}{360 + m}$ , $k > 0$ D) Linear: $C (m) = k m$ , $k > 0$

Worked Solution: First, the context requires a vertical asymptote at $m = 360$ , which only occurs for rational functions with a zero in the denominator at $m = 360$ . This eliminates the polynomial models A and D. Next, check the denominator for the remaining options: Option C has a denominator of $360 + m$ , which never equals zero for $0 < m < 360$ , so it has no vertical asymptote at 360, and does not produce infinite concentration at maximum solubility. Only Option B has a zero in the denominator at $m = 360$ , satisfies $C (0) = 0$ , and matches all required contextual behavior. Correct answer: B.

Question 2 (Free Response)

A café owner wants to model the total daily profit $P$ (in dollars) as a function of the number of customers $x$ that visit in a day, where $x \geq 0$ . The café has fixed daily costs of $500, an d t h e a v er a g e p r o f i tp er c u s t o m er i s a pp r o x ima t e l y$ 15. (a) Identify an appropriate polynomial or rational function model for $P (x)$ based on the context, and explain why your model is appropriate. (b) State one key assumption that the café owner is making when using this model, and explain why the assumption is necessary. (c) Explain why an even-degree quadratic polynomial model would not be appropriate for this context.

Worked Solution: (a) The appropriate model is the linear polynomial $P (x) = 15 x - 500$ . This is appropriate because profit increases at a constant rate of $15 p er c u s t o m er, w hi c hma t c h es t h eco n s t an t s l o p eo f a l in e a r p o l y n o mia l . W h e n$ x=0 $(n oc u s t o m er s), p r o f i t i s$ -500 $(co v er in g f i x e d cos t s), w hi c hma t c h es t h eco n t e x t, an d e n d b e ha v i or a s$ x \to +\infty $i s$ P(x) \to +\infty$, which matches the expectation that profit grows with more customers. (b) One key assumption is that average profit per customer is constant, regardless of how many customers visit. This assumption is necessary because a linear model has a constant slope, which corresponds to constant marginal profit per customer. If average profit decreased for large numbers of customers (e.g., due to overtime labor costs), the linear model would overestimate profit for large $x$ , and a different model would be required. (c) An even-degree quadratic polynomial with positive leading coefficient has end behavior $P (x) \to + \infty$ as both $x \to + \infty$ and $x \to - \infty$ . Negative $x$ (negative number of customers) is contextually meaningless, and the quadratic model incorrectly predicts positive profit for negative $x$ , which is impossible. Additionally, a quadratic model would have profit increasing at an increasing rate, which implies average profit per customer grows with the number of customers, contradicting the given context that average profit is constant.

Question 3 (Application / Real-World Style)

The population of deer in a wildlife management area is modeled as a function of the number of years $t$ since 2010, where $t \geq 0$ . Observed population counts are: $t = 0$ (2010): 250 deer, $t = 5$ : 320 deer, $t = 10$ : 380 deer, $t = 15$ : 410 deer. The area has a maximum carrying capacity of 500 deer due to limited food and space. A park ranger is comparing two models: (1) a linear model $P_{1} (t) = 11 t + 250$ , and (2) a rational model $P_{2} (t) = \frac{500 t + 6250}{t + 25}$ . Which model is more appropriate for predicting the deer population in 2050 ( $t = 40$ ), and what does the appropriate model predict? Interpret your result in context.

Worked Solution: First, the context requires that the population approaches a horizontal asymptote at 500 deer (the carrying capacity) as $t$ becomes large. For the linear model, $lim_{t \to + \infty} P_{1} (t) = + \infty$ , which predicts unbounded population growth that violates the carrying capacity constraint. For the rational model, $lim_{t \to + \infty} P_{2} (t) = \frac{500}{1} = 500$ , which matches the carrying capacity constraint. Calculate the prediction for 2050: $P_{2} (40) = \frac{500 ( 40 ) + 6250}{40 + 25} = \frac{26250}{65} \approx 404$ The linear model would predict $P_{1} (40) = 11 (40) + 250 = 690$ , which far exceeds the 500 deer carrying capacity. The rational model is the more appropriate choice. In context, this means that after 40 years of growth, the deer population will be approximately 404, and growth will continue to slow as the population approaches the maximum sustainable size of 500 deer.

Quick Reference Cheatsheet

Category	Formula / Rule	Notes
Linear (1st-degree polynomial)	$f (x) = a x + b$	Applies for constant rate of change; no vertical asymptotes; odd-degree end behavior
Quadratic (2nd-degree polynomial)	$f (x) = a x^{2} + b x + c$	Applies for constant rate of change of the slope; used for projectile motion, profit models; even-degree end behavior
Cubic (3rd-degree polynomial)	$f (x) = a x^{3} + b x^{2} + c x + d$	Applies for constant second difference; used for volume models; odd-degree end behavior
General polynomial	$f (x) = \sum_{k = 0}^{n} a_{k} x^{k}$	Defined for all real $x$ ; no vertical asymptotes; end behavior determined by degree $n$ and leading coefficient
Rational function	$f (x) = \frac{p ( x )}{q ( x )}$ , $p (x), q (x)$ polynomials	Vertical asymptotes at non-cancelled roots of $q (x)$ ; used for contexts with asymptotic behavior
Rational with horizontal asymptote	$lim_{x \to + \infty} f (x) = L$	Applies for carrying capacity, minimum average cost, saturation concentration
Rational with vertical asymptote	$q (a) = 0, p (a) \neq = 0$	Applies for contexts where output approaches infinity at finite input (e.g., solute saturation)
Model selection rule	Lower sum of squared error = better fit to observed data	Extrapolation context validity always takes priority over fit to observed data

What's Next

This topic establishes the core process of contextual model selection and assumption validation that you will use across the entire AP Precalculus course. Immediately after mastering this topic for polynomial and rational functions, you will apply the exact same process to exponential and logarithmic models in Unit 2, so mastering the process here will make that transition far smoother. This topic is also foundational for the AP Precalculus cross-cutting theme of mathematical modeling, which makes up approximately 30% of the total exam score. Without mastering how to match functional behavior to context and articulate your assumptions, you will lose significant points on FRQ questions that require justifying model choices.

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Function model selection and assumption articulation — AP Precalculus Study Guide

1. What Is Function model selection and assumption articulation?

2. Model Selection Based on End Behavior and Contextual Constraints

Worked Example

3. Articulating Explicit Modeling Assumptions

Worked Example

4. Comparing Model Fit and Validating Extrapolation

Worked Example

Common Pitfalls (and how to avoid them)

Practice Questions (AP Precalculus Style)

Question 1 (Multiple Choice)

Question 2 (Free Response)

Question 3 (Application / Real-World Style)

Quick Reference Cheatsheet

What's Next

More study guides