What's new

Linearity of Regression Equation


Well-Known Member
Interestingly, I have seen that a regression equation is linear if and only if it is linear in the parameters....i.e., if it is linear in the coefficients. (I have also seen definitions of linear regression equations requiring linearity in both the coefficients and the independent variables.

Operating under the requirement that the parameters must be linear, I have the following question:

Y = B0 + B1X (simple linear regression equation.)

Let's assume B0 = 1 and B1 = 9.

Then we have Y = 1 + 9X (Still a simple linear regression equation.)

However, we could also write this as

Y = 1 + (3^2) X

(i.e., Y = 1 + (B1^2) X which would NOT be deemed a linear regression.)

What am I missing here? (Its probably something that is a bit abstract and related to B1-hat being an estimate and the difference between a population regression equation and a sample regression equation....not sure.)



David Harper CFA FRM

David Harper CFA FRM
Staff member
Hi Brian,

Good to see you can post in the forum :cool: I will be grateful if a member with more expertise can help (this is straight up econometrics and i'd like to see a better answer myself ...) but my understanding is, related to your hunch, that it's the essence of the Guass Markov Theorem http://en.wikipedia.org/wiki/Gauss–Markov_theorem

In my words, the linearity assumption enables us to perform the relatively simple OLS estimation (it really is not complex, it's shonw on our regression learning XLS: the coefficient estimates, slope and intercept, are ultimately sample means) and produce so-called "best" BLUE estimators (U = unbaised). Shorter version: linearity enables the the generation of "best" estimates via relatively simple calcuations. I *think* that if OLS estimators are performed against non-linearity parameter (i.e., non linear model), the produced estimates will be technically biased (with other methods available).

While it's true that 3^2 = 9, the regression does not know the constant parameter (B). The regressions applies an estimator to (cooks a "recipe" with) the sample in order to infer unknown (B), and the calculations to infer (B) in OLS very much depends on unknown (B) and not unknown (B^2).

A statistician might laugh at this but i just think of a :
  • say true parm B = 2 and a very small sample of three nicely surrounds with {1, 2, 3}
  • If we use the familiar average (an estimator: a recipe) of sample, average(1,2,3) = 2, we correctly estimate the param; the "sample mean is an unbiased estimator."
  • But say we infer instead from B^2 = {1, 4, 9}
  • Estimation of the squared set = average(B^2) = 4.67 and SQRT(B^2) = 2.16; knowing B^2, we probably want to use a different recipe (estimator) to infer the unknown paramater. This isn't OLS of course, just illustrating how knowledge of B^2 does not make us indifferent to the recipe used (e.g., sample mean).
But I will take any expert help, thanks,
Last edited:

Arka Bose

Active Member
Hi Brian,

The linearity that you are thing like 3^2 = 9 if 9 is the beta, that way to think is totally incorrect.

basically, if u break down the Beta, it is Σ(x- xbar) (y-bar) / Σ(x-xbar)^2

This is in the form of Σ α (y-ybar) where α is just a number, or we can say weights of an observation of (x-xbar)/ Σ(x-xbar)^2

Here, linearity in the parameters actually refers to (y-ybar).
(y-ybar) has to be linear.

Source :- http://www.infocobuild.com/educatio...omics/economics421-winter2009-mark-thoma.html

Watch lecture 1 (Mark Thoma is a brilliant professor and I regularly follow some of his writtings in his blog)

I was deeply into linear regression too, and watched his whole lectures, however, Lecture 1 would be more than sufficient for an FRM,


Linearity refers to X (to the power 1) and Y (to the power 1) not to the constants B0 and B1 [they are not "variates" as they are constants].

Otherwise you could write anything 9=3^2 = exp ^ ln (3) = sqrt [abs (3) x abs (3)] ... but at the end what matters is the y = f(x) with the function f a straight line, such as f (x+z) = f(x) + f(z)

Arka Bose

Active Member
Hi @Kaiser ,

Linearity in regression actually means B0, B2 etc (i.e the parameters) be linear.
The variables may or may not be non linear as we can convert them to make a linear equation in variables