The idea of least squares is to obtain the best fit
function (e.g., linear, quadratic, cubic, etc.) to a data set, whereby
the discrepancy between the function and the data is a minimum. This discrepancy
is often described in terms of the error energy, i.e., the square of the
error (difference) between the best fit curve and the data.
The error is written using the mathematical notation:
ei = di – yi
where di is
the desired curve
yi is the actual data for each
point xi, i = 1, 2, . . . N.
I is known as the total error energy or the
l2 norm. It is widely used in the physical
sciences, and it is defined by the equation:
There are many other measures of error, e.g., l1 norm which is based on the sum of the absolute values of the differences.
To find the least squares error for any function, d, we must minimize I. Assume that the desired function form is a straight line of the form bxi + a, for i = 1, N and that the data values are yi, i=1, N. Then:
we can now take the derivative of I with respect to a and b and set them to zero to get:
We can rearrange these equations to get two equations with two unknowns (a and b).
We must now solve these equations simultaneously.
Rewriting we have
From the second equation, we immediately get
where
Now, to solve for b, multiply the second equation
by Nx and subtract:
which gives