In general, $\lVert(a,b)\rVert^2=(a,b)^T(a,b)=a^Ta+b^Tb=\lVert a\rVert^2+\lVert b\rVert^2$

Linear Least Squares

$\min \frac{1}{2}\lVert Ax-b\rVert^2_2$

if A has full col rank ($m>n$) then $X_{LS}$ uniquely solves Normal Equation’s

$A^TAx_{LS} = A^Tb$

$x_{LS} = (A^TA)^{-1}A^Tb$

$\bar{y}=Ax_{LS}=A(A^TA)^{-1}A^Tb$ (projector onto $range(A)$)

“Conditioning” of a matrix determines the accuracy of a linear system solve: $cond(M)=\frac{\lambda_{max}(M)}{\lambda_{min}(M)}$

If M is not square:

$cond(M)=\frac{\nabla{max}(M)}{\nabla{min}(M)}$, where $\nabla$ is the gradient

For $M = A^TA$, $\nabla(M) = \nabla(A^TA)=\nabla(A)^2$

$\lambda_i(A^TA)=\nabla_i(A)^2$