Normal Equations

Linear prediction in multi-dimensional spaces
Normal equations and projections
Different cases and pseudo-inversion

Polynomial Approximation:

In case of a polynomial function:

f = w_{2} x^{2} + w_{1} x b

we can see it as a liner multi-variable function:

f = w_{2} x_{2} + w_{1} x_{1} + b

Normal Equation - Solutions

To solve a normal equation we will need more points than unknown variables, which means having enough data to learn the weights.

Given the matrix $\hat{X}$ : “example” matrix, or data matrix, we put in each row one observation, plus the bias input (usually 1). The number of columns correspond to the total number of observation we took.

More rows = More parameters = More weights to learn but a more “complete” solution, i.e. where the edge cases of the problem are taken into account
More columns = More data = More precise weights

Number of columns $≫$ Number of rows

Given:

$\hat{X}$ : information matrix
$\hat{W}^{*}$ : “perfect” weight matrix: such that $(\hat{X}^{T} \hat{X}) \hat{W}^{*} = \hat{X} Y$
$Y$ : target Matrix

We have that: $rank (\hat{X}) = r \leq d$ where $d$ is the maximum rank And we can say that:

(\hat{X}^{T} \hat{X}) \hat{W}^{*} = \hat{X} Y

So:

\hat{W}^{*} = (\hat{X}^{T} \hat{X})^{- 1} \hat{X}^{T} Y

🪴 Quartz 4.0

Explorer

ML - Lecture 5

Normal Equations

Polynomial Approximation:

Normal Equation - Solutions

Graph View

Table of Contents

Backlinks