Learning Methods for ANN

First we need to decide on a cost function, and how many training samples are seen for each epoch:

We can decide on seeing all training samples before taking a step of backpropagation using the Batch Criterion Function
Or we can decide to take a step of backpropagation for each training sample, this is called Online Criterion Function

For each epoch we need to apply the Gradient Descent formula to each weight.

This formula is just a reference, as it can be used just to calculate the $Δ w$ of the output layer, the more general formula is the Delta Rule

Batch Criterion Function

C (τ, w) = C (W) = \frac{1}{2} j = 1 \sum n i = 1 \sum m (\overset{y_{i}}{^} - y_{i})^{2}

Where:

$τ = {(\underline{x_{1}}, \underline{y_{1}}), \dots, (\underline{x_{1}}, \underline{y_{1}})}$ : is the training set (supervised)
$w$ : weight
$n$ : number of all the data in the training set
$m$ : dimension of the output.
$\overset{y_{i}}{^}$ : predicted output given by the ANN.

Online Criterion Function

C (W) = \frac{1}{2} j = 1 \sum m (\overset{y_{i}}{^} - y_{i})^{2}

The difference with the Batch Criterion is that in the online mode only one output is considered to calculate the Cost $C (W)$ , while in the batch for each cost we consider $m$ outputs (a batch).

Gradient Descent

w^{'} = w + Δ w

Where $Δ w$ is calculated as:

Δ w = - η \frac{\partial C}{\partial w}

Where:

$w$ : weight
$η$ : learning rate ( $\in R$ ) which is decided arbitrarily (can also be non-constant)
$C$ : Cost Function

🪴 Quartz 4.0

Explorer

University AI - Learning Methods for ANN

Learning Methods for ANN

Batch Criterion Function

Online Criterion Function

Gradient Descent

Original Files

Graph View

Table of Contents

Backlinks