Learning Methods for ANN

First we need to decide on a cost function, and how many training samples are seen for each epoch:

For each epoch we need to apply the Gradient Descent formula to each weight.

  • This formula is just a reference, as it can be used just to calculate the of the output layer, the more general formula is the Delta Rule

Batch Criterion Function

Where:

  • : is the training set (supervised)
  • : weight
  • : number of all the data in the training set
  • : dimension of the output.
  • : predicted output given by the ANN.

Online Criterion Function

The difference with the Batch Criterion is that in the online mode only one output is considered to calculate the Cost , while in the batch for each cost we consider outputs (a batch).


Gradient Descent

Where is calculated as:

Where:

  • : weight
  • : learning rate () which is decided arbitrarily (can also be non-constant)
  • : Cost Function

Original Files