Rosenblatt’s Perceptron Algorithm

Given a training set with targets taking values , find and such that the hyperplane perpendicular to correctly separates the examples and is the number of times that is updated.

  1. INITIALIZE: Set , , , and .
  2. NORMALIZE: Compute for all set .
  3. CARROT OR STICK ?: If , set , , .
  4. ALL TESTED ?: Set ; If go back to step 3.
  5. NO MISTAKES ?: If , the algorithm terminates; set and return .
  6. TRY AGAIN: Set , , and go back to step P3.

: because can only be or , (two classes), the classification in this case is defined as sign agreement, because the supervisor stop the algorithm only when the sign agrees ()

This algorithm can also be seen as a NN where the activation function is the sign function


ReLU

Rectified Linear Unit

Another activation function that can be substituted to the sigmoid function.

Prevents saturation for but not for .

Also note that the derivate for doesn’t exist. So we have to directly specify its value in the code (not too difficult)


Robust Linear Separation

The Rosenblatt’s perceptron algorithm perform a robust linear separation

Robust because all the points are divided from the linear separation by a factor of (distance), this value is not known at prior, it can be found as the of the distances from the line of linear-separation and all the points.

Also from calculations and theorems we get that the number of steps that allow the algorithm to find the perfect solution will be:

Where R is the radius of the space occupied by the points, as shown in the figure.


Linear Separation with more variables

Let’s now get an intuition on why having more features can increment the possibility of finding linear-separation in the data.

Do do this let’s see to the opposite case.

Take 3 point in a 2D plane:

All of this can be linearly separated by a single line Also even if we project them in a 1D plane they can still be linearly separated:

First one for example:

So this can all be linearly separated

Now take for example this points and their projection:

Notice how the 3 point in the 2D plane can be linearly separated, while the points projected in the 1D plan cannot