Perceptron learning algorithm- Intuition and implementation

Moshood Abidemi
3 min readAug 20, 2019

The perceptron is the simplest form of a neural network. It is also called singled-layer neural network due to having only one hidden layer. The perceptron learning rule was proposed by Frank Rosenblatt to model how the human nerve cells work. He defined an algorithm that would automatically learn the optimal weight coefficients that are then multiplied by the corresponding input features. The net input derived from the multiplication of weights and features is then put into an activation function. A threshold is defined for this activation function, if the net input meets the threshold, a certain class label is predicted and if otherwise, another class is predicted. Although, a perceptron is generally meant for binary classification, there are techniques like the One-Versus-Rest (OvR) technique which allows for multi-class classification.

Now before we jump to the implementation, let’s understand the mathematics a little better. Given input features (x0,x1,x3….xn) and corresponding weights (w0, w1,w2…wn), the net-input which is the linear combination of the input features and corresponding weights w is given by the formula:

z = w0 x0+ w1x1 + w2x2 +….+ wnxn

After calculating the net input z, then we put it into our activation function which is a unit step function to predict the outcome y. For example, in a binary classification scenario, we can say our predicted class label y = 1 if z >= 1 and y = -1 if z < 0. Now, what our perceptron algorithm does is to find the weights w, that correctly classifies the points. The algorithm can be summarized into two steps:

1. Initialize the weights to 0 or small random numbers

2. Calculate the output y and update the weights a specified number of times until convergence.

Convergence is when there is no significant error decrease in the model.

w = w + ∆w

Where ∆w = µ(y — ŷ ) x

µ is the learning rate between (0.0 and 1.0), ŷ is the predicted outcome and y is the true class label, x is the training sample. After calculating the ∆w, then we add to the weight to update it.

Now that we understand the perceptron learning rule, let’s start the implementation. For this, we would be using python. The perceptron would be a python class that fits the model through the fit method and make predictions through the predict method.

Now let’s explain what the code above is doing. First, we initialize the weights to 0 using the zeros method in numpy. We added 1 to the number of features(X.shape[1] + 1) because numpy arrays are also zero indexed. Through the net_input method, we calculate the dot product using the dot method available on numpy. The predict method is actually the unit step function, using the numpy where method, we predict the class label based on the outcome of the net_input function.

Back to the fit method, we iterate based on the number of iterations, we calculate the update which is the learning rate multiplied by the difference between the true class label and the predicted class label, then we multiply the update with the feature value x, and add the result to the previous weight to update it. self.w[0] is actually the bias, so there is no multiplication with the feature x, then we calculate the errors for each epoch. This can be accessed through the errors_ property.

Now, we can start training our model with the Perceptron classifier. The perceptron classifier can be initialized by specifying the learning rate(eta) and the number of iterations(n_iter)

ppn = Perceptron(eta=0.1, n_iter=10)

Note: for a perceptron to converge, the data set must be linearly separable and the learning rate sufficiently small.

--

--