Perceptron (en)

The Perceptron is inspired by the information processing of a single neural cell (called a neuron). A neuron accepts input signals via its axon, which pass the electrical signal down to the cell body. The dendrites carry the signal out to synapses, which are the connections of a cell’s dendrites to other cell’s axons. In a synapse, the electrical activity is converted into molecular activity (neurotransmitter molecules crossing the synaptic cleft and binding with receptors). The molecular binding develops an electrical signal which is passed onto the connected cells axon.

The information processing objective of the technique is to model a given function by modifying internal weightings of input signals to produce an expected output signal. The system is trained using a supervised learning method, where the error between the system’s output and a known expected output is presented to the system and used to modify its internal state. State is maintained in a set of weightings on the input signals. The weights are used to represent an abstraction of the mapping of input vectors to the output signal for the examples that the system was exposed to during training.

The Perceptron is comprised of a data structure (weights) and separate procedures for training and applying the structure. The structure is really just a vector of weights (one for each expected input) and a bias term.

The following algorithm provides a pseudocode for training the Perceptron. A weight is initialized for each input plus an additional weight for a fixed bias constant input that is almost always set to 1,0. The activation of the network to a given input pattern is calculated as follows:

where n is the number of weights and inputs, x_ki is the k-th attribute on the i-th input pattern, and w_bias is the bias weight. The weights are updated as follows:

where w_i is the i-th weight at time t and t + 1, α is the learning rate, e(t) and a(t) are the expected and actual output at time t, and x_i is the i-th input. This update process is applied to each weight in turn (as well as the bias weight with its contact input).

The Perceptron can be used to approximate arbitrary linear functions and can be used for regression or classication problems. The Perceptron cannot learn a non-linear mapping between the input and output attributes. The XOR problem is a classical example of a problem that the Perceptron cannot learn.

Input and output values should be normalized such that x in [0; 1]. The learning rate α in [0; 1] controls the amount of change each error has on the system, lower learning rages are common such as 0,1. The weights can be updated in an online manner (after the exposure to each input pattern) or in batch (after a fixed number of patterns have been observed). Batch updates are expected to be more stable than online updates for some complex problems. A bias weight is used with a constant input signal to provide stability to the learning process. A step transfer function is commonly used to transfer the activation to a binary output value 1 <– activation ≥ 0, otherwise 0. It is good practice to expose the system to input patterns in a different random order each enumeration through the input set. The initial weights are typically small random values, typically in [0; 0,5].