Tutorial 3.1: The Perceptron

1 Tutorial 3.1: The Perceptron

Reading: Stephen Marsland: Chapter 1-3

In this tutorial we shall implement our first learning algorithm, namely a single neuron. The celebrate artificial neural networks (ANN) are built up of numerous neurons, so this tutorial is the first step.

1.1 Problem 1: Single Neuron and Perceptron Training

pict

Figure 1: The single neuron with threshold function.

In this Problem we will implement only a single neuron with the perceptron training algorithm. The neuron is depicted in Figure 1. It acts as a function, with input $\vec{x} = (x_{1}, \dots, x_{n})$ on the left, and output $y$ on the right. The weights $\vec{w} = (w_{0}, \dots, w_{n})$ defines a particular neuron. Different neurons (as far as we are concerned) use the same summing and thresholding functions, but they have different weights.

1.1.1 Step 1: Data Types

At the conceptual (mathematical) level, the neuron receives a real vector as input. The output is 0.0 or 1.0, which we also consider as a real number.

Discuss the following:

1.: What data type should be used in Haskell for the output $y$ ?
2.: What data type should be used in Haskell for the input $\vec{x}$ ?
3.: What data type should be used in Haskell for the weights $\vec{w}$ ?

1.1.2 Step 2: The Neuron Type

Let us define a data type (alias) Neuron to represent a single neuron, recording all the weights.

1.: Discuss: What information must be stored as part of the neuron?
2.: Discuss: What types can we use to define the Neuron type?
3.: Create a new module with the name Perceptron.
4.: Add a definition for the Neuron type to the module.

1.1.3 Step 3: Initialisation

We need a function to create a new, pristine neuron. In a production system, this should be done randomly, but randomness is non-trivial, so we have to return to that later. For the time being, we are happy to initialise the neuron with small constant weights (say 0.01).

1.: Give the type declaration for the initialisation function in your module: initNeuron :: Int −> NeuronThe input is the number $n$ of input values. The number of weights is $n + 1$ .
2.: Add a definition for initNeuron. The return value is a list of $n + 1$ numbers, each equal to 0.01.
You can start with the list [0..n] to get the right number of weights, and then use either map or list comprehension to generate a list of the same length and the right values (0.01).
3.: Test the function in ghci. Does the function give you what you expect?

1.1.4 Step 4: Recall

The neuron as depicted in Figure 1 defines a function called recall. In Haskell it would have the following signature.

recall :: Neuron −> [Double] −> Double

The function takes the neuron and the input list, and it produces a scalar output.

1.

Looking at Figure 1, we see that recall is the composition of two functions: the summation (circle) and the thresholding (square). In Haskell, this can be written as follows:

recall :: Neuron −> [Double] −> Double 
recall n = threshold . neuronSum n

It remains to implement the the threshold and neuronSum.

2.

The threshold function is defined as

\begin{align} t h r e s h o l d x = \begin{aligned} 0 for x < 0, \\ 1 for x \geq 0 . \end{aligned} & (1) \end{align}

Implement this function in your module using guards. Use the following type declaration.

threshold :: Double −> Double

3.

Secondly, we implement the summation (the circle node in Figure 1). Add the following to your module.

neuronSum :: Neuron −> [Double] −> Double 
neuronSum n is = sum $ zipWith (*) n ((−1):is)

Discuss how this function works.

a): What does (-1):is mean?
b): What does the zipWith function do?
c): What does the sum function do?

4.

Test the function. Start ghci and try the following

recall (initNeuron 3) [ 1.0, 0.5, −1.0 ]

Do you get the expected output?

5.

Obviously, you do not learn all that you want to know from the above test, but at least you get to check for type errors. Develop your own test, by manually defining a test neuron with other weights, and use that in lieu of initNeuron.

1.1.5 Step 5: Training

The first step of implementing training is to update the neuron weights based on a single input/output pair. That is a function

trainOne :: Double −> [Double] −> Double 
                   −> Neuron −> Neuron

The first argument is the training factor

η

. The second is the input vector

\vec{x}

, and the third argument is the target output value

t

. The last (fourth) argument is the old neuron

\vec{w}

. The output is the updated neuron

{\vec{w}}^{'}

The updated weights are defined as

\begin{align} w_{i}^{'} & = w_{i} - η (y - t) x_{i}, where & (2) \\ y & = r e c a l l \vec{w} \vec{x} . & (3) \end{align}

In other words, if the actual output is different from the target output ( $y \neq t$ ), then the weight is adjusted proportionally to the difference ( $y - t$ ).

1.

Implemente the weight update as defined above. We need a function with the following signature

weightUpdate :: Double −> Double −> Double 
                       −> Double −> Double 
weightUpdate eta diff w x  =

We have introduced diff for the different

y - t

, the other arguments are

η

w

and

x

as in (2). Complete the implementation and add it to your module.

2.

We implement the trainOne as follows:

trainOne :: Double −> [Double] −> Double 
                   −> Neuron −> Neuron 
trainOne eta xs t ws = zipWith (weightUpdate eta diff) 
                               ws ((−1):xs) 
                       where diff = recall ws xs − t

This implementation uses zipWith and partial application of weightUpdate. Discuss the following:

a): What does zipWith do?
b): What do we mean by partial application?
c): What is the type of the first argument to zipWith, i.e. weightUpdate eta diff?

3.

To test the function, start ghci and try the following

trainOne 0.5 [ 1.0, 0.5, −1.0 ] 1.0 (initNeuron 3)

Do you get the expected output?

1.1.6 Step 6: Training on a Set of Vectors

The trainOne function uses only a single object for training. Now we need a trainSet function which uses a set of objects for training. This is a beautiful application of recursion over the list of training objects.

1.

The function declaration is similar to that of trainOne, except that we get a lists instead of a single input vector and a single output value. Add it to your module as follows:

trainSet :: Double −> [[Double]] −> [Double] 
                   −> Neuron −> Neuron

2.

Add the base case:

trainSet _ [] _ n = n

Discuss, what does the base case do?

3.

Then, add the recursive case:

trainSet eta (v:vs) (t:ts) n = trainSet eta vs ts 
                             $ trainOne eta v t n

Discuss the following

a): What does the notation (v:vs) (and (t:ts)) mean?
b): What is the last argument to trainSet? What is its type? How is it computed?
c): How does the recursion work?

4.

Test the function as you did with trainOne, but replace the input vector with a list of two input vectors (of you choice), each of length 3.

1.1.7 Step 7: Complete Training Function

It is usually not sufficient to run the training once only. Usually, we want to repeat the trainSet operation $T$ times. In other words, we want a function with the following signature:

train :: Int −> Double −> [[Double]] −> [Double] 
                 −> Neuron −> Neuron

The first argument is the number

T

of iterations, while the other argumens are as they were for trainSet.

1.: Add the function declaration to your module.
2.: Add a base case, defining the return value for $T = 0$ iterations.
3.: Add a recursive case which uses trainSet to do a single iteration and calls train recursively to do $T - 1$ more iterations.
You may look at the definition of trainSet above for an example of recursion, but remember that train recurses over an integer (the number of iterations) while trainSet recursed over a list.
4.: Test the function using the same test data as you used for trainSet. Try both $T = 2$ and $T = 5$ . replace the input vector with a list of two input vectors (of you choice), each of length 3.

1.1.8 Step 8: Testing

A simple test to device is to take a simple function and try to predict whether it is positive or negative. Take for instance the following:

f (x, y, z) = t h r e s h o l d [{(x - 1.0)}^{4} + {(y - 1.0)}^{5} - z - 1.0] .

1.: Choose a couple of feature vectors $(x, y, z)$ (randomly or otherwise), and calculate the corresponding class label $f (x, y, z)$ . This gives a training set.
2.: Use the training set to train a neuron $n$ in GHCi.
3.: Choose another feature vector $(x, y, z)$ and calculate $f (x, y, z)$ . Use GHCi to calculate $r e c a l l (x, y, z) n$ .
4.: Compare the real class label $f (x, y, z)$ with the prediction obtained in GHCi. Do they match?
5.: Repeat the last two steps a couple of times.

1.2 Problem 2: Multi-Neuron Perceptrons

1.2.1 Step 1: Data Type

Define a data type Layer to represent a multi-neuron perceptron.

What data type can be used to hold a set of neurons?

The name ‘layer’ will make sense when we advance to more complex neural networks. The perceptron consists of a single layer only, but other neural networks will be lists of layers.

1.2.2 Step 2: Initialisation

Define a function initLayer to return a perceptron (layer) where all weights in all neurons is set to some small, constant, non-zero value.

Remember arguments so that the user can choose both the number of neurons in the layer and the number of inputs. Each neuron in the layer should be created by a call to initNeuron which you defined above.

1.2.3 Step 3: Recall

Define a function recallLayer which does a recall for each neuron in the layer, and returns a list of output values.

1.2.4 Step 4: Training

Generalise each of the training functions trainOne, trainSet, and train for perceptrons. The training functions for perceptrons have to apply the corresponding training function for each neuron in the layer.

1.3 Epilogue

You have just implemented your first classifier. Well done.

However, this prototype leaves much to be desired.

1.: We cannot initialise with random weights.
2.: We have to type in the data for training and for testing.
3.: We only have a single layer, and not a full network.

As you can see, we have to go back and learn some more techniques in Haskell. First of all, we will learn I/O in the next tutorial, to be able to read complete data sets from file, both for training and for testing.

[next] [front] [up]