1 Tutorial 3.1: The Perceptron
Reading: Stephen Marsland: Chapter 1-3
In this tutorial we shall implement our first learning algorithm, namely a single neuron. The celebrate artificial neural networks (ANN) are built up of numerous neurons, so this tutorial is the first step.
1.1 Problem 1: Single Neuron and Perceptron Training
In this Problem we will implement only a single neuron with the perceptron training algorithm. The neuron is depicted in Figure 1. It acts as a function, with input on the left, and output on the right. The weights defines a particular neuron. Different neurons (as far as we are concerned) use the same summing and thresholding functions, but they have different weights.
1.1.1 Step 1: Data Types
At the conceptual (mathematical) level, the neuron receives a real vector as input. The output is 0.0 or 1.0, which we also consider as a real number.
Discuss the following:
- 1.
- What data type should be used in Haskell for the output ?
- 2.
- What data type should be used in Haskell for the input ?
- 3.
- What data type should be used in Haskell for the weights ?
1.1.2 Step 2: The Neuron Type
Let us define a data type (alias) Neuron
to represent a single neuron, recording all the
weights.
- 1.
- Discuss: What information must be stored as part of the neuron?
- 2.
- Discuss: What types can we use to define the
Neuron
type? - 3.
- Create a new module with the name
Perceptron
. - 4.
- Add a definition for the
Neuron
type to the module.
1.1.3 Step 3: Initialisation
We need a function to create a new, pristine neuron. In a production system, this should be done randomly, but randomness is non-trivial, so we have to return to that later. For the time being, we are happy to initialise the neuron with small constant weights (say 0.01).
- 1.
- Give the type declaration for the initialisation function in your module:
initNeuron :: Int −> Neuron
The input is the number of input values. The number of weights is . - 2.
- Add a definition for
initNeuron
. The return value is a list of numbers, each equal to 0.01.You can start with the list
[0..n]
to get the right number of weights, and then use eithermap
or list comprehension to generate a list of the same length and the right values (0.01). - 3.
- Test the function in
ghci
. Does the function give you what you expect?
1.1.4 Step 4: Recall
The neuron as depicted in Figure 1 defines a function called recall. In Haskell it would have the following signature.
recall :: Neuron −> [Double] −> Double
The function takes the neuron and the input list, and it produces a scalar output.
- 1.
- Looking at Figure 1, we see that
recall
is the composition of two functions: the summation (circle) and the thresholding (square). In Haskell, this can be written as follows:recall :: Neuron −> [Double] −> Double
recall n = threshold . neuronSum nIt remains to implement the the threshold and
neuronSum
. - 2.
- The threshold function is defined as
Implement this function in your module using guards. Use the following type declaration.
threshold :: Double −> Double
- 3.
- Secondly, we implement the summation (the circle node in Figure 1). Add the following to
your module.
neuronSum :: Neuron −> [Double] −> Double
neuronSum n is = sum $ zipWith (*) n ((−1):is)Discuss how this function works.
- a)
- What does
(-1):is
mean? - b)
- What does the
zipWith
function do? - c)
- What does the
sum
function do?
- 4.
- Test the function. Start
ghci
and try the followingrecall (initNeuron 3) [ 1.0, 0.5, −1.0 ]
Do you get the expected output? - 5.
- Obviously, you do not learn all that you want to know from the above test, but at least you get
to check for type errors. Develop your own test, by manually defining a test neuron with other
weights, and use that in lieu of
initNeuron
.
1.1.5 Step 5: Training
The first step of implementing training is to update the neuron weights based on a single input/output pair. That is a function
trainOne :: Double −> [Double] −> Double
−> Neuron −> Neuron
The first argument is the training factor .
The second is the input vector ,
and the third argument is the target output value
. The last (fourth) argument
is the old neuron . The output
is the updated neuron .
The updated weights are defined as
In other words, if the actual output is different from the target output (), then the weight is adjusted proportionally to the difference ().
- 1.
- Implemente the weight update as defined above. We need a function with the following
signature
weightUpdate :: Double −> Double −> Double
We have introduced
−> Double −> Double
weightUpdate eta diff w x =diff
for the different , the other arguments are , and as in (2). Complete the implementation and add it to your module. - 2.
- We implement the
trainOne
as follows:trainOne :: Double −> [Double] −> Double
−> Neuron −> Neuron
trainOne eta xs t ws = zipWith (weightUpdate eta diff)
ws ((−1):xs)
where diff = recall ws xs − tThis implementation uses
zipWith
and partial application ofweightUpdate
. Discuss the following:- a)
- What does
zipWith
do? - b)
- What do we mean by partial application?
- c)
- What is the type of the first argument to
zipWith
, i.e.weightUpdate eta diff
?
- 3.
- To test the function, start
ghci
and try the followingtrainOne 0.5 [ 1.0, 0.5, −1.0 ] 1.0 (initNeuron 3)
Do you get the expected output?
1.1.6 Step 6: Training on a Set of Vectors
The trainOne
function uses only a single object for training. Now we need a trainSet
function which
uses a set of objects for training. This is a beautiful application of recursion over the list of training
objects.
- 1.
- The function declaration is similar to that of
trainOne
, except that we get a lists instead of a single input vector and a single output value. Add it to your module as follows:trainSet :: Double −> [[Double]] −> [Double]
−> Neuron −> Neuron - 2.
- Add the base case:
trainSet _ [] _ n = n
Discuss, what does the base case do?
- 3.
- Then, add the recursive case:
trainSet eta (v:vs) (t:ts) n = trainSet eta vs ts
$ trainOne eta v t nDiscuss the following
- a)
- What does the notation
(v:vs)
(and(t:ts)
) mean? - b)
- What is the last argument to
trainSet
? What is its type? How is it computed? - c)
- How does the recursion work?
- 4.
- Test the function as you did with
trainOne
, but replace the input vector with a list of two input vectors (of you choice), each of length 3.
1.1.7 Step 7: Complete Training Function
It is usually not sufficient to run the training once only. Usually, we want to repeat the trainSet
operation
times.
In other words, we want a function with the following signature:
train :: Int −> Double −> [[Double]] −> [Double]
−> Neuron −> Neuron
The first argument is the number
of iterations, while the other argumens are as they were for trainSet
.
- 1.
- Add the function declaration to your module.
- 2.
- Add a base case, defining the return value for iterations.
- 3.
- Add a recursive case which uses
trainSet
to do a single iteration and callstrain
recursively to do more iterations.You may look at the definition of
trainSet
above for an example of recursion, but remember thattrain
recurses over an integer (the number of iterations) whiletrainSet
recursed over a list. - 4.
- Test the function using the same test data as you used for
trainSet
. Try both and . replace the input vector with a list of two input vectors (of you choice), each of length 3.
1.1.8 Step 8: Testing
A simple test to device is to take a simple function and try to predict whether it is positive or negative. Take for instance the following:
- 1.
- Choose a couple of feature vectors (randomly or otherwise), and calculate the corresponding class label . This gives a training set.
- 2.
- Use the training set to train a neuron in GHCi.
- 3.
- Choose another feature vector and calculate . Use GHCi to calculate .
- 4.
- Compare the real class label with the prediction obtained in GHCi. Do they match?
- 5.
- Repeat the last two steps a couple of times.
1.2 Problem 2: Multi-Neuron Perceptrons
1.2.1 Step 1: Data Type
Define a data type Layer
to represent a multi-neuron perceptron.
What data type can be used to hold a set of neurons?
The name ‘layer’ will make sense when we advance to more complex neural networks. The perceptron consists of a single layer only, but other neural networks will be lists of layers.
1.2.2 Step 2: Initialisation
Define a function initLayer
to return a perceptron (layer) where all weights in all neurons is set to
some small, constant, non-zero value.
Remember arguments so that the user can choose both the number of neurons in the layer and the
number of inputs. Each neuron in the layer should be created by a call to initNeuron
which you
defined above.
1.2.3 Step 3: Recall
Define a function recallLayer
which does a recall for each neuron in the layer, and returns a list of
output values.
1.2.4 Step 4: Training
Generalise each of the training functions trainOne
, trainSet
, and train
for perceptrons. The training
functions for perceptrons have to apply the corresponding training function for each neuron in the
layer.
1.3 Epilogue
You have just implemented your first classifier. Well done.
However, this prototype leaves much to be desired.
- 1.
- We cannot initialise with random weights.
- 2.
- We have to type in the data for training and for testing.
- 3.
- We only have a single layer, and not a full network.
As you can see, we have to go back and learn some more techniques in Haskell. First of all, we will learn I/O in the next tutorial, to be able to read complete data sets from file, both for training and for testing.