Real Problems for Neural Networks

Functional Programming and Intelligent Algorithms

(Sist oppdatert: 1 January 2015)

Menu

Binary Classification

We have worked mostly with the breast cancer data from Wisconsin. This is a very simple case, since we have two classes; we call it binary classification. Thus we only need one neuron in the output layer and the two-valued (0, 1) output is sufficient.

The input (feature vector) is already coded for us, so we do not have to think about that.

When we move to other problems we may have to think about coding, both of input and of output.

Multi-class problems (The iris data set)

We have also introduced the iris data set previously. The input is already floating point features, so we do not need to think about that. However, there are three classes, so we cannot just use the single binary output neuron that we used to.

What options do we have for coding the output vectors? Discuss.

What are the advantages and disadvantages of each of the following options?

  1. Change the threshold functions to allow three outputs from one neuron.
  2. Use two binary output neurons, which gives four possible classes: (0,0), (0,1), (1,0), (1,1), where we use only three of the four.
  3. Use three binary output neurons, one for each class. Thus the classes are represented as: (1,0,0), (0,1,0), (0,0,1)

Regression

The neural network discussed so far gives discrete output. Either the neuron fires, or it does not. This is suitable for classification problems, because classes er discrete entities.

Very often we have prediction problems, where we want to estimate some numeric characteristic from a set of features. Say, you want to estimate the price a picture will command at an auction, based on visual features. The price is a continuous variable, so the discrete output will not suite

Consider the wine quality data set from MCR. Open the CSV file and consider the data. Discuss the following:

  1. Is this a regression or classification problem? Or could it be either?
  2. What reasons/advantages exist for interpreting it as a regression problem?
  3. What reasons/advantages exist for interpreting it as a classification problem?
  4. How can you adapt a neural network to solve the problem?

Train and test a neural network to assess wine quality.

Hyphenation

Hyphenation is a difficult problem; when a line is broken midword, where are you allowed to break? Admittedly, it is not too hard for native writers to learn, but when you typeset a text you really want the software to do it for you. How?

Neural networks has been proposed for the purpose. You can check Google Scholar for examples.

Coding of the input and output is non-trivial here. You are working with sequences of letters, rather than floating point numbers. What options do you have?

Discuss. How can you code input and output for the hyphenation problem?

You can get some ideas from a search on Google Scholar, or have a look at the following two papers which gives different approaches.

  1. Czech
  2. Norwegian


Hans Georg Schaathun / hasc@hials.no