What is an Artificial Neural Network (ANN)?

What is an Artificial Neural Network (ANN)? A Combustion File downloaded from the IFRF Online Combustion Handbook ISSN 1607-9116 Combustion File No: ...
Author: Lauren Lyons
7 downloads 0 Views 149KB Size
What is an Artificial Neural Network (ANN)? A Combustion File downloaded from the IFRF Online Combustion Handbook ISSN 1607-9116 Combustion File No:

46

Version No:

1

Date:

02-06-2003

Author(s):

Chee Keong Tan

Source(s):

See CF

Sub-editor:

John Ward

Referee(s):

S J Wilcox

Status:

Published

Sponsor:

University of Glamorgan

1. Background The past fifteen years or so have seen an increasing use of Artificial Neural Networks (ANNs) to model or represent a large class of “real” problems and systems which are very difficult to analyse by conventional methods. These systems, including some in the field of combustion, are frequently characterised by the following features:ƒ

It is not always possible to develop a mathematical model for the system or problem that adequately represents the actual physical processes.

ƒ

The solution methodology employed by a “human expert” is often based on a discrete rule-based reasoning framework. Attempts to generalise these rules in a conventional expert system can often be difficult.

ƒ

The complexity and size of the problem is such that “hard and fast rules” cannot easily be applied and, moreover, the computational requirements of conventional models can be excessive.

There are many different types of ANN, see for example, Jain et al. (1996). However, typically an ANN is made up of a large number of interconnected processing elements linked together by connections. The essential structure of the network is therefore similar to that of a biological brain in which a series of neurons are connected by synapses. An ANN can be characterised by a massively distributed and parallel computing paradigm, in which “learning” of the process replaces a priori program and model development. The learning phase can be understood as the process of updating the network structure and connection weights in response to a set of training data obtained from experimental measurements or, less often, from a rigorous mathematical model of the process. The overall behaviour and functionality of an ANN is determined by the network architecture, the individual neuron characteristics, the learning or training

strategy, and the quality of the training data. Once trained the network has a relatively modest computing requirement. It must be emphasised that an ANN is a relatively simple method of representing complex relationships between parameters for which corresponding data is already available. No new knowledge is generated by an ANN, but the approach is often easier and more effective than conventional curve fitting techniques, or computer models based on the underlying physical and chemical processes. In particular, an ANN offers the capability of simulating and even controlling a combustion process in real time, as well as being a powerful tool for the classification of complex data sets.

2. The Feed-Forward Multilayer Perceptron Network The Feed-Forward Multilayer Perceptron Network (FFMLP), as shown in Figure 1(a), is the most popular supervised-learning network and has found application across many disciplines, including combustion, see Wilcox et al. (2002). It is particularly suited for use in tasks involving prediction and classification of complex data. A FFMLP network is typically organised in a series of layers made up of a number of interconnected neurons each of which contain a transfer function, f (Figure 1(b)). Input data are first presented to the network via the input layer, which then communicates this data via a system of weighted connections to one or more, hidden layers where the actual neuron processing is carried out. The response of the hidden layers is then transferred to an output layer where the network outputs are computed. During the training phase these outputs are compared with known “target” values for the process. The resultant errors arising from the difference between the predicted and known values are back-propagated through the network and used to update the weights of the connections. The process is successively repeated until the errors fall within pre-specified limits. At this stage the network training is said to be complete. However, it should be appreciated that training is not necessarily a “once-off” procedure since the network can “learn” and be updated on-line if the system or process changes. (a)

(b) Connection weights

x1

y1

x2

xn

. . . .

i1

w1 i2 w2

.. i

n



f

o

wn b

. . .

. . .

 n  o = f  ∑ ( w j × i j ) + b   j =1 

Input

Hidden

Output

Layer

Layers

Layer

Figure 1: (a) Typical ANN Architecture and (b) Computational Model of a Single Neuron.

Figure 2: Neuron Transfer Functions. (a) ‘Tansig’ (b) ‘Logsig’ (c) ‘RBF’ and (d) ‘Purelin’. The three most widely used transfer functions for the hidden neurons are (a) the tanhsigmoid (tansig), (b) log-sigmoid (logsig) and (c) the radial basis function (RBF), whilst a commonly used transfer function for the output neurons is a linear function (purelin) (Figure 2). As a result of the use of these non-linear transfer functions in the hidden neurons, a FFMLP network can successfully represent highly complex input-output relationships so that it can often be employed as a “black box” model of a system or process. A simple example of the use of a FFMLP is presented in CF237 “How do I use a neural network to predict the calorific value of a solid fuel”.

3. Training an ANN The successful implementation of an ANN should take into account the following features, which can affect the performance of the trained network.

Data Pre-Processing Neural network training can be made more efficient if certain pre-processing steps are performed on the network inputs and targets to prevent individual parameters dominating the response. One way of doing this is to scale the inputs and targets so that they fall in the range [-1, 1]. Another useful approach is to normalize the values so that they have a zero mean and unit standard deviation.

The Number of Hidden Neurons The response of a neural network is sensitive to the number of neurons in the hidden layer(s). The use of too few neurons can lead to under fitting of the training data whilst too many neurons can contribute to over fitting. In this latter case all the training values are closely represented, but the relationships predicted by the network can oscillate wildly between these points. Consequently large errors can occur during subsequent prediction of previously unseen conditions. Unfortunately, there is still no single reliable method, apart from trial and error, to determine the optimal number of hidden neurons for a particular application.

Number of Output Neurons The number of output neurons is problem specific, in that it depends on how many dependent variables are modelled. In some cases each dependent variable is modelled using a separate network since this allows more efficient training of smaller networks and hence can improve generalisation of the network.

Use of Feed-back Connections (Recurrent Networks) With time varying data a FFMLP network can be made recurrent by the addition of feed-back connections, (with appropriate time delays), from either the outputs of the hidden layers to the inputs (internal feed-back connections) or from the output neurons to the input layer (global feed-back connections). These feed-back paths allow the network to both learn and store the time-varying patterns in the training data and consequently this type of network is particularly useful for modelling time-dependent or dynamic systems.

4. Generalisation of the Network Following successful training, the outputs of the network should adequately “match” the target values corresponding to the given inputs in the training data set. Usually, however, the purpose of using a neural network is to predict or control the behaviour of a system for a range of previously unseen input data, i.e., the network should be capable of generalisation. Neural networks do not automatically generalise, and care should be exercised in interpreting network predictions. In this context it is important that continuous functional relationships (albeit highly complex, approximate and unspecified) exist between the system inputs and outputs. Moreover, the training data should be sufficiently large to be representative of the whole problem domain since neural networks can be usually only used reliably to interpolate within the range of the training conditions. Therefore ANN predictions are often subject to significant errors if the network is applied to the region of a system where the initial training data are “sparse”. Furthermore the use of an ANN to extrapolate the behaviour of the system to conditions which lie outside the range of the training data is notoriously unreliable. Hence it is important to have sufficient, and suitably distributed, training data to obtain satisfactory predictions and to avoid the need for extrapolation.

Glossary terms Artificial neural network – A computational model inspired by the structure of a biological brain. Therefore it is typically composed of a large number of highly interconnected processing elements, that are analogous to neurons, and these are linked together with weighted connections that are analogous to synapses in the brain. Connection weights – In an Artificial Neural Network, the connection weights between neurons represent the necessary “knowledge” to solve the problem. Generalisation – A measure of how well an artificial neural network can respond to new inputs on which it has not been trained. An ability to generalize is crucial to the predictive and decision making ability of the network. Hidden Layer – A layer of processing elements (neurons) positioned between the input layer and output layers of a neural network. Input Layer - A layer of processing elements that receives the input to an artificial neural network. The input layer neurons are used only to “hold” input values and to distribute these values to neurons in the next layer. Hence, the input layer neurons do not implement a separate mapping or conversion of the input data. Output Layer - The final layer of processing elements in an artificial neural network. They yield the output of the neural network.

Supervised learning – With supervised learning, an artificial neural network learns to approximately represent the desired outputs (or targets) corresponding to particular specified input patterns. Training data – In artificial neural networks, these are ranges of appropriate data, obtained from experimental measurements or from a rigorous mathematical model of a system, which are then used to “train” the neural network to represent a specific problem. Transfer function – In artificial neural networks, a function that is applied by a neuron to convert its input activations to an output. It is also known as an activation, thresholding function.

Keywords Neural; network; Modelling; modelling; artificial; intelligence; control; simulation, computing, model, mathematical; simulator

Related Combustion Files CF237 How do I use a neural network to predict the calorific value of a solid fuel?

Sources Jain, A.K., Mao, J.C. and Mohiuddin, K.M. (1996) Artificial Neural Networks: A Tutorial. Computer, 29(3): 31-44. Wilcox, S.J., Ward, J., Tan, C.K., Tan, O.H. and Payne, R. (2002) The Application of Neural Networks in a Range of Combustion Systems. Proc. 6th European Conference on Industrial Furnaces and Boilers, Estoril, Portugal, 4: 203-214.

Acknowledgements None

File Placing [Modelling]; [Mathematical]; [Artificial Intelligence] [Burners]; [Safety and control]; [Advanced Process Control]

Access Domain [Open Domain]

Parity between this pdf and the present html version of this Combustion File The information contained in this pdf Combustion File edition is derived from html edition of the same number and version, as published in the IFRF Combustion Handbook (http://www.handbook.ifrf.net).

The information published in this pdf edition, is that which was included in the original html edition and has not been updated since. For example there may have been minor corrections in the html version, of errors, which have been drawn to our attention by our readers. What is more important is that with the passage of time and the continuous growth of the handbook, a number of other changes may have been made to the published html version, such as: •

The related combustion files may have been augmented;



The filing system may have been further developed;



The Access Domain may have changed.

These changes can be made without substantial changes being made to the main text and graphics. If there have been substantial changes made, then a new version of the Combustion File will have been published. Thus to be sure of up-to-date information, go to the Handbook and download the latest html version of the Combustion File.

Limits of Liability A full Limits of Liability declaration is shown at the entry of the IFRF ONLINE Combustion Handbook at www.handbook.ifrf.net. Through possession of this document, it is assumed that the holder has read and accepted the limits. The essential limitation is that: The International Flame Research Foundation, its Officers, its Member Organisations its Individual Members and its staff accept no legal liability or responsibility whatsoever for the consequences of unqualified use or misuse of the information presented in the IFRF Combustion Handbook or any results derived from the Combustion Files which comprise this Handbook.  IFRF 1999 - 2003

Suggest Documents