An Artificial Neural Network for Data Mining

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014 An Artificial Neural Network for Data Mining Dimitrios C. Kyritsis Dept. of Mechanical Engin...
13 downloads 0 Views 547KB Size
Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

An Artificial Neural Network for Data Mining Dimitrios C. Kyritsis Dept. of Mechanical Engineering, Khalifa Univ. of Science, Technology and Research, Abu Dhabi, United Arab Emirates

Abstract: Data mining is a logical process of extraction of useful information and patterns from huge data. It is also called as knowledge discovery process or knowledge mining from data. The goal of this technique is to find patterns that were previously unknown and once these patterns are found they can further be used to make certain decisions. Neural Networks is a set of connected input/output units and each connection has a weight present with it. During the learning phase, network learns by adjusting weights. They have the ability to derive meaning from complicated or imprecise data and so, can be used to extract patterns and detect trends. Companies have massive data warehouses in which they store data collected by them over years. Even though this data is available, very few companies are able to realize the actual value of it. Data Mining is an efficient analysis method helping companies to take maximum advantage of their data resources. Neural Network is one of the most efficient ways of performing data mining since it makes no prior assumptions about data, is highly accurate and is noise tolerant. Neural networks, depending on the architecture, provide associations, classifications, clusters, prediction and forecasting to the data mining industry.

Keywords – Artificial Neural Network(ANN), Data mining

1.INTRODUCTION There is wide availability of huge amount of data and there is an imminent need for turning such data into useful information and knowledge. The information and knowledge gained can be used for applications ranging from market analysis, fraud detection, and customer retention, to production control and science exploration. Data mining tools perform data analysis and uncover important data patterns, contributing greatly to business strategies, knowledge bases, and scientific and medical research. Data mining involves an integration of techniques from multiple disciplines such as database and data warehouse technology, statistics, machine learning, high-performance computing, pattern recognition, neural networks, data visualization, information retrieval, image and signal processing, and spatial or temporal data analysis. 1.1 Prediction Whereas classification predicts categorical (discrete, unordered) labels, prediction models continuous-valued functions. That is, it is used to predict missing or unavailable numerical data values rather than class labels. 1.2 Clustring Clustering analyses data objects without consulting a known class label. In general, the class labels are not present in the training data simply because they are not known to begin with. Clustering can be used to generate such labels. The objects are clustered or grouped based on the principle of maximizing the intra-class similarity and minimizing the inter-class similarity. 1.3 Outlier Analysis A database may contain data objects that do not comply with the general behaviour or model of the data. These data objects are outliers. In some applications such as fraud detection, the rare events can be more interesting than the more regularly occurring ones. The analysis of outlier data is referred to as outlier mining. Outliers may be detected using statistical tests that assume a distribution or probability model for the data, or using distance measures where objects that are a substantial distance from any other cluster are considered outliers.

2. NEURAL NETWORKS 2.1 Human nervous system (HNS) The human brain is composed of special cells called neurons. The estimated number of neurons in a human brain is 50 to 150 billion, of which there are more than 100 different kinds. The brain and the central nervous system control thinking, ability to learn and react to changes in environment. People who suffer brain damage have difficulty learning and reacting to changing environments. Even so, undamaged parts of the brain can often compensate with new learning.

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

2.2 Part of a nerve cell: Nucleus: The central processing portion of the neuron Dendrites: Provide input signals to the cell Axon: To send output signals, the axon terminals of one cell merge with the dendrites of adjacent cell. Synapse: Ability to increase or decrease the strength of the connection from neuron to neuron and cause excitation or inhibition of a subsequent neuron. Signals can be transmitted unchanged, or they can be altered by synapses. 3. ARTIFICIAL NEURAL NETWORK(ANN) ANALOGY TO HNS

An ANN model emulates a biological neural network. Neural computing actually uses a very limited set of concepts from biological neural systems. It is more of an analogy to the human brain than an accurate model of it. Neural concepts are usually implemented as software simulations of the massively parallel processes that involve processing elements interconnected in a network architecture. The artificial neuron receives inputs analogous to the electrochemical impulses the dendrites of biological neurons receive from other neurons. The output of the artificial neuron corresponds to signals sent out from a biological neuron over its axon. The artificial neurons receive the information from other neurons or external input stimuli, perform a transformation on the inputs, and then pass on the transformed information to other neurons or external outputs. These artificial signals can be changed by weights in a manner similar to the physical changes that occur in the synapses.

4. CHARACTERISTICS OF ANN:

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

4.1 I/P – O/P Mapping The free parameters are adjusted so as to get the desired

output. They always learn, just like our brain.

4.2 Adaptivity They can quickly adapt because of the free parameters which determine the strength of the connection. 4.3 Nonlinearity Since all real-life problems are mostly non-linear, this feature is an added advantage of Neural network. 4.4 Fault Tolerance Because of parallelization, its network is not easily affected by fault or errors. 4.5 VLSI Implementability They can easily be implemented with all the modern technology available. 5. ELEMENTS OF ANN 5.1 Processing Elements The PE of an ANN are artificial neurons. Each of the neurons receives inputs, processes them, and delivers a single output. The input can be raw input data or the output of other processing elements. The output can be the final result (e.g., 1 means yes, 0 means no), or it can be inputs to other neurons. 5.2 Networking Structure Each ANN is composed of a collection of neurons, grouped in 3 basic layers : input, intermediate (hidden), output. A hidden layer is a layer of neurons that takes input from the previous layer and converts those inputs into outputs for further processing. Mostly, one hidden layer is present where the hidden layer simply converts inputs into a nonlinear combination and passes the transformed inputs to the output layer. 5.3 Network Information Processing 6. INPUTS Each input corresponds to the numerical representation of a single attribute. e. g. if the problem is to decide on approval or disapproval of a loan, some attributes could be the applicant’s income level, age, and home ownership. Several types of data, such as text, pictures, and voice, can be used as inputs. Pre-processing may be needed to convert the data to meaningful inputs from symbolic data or to scale the data. 7. OUTPUTS The outputs of a network contain the solution to a problem. e. g. in the case of a loan application, the outputs can be yes or no. The ANN assigns numeric values to the outputs, such as 1 for yes and 0 for no. Often, postprocessing of the outputs is required. 8. CONNECTION WEIGHTS They express the relative strength (or mathematical value) of the input data or the many connections that transfer data from layer to layer. i. e. they express the relative importance of each input to a processing element and, ultimately, the outputs. Weights are crucial because they store learned patterns of information. It is through repeated adjustments of weights that a network learns. 9. SUMMATION FUNCTION

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

It computes the weighted sums of all the input elements entering each processing element. A summation function multiplies each input value by its weight and totals the values for a weighted sum Y. The formula for n inputs in one processing element is:

For the jth neuron of several processing neurons in a layer, the formula is:

10. TRANSFORMATION (TRANSFER) FUNCTION The summation function computes the internal stimulation, or activation level, of the neuron. Based on this level, the neuron may or may not produce an output. The relationship between the internal activation level and the output can be linear or nonlinear and is given by the transformation function,

where YT is the transformed (i.e., normalized) value of Y

11. NEURAL NETWORK RCHITECTURES

11.1 Fully Connected Network In a network with n nodes there are n^2 weights.

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

11.2 Layered Network The network is partitioned into subsets called layers with no connection from j to k if j>k.

11.3 Acyclic Network The network is partitioned into subsets called layers with no

interlayer connection.

11.4 Feedforward Architecture Information flows in one direction along connecting pathways from the input layer via hidden layers to final output layers. There is no feedback.

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

11.5 Recurrent Network There is at least one feedback loop. There could also be self-feedback links.

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

12. NEURAL NETWORK IN DATA MINING 12.1 ANN Algorithms Used in Data Mining Out of all algorithms used in ANN, the ones most frequently used in Data Mining are Back Propagation for Classification and Self Organizing Maps for Clustering. 12.2 Back-Propagation Algorithm

12.3 Steps 1. Initialize the weights and biases. 2. Feed the training sample. 3. Propagate the inputs forward. 4. Back propagate the error. 5. Update weights and biases to reflect the propagated errors. 6. End when terminating condition reached.

12.4 Advantages  

It is easy to implement with few parameters to adjust Applicable to wide range of problems

Able to form arbitrarily complex nonlinear mappings 12.5 Disadvantages  Inability to know how to precisely generate any mapping and is slow  Hard to find out the number of neurons and layers necessary  No inherent novelty detection

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

12.6 Self Organising Maps (SOM) 12.7 Introduction  SOM is based on competitive learning and is mainly used for clustering.  PRINCIPLE OF TOPOGRAPHIC MAP FORMATION :The spatial location of an output neuron corresponds to a particular feature from input space.  Output neurons compete with each other to be activated and only one is activated at any one time.  This activated neuron is called a WINNER-TAKES-ALL or WINNING NEURON.

12.8 KOHONEN Neural Network  It is a the most frequently used structure for SOM.  It has a feed-forward structure with a single computational layer arranged in rows and columns.  Each neuron is fully connected to all the source nodes in the input layer.

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

1. Each node's weights are initialized. 2. A vector is chosen at random from the set of training data and presented to the lattice. 3. The node whose weight is most like input vector, is called BEST MATCHING UNIT (BMU). 4. Initially, RADIUS for BMU is set to RADIUS of LATTICE, and it reduces at each step. All the nodes within this radius are said to be in the BMU's neighbourhood. 5. Each neighbouring node's weights are adjusted to make them more like the input vector. The closer a node is to the BMU, the more its weights get altered. 6. Repeat step 2 for N iterations.

13. CONCLUSION 13.1 In Classification METHOD:  Each o/p node is taken as a class.  The i/p pattern is determined to belong to class I if the ith o/p node computes a higher value than the other nodes. PRACTICAL EXAMPLES:  Recognizing printer or handwritten characters  Classify loan applications as credit or non-credit worthy  Analyse the sonar and radar

Rev. Téc. Ing. Univ. Zulia. Vol. 37, Nº 2, 65 - 73, 2014

13.2 In Clustering

 Initially each node randomly reacts to the input samples.  Nodes with higher outputs to an input sample learn to react more strongly to that sample and those around it.  Thus the nodes specialize themselves to different i/p samples.

13.3 Practical Examples Detect patterns in sales of products and in bio-infomatics 13.4 In Forecasting It is a special case of function approximation in which function values are represented using time series. From a training set S, a tuples of d+1 values is selected if we are to predict value based on d inputs and the last component of tuples is given as output. i.e. Predict next value in sequence. PRACTICAL EXAMPLES: Predict stock-market indices

ACKNOWLEDGEMENT This research was supported by the Institute of Technology, Nirma University, Ahmedabad,Gujarat, India

REFERENCES: Data fitting, pattern recognition, clustering in MATLAB, http://www.mathworks.in/products/neuralnetwork/description2.html Hongjun Lu RS and H. Liu, Effective data mining using neural networks, IEEE transactions on knowledge and data engineering, vol. 8, pg. 957-961, December 1996. Jalal Mahmud HY and Gu, Data mining techniques. Priddy K and P. Keller, Artificial Neural Networks, An Introduction. New Delhi: Prentice Hall India, 2007. Sivanandam and Paulraj, Introduction to Artificial Neural Networks,576 Masjid Road, Jangpura, New Delhi: Vikas, 2006. SOM training, http://www.ai-junkie.com/ann/som/som1.html

Suggest Documents