Idea Transcript
CS407 Neural Computation
Lecturer: A/Prof. M. Bennamoun
General Information
Lecturer/Tutor/Lab Demonstrator: – A/Prof. M. Bennamoun Timetable: – Lectures: Mondays 10am-11:45am CS-Room 1.24 – Tutorials: fortnightly starting in week 3 – Laboratories: starting in week 3 – Consultations: Thursdays 2-3pm. Assessment: – Project- week 9-13: 40% – Final Exam- 2 hours : 60%. Recommendations: Familiarize yourselves with Matlab and its neural network toolbox.
Text Books
M. Hassoun, Fundamentals of Artificial Neural Networks, MIT press. – A thorough and mathematically oriented text.
S. Haykin, “Neural Networks: A Comprehensive foundation” – An extremely thorough, strongly mathematically grounded, text on the subject. – “If you have strong mathematical analysis basics and you love Neural Networks then you have found your book”
L. Fausett, “Fundamentals of Neural Networks” – “clear and useful in presenting the topics, and more importantly, in presenting the algorithms in a clear simple format which makes it very easy to produce a computer program implementing these algorithms just by reading the book”
Phil Picton, “Neural Networks”, Prentice Hall. – A simple introduction to the subject. It does not contain algorithm descriptions but includes other useful material.
Today’s Lecture Motivation
The
+ What are ANNs ?
Brain Brain vs. Computers Historical overview Applications Course content
Motivations: Why ANNS “Sensors” or “power of the brain”
Sagittal
Transverse
Computer vs. Brain
Computers are good at: 1/ Fast arithmetic and 2/ Doing precisely what the programmer programs them to do Computers are not good at: 1/ Interacting with noisy data or data from the environment, 2/ Massive parallelism, 3/ Fault tolerance, 4/ Adapting to circumstances
Brain and Machine • The Brain – – – –
Pattern Recognition Association Complexity Noise Tolerance
• The Machine – Calculation – Precision – Logic
The contrast in architecture • The Von Neumann architecture uses a single processing unit; – Tens of millions of operations per second – Absolute arithmetic precision
• The brain uses many slow unreliable processors acting in parallel
Features of the Brain • • • • •
Ten billion (1010) neurons On average, several thousand connections Hundreds of operations per second Die off frequently (never replaced) Compensates for problems by massive parallelism
The biological inspiration
• The brain has been extensively studied by scientists. • Vast complexity and ethics (which limit extent of research) prevent all but rudimentary understanding. • Even the behaviour of an individual neuron is extremely complex
The biological inspiration
• Single “percepts” distributed among many neurons • Localized parts of the brain are responsible for certain well-defined functions (e.g. vision, motion). • Which features are integral to the brain's performance? • Which are incidentals imposed by the fact of biology?
Do we want computers which get confused, and make mistakes,,…?
Where can neural network systems help? •where we can't formulate an algorithmic solution. •where we can get lots of examples of the behaviour we require. •where we need to pick out the structure from existing data. •http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html
What are “Artificial neural networks”? http://webopedia.internet.com/TERM/n/neural_network.html
ANN’s are a type of artificial intelligence that attempts to imitate the way a human brain works. Rather than using a digital model, in which all computations manipulate zeros and ones, a neural network works by creating connections between processing elements, the computer equivalent of neurons. The organization and weights of the connections determine the output.
What are Artificial Neural Networks? … (i) Hardware inspired by biological neural networks, e.g. human brain (ii) Parallel, Distributed Computing Paradigm (i.e. Method) (iii) Algorithm for learning by example (iv) Tolerant to errors in data and in hardware (v) Example of a complex system built from simple parts
Sims, history & areas of use…
Strictly speaking, a neural network implies a nondigital computer, but neural networks can be simulated on digital computers. The approach is beginning to prove useful in certain areas that involve recognizing complex patterns, such as voice recognition and image recognition.
Definition of an ANN
An ANN is a massively parallel distributed processor that has a natural propensity for storing exponential knowledge and making it available for use. It resembles the brain in 2 respects
knowledge is acquired by the network thru a learning process. Interconnection strengths known as synaptic weights are used to store the knowledge.
Other terms/names • • • •
connectionist parallel distributed processing neural computation adaptive networks..
Simple explanation – how NNs work http://www.zsolutions.com/light.htm
Neural Networks use a set of processing elements (or nodes) loosely analogous to neurons in the brain (hence the name, neural networks). These nodes are interconnected in a network that can then identify patterns in data as it is exposed to the data. In a sense, the network learns from experience just as people do (case of supervised learning). This distinguishes neural networks from traditional computing programs, that simply follow instructions in a fixed sequential order.
Simple explanation – how NNs work http://www.zsolutions.com/light.htm
The structure of a feed forward neural network
The bottom layer represents the input layer, in this case with 5 inputs labelled X1 through X5. In the middle is something called the hidden layer, with a variable number of nodes. It is the hidden layer that performs much of the work of the network. The output layer in this case has two nodes, Z1 and Z2 representing output values we are trying to determine from the inputs. For example, we may be trying to predict sales (output) based on past sales, price and season (input).
Simple explanation – hidden layer http://www.zsolutions.com/light.htm
More on the Hidden Layer
Each node in the hidden layer is fully connected to the inputs. That means what is learned in a hidden node is based on all the inputs taken together. This hidden layer is where the network learns interdependencies in the model. The following diagram provides some detail into what goes on inside a hidden node (see more details later). Simply speaking a weighted sum is performed: X1 times W1 plus X2 times W2 on through X5 and W5. This weighted sum is performed for each hidden node and each output node and is how interactions are represented in the network. Each summation is then transformed using a nonlinear function before the value is passed on to the next layer.
Where does the NN get the weights? (case of supervised learning) http://www.zsolutions.com/light.htm
Again, the simple explanation... The network is repeatedly shown observations from available data related to the problem to be solved, including both inputs (the X1 through X5 in the diagram above) and the desired outputs (Z1 and Z2 in the diagram). The network then tries to predict the correct output for each set of inputs by gradually reducing the error. There are many algorithms for accomplishing this, but they all involve an interactive search for the proper set of weights (the W1-W5) that will do the best job of accurately predicting the outputs.
Historical Overview 40’s, 50’s 60’s
(i) McCulloch & Pitts (1943) - Threshold neuron McCulloch & Pitts are generally recognised as the designers of the first neural network (ii) Hebb (1949) - first learning rule (iii) Rosenblatt (1958) - Perceptron & Learning Rule (iv) Widrow & Huff (1962) - Adaline
Historical Overview 70’s
(v) Minsky & Papert (1969) – Perceptron Limitations described, Interest wanes (death of ANNs) (vi) 1970’s were quiet years
Historical Overview 80’s 90’s
(vii) 1980’s: Explosion of interest (viii) Backpropagation discovered (Werbos ‘74, Parker ‘85, Rumelhart ‘86) (ix) Hopfield (1982): Associative memory, fast optimisation (x) Fukushima (1988): neocognitron for character recognition
Applications of ANNs Signal Processing, e.g. Adaptive Echo Cancellation Pattern Recognition, e.g. Character Recognition Speech Synthesis (e.g. Text-to-Speech) & recognition Forecasting and prediction Control & Automation (neuro-controllers) e.g. BroomBalancing Radar interpretation Interpreting brain scans Stock market prediction Associative memory Optimization, etc… For more reference, see the Proceedings of the IEEE, Special Issue on Artificial Neural Network Applications, Oct. 1996, Vol. 84, No. 10
A simple example
Pattern classifier
We want to classify each input pattern as one of the 10 numerals (3 in the above figure).
0 1 2 3 4 5 6 7 8 9
ANN sol’n
ANNs are able to perform an accurate classification even if the input is corrupted by noise. Example:
Artificial Neural Networks (Taxonomy) These neurons connected together will form a network. These networks (ANNs) differ from each other, according to 3 main criteria: (1) the properties of the neuron or cell (Threshold, and Activation Function) (2) The architecture of the network or topology and (3) the learning mechanism or learning rule (Weight Calculation), and the way they are updated: Update rule, e.g. synchronous, continuous. And of course the type of implementation: Software, Analog hardware, digital hardware
Topology/Architecture There are 3 main types of topologies: Single-Layer Feedforward Networks Multilayer Feedforward Networks
Recurrent Networks.
Network Architecture Single layer
Multiple layer fully connected
Unit delay operator
Recurrent network without hidden units
} inputs
{
outputs
Recurrent network with hidden units
Topics to be covered
Background information Threshold gates Multilayer networks Classification problem Learning process Correlation matrix Perceptron learning rule Supervised learning Multi-layer perceptron (MLP): Backpropagation alg. Unsupervised learning
Topics to be covered (…)
Hebbian learning rule, Oja’s rule Competitive learning Instar/Outstar Networks (ART-Adaptive Resonance Theory) Self-organizing feature maps (Kohonen’s nets) Hopfield netws, stochastic neurons Boltzmann machines & their applications Recurrent neural nets & temporal NNs
Assumed Background
Basic linear Algebra Basic differential calculus Basic combinatorics Basic probability
References: 1. ICS611 Foundations of Artificial Intelligence, Lecture notes, Univ. of Nairobi, Kenya: Learning –
http://www.uonbi.ac.ke/acad_depts/ics/course_material-
1. Berlin Chen Lecture notes: Normal University, Taipei, Taiwan, ROC. http://140.122.185.1202. Lecture notes on Biology of Behaviour, PYB012- Psychology, by James Freeman, QUT. 3. Jarl Giske Lecture notes: University of Bergen Norway, http://www.ifm.uib.no/staff/giske/ 4. Denis Riordan Lecture notes, Dalhousie Univ.:http://www.cs.dal.ca/~riordan/ 5. Artificial Neural Networks (ANN) by David Christiansen: http://www.pa.ash.org.au/qsite/conferences/conf2000/ moreinfo.asp?paperid=95
References: •Jin Hyung Kim, KAIST Computer Science Dept., CS679 Neural Network lecture notes http://ai.kaist.ac.kr/~jkim/cs679/detail.htm tp://ai.kaist.ac.kr/~jkim/cs679/detail.htm