## Week 3 - Artificial Neural Networks

In which we look at the general structure of artificial neural networks, including the mathematical description of processing within a node, and how networks can learn by gradient descent.

### Learning Objectives

By the end of the session the student will be able to:

- describe the key developments in the history of artificial neural networks for machine learning.
- describe the perceptron learning algorithm
- explain how gradient descent works in multiple-layer networks
- use the Keras toolkit to implement, train and test neural network models for simple problems

### Outline

- Neural networks for machine learning
- The Perceptron
- Warren McCulloch and Walter Pitts, A Logical Calculus of Ideas Immanent in Nervous Activity, 1943
- Frank Rosenblatt, The Perceptron — a perceiving and recognizing automaton. 1957
- Perceptron learning animation
- Multiple layers of perceptrons
- Marvin Minsky and Seymour Papert, Perceptrons, 1969
- Introduction to learning by gradient descent
- Automatic differentiation
- Deep neural networks
- A. Krizhevky, I. Sutskever, G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NeurIPS, 2012.
- D. Kingma, J. Ba, Adam: A Method for Stochastic Optimization, 2015.
- Programing DNNs

We view the brain as an information processing system, looking at the operation of neurons and how a network of neurons can perform complex calculations. We discuss how artificial neural networks can be motivated by the processing networks of the brain without being simulations of biological neurons.

We discuss a simple mathematical of a neuron proposed by McCulloch and Pitts, and a means to train such a model from data proposed by Rosenblatt: the Perceptron learning rule.

We look at the criticisms of perceptrons as a means to perform information processing, and a solution to the problem of training multiple layers of perceptrons through the use of gradient descent.

We briefly review the history of how networks of multi-layer perceptrons turned into "deep" networks. Problems in extending gradient descent to large networks were gradually overcome by improvements in algorithms and an increase in computer power. We outline common activation functions, loss functions and optimisation methods.

We introduce the Keras toolkit for building, training and running deep neural networks.

### Research Paper of the Week

- I. Howard, M. Huckvale, Two-level recognition of isolated words using neural nets, First IEE International Conference on Artificial Neural Networks, 1989.

### Web Resources

- How Deep Neural Networks Work, excellent introductory video. [If you want to be challenged, try to build a version of this example in Keras - classify 4-pixel inputs as solid, or made up of a vertical, horizontal or diagonal line]
- LinkedIn Video Course:
**Building recommender systems with machine learning and AI: History of artificial neural networks**. Accessible to all UCL staff and students through this sign on. - LinkedIn Video Course:
**Artificial intelligence foundations neural networks**. Accessible to all UCL staff and students through this sign on. - Introduction to Neural Networks YouTube Videos by 3Blue1Brown.

### Readings

Be sure to read one or more of these discussions of deep learning:

- Keras tutorial: deep learning in Python.
- Your First Deep Learning Project in Python with Keras Step-By-Step.

### Exercises

Implement answers to the problems described in the notebooks below. Save your completed notebooks into your personal Google Drive account.

Word count: . Last modified: 22:45 11-Mar-2022.