Hardware Architectures for Deep Neural Networks - MIT RLE [PDF]

Sep 27, 2016 - 1940s: Neural networks were proposed. • 1960s: Deep neural networks were proposed. • 1990s: Early har

0 downloads 5 Views 45MB Size

Recommend Stories


Deep neural networks for cryptocurrencies price prediction
No matter how you feel: Get Up, Dress Up, Show Up, and Never Give Up! Anonymous

Deep Recurrent Neural Networks for Supernovae Classification
Almost everything will work again if you unplug it for a few minutes, including you. Anne Lamott

Merging Deep Neural Networks for Mobile Devices
Courage doesn't always roar. Sometimes courage is the quiet voice at the end of the day saying, "I will

Lecture 6 Optimization for Deep Neural Networks
When you do things from your soul, you feel a river moving in you, a joy. Rumi

Hyphenation using deep neural networks
Come let us be friends for once. Let us make life easy on us. Let us be loved ones and lovers. The earth

Direct Neural Networks Hardware Implementation Algorithm
Nothing in nature is unbeautiful. Alfred, Lord Tennyson

[PDF] Download Neural Networks
I want to sing like the birds sing, not worrying about who hears or what they think. Rumi

[PDF] Download Neural Networks
Sorrow prepares you for joy. It violently sweeps everything out of your house, so that new joy can find

Deep Neural Networks in Machine Translation
If you feel beautiful, then you are. Even if you don't, you still are. Terri Guillemets

Landscape Classification with Deep Neural Networks
Live as if you were to die tomorrow. Learn as if you were to live forever. Mahatma Gandhi

Idea Transcript


Hardware Architectures for Deep Neural Networks MICRO Tutorial October 16, 2016 Website: http://eyeriss.mit.edu/tutorial.html 1

Speakers

2

Joel Emer

Vivienne Sze

Yu-Hsin Chen

Senior Distinguished Research Scientist

Professor MIT

PhD Candidate MIT

NVIDIA Professor MIT 2

Outline •  Overview of Deep Neural Networks •  DNN Development Resources •  Survey of DNN Computation •  DNN Accelerators •  Network Optimizations •  Benchmarking Metrics for Evaluation •  DNN Training 3

Participant Takeaways •  Understand the key design considerations for DNNs •  Be able to evaluate different implementations of DNN with benchmarks and comparison metrics •  Understand the tradeoffs between various architectures and platforms •  Assess the utility of various optimization approaches •  Understand recent implementation trends and opportunities

4

Background of Deep Neural Networks

5

AI and Machine Learning Artificial Intelligence Machine Learning

“Field of study that gives computers the ability to learn without being explicitly programmed” – Arthur Samuel, 1959

6

Brain-Inspired Machine Learning Artificial Intelligence Machine Learning Brain-Inspired

An algorithm that takes its basic functionality from our understanding of how the brain operates

7

How Does the Brain Work?

•  The basic computational unit of the brain is a neuron à 86B neurons in the brain •  Neurons are connected with nearly 1014 – 1015 synapses •  Neurons receive input signal from dendrites and produce output signal along axon, which interact with the dendrites of other neurons via synaptic weights •  Synaptic weights – learnable & control influence strength Image Source: Stanford

8

Spiking-based Machine Learning Artificial Intelligence Machine Learning Brain-Inspired Spiking

9

Spiking Architecture •  Brain-inspired •  Integrate and fire •  Example: IBM TrueNorth

[Merolla et al., Science 2014; Esser et al., PNAS 2016] http://www.research.ibm.com/articles/brain-chip.shtml

10

Machine Learning with Neural Networks Artificial Intelligence Machine Learning Brain-Inspired Spiking

Neural Networks

11

Neural Networks: Weighted Sum

Image Source: Stanford

12

Many Weighted Sums

Image Source: Stanford

13

Deep Learning Artificial Intelligence Machine Learning Brain-Inspired Spiking

Neural Networks Deep Learning

14

What is Deep Learning?

“Volvo XC90”

Image

Image Source: [Lee et al., Comm. ACM 2011]

15

Why is Deep Learning Hot Now? Big Data Availability

GPU Acceleration

New ML Techniques

350M images uploaded per day 2.5 Petabytes of customer data hourly 300 hours of video uploaded every minute

16

ImageNet Challenge

Image Classification Task: 1.2M training images • 1000 object categories Object Detection Task: 456k training images • 200 object categories

17

ImageNet: Image Classification Task Top 5 Classification Error (%)

30

large error rate reduction due to Deep CNN

25 20 15 10 5 0 2010

2011

2012

Hand-crafted featurebased designs

2013

2014

2015

Human

Deep CNN-based designs

[Russakovsky et al., IJCV 2015]

18

GPU Usage for ImageNet Challenge

19

Deep Learning on Images •  Image Classification

•  Image Segmentation

•  Object Localization

•  Action Recognition

•  Object Detection

•  Image Generation

20

Deep Learning for Speech •  Speech Recognition •  Natural Language Processing •  Speech Translation •  Audio Generation

21

Deep Learning on Games Google DeepMind AlphaGo

22

Medical Applications of Deep Learning •  Brain Cancer Detection

Image Source: [Jermyn et al., JBO 2016]

23

Deep Learning for Self-driving Cars

24

Connectomics – Finding Synapses

(1) EM

(5) Merging

(2) ML Membrane Detection

Machine Learning requires orders of Magnitude more computation than other parts (3) Watershed

(6) Synapses

Image Source: MIT

(4) Agglomeration

(7) Skeletons

(8) Graph

25

Mature Applications •  Image o  o  o  o 

Classification: image to object class Recognition: same as classification (except for faces) Detection: assigning bounding boxes to objects Segmentation: assigning object class to every pixel

•  Speech & Language o  o  o  o 

Speech Recognition: audio to text Translation Natural Language Processing: text to meaning Audio Generation: text to audio

•  Games 26

Emerging Applications •  Medical (Cancer Detection, Pre-Natal) •  Finance (Trading, Energy Forecasting, Risk) •  Infrastructure (Structure Safety and Traffic) •  Weather Forecasting and Event Detection

This tutorial will focus on image classification http://www.nextplatform.com/2016/09/14/next-wave-deep-learning-applications/

27

Opportunities $500B Market over 10 Years!

Image Source: Tractica

28

Opportunities From EE Times – September 27, 2016 ”Today the job of training machine learning models is limited by compute, if we had faster processors we’d run bigger models…in practice we train on a reasonable subset of data that can finish in a matter of months. We could use improvements of several orders of magnitude – 100x or greater.”

–  Greg Diamos, Senior Researcher, SVAIL, Baidu

29

Overview of Deep Neural Networks

30

DNN Timeline •  1940s: Neural networks were proposed •  1960s: Deep neural networks were proposed •  1990s: Early hardware for shallow neural nets –  Example: Intel ETANN (1992)

•  1998: LeNet for MNIST •  2011: Speech recognition using DNN (Microsoft) •  2012: Deep learning starts supplanting traditional ML –  AlexNet for image classification

•  Early 2010s: Rise of DNN accelerator research –  Examples: Neuflow, DianNao, etc. 31

Publications at Architecture Conferences •  MICRO, ISCA, HPCA, ASPLOS

32

So Many Neural Networks!

http://www.asimovinstitute.org/neural-network-zoo/

33

DNN Terminology 101 Neurons

Image Source: Stanford

34

DNN Terminology 101 Synapses

Image Source: Stanford

35

DNN Terminology 101 Each synapse has a weight for neuron activation

​"↓$ =&'()*&()+,(∑)=#↑$▒​0↓)$ ∗​1↓

W11 X1

Y2

X2 X3

Y1

Y3 W34

Y4

Image Source: Stanford

36

DNN Terminology 101 Weight Sharing: multiple synapses use the same weight value

​"↓$ =&'()*&()+,(∑)=#↑$▒​0↓)$ ∗​1↓

W11 X1

Y2

X2 X3

Y1

Y3 W34

Y4

Image Source: Stanford

37

DNN Terminology 101 Layer 1 L1 Input Neurons e.g. image pixels

L1 Output Neurons a.k.a. Activations

Image Source: Stanford

38

DNN Terminology 101 L2 Input Activations

Layer 2 L2 Output Activations

Image Source: Stanford

39

DNN Terminology 101 Fully-Connected: all i/p neurons connected to all o/p neurons Sparsely-Connected

Image Source: Stanford

40

DNN Terminology 101 Feed Forward

Feedback

Image Source: Stanford

41

Popular Types of DNNs •  Fully-Connected NN –  feed forward, a.k.a. multilayer perceptron (MLP)

•  Convolutional NN (CNN) –  feed forward, sparsely-connected w/ weight sharing

•  Recurrent NN (RNN) –  feedback

•  Long Short-Term Memory (LSTM) –  feedback + Storage

42

Inference vs. Training •  Training: Determine weights –  Supervised: •  Training set has inputs and outputs, i.e., labeled

–  Reinforcement: •  Output assessed via rewards and punishments

–  Unsupervised: •  Training set is unlabeled

–  Semi-supervised: •  Training set is partially labeled

•  Inference: Apply weights to determine output

43

Deep Convolutional Neural Networks Modern Deep CNN: 5 – 1000 Layers CONV Layer

Low-Level Features



CONV Layer

High-Level FC Features Layer

Classes

1 – 3 Layers

44

Deep Convolutional Neural Networks

CONV Layer

Low-Level Features



CONV Layer

Convolution

High-Level FC Features Layer

Classes

Activation ×

45

Deep Convolutional Neural Networks

CONV Layer

Low-Level Features



CONV Layer

High-Level FC Features Layer

Fully Connected

Classes

Activation ×

46

Deep Convolutional Neural Networks Optional layers in between CONV and/or FC layers CONV Layer

NORM Layer

Normalization

POOL Layer

CONV Layer

High-Level FC Features Layer

Classes

Pooling

47

Deep Convolutional Neural Networks

CONV Layer

NORM Layer

POOL Layer

CONV Layer

High-Level FC Features Layer

Classes

Convolutions account for more than 90% of overall computation, dominating runtime and energy consumption 48

Convolution (CONV) Layer a plane of input activations a.k.a. input feature map (fmap) filter (weights) H

R S

W

49

Convolution (CONV) Layer input fmap filter (weights) H

R S

W

Element-wise Multiplication

50

Convolution (CONV) Layer input fmap

output fmap an output activation

filter (weights) H

R S

E W

Element-wise Multiplication

F

Partial Sum (psum) Accumulation

51

Convolution (CONV) Layer input fmap

output fmap an output activation

filter (weights) H

R S

E W

F

Sliding Window Processing

52

Convolution (CONV) Layer input fmap C

filter

output fmap

C H

R S

E W

F

Many Input Channels (C)

53

Convolution (CONV) Layer input fmap

many filters (M)

output fmap

C

C R

H

… C R

E

1 S

M

W

F

Many Output Channels (M)

M S 54

Convolution (CONV) Layer Many Input fmaps (N) C

filters

Many Output fmaps (N) M

C H

E

1

S

1 F





W



R

C

C R

E

H S

N

N W

F 55

CONV Layer Implementation Output fmaps

Biases

Input fmaps

Filter weights

56

CONV Layer Implementation Naïve 7-layer for-loop implementation: for (n=0; n

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.