Idea Transcript
ImageNet Classification with Deep Convolutional Neural Networks Choi Yongchan Department of Statistics
May 4, 2017
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
1 / 17
Outline
Dataset Architecture Reducing Overfitting Results Discussion
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
2 / 17
Dataset
ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) Using ILSVRC2010 data, check the model performance. Roughly 1000 images in each of 1000 categories. 1.2 million training images, 50,000 validation images, 150,000 test images down-sampled the images to a fixed resolution of 256X256
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
3 / 17
Architecture
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
4 / 17
Architecture - ReLU Nonlinearity standard way to model a neuron’s output f (x) = tanh(x), f (x) = (1 + exp(−x))−1 Non-saturating nonlinearity(ReLU) f (x) = max(0, x)
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
5 / 17
Architecture - Training on Multiple GPUs
two GTX 580 3GB GPUs The GPUs communicate only in certain layers. This scheme reduces top-1 and top-5 error rates by 1.7, 1.2 percent as compared with a net with half as may kernels in each convolutional layer trained on one GPU
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
6 / 17
Architecture
- Local Response Normalization ReLUs have the desirable property that they do not require input normalization. But local nomalization scheme aids generalization. This scheme reduces the top-1 top-5 error rates by 1.4 and 1.2 percent min(N−1,i+n/2)
bx,y i = ax,y i /(k +
X
(ax,y j )2 )β
j=max(0,i−n/2)
k = 2, n = 5, α = 10−4 , β = 0.75
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
7 / 17
Architecture - Overlapping Pooling Traditional pooling(s=z)
Overlapping pooling
This scheme reduces the top-1 and top-5 error rates by 0.4 and 0.5 percent as compared with the non-overlapping scheme s=2, z=2 Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
8 / 17
Architecture 1st Convolutional layer 96 kernals with size 11 X 11 X 3 (stride 4) 2nd Convolutional layer 256 kernals with size 5 X 5 X 48 3rd Convolutional layer (inter GPU connection) 384 kernals with size 3 X 3 X 256 4th Convolutional layer 192 kernals with size 3 X 3 X 192 5th Convolutional layer 256 kernals with size 3 X 3 X 192 Fully connected layers have 4096 neurons each. Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
9 / 17
Reducing Overfitting
On previous network architecture, There are 60 million parameter. Alexnet takes two primary ways to reduce overfitting (Data augmentation, Dropout)
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
10 / 17
Reducing Overfitting - Data augmentation Image translations and horizontal reflection 1. extracting 224 X 224 patches from the 256X256 images (get 2048 images per one image) 2. At test time, using 10 patches(size 224X224) and averaging the predictions
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
11 / 17
Reducing Overfitting - Data augmentation Altering the intensities of the RGB Ixy = [Ixy R , Ixy G , Ixy B ] add the following quantity [p1 , p3 , p3 ][α1 λ1 , α2 λ2 , α2 λ2 ]T where pi and λi are ith eigenvector and eigenvalue of the 3X3 covariance matrix of RGB pixel values αi ∼ N(0, 0.1) Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
12 / 17
Reducing Overfitting - Dropout
With probability 0.5 At test time, use all the neurons but multiply their outputs by 0.5 Without dropout, the network exhibits substantial overfitting Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
13 / 17
Results
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
14 / 17
Results
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
15 / 17
Results
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
16 / 17
Discussion
The depth of network is really important for achieving our results. Classfication on video (Video sequences provides temporal structure which is very helpful information)
Choi Yongchan (Department of Statistics) ImageNet Classification with Deep Convolutional Neural Networks May 4, 2017
17 / 17