Skip to content

Commit 9634e2c

Browse files
committed
2 parents d7c60fd + e686a54 commit 9634e2c

File tree

1 file changed

+52
-2
lines changed

1 file changed

+52
-2
lines changed

README.md

Lines changed: 52 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# GoogLeNet for Image Classification
22

33
- TensorFlow implementation of [Going Deeper with Convolutions](https://research.google.com/pubs/pub43022.html) (CVPR'15).
4+
<!-- - **The inception structure** -->
5+
- This repository contains the examples of natural image classification using pre-trained model as well as training a Inception network from scratch on [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset (93.64% accuracy on testing set). The pre-trained model on CIFAR-10 can be download from [here](https://www.dropbox.com/sh/kab0bzpy0zymljx/AAD2YCVm0J1Qmlor8EoPzgQda?dl=0).
46
- Architecture of GoogLeNet from the paper:
57
![googlenet](fig/arch.png)
68

@@ -15,13 +17,27 @@
1517
- The GoogLeNet model is defined in [`src/nets/googlenet.py`](src/nets/googlenet.py).
1618
- Inception module is defined in [`src/models/inception_module.py`](src/models/inception_module.py).
1719
- An example of image classification using pre-trained model is in [`examples/inception_pretrained.py`](examples/inception_pretrained.py).
18-
- When testing the pre-trained model, images are rescaled so that the shorter dimension is 224. This is not the same as the original paper which is an ensemle of 7 similar models using 144 224x224 crops per image for testing. So the performance will not be as good as the original paper.
20+
- An example of train a network from scratch on CIFAR-10 is in [`examples/inception_cifar.py`](examples/inception_cifar.py).
21+
22+
For testing the pre-trained model
23+
- Images are rescaled so that the smallest side equals 224 before fed into the model. This is not the same as the original paper which is an ensemle of 7 similar models using 144 224x224 crops per image for testing. So the performance will not be as good as the original paper.
24+
<!--- **LRN** -->
25+
26+
For training from scratch on CIFAR-10
27+
- All the LRN layers are removed from the convolutional layers.
28+
- [Batch normalization](https://arxiv.org/abs/1502.03167) and ReLU activation are used in all the convolutional layers including the layers in Inception structure except the output layer.
29+
- Two auxiliary classifiers are used as mentioned in the paper, though 512 instead of 1024 hidden units are used in the two fully connected layers to reduce the computation. However, I found the results are almost the same on CIFAR-10 with and without auxiliary classifiers.
30+
- Since the 32 x 32 images are downsampled to 1 x 1 before fed into `inception_5a`, this makes the multi-scale structure of inception layers unuseful and harm the performance (around **80%** accuarcy). To make full use of the multi-scale structures, the stride of the first convolutional layer is reduced to 1 and the first two max pooling layers are removed. The the feature map (32 x 32 x channels) will have almost the same size as described in table 1 (28 x 28 x channel) in the paper before fed into `inception_3a`. I have also tried only reduce the stride or only remove one max pooling layer. But I found the current setting provides the best performance on the testing set.
31+
- During training, dropout with keep probability 0.4 is applied to two fully connected layers and weight decay with 5e-4 is used as well.
32+
- The network is trained through Adam optimizer. Batch size is 128. The initial learning rate is 1e-3, decays to 1e-4 after 30 epochs, and finally decays to 1e-5 after 50 epochs.
33+
- Each color channel of the input images are subtracted by the mean value computed from the training set.
34+
1935

2036
## Usage
2137
### ImageNet Classification
2238
#### Preparation
2339
- Download the pre-trained parameters [here](https://www.dropbox.com/sh/axnbpd1oe92aoyd/AADpmuFIJTtxS7zkL_LZrROLa?dl=0). This is original from [here](http://www.deeplearningmodel.net/).
24-
- Setup path in [`examples/inception_pretrained.py`](examples/inception_pretrained.py): `PRETRINED_PATH` is the path for pre-trained vgg model. `DATA_PATH` is the path to put testing images.
40+
- Setup path in [`examples/inception_pretrained.py`](examples/inception_pretrained.py): `PRETRINED_PATH` is the path for pre-trained model. `DATA_PATH` is the path to put testing images.
2541

2642
#### Run
2743
Go to `examples/` and put test image in folder `DATA_PATH`, then run the script:
@@ -32,6 +48,28 @@ python inception_pretrained.py --im_name PART-OF-IMAGE-NAME
3248
- `--im_name` is the option for image names you want to test. If the testing images are all `png` files, this can be `png`. The default setting is `.jpg`.
3349
- The output will be the top-5 class labels and probabilities.
3450

51+
### Train the network on CIFAR-10
52+
#### Preparation
53+
- Download CIFAR-10 dataset from [here](https://www.cs.toronto.edu/~kriz/cifar.html)
54+
- Setup path in [`examples/inception_cifar.py`](examples/inception_cifar.py): `DATA_PATH` is the path to put CIFAR-10. `SAVE_PATH` is the path to save or load summary file and trained model.
55+
#### Train the model
56+
Go to `examples/` and run the script:
57+
58+
```
59+
python inception_cifar.py --train --lr LEARNING-RATE --bsize BATCH-SIZE --keep_prob KEEP-PROB-OF-DROPOUT
60+
--maxepoch MAX-TRAINING-EPOCH
61+
```
62+
- Summary and model will be saved in `SAVE_PATH`. One pre-trained model on CIFAR-10 can be downloaded from [here](https://www.dropbox.com/sh/kab0bzpy0zymljx/AAD2YCVm0J1Qmlor8EoPzgQda?dl=0).
63+
64+
#### Evaluate the model
65+
Go to `examples/` and put the pre-trained model in `SAVE_PATH`. Then run the script:
66+
67+
```
68+
python inception_cifar.py --eval --load PRE-TRAINED-MODEL-ID
69+
```
70+
- The pre-trained ID is epoch ID shown in the save modeled file name. The default value is `99`, which indicates the one I uploaded.
71+
- The output will be the accuracy of training and testing set.
72+
3573

3674
## Results
3775
### Image classification using pre-trained model
@@ -49,5 +87,17 @@ python inception_pretrained.py --im_name PART-OF-IMAGE-NAME
4987
Self Collection | <img src='data/IMG_4379.jpg' height='200px'>|1: probability: 0.32, label: Egyptian cat<br>2: probability: 0.30, label: tabby, tabby cat<br>3: probability: 0.05, label: tiger cat<br>4: probability: 0.02, label: mouse, computer mouse<br>5: probability: 0.02, label: paper towel
5088
Self Collection | <img src='data/IMG_7940.JPG' height='200px'>|1: probability: 1.00, label: streetcar, tram, tramcar, trolley, trolley car<br>2: probability: 0.00, label: passenger car, coach, carriage<br>3: probability: 0.00, label: trolleybus, trolley coach, trackless trolley<br>4: probability: 0.00, label: electric locomotive<br>5: probability: 0.00, label: freight car
5189

90+
### Train the network from scratch on CIFAR-10
91+
- [Here](https://github.com/conan7882/VGG-cifar-tf/blob/master/README.md#train-the-network-from-scratch-on-cifar-10) is a similar experiment using VGG19.
92+
93+
learning curve for training set
94+
95+
![train_lc](fig/train_lc.png)
96+
97+
learning curve for testing set
98+
- The accuracy on testing set is 93.64% around 100 epochs. We can observe the slightly over-fitting behavior at the end of training.
99+
100+
![valid_lc](fig/valid_lc.png)
101+
52102
## Author
53103
Qian Ge

0 commit comments

Comments
 (0)