Merge pull request #4823 from luotao1/doc
remove duplicated doc/tutorials, and rename tutorials to v1_api_tutorialsrevert-4814-Add_sequence_project_op
Before Width: | Height: | Size: 35 KiB After Width: | Height: | Size: 35 KiB |
Before Width: | Height: | Size: 66 KiB After Width: | Height: | Size: 66 KiB |
Before Width: | Height: | Size: 456 KiB |
Before Width: | Height: | Size: 51 KiB |
@ -1,221 +0,0 @@
|
|||||||
Image Classification Tutorial
|
|
||||||
==============================
|
|
||||||
|
|
||||||
This tutorial will guide you through training a convolutional neural network to classify objects using the CIFAR-10 image classification dataset.
|
|
||||||
As shown in the following figure, the convolutional neural network can recognize the main object in images, and output the classification result.
|
|
||||||
|
|
||||||
<center>![Image Classification](./image_classification.png)</center>
|
|
||||||
|
|
||||||
## Data Preparation
|
|
||||||
First, download CIFAR-10 dataset. CIFAR-10 dataset can be downloaded from its official website.
|
|
||||||
|
|
||||||
<https://www.cs.toronto.edu/~kriz/cifar.html>
|
|
||||||
|
|
||||||
We have prepared a script to download and process CIFAR-10 dataset. The script will download CIFAR-10 dataset from the official dataset.
|
|
||||||
It will convert it to jpeg images and organize them into a directory with the required structure for the tutorial. Make sure that you have installed pillow and its dependents.
|
|
||||||
Consider the following commands:
|
|
||||||
|
|
||||||
1. install pillow dependents
|
|
||||||
|
|
||||||
```bash
|
|
||||||
sudo apt-get install libjpeg-dev
|
|
||||||
pip install pillow
|
|
||||||
```
|
|
||||||
|
|
||||||
2. download data and preparation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd demo/image_classification/data/
|
|
||||||
sh download_cifar.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
The CIFAR-10 dataset consists of 60000 32x32 color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
|
|
||||||
|
|
||||||
Here are the classes in the dataset, as well as 10 random images from each:
|
|
||||||
<center>![Image Classification](./cifar.png)</center>
|
|
||||||
|
|
||||||
|
|
||||||
After downloading and converting, we should find a directory (cifar-out) containing the dataset in the following format:
|
|
||||||
|
|
||||||
```
|
|
||||||
train
|
|
||||||
---airplane
|
|
||||||
---automobile
|
|
||||||
---bird
|
|
||||||
---cat
|
|
||||||
---deer
|
|
||||||
---dog
|
|
||||||
---frog
|
|
||||||
---horse
|
|
||||||
---ship
|
|
||||||
---truck
|
|
||||||
test
|
|
||||||
---airplane
|
|
||||||
---automobile
|
|
||||||
---bird
|
|
||||||
---cat
|
|
||||||
---deer
|
|
||||||
---dog
|
|
||||||
---frog
|
|
||||||
---horse
|
|
||||||
---ship
|
|
||||||
---truck
|
|
||||||
```
|
|
||||||
|
|
||||||
It has two directories:`train` and `test`. These two directories contain training data and testing data of CIFAR-10, respectively. Each of these two folders contains 10 sub-folders, ranging from `airplane` to `truck`. Each sub-folder contains images with the corresponding label. After the images are organized into this structure, we are ready to train an image classification model.
|
|
||||||
|
|
||||||
## Preprocess
|
|
||||||
After the data has been downloaded, it needs to be pre-processed into the Paddle format. We can run the following command for preprocessing.
|
|
||||||
|
|
||||||
```
|
|
||||||
cd demo/image_classification/
|
|
||||||
sh preprocess.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
`preprocess.sh` calls `./demo/image_classification/preprocess.py` to preprocess image data.
|
|
||||||
```sh
|
|
||||||
export PYTHONPATH=$PYTHONPATH:../../
|
|
||||||
data_dir=./data/cifar-out
|
|
||||||
python preprocess.py -i $data_dir -s 32 -c 1
|
|
||||||
```
|
|
||||||
|
|
||||||
`./demo/image_classification/preprocess.py` has the following arguments
|
|
||||||
|
|
||||||
- `-i` or `--input` specifes the input data directory.
|
|
||||||
- `-s` or `--size` specifies the processed size of images.
|
|
||||||
- `-c` or `--color` specifes whether images are color images or gray images.
|
|
||||||
|
|
||||||
|
|
||||||
## Model Training
|
|
||||||
We need to create a model config file before training the model. An example of the config file (vgg_16_cifar.py) is listed below. **Note**, it is slightly different from the `vgg_16_cifar.py` which also applies to the prediction.
|
|
||||||
|
|
||||||
```python
|
|
||||||
from paddle.trainer_config_helpers import *
|
|
||||||
data_dir='data/cifar-out/batches/'
|
|
||||||
meta_path=data_dir+'batches.meta'
|
|
||||||
args = {'meta':meta_path, 'mean_img_size': 32,
|
|
||||||
'img_size': 32, 'num_classes': 10,
|
|
||||||
'use_jpeg': 1, 'color': "color"}
|
|
||||||
define_py_data_sources2(train_list=data_dir+"train.list",
|
|
||||||
test_list=data_dir+'test.list',
|
|
||||||
module='image_provider',
|
|
||||||
obj='processData',
|
|
||||||
args=args)
|
|
||||||
settings(
|
|
||||||
batch_size = 128,
|
|
||||||
learning_rate = 0.1 / 128.0,
|
|
||||||
learning_method = MomentumOptimizer(0.9),
|
|
||||||
regularization = L2Regularization(0.0005 * 128))
|
|
||||||
|
|
||||||
img = data_layer(name='image', size=3*32*32)
|
|
||||||
lbl = data_layer(name="label", size=10)
|
|
||||||
# small_vgg is predined in trainer_config_helpers.network
|
|
||||||
predict = small_vgg(input_image=img, num_channels=3)
|
|
||||||
outputs(classification_cost(input=predict, label=lbl))
|
|
||||||
```
|
|
||||||
|
|
||||||
The first line imports python functions for defining networks.
|
|
||||||
```python
|
|
||||||
from paddle.trainer_config_helpers import *
|
|
||||||
```
|
|
||||||
|
|
||||||
Then define an `define_py_data_sources2` which use python data provider
|
|
||||||
interface. The arguments in `args` are used in `image_provider.py` which
|
|
||||||
yeilds image data and transform them to Paddle.
|
|
||||||
- `meta`: the mean value of training set.
|
|
||||||
- `mean_img_size`: the size of mean feature map.
|
|
||||||
- `img_size`: the height and width of input image.
|
|
||||||
- `num_classes`: the number of classes.
|
|
||||||
- `use_jpeg`: the data storage type when preprocessing.
|
|
||||||
- `color`: specify color image.
|
|
||||||
|
|
||||||
`settings` specifies the training algorithm. In the following example,
|
|
||||||
it specifies learning rate as 0.1, but divided by batch size, and the weight decay
|
|
||||||
is 0.0005 and multiplied by batch size.
|
|
||||||
```python
|
|
||||||
settings(
|
|
||||||
batch_size = 128,
|
|
||||||
learning_rate = 0.1 / 128.0,
|
|
||||||
learning_method = MomentumOptimizer(0.9),
|
|
||||||
regularization = L2Regularization(0.0005 * 128)
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
The `small_vgg` specifies the network. We use a small version of VGG convolutional network as our network
|
|
||||||
for classification. A description of VGG network can be found here [http://www.robots.ox.ac.uk/~vgg/research/very_deep/](http://www.robots.ox.ac.uk/~vgg/research/very_deep/).
|
|
||||||
```python
|
|
||||||
# small_vgg is predined in trainer_config_helpers.network
|
|
||||||
predict = small_vgg(input_image=img, num_channels=3)
|
|
||||||
```
|
|
||||||
After writing the config, we can train the model by running the script train.sh.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
config=vgg_16_cifar.py
|
|
||||||
output=./cifar_vgg_model
|
|
||||||
log=train.log
|
|
||||||
|
|
||||||
paddle train \
|
|
||||||
--config=$config \
|
|
||||||
--dot_period=10 \
|
|
||||||
--log_period=100 \
|
|
||||||
--test_all_data_in_one_period=1 \
|
|
||||||
--use_gpu=1 \
|
|
||||||
--save_dir=$output \
|
|
||||||
2>&1 | tee $log
|
|
||||||
|
|
||||||
python -m paddle.utils.plotcurve -i $log > plot.png
|
|
||||||
```
|
|
||||||
|
|
||||||
- Here we use GPU mode to train. If you have no gpu environment, just set `use_gpu=0`.
|
|
||||||
|
|
||||||
- `./demo/image_classification/vgg_16_cifar.py` is the network and data configuration file. The meaning of the other flags can be found in the documentation of the command line flags.
|
|
||||||
|
|
||||||
- The script `plotcurve.py` requires the python module of `matplotlib`, so if it fails, maybe you need to install `matplotlib`.
|
|
||||||
|
|
||||||
|
|
||||||
After training finishes, the training and testing error curves will be saved to `plot.png` using `plotcurve.py` script. An example of the plot is shown below:
|
|
||||||
|
|
||||||
<center>![Training and testing curves.](./plot.png)</center>
|
|
||||||
|
|
||||||
|
|
||||||
## Prediction
|
|
||||||
After we train the model, the model file as well as the model parameters are stored in path `./cifar_vgg_model/pass-%05d`. For example, the model of the 300-th pass is stored at `./cifar_vgg_model/pass-00299`.
|
|
||||||
|
|
||||||
To make a prediction for an image, one can run `predict.sh` as follows. The script will output the label of the classfiication.
|
|
||||||
|
|
||||||
```
|
|
||||||
sh predict.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
predict.sh:
|
|
||||||
```
|
|
||||||
model=cifar_vgg_model/pass-00299/
|
|
||||||
image=data/cifar-out/test/airplane/seaplane_s_000978.png
|
|
||||||
use_gpu=1
|
|
||||||
python prediction.py $model $image $use_gpu
|
|
||||||
```
|
|
||||||
|
|
||||||
## Exercise
|
|
||||||
Train a image classification of birds using VGG model and CUB-200 dataset. The birds dataset can be downloaded here. It contains an image dataset with photos of 200 bird species (mostly North American).
|
|
||||||
|
|
||||||
<http://www.vision.caltech.edu/visipedia/CUB-200.html>
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Delve into Details
|
|
||||||
### Convolutional Neural Network
|
|
||||||
A Convolutional Neural Network is a feedforward neural network that uses convolution layers. It is very suitable for building neural networks that process and understand images. A standard convolutional neural network is shown below:
|
|
||||||
|
|
||||||
![Convolutional Neural Network](./lenet.png)
|
|
||||||
|
|
||||||
Convolutional Neural Network contains the following layers:
|
|
||||||
|
|
||||||
- Convolutional layer: It uses convolution operation to extract features from an image or a feature map.
|
|
||||||
- Pooling layer: It uses max-pooling to downsample feature maps.
|
|
||||||
- Fully Connected layer: It uses fully connected connections to transform features.
|
|
||||||
|
|
||||||
Convolutional Neural Network achieves amazing performance for image classification because it exploits two important characteristics of images: *local correlation* and *spatial invariance*. By iteratively applying convolution and max-pooing operations, convolutional neural network can well represent these two characteristics of images.
|
|
||||||
|
|
||||||
|
|
||||||
For more details of how to define layers and their connections, please refer to the documentation of layers.
|
|
Before Width: | Height: | Size: 49 KiB |
Before Width: | Height: | Size: 30 KiB |
Before Width: | Height: | Size: 456 KiB |
Before Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 49 KiB |
Before Width: | Height: | Size: 30 KiB |
@ -1,13 +0,0 @@
|
|||||||
# 完整教程
|
|
||||||
|
|
||||||
* [快速入门](quick_start/index_cn.rst)
|
|
||||||
* [个性化推荐](rec/ml_regression_cn.rst)
|
|
||||||
* [图像分类](image_classification/index_cn.md)
|
|
||||||
* [情感分析](sentiment_analysis/index_cn.md)
|
|
||||||
* [语义角色标注](semantic_role_labeling/index_cn.md)
|
|
||||||
* [机器翻译](text_generation/index_cn.md)
|
|
||||||
|
|
||||||
## 常用模型
|
|
||||||
|
|
||||||
* [ResNet模型](imagenet_model/resnet_model_cn.md)
|
|
||||||
* [词向量模型](embedding_model/index_cn.md)
|
|
@ -1,14 +0,0 @@
|
|||||||
# TUTORIALS
|
|
||||||
There are several examples and demos here.
|
|
||||||
|
|
||||||
* [Quick Start](quick_start/index_en.md)
|
|
||||||
* [MovieLens Regression](rec/ml_regression_en.rst)
|
|
||||||
* [Image Classification](image_classification/index_en.md)
|
|
||||||
* [Sentiment Analysis](sentiment_analysis/index_en.md)
|
|
||||||
* [Semantic Role Labeling](semantic_role_labeling/index_en.md)
|
|
||||||
* [Text Generation](text_generation/index_en.md)
|
|
||||||
* [Image Auto-Generation](gan/index_en.md)
|
|
||||||
|
|
||||||
## Model Zoo
|
|
||||||
* [ImageNet: ResNet](imagenet_model/resnet_model_en.md)
|
|
||||||
* [Embedding: Chinese Word](embedding_model/index_en.md)
|
|
@ -1,111 +0,0 @@
|
|||||||
```eval_rst
|
|
||||||
.. _demo_ml_dataset:
|
|
||||||
```
|
|
||||||
|
|
||||||
# MovieLens Dataset
|
|
||||||
|
|
||||||
The [MovieLens Dataset](http://grouplens.org/datasets/movielens/) was collected by GroupLens Research.
|
|
||||||
The data set contains some user information, movie information, and many movie ratings from \[1-5\].
|
|
||||||
The data sets have many version depending on the size of set.
|
|
||||||
We use [MovieLens 1M Dataset](http://files.grouplens.org/datasets/movielens/ml-1m.zip) as a demo dataset, which contains
|
|
||||||
1 million ratings from 6000 users on 4000 movies. Released 2/2003.
|
|
||||||
|
|
||||||
## Dataset Features
|
|
||||||
|
|
||||||
In [ml-1m Dataset](http://files.grouplens.org/datasets/movielens/ml-1m.zip), there are many features in these dataset.
|
|
||||||
The data files (which have ".dat" extension) in [ml-1m Dataset](http://files.grouplens.org/datasets/movielens/ml-1m.zip)
|
|
||||||
is basically CSV file that delimiter is "::". The description in README we quote here.
|
|
||||||
|
|
||||||
### RATINGS FILE DESCRIPTION(ratings.dat)
|
|
||||||
|
|
||||||
|
|
||||||
All ratings are contained in the file "ratings.dat" and are in the
|
|
||||||
following format:
|
|
||||||
|
|
||||||
UserID::MovieID::Rating::Timestamp
|
|
||||||
|
|
||||||
- UserIDs range between 1 and 6040
|
|
||||||
- MovieIDs range between 1 and 3952
|
|
||||||
- Ratings are made on a 5-star scale (whole-star ratings only)
|
|
||||||
- Timestamp is represented in seconds since the epoch as returned by time(2)
|
|
||||||
- Each user has at least 20 ratings
|
|
||||||
|
|
||||||
### USERS FILE DESCRIPTION(users.dat)
|
|
||||||
|
|
||||||
User information is in the file "users.dat" and is in the following
|
|
||||||
format:
|
|
||||||
|
|
||||||
UserID::Gender::Age::Occupation::Zip-code
|
|
||||||
|
|
||||||
All demographic information is provided voluntarily by the users and is
|
|
||||||
not checked for accuracy. Only users who have provided some demographic
|
|
||||||
information are included in this data set.
|
|
||||||
|
|
||||||
- Gender is denoted by a "M" for male and "F" for female
|
|
||||||
- Age is chosen from the following ranges:
|
|
||||||
|
|
||||||
* 1: "Under 18"
|
|
||||||
* 18: "18-24"
|
|
||||||
* 25: "25-34"
|
|
||||||
* 35: "35-44"
|
|
||||||
* 45: "45-49"
|
|
||||||
* 50: "50-55"
|
|
||||||
* 56: "56+"
|
|
||||||
|
|
||||||
- Occupation is chosen from the following choices:
|
|
||||||
|
|
||||||
* 0: "other" or not specified
|
|
||||||
* 1: "academic/educator"
|
|
||||||
* 2: "artist"
|
|
||||||
* 3: "clerical/admin"
|
|
||||||
* 4: "college/grad student"
|
|
||||||
* 5: "customer service"
|
|
||||||
* 6: "doctor/health care"
|
|
||||||
* 7: "executive/managerial"
|
|
||||||
* 8: "farmer"
|
|
||||||
* 9: "homemaker"
|
|
||||||
* 10: "K-12 student"
|
|
||||||
* 11: "lawyer"
|
|
||||||
* 12: "programmer"
|
|
||||||
* 13: "retired"
|
|
||||||
* 14: "sales/marketing"
|
|
||||||
* 15: "scientist"
|
|
||||||
* 16: "self-employed"
|
|
||||||
* 17: "technician/engineer"
|
|
||||||
* 18: "tradesman/craftsman"
|
|
||||||
* 19: "unemployed"
|
|
||||||
* 20: "writer"
|
|
||||||
|
|
||||||
### MOVIES FILE DESCRIPTION(movies.dat)
|
|
||||||
|
|
||||||
Movie information is in the file "movies.dat" and is in the following
|
|
||||||
format:
|
|
||||||
|
|
||||||
MovieID::Title::Genres
|
|
||||||
|
|
||||||
- Titles are identical to titles provided by the IMDB (including
|
|
||||||
year of release)
|
|
||||||
- Genres are pipe-separated and are selected from the following genres:
|
|
||||||
|
|
||||||
* Action
|
|
||||||
* Adventure
|
|
||||||
* Animation
|
|
||||||
* Children's
|
|
||||||
* Comedy
|
|
||||||
* Crime
|
|
||||||
* Documentary
|
|
||||||
* Drama
|
|
||||||
* Fantasy
|
|
||||||
* Film-Noir
|
|
||||||
* Horror
|
|
||||||
* Musical
|
|
||||||
* Mystery
|
|
||||||
* Romance
|
|
||||||
* Sci-Fi
|
|
||||||
* Thriller
|
|
||||||
* War
|
|
||||||
* Western
|
|
||||||
|
|
||||||
- Some MovieIDs do not correspond to a movie due to accidental duplicate
|
|
||||||
entries and/or test entries
|
|
||||||
- Movies are mostly entered by hand, so errors and inconsistencies may exist
|
|
Before Width: | Height: | Size: 81 KiB |
Before Width: | Height: | Size: 30 KiB |
Before Width: | Height: | Size: 27 KiB |