@ -0,0 +1,23 @@
|
||||
# Executor Design Doc
|
||||
|
||||
## Motivation
|
||||
|
||||
We use executor to do the runtime evaluation of a `ProgramDesc`.
|
||||
|
||||
## Overview
|
||||
|
||||
An executor takes a `ProgramDesc`, a `block_id` and a `Scope`. The `ProgramDesc` is a list of blocks and each block contains the protobuf definition of all the parameters and operators. The `block_id` specifies the entrance block. And the `Scope` is the container of all the variable instance, which is persistent throughout different runs.
|
||||
|
||||
### What does executor do?
|
||||
|
||||
It evaluates all the operators in the `block_id`th block of a `ProgramDesc`.
|
||||
|
||||
### What does executor NOT do?
|
||||
|
||||
It does not do runtime optimization, meaning intelligently parse the dependency of each op a choose which one to be run and in which order they should be run.
|
||||
|
||||
It does not do graph partitioning, meaning dividing the `ProgramDesc` into several small pieces and executing them on different devices.
|
||||
|
||||
## Implementation
|
||||
|
||||
`Executor` evaluates a `ProgramDesc`. Essentially, it instantiates Variables and Operators, then run all the operators in sequence. [[code]](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/executor.cc)
|
After Width: | Height: | Size: 32 KiB |
After Width: | Height: | Size: 45 KiB |
After Width: | Height: | Size: 1.1 KiB |
After Width: | Height: | Size: 989 B |
After Width: | Height: | Size: 1.6 KiB |
@ -0,0 +1,78 @@
|
||||
# Design Doc: InferVarType
|
||||
|
||||
## The Problem Posed
|
||||
|
||||
The variable in our design can hold variant types. Such as `LoDTensor` and `SelectedRows`. An operator should be able to inference the variable types of its output.
|
||||
|
||||
For example, a `lookup table` operator takes two `LoDTensor`; one is a float tensor as the embedding table, the other is an int tensor as word ID. The gradient operator of `lookup table` will generate a `SelectedRows` as its output. A `sum` operator can take both `LoDTensor` and `SelectedRows` as its inputs and will generate a `LoDTensor` if any of its inputs is `LoDTensor`, otherwise, the `sum` operator will generate `SelectedRows` as its output.
|
||||
|
||||
The variable type will be constant at runtime. Every variable's type can either be set by the user (input data and parameter) or be inferred by the operator in compile time.
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
The `InferVarType` is a compile-time function which is registered to each operator. The inferface of that function is:
|
||||
|
||||
|
||||
```c++
|
||||
using InferVarTypeFN = std::function<
|
||||
void (const OpDescBind& /*op_desc*/, BlockDescBind* /*block*/)>;
|
||||
```
|
||||
|
||||
It takes an operator description as its input and will write the output variable type and store them in block description.
|
||||
|
||||
The `InferVarTypeFN` will be registered in `OpInfo`, to replace `infer_var_type_` field. The `OpInfo` should be
|
||||
|
||||
```cpp
|
||||
struct OpInfo {
|
||||
InferVarTypeFN infer_var_type_;
|
||||
...
|
||||
};
|
||||
```
|
||||
|
||||
The default `InferVarType` will set output type as `LoDTensor`. It can be done by `GetInferVarType()`.
|
||||
|
||||
```cpp
|
||||
void DefaultInferVarType(const OpDescBind& op_desc, BlockDescBind* block) {
|
||||
// set the output type of variable as `LoDTensor`.
|
||||
// ...
|
||||
}
|
||||
|
||||
struct OpInfo {
|
||||
InferVarTypeFN infer_var_type_;
|
||||
InferVarTypeFN GetInferVarType() const {
|
||||
if (infer_var_type_) {
|
||||
return infer_var_type_;
|
||||
} else {
|
||||
return DefaultInferVarType;
|
||||
}
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
## Register InferVarType
|
||||
|
||||
We provide a thin base class for registering an `InferVarTypeFN`. To use a base class will ease the implementation of registry since we can detect the registry entry is an `InferVarTypeFN` or not.
|
||||
|
||||
```cpp
|
||||
class VarTypeInferer {
|
||||
public:
|
||||
virtual void operator()(const OpDescBind& op_desc, BlockDescBind* block) const = 0;
|
||||
}
|
||||
```
|
||||
|
||||
Operator developers can write the specialize `VarTypeInferer` as follow.
|
||||
|
||||
```cpp
|
||||
class SpecialVarTypeInferer : public VarTypeInferer {
|
||||
public:
|
||||
virtual void operator()(const OpDescBind& op_desc, BlockDescBind* block) const {
|
||||
// .. own logic
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then user can register the `InferVarType` just like `GradOpDescMaker` and `OpInfoMaker`.
|
||||
|
||||
```
|
||||
REGISTER_OPERATOR(some_op, OpType, SpecialVarTypeInferer, ...);
|
||||
```
|
@ -0,0 +1,63 @@
|
||||
# Prune
|
||||
|
||||
## Motivation
|
||||
|
||||
We want to support running inference, training and checkpointing in one `ProgramDesc`. We implement
|
||||
`void Prune(const ProgramDesc* input, ProgramDesc* output)` function, which takes a `ProgramDesc`
|
||||
and generate a pruned `ProgramDesc`.
|
||||
|
||||
## Challenge
|
||||
|
||||
Pruning need to support both variables and operators being evaluation targets. Consider the following
|
||||
different situations.
|
||||
|
||||
```python
|
||||
# Case 1: run foward pass.
|
||||
cost_np = session.run(target=cost)
|
||||
# Case 2: run backward passing.
|
||||
opts_np, _ = session.run(target=[cost, opt])
|
||||
# Case 3: run checkpointing
|
||||
_ = session.run(target=checkpoint)
|
||||
```
|
||||
|
||||
## Solution
|
||||
|
||||
To support evaluation of operators, we add `is_target` field in the `OpDesc`.
|
||||
|
||||
```c++
|
||||
message OpDesc {
|
||||
required string type = 3;
|
||||
repeated Var inputs = 1;
|
||||
repeated Var outputs = 2;
|
||||
repeated Attr attrs = 4;
|
||||
optional bool is_target = 5 [ default = false ];
|
||||
};
|
||||
```
|
||||
|
||||
To support evaluation of variables, we add [fetch_op](https://github.com/PaddlePaddle/Paddle/pull/4599).
|
||||
For each variable in the `target`, we insert a `fetch_op` into the `ProgramDesc` with `variable` being
|
||||
`fetch_op`'s input. Then we also set `fetch_op` is a target.
|
||||
|
||||
### Algorithm
|
||||
|
||||
If an operator needs to be run, it must fall into one of the following cases:
|
||||
|
||||
1. It is the target.
|
||||
2. It is depended by some other ops, meaning its output is some other op's input.
|
||||
|
||||
The first case can be checked by `op_desc.is_traget()` . The second case can be implement as
|
||||
|
||||
```c++
|
||||
bool HasDependentVar(const OpDesc& op_desc, const std::set<string>& dependent_vars) {
|
||||
for (auto& var : op_desc.outputs()) {
|
||||
for (auto& argu : var.arguments()) {
|
||||
if (dependent_vars.count(argu) != 0) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
```
|
||||
|
||||
Then the whole algorithm can be implemented as the following [code](https://github.com/tonyyang-svail/Paddle/blob/prune_impl/paddle/framework/prune.cc).
|
Before Width: | Height: | Size: 35 KiB After Width: | Height: | Size: 35 KiB |
Before Width: | Height: | Size: 66 KiB After Width: | Height: | Size: 66 KiB |
Before Width: | Height: | Size: 456 KiB |
Before Width: | Height: | Size: 51 KiB |
@ -1,221 +0,0 @@
|
||||
Image Classification Tutorial
|
||||
==============================
|
||||
|
||||
This tutorial will guide you through training a convolutional neural network to classify objects using the CIFAR-10 image classification dataset.
|
||||
As shown in the following figure, the convolutional neural network can recognize the main object in images, and output the classification result.
|
||||
|
||||
<center>![Image Classification](./image_classification.png)</center>
|
||||
|
||||
## Data Preparation
|
||||
First, download CIFAR-10 dataset. CIFAR-10 dataset can be downloaded from its official website.
|
||||
|
||||
<https://www.cs.toronto.edu/~kriz/cifar.html>
|
||||
|
||||
We have prepared a script to download and process CIFAR-10 dataset. The script will download CIFAR-10 dataset from the official dataset.
|
||||
It will convert it to jpeg images and organize them into a directory with the required structure for the tutorial. Make sure that you have installed pillow and its dependents.
|
||||
Consider the following commands:
|
||||
|
||||
1. install pillow dependents
|
||||
|
||||
```bash
|
||||
sudo apt-get install libjpeg-dev
|
||||
pip install pillow
|
||||
```
|
||||
|
||||
2. download data and preparation
|
||||
|
||||
```bash
|
||||
cd demo/image_classification/data/
|
||||
sh download_cifar.sh
|
||||
```
|
||||
|
||||
The CIFAR-10 dataset consists of 60000 32x32 color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
|
||||
|
||||
Here are the classes in the dataset, as well as 10 random images from each:
|
||||
<center>![Image Classification](./cifar.png)</center>
|
||||
|
||||
|
||||
After downloading and converting, we should find a directory (cifar-out) containing the dataset in the following format:
|
||||
|
||||
```
|
||||
train
|
||||
---airplane
|
||||
---automobile
|
||||
---bird
|
||||
---cat
|
||||
---deer
|
||||
---dog
|
||||
---frog
|
||||
---horse
|
||||
---ship
|
||||
---truck
|
||||
test
|
||||
---airplane
|
||||
---automobile
|
||||
---bird
|
||||
---cat
|
||||
---deer
|
||||
---dog
|
||||
---frog
|
||||
---horse
|
||||
---ship
|
||||
---truck
|
||||
```
|
||||
|
||||
It has two directories:`train` and `test`. These two directories contain training data and testing data of CIFAR-10, respectively. Each of these two folders contains 10 sub-folders, ranging from `airplane` to `truck`. Each sub-folder contains images with the corresponding label. After the images are organized into this structure, we are ready to train an image classification model.
|
||||
|
||||
## Preprocess
|
||||
After the data has been downloaded, it needs to be pre-processed into the Paddle format. We can run the following command for preprocessing.
|
||||
|
||||
```
|
||||
cd demo/image_classification/
|
||||
sh preprocess.sh
|
||||
```
|
||||
|
||||
`preprocess.sh` calls `./demo/image_classification/preprocess.py` to preprocess image data.
|
||||
```sh
|
||||
export PYTHONPATH=$PYTHONPATH:../../
|
||||
data_dir=./data/cifar-out
|
||||
python preprocess.py -i $data_dir -s 32 -c 1
|
||||
```
|
||||
|
||||
`./demo/image_classification/preprocess.py` has the following arguments
|
||||
|
||||
- `-i` or `--input` specifes the input data directory.
|
||||
- `-s` or `--size` specifies the processed size of images.
|
||||
- `-c` or `--color` specifes whether images are color images or gray images.
|
||||
|
||||
|
||||
## Model Training
|
||||
We need to create a model config file before training the model. An example of the config file (vgg_16_cifar.py) is listed below. **Note**, it is slightly different from the `vgg_16_cifar.py` which also applies to the prediction.
|
||||
|
||||
```python
|
||||
from paddle.trainer_config_helpers import *
|
||||
data_dir='data/cifar-out/batches/'
|
||||
meta_path=data_dir+'batches.meta'
|
||||
args = {'meta':meta_path, 'mean_img_size': 32,
|
||||
'img_size': 32, 'num_classes': 10,
|
||||
'use_jpeg': 1, 'color': "color"}
|
||||
define_py_data_sources2(train_list=data_dir+"train.list",
|
||||
test_list=data_dir+'test.list',
|
||||
module='image_provider',
|
||||
obj='processData',
|
||||
args=args)
|
||||
settings(
|
||||
batch_size = 128,
|
||||
learning_rate = 0.1 / 128.0,
|
||||
learning_method = MomentumOptimizer(0.9),
|
||||
regularization = L2Regularization(0.0005 * 128))
|
||||
|
||||
img = data_layer(name='image', size=3*32*32)
|
||||
lbl = data_layer(name="label", size=10)
|
||||
# small_vgg is predined in trainer_config_helpers.network
|
||||
predict = small_vgg(input_image=img, num_channels=3)
|
||||
outputs(classification_cost(input=predict, label=lbl))
|
||||
```
|
||||
|
||||
The first line imports python functions for defining networks.
|
||||
```python
|
||||
from paddle.trainer_config_helpers import *
|
||||
```
|
||||
|
||||
Then define an `define_py_data_sources2` which use python data provider
|
||||
interface. The arguments in `args` are used in `image_provider.py` which
|
||||
yeilds image data and transform them to Paddle.
|
||||
- `meta`: the mean value of training set.
|
||||
- `mean_img_size`: the size of mean feature map.
|
||||
- `img_size`: the height and width of input image.
|
||||
- `num_classes`: the number of classes.
|
||||
- `use_jpeg`: the data storage type when preprocessing.
|
||||
- `color`: specify color image.
|
||||
|
||||
`settings` specifies the training algorithm. In the following example,
|
||||
it specifies learning rate as 0.1, but divided by batch size, and the weight decay
|
||||
is 0.0005 and multiplied by batch size.
|
||||
```python
|
||||
settings(
|
||||
batch_size = 128,
|
||||
learning_rate = 0.1 / 128.0,
|
||||
learning_method = MomentumOptimizer(0.9),
|
||||
regularization = L2Regularization(0.0005 * 128)
|
||||
)
|
||||
```
|
||||
|
||||
The `small_vgg` specifies the network. We use a small version of VGG convolutional network as our network
|
||||
for classification. A description of VGG network can be found here [http://www.robots.ox.ac.uk/~vgg/research/very_deep/](http://www.robots.ox.ac.uk/~vgg/research/very_deep/).
|
||||
```python
|
||||
# small_vgg is predined in trainer_config_helpers.network
|
||||
predict = small_vgg(input_image=img, num_channels=3)
|
||||
```
|
||||
After writing the config, we can train the model by running the script train.sh.
|
||||
|
||||
```bash
|
||||
config=vgg_16_cifar.py
|
||||
output=./cifar_vgg_model
|
||||
log=train.log
|
||||
|
||||
paddle train \
|
||||
--config=$config \
|
||||
--dot_period=10 \
|
||||
--log_period=100 \
|
||||
--test_all_data_in_one_period=1 \
|
||||
--use_gpu=1 \
|
||||
--save_dir=$output \
|
||||
2>&1 | tee $log
|
||||
|
||||
python -m paddle.utils.plotcurve -i $log > plot.png
|
||||
```
|
||||
|
||||
- Here we use GPU mode to train. If you have no gpu environment, just set `use_gpu=0`.
|
||||
|
||||
- `./demo/image_classification/vgg_16_cifar.py` is the network and data configuration file. The meaning of the other flags can be found in the documentation of the command line flags.
|
||||
|
||||
- The script `plotcurve.py` requires the python module of `matplotlib`, so if it fails, maybe you need to install `matplotlib`.
|
||||
|
||||
|
||||
After training finishes, the training and testing error curves will be saved to `plot.png` using `plotcurve.py` script. An example of the plot is shown below:
|
||||
|
||||
<center>![Training and testing curves.](./plot.png)</center>
|
||||
|
||||
|
||||
## Prediction
|
||||
After we train the model, the model file as well as the model parameters are stored in path `./cifar_vgg_model/pass-%05d`. For example, the model of the 300-th pass is stored at `./cifar_vgg_model/pass-00299`.
|
||||
|
||||
To make a prediction for an image, one can run `predict.sh` as follows. The script will output the label of the classfiication.
|
||||
|
||||
```
|
||||
sh predict.sh
|
||||
```
|
||||
|
||||
predict.sh:
|
||||
```
|
||||
model=cifar_vgg_model/pass-00299/
|
||||
image=data/cifar-out/test/airplane/seaplane_s_000978.png
|
||||
use_gpu=1
|
||||
python prediction.py $model $image $use_gpu
|
||||
```
|
||||
|
||||
## Exercise
|
||||
Train a image classification of birds using VGG model and CUB-200 dataset. The birds dataset can be downloaded here. It contains an image dataset with photos of 200 bird species (mostly North American).
|
||||
|
||||
<http://www.vision.caltech.edu/visipedia/CUB-200.html>
|
||||
|
||||
|
||||
|
||||
|
||||
## Delve into Details
|
||||
### Convolutional Neural Network
|
||||
A Convolutional Neural Network is a feedforward neural network that uses convolution layers. It is very suitable for building neural networks that process and understand images. A standard convolutional neural network is shown below:
|
||||
|
||||
![Convolutional Neural Network](./lenet.png)
|
||||
|
||||
Convolutional Neural Network contains the following layers:
|
||||
|
||||
- Convolutional layer: It uses convolution operation to extract features from an image or a feature map.
|
||||
- Pooling layer: It uses max-pooling to downsample feature maps.
|
||||
- Fully Connected layer: It uses fully connected connections to transform features.
|
||||
|
||||
Convolutional Neural Network achieves amazing performance for image classification because it exploits two important characteristics of images: *local correlation* and *spatial invariance*. By iteratively applying convolution and max-pooing operations, convolutional neural network can well represent these two characteristics of images.
|
||||
|
||||
|
||||
For more details of how to define layers and their connections, please refer to the documentation of layers.
|
Before Width: | Height: | Size: 49 KiB |
Before Width: | Height: | Size: 30 KiB |