parent
58f896c3f4
commit
a8342d073d
@ -0,0 +1,168 @@
|
|||||||
|
# Benchmark
|
||||||
|
|
||||||
|
Machine:
|
||||||
|
|
||||||
|
- CPU: 12-core Intel(R) Xeon(R) CPU E5-2620 v2 @2.10GHz
|
||||||
|
- GPU: Tesla K40m
|
||||||
|
- cuDNN: v5.1
|
||||||
|
- system: Docker 1.12.1, all platform are tested in docker environment.
|
||||||
|
|
||||||
|
Platform:
|
||||||
|
|
||||||
|
- PaddlePaddle:
|
||||||
|
- Tensorflow: gcr.io/tensorflow/tensorflow:0.11.0rc0-gpu
|
||||||
|
- Caffe:
|
||||||
|
|
||||||
|
Several convolutional neural networks and recurrent neural network are used to test.
|
||||||
|
|
||||||
|
## Image
|
||||||
|
|
||||||
|
### Benchmark Model
|
||||||
|
|
||||||
|
AlexNet, GooleNet and a small network which refer the config of cifar10 in Caffe are used.
|
||||||
|
|
||||||
|
- [AlexNet](https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet): but the group size is one.
|
||||||
|
|
||||||
|
- [GoogleNet](https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet): but remove loss1 and loss2 when testing benchmark.
|
||||||
|
|
||||||
|
- [SmallNet](https://github.com/BVLC/caffe/blob/master/examples/cifar10/cifar10\_quick\_train\_test.prototxt)
|
||||||
|
|
||||||
|
|
||||||
|
### Singe-GPU
|
||||||
|
|
||||||
|
- AlexNet: input - 3 * 227 * 227, Time: ms/batch
|
||||||
|
|
||||||
|
| BatchSize | 64 | 128 | 256 | 512 |
|
||||||
|
|--------------|-----| -----| ------| -----|
|
||||||
|
| PaddlePaddle | 195 | 334 | 602 | 1629 |
|
||||||
|
| TensorFlow | 223 | 364 | 645 | 1235 |
|
||||||
|
| Caffe | 324 | 627 | 1232 | 2513 |
|
||||||
|
|
||||||
|
##### Notation
|
||||||
|
|
||||||
|
All platforms use cuDnn-v5.1. You might see that caffe is slower, because the workspace limit size is 8 * 1024 * 1024 in Caffe's cuDnn-conv interface. This size is larger in PaddlePaddle and TensorFlow. Caffe will be faster if increasing the workspace limit size.
|
||||||
|
|
||||||
|
- GoogletNet: input - 3 * 224 * 224, Time: ms/batch
|
||||||
|
|
||||||
|
|
||||||
|
| BatchSize | 64 | 128 | 256 |
|
||||||
|
|--------------|-------| -------| --------|
|
||||||
|
| PaddlePaddle | 613 | 1149 | 2348 |
|
||||||
|
| TensorFlow | 644 | 1176 | 2219 |
|
||||||
|
| Caffe | 694 | 1364 | out of memory |
|
||||||
|
|
||||||
|
- SmallNet: input - 3 * 32 * 32, Time ms/batch
|
||||||
|
|
||||||
|
| BatchSize | 64 | 128 | 256 | 512 |
|
||||||
|
|--------------|--------| -------- | --------|---------|
|
||||||
|
| PaddlePaddle | 10.463 | 18.184 | 33.113 | 63.039 |
|
||||||
|
| TensorFlow | 9 | 15 | 28 | 59 |
|
||||||
|
| Caffe | 9.373 | 16.6606 | 31.4797 | 59.719 |
|
||||||
|
|
||||||
|
##### Notation
|
||||||
|
|
||||||
|
All the tests in caffe use `caffe time` to execute, which is not including the parameter updating process. But the time in PaddlePaddle and TensorFlow contains it.
|
||||||
|
|
||||||
|
In Tensorflow, they implement algorithm searching method instead of using the algorithm searching interface in cuDNN.
|
||||||
|
|
||||||
|
### Multi-GPU: 4 GPUs
|
||||||
|
|
||||||
|
- AlexNet, ms / batch
|
||||||
|
|
||||||
|
| totoal-BatchSize | 128 * 4 | 256 * 4 |
|
||||||
|
|------------------|----------| -----------|
|
||||||
|
| PaddlePaddle | 347 | 622 |
|
||||||
|
| TensorFlow | 377 | 675 |
|
||||||
|
| Caffe | 1229 | 2435 |
|
||||||
|
|
||||||
|
For example, if `totoal-BatchSize = 128 * 4`, the speed is calculated by
|
||||||
|
|
||||||
|
```
|
||||||
|
time_at_1gpu_batch_128 * 4 / time_at_4gpu_total_batch_512
|
||||||
|
= (334 * 4)/347
|
||||||
|
= 3.85
|
||||||
|
```
|
||||||
|
|
||||||
|
<img src="figs/alexnet-4gpu.png" width="420">
|
||||||
|
|
||||||
|
|
||||||
|
- GooleNet, ms / batch
|
||||||
|
|
||||||
|
| totoal-BatchSize | 128 * 4 | 256 * 4 |
|
||||||
|
|-------------------|--------------| ----------- |
|
||||||
|
| PaddlePaddle | 1178 | 2367 |
|
||||||
|
| TensorFlow | 1210 | 2292 |
|
||||||
|
| Caffe | 2007 | out of memory |
|
||||||
|
|
||||||
|
<img src="figs/googlenet-4gpu.png" width="420">
|
||||||
|
|
||||||
|
|
||||||
|
## RNN
|
||||||
|
We use lstm network for text classfication to test benchmark.
|
||||||
|
|
||||||
|
### Dataset
|
||||||
|
- [IMDB](http://www.iro.umontreal.ca/~lisa/deep/data/imdb.pkl)
|
||||||
|
- Sequence legth=100, in fact, PaddlePaddle support training with variable-length sequence. But TensorFlow need to pad, in order to compare, we also pad sequence length to 100 in PaddlePaddle.
|
||||||
|
- Dictionary size=30000
|
||||||
|
- Peephole connection is used in `lstmemory` by default in PaddlePaddle. It is also configured in TensorFlow.
|
||||||
|
|
||||||
|
### Single GPU
|
||||||
|
|
||||||
|
#### LSTM in Text Classification
|
||||||
|
|
||||||
|
Testing network for different hidden size, batch size with `2 lstm layer + fc` network.
|
||||||
|
|
||||||
|
- Batch size = 64, ms / batch
|
||||||
|
|
||||||
|
| hidden_size | 256 | 512 | 1280 |
|
||||||
|
|--------------|-------| -------| --------|
|
||||||
|
| PaddlePaddle | 83 | 184 | 641 |
|
||||||
|
| TensorFlow | 175 | 280 | 818 |
|
||||||
|
|
||||||
|
- Batch size = 128, ms / batch
|
||||||
|
|
||||||
|
| hidden_size | 256 | 512 | 1280 |
|
||||||
|
|--------------|------- | -------| --------|
|
||||||
|
| PaddlePaddle | 110 | 261 | 1007 |
|
||||||
|
| TensorFlow | 181 | 361 | 1237 |
|
||||||
|
|
||||||
|
|
||||||
|
- Batch size = 256, ms / batch
|
||||||
|
|
||||||
|
| hidden_size | 256 | 512 | 1280 |
|
||||||
|
|--------------|-------| -------| --------|
|
||||||
|
| PaddlePaddle | 170 | 414 | 1655 |
|
||||||
|
| TensorFlow | 238 | 536 | 1905 |
|
||||||
|
|
||||||
|
<img src="figs/rnn_lstm_cls.png" width="600">
|
||||||
|
|
||||||
|
#### Seq2Seq
|
||||||
|
|
||||||
|
The benchmark of sequence-to-sequence network will be add later.
|
||||||
|
|
||||||
|
|
||||||
|
### Multi GPU: 4 GPUs
|
||||||
|
|
||||||
|
#### LSTM in Text Classification
|
||||||
|
|
||||||
|
- hidden_size = 256, ms / batch
|
||||||
|
|
||||||
|
| batch_size | 256 | 512 |
|
||||||
|
|--------------| -------| --------|
|
||||||
|
| PaddlePaddle | 90 | 118 |
|
||||||
|
| TensorFlow | 226 | 118 |
|
||||||
|
|
||||||
|
|
||||||
|
- hidden_size = 512, ms / batch
|
||||||
|
|
||||||
|
| batch_size | 256 | 512 |
|
||||||
|
|--------------| -------| --------|
|
||||||
|
| PaddlePaddle | 189 | 268 |
|
||||||
|
| TensorFlow | 297 | 383 |
|
||||||
|
|
||||||
|
|
||||||
|
<img src="figs/rnn_lstm_4gpus.png" width="420">
|
||||||
|
|
||||||
|
#### Seq2Seq
|
||||||
|
|
||||||
|
The benchmark of sequence-to-sequence network will be add later.
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,30 @@
|
|||||||
|
set -e
|
||||||
|
|
||||||
|
function test() {
|
||||||
|
cfg=$1
|
||||||
|
batch=$2
|
||||||
|
prefix=$3
|
||||||
|
sed -i "/input: \"data\"/{n;s/^input_dim.*/input_dim: $batch/g}" $cfg
|
||||||
|
sed -i "/input: \"label\"/{n;s/^input_dim.*/input_dim: $batch/g}" $cfg
|
||||||
|
caffe time --model=$cfg --iterations=50 --gpu 0 > logs/$prefix-1gpu-batch${batch}.log 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
if [ ! -d "logs" ]; then
|
||||||
|
mkdir logs
|
||||||
|
fi
|
||||||
|
|
||||||
|
# alexnet
|
||||||
|
test alexnet.prototxt 64 alexnet
|
||||||
|
test alexnet.prototxt 128 alexnet
|
||||||
|
test alexnet.prototxt 256 alexnet
|
||||||
|
test alexnet.prototxt 512 alexnet
|
||||||
|
|
||||||
|
# googlenet
|
||||||
|
test googlenet.prototxt 64 googlenet
|
||||||
|
test googlenet.prototxt 128 googlenet
|
||||||
|
|
||||||
|
# small net
|
||||||
|
test smallnet_mnist_cifar.prototxt 64 smallnet
|
||||||
|
test smallnet_mnist_cifar.prototxt 128 smallnet
|
||||||
|
test smallnet_mnist_cifar.prototxt 256 smallnet
|
||||||
|
test smallnet_mnist_cifar.prototxt 512 smallnet
|
@ -0,0 +1,24 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
set -e
|
||||||
|
|
||||||
|
function test() {
|
||||||
|
cfg=$1
|
||||||
|
batch=$2
|
||||||
|
prefix=$3
|
||||||
|
batch_per_gpu=`expr ${batch} / 4`
|
||||||
|
sed -i "/input: \"data\"/{n;s/^input_dim.*/input_dim: ${batch_per_gpu}/g}" $cfg
|
||||||
|
sed -i "/input: \"label\"/{n;s/^input_dim.*/input_dim: ${batch_per_gpu}/g}" $cfg
|
||||||
|
sed -i "1c\net : \"${cfg}\"" solver.prototxt
|
||||||
|
caffe train --solver=solver.prototxt -gpu all > logs/${prefix}-4gpu-batch${batch}.log 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
if [ ! -d "logs" ]; then
|
||||||
|
mkdir logs
|
||||||
|
fi
|
||||||
|
|
||||||
|
# alexnet
|
||||||
|
test alexnet.prototxt 512 alexnet
|
||||||
|
test alexnet.prototxt 1024 alexnet
|
||||||
|
|
||||||
|
# googlnet
|
||||||
|
test googlenet.prototxt 512 googlenet
|
@ -0,0 +1,198 @@
|
|||||||
|
name: "mnist/cifar"
|
||||||
|
input: "data"
|
||||||
|
input_dim: 128
|
||||||
|
input_dim: 3
|
||||||
|
input_dim: 32
|
||||||
|
input_dim: 32
|
||||||
|
input: "label"
|
||||||
|
input_dim: 128
|
||||||
|
input_dim: 1
|
||||||
|
input_dim: 1
|
||||||
|
input_dim: 1
|
||||||
|
layer {
|
||||||
|
name: "conv1"
|
||||||
|
type: "Convolution"
|
||||||
|
bottom: "data"
|
||||||
|
top: "conv1"
|
||||||
|
param {
|
||||||
|
lr_mult: 1
|
||||||
|
}
|
||||||
|
param {
|
||||||
|
lr_mult: 2
|
||||||
|
}
|
||||||
|
convolution_param {
|
||||||
|
num_output: 32
|
||||||
|
pad: 2
|
||||||
|
kernel_size: 5
|
||||||
|
stride: 1
|
||||||
|
weight_filler {
|
||||||
|
type: "gaussian"
|
||||||
|
std: 0.0001
|
||||||
|
}
|
||||||
|
bias_filler {
|
||||||
|
type: "constant"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "pool1"
|
||||||
|
type: "Pooling"
|
||||||
|
bottom: "conv1"
|
||||||
|
top: "pool1"
|
||||||
|
pooling_param {
|
||||||
|
pool: MAX
|
||||||
|
kernel_size: 3
|
||||||
|
stride: 2
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "relu1"
|
||||||
|
type: "ReLU"
|
||||||
|
bottom: "pool1"
|
||||||
|
top: "pool1"
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "conv2"
|
||||||
|
type: "Convolution"
|
||||||
|
bottom: "pool1"
|
||||||
|
top: "conv2"
|
||||||
|
param {
|
||||||
|
lr_mult: 1
|
||||||
|
}
|
||||||
|
param {
|
||||||
|
lr_mult: 2
|
||||||
|
}
|
||||||
|
convolution_param {
|
||||||
|
num_output: 32
|
||||||
|
pad: 2
|
||||||
|
kernel_size: 5
|
||||||
|
stride: 1
|
||||||
|
weight_filler {
|
||||||
|
type: "gaussian"
|
||||||
|
std: 0.01
|
||||||
|
}
|
||||||
|
bias_filler {
|
||||||
|
type: "constant"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "relu2"
|
||||||
|
type: "ReLU"
|
||||||
|
bottom: "conv2"
|
||||||
|
top: "conv2"
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "pool2"
|
||||||
|
type: "Pooling"
|
||||||
|
bottom: "conv2"
|
||||||
|
top: "pool2"
|
||||||
|
pooling_param {
|
||||||
|
pool: AVE
|
||||||
|
kernel_size: 3
|
||||||
|
stride: 2
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "conv3"
|
||||||
|
type: "Convolution"
|
||||||
|
bottom: "pool2"
|
||||||
|
top: "conv3"
|
||||||
|
param {
|
||||||
|
lr_mult: 1
|
||||||
|
}
|
||||||
|
param {
|
||||||
|
lr_mult: 2
|
||||||
|
}
|
||||||
|
convolution_param {
|
||||||
|
num_output: 64
|
||||||
|
pad: 2
|
||||||
|
kernel_size: 5
|
||||||
|
stride: 1
|
||||||
|
weight_filler {
|
||||||
|
type: "gaussian"
|
||||||
|
std: 0.01
|
||||||
|
}
|
||||||
|
bias_filler {
|
||||||
|
type: "constant"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "relu3"
|
||||||
|
type: "ReLU"
|
||||||
|
bottom: "conv3"
|
||||||
|
top: "conv3"
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "pool3"
|
||||||
|
type: "Pooling"
|
||||||
|
bottom: "conv3"
|
||||||
|
top: "pool3"
|
||||||
|
pooling_param {
|
||||||
|
pool: AVE
|
||||||
|
kernel_size: 3
|
||||||
|
stride: 2
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "ip1"
|
||||||
|
type: "InnerProduct"
|
||||||
|
bottom: "pool3"
|
||||||
|
top: "ip1"
|
||||||
|
param {
|
||||||
|
lr_mult: 1
|
||||||
|
}
|
||||||
|
param {
|
||||||
|
lr_mult: 2
|
||||||
|
}
|
||||||
|
inner_product_param {
|
||||||
|
num_output: 64
|
||||||
|
weight_filler {
|
||||||
|
type: "gaussian"
|
||||||
|
std: 0.1
|
||||||
|
}
|
||||||
|
bias_filler {
|
||||||
|
type: "constant"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "ip2"
|
||||||
|
type: "InnerProduct"
|
||||||
|
bottom: "ip1"
|
||||||
|
top: "ip2"
|
||||||
|
param {
|
||||||
|
lr_mult: 1
|
||||||
|
}
|
||||||
|
param {
|
||||||
|
lr_mult: 2
|
||||||
|
}
|
||||||
|
inner_product_param {
|
||||||
|
num_output: 10
|
||||||
|
weight_filler {
|
||||||
|
type: "gaussian"
|
||||||
|
std: 0.1
|
||||||
|
}
|
||||||
|
bias_filler {
|
||||||
|
type: "constant"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "accuracy"
|
||||||
|
type: "Accuracy"
|
||||||
|
bottom: "ip2"
|
||||||
|
bottom: "label"
|
||||||
|
top: "accuracy"
|
||||||
|
include {
|
||||||
|
phase: TEST
|
||||||
|
}
|
||||||
|
}
|
||||||
|
layer {
|
||||||
|
name: "loss"
|
||||||
|
type: "SoftmaxWithLoss"
|
||||||
|
bottom: "ip2"
|
||||||
|
bottom: "label"
|
||||||
|
top: "loss"
|
||||||
|
}
|
@ -0,0 +1,10 @@
|
|||||||
|
net: "alexnet.prototxt"
|
||||||
|
base_lr: 0.01
|
||||||
|
lr_policy: "fixed"
|
||||||
|
display: 20
|
||||||
|
max_iter: 200
|
||||||
|
momentum: 0.9
|
||||||
|
weight_decay: 0.0005
|
||||||
|
snapshot: 10000
|
||||||
|
snapshot_prefix: "models/caffe_alexnet_train"
|
||||||
|
solver_mode: GPU
|
After Width: | Height: | Size: 80 KiB |
After Width: | Height: | Size: 81 KiB |
After Width: | Height: | Size: 72 KiB |
After Width: | Height: | Size: 115 KiB |
@ -0,0 +1,57 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
|
||||||
|
from paddle.trainer_config_helpers import *
|
||||||
|
|
||||||
|
height=227
|
||||||
|
width=227
|
||||||
|
num_class = 1000
|
||||||
|
batch_size = get_config_arg('batch_size', int, 128)
|
||||||
|
|
||||||
|
args={'height':height, 'width':width, 'color':True, 'num_class':num_class}
|
||||||
|
define_py_data_sources2("train.list",
|
||||||
|
None,
|
||||||
|
module="provider",
|
||||||
|
obj="process",
|
||||||
|
args=args)
|
||||||
|
|
||||||
|
|
||||||
|
settings(
|
||||||
|
batch_size = batch_size,
|
||||||
|
learning_rate = 0.01 / batch_size,
|
||||||
|
learning_method = MomentumOptimizer(0.9),
|
||||||
|
regularization = L2Regularization(0.0005 * batch_size)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# conv1
|
||||||
|
net = data_layer('data', size=height * width * 3)
|
||||||
|
net = img_conv_layer(input=net, filter_size=11, num_channels=3,
|
||||||
|
num_filters=96, stride=4, padding=1)
|
||||||
|
net = img_cmrnorm_layer(input=net, size=5, scale=0.0001, power=0.75)
|
||||||
|
net = img_pool_layer(input=net, pool_size=3, stride=2)
|
||||||
|
|
||||||
|
# conv2
|
||||||
|
net = img_conv_layer(input=net, filter_size=5, num_filters=256,
|
||||||
|
stride=1, padding=2, groups=1)
|
||||||
|
net = img_cmrnorm_layer(input=net, size=5, scale=0.0001, power=0.75)
|
||||||
|
net = img_pool_layer(input=net, pool_size=3, stride=2)
|
||||||
|
|
||||||
|
# conv3
|
||||||
|
net = img_conv_layer(input=net, filter_size=3, num_filters=384,
|
||||||
|
stride=1, padding=1)
|
||||||
|
# conv4
|
||||||
|
net = img_conv_layer(input=net, filter_size=3, num_filters=384,
|
||||||
|
stride=1, padding=1, groups=1)
|
||||||
|
|
||||||
|
# conv5
|
||||||
|
net = img_conv_layer(input=net, filter_size=3, num_filters=256,
|
||||||
|
stride=1, padding=1, groups=1)
|
||||||
|
net = img_pool_layer(input=net, pool_size=3, stride=2)
|
||||||
|
|
||||||
|
net = fc_layer(input=net, size=4096, act=ReluActivation(), layer_attr=ExtraAttr(drop_rate=0.5))
|
||||||
|
net = fc_layer(input=net, size=4096, act=ReluActivation(), layer_attr=ExtraAttr(drop_rate=0.5))
|
||||||
|
net = fc_layer(input=net, size=1000, act=SoftmaxActivation())
|
||||||
|
|
||||||
|
lab = data_layer('label', num_class)
|
||||||
|
loss = cross_entropy(input=net, label=lab)
|
||||||
|
outputs(loss)
|
@ -0,0 +1,147 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
from paddle.trainer_config_helpers import *
|
||||||
|
|
||||||
|
height=224
|
||||||
|
width=224
|
||||||
|
num_class = 1000
|
||||||
|
batch_size = get_config_arg('batch_size', int, 128)
|
||||||
|
|
||||||
|
args={'height':height, 'width':width, 'color':True, 'num_class':num_class}
|
||||||
|
define_py_data_sources2("train.list",
|
||||||
|
None,
|
||||||
|
module="provider",
|
||||||
|
obj="process",
|
||||||
|
args=args)
|
||||||
|
|
||||||
|
settings(
|
||||||
|
batch_size = batch_size,
|
||||||
|
learning_rate = 0.01 / batch_size,
|
||||||
|
learning_method = MomentumOptimizer(0.9),
|
||||||
|
regularization = L2Regularization(0.0005 * batch_size)
|
||||||
|
)
|
||||||
|
|
||||||
|
def inception2(name, input, channels, \
|
||||||
|
filter1,
|
||||||
|
filter3R, filter3,
|
||||||
|
filter5R, filter5,
|
||||||
|
proj):
|
||||||
|
|
||||||
|
conv1 = name + '_1'
|
||||||
|
conv3r = name + '_3r'
|
||||||
|
conv3 = name + '_3'
|
||||||
|
conv5r = name + '_5r'
|
||||||
|
conv5 = name + '_5'
|
||||||
|
maxpool = name + '_max'
|
||||||
|
convproj = name + '_proj'
|
||||||
|
|
||||||
|
cov1 = img_conv_layer(name=conv1, input=input, filter_size=1,
|
||||||
|
num_channels=channels, num_filters=filter1,
|
||||||
|
stride=1, padding=0)
|
||||||
|
|
||||||
|
cov3r = img_conv_layer(name=conv3r, input=input, filter_size=1,
|
||||||
|
num_channels=channels, num_filters=filter3R,
|
||||||
|
stride=1, padding=0)
|
||||||
|
cov3 = img_conv_layer(name=conv3, input=cov3r, filter_size=3,
|
||||||
|
num_filters=filter3, stride=1, padding=1)
|
||||||
|
|
||||||
|
cov5r = img_conv_layer(name=conv5r, input=input, filter_size=1,
|
||||||
|
num_channels=channels, num_filters=filter5R,
|
||||||
|
stride=1, padding=0)
|
||||||
|
cov5 = img_conv_layer(name=conv5, input=cov5r, filter_size=5,
|
||||||
|
num_filters=filter5, stride=1, padding=2)
|
||||||
|
|
||||||
|
pool1 = img_pool_layer(name=maxpool, input=input, pool_size=3,
|
||||||
|
num_channels=channels, stride=1, padding=1)
|
||||||
|
covprj = img_conv_layer(name=convproj, input=pool1, filter_size=1,
|
||||||
|
num_filters=proj, stride=1, padding=0)
|
||||||
|
|
||||||
|
cat = concat_layer(name=name, input=[cov1, cov3, cov5, covprj])
|
||||||
|
return cat
|
||||||
|
|
||||||
|
def inception(name, input, channels, \
|
||||||
|
filter1,
|
||||||
|
filter3R, filter3,
|
||||||
|
filter5R, filter5,
|
||||||
|
proj):
|
||||||
|
|
||||||
|
cov1 = conv_projection(input=input, filter_size=1, num_channels=channels,
|
||||||
|
num_filters=filter1, stride=1, padding=0)
|
||||||
|
|
||||||
|
cov3r = img_conv_layer(name=name + '_3r', input=input, filter_size=1,
|
||||||
|
num_channels=channels, num_filters=filter3R,
|
||||||
|
stride=1, padding=0)
|
||||||
|
cov3 = conv_projection(input=cov3r, filter_size=3, num_filters=filter3,
|
||||||
|
stride=1, padding=1)
|
||||||
|
|
||||||
|
cov5r = img_conv_layer(name=name + '_5r', input=input, filter_size=1,
|
||||||
|
num_channels=channels, num_filters=filter5R,
|
||||||
|
stride=1, padding=0)
|
||||||
|
cov5 = conv_projection(input=cov5r, filter_size=5, num_filters=filter5,
|
||||||
|
stride=1, padding=2)
|
||||||
|
|
||||||
|
pool1 = img_pool_layer(name=name + '_max', input=input, pool_size=3,
|
||||||
|
num_channels=channels, stride=1, padding=1)
|
||||||
|
covprj = conv_projection(input=pool1, filter_size=1, num_filters=proj,
|
||||||
|
stride=1, padding=0)
|
||||||
|
|
||||||
|
cat = concat_layer(name=name, input=[cov1, cov3, cov5, covprj],
|
||||||
|
bias_attr=True, act=ReluActivation())
|
||||||
|
return cat
|
||||||
|
|
||||||
|
|
||||||
|
lab = data_layer(name="label", size=1000)
|
||||||
|
data = data_layer(name="input", size=3 * height * width)
|
||||||
|
|
||||||
|
# stage 1
|
||||||
|
conv1 = img_conv_layer(name="conv1", input=data, filter_size=7,
|
||||||
|
num_channels=3, num_filters=64, stride=2, padding=3)
|
||||||
|
pool1 = img_pool_layer(name="pool1", input=conv1, pool_size=3,
|
||||||
|
num_channels=64, stride=2)
|
||||||
|
|
||||||
|
# stage 2
|
||||||
|
conv2_1 = img_conv_layer(name="conv2_1", input=pool1, filter_size=1,
|
||||||
|
num_filters=64, stride=1, padding=0)
|
||||||
|
conv2_2 = img_conv_layer(name="conv2_2", input=conv2_1, filter_size=3,
|
||||||
|
num_filters=192, stride=1, padding=1)
|
||||||
|
pool2 = img_pool_layer(name="pool2", input=conv2_2, pool_size=3,
|
||||||
|
num_channels=192, stride=2)
|
||||||
|
|
||||||
|
# stage 3
|
||||||
|
ince3a = inception("ince3a", pool2, 192, 64, 96, 128, 16, 32, 32)
|
||||||
|
ince3b = inception("ince3b", ince3a, 256, 128, 128,192, 32, 96, 64)
|
||||||
|
pool3 = img_pool_layer(name="pool3", input=ince3b, num_channels=480, pool_size=3, stride=2)
|
||||||
|
|
||||||
|
# stage 4
|
||||||
|
ince4a = inception("ince4a", pool3, 480, 192, 96, 208, 16, 48, 64)
|
||||||
|
ince4b = inception("ince4b", ince4a, 512, 160, 112, 224, 24, 64, 64)
|
||||||
|
ince4c = inception("ince4c", ince4b, 512, 128, 128, 256, 24, 64, 64)
|
||||||
|
ince4d = inception("ince4d", ince4c, 512, 112, 144, 288, 32, 64, 64)
|
||||||
|
ince4e = inception("ince4e", ince4d, 528, 256, 160, 320, 32, 128, 128)
|
||||||
|
pool4 = img_pool_layer(name="pool4", input=ince4e, num_channels=832, pool_size=3, stride=2)
|
||||||
|
|
||||||
|
# stage 5
|
||||||
|
ince5a = inception("ince5a", pool4, 832, 256, 160, 320, 32, 128, 128)
|
||||||
|
ince5b = inception("ince5b", ince5a, 832, 384, 192, 384, 48, 128, 128)
|
||||||
|
pool5 = img_pool_layer(name="pool5", input=ince5b, num_channels=1024, pool_size=7, stride=7, pool_type=AvgPooling())
|
||||||
|
|
||||||
|
# We remove loss1 and loss2 for all system when testing benchmark
|
||||||
|
# output 1
|
||||||
|
# pool_o1 = img_pool_layer(name="pool_o1", input=ince4a, num_channels=512, pool_size=5, stride=3, pool_type=AvgPooling())
|
||||||
|
# conv_o1 = img_conv_layer(name="conv_o1", input=pool_o1, filter_size=1, num_filters=128, stride=1, padding=0)
|
||||||
|
# fc_o1 = fc_layer(name="fc_o1", input=conv_o1, size=1024, layer_attr=ExtraAttr(drop_rate=0.7), act=ReluActivation())
|
||||||
|
# out1 = fc_layer(name="output1", input=fc_o1, size=1000, act=SoftmaxActivation())
|
||||||
|
# loss1 = cross_entropy(name='loss1', input=out1, label=lab, coeff=0.3)
|
||||||
|
|
||||||
|
# output 2
|
||||||
|
#pool_o2 = img_pool_layer(name="pool_o2", input=ince4d, num_channels=528, pool_size=5, stride=3, pool_type=AvgPooling())
|
||||||
|
#conv_o2 = img_conv_layer(name="conv_o2", input=pool_o2, filter_size=1, num_filters=128, stride=1, padding=0)
|
||||||
|
#fc_o2 = fc_layer(name="fc_o2", input=conv_o2, size=1024, layer_attr=ExtraAttr(drop_rate=0.7), act=ReluActivation())
|
||||||
|
#out2 = fc_layer(name="output2", input=fc_o2, size=1000, act=SoftmaxActivation())
|
||||||
|
#loss2 = cross_entropy(name='loss2', input=out2, label=lab, coeff=0.3)
|
||||||
|
|
||||||
|
# output 3
|
||||||
|
dropout = dropout_layer(name="dropout", input=pool5, dropout_rate=0.4)
|
||||||
|
out3 = fc_layer(name="output3", input=dropout, size=1000, act=SoftmaxActivation())
|
||||||
|
loss3 = cross_entropy(name='loss3', input=out3, label=lab)
|
||||||
|
|
||||||
|
outputs(loss3)
|
@ -0,0 +1,24 @@
|
|||||||
|
import io,os
|
||||||
|
import random
|
||||||
|
import numpy as np
|
||||||
|
from paddle.trainer.PyDataProvider2 import *
|
||||||
|
|
||||||
|
def initHook(settings, height, width, color, num_class, **kwargs):
|
||||||
|
settings.height = height
|
||||||
|
settings.width = width
|
||||||
|
settings.color = color
|
||||||
|
settings.num_class = num_class
|
||||||
|
if settings.color:
|
||||||
|
settings.data_size = settings.height * settings.width * 3
|
||||||
|
else:
|
||||||
|
settings.data_size = settings.height * settings.width
|
||||||
|
|
||||||
|
settings.slots = [dense_vector(settings.data_size), integer_value(1)]
|
||||||
|
|
||||||
|
@provider(init_hook=initHook, min_pool_size=-1, cache=CacheType.CACHE_PASS_IN_MEM)
|
||||||
|
def process(settings, file_list):
|
||||||
|
with open(file_list, 'r') as fdata:
|
||||||
|
for line in fdata:
|
||||||
|
img = np.random.rand(1, settings.data_size).reshape(-1, 1).flatten()
|
||||||
|
lab = random.randint(0, settings.num_class)
|
||||||
|
yield img.tolist(), int(lab)
|
@ -0,0 +1,54 @@
|
|||||||
|
set -e
|
||||||
|
|
||||||
|
function gen_file() {
|
||||||
|
if [ ! -d "train.txt" ]; then
|
||||||
|
for ((i=1;i<=1024;i++))
|
||||||
|
do
|
||||||
|
echo "train/n09246464/n09246464_38735.jpeg 972" >> train.txt
|
||||||
|
done
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ ! -d "train.list" ]; then
|
||||||
|
echo "train.txt" > train.list
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
function train() {
|
||||||
|
cfg=$1
|
||||||
|
thread=$2
|
||||||
|
bz=$3
|
||||||
|
args="batch_size=$3"
|
||||||
|
prefix=$4
|
||||||
|
paddle train --job=time \
|
||||||
|
--config=$cfg \
|
||||||
|
--use_gpu=True \
|
||||||
|
--trainer_count=$thread \
|
||||||
|
--log_period=10 \
|
||||||
|
--test_period=100 \
|
||||||
|
--config_args=$args \
|
||||||
|
--cudnn_dir=/home/dangqingqing/tools/cudnn-5.1/lib64 \
|
||||||
|
> logs/$prefix-${thread}gpu-$bz.log 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
gen_file
|
||||||
|
if [ ! -d "logs" ]; then
|
||||||
|
mkdir logs
|
||||||
|
fi
|
||||||
|
|
||||||
|
#========single-gpu=========#
|
||||||
|
# alexnet
|
||||||
|
train alexnet.py 1 64 alexnet
|
||||||
|
train alexnet.py 1 128 alexnet
|
||||||
|
train alexnet.py 1 256 alexnet
|
||||||
|
train alexnet.py 1 512 alexnet
|
||||||
|
|
||||||
|
# googlenet
|
||||||
|
train googlenet.py 1 64 googlenet
|
||||||
|
train googlenet.py 1 128 googlenet
|
||||||
|
train googlenet.py 1 256 googlenet
|
||||||
|
|
||||||
|
# smallnet
|
||||||
|
train smallnet_mnist_cifar.py 1 64 smallnet
|
||||||
|
train smallnet_mnist_cifar.py 1 128 smallnet
|
||||||
|
train smallnet_mnist_cifar.py 1 256 smallnet
|
||||||
|
train smallnet_mnist_cifar.py 1 512 smallnet
|
@ -0,0 +1,42 @@
|
|||||||
|
set -e
|
||||||
|
|
||||||
|
function gen_file() {
|
||||||
|
if [ ! -d "train.txt" ]; then
|
||||||
|
for ((i=1;i<=1024;i++))
|
||||||
|
do
|
||||||
|
echo "train/n09246464/n09246464_38735.jpeg 972" >> train.txt
|
||||||
|
done
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ ! -d "train.list" ]; then
|
||||||
|
echo "train.txt" > train.list
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
function train() {
|
||||||
|
cfg=$1
|
||||||
|
thread=$2
|
||||||
|
bz=$3
|
||||||
|
args="batch_size=$3"
|
||||||
|
prefix=$4
|
||||||
|
paddle train --job=time \
|
||||||
|
--config=$cfg \
|
||||||
|
--use_gpu=True \
|
||||||
|
--trainer_count=$thread \
|
||||||
|
--log_period=10 \
|
||||||
|
--test_period=100 \
|
||||||
|
--config_args=$args \
|
||||||
|
> logs/$prefix-${thread}gpu-$bz.log 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
gen_file
|
||||||
|
if [ ! -d "logs" ]; then
|
||||||
|
mkdir logs
|
||||||
|
fi
|
||||||
|
|
||||||
|
#========multi-gpus=========#
|
||||||
|
train alexnet.py 4 512 alexnet
|
||||||
|
train alexnet.py 4 1024 alexnet
|
||||||
|
|
||||||
|
train googlenet.py 4 512 googlenet
|
||||||
|
train googlenet.py 4 1024 googlenet
|
@ -0,0 +1,47 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
|
||||||
|
from paddle.trainer_config_helpers import *
|
||||||
|
|
||||||
|
height=32
|
||||||
|
width=32
|
||||||
|
num_class = 10
|
||||||
|
|
||||||
|
batch_size = get_config_arg('batch_size', int, 128)
|
||||||
|
|
||||||
|
args={'height':height, 'width':width, 'color':True, 'num_class':num_class}
|
||||||
|
define_py_data_sources2("train.list",
|
||||||
|
None,
|
||||||
|
module="provider",
|
||||||
|
obj="process",
|
||||||
|
args=args)
|
||||||
|
|
||||||
|
settings(
|
||||||
|
batch_size = batch_size,
|
||||||
|
learning_rate = 0.01 / batch_size,
|
||||||
|
learning_method = MomentumOptimizer(0.9),
|
||||||
|
regularization = L2Regularization(0.0005 * batch_size)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# conv1
|
||||||
|
net = data_layer('data', size=height * width * 3)
|
||||||
|
net = img_conv_layer(input=net, filter_size=5, num_channels=3,
|
||||||
|
num_filters=32, stride=1, padding=2)
|
||||||
|
net = img_pool_layer(input=net, pool_size=3, stride=2, padding=1)
|
||||||
|
|
||||||
|
# conv2
|
||||||
|
net = img_conv_layer(input=net, filter_size=5, num_filters=32,
|
||||||
|
stride=1, padding=2)
|
||||||
|
net = img_pool_layer(input=net, pool_size=3, stride=2, padding=1, pool_type=AvgPooling())
|
||||||
|
|
||||||
|
# conv3
|
||||||
|
net = img_conv_layer(input=net, filter_size=3, num_filters=64,
|
||||||
|
stride=1, padding=1)
|
||||||
|
net = img_pool_layer(input=net, pool_size=3, stride=2, padding=1, pool_type=AvgPooling())
|
||||||
|
|
||||||
|
net = fc_layer(input=net, size=64, act=ReluActivation())
|
||||||
|
net = fc_layer(input=net, size=10, act=SoftmaxActivation())
|
||||||
|
|
||||||
|
lab = data_layer('label', num_class)
|
||||||
|
loss = classification_cost(input=net, label=lab)
|
||||||
|
outputs(loss)
|
@ -0,0 +1,42 @@
|
|||||||
|
from __future__ import print_function
|
||||||
|
import six.moves.cPickle as pickle
|
||||||
|
import gzip
|
||||||
|
import os
|
||||||
|
import numpy
|
||||||
|
|
||||||
|
def get_dataset_file(dataset, default_dataset, origin):
|
||||||
|
data_dir, data_file = os.path.split(dataset)
|
||||||
|
if (not os.path.isfile(dataset)) and data_file == default_dataset:
|
||||||
|
from six.moves import urllib
|
||||||
|
print('Downloading data from %s' % origin)
|
||||||
|
urllib.request.urlretrieve(origin, dataset)
|
||||||
|
|
||||||
|
return dataset
|
||||||
|
|
||||||
|
def create_data(path="imdb.pkl"):
|
||||||
|
|
||||||
|
if (not os.path.isfile('imdb.train.pkl')):
|
||||||
|
path = get_dataset_file(
|
||||||
|
path, "imdb.pkl",
|
||||||
|
"http://www.iro.umontreal.ca/~lisa/deep/data/imdb.pkl")
|
||||||
|
|
||||||
|
if path.endswith(".gz"):
|
||||||
|
f = gzip.open(path, 'rb')
|
||||||
|
else:
|
||||||
|
f = open(path, 'rb')
|
||||||
|
|
||||||
|
train_set = pickle.load(f)
|
||||||
|
test_set = pickle.load(f)
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
pickle.dump(train_set, open('imdb.train.pkl', 'wb'))
|
||||||
|
pickle.dump(test_set, open('imdb.test.pkl', 'wb'))
|
||||||
|
|
||||||
|
if (not os.path.isfile('train.list')):
|
||||||
|
file('train.list', 'w').write('imdb.train.pkl\n')
|
||||||
|
|
||||||
|
def main():
|
||||||
|
create_data('imdb.pkl')
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
@ -0,0 +1,64 @@
|
|||||||
|
import io,os
|
||||||
|
import random
|
||||||
|
import numpy as np
|
||||||
|
import six.moves.cPickle as pickle
|
||||||
|
from paddle.trainer.PyDataProvider2 import *
|
||||||
|
|
||||||
|
def remove_unk(x, n_words):
|
||||||
|
return [[1 if w >= n_words else w for w in sen] for sen in x]
|
||||||
|
|
||||||
|
# ==============================================================
|
||||||
|
# tensorflow uses fixed length, but PaddlePaddle can process
|
||||||
|
# variable-length. Padding is used in benchmark in order to
|
||||||
|
# compare with other platform.
|
||||||
|
# ==============================================================
|
||||||
|
def pad_sequences(sequences, maxlen=None, dtype='int32', padding='post',
|
||||||
|
truncating='post', value=0.):
|
||||||
|
lengths = [len(s) for s in sequences]
|
||||||
|
|
||||||
|
nb_samples = len(sequences)
|
||||||
|
if maxlen is None:
|
||||||
|
maxlen = np.max(lengths)
|
||||||
|
|
||||||
|
x = (np.ones((nb_samples, maxlen)) * value).astype(dtype)
|
||||||
|
for idx, s in enumerate(sequences):
|
||||||
|
if len(s) == 0:
|
||||||
|
continue # empty list was found
|
||||||
|
if truncating == 'pre':
|
||||||
|
trunc = s[-maxlen:]
|
||||||
|
elif truncating == 'post':
|
||||||
|
trunc = s[:maxlen]
|
||||||
|
else:
|
||||||
|
raise ValueError("Truncating type '%s' not understood" % padding)
|
||||||
|
|
||||||
|
if padding == 'post':
|
||||||
|
x[idx, :len(trunc)] = trunc
|
||||||
|
elif padding == 'pre':
|
||||||
|
x[idx, -len(trunc):] = trunc
|
||||||
|
else:
|
||||||
|
raise ValueError("Padding type '%s' not understood" % padding)
|
||||||
|
return x
|
||||||
|
|
||||||
|
|
||||||
|
def initHook(settings, vocab_size, pad_seq, maxlen, **kwargs):
|
||||||
|
settings.vocab_size = vocab_size
|
||||||
|
settings.pad_seq = pad_seq
|
||||||
|
settings.maxlen = maxlen
|
||||||
|
settings.input_types = [
|
||||||
|
integer_value_sequence(vocab_size),
|
||||||
|
integer_value(2)]
|
||||||
|
|
||||||
|
@provider(init_hook=initHook, min_pool_size=-1, cache=CacheType.CACHE_PASS_IN_MEM)
|
||||||
|
def process(settings, file):
|
||||||
|
f = open(file, 'rb')
|
||||||
|
train_set = pickle.load(f)
|
||||||
|
f.close()
|
||||||
|
x, y = train_set
|
||||||
|
|
||||||
|
# remove unk, namely remove the words out of dictionary
|
||||||
|
x = remove_unk(x, settings.vocab_size)
|
||||||
|
if settings.pad_seq:
|
||||||
|
x = pad_sequences(x, maxlen=settings.maxlen, value=0.)
|
||||||
|
|
||||||
|
for i in range(len(y)):
|
||||||
|
yield map(int,x[i]), int(y[i])
|
@ -0,0 +1,42 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
|
||||||
|
from paddle.trainer_config_helpers import *
|
||||||
|
import imdb
|
||||||
|
|
||||||
|
num_class = 2
|
||||||
|
vocab_size = 30000
|
||||||
|
fixedlen = 100
|
||||||
|
batch_size = get_config_arg('batch_size', int, 128)
|
||||||
|
lstm_num = get_config_arg('lstm_num', int, 1)
|
||||||
|
hidden_size = get_config_arg('hidden_size', int, 128)
|
||||||
|
# whether to pad sequence into fixed length
|
||||||
|
pad_seq = get_config_arg('pad_seq', bool, True)
|
||||||
|
imdb.create_data('imdb.pkl')
|
||||||
|
|
||||||
|
args={'vocab_size':vocab_size, 'pad_seq':pad_seq, 'maxlen':fixedlen}
|
||||||
|
define_py_data_sources2("train.list",
|
||||||
|
None,
|
||||||
|
module="provider",
|
||||||
|
obj="process",
|
||||||
|
args=args)
|
||||||
|
|
||||||
|
settings(
|
||||||
|
batch_size=batch_size,
|
||||||
|
learning_rate=2e-3,
|
||||||
|
learning_method=AdamOptimizer(),
|
||||||
|
regularization=L2Regularization(8e-4),
|
||||||
|
gradient_clipping_threshold=25
|
||||||
|
)
|
||||||
|
|
||||||
|
net = data_layer('data', size=vocab_size)
|
||||||
|
net = embedding_layer(input=net, size=128)
|
||||||
|
|
||||||
|
for i in xrange(lstm_num):
|
||||||
|
net = simple_lstm(input=net, size=hidden_size)
|
||||||
|
|
||||||
|
net = last_seq(input=net)
|
||||||
|
net = fc_layer(input=net, size=2, act=SoftmaxActivation())
|
||||||
|
|
||||||
|
lab = data_layer('label', num_class)
|
||||||
|
loss = classification_cost(input=net, label=lab)
|
||||||
|
outputs(loss)
|
@ -0,0 +1,38 @@
|
|||||||
|
set -e
|
||||||
|
|
||||||
|
function train() {
|
||||||
|
cfg=$1
|
||||||
|
thread=$2
|
||||||
|
args="lstm_num=${3},seq_pad=${4},hidden_size=${5},batch_size=${6}"
|
||||||
|
paddle train --job=time \
|
||||||
|
--config=$cfg \
|
||||||
|
--use_gpu=1 \
|
||||||
|
--trainer_count=$thread \
|
||||||
|
--log_period=10 \
|
||||||
|
--test_period=100 \
|
||||||
|
--num_passes=1 \
|
||||||
|
--feed_data=1 \
|
||||||
|
--config_args=$args \
|
||||||
|
>logs/rnn-pad${4}-${thread}gpu-lstm${3}-batch${6}-hid${5}.log 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
if [ ! -d "logs" ]; then
|
||||||
|
mkdir logs
|
||||||
|
fi
|
||||||
|
|
||||||
|
## padding, single gpu
|
||||||
|
#-----config--gpu--lstm_num--padding--hidden_size--batch_size
|
||||||
|
## lstm_num=2, batch_size=64
|
||||||
|
train rnn.py 1 2 1 256 64
|
||||||
|
train rnn.py 1 2 1 512 64
|
||||||
|
train rnn.py 1 2 1 1280 64
|
||||||
|
|
||||||
|
## lstm_num=2, batch_size=128
|
||||||
|
train rnn.py 1 2 1 256 128
|
||||||
|
train rnn.py 1 2 1 512 128
|
||||||
|
train rnn.py 1 2 1 1280 128
|
||||||
|
|
||||||
|
## lstm_num=4, batch_size=256
|
||||||
|
train rnn.py 1 2 1 256 256
|
||||||
|
train rnn.py 1 2 1 512 256
|
||||||
|
train rnn.py 1 2 1 1280 256
|
@ -0,0 +1,34 @@
|
|||||||
|
set -e
|
||||||
|
|
||||||
|
function train() {
|
||||||
|
cfg=$1
|
||||||
|
thread=$2
|
||||||
|
args="lstm_num=${3},seq_pad=${4},hidden_size=${5},batch_size=${6}"
|
||||||
|
paddle train --job=time \
|
||||||
|
--config=$cfg \
|
||||||
|
--use_gpu=1 \
|
||||||
|
--trainer_count=$thread \
|
||||||
|
--log_period=10 \
|
||||||
|
--test_period=100 \
|
||||||
|
--num_passes=1 \
|
||||||
|
--feed_data=1 \
|
||||||
|
--config_args=$args \
|
||||||
|
>logs/rnn-pad${4}-${thread}gpu-lstm${3}-hid${5}-batch${6}.log 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
if [ ! -d "logs" ]; then
|
||||||
|
mkdir logs
|
||||||
|
fi
|
||||||
|
|
||||||
|
#-----config--gpu--lstm_num--padding--hidden_size--batch_size
|
||||||
|
#==================multi gpus=====================#
|
||||||
|
# hidden_size=256, lstm_num=2, different batch size
|
||||||
|
train rnn.py 4 2 1 256 128
|
||||||
|
train rnn.py 4 2 1 256 256
|
||||||
|
train rnn.py 4 2 1 256 512
|
||||||
|
|
||||||
|
# hidden_size=512, lstm_num=4, different batch size
|
||||||
|
train rnn.py 4 2 1 512 128
|
||||||
|
train rnn.py 4 2 1 512 256
|
||||||
|
train rnn.py 4 2 1 512 512
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in new issue