You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Paddle/doc/ui/cmd_argument/use_case.md

184 lines
6.0 KiB

# Use Case
## Local Training
These command line arguments are commonly used by local training experiments, such as image classification, natural language processing, et al.
```
paddle train \
--use_gpu=1/0 \ #1:GPU,0:CPU(default:true)
--config=network_config \
--save_dir=output \
--trainer_count=COUNT \ #(default:1)
--test_period=M \ #(default:1000
--test_all_data_in_one_period=true \ #(default:false)
--num_passes=N \ #(defalut:100
--log_period=K \ #(default:100)
--dot_period=1000 \ #(default:1)
#[--show_parameter_stats_period=100] \ #(default:0)
#[--saving_period_by_batches=200] \ #(default:0)
```
`show_parameter_stats_period` and `saving_period_by_batches` are optional according to your task.
### 1) Pass Command Argument to Network config
`config_args` is a useful parameter to pass arguments to network config.
```
--config_args=generating=1,beam_size=5,layer_num=10 \
```
And `get_config_arg` can be used to parse these arguments in network config as follows:
```
generating = get_config_arg('generating', bool, False)
beam_size = get_config_arg('beam_size', int, 3)
layer_num = get_config_arg('layer_num', int, 8)
```
`get_config_arg`:
```
get_config_arg(name, type, default_value)
```
- name: the name specified in the `--config_args`
- type: value type, bool, int, str, float etc.
- default_value: default value if not set.
### 2) Use Model to Initialize Network
add argument:
```
--init_model_path=model_path
--load_missing_parameter_strategy=rand
```
## Local Testing
Method 1:
```
paddle train --job=test \
--use_gpu=1/0 \
--config=network_config \
--trainer_count=COUNT \
--init_model_path=model_path \
```
- use init\_model\_path to specify test model.
- only can test one model.
Method 2:
```
paddle train --job=test \
--use_gpu=1/0 \
--config=network_config \
--trainer_count=COUNT \
--model_list=model.list \
```
- use model_list to specify test models
- can test several models, where model.list likes:
```
./alexnet_pass1
./alexnet_pass2
```
Method 3:
```
paddle train --job=test \
--use_gpu=1/0 \
--config=network_config \
--trainer_count=COUNT \
--save_dir=model \
--test_pass=M \
--num_passes=N \
```
This way must use model path saved by Paddle like this: `model/pass-%5d`. Testing model is from M-th pass to (N-1)-th pass. For example: M=12 and N=14 will test `model/pass-00012` and `model/pass-00013`.
## Sparse Training
Sparse training is usually used to accelerate calculation when input is sparse data with highly dimension. For example, dictionary dimension of input data is 1 million, but one sample just have several words. In paddle, sparse matrix multiplication is used in forward propagation and sparse updating is perfomed on weight updating after backward propagation.
### 1) Local training
You need to set **sparse\_update=True** in network config. Check the network config documentation for more details.
### 2) cluster training
Add the following argument for cluster training of a sparse model. At the same time you need to set **sparse\_remote\_update=True** in network config. Check the network config documentation for more details.
```
--ports_num_for_sparse=1 #(default: 0)
```
## parallel_nn
`parallel_nn` can be set to mixed use of GPUs and CPUs to compute layers. That is to say, you can deploy network to use a GPU to compute some layers and use a CPU to compute other layers. The other way is to split layers into different GPUs, which can **reduce GPU memory** or **use parallel computation to accelerate some layers**.
If you want to use these characteristics, you need to specify device ID in network config (denote it as deviceId) and add command line argument:
```
--parallel_nn=true
```
### case 1: Mixed Use of GPU and CPU
Consider the following example:
```
#command line:
paddle train --use_gpu=true --parallel_nn=true trainer_count=COUNT
default_device(0)
fc1=fc_layer(...)
fc2=fc_layer(...)
fc3=fc_layer(...,layer_attr=ExtraAttr(device=-1))
```
- default_device(0): set default device ID to 0. This means that except the layers with device=-1, all layers will use a GPU, and the specific GPU used for each layer depends on trainer\_count and gpu\_id (0 by default). Here, layer l1 and l2 are computed on the GPU.
- device=-1: use the CPU for layer l3.
- trainer_count:
- trainer_count=1: if gpu\_id is not set, then use the first GPU to compute layers l1 and l2. Otherwise use the GPU with gpu\_id.
- trainer_count>1: use trainer\_count GPUs to compute one layer using data parallelism. For example, trainer\_count=2 means that GPUs 0 and 1 will use data parallelism to compute layer l1 and l2.
### Case 2: Specify Layers in Different Devices
```
#command line:
paddle train --use_gpu=true --parallel_nn=true --trainer_count=COUNT
#network:
fc2=fc_layer(input=l1, layer_attr=ExtraAttr(device=0), ...)
fc3=fc_layer(input=l1, layer_attr=ExtraAttr(device=1), ...)
fc4=fc_layer(input=fc2, layer_attr=ExtraAttr(device=-1), ...)
```
In this case, we assume that there are 4 GPUs in one machine.
- trainer_count=1:
- Use GPU 0 to compute layer l2.
- Use GPU 1 to compute layer l3.
- Use CPU to compute layer l4.
- trainer_count=2:
- Use GPU 0 and 1 to compute layer l2.
- Use GPU 2 and 3 to compute layer l3.
- Use CPU to compute l4 in two threads.
- trainer_count=4:
- It will fail (note, we have assumed that there are 4 GPUs in machine), because argument `allow_only_one_model_on_one_gpu` is true by default.
**Allocation of device ID when `device!=-1`**:
```
(deviceId + gpu_id + threadId * numLogicalDevices_) % numDevices_
deviceId: specified in layer.
gpu_id: 0 by default.
threadId: thread ID, range: 0,1,..., trainer_count-1
numDevices_: device (GPU) count in machine.
numLogicalDevices_: min(max(deviceId + 1), numDevices_)
```