|
4 years ago | |
---|---|---|
.. | ||
script | 4 years ago | |
src | 4 years ago | |
README.md | 4 years ago | |
README_CN.md | 4 years ago | |
eval.py | 4 years ago | |
export.py | 4 years ago | |
train.py | 4 years ago |
README.md
Contents
- LSTM Description
- Model Architecture
- Dataset
- Environment Requirements
- Quick Start
- Script Description
- Model Description
- Description of Random Situation
- ModelZoo Homepage
LSTM Description
This example is for LSTM model training and evaluation.
Paper: Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, Christopher Potts. Learning Word Vectors for Sentiment Analysis. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011
Model Architecture
LSTM contains embeding, encoder and decoder modules. Encoder module consists of LSTM layer. Decoder module consists of fully-connection layer.
Dataset
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
- aclImdb_v1 for training evaluation.Large Movie Review Dataset
- GloVe: Vector representations for words.GloVe: Global Vectors for Word Representation
Environment Requirements
- Hardware(GPU/CPU/Ascend)
- Prepare hardware environment with Ascend, GPU or CPU processor.
- Framework
- For more information, please check the resources below:
Quick Start
-
running on Ascend
# run training example bash run_train_ascend.sh 0 ./aclimdb ./glove_dir # run evaluation example bash run_eval_ascend.sh 0 ./preprocess lstm-20_390.ckpt
-
running on GPU
# run training example bash run_train_gpu.sh 0 ./aclimdb ./glove_dir # run evaluation example bash run_eval_gpu.sh 0 ./aclimdb ./glove_dir lstm-20_390.ckpt
-
running on CPU
# run training example bash run_train_cpu.sh ./aclimdb ./glove_dir # run evaluation example bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt
Script Description
Script and Sample Code
.
├── lstm
├── README.md # descriptions about LSTM
├── script
│ ├── run_eval_gpu.sh # shell script for evaluation on GPU
│ ├── run_eval_ascend.sh # shell script for evaluation on Ascend
│ ├── run_eval_cpu.sh # shell script for evaluation on CPU
│ ├── run_train_gpu.sh # shell script for training on GPU
│ ├── run_train_ascend.sh # shell script for training on Ascend
│ └── run_train_cpu.sh # shell script for training on CPU
├── src
│ ├── config.py # parameter configuration
│ ├── dataset.py # dataset preprocess
│ ├── imdb.py # imdb dataset read script
│ ├── lr_schedule.py # dynamic_lr script
│ └── lstm.py # Sentiment model
├── eval.py # evaluation script on GPU, CPU and Ascend
└── train.py # training script on GPU, CPU and Ascend
Script Parameters
Training Script Parameters
usage: train.py [-h] [--preprocess {true, false}] [--aclimdb_path ACLIMDB_PATH]
[--glove_path GLOVE_PATH] [--preprocess_path PREPROCESS_PATH]
[--ckpt_path CKPT_PATH] [--pre_trained PRE_TRAINING]
[--device_target {GPU, CPU, Ascend}]
Mindspore LSTM Example
options:
-h, --help # show this help message and exit
--preprocess {true, false} # whether to preprocess data.
--aclimdb_path ACLIMDB_PATH # path where the dataset is stored.
--glove_path GLOVE_PATH # path where the GloVe is stored.
--preprocess_path PREPROCESS_PATH # path where the pre-process data is stored.
--ckpt_path CKPT_PATH # the path to save the checkpoint file.
--pre_trained # the pretrained checkpoint file path.
--device_target # the target device to run, support "GPU", "CPU", "Ascend". Default: "Ascend".
Running Options
config.py:
GPU/CPU:
num_classes # classes num
dynamic_lr # if use dynamic learning rate
learning_rate # value of learning rate
momentum # value of momentum
num_epochs # epoch size
batch_size # batch size of input dataset
embed_size # the size of each embedding vector
num_hiddens # number of features of hidden layer
num_layers # number of layers of stacked LSTM
bidirectional # specifies whether it is a bidirectional LSTM
save_checkpoint_steps # steps for saving checkpoint files
Ascend:
num_classes # classes num
momentum # value of momentum
num_epochs # epoch size
batch_size # batch size of input dataset
embed_size # the size of each embedding vector
num_hiddens # number of features of hidden layer
num_layers # number of layers of stacked LSTM
bidirectional # specifies whether it is a bidirectional LSTM
save_checkpoint_steps # steps for saving checkpoint files
keep_checkpoint_max # max num of checkpoint files
dynamic_lr # if use dynamic learning rate
lr_init # init learning rate of Dynamic learning rate
lr_end # end learning rate of Dynamic learning rate
lr_max # max learning rate of Dynamic learning rate
lr_adjust_epoch # Dynamic learning rate adjust epoch
warmup_epochs # warmup epochs
global_step # global step
Network Parameters
Dataset Preparation
-
Download the dataset aclImdb_v1.
Unzip the aclImdb_v1 dataset to any path you want and the folder structure should be as follows:
. ├── train # train dataset └── test # infer dataset
-
Download the GloVe file.
Unzip the glove.6B.zip to any path you want and the folder structure should be as follows:
. ├── glove.6B.100d.txt ├── glove.6B.200d.txt ├── glove.6B.300d.txt # we will use this one later. └── glove.6B.50d.txt
Adding a new line at the beginning of the file which named
glove.6B.300d.txt
. It means reading a total of 400,000 words, each represented by a 300-latitude word vector.400000 300
Training Process
-
Set options in
config.py
, including learning rate and network hyperparameters. -
running on Ascend
Run
sh run_train_ascend.sh
for training.bash run_train_ascend.sh 0 ./aclimdb ./glove_dir
The above shell script will train in the background. You will get the loss value as following:
# grep "loss is " log.txt epoch: 1 step: 390, loss is 0.6003723 epcoh: 2 step: 390, loss is 0.35312173 ...
-
running on GPU
Run
sh run_train_gpu.sh
for training.bash run_train_gpu.sh 0 ./aclimdb ./glove_dir
The above shell script will run distribute training in the background. You will get the loss value as following:
# grep "loss is " log.txt epoch: 1 step: 390, loss is 0.6003723 epcoh: 2 step: 390, loss is 0.35312173 ...
-
running on CPU
Run
sh run_train_cpu.sh
for training.bash run_train_cpu.sh ./aclimdb ./glove_dir
The above shell script will train in the background. You will get the loss value as following:
# grep "loss is " log.txt epoch: 1 step: 390, loss is 0.6003723 epcoh: 2 step: 390, loss is 0.35312173 ...
Evaluation Process
-
evaluation on Ascend
Run
bash run_eval_ascend.sh
for evaluation.bash run_eval_ascend.sh 0 ./preprocess lstm-20_390.ckpt
-
evaluation on GPU
Run
bash run_eval_gpu.sh
for evaluation.bash run_eval_gpu.sh 0 ./aclimdb ./glove_dir lstm-20_390.ckpt
-
evaluation on CPU
Run
bash run_eval_cpu.sh
for evaluation.bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt
Model Description
Performance
Training Performance
Parameters | LSTM (Ascend) | LSTM (GPU) | LSTM (CPU) |
---|---|---|---|
Resource | Ascend 910 | Tesla V100-SMX2-16GB | Ubuntu X86-i7-8565U-16GB |
uploaded Date | 12/21/2020 (month/day/year) | 10/28/2020 (month/day/year) | 10/28/2020 (month/day/year) |
MindSpore Version | 1.1.0 | 1.0.0 | 1.0.0 |
Dataset | aclimdb_v1 | aclimdb_v1 | aclimdb_v1 |
Training Parameters | epoch=20, batch_size=64 | epoch=20, batch_size=64 | epoch=20, batch_size=64 |
Optimizer | Momentum | Momentum | Momentum |
Loss Function | Softmax Cross Entropy | Softmax Cross Entropy | Softmax Cross Entropy |
Speed | 1049 | 1022 (1pcs) | 20 |
Loss | 0.12 | 0.12 | 0.12 |
Params (M) | 6.45 | 6.45 | 6.45 |
Checkpoint for inference | 292.9M (.ckpt file) | 292.9M (.ckpt file) | 292.9M (.ckpt file) |
Scripts | lstm script | lstm script | lstm script |
Evaluation Performance
Parameters | LSTM (Ascend) | LSTM (GPU) | LSTM (CPU) |
---|---|---|---|
Resource | Ascend 910 | Tesla V100-SMX2-16GB | Ubuntu X86-i7-8565U-16GB |
uploaded Date | 12/21/2020 (month/day/year) | 10/28/2020 (month/day/year) | 10/28/2020 (month/day/year) |
MindSpore Version | 1.1.0 | 1.0.0 | 1.0.0 |
Dataset | aclimdb_v1 | aclimdb_v1 | aclimdb_v1 |
batch_size | 64 | 64 | 64 |
Accuracy | 85% | 84% | 83% |
Description of Random Situation
There are three random situations:
- Shuffle of the dataset.
- Initialization of some model weights.
ModelZoo Homepage
Please check the official homepage.