History

mindspore-ci-bot 669a37739e !14080 modify some readme file From: @zhanghuiyao Reviewed-by: @c_34,@oacjiewen Signed-off-by: @c_34		4 years ago
..
script	Add lstm ascend distribute train	4 years ago
src	LSTM Ascend parameter type fp16 change to fp32	4 years ago
README.md	removed the useless link of apply form	4 years ago
README_CN.md	modify some readme file.	4 years ago
eval.py	Add lstm ascend distribute train	4 years ago
export.py	deeplabv3 & ssd & lstm new interface	4 years ago
train.py	Add lstm ascend distribute train	4 years ago

README.md

Unescape Escape

查看中文

LSTM Description
Model Architecture
Dataset
Environment Requirements
Quick Start
Script Description
Model Description
- Performance
  - Training Performance
  - Evaluation Performance
Description of Random Situation
ModelZoo Homepage

LSTM Description

This example is for LSTM model training and evaluation.

Paper: Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, Christopher Potts. Learning Word Vectors for Sentiment Analysis. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011

Model Architecture

LSTM contains embeding, encoder and decoder modules. Encoder module consists of LSTM layer. Decoder module consists of fully-connection layer.

Dataset

Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.

aclImdb_v1 for training evaluation.Large Movie Review Dataset
GloVe: Vector representations for words.GloVe: Global Vectors for Word Representation

Environment Requirements

Hardware（GPU/CPU/Ascend）
- Prepare hardware environment with Ascend, GPU or CPU processor.
Framework
- MindSpore
For more information, please check the resources below：
- MindSpore Tutorials
- MindSpore Python API

Quick Start

running on Ascend

# run training example
bash run_train_ascend.sh 0 ./aclimdb ./glove_dir

# run evaluation example
bash run_eval_ascend.sh 0 ./preprocess lstm-20_390.ckpt

running on GPU

# run training example
bash run_train_gpu.sh 0 ./aclimdb ./glove_dir

# run evaluation example
bash run_eval_gpu.sh 0 ./aclimdb ./glove_dir lstm-20_390.ckpt

running on CPU

# run training example
bash run_train_cpu.sh ./aclimdb ./glove_dir

# run evaluation example
bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt

Script Description

Script and Sample Code

.
├── lstm
    ├── README.md               # descriptions about LSTM
    ├── script
    │   ├── run_eval_gpu.sh     # shell script for evaluation on GPU
    │   ├── run_eval_ascend.sh  # shell script for evaluation on Ascend
    │   ├── run_eval_cpu.sh     # shell script for evaluation on CPU
    │   ├── run_train_gpu.sh    # shell script for training on GPU
    │   ├── run_train_ascend.sh # shell script for training on Ascend
    │   └── run_train_cpu.sh    # shell script for training on CPU
    ├── src
    │   ├── config.py           # parameter configuration
    │   ├── dataset.py          # dataset preprocess
    │   ├── imdb.py             # imdb dataset read script
    │   ├── lr_schedule.py      # dynamic_lr script
    │   └── lstm.py             # Sentiment model
    ├── eval.py                 # evaluation script on GPU, CPU and Ascend
    └── train.py                # training script on GPU, CPU and Ascend

Script Parameters

Training Script Parameters

usage: train.py  [-h] [--preprocess {true, false}] [--aclimdb_path ACLIMDB_PATH]
                 [--glove_path GLOVE_PATH] [--preprocess_path PREPROCESS_PATH]
                 [--ckpt_path CKPT_PATH] [--pre_trained PRE_TRAINING]
                 [--device_target {GPU, CPU, Ascend}]

Mindspore LSTM Example

options:
  -h, --help                          # show this help message and exit
  --preprocess {true, false}          # whether to preprocess data.
  --aclimdb_path ACLIMDB_PATH         # path where the dataset is stored.
  --glove_path GLOVE_PATH             # path where the GloVe is stored.
  --preprocess_path PREPROCESS_PATH   # path where the pre-process data is stored.
  --ckpt_path CKPT_PATH               # the path to save the checkpoint file.
  --pre_trained                       # the pretrained checkpoint file path.
  --device_target                     # the target device to run, support "GPU", "CPU", "Ascend". Default: "Ascend".

Running Options

config.py:
GPU/CPU:
    num_classes                   # classes num
    dynamic_lr                    # if use dynamic learning rate
    learning_rate                 # value of learning rate
    momentum                      # value of momentum
    num_epochs                    # epoch size
    batch_size                    # batch size of input dataset
    embed_size                    # the size of each embedding vector
    num_hiddens                   # number of features of hidden layer
    num_layers                    # number of layers of stacked LSTM
    bidirectional                 # specifies whether it is a bidirectional LSTM
    save_checkpoint_steps         # steps for saving checkpoint files

Ascend:
    num_classes                   # classes num
    momentum                      # value of momentum
    num_epochs                    # epoch size
    batch_size                    # batch size of input dataset
    embed_size                    # the size of each embedding vector
    num_hiddens                   # number of features of hidden layer
    num_layers                    # number of layers of stacked LSTM
    bidirectional                 # specifies whether it is a bidirectional LSTM
    save_checkpoint_steps         # steps for saving checkpoint files
    keep_checkpoint_max           # max num of checkpoint files
    dynamic_lr                    # if use dynamic learning rate
    lr_init                       # init learning rate of Dynamic learning rate
    lr_end                        # end learning rate of Dynamic learning rate
    lr_max                        # max learning rate of Dynamic learning rate
    lr_adjust_epoch               # Dynamic learning rate adjust epoch
    warmup_epochs                 # warmup epochs
    global_step                   # global step

Network Parameters

Dataset Preparation

Download the dataset aclImdb_v1.

Unzip the aclImdb_v1 dataset to any path you want and the folder structure should be as follows:
```
.
├── train  # train dataset
└── test   # infer dataset
```
Download the GloVe file.

Unzip the glove.6B.zip to any path you want and the folder structure should be as follows:
```
.
├── glove.6B.100d.txt
├── glove.6B.200d.txt
├── glove.6B.300d.txt    # we will use this one later.
└── glove.6B.50d.txt
```
Adding a new line at the beginning of the file which named glove.6B.300d.txt. It means reading a total of 400,000 words, each represented by a 300-latitude word vector.
```
400000    300
```

Training Process

Set options in config.py, including learning rate and network hyperparameters.

running on Ascend

Run sh run_train_ascend.sh for training.

bash run_train_ascend.sh 0 ./aclimdb ./glove_dir

The above shell script will train in the background. You will get the loss value as following:

# grep "loss is " log.txt
epoch: 1 step: 390, loss is 0.6003723
epcoh: 2 step: 390, loss is 0.35312173
...

running on GPU

Run sh run_train_gpu.sh for training.

bash run_train_gpu.sh 0 ./aclimdb ./glove_dir

The above shell script will run distribute training in the background. You will get the loss value as following:

# grep "loss is " log.txt
epoch: 1 step: 390, loss is 0.6003723
epcoh: 2 step: 390, loss is 0.35312173
...

running on CPU

Run sh run_train_cpu.sh for training.

bash run_train_cpu.sh ./aclimdb ./glove_dir

The above shell script will train in the background. You will get the loss value as following:

# grep "loss is " log.txt
epoch: 1 step: 390, loss is 0.6003723
epcoh: 2 step: 390, loss is 0.35312173
...

Evaluation Process

evaluation on Ascend

Run bash run_eval_ascend.sh for evaluation.
```
bash run_eval_ascend.sh 0 ./preprocess lstm-20_390.ckpt
```

evaluation on GPU

Run bash run_eval_gpu.sh for evaluation.

bash run_eval_gpu.sh 0 ./aclimdb ./glove_dir lstm-20_390.ckpt

evaluation on CPU

Run bash run_eval_cpu.sh for evaluation.

bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt

Model Description

Performance

Training Performance

Parameters	LSTM (Ascend)	LSTM (GPU)	LSTM (CPU)
Resource	Ascend 910	Tesla V100-SMX2-16GB	Ubuntu X86-i7-8565U-16GB
uploaded Date	12/21/2020 (month/day/year)	10/28/2020 (month/day/year)	10/28/2020 (month/day/year)
MindSpore Version	1.1.0	1.0.0	1.0.0
Dataset	aclimdb_v1	aclimdb_v1	aclimdb_v1
Training Parameters	epoch=20, batch_size=64	epoch=20, batch_size=64	epoch=20, batch_size=64
Optimizer	Momentum	Momentum	Momentum
Loss Function	Softmax Cross Entropy	Softmax Cross Entropy	Softmax Cross Entropy
Speed	1049	1022 (1pcs)	20
Loss	0.12	0.12	0.12
Params (M)	6.45	6.45	6.45
Checkpoint for inference	292.9M (.ckpt file)	292.9M (.ckpt file)	292.9M (.ckpt file)
Scripts	lstm script	lstm script	lstm script

Evaluation Performance

Parameters	LSTM (Ascend)	LSTM (GPU)	LSTM (CPU)
Resource	Ascend 910	Tesla V100-SMX2-16GB	Ubuntu X86-i7-8565U-16GB
uploaded Date	12/21/2020 (month/day/year)	10/28/2020 (month/day/year)	10/28/2020 (month/day/year)
MindSpore Version	1.1.0	1.0.0	1.0.0
Dataset	aclimdb_v1	aclimdb_v1	aclimdb_v1
batch_size	64	64	64
Accuracy	85%	84%	83%

Description of Random Situation

There are three random situations:

Shuffle of the dataset.
Initialization of some model weights.

ModelZoo Homepage

Please check the official homepage.

README.md Unescape Escape