!8727 Clean code for gnmt_v2 network

From: @gaojing22 Reviewed-by: @yingjy,@guoqi1024 Signed-off-by: @yingjy
4 years ago · 7689062c7d
parent 044c7d183c a23ddaf409
commit 7689062c7d
15 changed files with 385 additions and 132 deletions
--- a/model_zoo/official/nlp/gnmt_v2/README.md
+++ b/model_zoo/official/nlp/gnmt_v2/README.md
@ -12,16 +12,12 @@
    - [Dataset Preparation](#dataset-preparation)
    - [Configuration File](#configuration-file)
    - [Training Process](#training-process)
-    - [Evaluation Process](#evaluation-process)
+    - [Inference Process](#inference-process)
 - [Model Description](#model-description)
    - [Performance](#performance)
        - [Result](#result)
            - [Training Performance](#training-performance)
            - [Inference Performance](#inference-performance)
    - [Practice](#practice)
        - [Dataset Preprocessing](#dataset-preprocessing)
        - [Training](#training-1)
        - [Inference](#inference-1)
 - [Random Situation Description](#random-situation-description)
 - [Others](#others)
 - [ModelZoo](#modelzoo)
@ -50,8 +46,8 @@ Note that you can run the scripts based on the dataset mentioned in original pap
 - Framework
  - Install [MindSpore](https://www.mindspore.cn/install/en).
 - For more information, please check the resources below:
-  - [MindSpore tutorials](https://www.mindspore.cn/tutorial/en/master/index.html) 
+  - [MindSpore tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) 
-  - [MindSpore API](https://www.mindspore.cn/api/en/master/index.html)
+  - [MindSpore API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
 ## Software
 ```txt
@ -62,18 +58,26 @@ subword_nmt==0.3.7
 ```
 # [Quick Start](#contents)
 The process of GNMTv2 performing the text translation task is as follows:
 1. Download the wmt16 data corpus and extract the dataset. For details, see the chapter "_Dataset_" above.
 2. Dataset preparation and configuration.
 3. Training.
 4. Inference.
 After dataset preparation, you can start training and evaluation as follows: 
 ```bash
 # run training example
-python train.py --config /home/workspace/gnmt_v2/config/config.json
+cd ./scripts
 sh run_standalone_train_ascend.sh DATASET_SCHEMA_TRAIN PRE_TRAIN_DATASET
 # run distributed training example
 cd ./scripts
-sh run_distributed_train_ascend.sh
+sh run_distributed_train_ascend.sh RANK_TABLE_ADDR DATASET_SCHEMA_TRAIN PRE_TRAIN_DATASET
 # run evaluation example
 cd ./scripts
-sh run_standalone_eval_ascend.sh
+sh run_standalone_eval_ascend.sh DATASET_SCHEMA_TEST TEST_DATASET EXISTED_CKPT_PATH \
  VOCAB_ADDR BPE_CODE_ADDR TEST_TARGET
 ```
 # Script Description
@ -130,7 +134,7 @@ The GNMT network script and code result are as follows:
 ```
 ## Dataset Preparation
-You may use this [shell script](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Translation/GNMT/scripts/wmt16_en_de.sh) to download and preprocess WMT English-German dataset. Assuming you get the following files:
+You may use this [shell script](https://github.com/NVIDIA/DeepLearningExamples/blob/master/TensorFlow/Translation/GNMT/scripts/wmt16_en_de.sh) to download and preprocess WMT English-German dataset. Assuming you get the following files:
  - train.tok.clean.bpe.32000.en
  - train.tok.clean.bpe.32000.de
  - vocab.bpe.32000
@ -146,35 +150,60 @@ You may use this [shell script](https://github.com/NVIDIA/DeepLearningExamples/b
 ## Configuration File
 The JSON file in the `config/` directory is the template configuration file.
-Almost all required options and parameters can be easily assigned, including the training platform, dataset and model configuration, and optimizer parameters. By setting the corresponding options, you can also obtain optional functions such as loss scale and checkpoint.
+Almost all required options and parameters can be easily assigned, including the training platform, model configuration, and optimizer parameters.
-For more information about attributes, see the `config/config.py` file.
+- config for GNMTv2
  ```python
  'random_seed': 50         # global random seed
  'epochs':6                # total training epochs
  'batch_size': 128         # training batch size
  'dataset_sink_mode': true # whether use dataset sink mode
  'seq_length': 51          # max length of source sentences
  'vocab_size': 32320       # vocabulary size
  'hidden_size': 125        # the output's last dimension of dynamicRNN
  'initializer_range': 0.1  # initializer range
  'max_decode_length': 125  # max length of decoder
  'lr': 0.1                 # initial learning rate
  'lr_scheduler': 'WarmupMultiStepLR'  # learning rate scheduler
  'existed_ckpt': ''        # the absolute full path to save the checkpoint file
  ```
 For more configuration details, please refer the script `config/config.py` file.
 ## Training Process
-The model training requires the shell script `scripts/run_standalone_train_ascend.sh`. In this script, set environment variables and the training script `train.py` to be executed in `gnmt_v2/`.
+For a pre-trained model, configure the following options in the `scripts/run_standalone_train_ascend.json` file:
-Start task training on a single device and run the following command in bash:
+- Select an optimizer ('momentum/adam/lamb' is available).
 - Specify `ckpt_prefix` and `ckpt_path` in `checkpoint_path` to save the model file.
 - Set other parameters, including dataset configuration and network configuration.
 - If a pre-trained model exists, assign `existed_ckpt` to the path of the existing model during fine-tuning.
 Start task training on a single device and run the shell script `scripts/run_standalone_train_ascend.sh`:
 ```bash
 cd ./scripts
-sh run_standalone_train_ascend.sh
+sh run_standalone_train_ascend.sh DATASET_SCHEMA_TRAIN PRE_TRAIN_DATASET
 ```
- or multiple devices 
+In this script, the `DATASET_SCHEMA_TRAIN` and `PRE_TRAIN_DATASET` are the dataset schema and dataset address.
 Run `scripts/run_distributed_train_ascend.sh` for distributed training of GNMTv2 model.
 Task training on multiple devices and run the following command in bash to be executed in `scripts/`.:
 ```bash
 cd ./scripts
-sh run_distributed_train_ascend.sh
+sh run_distributed_train_ascend.sh RANK_TABLE_ADDR DATASET_SCHEMA_TRAIN PRE_TRAIN_DATASET
 ```
-Note: Ensure that the hccl_json file is assigned when distributed training is running.
+Note: the `RANK_TABLE_ADDR` is the hccl_json file assigned when distributed training is running. 
-Currently, inconsecutive device IDs are not supported in `scripts/run_distributed_train_ascend.sh`. The device ID must start from 0 in the `distribute_script/rank_table_8p.json` file.
+Currently, inconsecutive device IDs are not supported in `scripts/run_distributed_train_ascend.sh`. The device ID must start from 0 in the `RANK_TABLE_ADDR` file.
 ## Evaluation Process
 Set options in `config/config_test.json`. Make sure the 'existed_ckpt', 'dataset_schema' and 'test_dataset' are set to your own path.
-Run `scripts/run_standalone_eval_ascend.sh` to process the output token ids to get the BLEU scores.
+## Inference Process
 For inference using a trained model on multiple hardware platforms, such as Ascend 910.
 Set options in `config/config_test.json`.
 Run the shell script `scripts/run_standalone_eval_ascend.sh` to process the output token ids to get the BLEU scores.
 ```bash
 cd ./scripts
 sh run_standalone_eval_ascend.sh
 sh run_standalone_eval_ascend.sh DATASET_SCHEMA_TEST TEST_DATASET EXISTED_CKPT_PATH \
  VOCAB_ADDR BPE_CODE_ADDR TEST_TARGET
 ```
 The `DATASET_SCHEMA_TEST` and the `TEST_DATASET` are the schema and address of inference dataset respectively, and `EXISTED_CKPT_PATH` is the path of the model file generated during training process.
 The `VOCAB_ADDR` is the vocabulary address, `BPE_CODE_ADDR` is the bpe code address and the `TEST_TARGET` are the path of answers.
 # Model Description
 ## Performance
@ -190,8 +219,9 @@ sh run_standalone_eval_ascend.sh
 | Training Parameters        | epoch=6, batch_size=128                                        |
 | Optimizer                  | Adam                                                           |
 | Loss Function              | Softmax Cross Entropy                                          |
-| BLEU Score                 | 24.05                                                          |
+| outputs                    | probability                                                    |
 | Speed                      | 344ms/step (8pcs)                                              |
 | Total Time                 | 7800s (8pcs)                                                   |
 | Loss                       | 63.35                                                          |
 | Params (M)                 | 613                                                            |
 | Checkpoint for inference   | 1.8G (.ckpt file)                                              |
@ -206,48 +236,10 @@ sh run_standalone_eval_ascend.sh
 | MindSpore Version   | 1.0.0                       |
 | Dataset             | WMT newstest2014            |
 | batch_size          | 128                         |
-| outputs             | BLEU score                  |
+| Total Time          | 1560s                       |
-| Accuracy            | BLEU= 24.05                 |
+| outputs             | probability                 |
-
+| Accuracy            | BLEU Score= 24.05           |
-## Practice
+| Model for inference | 1.8G (.ckpt file)           |
 The process of GNMTv2 performing the text translation task is as follows:
 1. Download the wmt16 data corpus and extract the dataset. For details, see the chapter "_Dataset_" above.
 2. Dataset preprocessing.
 3. Perform training.
 4. Perform inference.
 ### Dataset Preprocessing
 For a pre-trained model, configure the following options in the `config.json` file:
 ```
 python create_dataset.py --src_folder /home/work_space/wmt16_de_en  --output_folder /home/work_space/dataset_menu
 ```
 ### Training
 For a pre-trained model, configure the following options in the `config/config.json` file:
 - Assign `pre_train_dataset` and `dataset_schema` to the training dataset path.
 - Select an optimizer ('momentum/adam/lamb' is available).
 - Specify `ckpt_prefix` and `ckpt_path` in `checkpoint_path` to save the model file.
 - Set other parameters, including dataset configuration and network configuration.
 - If a pre-trained model exists, assign `existed_ckpt` to the path of the existing model during fine-tuning.
 Run the shell script `run.sh`:
 ```bash
 cd ./scripts
 sh run_standalone_train_ascend.sh
 ```
 ### Inference
 For inference using a trained model on multiple hardware platforms, such as GPU, Ascend 910, and Ascend 310, see [Network Migration](https://www.mindspore.cn/tutorial/en/master/advanced_use/network_migration.html).
 For inference interruption, configure the following options in the `config/config.json` file:
 - Assign `test_dataset` and the `dataset_schema` to the inference dataset path.
 - Assign `existed_ckpt` and the `checkpoint_path` to the path of the model file generated during training.
 - Set other parameters, including dataset configuration and network configuration.
 Run the shell script `run.sh`:
 ```bash
 cd ./scripts
 sh run_standalone_eval_ascend.sh
 ```
 # Random Situation Description
 There are three random situations:
@ -260,4 +252,4 @@ Some seeds have already been set in train.py to avoid the randomness of dataset
 This model has been validated in the Ascend environment and is not validated on the CPU and GPU.
 # ModelZoo 主页
- [链接](https://gitee.com/mindspore/mindspore/tree/master/mindspore/model_zoo)
+ [链接](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)
--- a/model_zoo/official/nlp/gnmt_v2/eval.py
+++ b/model_zoo/official/nlp/gnmt_v2/eval.py
@ -15,6 +15,7 @@
 """Evaluation api."""
 import argparse
 import pickle
 import os
 from mindspore.common import dtype as mstype
@ -26,11 +27,17 @@ from src.dataset.tokenizer import Tokenizer
 parser = argparse.ArgumentParser(description='gnmt')
 parser.add_argument("--config", type=str, required=True,
                    help="model config json file path.")
 parser.add_argument("--dataset_schema_test", type=str, required=True,
                    help="dataset schema for evaluation.")
 parser.add_argument("--test_dataset", type=str, required=True,
                    help="test dataset address.")
 parser.add_argument("--existed_ckpt", type=str, required=True,
                    help="existed checkpoint address.")
 parser.add_argument("--vocab", type=str, required=True,
                    help="Vocabulary to use.")
 parser.add_argument("--bpe_codes", type=str, required=True,
                    help="bpe codes to use.")
-parser.add_argument("--test_tgt", type=str, required=False,
+parser.add_argument("--test_tgt", type=str, required=True,
                    default=None,
                    help="data file of the test target")
 parser.add_argument("--output", type=str, required=False,
@ -45,9 +52,20 @@ def get_config(config):
    return config
 def _check_args(config):
    if not os.path.exists(config):
        raise FileNotFoundError("`config` is not existed.")
    if not isinstance(config, str):
        raise ValueError("`config` must be type of str.")
 if __name__ == '__main__':
    args, _ = parser.parse_known_args()
    _check_args(args.config)
    _config = get_config(args.config)
    _config.dataset_schema = args.dataset_schema_test
    _config.test_dataset = args.test_dataset
    _config.existed_ckpt = args.existed_ckpt
    result = infer(_config)
    with open(args.output, "wb") as f:
--- a/model_zoo/official/nlp/gnmt_v2/scripts/run_distributed_train_ascend.sh
+++ b/model_zoo/official/nlp/gnmt_v2/scripts/run_distributed_train_ascend.sh
@ -13,14 +13,31 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ============================================================================
 echo "=============================================================================================================="
 echo "Please run the scipt as: "
 echo "sh run_distributed_train_ascend.sh RANK_TABLE_ADDR DATASET_SCHEMA_TRAIN PRE_TRAIN_DATASET"
 echo "for example:"
 echo "sh run_distributed_train_ascend.sh \
  /home/workspace/rank_table_8p.json \
  /home/workspace/dataset_menu/train.tok.clean.bpe.32000.en.json \
  /home/workspace/dataset_menu/train.tok.clean.bpe.32000.en.tfrecord-001-of-001"
 echo "It is better to use absolute path."
 echo "=============================================================================================================="
 RANK_TABLE_ADDR=$1
 DATASET_SCHEMA_TRAIN=$2
 PRE_TRAIN_DATASET=$3
 current_exec_path=$(pwd)
 echo ${current_exec_path}
-export RANK_TABLE_FILE=/home/workspace/rank_table_8p.json
+export RANK_TABLE_FILE=$RANK_TABLE_ADDR
-export MINDSPORE_HCCL_CONFIG_PATH=/home/workspace/rank_table_8p.json
+export MINDSPORE_HCCL_CONFIG_PATH=$RANK_TABLE_ADDR
 echo $RANK_TABLE_FILE
 export RANK_SIZE=8
 export GLOG_v=2
 for((i=0;i<=7;i++));
 do
@ -32,7 +49,10 @@ do
    cp -r ../../config .
    export RANK_ID=$i
    export DEVICE_ID=$i
-	python ../../train.py --config /home/workspace/gnmt_v2/config/config.json > log_gnmt_network${i}.log 2>&1 &
+	python ../../train.py \
 	  --config=${current_exec_path}/device${i}/config/config.json \
 	  --dataset_schema_train=$DATASET_SCHEMA_TRAIN \
 	  --pre_train_dataset=$PRE_TRAIN_DATASET > log_gnmt_network${i}.log 2>&1 &
    cd ${current_exec_path} || exit
 done
 cd ${current_exec_path} || exit
--- a/model_zoo/official/nlp/gnmt_v2/scripts/run_standalone_eval_ascend.sh
+++ b/model_zoo/official/nlp/gnmt_v2/scripts/run_standalone_eval_ascend.sh
@ -13,10 +13,36 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ============================================================================
 echo "=============================================================================================================="
 echo "Please run the scipt as: "
 echo "sh run_standalone_eval_ascend.sh DATASET_SCHEMA_TEST TEST_DATASET EXISTED_CKPT_PATH \
  VOCAB_ADDR BPE_CODE_ADDR TEST_TARGET"
 echo "for example:"
 echo "sh run_standalone_eval_ascend.sh \
  /home/workspace/dataset_menu/newstest2014.en.json \
  /home/workspace/dataset_menu/newstest2014.en.tfrecord-001-of-001 \
  /home/workspace/gnmt_v2/gnmt-6_3452.ckpt \
  /home/workspace/wmt16_de_en/vocab.bpe.32000 \
  /home/workspace/wmt16_de_en/bpe.32000 \
  /home/workspace/wmt16_de_en/newstest2014.de"
 echo "It is better to use absolute path."
 echo "=============================================================================================================="
 DATASET_SCHEMA_TEST=$1
 TEST_DATASET=$2
 EXISTED_CKPT_PATH=$3
 VOCAB_ADDR=$4
 BPE_CODE_ADDR=$5
 TEST_TARGET=$6
 current_exec_path=$(pwd)
 echo ${current_exec_path}
 export DEVICE_NUM=1
 export DEVICE_ID=5
 export RANK_ID=0
 export RANK_SIZE=1
 export GLOG_v=2
 if [ -d "eval" ];
 then
@ -29,5 +55,12 @@ cp -r ../config ./eval
 cd ./eval || exit
 echo "start eval for device $DEVICE_ID"
 env > env.log
-python eval.py --config /home/workspace/gnmt_v2/config/config_test.json --vocab /home/workspace/wmt16_de_en/vocab.bpe.32000 --bpe_codes /home/workspace/wmt16_de_en/bpe.32000 --test_tgt /home/workspace/wmt16_de_en/newstest2014.de >log_infer.log 2>&1 &
+python eval.py \
  --config=${current_exec_path}/eval/config/config_test.json \
  --dataset_schema_test=$DATASET_SCHEMA_TEST \
  --test_dataset=$TEST_DATASET \
  --existed_ckpt=$EXISTED_CKPT_PATH \
  --vocab=$VOCAB_ADDR \
  --bpe_codes=$BPE_CODE_ADDR \
  --test_tgt=$TEST_TARGET >log_infer.log 2>&1 &
 cd ..
--- a/model_zoo/official/nlp/gnmt_v2/scripts/run_standalone_train_ascend.sh
+++ b/model_zoo/official/nlp/gnmt_v2/scripts/run_standalone_train_ascend.sh
@ -13,11 +13,26 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ============================================================================
 echo "=============================================================================================================="
 echo "Please run the scipt as: "
 echo "sh run_standalone_train_ascend.sh DATASET_SCHEMA_TRAIN PRE_TRAIN_DATASET"
 echo "for example:"
 echo "sh run_standalone_train_ascend.sh \
  /home/workspace/dataset_menu/train.tok.clean.bpe.32000.en.json \
  /home/workspace/dataset_menu/train.tok.clean.bpe.32000.en.tfrecord-001-of-001"
 echo "It is better to use absolute path."
 echo "=============================================================================================================="
 DATASET_SCHEMA_TRAIN=$1
 PRE_TRAIN_DATASET=$2
 export DEVICE_NUM=1
 export DEVICE_ID=4
 export RANK_ID=0
 export RANK_SIZE=1
-
+export GLOG_v=2
 current_exec_path=$(pwd)
 echo ${current_exec_path}
 if [ -d "train" ];
 then
    rm -rf ./train
@ -29,5 +44,8 @@ cp -r ../config ./train
 cd ./train || exit
 echo "start training for device $DEVICE_ID"
 env > env.log
-python train.py --config /home/workspace/gnmt_v2/config/config.json > log_gnmt_network.log 2>&1 &
+python train.py \
 	--config=${current_exec_path}/train/config/config.json \
 	--dataset_schema_train=$DATASET_SCHEMA_TRAIN \
 	--pre_train_dataset=$PRE_TRAIN_DATASET > log_gnmt_network${i}.log 2>&1 &
 cd ..
--- a/model_zoo/official/nlp/gnmt_v2/src/dataset/bi_data_loader.py
+++ b/model_zoo/official/nlp/gnmt_v2/src/dataset/bi_data_loader.py
@ -103,7 +103,6 @@ class BiLingualDataLoader(DataLoader):
                    src_padding = np.zeros(shape=self.source_max_sen_len, dtype=np.int64)
                    for i in range(src_len):
                        src_padding[i] = 1
                    src_length = np.array([src_len], dtype=np.int64)
                    # decoder inputs
                    decoder_input = self.padding(tgt_tokens[:-1], self.tokenizer.padding_index, self.target_max_sen_len)
                    # decoder outputs
@ -119,7 +118,6 @@ class BiLingualDataLoader(DataLoader):
                    example = {
                        "src": encoder_input,
                        "src_padding": src_padding,
                        "src_length": src_length,
                        "prev_opt": decoder_input,
                        "target": decoder_output,
                        "tgt_padding": tgt_padding
@ -133,9 +131,9 @@ class BiLingualDataLoader(DataLoader):
                print(f" | Total  sen = {count}.")
                if self.schema_address is not None:
-                    provlist = [count, self.source_max_sen_len, self.source_max_sen_len, 1,
+                    provlist = [count, self.source_max_sen_len, self.source_max_sen_len,
                                self.target_max_sen_len, self.target_max_sen_len, self.target_max_sen_len]
-                    columns = ["src", "src_padding", "src_length", "prev_opt", "target", "tgt_padding"]
+                    columns = ["src", "src_padding", "prev_opt", "target", "tgt_padding"]
                    with open(self.schema_address, "w", encoding="utf-8") as  f:
                        f.write("{\n")
                        f.write('  "datasetType":"TF",\n')
@ -196,12 +194,10 @@ class TextDataLoader(DataLoader):
                src_padding = np.zeros(shape=self.source_max_sen_len, dtype=np.int64)
                for i in range(src_len):
                    src_padding[i] = 1
                src_length = np.array([src_len], dtype=np.int64)
                example = {
                    "src": encoder_input,
-                    "src_padding": src_padding,
+                    "src_padding": src_padding
                    "src_length": src_length
                }
                self._add_example(example)
                count += 1
@ -211,8 +207,8 @@ class TextDataLoader(DataLoader):
            print(f" | Total  sen = {count}.")
            if self.schema_address is not None:
-                provlist = [count, self.source_max_sen_len, self.source_max_sen_len, 1]
+                provlist = [count, self.source_max_sen_len, self.source_max_sen_len]
-                columns = ["src", "src_padding", "src_length"]
+                columns = ["src", "src_padding"]
                with open(self.schema_address, "w", encoding="utf-8") as  f:
                    f.write("{\n")
                    f.write('  "datasetType":"TF",\n')
--- a/model_zoo/official/nlp/gnmt_v2/src/dataset/load_dataset.py
+++ b/model_zoo/official/nlp/gnmt_v2/src/dataset/load_dataset.py
@ -60,20 +60,20 @@ def _load_dataset(input_files, schema_file, batch_size, epoch_count=1,
        ds = de.TFRecordDataset(
            input_files, schema_file,
            columns_list=[
-                "src", "src_padding", "src_length",
+                "src", "src_padding",
                "prev_opt",
                "target", "tgt_padding"
            ],
-            shuffle=shuffle, num_shards=rank_size, shard_id=rank_id,
+            shuffle=False, num_shards=rank_size, shard_id=rank_id,
            shard_equal_rows=True, num_parallel_workers=8)
        ori_dataset_size = ds.get_dataset_size()
        print(f" | Dataset size: {ori_dataset_size}.")
-
+        if shuffle:
            ds = ds.shuffle(buffer_size=ori_dataset_size // 20)
        type_cast_op = deC.TypeCast(mstype.int32)
        ds = ds.map(input_columns="src", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.map(input_columns="src_padding", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.map(input_columns="src_length", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.map(input_columns="prev_opt", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.map(input_columns="target", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.map(input_columns="tgt_padding", operations=type_cast_op, num_parallel_workers=8)
@ -81,13 +81,11 @@ def _load_dataset(input_files, schema_file, batch_size, epoch_count=1,
        ds = ds.rename(
            input_columns=["src",
                           "src_padding",
                           "src_length",
                           "prev_opt",
                           "target",
                           "tgt_padding"],
            output_columns=["source_eos_ids",
                            "source_eos_mask",
                            "source_eos_length",
                            "target_sos_ids",
                            "target_eos_ids",
                            "target_eos_mask"]
@ -97,26 +95,24 @@ def _load_dataset(input_files, schema_file, batch_size, epoch_count=1,
        ds = de.TFRecordDataset(
            input_files, schema_file,
            columns_list=[
-                "src", "src_padding", "src_length"
+                "src", "src_padding"
            ],
-            shuffle=shuffle, num_shards=rank_size, shard_id=rank_id,
+            shuffle=False, num_shards=rank_size, shard_id=rank_id,
            shard_equal_rows=True, num_parallel_workers=8)
        ori_dataset_size = ds.get_dataset_size()
        print(f" | Dataset size: {ori_dataset_size}.")
-
+        if shuffle:
            ds = ds.shuffle(buffer_size=ori_dataset_size // 20)
        type_cast_op = deC.TypeCast(mstype.int32)
        ds = ds.map(input_columns="src", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.map(input_columns="src_padding", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.map(input_columns="src_length", operations=type_cast_op, num_parallel_workers=8)
        ds = ds.rename(
            input_columns=["src",
-                           "src_padding",
+                           "src_padding"],
                           "src_length"],
            output_columns=["source_eos_ids",
-                            "source_eos_mask",
+                            "source_eos_mask"]
                            "source_eos_length"]
        )
        ds = ds.batch(batch_size, drop_remainder=drop_remainder)
--- a/model_zoo/official/nlp/gnmt_v2/src/dataset/schema.py
+++ b/model_zoo/official/nlp/gnmt_v2/src/dataset/schema.py
@ -17,7 +17,6 @@
 SCHEMA = {
    "src": {"type": "int64", "shape": [-1]},
    "src_padding": {"type": "int64", "shape": [-1]},
    "src_length": {"type": "int64", "shape": [-1]},
    "prev_opt": {"type": "int64", "shape": [-1]},
    "target": {"type": "int64", "shape": [-1]},
    "tgt_padding": {"type": "int64", "shape": [-1]},
--- a/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/encoder.py
+++ b/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/encoder.py
@ -71,7 +71,7 @@ class GNMTEncoder(nn.Cell):
        self.reverse_v2 = P.ReverseV2(axis=[0])
        self.dropout = nn.Dropout(keep_prob=1.0 - config.hidden_dropout_prob)
-    def construct(self, inputs, source_len, attention_mask=None):
+    def construct(self, inputs):
        """Encoder."""
        inputs = self.dropout(inputs)
        # bidirectional layer, fwd_encoder_outputs: [T,N,D]
--- a/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/gnmt.py
+++ b/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/gnmt.py
@ -109,8 +109,7 @@ class GNMT(nn.Cell):
            self.beam_decoder.add_flags(loop_can_unroll=True)
        self.shape = P.Shape()
-    def construct(self, source_ids, source_mask=None, source_len=None,
+    def construct(self, source_ids, source_mask=None, target_ids=None):
                  target_ids=None):
        """
        Construct network.
@ -121,8 +120,6 @@ class GNMT(nn.Cell):
            source_mask (Tensor): Source sentences padding mask with shape (N, T),
                where 0 indicates padding position.
            target_ids (Tensor): Target sentences with shape (N, T').
            target_mask (Tensor): Target sentences padding mask with shape (N, T'),
                where 0 indicates padding position.
        Returns:
            Tuple[Tensor], network outputs.
@ -133,7 +130,7 @@ class GNMT(nn.Cell):
        # T, N, D
        inputs = self.transpose(src_embeddings, self.transpose_orders)
        # encoder. encoder_outputs: [T, N, D]
-        encoder_outputs = self.gnmt_encoder(inputs, source_len=source_len)
+        encoder_outputs = self.gnmt_encoder(inputs)
        # decoder.
        if self.is_training:
--- a/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/gnmt_for_infer.py
+++ b/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/gnmt_for_infer.py
@ -66,13 +66,11 @@ class GNMTInferCell(nn.Cell):
    def construct(self,
                  source_ids,
-                  source_mask,
+                  source_mask):
                  source_len):
        """Defines the computation performed."""
        predicted_ids = self.network(source_ids,
-                                     source_mask,
+                                     source_mask)
                                     source_len)
        return predicted_ids
@ -143,21 +141,18 @@ def gnmt_infer(config, dataset):
                                    [config.batch_size, 1]), mstype.int32)
    source_mask_pad = Tensor(np.tile(np.array([[1, 1] + [0] * (config.seq_length - 2)]),
                                     [config.batch_size, 1]), mstype.int32)
    source_len_pad = Tensor(np.tile(np.array([[2]]), [config.batch_size, 1]), mstype.int32)
    for batch in dataset.create_dict_iterator():
        source_sentences.append(batch["source_eos_ids"].asnumpy())
        source_ids = Tensor(batch["source_eos_ids"], mstype.int32)
        source_mask = Tensor(batch["source_eos_mask"], mstype.int32)
        source_len = Tensor(batch["source_eos_length"], mstype.int32)
        active_num = shape(source_ids)[0]
        if active_num < config.batch_size:
            source_ids = concat((source_ids, source_ids_pad[active_num:, :]))
            source_mask = concat((source_mask, source_mask_pad[active_num:, :]))
            source_len = concat((source_len, source_len_pad[active_num:, :]))
        start_time = time.time()
-        predicted_ids = model.predict(source_ids, source_mask, source_len)
+        predicted_ids = model.predict(source_ids, source_mask)
        print(f" | BatchIndex = {batch_index}, Batch size: {config.batch_size}, active_num={active_num}, "
              f"Time cost: {time.time() - start_time}.")
--- a/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/gnmt_for_train.py
+++ b/model_zoo/official/nlp/gnmt_v2/src/gnmt_model/gnmt_for_train.py
@ -81,21 +81,19 @@ class GNMTTraining(nn.Cell):
        self.gnmt = GNMT(config, is_training, use_one_hot_embeddings)
        self.projection = PredLogProbs(config)
-    def construct(self, source_ids, source_mask, source_len, target_ids):
+    def construct(self, source_ids, source_mask, target_ids):
        """
        Construct network.
        Args:
            source_ids (Tensor): Source sentence.
            source_mask (Tensor): Source padding mask.
            source_len (Tensor): Effective length of source sentence.
            target_ids (Tensor): Target sentence.
        Returns:
            Tensor, prediction_scores.
        """
-        decoder_outputs = self.gnmt(source_ids, source_mask, source_len, target_ids)
+        decoder_outputs = self.gnmt(source_ids, source_mask, target_ids)
        prediction_scores = self.projection(decoder_outputs)
        return prediction_scores
@ -175,11 +173,10 @@ class GNMTNetworkWithLoss(nn.Cell):
    def construct(self,
                  source_ids,
                  source_mask,
                  source_len,
                  target_ids,
                  label_ids,
                  label_weights):
-        prediction_scores = self.gnmt(source_ids, source_mask, source_len, target_ids)
+        prediction_scores = self.gnmt(source_ids, source_mask, target_ids)
        total_loss = self.loss(prediction_scores, label_ids, label_weights)
        return self.cast(total_loss, mstype.float32)
@ -265,7 +262,6 @@ class GNMTTrainOneStepWithLossScaleCell(nn.Cell):
    def construct(self,
                  source_eos_ids,
                  source_eos_mask,
                  source_eos_length,
                  target_sos_ids,
                  target_eos_ids,
                  target_eos_mask,
@ -276,7 +272,6 @@ class GNMTTrainOneStepWithLossScaleCell(nn.Cell):
        Args:
            source_eos_ids (Tensor): Source sentence.
            source_eos_mask (Tensor): Source padding mask.
            source_eos_length (Tensor): Effective length of source sentence.
            target_sos_ids (Tensor): Target sentence.
            target_eos_ids (Tensor): Prediction sentence.
            target_eos_mask (Tensor): Prediction padding mask.
@ -287,7 +282,6 @@ class GNMTTrainOneStepWithLossScaleCell(nn.Cell):
        """
        source_ids = source_eos_ids
        source_mask = source_eos_mask
        source_len = source_eos_length
        target_ids = target_sos_ids
        label_ids = target_eos_ids
        label_weights = target_eos_mask
@ -295,7 +289,6 @@ class GNMTTrainOneStepWithLossScaleCell(nn.Cell):
        weights = self.weights
        loss = self.network(source_ids,
                            source_mask,
                            source_len,
                            target_ids,
                            label_ids,
                            label_weights)
@ -309,7 +302,6 @@ class GNMTTrainOneStepWithLossScaleCell(nn.Cell):
            scaling_sens = sens
        grads = self.grad(self.network, weights)(source_ids,
                                                 source_mask,
                                                 source_len,
                                                 target_ids,
                                                 label_ids,
                                                 label_weights,
--- a/model_zoo/official/nlp/gnmt_v2/train.py
+++ b/model_zoo/official/nlp/gnmt_v2/train.py
@ -40,6 +40,8 @@ from src.utils.optimizer import Adam
 parser = argparse.ArgumentParser(description='GNMT train entry point.')
 parser.add_argument("--config", type=str, required=True, help="model config json file path.")
 parser.add_argument("--dataset_schema_train", type=str, required=True, help="dataset schema for train.")
 parser.add_argument("--pre_train_dataset", type=str, required=True, help="pre-train dataset address.")
 device_id = os.getenv('DEVICE_ID', None)
 if device_id is None:
@ -351,9 +353,9 @@ if __name__ == '__main__':
    args, _ = parser.parse_known_args()
    _check_args(args.config)
    _config = get_config(args.config)
-
+    _config.dataset_schema = args.dataset_schema_train
    _config.pre_train_dataset = args.pre_train_dataset
    set_seed(_config.random_seed)
    if _rank_size is not None and int(_rank_size) > 1:
        train_parallel(_config)
    else:
--- a/tests/st/model_zoo_tests/gnmt_v2/test_gnmt_v2.py
+++ b/tests/st/model_zoo_tests/gnmt_v2/test_gnmt_v2.py
@ -0,0 +1,109 @@
 # Copyright 2020 Huawei Technologies Co., Ltd
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 # http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ============================================================================
 """Train and eval api."""
 import os
 import argparse
 import pickle
 import datetime
 import mindspore.common.dtype as mstype
 from mindspore.common import set_seed
 from config import GNMTConfig
 from train import train_parallel
 from src.gnmt_model import infer
 from src.gnmt_model.bleu_calculate import bleu_calculate
 from src.dataset.tokenizer import Tokenizer
 parser = argparse.ArgumentParser(description='GNMT train and eval.')
 # train
 parser.add_argument("--config_train", type=str, required=True,
                    help="model config json file path.")
 parser.add_argument("--dataset_schema_train", type=str, required=True,
                    help="dataset schema for train.")
 parser.add_argument("--pre_train_dataset", type=str, required=True,
                    help="pre-train dataset address.")
 # eval
 parser.add_argument("--config_test", type=str, required=True,
                    help="model config json file path.")
 parser.add_argument("--dataset_schema_test", type=str, required=True,
                    help="dataset schema for evaluation.")
 parser.add_argument("--test_dataset", type=str, required=True,
                    help="test dataset address.")
 parser.add_argument("--existed_ckpt", type=str, required=True,
                    help="existed checkpoint address.")
 parser.add_argument("--vocab", type=str, required=True,
                    help="Vocabulary to use.")
 parser.add_argument("--bpe_codes", type=str, required=True,
                    help="bpe codes to use.")
 parser.add_argument("--test_tgt", type=str, required=True,
                    default=None,
                    help="data file of the test target")
 parser.add_argument("--output", type=str, required=False,
                    default="./output.npz",
                    help="result file path.")
 def get_config(config):
    config = GNMTConfig.from_json_file(config)
    config.compute_type = mstype.float16
    config.dtype = mstype.float32
    return config
 def _check_args(config):
    if not os.path.exists(config):
        raise FileNotFoundError("`config` is not existed.")
    if not isinstance(config, str):
        raise ValueError("`config` must be type of str.")
 if __name__ == '__main__':
    start_time = datetime.datetime.now()
    _rank_size = os.getenv('RANK_SIZE')
    args, _ = parser.parse_known_args()
    # train
    _check_args(args.config_train)
    _config_train = get_config(args.config_train)
    _config_train.dataset_schema = args.dataset_schema_train
    _config_train.pre_train_dataset = args.pre_train_dataset
    set_seed(_config_train.random_seed)
    assert _rank_size is not None and int(_rank_size) > 1
    if _rank_size is not None and int(_rank_size) > 1:
        train_parallel(_config_train)
    # eval
    _check_args(args.config_test)
    _config_test = get_config(args.config_test)
    _config_test.dataset_schema = args.dataset_schema_test
    _config_test.test_dataset = args.test_dataset
    _config_test.existed_ckpt = args.existed_ckpt
    result = infer(_config_test)
    with open(args.output, "wb") as f:
        pickle.dump(result, f, 1)
    result_npy_addr = args.output
    vocab = args.vocab
    bpe_codes = args.bpe_codes
    test_tgt = args.test_tgt
    tokenizer = Tokenizer(vocab, bpe_codes, 'en', 'de')
    scores = bleu_calculate(tokenizer, result_npy_addr, test_tgt)
    print(f"BLEU scores is :{scores}")
    end_time = datetime.datetime.now()
    cost_time = (end_time - start_time).seconds
    print(f"Cost time is {cost_time}s.")
    assert scores >= 23.8
    assert cost_time < 10800.0
    print("----done!----")
--- a/tests/st/model_zoo_tests/gnmt_v2/test_gnmt_v2.sh
+++ b/tests/st/model_zoo_tests/gnmt_v2/test_gnmt_v2.sh
@ -0,0 +1,86 @@
 #!/bin/bash
 # Copyright 2020 Huawei Technologies Co., Ltd
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 # http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ============================================================================
 echo "=============================================================================================================="
 echo "Please run the scipt as: "
 echo "sh run_distributed_train_ascend.sh \
  GNMT_ADDR RANK_TABLE_ADDR \
  DATASET_SCHEMA_TRAIN PRE_TRAIN_DATASET \
  DATASET_SCHEMA_TEST TEST_DATASET EXISTED_CKPT_PATH \
  VOCAB_ADDR BPE_CODE_ADDR TEST_TARGET"
 echo "for example:"
 echo "sh run_distributed_train_ascend.sh \
  /home/workspace/gnmt_v2 \
  /home/workspace/rank_table_8p.json \
  /home/workspace/dataset_menu/train.tok.clean.bpe.32000.en.json \
  /home/workspace/dataset_menu/train.tok.clean.bpe.32000.en.tfrecord-001-of-001 \
  /home/workspace/dataset_menu/newstest2014.en.json \
  /home/workspace/dataset_menu/newstest2014.en.tfrecord-001-of-001 \
  /home/workspace/gnmt_v2/gnmt-6_3452.ckpt \
  /home/workspace/wmt16_de_en/vocab.bpe.32000 \
  /home/workspace/wmt16_de_en/bpe.32000 \
  /home/workspace/wmt16_de_en/newstest2014.de"
 echo "It is better to use absolute path."
 echo "=============================================================================================================="
 GNMT_ADDR=$1
 RANK_TABLE_ADDR=$2
 # train dataset addr
 DATASET_SCHEMA_TRAIN=$3
 PRE_TRAIN_DATASET=$4
 # eval dataset addr
 DATASET_SCHEMA_TEST=$5
 TEST_DATASET=$6
 EXISTED_CKPT_PATH=$7
 VOCAB_ADDR=$8
 BPE_CODE_ADDR=$9
 TEST_TARGET=${10}
 current_exec_path=$(pwd)
 echo ${current_exec_path}
 export RANK_TABLE_FILE=$RANK_TABLE_ADDR
 export MINDSPORE_HCCL_CONFIG_PATH=$RANK_TABLE_ADDR
 echo $RANK_TABLE_FILE
 export RANK_SIZE=8
 export GLOG_v=2
 for((i=0;i<=7;i++));
 do
    rm -rf ${current_exec_path}/device$i
    mkdir ${current_exec_path}/device$i
    cd ${current_exec_path}/device$i || exit
    cp ${current_exec_path}/*.py .
    cp ${GNMT_ADDR}/*.py .
    cp -r ${GNMT_ADDR}/src .
    cp -r ${GNMT_ADDR}/config .
    export RANK_ID=$i
    export DEVICE_ID=$i
 	python test_gnmt_v2.py \
 	  --config_train=${GNMT_ADDR}/config/config.json \
 	  --dataset_schema_train=$DATASET_SCHEMA_TRAIN \
 	  --pre_train_dataset=$PRE_TRAIN_DATASET \
 	  --config_test=${GNMT_ADDR}/config/config_test.json \
 	  --dataset_schema_test=$DATASET_SCHEMA_TEST \
 	  --test_dataset=$TEST_DATASET \
 	  --existed_ckpt=$EXISTED_CKPT_PATH \
 	  --vocab=$VOCAB_ADDR \
 	  --bpe_codes=$BPE_CODE_ADDR \
 	  --test_tgt=$TEST_TARGET > log_gnmt_network${i}.log 2>&1 &
    cd ${current_exec_path} || exit
 done
 cd ${current_exec_path} || exit