@ -30,8 +30,8 @@ The backbone structure of TinyBERT is transformer, the transformer contains four
- Download glue dataset for task distillation. Convert dataset files from json format to tfrecord format, please refer to run_classifier.py which in [BERT](https://github.com/google-research/bert) repository.
# [Environment Requirements](#contents)
- Hardware(Ascend)
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Hardware(Ascend/GPU)
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- For more information, please check the resources below:
@ -42,22 +42,26 @@ The backbone structure of TinyBERT is transformer, the transformer contains four
After installing MindSpore via the official website, you can start general distill, task distill and evaluation as follows:
```bash
# run standalone general distill example
bash scripts/run_standalone_gd_ascend.sh
bash scripts/run_standalone_gd.sh
Before running the shell script, please set the `load_teacher_ckpt_path`, `data_dir` and `schema_dir` in the run_standalone_gd_ascend.sh file first.
Before running the shell script, please set the `load_teacher_ckpt_path`, `data_dir` and `schema_dir` in the run_standalone_gd.sh file first. If running on GPU, please set the `device_target=GPU`.
# run distributed general distill example
# For Ascend device, run distributed general distill example
Before running the shell script, please set the `task_name`, `load_teacher_ckpt_path`, `load_gd_ckpt_path`, `train_data_dir`, `eval_data_dir` and `schema_dir` in the run_standalone_td_ascend.sh file first.
Before running the shell script, please set the `task_name`, `load_teacher_ckpt_path`, `load_gd_ckpt_path`, `train_data_dir`, `eval_data_dir` and `schema_dir` in the run_standalone_td.sh file first.
If running on GPU, please set the `device_target=GPU`.
```
For distributed training, a hccl configuration file with JSON format needs to be created in advance.
For distributed training on Ascend, a hccl configuration file with JSON format needs to be created in advance.
--device_target device where the code will be implemented: "Ascend" | "GPU", default is "Ascend"
@ -198,7 +202,7 @@ Parameters for bert network:
#### running on Ascend
Before running the command below, please check `load_teacher_ckpt_path`, `data_dir` and `schma_dir` has been set. Please set the path to be the absolute full path, e.g:"/username/checkpoint_100_300.ckpt".
```
bash scripts/run_standalone_gd_ascend.sh
bash scripts/run_standalone_gd.sh
```
The command above will run in the background, you can view the results the file log.txt. After training, you will get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
Before running the command below, please check `load_teacher_ckpt_path`, `data_dir``schma_dir` and `device_target=GPU` has been set. Please set the path to be the absolute full path, e.g:"/username/checkpoint_100_300.ckpt".
```
bash scripts/run_standalone_gd.sh
```
The command above will run in the background, you can view the results the file log.txt. After training, you will get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
```
# grep "epoch" log.txt
epoch: 1, step: 100, outpus are 28.2093
...
```
### Distributed Training
#### running on Ascend
Before running the command below, please check `load_teacher_ckpt_path`, `data_dir` and `schma_dir` has been set. Please set the path to be the absolute full path, e.g:"/username/checkpoint_100_300.ckpt".
The command above will run in the background, you can view the results the file log.txt. After training, you will get some checkpoint files under the LOG* folder by default. The loss value will be achieved as follows:
```
# grep "epoch" LOG*/log.txt
epoch: 1, step: 1, outpus are 63.4098
...
```
## [Evaluation Process](#contents)
### Evaluation
If you want to after running and continue to eval, please set `do_train=true` and `do_eval=true`, If you want to run eval alone, please set `do_train=false` and `do_eval=true`.
#### evaluation on SST-2 dataset when running on Ascend
If you want to after running and continue to eval, please set `do_train=true` and `do_eval=true`, If you want to run eval alone, please set `do_train=false` and `do_eval=true`. If running on GPU, please set `device_target=GPU`.
#### evaluation on SST-2 dataset
```
bash scripts/run_standalone_td_ascend.sh
bash scripts/run_standalone_td.sh
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
```bash
@ -240,10 +268,10 @@ The best acc is 0.899305
The best acc is 0.902777
...
```
#### evaluation on MNLI dataset when running on Ascend
#### evaluation on MNLI dataset
Before running the command below, please check the load pretrain checkpoint path has been set. Please set the checkpoint path to be the absolute full path, e.g:"/username/pretrain/checkpoint_100_300.ckpt".
```
bash scripts/run_standalone_td_ascend.sh
bash scripts/run_standalone_td.sh
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows:
```
@ -255,10 +283,10 @@ The best acc is 0.810355
The best acc is 0.813929
...
```
#### evaluation on QNLI dataset when running on Ascend
#### evaluation on QNLI dataset
Before running the command below, please check the load pretrain checkpoint path has been set. Please set the checkpoint path to be the absolute full path, e.g:"/username/pretrain/checkpoint_100_300.ckpt".
```
bash scripts/run_standalone_td_ascend.sh
bash scripts/run_standalone_td.sh
```
The command above will run in the background, you can view the results the file log.txt. The accuracy of the test dataset will be as follows: