@ -73,6 +73,60 @@ For distributed training, a hccl configuration file with JSON format needs to be
Please follow the instructions in the link below:
https:gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools.
For dataset, if you want to set the format and parameters, a schema configuration file with JSON format needs to be created, please refer to [tfrecord ](https://www.mindspore.cn/tutorial/zh-CN/master/use/data_preparation/loading_the_datasets.html#tfrecord ) format.
```
For pretraining, schema file contains ["input_ids", "input_mask", "segment_ids", "next_sentence_labels", "masked_lm_positions", "masked_lm_ids", "masked_lm_weights"].
For ner or classification task, schema file contains ["input_ids", "input_mask", "segment_ids", "label_ids"].
For squad task, training: schema file contains ["start_positions", "end_positions", "input_ids", "input_mask", "segment_ids"], evaluation: schema file contains ["input_ids", "input_mask", "segment_ids"].
`numRows` is the only option which could be set by user, the others value must be set according to the dataset.
For example, the dataset is cn-wiki-128, the schema file for pretraining as following:
{
"datasetType": "TF",
"numRows": 7680,
"columns": {
"input_ids": {
"type": "int64",
"rank": 1,
"shape": [256]
},
"input_mask": {
"type": "int64",
"rank": 1,
"shape": [256]
},
"segment_ids": {
"type": "int64",
"rank": 1,
"shape": [256]
},
"next_sentence_labels": {
"type": "int64",
"rank": 1,
"shape": [1]
},
"masked_lm_positions": {
"type": "int64",
"rank": 1,
"shape": [32]
},
"masked_lm_ids": {
"type": "int64",
"rank": 1,
"shape": [32]
},
"masked_lm_weights": {
"type": "float32",
"rank": 1,
"shape": [32]
}
}
}
```
# [Script Description ](#contents )
## [Script and Sample Code ](#contents )
@ -87,11 +141,12 @@ https:gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools.
├─hyper_parameter_config.ini # hyper paramter for distributed pretraining
├─run_distribute_pretrain.py # script for distributed pretraining
├─README.md
├─run_classifier.sh # shell script for standalone classifier task
├─run_ner.sh # shell script for standalone NER task
├─run_squad.sh # shell script for standalone SQUAD task
├─run_classifier.sh # shell script for standalone classifier task on ascend or gpu
├─run_ner.sh # shell script for standalone NER task on ascend or gpu
├─run_squad.sh # shell script for standalone SQUAD task on ascend or gpu
├─run_standalone_pretrain_ascend.sh # shell script for standalone pretrain on ascend
├─run_distributed_pretrain_ascend.sh # shell script for distributed pretrain on ascend
├─run_distributed_pretrain_gpu.sh # shell script for distributed pretrain on gpu
└─run_standaloned_pretrain_gpu.sh # shell script for distributed pretrain on gpu
├─src
├─__init__.py
@ -363,55 +418,59 @@ The result will be as follows:
## [Model Description ](#contents )
## [Performance ](#contents )
### Pretraining Performance
| Parameters | BERT | BERT |
| Parameters | Ascend | GPU |
| -------------------------- | ---------------------------------------------------------- | ------------------------- |
| Model Version | base | base |
| Model Version | BERT_ base | BERT_ base |
| Resource | Ascend 910, cpu:2.60GHz 56cores, memory:314G | NV SMX2 V100-32G |
| uploaded Date | 08/22/2020 | 05/06/2020 |
| MindSpore Version | 0.6.0 | 0.3.0 |
| Dataset | cn-wiki-128 | ImageNet |
| Dataset | cn-wiki-128(4000w) | ImageNet |
| Training Parameters | src/config.py | src/config.py |
| Optimizer | Lamb | Momentum |
| Loss Function | SoftmaxCrossEntropy | SoftmaxCrossEntropy |
| outputs | probability | |
| Loss | | 1.913 |
| Speed | 116.5 ms/step | 1.913 |
| Total time | | |
| Epoch | 40 | | |
| Batch_size | 256*8 | 130(8P) | |
| Loss | 1.7 | 1.913 |
| Speed | 340ms/step | 1.913 |
| Total time | 73h | |
| Params (M) | 110M | |
| Checkpoint for Fine tuning | 1.2G(.ckpt file) | |
| Parameters | BERT | BERT |
| Parameters | Ascend | GPU |
| -------------------------- | ---------------------------------------------------------- | ------------------------- |
| Model Version | NEZHA | NEZHA |
| Model Version | BERT_ NEZHA | BERT_ NEZHA |
| Resource | Ascend 910, cpu:2.60GHz 56cores, memory:314G | NV SMX2 V100-32G |
| uploaded Date | 08/20/2020 | 05/06/2020 |
| MindSpore Version | 0.6.0 | 0.3.0 |
| Dataset | cn-wiki-128 | ImageNet |
| Dataset | cn-wiki-128(4000w) | ImageNet |
| Training Parameters | src/config.py | src/config.py |
| Optimizer | Lamb | Momentum |
| Loss Function | SoftmaxCrossEntropy | SoftmaxCrossEntropy |
| outputs | probability | |
| Loss | | 1.913 |
| Speed | | 1.913 |
| Total time | | |
| Epoch | 40 | | |
| Batch_size | 96*8 | 130(8P) |
| Loss | 1.7 | 1.913 |
| Speed | 360ms/step | 1.913 |
| Total time | 200h | |
| Params (M) | 340M | |
| Checkpoint for Fine tuning | 3.2G(.ckpt file) | |
#### Inference Performance
| Parameters | | | |
| -------------------------- | ----------------------------- | ------------------------- | -------------------- |
| Model Version | V1 | | |
| Resource | Huawei 910 | NV SMX2 V100-32G | Huawei 310 |
| uploaded Date | 08/22/2020 | 05/22/2020 | |
| MindSpore Version | 0.6.0 | 0.2.0 | 0.2.0 |
| Dataset | cola, 1.2W | ImageNet, 1.2W | ImageNet, 1.2W |
| batch_size | 32(1P) | 130(8P) | |
| Accuracy | 0.588986 | ACC1[72.07%] ACC5[90.90%] | |
| Speed | 59.25ms/step | | |
| Total time | | | |
| Model for inference | 1.2G(.ckpt file) | | |
| Parameters | Ascend | GPU |
| -------------------------- | ----------------------------- | ------------------------- |
| Model Version | | |
| Resource | Ascend 910 | NV SMX2 V100-32G |
| uploaded Date | 08/22/2020 | 05/22/2020 |
| MindSpore Version | 0.6.0 | 0.2.0 |
| Dataset | cola, 1.2W | ImageNet, 1.2W |
| batch_size | 32(1P) | 130(8P) |
| Accuracy | 0.588986 | ACC1[72.07%] ACC5[90.90%] |
| Speed | 59.25ms/step | |
| Total time | 15min | |
| Model for inference | 1.2G(.ckpt file) | |
# [Description of Random Situation ](#contents )