!12184 Fix some spelling mistakes in model_zoo training script

From: @zlq2020
Reviewed-by: @guoqi1024
Signed-off-by: @guoqi1024
pull/12184/MERGE
mindspore-ci-bot 4 years ago committed by Gitee
commit adfe6e1bc2

@ -20,7 +20,6 @@
# [MobileNetV2 Description](#contents)
MobileNetV2 is tuned to mobile phone CPUs through a combination of hardware- aware network architecture search (NAS) complemented by the NetAdapt algorithm and then subsequently improved through novel architecture advances.Nov 20, 2019.
[Paper](https://arxiv.org/pdf/1905.02244) Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al. "Searching for MobileNetV2." In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314-1324. 2019.
@ -43,7 +42,6 @@ Dataset used: [imagenet](http://www.image-net.org/)
- Data format: RGB images.
- Note: Data will be processed in src/dataset.py
# [Features](#contents)
## [Mixed Precision](#contents)
@ -61,7 +59,6 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Script description](#contents)
## [Script and sample code](#contents)
@ -84,7 +81,6 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
├── export.py # export checkpoint files into air/onnx
```
## [Script Parameters](#contents)
Parameters for both training and evaluation can be set in config.py
@ -113,13 +109,11 @@ Parameters for both training and evaluation can be set in config.py
### Usage
You can start training using python or shell scripts. The usage of shell scripts as follows:
- bash run_train.sh [Ascend] [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(optional)
- bash run_train.sh [GPU] [DEVICE_ID_LIST] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(optional)
### Launch
``` bash
@ -133,7 +127,7 @@ You can start training using python or shell scripts. The usage of shell scripts
Training result will be stored in the example path. Checkpoints trained by `Ascend` will be stored at `./train/device$i/checkpoint` by default, and training log will be redirected to `./train/device$i/train.log`. Checkpoints trained by `GPU` will be stored in `./train/checkpointckpt_$i` by default, and training log will be redirected to `./train/train.log`.
`train.log` is as follows:
```
``` bash
epoch: [ 0/200], step:[ 624/ 625], loss:[5.258/5.258], time:[140412.236], lr:[0.100]
epoch time: 140522.500, per step time: 224.836, avg loss: 5.258
epoch: [ 1/200], step:[ 624/ 625], loss:[3.917/3.917], time:[138221.250], lr:[0.200]
@ -150,7 +144,7 @@ You can start training using python or shell scripts. The usage of shell scripts
### Launch
```
``` bash
# infer example
shell:
Ascend: sh run_infer_quant.sh Ascend ~/imagenet/val/ ~/train/mobilenet-60_1601.ckpt
@ -160,9 +154,9 @@ You can start training using python or shell scripts. The usage of shell scripts
### Result
Inference result will be stored in the example path, you can find result like the followings in `./val/infer.log`.
Inference result will be stored in the example path, you can find result like the following in `./val/infer.log`.
```
``` bash
result: {'acc': 0.71976314102564111}
```

@ -35,7 +35,7 @@ def parse_args():
>>> parse_args()
"""
parser = ArgumentParser(description="mindspore distributed training launch "
"helper utilty that will spawn up "
"helper utility that will spawn up "
"multiple distributed processes")
parser.add_argument("--nproc_per_node", type=int, default=1,
help="The number of processes to launch on each node, "

@ -43,13 +43,12 @@ Dataset used: [ImageNet2012](http://www.image-net.org/)
- NoteData will be processed in dataset.py
- Download the dataset, the directory structure is as follows:
```
```python
└─dataset
├─ilsvrc # train dataset
└─validation_preprocess # evaluate dataset
```
# [Features](#contents)
## [Mixed Precision](#contents)
@ -67,7 +66,6 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Script description](#contents)
## [Script and sample code](#contents)
@ -124,18 +122,19 @@ Parameters for both training and evaluation can be set in config.py
### Usage
- Ascend: sh run_train.sh Ascend [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_CKPT_PATH]\(optional)
### Launch
```
```bash
# training example
Ascend: bash run_train.sh Ascend ~/hccl.json ~/imagenet/train/ ~/pretrained_ckeckpoint
```
### Result
Training result will be stored in the example path. Checkpoints will be stored at `./train/device$i/` by default, and training log will be redirected to `./train/device$i/train.log` like followings.
Training result will be stored in the example path. Checkpoints will be stored at `./train/device$i/` by default, and training log will be redirected to `./train/device$i/train.log` like following.
```
```bash
epoch: 1 step: 5004, loss is 4.8995576
epoch: 2 step: 5004, loss is 3.9235563
epoch: 3 step: 5004, loss is 3.833077
@ -153,7 +152,7 @@ You can start training using python or shell scripts. The usage of shell scripts
### Launch
```
```bash
# infer example
shell:
Ascend: sh run_infer.sh Ascend ~/imagenet/val/ ~/train/Resnet50-30_5004.ckpt
@ -163,9 +162,9 @@ You can start training using python or shell scripts. The usage of shell scripts
### Result
Inference result will be stored in the example path, you can find result like the followings in `./eval/infer.log`.
Inference result will be stored in the example path, you can find result like the following in `./eval/infer.log`.
```
```bash
result: {'acc': 0.76576314102564111}
```

@ -34,7 +34,7 @@ def parse_args():
>>> parse_args()
"""
parser = ArgumentParser(description="mindspore distributed training launch "
"helper utilty that will spawn up "
"helper utility that will spawn up "
"multiple distributed processes")
parser.add_argument("--nproc_per_node", type=int, default=1,
help="The number of processes to launch on each node, "

@ -93,7 +93,7 @@ class DetectionEngine:
def _nms(self, dets, thresh):
"""Calculate NMS."""
# conver xywh -> xmin ymin xmax ymax
# convert xywh -> xmin ymin xmax ymax
x1 = dets[:, 0]
y1 = dets[:, 1]
x2 = x1 + dets[:, 2]
@ -185,7 +185,7 @@ class DetectionEngine:
x_top_left = x - w / 2.
y_top_left = y - h / 2.
# creat all False
# create all False
flag = np.random.random(cls_emb.shape) > sys.maxsize
for i in range(flag.shape[0]):
c = cls_argmax[i]

@ -59,7 +59,7 @@ cp ../*.py ./eval$3
cp -r ../src ./eval$3
cd ./eval$3 || exit
env > env.log
echo "start infering for device $DEVICE_ID"
echo "start inferring for device $DEVICE_ID"
python eval.py \
--data_dir=$DATASET_PATH \
--pretrained=$CHECKPOINT_PATH \

@ -94,7 +94,7 @@ def get_interp_method(interp, sizes=()):
Neighbors method. (used by default).
4: Lanczos interpolation over 8x8 pixel neighborhood.
9: Cubic for enlarge, area for shrink, bilinear for others
10: Random select from interpolation method metioned above.
10: Random select from interpolation method mentioned above.
Note:
When shrinking an image, it will generally look best with AREA-based
interpolation, whereas, when enlarging an image, it will generally look best

@ -58,7 +58,7 @@ class YoloBlock(nn.Cell):
Args:
in_channels: Integer. Input channel.
out_chls: Interger. Middle channel.
out_chls: Integer. Middle channel.
out_channels: Integer. Output channel.
Returns:
@ -108,7 +108,7 @@ class YOLOv3(nn.Cell):
Args:
backbone_shape: List. Darknet output channels shape.
backbone: Cell. Backbone Network.
out_channel: Interger. Output channel.
out_channel: Integer. Output channel.
Returns:
Tensor, output tensor.

@ -44,7 +44,7 @@ def has_valid_annotation(anno):
# if all boxes have close to zero area, there is no annotation
if _has_only_empty_bbox(anno):
return False
# keypoints task have a slight different critera for considering
# keypoints task have a slight different criteria for considering
# if an annotation is valid
if "keypoints" not in anno[0]:
return True

@ -121,7 +121,7 @@ def parse_args():
args.rank = get_rank()
args.group_size = get_group_size()
# select for master rank save ckpt or all rank save, compatiable for model parallel
# select for master rank save ckpt or all rank save, compatible for model parallel
args.rank_save_ckpt_flag = 0
if args.is_save_on_master:
if args.rank == 0:
@ -140,6 +140,14 @@ def conver_training_shape(args):
training_shape = [int(args.training_shape), int(args.training_shape)]
return training_shape
def build_quant_network(network):
quantizer = QuantizationAwareTraining(bn_fold=True,
per_channel=[True, False],
symmetric=[True, False],
one_conv_fold=False)
network = quantizer.quantize(network)
return network
def train():
"""Train function."""
@ -168,11 +176,7 @@ def train():
config = ConfigYOLOV3DarkNet53()
# convert fusion network to quantization aware network
if config.quantization_aware:
quantizer = QuantizationAwareTraining(bn_fold=True,
per_channel=[True, False],
symmetric=[True, False],
one_conv_fold=False)
network = quantizer.quantize(network)
network = build_quant_network(network)
network = YoloWithLossCell(network)
args.logger.info('finish get network')
@ -198,11 +202,8 @@ def train():
lr = get_lr(args)
opt = Momentum(params=get_param_groups(network),
learning_rate=Tensor(lr),
momentum=args.momentum,
weight_decay=args.weight_decay,
loss_scale=args.loss_scale)
opt = Momentum(params=get_param_groups(network), learning_rate=Tensor(lr), momentum=args.momentum,
weight_decay=args.weight_decay, loss_scale=args.loss_scale)
network = TrainingWrapper(network, opt)
network.set_train()
@ -213,9 +214,7 @@ def train():
ckpt_config = CheckpointConfig(save_checkpoint_steps=args.ckpt_interval,
keep_checkpoint_max=ckpt_max_num)
save_ckpt_path = os.path.join(args.outputs_dir, 'ckpt_' + str(args.rank) + '/')
ckpt_cb = ModelCheckpoint(config=ckpt_config,
directory=save_ckpt_path,
prefix='{}'.format(args.rank))
ckpt_cb = ModelCheckpoint(config=ckpt_config, directory=save_ckpt_path, prefix='{}'.format(args.rank))
cb_params = _InternalCallbackParam()
cb_params.train_network = network
cb_params.epoch_num = ckpt_max_num

@ -19,7 +19,6 @@
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [YOLOv3_ResNet18 Description](#contents)
YOLOv3 network based on ResNet-18, with support for training and evaluation.
@ -32,8 +31,8 @@ The overall network architecture of YOLOv3 is show below:
And we use ResNet18 as the backbone of YOLOv3_ResNet18. The architecture of ResNet18 has 4 stages. The ResNet architecture performs the initial convolution and max-pooling using 7×7 and 3×3 kernel sizes respectively. Afterward, every stage of the network has different Residual blocks (2, 2, 2, 2) containing two 3×3 conv layers. Finally, the network has an Average Pooling layer followed by a fully connected layer.
# [Dataset](#contents)
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
Dataset used: [COCO2017](<http://images.cocodataset.org/>)
@ -48,6 +47,7 @@ Dataset used: [COCO2017](<http://images.cocodataset.org/>)
- Dataset
1. The directory structure is as follows:
```
.
├── annotations # annotation jsons
@ -55,7 +55,7 @@ Dataset used: [COCO2017](<http://images.cocodataset.org/>)
└── val2017 # infer dataset
```
2. Organize the dataset infomation into a TXT file, each row in the file is as follows:
2. Organize the dataset information into a TXT file, each row in the file is as follows:
```
train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
@ -63,7 +63,6 @@ Dataset used: [COCO2017](<http://images.cocodataset.org/>)
Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. `dataset.py` is the parsing script, we read image from an image path joined by the `image_dir`(dataset directory) and the relative path in `anno_path`(the TXT file path), `image_dir` and `anno_path` are external inputs.
# [Environment Requirements](#contents)
- HardwareAscend
@ -74,13 +73,11 @@ Dataset used: [COCO2017](<http://images.cocodataset.org/>)
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
After installing MindSpore via the official website, you can start training and evaluation on Ascend as follows:
- runing on Ascend
- running on Ascend
```shell script
#run standalone training example
@ -92,11 +89,12 @@ After installing MindSpore via the official website, you can start training and
#run evaluation example
sh run_eval.sh [DEVICE_ID] [CKPT_PATH] [MINDRECORD_DIR] [IMAGE_DIR] [ANNO_PATH]
```
# [Script Description](#contents)
## [Script and Sample Code](#contents)
```
```python
└── cv
├── README.md // descriptions about all the models
├── mindspore_hub_conf.md // config for mindspore hub
@ -117,9 +115,9 @@ After installing MindSpore via the official website, you can start training and
## [Script Parameters](#contents)
```
Major parameters in train.py and config.py as follows:
```python
device_num: Use device nums, default is 1.
lr: Learning rate, default is 0.001.
epoch_size: Epoch size, default is 50.
@ -133,24 +131,23 @@ After installing MindSpore via the official website, you can start training and
img_shape: Image height and width used as input to the model.
```
## [Training Process](#contents)
### Training on Ascend
To train the model, run `train.py` with the dataset `image_dir`, `anno_path` and `mindrecord_dir`. If the `mindrecord_dir` is empty, it wil generate [mindrecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html) file by `image_dir` and `anno_path`(the absolute image path is joined by the `image_dir` and the relative path in `anno_path`). **Note if `mindrecord_dir` isn't empty, it will use `mindrecord_dir` rather than `image_dir` and `anno_path`.**
- Stand alone mode
```
```bash
sh run_standalone_train.sh 0 50 ./Mindrecord_train ./dataset ./dataset/train.txt
```
The input variables are device id, epoch size, mindrecord directory path, dataset directory path and train TXT file path.
- Distributed mode
```
```bash
sh run_distribute_train.sh 8 150 /data/Mindrecord_train /data /data/train.txt /data/hccl.json
```
@ -158,7 +155,7 @@ To train the model, run `train.py` with the dataset `image_dir`, `anno_path` and
You will get the loss value and time of each step as following:
```
```bash
epoch: 145 step: 156, loss is 12.202981
epoch time: 25599.22742843628, per step time: 164.0976117207454
epoch: 146 step: 156, loss is 16.91706
@ -173,14 +170,15 @@ You will get the loss value and time of each step as following:
epoch time: 25319.57221031189, per step time: 162.30495006610187
```
Note the results is two-classification(person and face) used our own annotations with coco2017, you can change `num_classes` in `config.py` to train your dataset. And we will suport 80 classifications in coco2017 the near future.
Note the results is two-classification(person and face) used our own annotations with coco2017, you can change `num_classes` in `config.py` to train your dataset. And we will support 80 classifications in coco2017 the near future.
## [Evaluation Process](#contents)
### Evaluation on Ascend
To eval, run `eval.py` with the dataset `image_dir`, `anno_path`(eval txt), `mindrecord_dir` and `ckpt_path`. `ckpt_path` is the path of [checkpoint](https://www.mindspore.cn/tutorial/training/en/master/use/save_model.html) file.
```
```bash
sh run_eval.sh 0 yolo.ckpt ./Mindrecord_eval ./dataset ./dataset/eval.txt
```
@ -188,15 +186,15 @@ The input variables are device id, checkpoint path, mindrecord directory path, d
You will get the precision and recall value of each class:
```
```bash
class 0 precision is 88.18%, recall is 66.00%
class 1 precision is 85.34%, recall is 79.13%
```
Note the precision and recall values are results of two-classification(person and face) used our own annotations with coco2017.
# [Model Description](#contents)
## [Performance](#contents)
### Evaluation Performance
@ -217,7 +215,6 @@ Note the precision and recall values are results of two-classification(person an
| Parameters (M) | 189 |
| Scripts | [yolov3_resnet18 script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_resnet18) | [yolov3_resnet18 script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_resnet18) |
### Inference Performance
| Parameters | Ascend |
@ -235,7 +232,7 @@ Note the precision and recall values are results of two-classification(person an
In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py.
# [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).

@ -15,7 +15,7 @@
# ============================================================================
echo "======================================================================================================================================================="
echo "Please run the scipt as: "
echo "Please run the script as: "
echo "sh run_distribute_train.sh DEVICE_NUM EPOCH_SIZE MINDRECORD_DIR IMAGE_DIR ANNO_PATH RANK_TABLE_FILE PRE_TRAINED PRE_TRAINED_EPOCH_SIZE"
echo "For example: sh run_distribute_train.sh 8 150 /data/Mindrecord_train /data /data/train.txt /data/hccl.json /opt/yolov3-150.ckpt(optional) 100(optional)"
echo "It is better to use absolute path."
@ -47,7 +47,7 @@ then
exit 1
fi
echo "After running the scipt, the network runs in the background. The log will be generated in LOGx/log.txt"
echo "After running the script, the network runs in the background. The log will be generated in LOGx/log.txt"
export RANK_TABLE_FILE=$6
export RANK_SIZE=$1

@ -15,7 +15,7 @@
# ============================================================================
echo "=============================================================================================================="
echo "Please run the scipt as: "
echo "Please run the script as: "
echo "sh run_eval.sh DEVICE_ID CKPT_PATH MINDRECORD_DIR IMAGE_DIR ANNO_PATH"
echo "for example: sh run_eval.sh 0 yolo.ckpt ./Mindrecord_eval ./dataset ./dataset/eval.txt"
echo "=============================================================================================================="

@ -15,7 +15,7 @@
# ============================================================================
echo "========================================================================================================================================="
echo "Please run the scipt as: "
echo "Please run the script as: "
echo "sh run_standalone_train.sh DEVICE_ID EPOCH_SIZE MINDRECORD_DIR IMAGE_DIR ANNO_PATH PRE_TRAINED PRE_TRAINED_EPOCH_SIZE"
echo "for example: sh run_standalone_train.sh 0 50 ./Mindrecord_train ./dataset ./dataset/train.txt /opt/yolov3-50.ckpt(optional) 30(optional)"
echo "========================================================================================================================================="

Loading…
Cancel
Save