!8899 Add OpenPose network to modelzoo

From: @zhanghuiyao
Reviewed-by: 
Signed-off-by:
pull/8899/MERGE
mindspore-ci-bot 4 years ago committed by Gitee
commit f4c126ddeb

@ -0,0 +1,225 @@
# Contents
- [Openpose Description](#googlenet-description)
- [Model Architecture](#model-architecture)
- [Dataset](#dataset)
- [Features](#features)
- [Mixed Precision](#mixed-precision)
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
- [Training Process](#training-process)
- [Training](#training)
- [Distributed Training](#distributed-training)
- [Evaluation Process](#evaluation-process)
- [Evaluation](#evaluation)
- [Model Description](#model-description)
- [Performance](#performance)
- [Evaluation Performance](#evaluation-performance)
# [Openpose Description](#contents)
Openpose network proposes a bottom-up human attitude estimation algorithm using Part Affinity Fields (PAFs). Instead of a top-down algorithm: Detect people first and then return key-points and skeleton. The advantage of openpose is that the computing time does not increase significantly as the number of people in the image increases.However,the top-down algorithm is based on the detection result, and the runtimes grow linearly with the number of people.
[Paper](https://arxiv.org/abs/1611.08050): Zhe Cao,Tomas Simon,Shih-En Wei,Yaser Sheikh,"Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields",The IEEE Conference on Computer Vision and Pattern Recongnition(CVPR),2017
# [Model Architecture](#contents)
In first step the image is passed through baseline CNN network to extract the feature maps of the input In the paper. In this paper thee authors used first 10 layers of VGG-19 network.
The feature map is then process in a multi-stage CNN pipeline to generate the Part Confidence Maps and Part Affinity Field.
In the last step, the Confidence Maps and Part Affinity Fields that are generated above are processed by a greedy bipartite matching algorithm to obtain the poses for each person in the image.
# [Dataset](#contents)
Prepare datasets, including training sets, verification sets, and annotations.The training set and validation set samples are located in the "dataset" directory, The available datasets include coco2014,coco2017 datasets.
In the currently provided training script, the coco2017 data set is used as an example to perform data preprocessing during the training process. If users use data sets in other formats, please modify the data set loading and preprocessing methods
- Download data from coco2017 data official website and unzip.
````bash
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/annotations/annotations2017.zip
````
- Create the mask dataset.
Run python gen_ignore_mask.py
````python
python gen_ignore_mask.py --train_ann ../dataset/annotations/person_keypoints_train2017.json --val_ann ../dataset/annotations/person_keypoints_val2017.json --train_dir train2017 --val_dir val2017
````
- The dataset folder is generated in the root directory and contains the following files:
```python
├── dataset
├── annotation
├─person_keypoints_train2017.json
└─person_keypoints_val2017.json
├─ignore_mask_train
├─ignore_mask_val
├─tran2017
└─val2017
```
# [Features](#contents)
## Mixed Precision
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching reduce precision.
# [Environment Requirements](#contents)
- Hardware (Ascend)
- Prepare hardware environment with Ascend. If you want to try, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- Download the VGG19 model of the MindSpore version:
- [vgg19-0-97_5004.ckpt](http://10.154.33.38:51203/tutorials/image_classification.html)
- For more information, please check the resources below
- [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html)
# [Quick Start](#contents)
After installing MindSpore via the official website, you can start training and evaluation as follows:
```python
# run training example
python train.py --train_dir train2017 --train_ann person_keypoints_train2017.json > train.log 2>&1 &
# run distributed training example
bash run_distribute_train.sh [RANK_TABLE_FILE]
# run evaluation example
python eval.py --model_path path_to_eval_model.ckpt --imgpath_val ./dataset/val2017 --ann ./dataset/annotations/person_keypoints_val2017.json > eval.log 2>&1 &
OR
bash scripts/run_eval_ascend.sh
```
[RANK_TABLE_FILE] is the path of the multi-card information configuration table in the environment. The configuration table can be automatically generated by the tool [hccl_tool](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools).
# [Script Description](#contents)
## [Script and Sample Code](#contents)
```python
├── ModelZoo_openpose_MS_MIT
├── README.md // descriptions about openpose
├── scripts
│ ├──run_standalone_train.sh // shell script for distributed on Ascend
│ ├──run_distribute_train.sh // shell script for distributed on Ascend with 8p
│ ├──run_eval_ascend.sh // shell script for evaluation on Ascend
├── src
│ ├──openposenet.py // Openpose architecture
│ ├──loss.py // Loss function
│ ├──config.py // parameter configuration
│ ├──dataset.py // Data preprocessing
│ ├──utils.py // Utils
│ ├──gen_ignore_mask.py // Generating mask data script
├── export.py // model conversion script
├── train.py // training script
├── eval.py // evaluation script
```
## [Script Parameters](#contents)
Parameters for both training and evaluation can be set in config.py
- config for openpose
```python
'data_dir': 'path to dataset' # absolute full path to the train and evaluation datasets
'vgg_path': 'path to vgg model' # absolute full path to vgg19 model
'save_model_path': 'path of saving models' # absolute full path to output models
'load_pretrain': 'False' # whether training based on the pre-trained model
'pretrained_model_path':'' # load pre-trained model path
'lr': 1e-4 # initial learning rate
'batch_size': 10 # training batch size
'lr_gamma': 0.1 # lr scale when reach lr_steps
'lr_steps': '100000,200000,250000' # the steps when lr * lr_gamma
'loss scale': 16386 # the loss scale of mixed precision
'max_epoch_train': 60 # total training epochs
'insize': 368 # image size used as input to the model
'keep_checkpoint_max': 5 # only keep the last keep_checkpoint_max checkpoint
'log_interval': 100 # the interval of print a log
'ckpt_interval': 5000 # the interval of saving a output model
```
For more configuration details, please refer the script `config.py`.
## [Training Process](#contents)
### Training
- running on Ascend
```python
python train.py --train_dir train2017 --train_ann person_keypoints_train2017.json > train.log 2>&1 &
```
The python command above will run in the background, you can view the results through the file `train.log`.
After training, you'll get some checkpoint files under the script folder by default. The loss value will be achieved as follows:
```python
# grep "epoch " train.log
epoch[0], iter[0], loss[0.29211228793809957], 0.13 imgs/sec, vgglr=0.0,baselr=2.499999936844688e-05,stagelr=9.999999747378752e-05
epoch[0], iter[100], loss[0.060355084178521694], 24.92 imgs/sec, vgglr=0.0,baselr=2.499999936844688e-05,stagelr=9.999999747378752e-05
epoch[0], iter[200], loss[0.026628130997662272], 26.20 imgs/sec, vgglr=0.0,baselr=2.499999936844688e-05,stagelr=9.999999747378752e-05
...
```
The model checkpoint will be saved in the directory of config.py: 'save_model_path'.
## [Evaluation Process](#contents)
### Evaluation
- running on Ascend
Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "username/openpose/outputs/\*time*\/0-6_30000.ckpt".
```python
python eval.py --model_path path_to_eval_model.ckpt --imgpath_val ./dataset/val2017 --ann ./dataset/annotations/person_keypoints_val2017.json > eval.log 2>&1 &
OR
bash scripts/run_eval_ascend.sh
```
The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows:
```python
# grep "AP" eval.log
{'AP': 0.40030956300341397, 'Ap .5': 0.6658941566481336, 'AP .75': 0.396047897339743, 'AP (M)': 0.3075356543635785, 'AP (L)': 0.533772768618845, 'AR': 0.4519836272040302, 'AR .5': 0.693639798488665, 'AR .75': 0.4570214105793451, 'AR (M)': 0.32155148866429945, 'AR (L)': 0.6330360460795242}
```
# [Model Description](#contents)
## [Performance](#contents)
### Evaluation Performance
| Parameters | Ascend
| -------------------------- | -----------------------------------------------------------
| Model Version | openpose
| Resource | Ascend 910 CPU 2.60GHz192coresMemory755G
| uploaded Date | 10/20/2020 (month/day/year)
| MindSpore Version | 1.0.1-alpha
| Training Parameters | epoch = 60, steps = 30k, batch_size = 10, lr = 0.0001
| Optimizer | Adam
| Loss Function | MSE
| outputs | pose
| Speed | 1pc: 29imgs/s
| Total time | 1pc: 30h
| Checkpoint for Fine tuning | 602.33M (.ckpt file)

File diff suppressed because it is too large Load Diff

@ -0,0 +1,38 @@
# Copyright 2020 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""export"""
import argparse
import numpy as np
from mindspore import Tensor
from mindspore import context
from mindspore.train.serialization import load_checkpoint, load_param_into_net, export
from src.openposenet import OpenPoseNet
parser = argparse.ArgumentParser(description='checkpoint export')
parser.add_argument('--checkpoint_path', type=str, default=None, help='Checkpoint file path')
args_opt = parser.parse_args()
if __name__ == '__main__':
context.set_context(mode=context.GRAPH_MODE, save_graphs=False)
# define net
net = OpenPoseNet()
# load checkpoint
param_dict = load_checkpoint(args_opt.checkpoint_path)
load_param_into_net(net, param_dict)
inputs = np.random.uniform(0.0, 1.0, size=[1, 3, 368, 368]).astype(np.float32)
export(net, Tensor(inputs), file_name="openpose.air", file_format='AIR')

@ -0,0 +1,61 @@
#!/bin/bash
# Copyright 2020 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 1 ]
then
echo "Usage: sh run_distribute_train.sh [RANK_TABLE_FILE]"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
RANK_TABLE_FILE=$(get_real_path $1)
echo $RANK_TABLE_FILE
if [ ! -f $RANK_TABLE_FILE ]
then
echo "error: RANK_TABLE_FILE=$RANK_TABLE_FILE is not a file"
exit 1
fi
export DEVICE_NUM=8
export RANK_SIZE=8
export RANK_TABLE_FILE=$RANK_TABLE_FILE
for((i=0; i<${DEVICE_NUM}; i++))
do
export DEVICE_ID=$i
export RANK_ID=$i
rm -rf ./train_parallel$i
mkdir ./train_parallel$i
cp ../*.py ./train_parallel$i
cp -r ../src ./train_parallel$i
cd ./train_parallel$i || exit
echo "start training for rank $RANK_ID, device $DEVICE_ID"
env > env.log
python train.py \
--train_dir train2017 \
--group_size 8 \
--train_ann person_keypoints_train2017.json > log.txt 2>&1 &
cd ..
done

@ -0,0 +1,23 @@
#!/bin/bash
# Copyright 2020 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
export DEVICE_ID=0
export RANK_ID=0
python eval.py \
--model_path ./scripts/train_parallel0/checkpoints/ckpt_0/0-60_663.ckpt \
--imgpath_val /data0/zhy/dataset/coco/val2017 \
--ann /data0/zhy/dataset/coco/annotations/person_keypoints_val2017.json \
> eval.log 2>&1 &

@ -0,0 +1,18 @@
#!/bin/bash
# Copyright 2020 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
cd ..
python train.py --train_dir train2017 --train_ann person_keypoints_train2017.json > scripts/train.log 2>&1 &

@ -0,0 +1,171 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
from enum import IntEnum
class JointType(IntEnum):
Nose = 0
Neck = 1
RightShoulder = 2
RightElbow = 3
RightHand = 4
LeftShoulder = 5
LeftElbow = 6
LeftHand = 7
RightWaist = 8
RightKnee = 9
RightFoot = 10
LeftWaist = 11
LeftKnee = 12
LeftFoot = 13
RightEye = 14
LeftEye = 15
RightEar = 16
LeftEar = 17
params = {
# paths
'data_dir': '/data0/zhy/dataset/coco',
'vgg_path': '/data0/zhy/dataset/coco/vgg19-0-97_5004.ckpt',
'save_model_path': './checkpoints/',
'load_pretrain': False,
'pretrained_model_path': "",
# training params
'batch_size': 10,
'lr': 1e-4,
'lr_gamma': 0.1,
'lr_steps': '100000,200000,250000',
'lr_steps_NP': '250000',
'loss_scale': 16386,
'max_epoch_train': 60,
'min_keypoints': 5,
'min_area': 32 * 32,
'insize': 368,
'downscale': 8,
'paf_sigma': 8,
'heatmap_sigma': 7,
'eva_num': 100,
'keep_checkpoint_max': 5,
'log_interval': 100,
'ckpt_interval': 663, # 5000,
'min_box_size': 64,
'max_box_size': 512,
'min_scale': 0.5,
'max_scale': 2.0,
'max_rotate_degree': 40,
'center_perterb_max': 40,
# inference params
'inference_img_size': 368,
'inference_scales': [0.5, 1, 1.5, 2],
# 'inference_scales': [1.0],
'heatmap_size': 320,
'gaussian_sigma': 2.5,
'ksize': 17,
'n_integ_points': 10,
'n_integ_points_thresh': 8,
'heatmap_peak_thresh': 0.05,
'inner_product_thresh': 0.05,
'limb_length_ratio': 1.0,
'length_penalty_value': 1,
'n_subset_limbs_thresh': 3,
'subset_score_thresh': 0.2,
'limbs_point': [
[JointType.Neck, JointType.RightWaist],
[JointType.RightWaist, JointType.RightKnee],
[JointType.RightKnee, JointType.RightFoot],
[JointType.Neck, JointType.LeftWaist],
[JointType.LeftWaist, JointType.LeftKnee],
[JointType.LeftKnee, JointType.LeftFoot],
[JointType.Neck, JointType.RightShoulder],
[JointType.RightShoulder, JointType.RightElbow],
[JointType.RightElbow, JointType.RightHand],
[JointType.RightShoulder, JointType.RightEar],
[JointType.Neck, JointType.LeftShoulder],
[JointType.LeftShoulder, JointType.LeftElbow],
[JointType.LeftElbow, JointType.LeftHand],
[JointType.LeftShoulder, JointType.LeftEar],
[JointType.Neck, JointType.Nose],
[JointType.Nose, JointType.RightEye],
[JointType.Nose, JointType.LeftEye],
[JointType.RightEye, JointType.RightEar],
[JointType.LeftEye, JointType.LeftEar]
],
'joint_indices': [
JointType.Nose,
JointType.LeftEye,
JointType.RightEye,
JointType.LeftEar,
JointType.RightEar,
JointType.LeftShoulder,
JointType.RightShoulder,
JointType.LeftElbow,
JointType.RightElbow,
JointType.LeftHand,
JointType.RightHand,
JointType.LeftWaist,
JointType.RightWaist,
JointType.LeftKnee,
JointType.RightKnee,
JointType.LeftFoot,
JointType.RightFoot
],
# face params
'face_inference_img_size': 368,
'face_heatmap_peak_thresh': 0.1,
'face_crop_scale': 1.5,
'face_line_indices': [
[0, 1], [1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7], [7, 8], [8, 9], [9, 10], [10, 11], [11, 12], [12, 13], [13, 14], [14, 15], [15, 16], # 轮廓
[17, 18], [18, 19], [19, 20], [20, 21],
[22, 23], [23, 24], [24, 25], [25, 26],
[27, 28], [28, 29], [29, 30],
[31, 32], [32, 33], [33, 34], [34, 35],
[36, 37], [37, 38], [38, 39], [39, 40], [40, 41], [41, 36],
[42, 43], [43, 44], [44, 45], [45, 46], [46, 47], [47, 42],
[48, 49], [49, 50], [50, 51], [51, 52], [52, 53], [53, 54], [54, 55], [55, 56], [56, 57], [57, 58], [58, 59], [59, 48], # 唇外廓
[60, 61], [61, 62], [62, 63], [63, 64], [64, 65], [65, 66], [66, 67], [67, 60]
],
# hand params
'hand_inference_img_size': 368,
'hand_heatmap_peak_thresh': 0.1,
'fingers_indices': [
[[0, 1], [1, 2], [2, 3], [3, 4]],
[[0, 5], [5, 6], [6, 7], [7, 8]],
[[0, 9], [9, 10], [10, 11], [11, 12]],
[[0, 13], [13, 14], [14, 15], [15, 16]],
[[0, 17], [17, 18], [18, 19], [19, 20]],
],
}

File diff suppressed because it is too large Load Diff

@ -0,0 +1,133 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os
import argparse
import cv2
import numpy as np
from tqdm import tqdm
from pycocotools.coco import COCO as ReadJson
from config import params
class DataLoader():
def __init__(self, coco, dir_name, data_mode='train'):
self.train = coco
self.dir_name = dir_name
assert data_mode in ['train', 'val'], 'Data loading mode is invalid.'
self.mode = data_mode
self.catIds = coco.getCatIds() # catNms=['person']
self.imgIds = sorted(coco.getImgIds(catIds=self.catIds))
def __len__(self):
return len(self.imgIds)
def gen_masks(self, image, anns):
_mask_all = np.zeros(image.shape[:2], 'bool')
_mask_miss = np.zeros(image.shape[:2], 'bool')
for ann in anns:
mask = self.train.annToMask(ann).astype('bool')
if ann['iscrowd'] == 1:
intxn = _mask_all & mask
_mask_miss = np.bitwise_or(_mask_miss.astype(int), np.subtract(mask, intxn, dtype=np.int32))
_mask_all = np.bitwise_or(_mask_all.astype(int), mask.astype(int))
elif ann['num_keypoints'] < params['min_keypoints'] or ann['area'] <= params['min_area']:
_mask_all = np.bitwise_or(_mask_all.astype(int), mask.astype(int))
_mask_miss = np.bitwise_or(_mask_miss.astype(int), mask.astype(int))
else:
_mask_all = np.bitwise_or(_mask_all.astype(int), mask.astype(int))
return _mask_all, _mask_miss
def dwaw_gen_masks(self, image, mask, color=(0, 0, 1)):
bimsk = np.repeat(mask[:, :, np.newaxis], 3, axis=2)
mskd = image * bimsk.astype(np.int32)
clmsk = np.ones(bimsk.shape) * bimsk
for ind in range(3):
clmsk[:, :, ind] = clmsk[:, :, ind] * color[ind] * 255
image = image + 0.7 * clmsk - 0.7 * mskd
return image.astype(np.uint8)
def draw_masks_and_keypoints(self, image, anns):
for ann in anns:
# masks
mask = self.train.annToMask(ann).astype(np.uint8)
if ann['iscrowd'] == 1:
color = (0, 0, 1)
elif ann['num_keypoints'] == 0:
color = (0, 1, 0)
else:
color = (1, 0, 0)
bimsk = np.repeat(mask[:, :, np.newaxis], 3, axis=2)
mskd = image * bimsk.astype(np.int32)
clmsk = np.ones(bimsk.shape) * bimsk
for ind in range(3):
clmsk[:, :, ind] = clmsk[:, :, ind] * color[ind] * 255
image = image + 0.7 * clmsk - 0.7 * mskd
# keypoints
for x, y, v in np.array(ann['keypoints']).reshape(-1, 3):
if v == 1:
cv2.circle(image, (x, y), 3, (255, 255, 0), -1)
elif v == 2:
cv2.circle(image, (x, y), 3, (255, 0, 255), -1)
return image.astype(np.uint8)
def get_img_annotation(self, ind=None, image_id=None):
if ind is not None:
image_id = self.imgIds[ind]
anno_ids = self.train.getAnnIds(imgIds=[image_id])
_annotations = self.train.loadAnns(anno_ids)
img_file = os.path.join(params['data_dir'], self.dir_name, self.train.loadImgs([image_id])[0]['file_name'])
_image = cv2.imread(img_file)
return _image, _annotations, image_id
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--vis', action='store_true', help='visualize annotations and ignore masks')
parser.add_argument('--train_ann', type=str, help='train annotations json')
parser.add_argument('--val_ann', type=str, help='val annotations json')
parser.add_argument('--train_dir', type=str, help='name of train dir')
parser.add_argument('--val_dir', type=str, help='name of val dir')
args = parser.parse_args()
path_list = [args.train_ann, args.val_ann, args.train_dir, args.val_dir]
for index, mode in enumerate(['train', 'val']):
train = ReadJson(path_list[index])
data_loader = DataLoader(train, path_list[index+2], mode=mode)
save_dir = os.path.join(params['data_dir'], 'ignore_mask_{}'.format(mode))
if not os.path.exists(save_dir):
os.makedirs(save_dir)
for i in tqdm(range(len(data_loader))):
img, annotations, img_id = data_loader.get_img_annotation(ind=i)
mask_all, mask_miss = data_loader.gen_masks(img, annotations)
if args.vis:
ann_img = data_loader.draw_masks_and_keypoints(img, annotations)
msk_img = data_loader.dwaw_gen_masks(img, mask_miss)
cv2.imshow('image', np.hstack((ann_img, msk_img)))
k = cv2.waitKey()
if k == ord('q'):
break
elif k == ord('s'):
cv2.imwrite('aaa.png', np.hstack((ann_img, msk_img)))
if np.any(mask_miss) and not args.vis:
mask_miss = mask_miss.astype(np.uint8) * 255
save_path = os.path.join(save_dir, '{:012d}.png'.format(img_id))
cv2.imwrite(save_path, mask_miss)

@ -0,0 +1,207 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import time
import mindspore.nn as nn
from mindspore.ops import operations as P
from mindspore.nn.loss.loss import _Loss
from mindspore.train.callback import Callback
from mindspore.ops import functional as F
from mindspore.ops import composite as C
from mindspore.communication.management import get_group_size
from mindspore.context import ParallelMode
from mindspore import context
context.set_context(mode=context.GRAPH_MODE, save_graphs=True)
time_stamp_init = False
time_stamp_first = 0
grad_scale = C.MultitypeFuncGraph("grad_scale")
reciprocal = P.Reciprocal()
GRADIENT_CLIP_TYPE = 1
GRADIENT_CLIP_VALUE = 1.0
@grad_scale.register("Tensor", "Tensor")
def tensor_grad_scale(scale, grad):
return grad * F.cast(reciprocal(scale), F.dtype(grad))
@grad_scale.register("Tensor", "RowTensor")
def tensor_grad_scale_row_tensor(scale, grad):
return RowTensor(grad.indices,
grad.values * F.cast(reciprocal(scale), F.dtype(grad.values)),
grad.dense_shape)
clip_grad = C.MultitypeFuncGraph("clip_grad")
@clip_grad.register("Number", "Number", "Tensor")
class openpose_loss(_Loss):
def __init__(self):
super(openpose_loss, self).__init__()
self.expand_dims = P.ExpandDims()
self.tile = P.Tile()
self.mul = P.Mul()
self.l2_loss = P.L2Loss()
self.square = P.Square()
self.reduceMean = P.ReduceMean()
self.reduceSum = P.ReduceSum()
self.print = P.Print()
self.shape = P.Shape()
self.maxoftensor = P.ArgMaxWithValue(-1)
def mean_square_error(self, map1, map2, mask=None):
# print("mask", mask)
# import pdb; pdb.set_trace()
if mask is None:
mse = self.reduceMean((map1 - map2) ** 2)
return mse
squareMap = self.square(map1 - map2)
squareMap_mask = self.mul(squareMap, mask)
mse = self.reduceMean(squareMap_mask)
return mse
def construct(self, logit_paf, logit_heatmap, gt_paf, gt_heatmap, ignore_mask):
# Input
# ignore_mask, make sure the ignore_mask the 0-1 array instead of the bool-false array
heatmaps_loss = []
pafs_loss = []
total_loss = 0
paf_masks = self.tile(self.expand_dims(ignore_mask, 1), (1, self.shape(gt_paf)[1], 1, 1))
heatmap_masks = self.tile(self.expand_dims(ignore_mask, 1), (1, self.shape(gt_heatmap)[1], 1, 1))
paf_masks = F.stop_gradient(paf_masks)
heatmap_masks = F.stop_gradient(heatmap_masks)
for logit_paf_t, logit_heatmap_t in zip(logit_paf, logit_heatmap):
# TEST
# tensor1 -- tuple
# tensor1 = self.maxoftensor(logit_paf_t)[1]
# tensor2 = self.maxoftensor(logit_heatmap_t)[1]
# tensor3 = self.maxoftensor(tensor1)[1]
# tensor4 = self.maxoftensor(tensor2)[1]
# self.print("paf",tensor3)
# self.print("heatmaps",tensor2)
pafs_loss_t = self.mean_square_error(logit_paf_t, gt_paf, paf_masks)
heatmaps_loss_t = self.mean_square_error(logit_heatmap_t, gt_heatmap, heatmap_masks)
total_loss += pafs_loss_t + heatmaps_loss_t
heatmaps_loss.append(heatmaps_loss_t)
pafs_loss.append(pafs_loss_t)
return total_loss, heatmaps_loss, pafs_loss
class Depend_network(nn.Cell):
def __init__(self, network):
super(Depend_network, self).__init__()
self.network = network
def construct(self, *args):
loss, _, _ = self.network(*args) # loss, heatmaps_loss, pafs_loss
return loss
class TrainingWrapper(nn.Cell):
def __init__(self, network, optimizer, sens=1):
super(TrainingWrapper, self).__init__(auto_prefix=False)
self.network = network
self.depend_network = Depend_network(network)
# self.weights = ms.ParameterTuple(network.trainable_params())
self.weights = optimizer.parameters
self.optimizer = optimizer
self.grad = C.GradOperation(get_by_list=True, sens_param=True)
self.sens = sens
self.reducer_flag = False
self.grad_reducer = None
self.print = P.Print()
self.parallel_mode = context.get_auto_parallel_context("parallel_mode")
if self.parallel_mode in [ParallelMode.DATA_PARALLEL, ParallelMode.HYBRID_PARALLEL]:
self.reducer_flag = True
if self.reducer_flag:
mean = context.get_auto_parallel_context("gradients_mean")
#if mean.get_device_num_is_set():
# if mean:
#degree = context.get_auto_parallel_context("device_num")
# else:
degree = get_group_size()
self.grad_reducer = nn.DistributedGradReducer(optimizer.parameters, mean, degree)
def construct(self, *args):
weights = self.weights
loss, heatmaps_loss, pafs_loss = self.network(*args)
sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens)
#grads = self.grad(self.network, weights)(*args, sens)
grads = self.grad(self.depend_network, weights)(*args, sens)
if self.reducer_flag:
grads = self.grad_reducer(grads)
#return F.depend(loss, self.optimizer(grads))
# for grad in grads:
# self.print(grad)
loss = F.depend(loss, self.optimizer(grads))
return loss, heatmaps_loss, pafs_loss
class BuildTrainNetwork(nn.Cell):
def __init__(self, network, criterion):
super(BuildTrainNetwork, self).__init__()
self.network = network
self.criterion = criterion
def construct(self, input_data, gt_paf, gt_heatmap, mask):
logit_pafs, logit_heatmap = self.network(input_data)
loss, _, _ = self.criterion(logit_pafs, logit_heatmap, gt_paf, gt_heatmap, mask)
return loss
class LossCallBack(Callback):
"""
Monitor the loss in training.
If the loss is NAN or INF terminating training.
Note:
If per_print_times is 0 do not print loss.
Args:
per_print_times (int): Print loss every times. Default: 1.
"""
def __init__(self, per_print_times=1):
super(LossCallBack, self).__init__()
if not isinstance(per_print_times, int) or per_print_times < 0:
raise ValueError("print_step must be int and >= 0.")
self._per_print_times = per_print_times
self.count = 0
self.loss_sum = 0
global time_stamp_init, time_stamp_first
if not time_stamp_init:
time_stamp_first = time.time()
time_stamp_init = True
def step_end(self, run_context):
cb_params = run_context.original_args()
loss = cb_params.net_outputs.asnumpy()
self.count += 1
self.loss_sum += float(loss)
cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1
if self.count >= 1:
global time_stamp_first
time_stamp_current = time.time()
loss = self.loss_sum/self.count
loss_file = open("./loss.log", "a+")
loss_file.write("%lu epoch: %s step: %s ,loss: %.5f" %
(time_stamp_current - time_stamp_first, cb_params.cur_epoch_num, cur_step_in_epoch,
loss))
loss_file.write("\n")
loss_file.close()
self.count = 0
self.loss_sum = 0

File diff suppressed because it is too large Load Diff

@ -0,0 +1,157 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import argparse
import os
import time
import numpy as np
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from mindspore.train.callback import LossMonitor
from mindspore.common.tensor import Tensor
from src.config import params
class MyLossMonitor(LossMonitor):
def __init__(self, per_print_times=1):
super(MyLossMonitor, self).__init__()
self._per_print_times = per_print_times
self._start_time = time.time()
self._loss_list = []
def step_end(self, run_context):
cb_params = run_context.original_args()
loss = cb_params.net_outputs
if isinstance(loss, (tuple, list)):
if isinstance(loss[0], Tensor) and isinstance(loss[0].asnumpy(), np.ndarray):
loss = loss[0]
if isinstance(loss, Tensor) and isinstance(loss.asnumpy(), np.ndarray):
loss = np.mean(loss.asnumpy())
cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1
if isinstance(loss, float) and (np.isnan(loss) or np.isinf(loss)):
raise ValueError("epoch: {} step: {}. Invalid loss, terminating training.".format(
cb_params.cur_epoch_num, cur_step_in_epoch))
if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0:
# print("epoch: %s step: %s, loss is %s, step time: %.3f s." % (cb_params.cur_epoch_num, cur_step_in_epoch,
# loss,
# (time.time() - self._start_time)), flush=True)
self._loss_list.append(loss)
if cb_params.cur_step_num % 100 == 0:
print("epoch: %s, steps: [%s] mean loss is: %s"%(cb_params.cur_epoch_num, cur_step_in_epoch,
np.array(self._loss_list).mean()), flush=True)
self._loss_list = []
self._start_time = time.time()
def parse_args():
"""Parse train arguments."""
parser = argparse.ArgumentParser('mindspore openpose training')
# dataset related
parser.add_argument('--train_dir', type=str, default='train2017', help='train data dir')
parser.add_argument('--train_ann', type=str, default='person_keypoints_train2017.json',
help='train annotations json')
parser.add_argument('--group_size', type=int, default=1, help='world size of distributed')
args, _ = parser.parse_known_args()
args.jsonpath_train = os.path.join(params['data_dir'], 'annotations/' + args.train_ann)
args.imgpath_train = os.path.join(params['data_dir'], args.train_dir)
args.maskpath_train = os.path.join(params['data_dir'], 'ignore_mask_train')
return args
def get_lr(lr, lr_gamma, steps_per_epoch, max_epoch_train, lr_steps, group_size):
lr_stage = np.array([lr] * steps_per_epoch * max_epoch_train).astype('f')
for step in lr_steps:
step //= group_size
lr_stage[step:] *= lr_gamma
lr_base = lr_stage.copy()
lr_base = lr_base / 4
lr_vgg = lr_base.copy()
vgg_freeze_step = 2000
lr_vgg[:vgg_freeze_step] = 0
return lr_stage, lr_base, lr_vgg
# zhang add
def adjust_learning_rate(init_lr, lr_gamma, steps_per_epoch, max_epoch_train, stepvalues):
lr_stage = np.array([init_lr] * steps_per_epoch * max_epoch_train).astype('f')
for epoch in stepvalues:
lr_stage[epoch * steps_per_epoch:] *= lr_gamma
lr_base = lr_stage.copy()
lr_base = lr_base / 4
lr_vgg = lr_base.copy()
vgg_freeze_step = 2000
lr_vgg[:vgg_freeze_step] = 0
return lr_stage, lr_base, lr_vgg
def load_model(test_net, model_path):
if model_path:
param_dict = load_checkpoint(model_path)
# print(type(param_dict))
param_dict_new = {}
for key, values in param_dict.items():
# print('key:', key)
if key.startswith('moment'):
continue
elif key.startswith('network.'):
param_dict_new[key[8:]] = values
# else:
# param_dict_new[key] = values
load_param_into_net(test_net, param_dict_new)
class show_loss_list():
def __init__(self, name):
self.loss_list = np.zeros(6).astype('f')
self.sums = 0
self.name = name
def add(self, list_of_tensor):
self.sums += 1
for i, loss_tensor in enumerate(list_of_tensor):
self.loss_list[i] += loss_tensor.asnumpy()
def show(self):
print(self.name + ' stage_loss:', self.loss_list / (self.sums + 1e-8), flush=True)
self.loss_list = np.zeros(6).astype('f')
self.sums = 0
class AverageMeter():
def __init__(self):
self.loss = 0
self.sum = 0
def add(self, tensor):
self.sum += 1
self.loss += tensor.asnumpy()
def meter(self):
avergeLoss = self.loss / (self.sum + 1e-8)
self.loss = 0
self.sum = 0
return avergeLoss

@ -0,0 +1,124 @@
# Copyright 2020 Huawei Technologies Co., Ltd
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os
from mindspore import context
from mindspore.context import ParallelMode
from mindspore.communication.management import init, get_rank, get_group_size
from mindspore.train import Model
from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, TimeMonitor
from mindspore.train.loss_scale_manager import FixedLossScaleManager
from mindspore.nn.optim import Adam
from src.dataset import create_dataset
from src.openposenet import OpenPoseNet
from src.loss import openpose_loss, BuildTrainNetwork
from src.config import params
from src.utils import parse_args, get_lr, load_model, MyLossMonitor
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", save_graphs=False)
def train():
"""Train function."""
args = parse_args()
args.outputs_dir = params['save_model_path']
if args.group_size > 1:
init()
context.set_auto_parallel_context(device_num=get_group_size(), parallel_mode=ParallelMode.DATA_PARALLEL,
gradients_mean=True)
args.outputs_dir = os.path.join(args.outputs_dir, "ckpt_{}/".format(str(get_rank())))
args.rank = get_rank()
else:
args.outputs_dir = os.path.join(args.outputs_dir, "ckpt_0/")
args.rank = 0
# with out loss_scale
if args.group_size > 1:
args.loss_scale = params['loss_scale'] / 2
args.lr_steps = list(map(int, params["lr_steps_NP"].split(',')))
else:
args.loss_scale = params['loss_scale']
args.lr_steps = list(map(int, params["lr_steps"].split(',')))
# create network
print('start create network')
criterion = openpose_loss()
criterion.add_flags_recursive(fp32=True)
network = OpenPoseNet(vggpath=params['vgg_path'])
# network.add_flags_recursive(fp32=True)
if params["load_pretrain"]:
print("load pretrain model:", params["pretrained_model_path"])
load_model(network, params["pretrained_model_path"])
train_net = BuildTrainNetwork(network, criterion)
# create dataset
if os.path.exists(args.jsonpath_train) and os.path.exists(args.imgpath_train) \
and os.path.exists(args.maskpath_train):
print('start create dataset')
else:
print('Error: wrong data path')
num_worker = 20 if args.group_size > 1 else 48
de_dataset_train = create_dataset(args.jsonpath_train, args.imgpath_train, args.maskpath_train,
batch_size=params['batch_size'],
rank=args.rank,
group_size=args.group_size,
num_worker=num_worker,
multiprocessing=True,
shuffle=True,
repeat_num=1)
steps_per_epoch = de_dataset_train.get_dataset_size()
print("steps_per_epoch: ", steps_per_epoch)
# lr scheduler
lr_stage, lr_base, lr_vgg = get_lr(params['lr'] * args.group_size,
params['lr_gamma'],
steps_per_epoch,
params["max_epoch_train"],
args.lr_steps,
args.group_size)
vgg19_base_params = list(filter(lambda x: 'base.vgg_base' in x.name, train_net.trainable_params()))
base_params = list(filter(lambda x: 'base.conv' in x.name, train_net.trainable_params()))
stages_params = list(filter(lambda x: 'base' not in x.name, train_net.trainable_params()))
group_params = [{'params': vgg19_base_params, 'lr': lr_vgg},
{'params': base_params, 'lr': lr_base},
{'params': stages_params, 'lr': lr_stage}]
opt = Adam(group_params, loss_scale=args.loss_scale)
train_net.set_train(True)
loss_scale_manager = FixedLossScaleManager(args.loss_scale, drop_overflow_update=False)
model = Model(train_net, optimizer=opt, loss_scale_manager=loss_scale_manager)
params['ckpt_interval'] = max(steps_per_epoch, params['ckpt_interval'])
config_ck = CheckpointConfig(save_checkpoint_steps=params['ckpt_interval'],
keep_checkpoint_max=params["keep_checkpoint_max"])
ckpoint_cb = ModelCheckpoint(prefix='{}'.format(args.rank), directory=args.outputs_dir, config=config_ck)
time_cb = TimeMonitor(data_size=de_dataset_train.get_dataset_size())
callback_list = [MyLossMonitor(), time_cb, ckpoint_cb]
print("============== Starting Training ==============")
model.train(params["max_epoch_train"], de_dataset_train, callbacks=callback_list,
dataset_sink_mode=False)
if __name__ == "__main__":
# mindspore.common.seed.set_seed(1)
train()
Loading…
Cancel
Save