@ -120,7 +120,7 @@ The following optimizers add the target interface: Adam, FTRL, LazyAdam, Proxim
</tr>
</table>
###### `export` Modify the input parameters and export's file name ([!7385](https://gitee.com/mind_spore/dashboard/projects/mindspore/mindspore/pulls/7385?tab=diffs), [!9057](https://gitee.com/mindspore/mindspore/pulls/9057/files))
###### `export` Modify the input parameters and export's file name ([!7385](https://gitee.com/mindspore/mindspore/pulls/7385), [!9057](https://gitee.com/mindspore/mindspore/pulls/9057/files))
Export the MindSpore prediction model to a file in the specified format.
@ -227,7 +227,7 @@ However, from a user's perspective, tensor.size and tensor.ndim (methods -> prop
</tr>
</table>
###### `EmbeddingLookup` add a config in the interface: sparse ([!8202](https://gitee.com/mind_spore/dashboard/projects/mindspore/mindspore/pulls/8202?tab=diffs))
###### `EmbeddingLookup` add a config in the interface: sparse ([!8202](https://gitee.com/mindspore/mindspore/pulls/8202))
sparse (bool): Using sparse mode. When 'target' is set to 'CPU', 'sparse' has to be true. Default: True.
@ -878,7 +878,7 @@ Contributions of any kind are welcome!
- Fix bug of list cannot be used as input in pynative mode([!1765](https://gitee.com/mindspore/mindspore/pulls/1765))
- Fix bug of kernel select ([!2103](https://gitee.com/mindspore/mindspore/pulls/2103))
- Fix bug of pattern matching for batchnorm fusion in the case of auto mix precision.([!1851](https://gitee.com/mindspore/mindspore/pulls/1851))
- Fix bug of generate hccl's kernel info.([!2393](https://gitee.com/mindspore/mindspore/mindspore/pulls/2393))
- Fix bug of generate hccl's kernel info.([!2393](https://gitee.com/mindspore/mindspore/pulls/2393))
- GPU platform
- Fix bug of summary feature invalid([!2173](https://gitee.com/mindspore/mindspore/pulls/2173))
@ -67,7 +67,7 @@ All the models in this repository are trained and validated on ImageNet-1K. The
## [Mixed Precision](#contents)
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
# [Environment Requirements](#contents)
@ -81,8 +81,8 @@ To run the python scripts in the repository, you need to prepare the environment
- Easydict
- MXNet 1.6.0 if running the script `param_convert.py`
- For more information, please check the resources below:
[论文](https://arxiv.org/pdf/1905.02244):Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al."Searching for MobileNetV2."In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314-1324.2019.
[论文](https://arxiv.org/pdf/1905.02244):Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al."Searching for mobilenetv3."In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314-1324.2019.
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [RetinaFace Description](#contents)
Retinaface is a face detection model, which was proposed in 2019 and achieved the best results on the wideface dataset at that time. Retinaface, the full name of the paper is retinaface: single stage dense face localization in the wild. Compared with s3fd and mtcnn, it has a significant improvement, and has a higher recall rate for small faces. It is not good for multi-scale face detection. In order to solve these problems, retinaface feature pyramid structure is used for feature fusion between different scales, and SSH module is added.
@ -33,6 +32,7 @@ Retinaface is a face detection model, which was proposed in 2019 and achieved th
Retinaface needs a resnet50 backbone to extract image features for detection. You could get resnet50 train script from our modelzoo and modify the pad structure of resnet50 according to resnet in ./src/network.py, Final train it on imagenet2012 to get resnet50 pretrain model.
Steps:
1. Get resnet50 train script from our modelzoo.
2. Modify the resnet50 architecture according to resnet in ```./src/network.py```.(You can also leave the structure of a unchanged, but the accuracy will be 2-3 percentage points lower.)
3. Train resnet50 on imagenet2012.
@ -41,21 +41,20 @@ Steps:
Specifically, the retinaface network is based on retinanet. The feature pyramid structure of retinanet is used in the network, and SSH structure is added. Besides the traditional detection branch, the prediction branch of key points and self-monitoring branch are added in the network. The paper indicates that the two branches can improve the performance of the model. Here we do not implement the self-monitoring branch.
@ -196,7 +189,6 @@ Parameters for both training and evaluation can be set in config.py
After training, you'll get some checkpoint files under the folder `./checkpoint/ckpt_0/` by default.
## [Evaluation Process](#contents)
### Evaluation
@ -205,14 +197,14 @@ Parameters for both training and evaluation can be set in config.py
Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path in src/config.py, e.g., "username/retinaface/checkpoint/ckpt_0/RetinaFace-100_402.ckpt".
```
```bash
export CUDA_VISIBLE_DEVICES=0
python eval.py > eval.log 2>&1 &
```
The above python command will run in the background. You can view the results through the file "eval.log". The result of the test dataset will be as follows:
```
```text
# grep "Val AP" eval.log
Easy Val AP : 0.9422
Medium Val AP : 0.9325
@ -221,23 +213,21 @@ Parameters for both training and evaluation can be set in config.py
OR,
```
```bash
bash run_standalone_gpu_eval.sh 0
```
The above python command will run in the background. You can view the results through the file "eval/eval.log". The result of the test dataset will be as follows:
```
```text
# grep "Val AP" eval.log
Easy Val AP : 0.9422
Medium Val AP : 0.9325
Hard Val AP : 0.8900
```
# [Model Description](#contents)
## [Performance](#contents)
### Evaluation Performance
@ -260,14 +250,13 @@ Parameters for both training and evaluation can be set in config.py
| Checkpoint for Fine tuning | 336.3M (.ckpt file) |
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
# [Environment Requirements](#contents)
@ -68,8 +68,8 @@ To run the python scripts in the repository, you need to prepare the environment
- opencv-python 4.3.0.36
- pycocotools 2.0
- For more information, please check the resources below:
YOLOv4 is a state-of-the-art detector which is faster (FPS) and more accurate (MS COCO AP50...95 and AP50) than all available alternative detectors.
YOLOv4 has verified a large number of features, and selected for use such of them for improving the accuracy of both the classifier and the detector.
These features can be used as best-practice for future studies and developments.
@ -39,7 +39,8 @@ Dataset support: [MS COCO] or datasetd with the same format as MS COCO
Annotation support: [MS COCO] or annotation as the same format as MS COCO
- The directory structure is as follows, the name of directory and file is user define:
```
```text
├── dataset
├── YOLOv4
├── annotations
@ -55,6 +56,7 @@ Annotation support: [MS COCO] or annotation as the same format as MS COCO
└─picturen.jpg
```
we suggest user to use MS COCO dataset to experience our model,
other datasets need to use the same format as MS COCO.
@ -63,15 +65,16 @@ other datasets need to use the same format as MS COCO.
- Hardware(Ascend)
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
@ -259,13 +272,16 @@ After training, you'll get some checkpoint files under the outputs folder by def
```
### Distributed Training
For Ascend device, distributed training example(8p) by shell script
```
```bash
sh run_distribute_train.sh dataset/coco2017 cspdarknet53_backbone.ckpt rank_table_8p.json
```
The above shell script will run distribute training in the background. You can view the results through the file train_parallel[X]/log.txt. The loss value will be achieved as follows:
@ -305,12 +321,11 @@ The above shell script will run distribute training in the background. You can v
...
```
## [Evaluation Process](#contents)
### Valid
```
```bash
python eval.py \
--data_dir=./dataset/coco2017 \
--pretrained=yolov4.ckpt \
@ -320,7 +335,8 @@ sh run_eval.sh dataset/coco2017 checkpoint/yolov4.ckpt
```
The above python command will run in the background. You can view the results through the file "log.txt". The mAP of the test dataset will be as follows:
```
```text
# log.txt
=============coco eval reulst=========
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.442
@ -336,8 +352,10 @@ The above python command will run in the background. You can view the results th
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.717
```
### Test-dev
```
```bash
python test.py \
--data_dir=./dataset/coco2017 \
--pretrained=yolov4.ckpt \
@ -345,11 +363,13 @@ python test.py \
OR
sh run_test.sh dataset/coco2017 checkpoint/yolov4.ckpt
```
The predict_xxx.json will be found in test/outputs/%Y-%m-%d_time_%H_%M_%S/.
Rename the file predict_xxx.json to detections_test-dev2017_yolov4_results.json and compress it to detections_test-dev2017_yolov4_results.zip
Submit file detections_test-dev2017_yolov4_results.zip to the MS COCO evaluation server for the test-dev2019 (bbox) https://competitions.codalab.org/competitions/20794#participate
Submit file detections_test-dev2017_yolov4_results.zip to the MS COCO evaluation server for the test-dev2019 (bbox) <https://competitions.codalab.org/competitions/20794#participate>
You will get such results in the end of file View scoring output log.
```
```text
overall performance
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.447
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.642
@ -624,7 +624,7 @@ Get the log and output files under the path `./train_mass_*/`, and the model fil
## Inference
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html).
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/migrate_3rd_scripts.html).
For inference, config the options in `config.json` firstly:
- Assign the `test_dataset` under `dataset_config` node to the dataset path.
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [NCF Description](#contents)
NCF is a general framework for collaborative filtering of recommendations in which a neural network architecture is used to model user-item interactions. Unlike traditional models, NCF does not resort to Matrix Factorization (MF) with an inner product on latent features of users and items. It replaces the inner product with a multi-layer perceptron that can learn an arbitrary function from data.
[Paper](https://arxiv.org/abs/1708.05031): He X, Liao L, Zhang H, et al. Neural collaborative filtering[C]//Proceedings of the 26th international conference on world wide web. 2017: 173-182.
# [Model Architecture](#contents)
Two instantiations of NCF are Generalized Matrix Factorization (GMF) and Multi-Layer Perceptron (MLP). GMF applies a linear kernel to model the latent feature interactions, and and MLP uses a nonlinear kernel to learn the interaction function from data. NeuMF is a fused model of GMF and MLP to better model the complex user-item interactions, and unifies the strengths of linearity of MF and non-linearity of MLP for modeling the user-item latent structures. NeuMF allows GMF and MLP to learn separate embeddings, and combines the two models by concatenating their last hidden layer. [neumf_model.py](neumf_model.py) defines the architecture details.
# [Dataset](#contents)
The [MovieLens datasets](http://files.grouplens.org/datasets/movielens/) are used for model training and evaluation. Specifically, we use two datasets: **ml-1m** (short for MovieLens 1 million) and **ml-20m** (short for MovieLens 20 million).
### ml-1m
## ml-1m
ml-1m dataset contains 1,000,209 anonymous ratings of approximately 3,706 movies made by 6,040 users who joined MovieLens in 2000. All ratings are contained in the file "ratings.dat" without header row, and are in the following format:
```
```cpp
UserID::MovieID::Rating::Timestamp
```
- UserIDs range between 1 and 6040.
- MovieIDs range between 1 and 3952.
- Ratings are made on a 5-star scale (whole-star ratings only).
### ml-20m
- UserIDs range between 1 and 6040.
- MovieIDs range between 1 and 3952.
- Ratings are made on a 5-star scale (whole-star ratings only).
## ml-20m
ml-20m dataset contains 20,000,263 ratings of 26,744 movies by 138493 users. All ratings are contained in the file "ratings.csv". Each line of this file after the header row represents one rating of one movie by one user, and has the following format:
```
```text
userId,movieId,rating,timestamp
```
- The lines within this file are ordered first by userId, then, within user, by movieId.
- Ratings are made on a 5-star scale, with half-star increments (0.5 stars - 5.0 stars).
- The lines within this file are ordered first by userId, then, within user, by movieId.
- Ratings are made on a 5-star scale, with half-star increments (0.5 stars - 5.0 stars).
In both datasets, the timestamp is represented in seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970. Each user has at least 20 ratings.
@ -67,11 +69,9 @@ In both datasets, the timestamp is represented in seconds since midnight Coordin
## Mixed Precision
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
# [Environment Requirements](#contents)
- Hardware(Ascend/GPU)
@ -79,10 +79,8 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil
@ -102,14 +100,11 @@ sh scripts/run_train.sh rank_table.json
sh run_eval.sh
```
# [Script Description](#contents)
## [Script and Sample Code](#contents)
```
```text
├── ModelZoo_NCF_ME
├── README.md // descriptions about NCF
├── scripts
@ -192,9 +187,8 @@ Parameters for both training and evaluation can be set in config.py.
HR:0.6846,NDCG:0.410
```
# [Model Description](#contents)
## [Performance](#contents)
### Evaluation Performance
@ -213,7 +207,6 @@ Parameters for both training and evaluation can be set in config.py.
| Speed | 1pc: 0.575 ms/step |
| Total time | 1pc: 5 mins |
### Inference Performance
| Parameters | Ascend |
@ -228,14 +221,14 @@ Parameters for both training and evaluation can be set in config.py.
| Accuracy | HR:0.6846,NDCG:0.410 |
## [How to use](#contents)
### Inference
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/network_migration.html). Following the steps below, this is a simple example:
If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/migrate_3rd_scripts.html). Following the steps below, this is a simple example:
@ -32,7 +32,7 @@ FCN-4 is a convolutional neural network architecture, its name FCN-4 comes from
### Mixed Precision
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
## [Environment Requirements](#contents)
@ -42,8 +42,8 @@ For FP16 operators, if the input data type is FP32, the backend of MindSpore wil