diff --git a/model_zoo/official/cv/alexnet/README.md b/model_zoo/official/cv/alexnet/README.md index 0e27ed8ec0..6ec30375be 100644 --- a/model_zoo/official/cv/alexnet/README.md +++ b/model_zoo/official/cv/alexnet/README.md @@ -17,47 +17,46 @@ - [Evaluation Performance](#evaluation-performance) - [ModelZoo Homepage](#modelzoo-homepage) - -# [AlexNet Description](#contents) +## [AlexNet Description](#contents) AlexNet was proposed in 2012, one of the most influential neural networks. It got big success in ImageNet Dataset recognition than other models. [Paper](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf): Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep ConvolutionalNeural Networks. *Advances In Neural Information Processing Systems*. 2012. -# [Model Architecture](#contents) +## [Model Architecture](#contents) AlexNet composition consists of 5 convolutional layers and 3 fully connected layers. Multiple convolutional kernels can extract interesting features in images and get more accurate classification. -# [Dataset](#contents) +## [Dataset](#contents) Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. Dataset used: [CIFAR-10]() - Dataset size:175M,60,000 32*32 colorful images in 10 classes - - Train:146M,50,000 images - - Test:29.3M,10,000 images + - Train:146M,50,000 images + - Test:29.3M,10,000 images - Data format:binary files - - Note:Data will be processed in dataset.py + - Note:Data will be processed in dataset.py - Download the dataset, the directory structure is as follows: -``` +```bash ├─cifar-10-batches-bin │ └─cifar-10-verify-bin ``` -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend/GPU) - - Prepare hardware environment with Ascend or GPU processor. + - Prepare hardware environment with Ascend or GPU processor. - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Quick Start](#contents) +## [Quick Start](#contents) After installing MindSpore via the official website, you can start training and evaluation as follows: @@ -68,11 +67,11 @@ sh run_standalone_train_ascend.sh [DATA_PATH] [CKPT_SAVE_PATH] sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME] ``` -# [Script Description](#contents) +## [Script Description](#contents) -## [Script and Sample Code](#contents) +### [Script and Sample Code](#contents) -``` +```bash ├── cv ├── alexnet ├── README.md // descriptions about alexnet @@ -90,7 +89,7 @@ sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME] ├── eval.py // evaluation script ``` -## [Script Parameters](#contents) +### [Script Parameters](#contents) ```python Major parameters in train.py and config.py as follows: @@ -105,13 +104,13 @@ Major parameters in train.py and config.py as follows: --data_path: Path where the dataset is saved ``` -## [Training Process](#contents) +### [Training Process](#contents) -### Training +#### Training - running on Ascend - ``` + ```bash python train.py --data_path cifar-10-batches-bin --ckpt_path ckpt > log 2>&1 & # or enter script dir, and run the script sh run_standalone_train_ascend.sh cifar-10-batches-bin ckpt @@ -119,7 +118,7 @@ Major parameters in train.py and config.py as follows: After training, the loss value will be achieved as follows: - ``` + ```bash # grep "loss is " log epoch: 1 step: 1, loss is 2.2791853 ... @@ -133,7 +132,7 @@ Major parameters in train.py and config.py as follows: - running on GPU - ``` + ```bash python train.py --device_target "GPU" --data_path cifar-10-batches-bin --ckpt_path ckpt > log 2>&1 & # or enter script dir, and run the script sh run_standalone_train_for_gpu.sh cifar-10-batches-bin ckpt @@ -141,7 +140,7 @@ Major parameters in train.py and config.py as follows: After training, the loss value will be achieved as follows: - ``` + ```bash # grep "loss is " log epoch: 1 step: 1, loss is 2.3125906 ... @@ -150,16 +149,15 @@ Major parameters in train.py and config.py as follows: epoch: 30 step: 1561, loss is 0.103845775 ``` +### [Evaluation Process](#contents) -## [Evaluation Process](#contents) - -### Evaluation +#### Evaluation Before running the command below, please check the checkpoint path used for evaluation. - running on Ascend - ``` + ```bash python eval.py --data_path cifar-10-verify-bin --ckpt_path ckpt/checkpoint_alexnet-1_1562.ckpt > eval_log.txt 2>&1 & # or enter script dir, and run the script sh run_standalone_eval_ascend.sh cifar-10-verify-bin ckpt/checkpoint_alexnet-1_1562.ckpt @@ -167,14 +165,14 @@ Before running the command below, please check the checkpoint path used for eval You can view the results through the file "eval_log". The accuracy of the test dataset will be as follows: - ``` + ```bash # grep "Accuracy: " eval_log 'Accuracy': 0.8832 ``` - running on GPU - ``` + ```bash python eval.py --device_target "GPU" --data_path cifar-10-verify-bin --ckpt_path ckpt/checkpoint_alexnet-30_1562.ckpt > eval_log 2>&1 & # or enter script dir, and run the script sh run_standalone_eval_for_gpu.sh cifar-10-verify-bin ckpt/checkpoint_alexnet-30_1562.ckpt @@ -182,16 +180,16 @@ Before running the command below, please check the checkpoint path used for eval You can view the results through the file "eval_log". The accuracy of the test dataset will be as follows: - ``` + ```bash # grep "Accuracy: " eval_log 'Accuracy': 0.88512 ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### Evaluation Performance +#### Evaluation Performance | Parameters | Ascend | GPU | | -------------------------- | ------------------------------------------------------------| -------------------------------------------------| @@ -207,11 +205,12 @@ Before running the command below, please check the checkpoint path used for eval | Speed | 7.3 ms/step | 16.8 ms/step | | Total time | 6 mins | 14 mins | | Checkpoint for Fine tuning | 445M (.ckpt file) | 445M (.ckpt file) | -| Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet | +| Scripts | [AlexNet Script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet) | [AlexNet Script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet) | -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) In dataset.py, we set the seed inside ```create_dataset``` function. -# [ModelZoo Homepage](#contents) - Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). +## [ModelZoo Homepage](#contents) + +Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/cv/alexnet/README_CN.md b/model_zoo/official/cv/alexnet/README_CN.md index f823afb753..2e4342ee17 100644 --- a/model_zoo/official/cv/alexnet/README_CN.md +++ b/model_zoo/official/cv/alexnet/README_CN.md @@ -23,44 +23,44 @@ -# AlexNet描述 +## AlexNet描述 AlexNet是2012年提出的最有影响力的神经网络之一。该网络在ImageNet数据集识别方面取得了显着的成功。 [论文](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-concumulational-neural-networks.pdf): Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep ConvolutionalNeural Networks. *Advances In Neural Information Processing Systems*. 2012. -# 模型架构 +## 模型架构 AlexNet由5个卷积层和3个全连接层组成。多个卷积核用于提取图像中有趣的特征,从而得到更精确的分类。 -# 数据集 +## 数据集 使用的数据集:[CIFAR-10]() - 数据集大小:175M,共10个类、60,000个32*32彩色图像 - - 训练集:146M,50,000个图像 - - 测试集:29.3M,10,000个图像 + - 训练集:146M,50,000个图像 + - 测试集:29.3M,10,000个图像 - 数据格式:二进制文件 - - 注意:数据在dataset.py中处理。 + - 注意:数据在dataset.py中处理。 - 下载数据集。目录结构如下: -``` +```bash ├─cifar-10-batches-bin │ └─cifar-10-verify-bin ``` -# 环境要求 +## 环境要求 - 硬件(Ascend/GPU) - - 准备Ascend或GPU处理器搭建硬件环境。 + - 准备Ascend或GPU处理器搭建硬件环境。 - 框架 - - [MindSpore](https://www.mindspore.cn/install) + - [MindSpore](https://www.mindspore.cn/install) - 如需查看详情,请参见如下资源: - - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) + - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: @@ -71,11 +71,11 @@ sh run_standalone_train_ascend.sh [DATA_PATH] [CKPT_SAVE_PATH] sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME] ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 -``` +```bash ├── cv ├── alexnet ├── README.md // AlexNet相关说明 @@ -93,7 +93,7 @@ sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME] ├── eval.py // 评估脚本 ``` -## 脚本参数 +### 脚本参数 ```python train.py和config.py中主要参数如下: @@ -108,13 +108,13 @@ train.py和config.py中主要参数如下: --data_path:数据集所在路径 ``` -## 训练过程 +### 训练过程 -### 训练 +#### 训练 - Ascend处理器环境运行 - ``` + ```bash python train.py --data_path cifar-10-batches-bin --ckpt_path ckpt > log 2>&1 & # 或进入脚本目录,执行脚本 sh run_standalone_train_ascend.sh cifar-10-batches-bin ckpt @@ -122,7 +122,7 @@ train.py和config.py中主要参数如下: 经过训练后,损失值如下: - ``` + ```bash # grep "loss is " log epoch: 1 step: 1, loss is 2.2791853 ... @@ -136,7 +136,7 @@ train.py和config.py中主要参数如下: - GPU环境运行 - ``` + ```bash python train.py --device_target "GPU" --data_path cifar-10-batches-bin --ckpt_path ckpt > log 2>&1 & # 或进入脚本目录,执行脚本 sh run_standalone_train_for_gpu.sh cifar-10-batches-bin ckpt @@ -144,7 +144,7 @@ train.py和config.py中主要参数如下: 经过训练后,损失值如下: - ``` + ```bash # grep "loss is " log epoch: 1 step: 1, loss is 2.3125906 ... @@ -153,15 +153,15 @@ train.py和config.py中主要参数如下: epoch: 30 step: 1561, loss is 0.103845775 ``` -## 评估过程 +### 评估过程 -### 评估 +#### 评估 在运行以下命令之前,请检查用于评估的检查点路径。 - Ascend处理器环境运行 - ``` + ```bash python eval.py --data_path cifar-10-verify-bin --ckpt_path ckpt/checkpoint_alexnet-1_1562.ckpt > eval_log.txt 2>&1 & #或进入脚本目录,执行脚本 sh run_standalone_eval_ascend.sh cifar-10-verify-bin ckpt/checkpoint_alexnet-1_1562.ckpt @@ -169,14 +169,14 @@ train.py和config.py中主要参数如下: 可通过"eval_log”文件查看结果。测试数据集的准确率如下: - ``` + ```bash # grep "Accuracy: " eval_log 'Accuracy': 0.8832 ``` - GPU环境运行 - ``` + ```bash python eval.py --device_target "GPU" --data_path cifar-10-verify-bin --ckpt_path ckpt/checkpoint_alexnet-30_1562.ckpt > eval_log 2>&1 & #或进入脚本目录,执行脚本 sh run_standalone_eval_for_gpu.sh cifar-10-verify-bin ckpt/checkpoint_alexnet-30_1562.ckpt @@ -184,16 +184,16 @@ train.py和config.py中主要参数如下: 可通过"eval_log”文件查看结果。测试数据集的准确率如下: - ``` + ```bash # grep "Accuracy: " eval_log 'Accuracy': 0.88512 ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 -### 评估性能 +#### 评估性能 | 参数 | Ascend | GPU | | -------------------------- | ------------------------------------------------------------| -------------------------------------------------| @@ -209,11 +209,12 @@ train.py和config.py中主要参数如下: | 速度 | 21毫秒/步 | 16.8毫秒/步 | | 总时间 | 17分钟 | 14分钟| | 微调检查点 | 445M (.ckpt文件) | 445M (.ckpt文件) | -| 脚本 | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet | +| 脚本 | [AlexNet脚本](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet) | [AlexNet脚本](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/alexnet) | -# 随机情况说明 +## 随机情况说明 dataset.py中设置了“create_dataset”函数内的种子。 -# ModelZoo主页 - 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 +## ModelZoo主页 + +请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/cv/lenet/README.md b/model_zoo/official/cv/lenet/README.md index 3ad0d84436..2043f3f65a 100644 --- a/model_zoo/official/cv/lenet/README.md +++ b/model_zoo/official/cv/lenet/README.md @@ -4,7 +4,7 @@ - [Model Architecture](#model-architecture) - [Dataset](#dataset) - [Environment Requirements](#environment-requirements) -- [Quick Start](#quick-start) +- [Quick Start](#quick-start) - [Script Description](#script-description) - [Script and Sample Code](#script-and-sample-code) - [Script Parameters](#script-parameters) @@ -17,32 +17,31 @@ - [Evaluation Performance](#evaluation-performance) - [ModelZoo Homepage](#modelzoo-homepage) +## [LeNet Description](#contents) -# [LeNet Description](#contents) - -LeNet was proposed in 1998, a typical convolutional neural network. It was used for digit recognition and got big success. +LeNet was proposed in 1998, a typical convolutional neural network. It was used for digit recognition and got big success. [Paper](https://ieeexplore.ieee.org/document/726791): Y.Lecun, L.Bottou, Y.Bengio, P.Haffner. Gradient-Based Learning Applied to Document Recognition. *Proceedings of the IEEE*. 1998. -# [Model Architecture](#contents) +## [Model Architecture](#contents) LeNet is very simple, which contains 5 layers. The layer composition consists of 2 convolutional layers and 3 fully connected layers. -# [Dataset](#contents) +## [Dataset](#contents) Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. -Dataset used: [MNIST]() +Dataset used: [MNIST]() - Dataset size:52.4M,60,000 28*28 in 10 classes - - Train:60,000 images - - Test:10,000 images + - Train:60,000 images + - Test:10,000 images - Data format:binary files - - Note:Data will be processed in dataset.py + - Note:Data will be processed in dataset.py - The directory structure is as follows: -``` +```bash └─Data ├─test │ t10k-images.idx3-ubyte @@ -53,19 +52,19 @@ Dataset used: [MNIST]() train-labels.idx1-ubyte ``` -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend/GPU/CPU) - - Prepare hardware environment with Ascend, GPU, or CPU processor. + - Prepare hardware environment with Ascend, GPU, or CPU processor. - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Quick Start](#contents) +## [Quick Start](#contents) -After installing MindSpore via the official website, you can start training and evaluation as follows: +After installing MindSpore via the official website, you can start training and evaluation as follows: ```python # enter script dir, train LeNet @@ -74,28 +73,28 @@ sh run_standalone_train_ascend.sh [DATA_PATH] [CKPT_SAVE_PATH] sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME] ``` -# [Script Description](#contents) +## [Script Description](#contents) -## [Script and Sample Code](#contents) +### [Script and Sample Code](#contents) -``` +```bash ├── cv - ├── lenet + ├── lenet ├── README.md // descriptions about lenet ├── requirements.txt // package needed - ├── scripts - │ ├──run_standalone_train_cpu.sh // train in cpu - │ ├──run_standalone_train_gpu.sh // train in gpu - │ ├──run_standalone_train_ascend.sh // train in ascend - │ ├──run_standalone_eval_cpu.sh // evaluate in cpu - │ ├──run_standalone_eval_gpu.sh // evaluate in gpu - │ ├──run_standalone_eval_ascend.sh // evaluate in ascend - ├── src + ├── scripts + │ ├──run_standalone_train_cpu.sh // train in cpu + │ ├──run_standalone_train_gpu.sh // train in gpu + │ ├──run_standalone_train_ascend.sh // train in ascend + │ ├──run_standalone_eval_cpu.sh // evaluate in cpu + │ ├──run_standalone_eval_gpu.sh // evaluate in gpu + │ ├──run_standalone_eval_ascend.sh // evaluate in ascend + ├── src │ ├──dataset.py // creating dataset │ ├──lenet.py // lenet architecture - │ ├──config.py // parameter configuration - ├── train.py // training script - ├── eval.py // evaluation script + │ ├──config.py // parameter configuration + ├── train.py // training script + ├── eval.py // evaluation script ``` ## [Script Parameters](#contents) @@ -103,23 +102,23 @@ sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME] ```python Major parameters in train.py and config.py as follows: ---data_path: The absolute full path to the train and evaluation datasets. ---epoch_size: Total training epochs. ---batch_size: Training batch size. +--data_path: The absolute full path to the train and evaluation datasets. +--epoch_size: Total training epochs. +--batch_size: Training batch size. --image_height: Image height used as input to the model. ---image_width: Image width used as input the model. +--image_width: Image width used as input the model. --device_target: Device where the code will be implemented. Optional values - are "Ascend", "GPU", "CPU". + are "Ascend", "GPU", "CPU". --checkpoint_path: The absolute full path to the checkpoint file saved after training. ---data_path: Path where the dataset is saved +--data_path: Path where the dataset is saved ``` ## [Training Process](#contents) -### Training +### Training -``` +```bash python train.py --data_path Data --ckpt_path ckpt > log.txt 2>&1 & # or enter script dir, and run the script sh run_standalone_train_ascend.sh Data ckpt @@ -127,7 +126,7 @@ sh run_standalone_train_ascend.sh Data ckpt After training, the loss value will be achieved as follows: -``` +```bash # grep "loss is " log.txt epoch: 1 step: 1, loss is 2.2791853 ... @@ -137,7 +136,7 @@ epoch: 1 step: 1538, loss is 1.0221305 ... ``` -The model checkpoint will be saved in the current directory. +The model checkpoint will be saved in the current directory. ## [Evaluation Process](#contents) @@ -145,7 +144,7 @@ The model checkpoint will be saved in the current directory. Before running the command below, please check the checkpoint path used for evaluation. -``` +```bash python eval.py --data_path Data --ckpt_path ckpt/checkpoint_lenet-1_1875.ckpt > log.txt 2>&1 & # or enter script dir, and run the script sh run_standalone_eval_ascend.sh Data ckpt/checkpoint_lenet-1_1875.ckpt @@ -153,16 +152,16 @@ sh run_standalone_eval_ascend.sh Data ckpt/checkpoint_lenet-1_1875.ckpt You can view the results through the file "log.txt". The accuracy of the test dataset will be as follows: -``` +```bash # grep "Accuracy: " log.txt -'Accuracy': 0.9842 +'Accuracy': 0.9842 ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### Evaluation Performance +#### Evaluation Performance | Parameters | LeNet | | -------------------------- | ----------------------------------------------------------- | @@ -178,11 +177,12 @@ You can view the results through the file "log.txt". The accuracy of the test da | Speed | 1.071 ms/step | | Total time | 32.1s | | | Checkpoint for Fine tuning | 482k (.ckpt file) | -| Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet | +| Scripts | [LeNet Script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet)s | -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) In dataset.py, we set the seed inside ```create_dataset``` function. -# [ModelZoo Homepage](#contents) - Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). +## [ModelZoo Homepage](#contents) + +Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/cv/lenet/README_CN.md b/model_zoo/official/cv/lenet/README_CN.md index c82a90ff41..a18d133149 100644 --- a/model_zoo/official/cv/lenet/README_CN.md +++ b/model_zoo/official/cv/lenet/README_CN.md @@ -1,4 +1,5 @@ # 目录 + - [目录](#目录) @@ -22,29 +23,29 @@ -# LeNet描述 +## LeNet描述 LeNet是1998年提出的一种典型的卷积神经网络。它被用于数字识别并取得了巨大的成功。 [论文](https://ieeexplore.ieee.org/document/726791): Y.Lecun, L.Bottou, Y.Bengio, P.Haffner.Gradient-Based Learning Applied to Document Recognition.*Proceedings of the IEEE*.1998. -# 模型架构 +## 模型架构 LeNet非常简单,包含5层,由2个卷积层和3个全连接层组成。 -# 数据集 +## 数据集 -使用的数据集:[MNIST]() +使用的数据集:[MNIST]() - 数据集大小:52.4M,共10个类,6万张 28*28图像 - - 训练集:6万张图像 - - 测试集:5万张图像 + - 训练集:6万张图像 + - 测试集:5万张图像 - 数据格式:二进制文件 - - 注:数据在dataset.py中处理。 + - 注:数据在dataset.py中处理。 - 目录结构如下: -``` +```bash └─Data ├─test │ t10k-images.idx3-ubyte @@ -55,19 +56,19 @@ LeNet非常简单,包含5层,由2个卷积层和3个全连接层组成。 train-labels.idx1-ubyte ``` -# 环境要求 +## 环境要求 - 硬件(Ascend/GPU/CPU) - - 使用Ascend、GPU或CPU处理器来搭建硬件环境。 + - 使用Ascend、GPU或CPU处理器来搭建硬件环境。 - 框架 - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - 如需查看详情,请参见如下资源: - - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) + - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 -通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: +通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: ```python # 进入脚本目录,训练LeNet @@ -76,11 +77,11 @@ sh run_standalone_train_ascend.sh [DATA_PATH] [CKPT_SAVE_PATH] sh run_standalone_eval_ascend.sh [DATA_PATH] [CKPT_NAME] ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 -``` +```bash ├── cv ├── lenet ├── README.md // Lenet描述 @@ -119,7 +120,7 @@ train.py和config.py中主要参数如下: ### 训练 -``` +```bash python train.py --data_path Data --ckpt_path ckpt > log.txt 2>&1 & # or enter script dir, and run the script sh run_standalone_train_ascend.sh Data ckpt @@ -127,7 +128,7 @@ sh run_standalone_train_ascend.sh Data ckpt 训练结束,损失值如下: -``` +```bash # grep "loss is " log.txt epoch:1 step:1, loss is 2.2791853 ... @@ -145,7 +146,7 @@ epoch:1 step:1538, loss is 1.0221305 在运行以下命令之前,请检查用于评估的检查点路径。 -``` +```bash python eval.py --data_path Data --ckpt_path ckpt/checkpoint_lenet-1_1875.ckpt > log.txt 2>&1 & # or enter script dir, and run the script sh run_standalone_eval_ascend.sh Data ckpt/checkpoint_lenet-1_1875.ckpt @@ -153,12 +154,12 @@ sh run_standalone_eval_ascend.sh Data ckpt/checkpoint_lenet-1_1875.ckpt 您可以通过log.txt文件查看结果。测试数据集的准确性如下: -``` +```bash # grep "Accuracy:" log.txt 'Accuracy':0.9842 ``` -# 模型描述 +## 模型描述 ## 性能 @@ -178,11 +179,12 @@ sh run_standalone_eval_ascend.sh Data ckpt/checkpoint_lenet-1_1875.ckpt | 速度 | 1.70毫秒/步 | | 总时长 | 43.1秒 | | | 微调检查点 | 482k (.ckpt文件) | -| 脚本 | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet | +| 脚本 | [LeNet脚本](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet) | -# 随机情况说明 +## 随机情况说明 在dataset.py中,我们设置了“create_dataset”函数内的种子。 -# ModelZoo主页 - 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 +## ModelZoo主页 + +请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/cv/lenet_quant/Readme.md b/model_zoo/official/cv/lenet_quant/Readme.md index c6f8d2c0fb..f933c8e3fb 100644 --- a/model_zoo/official/cv/lenet_quant/Readme.md +++ b/model_zoo/official/cv/lenet_quant/Readme.md @@ -17,8 +17,7 @@ - [Evaluation Performance](#evaluation-performance) - [ModelZoo Homepage](#modelzoo-homepage) - -# [LeNet Description](#contents) +## [LeNet Description](#contents) LeNet was proposed in 1998, a typical convolutional neural network. It was used for digit recognition and got big success. @@ -26,23 +25,23 @@ LeNet was proposed in 1998, a typical convolutional neural network. It was used This is the quantitative network of LeNet. -# [Model Architecture](#contents) +## [Model Architecture](#contents) LeNet is very simple, which contains 5 layers. The layer composition consists of 2 convolutional layers and 3 fully connected layers. -# [Dataset](#contents) +## [Dataset](#contents) Dataset used: [MNIST]() - Dataset size 52.4M 60,000 28*28 in 10 classes - - Train 60,000 images - - Test 10,000 images + - Train 60,000 images + - Test 10,000 images - Data format binary files - - Note Data will be processed in dataset.py + - Note Data will be processed in dataset.py - The directory structure is as follows: -``` +```bash └─Data ├─test │ t10k-images.idx3-ubyte @@ -53,17 +52,17 @@ Dataset used: [MNIST]() train-labels.idx1-ubyte ``` -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware:Ascend - - Prepare hardware environment with Ascend + - Prepare hardware environment with Ascend - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Quick Start](#contents) +## [Quick Start](#contents) After installing MindSpore via the official website, you can start training and evaluation as follows: @@ -76,11 +75,11 @@ python train.py --device_target=Ascend --data_path=[DATA_PATH] --ckpt_path=[CKPT python eval.py --device_target=Ascend --data_path=[DATA_PATH] --ckpt_path=[CKPT_PATH] --dataset_sink_mode=True ``` -# [Script Description](#contents) +## [Script Description](#contents) ## [Script and Sample Code](#contents) -``` +```bash ├── model_zoo ├── README.md // descriptions about all the models ├── lenet_quant @@ -117,13 +116,13 @@ Major parameters in train.py and config.py as follows: ### Training -``` +```bash python train.py --device_target=Ascend --dataset_path=/home/datasets/MNIST --dataset_sink_mode=True > log.txt 2>&1 & ``` After training, the loss value will be achieved as follows: -``` +```bash # grep "Epoch " log.txt Epoch: [ 1/ 10], step: [ 937/ 937], loss: [0.0081], avg loss: [0.0081], time: [11268.6832ms] Epoch time: 11269.352, per step time: 12.027, avg loss: 0.008 @@ -142,22 +141,22 @@ The model checkpoint will be saved in the current directory. Before running the command below, please check the checkpoint path used for evaluation. -``` +```bash python eval.py --data_path Data --ckpt_path ckpt/checkpoint_lenet-1_937.ckpt > log.txt 2>&1 & ``` You can view the results through the file "log.txt". The accuracy of the test dataset will be as follows: -``` +```bash # grep "Accuracy: " log.txt 'Accuracy': 0.9842 ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### Evaluation Performance +#### Evaluation Performance | Parameters | LeNet | | -------------------------- | ----------------------------------------------------------- | @@ -175,9 +174,10 @@ You can view the results through the file "log.txt". The accuracy of the test da | Checkpoint for Fine tuning | 482k (.ckpt file) | | Scripts | [scripts](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet) | -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) In dataset.py, we set the seed inside “create_dataset" function. -# [ModelZoo Homepage](#contents) - Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). +## [ModelZoo Homepage](#contents) + +Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/cv/lenet_quant/Readme_CN.md b/model_zoo/official/cv/lenet_quant/Readme_CN.md index e0ed1fb097..12cb28fea4 100644 --- a/model_zoo/official/cv/lenet_quant/Readme_CN.md +++ b/model_zoo/official/cv/lenet_quant/Readme_CN.md @@ -1,4 +1,5 @@ # 目录 + - [目录](#目录) @@ -22,7 +23,7 @@ -# LeNet描述 +## LeNet描述 LeNet是1998年提出的一种典型的卷积神经网络。它被用于数字识别并取得了巨大的成功。 @@ -30,23 +31,23 @@ LeNet是1998年提出的一种典型的卷积神经网络。它被用于数字 这是LeNet的量化网络。 -# 模型架构 +## 模型架构 LeNet非常简单,包含5层,由2个卷积层和3个全连接层组成。 -# 数据集 +## 数据集 使用的数据集:[MNIST]() - 数据集大小:52.4M,共10个类,6万张 28*28图像 - - 训练集:6万张图像 - - 测试集:1万张图像 + - 训练集:6万张图像 + - 测试集:1万张图像 - 数据格式:二进制文件 - - 注:数据在dataset.py中处理。 + - 注:数据在dataset.py中处理。 - 目录结构如下: -``` +```bash └─Data ├─test │ t10k-images.idx3-ubyte @@ -57,17 +58,17 @@ LeNet非常简单,包含5层,由2个卷积层和3个全连接层组成。 train-labels.idx1-ubyte ``` -# 环境要求 +## 环境要求 - 硬件:Ascend - - 使用Ascend搭建硬件环境 + - 使用Ascend搭建硬件环境 - 框架 - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - 如需查看详情,请参见如下资源: - - [MindSpore教程](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore教程](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# 快速入门 +## 快速入门 通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: @@ -80,11 +81,11 @@ python train.py --device_target=Ascend --data_path=[DATA_PATH] --ckpt_path=[CKPT python eval.py --device_target=Ascend --data_path=[DATA_PATH] --ckpt_path=[CKPT_PATH] --dataset_sink_mode=True ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 -``` +```bash ├── model_zoo ├── README.md // 所有型号的描述 ├── lenet_quant @@ -100,7 +101,7 @@ python eval.py --device_target=Ascend --data_path=[DATA_PATH] --ckpt_path=[CKPT_ ├── eval.py // 使用Ascend评估LeNet-Quant网络d ``` -## 脚本参数 +### 脚本参数 ```python train.py和config.py中主要参数如下: @@ -115,17 +116,17 @@ train.py和config.py中主要参数如下: --data_path:数据集所在路径 ``` -## 训练过程 +### 训练过程 -### 训练 +#### 训练 -``` +```bash python train.py --device_target=Ascend --dataset_path=/home/datasets/MNIST --dataset_sink_mode=True > log.txt 2>&1 & ``` 训练结束,损失值如下: -``` +```bash # grep "Epoch " log.txt Epoch:[ 1/ 10], step:[ 937/ 937], loss:[0.0081], avg loss:[0.0081], time:[11268.6832ms] Epoch time:11269.352, per step time:12.027, avg loss:0.008 @@ -138,28 +139,28 @@ Epoch:[ 3/ 10], step:[ 937/ 937], loss:[0.0017], avg loss:[0.0017], time:[3085.3 模型检查点保存在当前目录下。 -## 评估过程 +### 评估过程 -### 评估 +#### 评估 在运行以下命令之前,请检查用于评估的检查点路径。 -``` +```bash python eval.py --data_path Data --ckpt_path ckpt/checkpoint_lenet-1_937.ckpt > log.txt 2>&1 & ``` 您可以通过log.txt文件查看结果。测试数据集的准确性如下: -``` +```bash # grep "Accuracy:" log.txt 'Accuracy':0.9842 ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 -### 评估性能 +#### 评估性能 | 参数 | LeNet | | -------------------------- | ----------------------------------------------------------- | @@ -177,10 +178,10 @@ python eval.py --data_path Data --ckpt_path ckpt/checkpoint_lenet-1_937.ckpt > l | 微调检查点 | 482k (.ckpt文件) | | 脚本 | [脚本](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/lenet) | -# 随机情况说明 +## 随机情况说明 在dataset.py中,我们设置了“create_dataset”函数内的种子。 -# ModelZoo主页 - 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 +## ModelZoo主页 +请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/cv/vgg16/README.md b/model_zoo/official/cv/vgg16/README.md index 1b003b2c50..03d86b4a05 100644 --- a/model_zoo/official/cv/vgg16/README.md +++ b/model_zoo/official/cv/vgg16/README.md @@ -22,34 +22,35 @@ - [Description of Random Situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) - -# [VGG Description](#contents) +## [VGG Description](#contents) VGG, a very deep convolutional networks for large-scale image recognition, was proposed in 2014 and won the 1th place in object localization and 2th place in image classification task in ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). -[Paper](): Simonyan K, zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv preprint arXiv:1409.1556, 2014. +[Paper](https://arxiv.org/abs/1409.1556): Simonyan K, zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv preprint arXiv:1409.1556, 2014. + +## [Model Architecture](#contents) -# [Model Architecture](#contents) VGG 16 network is mainly consisted by several basic modules (including convolution and pooling layer) and three continuous Dense layer. here basic modules mainly include basic operation like: **3×3 conv** and **2×2 max pooling**. +## [Dataset](#contents) -# [Dataset](#contents) Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. -#### Dataset used: [CIFAR-10]() +### Dataset used: [CIFAR-10]() - CIFAR-10 Dataset size:175M,60,000 32*32 colorful images in 10 classes - Train:146M,50,000 images - Test:29.3M,10,000 images - - Data format: binary files + - Data format: binary files - Note: Data will be processed in src/dataset.py -#### Dataset used: [ImageNet2012](http://www.image-net.org/) +### Dataset used: [ImageNet2012](http://www.image-net.org/) + - Dataset size: ~146G, 1.28 million colorful images in 1000 classes - Train: 140G, 1,281,167 images - Test: 6.4G, 50, 000 images - - Data format: RGB images + - Data format: RGB images - Note: Data will be processed in src/dataset.py #### Dataset organize way @@ -57,7 +58,8 @@ Note that you can run the scripts based on the dataset mentioned in original pap CIFAR-10 > Unzip the CIFAR-10 dataset to any path you want and the folder structure should be as follows: - > ``` + > + > ```bash > . > ├── cifar-10-batches-bin # train dataset > └── cifar-10-verify-bin # infer dataset @@ -67,39 +69,37 @@ Note that you can run the scripts based on the dataset mentioned in original pap > Unzip the ImageNet2012 dataset to any path you want and the folder should include train and eval dataset as follows: > - > ``` + > ```bash > . > └─dataset > ├─ilsvrc # train dataset > └─validation_preprocess # evaluate dataset > ``` +## [Features](#contents) -# [Features](#contents) +### Mixed Precision -## Mixed Precision - -The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. +The [mixed precision](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/enable_mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’. - -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend/GPU) - - Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. + - Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) - + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Quick Start](#contents) +## [Quick Start](#contents) After installing MindSpore via the official website, you can start training and evaluation as follows: - Running on Ascend + ```python # run training example python train.py --data_path=[DATA_PATH] --device_id=[DEVICE_ID] > output.train.log 2>&1 & @@ -110,12 +110,14 @@ sh run_distribute_train.sh [RANL_TABLE_JSON] [DATA_PATH] # run evaluation example python eval.py --data_path=[DATA_PATH] --pre_trained=[PRE_TRAINED] > output.eval.log 2>&1 & ``` + For distributed training, a hccl configuration file with JSON format needs to be created in advance. Please follow the instructions in the link below: -https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools + - Running on GPU -``` + +```bash # run training example python train.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_TYPE] --data_path=[DATA_PATH] > output.train.log 2>&1 & @@ -126,17 +128,16 @@ sh run_distribute_train_gpu.sh [DATA_PATH] python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_TYPE] --data_path=[DATA_PATH] --pre_trained=[PRE_TRAINED] > output.eval.log 2>&1 & ``` -# [Script Description](#contents) - -## [Script and Sample Code](#contents) +## [Script Description](#contents) +### [Script and Sample Code](#contents) -``` +```bash ├── model_zoo ├── README.md // descriptions about all the models - ├── vgg16 + ├── vgg16 ├── README.md // descriptions about googlenet - ├── scripts + ├── scripts │ ├── run_distribute_train.sh // shell script for distributed training on Ascend │ ├── run_distribute_train_gpu.sh // shell script for distributed training on GPU ├── src @@ -146,7 +147,7 @@ python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_ │ │ ├── util.py // util function │ │ ├── var_init.py // network parameter init method │ ├── config.py // parameter configuration - │ ├── crossentropy.py // loss caculation + │ ├── crossentropy.py // loss calculation │ ├── dataset.py // creating dataset │ ├── linear_warmup.py // linear leanring rate │ ├── warmup_cosine_annealing_lr.py // consine anealing learning rate @@ -156,10 +157,11 @@ python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_ ├── eval.py // evaluation script ``` -## [Script Parameters](#contents) +### [Script Parameters](#contents) -### Training -``` +#### Training + +```bash usage: train.py [--device_target TARGET][--data_path DATA_PATH] [--dataset DATASET_TYPE][--is_distributed VALUE] [--device_id DEVICE_ID][--pre_trained PRE_TRAINED] @@ -177,9 +179,9 @@ parameters/options: ``` -### Evaluation +#### Evaluation -``` +```bash usage: eval.py [--device_target TARGET][--data_path DATA_PATH] [--dataset DATASET_TYPE][--pre_trained PRE_TRAINED] [--device_id DEVICE_ID] @@ -192,13 +194,13 @@ parameters/options: --pre_trained the checkpoint file path used to evaluate model. ``` -## [Parameter configuration](#contents) +### [Parameter configuration](#contents) Parameters for both training and evaluation can be set in config.py. - config for vgg16, CIFAR-10 dataset -``` +```bash "num_classes": 10, # dataset class num "lr": 0.01, # learning rate "lr_init": 0.01, # initial learning rate @@ -218,15 +220,15 @@ Parameters for both training and evaluation can be set in config.py. "pad_mode": 'same', # pad mode for conv2d "padding": 0, # padding value for conv2d "has_bias": False, # whether has bias in conv2d -"batch_norm": True, # wether has batch_norm in conv2d +"batch_norm": True, # whether has batch_norm in conv2d "keep_checkpoint_max": 10, # only keep the last keep_checkpoint_max checkpoint "initialize_mode": "XavierUniform", # conv2d init mode -"has_dropout": True # wether using Dropout layer +"has_dropout": True # whether using Dropout layer ``` - config for vgg16, ImageNet2012 dataset -``` +```bash "num_classes": 1000, # dataset class num "lr": 0.01, # learning rate "lr_init": 0.01, # initial learning rate @@ -246,28 +248,31 @@ Parameters for both training and evaluation can be set in config.py. "pad_mode": 'pad', # pad mode for conv2d "padding": 1, # padding value for conv2d "has_bias": True, # whether has bias in conv2d -"batch_norm": False, # wether has batch_norm in conv2d +"batch_norm": False, # whether has batch_norm in conv2d "keep_checkpoint_max": 10, # only keep the last keep_checkpoint_max checkpoint "initialize_mode": "KaimingNormal", # conv2d init mode -"has_dropout": True # wether using Dropout layer +"has_dropout": True # whether using Dropout layer ``` -## [Training Process](#contents) +### [Training Process](#contents) -### Training +#### Training -#### Run vgg16 on Ascend +##### Run vgg16 on Ascend - Training using single device(1p), using CIFAR-10 dataset in default + +```bash +python train.py --data_path=your_data_path --device_id=6 > out.train.log 2>&1 & ``` -python train.py --data_path=your_data_path --device_id=6 > out.train.log 2>&1 & -``` + The python command above will run in the background, you can view the results through the file `out.train.log`. After training, you'll get some checkpoint files in specified ckpt_path, default in ./output directory. You will get the loss value as following: -``` + +```bash # grep "loss is " output.train.log epoch: 1 step: 781, loss is 2.093086 epcoh: 2 step: 781, loss is 1.827582 @@ -275,13 +280,16 @@ epcoh: 2 step: 781, loss is 1.827582 ``` - Distributed Training -``` + +```bash sh run_distribute_train.sh rank_table.json your_data_path ``` + The above shell script will run distribute training in the background, you can view the results through the file `train_parallel[X]/log`. You will get the loss value as following: -``` + +```bash # grep "result: " train_parallel*/log train_parallel0/log:epoch: 1 step: 97, loss is 1.9060308 train_parallel0/log:epcoh: 2 step: 97, loss is 1.6003821 @@ -291,37 +299,42 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ... ``` -> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_tutorials.html). +> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_tutorials.html). > **Attention** This will bind the processor cores according to the `device_num` and total processor numbers. If you don't expect to run pretraining with binding processor cores, remove the operations about `taskset` in `scripts/run_distribute_train.sh` -#### Run vgg16 on GPU +##### Run vgg16 on GPU - Training using single device(1p) -``` + +```bash python train.py --device_target="GPU" --dataset="imagenet2012" --is_distributed=0 --data_path=$DATA_PATH > output.train.log 2>&1 & ``` - Distributed Training -``` + +```bash # distributed training(8p) bash scripts/run_distribute_train_gpu.sh /path/ImageNet2012/train" ``` -## [Evaluation Process](#contents) +### [Evaluation Process](#contents) -### Evaluation +#### Evaluation - Do eval as follows, need to specify dataset type as "cifar10" or "imagenet2012" -``` + +```bash # when using cifar10 dataset python eval.py --data_path=your_data_path --dataset="cifar10" --device_target="Ascend" --pre_trained=./*-70-781.ckpt > output.eval.log 2>&1 & # when using imagenet2012 dataset python eval.py --data_path=your_data_path --dataset="imagenet2012" --device_target="GPU" --pre_trained=./*-150-5004.ckpt > output.eval.log 2>&1 & ``` + - The above python command will run in the background, you can view the results through the file `output.eval.log`. You will get the accuracy as following: -``` + +```bash # when using cifar10 dataset # grep "result: " output.eval.log result: {'acc': 0.92} @@ -331,11 +344,11 @@ after allreduce eval: top1_correct=36636, tot=50000, acc=73.27% after allreduce eval: top5_correct=45582, tot=50000, acc=91.16% ``` +## [Model Description](#contents) -# [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### Training Performance +#### Training Performance | Parameters | VGG16(Ascend) | VGG16(GPU) | | -------------------------- | ---------------------------------------------- |------------------------------------| @@ -354,8 +367,7 @@ after allreduce eval: top5_correct=45582, tot=50000, acc=91.16% | Checkpoint for Fine tuning | 1.1G(.ckpt file) |1.1G(.ckpt file) | | Scripts |[vgg16](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/vgg16) | | - -### Evaluation Performance +#### Evaluation Performance | Parameters | VGG16(Ascend) | VGG16(GPU) | ------------------- | --------------------------- |--------------------- @@ -368,9 +380,10 @@ after allreduce eval: top5_correct=45582, tot=50000, acc=91.16% | outputs | probability | probability | | Accuracy | 1pc: 93.4% |1pc: 73.0%; | -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py. -# [ModelZoo Homepage](#contents) - Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). +## [ModelZoo Homepage](#contents) + +Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/cv/vgg16/README_CN.md b/model_zoo/official/cv/vgg16/README_CN.md index 2a0295fe78..308efbf4d8 100644 --- a/model_zoo/official/cv/vgg16/README_CN.md +++ b/model_zoo/official/cv/vgg16/README_CN.md @@ -31,20 +31,20 @@ -# VGG描述 +## VGG描述 于2014年提出的VGG是用于大规模图像识别的非常深的卷积网络。它在ImageNet大型视觉识别大赛2014(ILSVRC14)中获得了目标定位第一名和图像分类第二名。 [论文](https://arxiv.org/abs/1409.1556): Simonyan K, zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv preprint arXiv:1409.1556, 2014. -# 模型架构 +## 模型架构 VGG 16网络主要由几个基本模块(包括卷积层和池化层)和三个连续密集层组成。 这里的基本模块主要包括以下基本操作: **3×3卷积**和**2×2最大池化**。 -# 数据集 +## 数据集 -## 使用的数据集:[CIFAR-10]() +### 使用的数据集:[CIFAR-10]() - CIFAR-10数据集大小:175 MB,共10个类、60,000张32*32彩色图像 - 训练集:146 MB,50,000张图像 @@ -52,7 +52,7 @@ VGG 16网络主要由几个基本模块(包括卷积层和池化层)和三 - 数据格式:二进制文件 - 注:数据在src/dataset.py中处理。 -## 使用的数据集:[ImageNet2012](http://www.image-net.org/) +### 使用的数据集:[ImageNet2012](http://www.image-net.org/) - 数据集大小:约146 GB,共1000个类、128万张彩色图像 - 训练集:140 GB,1,281,167张图像 @@ -60,7 +60,7 @@ VGG 16网络主要由几个基本模块(包括卷积层和池化层)和三 - 数据格式:RGB图像。 - 注:数据在src/dataset.py中处理。 -## 数据集组织方式 +### 数据集组织方式 CIFAR-10 @@ -83,15 +83,15 @@ VGG 16网络主要由几个基本模块(包括卷积层和池化层)和三 > └─validation_preprocess # 评估数据集 > ``` -# 特性 +## 特性 -## 混合精度 +### 混合精度 采用[混合精度](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/enable_mixed_precision.html)的训练方法使用支持单精度和半精度数据来提高深度学习神经网络的训练速度,同时保持单精度训练所能达到的网络精度。混合精度训练提高计算速度、减少内存使用的同时,支持在特定硬件上训练更大的模型或实现更大批次的训练。 以FP16算子为例,如果输入数据类型为FP32,MindSpore后台会自动降低精度来处理数据。用户可打开INFO日志,搜索“reduce precision”查看精度降低的算子。 -# 环境要求 +## 环境要求 - 硬件(Ascend或GPU) - 准备Ascend或GPU处理器搭建硬件环境。如需试用昇腾处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com,审核通过即可获得资源。 @@ -101,7 +101,7 @@ VGG 16网络主要由几个基本模块(包括卷积层和池化层)和三 - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: @@ -135,9 +135,9 @@ sh run_distribute_train_gpu.sh [DATA_PATH] python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_TYPE] --data_path=[DATA_PATH] --pre_trained=[PRE_TRAINED] > output.eval.log 2>&1 & ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 ```bash ├── model_zoo @@ -164,9 +164,9 @@ python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_ ├── eval.py // 评估脚本 ``` -## 脚本参数 +### 脚本参数 -### 训练 +#### 训练 ```bash 用法:train.py [--device_target TARGET][--data_path DATA_PATH] @@ -186,7 +186,7 @@ python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_ ``` -### 评估 +#### 评估 ```bash 用法:eval.py [--device_target TARGET][--data_path DATA_PATH] @@ -201,7 +201,7 @@ python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_ --pre_trained 用于评估模型的检查点文件路径。 ``` -## 参数配置 +### 参数配置 在config.py中可以同时配置训练参数和评估参数。 @@ -261,11 +261,11 @@ python eval.py --device_target="GPU" --device_id=[DEVICE_ID] --dataset=[DATASET_ "has_dropout": True # 是否使用Dropout层 ``` -## 训练过程 +### 训练过程 -### 训练 +#### 训练 -#### Ascend处理器环境运行VGG16 +##### Ascend处理器环境运行VGG16 - 使用单设备(1p)训练,默认使用CIFAR-10数据集 @@ -311,7 +311,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 > 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_tutorials.html)。 > **注意** 将根据`device_num`和处理器总数绑定处理器核。如果您不希望预训练中绑定处理器内核,请在`scripts/run_distribute_train.sh`脚本中移除`taskset`相关操作。 -#### GPU处理器环境运行VGG16 +##### GPU处理器环境运行VGG16 - 单设备训练(1p) @@ -326,9 +326,9 @@ python train.py --device_target="GPU" --dataset="imagenet2012" --is_distributed bash scripts/run_distribute_train_gpu.sh /path/ImageNet2012/train" ``` -## 评估过程 +### 评估过程 -### 评估 +#### 评估 - 评估过程如下,需要指定数据集类型为“cifar10”或“imagenet2012”。 @@ -352,11 +352,11 @@ after allreduce eval: top1_correct=36636, tot=50000, acc=73.27% after allreduce eval: top5_correct=45582, tot=50000, acc=91.16% ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 -### 训练性能 +#### 训练性能 | 参数 | VGG16(Ascend) | VGG16(GPU) | | -------------------------- | ---------------------------------------------- |------------------------------------| @@ -375,7 +375,7 @@ after allreduce eval: top5_correct=45582, tot=50000, acc=91.16% | 调优检查点 | 1.1 GB(.ckpt 文件) | 1.1 GB(.ckpt 文件) | | 脚本 |[VGG16](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/vgg16) | | -### 评估性能 +#### 评估性能 | 参数 | VGG16(Ascend) | VGG16(GPU) | ------------------- | --------------------------- |--------------------- @@ -388,10 +388,10 @@ after allreduce eval: top5_correct=45582, tot=50000, acc=91.16% | 输出 | 概率 | 概率 | | 准确率 | 1卡:93.4% |1卡:73.0%; | -# 随机情况说明 +## 随机情况说明 dataset.py中设置了“create_dataset”函数内的种子,同时还使用了train.py中的随机种子。 -# ModelZoo主页 +## ModelZoo主页 - 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 +请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/cv/warpctc/README.md b/model_zoo/official/cv/warpctc/README.md index 4ba7a18e4c..f93cddaf60 100644 --- a/model_zoo/official/cv/warpctc/README.md +++ b/model_zoo/official/cv/warpctc/README.md @@ -12,10 +12,10 @@ - [Parameters Configuration](#parameters-configuration) - [Dataset Preparation](#dataset-preparation) - [Training Process](#training-process) - - [Training](#training) - - [Distributed Training](#distributed-training) + - [Training](#training) + - [Distributed Training](#distributed-training) - [Evaluation Process](#evaluation-process) - - [Evaluation](#evaluation) + - [Evaluation](#evaluation) - [Model Description](#model-description) - [Performance](#performance) - [Training Performance](#training-performance) @@ -23,39 +23,38 @@ - [Description of Random Situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) -# [WarpCTC Description](#contents) +## [WarpCTC Description](#contents) This is an example of training WarpCTC with self-generated captcha image dataset in MindSpore. -# [Model Architecture](#content) +## [Model Architecture](#content) WarpCTC is a two-layer stacked LSTM appending with one-layer FC neural network. See src/warpctc.py for details. -# [Dataset](#content) +## [Dataset](#content) The dataset is self-generated using a third-party library called [captcha](https://github.com/lepture/captcha), which can randomly generate digits from 0 to 9 in image. In this network, we set the length of digits varying from 1 to 4. -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend/GPU) - - Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. You will be able to have access to related resources once approved. + - Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. You will be able to have access to related resources once approved. - Framework - - [MindSpore](https://gitee.com/mindspore/mindspore) + - [MindSpore](https://gitee.com/mindspore/mindspore) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) - -# [Quick Start](#contents) +## [Quick Start](#contents) - Generate dataset. Run the script `scripts/run_process_data.sh` to generate a dataset. By default, the shell script will generate 10000 test images and 50000 train images separately. - - ``` + + ```bash $ cd scripts $ sh run_process_data.sh - + # after execution, you will find the dataset like the follows: . └─warpctc @@ -67,38 +66,41 @@ The dataset is self-generated using a third-party library called [captcha](https - After the dataset is prepared, you may start running the training or the evaluation scripts as follows: - Running on Ascend - ``` + + ```bash # distribute training example in Ascend $ bash run_distribute_train.sh rank_table.json ../data/train - + # evaluation example in Ascend $ bash run_eval.sh ../data/test warpctc-30-97.ckpt Ascend - + # standalone training example in Ascend $ bash run_standalone_train.sh ../data/train Ascend + ``` + For distributed training, a hccl configuration file with JSON format needs to be created in advance. Please follow the instructions in the link below: - https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools. - + . + - Running on GPU - - ``` + + ```bash # distribute training example in GPU $ bash run_distribute_train_for_gpu.sh 8 ../data/train - + # standalone training example in GPU $ bash run_standalone_train.sh ../data/train GPU - + # evaluation example in GPU $ bash run_eval.sh ../data/test warpctc-30-97.ckpt GPU ``` -# [Script Description](#contents) +## [Script Description](#contents) -## [Script and Sample Code](#contents) +### [Script and Sample Code](#contents) ```shell . @@ -124,10 +126,11 @@ The dataset is self-generated using a third-party library called [captcha](https └── train.py # train net ``` -## [Script Parameters](#contents) +### [Script Parameters](#contents) -### Training Script Parameters -``` +#### Training Script Parameters + +```bash # distributed training in Ascend Usage: bash run_distribute_train.sh [RANK_TABLE_FILE] [DATASET_PATH] @@ -138,12 +141,12 @@ Usage: bash run_distribute_train_for_gpu.sh [RANK_SIZE] [DATASET_PATH] Usage: bash run_standalone_train.sh [DATASET_PATH] [PLATFORM] ``` -### Parameters Configuration +#### Parameters Configuration Parameters for both training and evaluation can be set in config.py. -``` -"max_captcha_digits": 4, # max number of digits in each +```bash +"max_captcha_digits": 4, # max number of digits in each "captcha_width": 160, # width of captcha images "captcha_height": 64, # height of capthca images "batch_size": 64, # batch size of input tensor @@ -158,45 +161,50 @@ Parameters for both training and evaluation can be set in config.py. ``` ## [Dataset Preparation](#contents) + - You may refer to "Generate dataset" in [Quick Start](#quick-start) to automatically generate a dataset, or you may choose to generate a captcha dataset by yourself. -## [Training Process](#contents) +### [Training Process](#contents) - Set options in `config.py`, including learning rate and other network hyperparameters. Click [MindSpore dataset preparation tutorial](https://www.mindspore.cn/tutorial/training/zh-CN/master/use/data_preparation.html) for more information about dataset. - -### [Training](#contents) + +#### [Training](#contents) + - Run `run_standalone_train.sh` for non-distributed training of WarpCTC model, either on Ascend or on GPU. ``` bash bash run_standalone_train.sh [DATASET_PATH] [PLATFORM] ``` - -### [Distributed Training](#contents) + +##### [Distributed Training](#contents) + - Run `run_distribute_train.sh` for distributed training of WarpCTC model on Ascend. ``` bash bash run_distribute_train.sh [RANK_TABLE_FILE] [DATASET_PATH] ``` - - Run `run_distribute_train_gpu.sh` for distributed training of WarpCTC model on GPU. + ``` bash bash run_distribute_train_gpu.sh [RANK_SIZE] [DATASET_PATH] ``` -## [Evaluation Process](#contents) -### [Evaluation](#contents) +### [Evaluation Process](#contents) + +#### [Evaluation](#contents) - Run `run_eval.sh` for evaluation. + ``` bash bash run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] [PLATFORM] ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### [Training Performance](#contents) +#### [Training Performance](#contents) | Parameters | Ascend 910 | GPU | | -------------------------- | --------------------------------------------- |---------------------------------- | @@ -216,8 +224,7 @@ bash run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] [PLATFORM] | Checkpoint for Fine tuning | 20.3M (.ckpt file) | 20.3M (.ckpt file) | | Scripts | [Link](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/warpctc) | [Link](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/warpctc) | - -### [Evaluation Performance](#contents) +#### [Evaluation Performance](#contents) | Parameters | WarpCTC | | ------------------- | --------------------------- | @@ -231,8 +238,10 @@ bash run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] [PLATFORM] | Accuracy | 99.0% | | Model for inference | 20.3M (.ckpt file) | -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) + In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py for weight initialization. -# [ModelZoo Homepage](#contents) -Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). \ No newline at end of file +## [ModelZoo Homepage](#contents) + +Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/cv/warpctc/README_CN.md b/model_zoo/official/cv/warpctc/README_CN.md index ca24178434..39d0c24202 100644 --- a/model_zoo/official/cv/warpctc/README_CN.md +++ b/model_zoo/official/cv/warpctc/README_CN.md @@ -28,19 +28,19 @@ -# WarpCTC描述 +## WarpCTC描述 以下为MindSpore中用自生成的验证码图像数据集来训练WarpCTC的例子。 -# 模型架构 +## 模型架构 WarpCTC是带有一层FC神经网络的二层堆叠LSTM模型。详细信息请参见src/warpctc.py。 -# 数据集 +## 数据集 该数据集由第三方库[captcha](https://github.com/lepture/captcha)自行生成,可以在图像中随机生成数字0至9。在本网络中,我们设置数字个数为1至4。 -# 环境要求 +## 环境要求 - 硬件(Ascend/GPU) - 使用Ascend或GPU处理器来搭建硬件环境。如需试用昇腾处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawie,审核通过即可获得资源。 @@ -50,7 +50,7 @@ WarpCTC是带有一层FC神经网络的二层堆叠LSTM模型。详细信息请 - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 - 生成数据集 @@ -102,9 +102,9 @@ WarpCTC是带有一层FC神经网络的二层堆叠LSTM模型。详细信息请 $ bash run_eval.sh ../data/test warpctc-30-97.ckpt GPU ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 ```text . @@ -130,9 +130,9 @@ WarpCTC是带有一层FC神经网络的二层堆叠LSTM模型。详细信息请 └── train.py # 训练网络 ``` -## 脚本参数 +### 脚本参数 -### 训练脚本参数 +#### 训练脚本参数 ```bash # Ascend分布式训练 @@ -204,11 +204,11 @@ bash run_distribute_train_gpu.sh [RANK_SIZE] [DATASET_PATH] bash run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] [PLATFORM] ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 -### 训练性能 +#### 训练性能 | 参数 | Ascend 910 | GPU | | -------------------------- | --------------------------------------------- |---------------------------------- | @@ -228,7 +228,7 @@ bash run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] [PLATFORM] | 微调检查点 | 20.3M (.ckpt文件) | 20.3M (.ckpt文件) | | 脚本 | [链接](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/warpctc) | [链接](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/warpctc) | -### 评估性能 +#### 评估性能 | 参数 | WarpCTC | | ------------------- | --------------------------- | @@ -242,10 +242,10 @@ bash run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] [PLATFORM] | 准确率 | 99.0% | | 推理模型 | 20.3M (.ckpt文件) | -# 随机情况说明 +## 随机情况说明 在dataset.py中设置“create_dataset”函数内的种子。使用train.py中的随机种子进行权重初始化。 -# ModelZoo主页 +## ModelZoo主页 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/cv/yolov3_darknet53_quant/README.md b/model_zoo/official/cv/yolov3_darknet53_quant/README.md index 83f9c15249..cc97f3da26 100644 --- a/model_zoo/official/cv/yolov3_darknet53_quant/README.md +++ b/model_zoo/official/cv/yolov3_darknet53_quant/README.md @@ -4,7 +4,7 @@ - [Model Architecture](#model-architecture) - [Dataset](#dataset) - [Environment Requirements](#environment-requirements) -- [Quick Start](#quick-start) +- [Quick Start](#quick-start) - [Script Description](#script-description) - [Script and Sample Code](#script-and-sample-code) - [Script Parameters](#script-parameters) @@ -20,13 +20,12 @@ - [Description of Random Situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) +## [YOLOv3-DarkNet53-Quant Description](#contents) -# [YOLOv3-DarkNet53-Quant Description](#contents) - -You only look once (YOLO) is a state-of-the-art, real-time object detection system. YOLOv3 is extremely fast and accurate. +You only look once (YOLO) is a state-of-the-art, real-time object detection system. YOLOv3 is extremely fast and accurate. Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales. High scoring regions of the image are considered detections. - YOLOv3 use a totally different approach. It apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities. + YOLOv3 use a totally different approach. It apply a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities. YOLOv3 uses a few tricks to improve training and increase performance, including: multi-scale predictions, a better backbone classifier, and more. The full details are in the paper! @@ -35,43 +34,39 @@ In order to reduce the size of the weight and improve the low-bit computing perf [Paper](https://pjreddie.com/media/files/papers/YOLOv3.pdf): YOLOv3: An Incremental Improvement. Joseph Redmon, Ali Farhadi, University of Washington - -# [Model Architecture](#contents) +## [Model Architecture](#contents) YOLOv3 use DarkNet53 for performing feature extraction, which is a hybrid approach between the network used in YOLOv2, Darknet-19, and that newfangled residual network stuff. DarkNet53 uses successive 3 × 3 and 1 × 1 convolutional layers and has some shortcut connections as well and is significantly larger. It has 53 convolutional layers. +## [Dataset](#contents) -# [Dataset](#contents) Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. -Dataset used: [COCO2014](https://cocodataset.org/#download) +Dataset used: [COCO2014](https://cocodataset.org/#download) - Dataset size: 19G, 123,287 images, 80 object categories. - - Train:13G, 82,783 images - - Val:6GM, 40,504 images - - Annotations: 241M, Train/Val annotations + - Train:13G, 82,783 images + - Val:6GM, 40,504 images + - Annotations: 241M, Train/Val annotations - Data format:zip files - - Note:Data will be processed in yolo_dataset.py, and unzip files before uses it. - + - Note:Data will be processed in yolo_dataset.py, and unzip files before uses it. -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend) - - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. + - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) - - + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Quick Start](#contents) +## [Quick Start](#contents) -After installing MindSpore via the official website, you can start training and evaluation in Ascend as follows: +After installing MindSpore via the official website, you can start training and evaluation in Ascend as follows: -``` -# The yolov3_darknet53_noquant.ckpt in the follow script is got from yolov3-darknet53 training like paper. +```bash +# The yolov3_darknet53_noquant.ckpt in the follow script is got from yolov3-darknet53 training like paper. # The parameter of resume_yolov3 is necessary. # The parameter of training_shape define image shape for network, default is "". # It means use 10 kinds of shape as input shape, or it can be set some kind of shape. @@ -103,17 +98,16 @@ python eval.py \ sh run_eval.sh dataset/coco2014/ checkpoint/yolov3_quant.ckpt 0 ``` +## [Script Description](#contents) -# [Script Description](#contents) - -## [Script and Sample Code](#contents) +### [Script and Sample Code](#contents) -``` +```bash . -└─yolov3_darknet53_quant +└─yolov3_darknet53_quant ├─README.md ├─mindspore_hub_conf.md # config for mindspore hub - ├─scripts + ├─scripts ├─run_standalone_train.sh # launch standalone training(1p) in ascend ├─run_distribute_train.sh # launch distributed training(8p) in ascend └─run_eval.sh # launch evaluating in ascend @@ -134,10 +128,9 @@ sh run_eval.sh dataset/coco2014/ checkpoint/yolov3_quant.ckpt 0 └─train.py # train net ``` +### [Script Parameters](#contents) -## [Script Parameters](#contents) - -``` +```bash Major parameters in train.py as follow. optional arguments: @@ -194,21 +187,19 @@ optional arguments: Resize rate for multi-scale training. Default: None ``` +### [Training Process](#contents) +#### Training on Ascend -## [Training Process](#contents) +##### Distributed Training -### Training on Ascend - -### Distributed Training - -``` +```bash sh run_distribute_train.sh dataset/coco2014 yolov3_darknet53_noquant.ckpt rank_table_8p.json ``` The above shell script will run distribute training in the background. You can view the results through the file `train_parallel[X]/log.txt`. The loss value will be achieved as follows: -``` +```bash # distribute training result(8p) epoch[0], iter[0], loss:483.341675, 0.31 imgs/sec, lr:0.0 epoch[0], iter[100], loss:55.690952, 3.46 imgs/sec, lr:0.0 @@ -232,14 +223,13 @@ epoch[134], iter[86400], loss:35.603033, 142.23 imgs/sec, lr:1.6245529650404933e epoch[134], iter[86500], loss:34.303755, 145.18 imgs/sec, lr:1.6245529650404933e-06 ``` +### [Evaluation Process](#contents) -## [Evaluation Process](#contents) - -### Evaluation on Ascend +#### Evaluation on Ascend Before running the command below. -``` +```bash python eval.py \ --data_dir=./dataset/coco2014 \ --pretrained=0-130_83330.ckpt \ @@ -250,7 +240,7 @@ sh run_eval.sh dataset/coco2014/ checkpoint/0-130_83330.ckpt 0 The above python command will run in the background. You can view the results through the file "log.txt". The mAP of the test dataset will be as follows: -``` +```bash # log.txt =============coco eval reulst========= Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.310 @@ -267,11 +257,11 @@ Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.450 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558 ``` +## [Model Description](#contents) -# [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### Evaluation Performance +#### Evaluation Performance | Parameters | Ascend | | -------------------------- | ---------------------------------------------------------------------------------------------- | @@ -279,7 +269,7 @@ Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558 | Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory, 755G | | uploaded Date | 09/15/2020 (month/day/year) | | MindSpore Version | 1.0.0 | -| Dataset | COCO2014 | +| Dataset | COCO2014 | | Training Parameters | epoch=135, batch_size=16, lr=0.012, momentum=0.9 | | Optimizer | Momentum | | Loss Function | Sigmoid Cross Entropy with logits | @@ -289,10 +279,9 @@ Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558 | Total time | 8pc: 23.5 hours | | Parameters (M) | 62.1 | | Checkpoint for Fine tuning | 474M (.ckpt file) | -| Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53_quant | +| Scripts | [YoloV3-DarkNet53-Quant Script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53_quant) | - -### Inference Performance +#### Inference Performance | Parameters | Ascend | | ------------------- | --------------------------- | @@ -306,11 +295,10 @@ Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558 | Accuracy | 8pcs: 31.0% | | Model for inference | 474M (.ckpt file) | +## [Description of Random Situation](#contents) -# [Description of Random Situation](#contents) - -There are random seeds in distributed_sampler.py, transforms.py, yolo_dataset.py files. +There are random seeds in distributed_sampler.py, transforms.py, yolo_dataset.py files. +## [ModelZoo Homepage](#contents) -# [ModelZoo Homepage](#contents) - Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). +Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/cv/yolov3_darknet53_quant/README_CN.md b/model_zoo/official/cv/yolov3_darknet53_quant/README_CN.md index 4f1b1a140b..4998d00425 100644 --- a/model_zoo/official/cv/yolov3_darknet53_quant/README_CN.md +++ b/model_zoo/official/cv/yolov3_darknet53_quant/README_CN.md @@ -25,7 +25,7 @@ -# YOLOv3-DarkNet53-Quant描述 +## YOLOv3-DarkNet53-Quant描述 You only look once(YOLO)是最先进的实时物体检测系统。YOLOv3非常快速和准确。 @@ -38,11 +38,11 @@ YOLOv3使用了一些技巧来改进训练,提高性能,包括多尺度预 [论文](https://pjreddie.com/media/files/papers/YOLOv3.pdf): YOLOv3: An Incremental Improvement.Joseph Redmon, Ali Farhadi, University of Washington -# 模型架构 +## 模型架构 YOLOv3使用DarkNet53执行特征提取,这是YOLOv2中的Darknet-19和残差网络的一种混合方法。DarkNet53使用连续的3×3和1×1卷积层,并且有一些快捷连接,而且DarkNet53明显更大,它有53层卷积层。 -# 数据集 +## 数据集 使用的数据集:[COCO 2014](https://cocodataset.org/#download) @@ -53,7 +53,7 @@ YOLOv3使用DarkNet53执行特征提取,这是YOLOv2中的Darknet-19和残差 - 数据格式:zip文件 - 注:数据将在yolo_dataset.py中处理,并在使用前解压文件。 -# 环境要求 +## 环境要求 - 硬件(Ascend处理器) - 准备Ascend或GPU处理器搭建硬件环境。如需试用Ascend处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com,审核通过即可获得资源。 @@ -63,7 +63,7 @@ YOLOv3使用DarkNet53执行特征提取,这是YOLOv2中的Darknet-19和残差 - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: @@ -108,9 +108,9 @@ python eval.py \ sh run_eval.sh dataset/coco2014/ checkpoint/yolov3_quant.ckpt 0 ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 ```text . @@ -138,7 +138,7 @@ sh run_eval.sh dataset/coco2014/ checkpoint/yolov3_quant.ckpt 0 └─train.py # 训练网络 ``` -## 脚本参数 +### 脚本参数 ```text train.py中主要参数如下: @@ -192,11 +192,11 @@ train.py中主要参数如下: 多尺度训练的调整率。默认设置:None。 ``` -## 训练过程 +### 训练过程 -### Ascend上训练 +#### Ascend上训练 -### 分布式训练 +##### 分布式训练 ```shell script sh run_distribute_train.sh dataset/coco2014 yolov3_darknet53_noquant.ckpt rank_table_8p.json @@ -228,9 +228,9 @@ epoch[134], iter[86400], loss:35.603033, 142.23 imgs/sec, lr:1.6245529650404933e epoch[134], iter[86500], loss:34.303755, 145.18 imgs/sec, lr:1.6245529650404933e-06 ``` -## 评估过程 +### 评估过程 -### Ascend评估 +#### Ascend评估 运行以下命令。 @@ -266,11 +266,11 @@ Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.450 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558 ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 -### 评估性能 +#### 评估性能 | 参数 | Ascend | | -------------------------- | ---------------------------------------------------------------------------------------------- | @@ -288,9 +288,9 @@ Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558 | 总时长 | 8卡:23.5小时 | | 参数 (M) | 62.1 | | 微调检查点 | 474M (.ckpt文件) | -| 脚本 | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53_quant | +| 脚本 | [YoloV3-DarkNet53-Quant脚本](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/yolov3_darknet53_quant) | -### 推理性能 +#### 推理性能 | 参数 | Ascend | | ------------------- | --------------------------- | @@ -304,10 +304,10 @@ Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.558 | 准确率 | 8pcs:31.0% | | 推理模型 | 474M (.ckpt文件) | -# 随机情况说明 +## 随机情况说明 在distributed_sampler.py、transforms.py、yolo_dataset.py文件中有随机种子。 -# ModelZoo主页 +## ModelZoo主页 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/gnn/bgcf/README.md b/model_zoo/official/gnn/bgcf/README.md index 178138d487..6cdc9be8a4 100644 --- a/model_zoo/official/gnn/bgcf/README.md +++ b/model_zoo/official/gnn/bgcf/README.md @@ -1,3 +1,5 @@ +# Contents + - [Bayesian Graph Collaborative Filtering](#bayesian-graph-collaborative-filtering) @@ -21,7 +23,7 @@ -# [Bayesian Graph Collaborative Filtering](#contents) +## [Bayesian Graph Collaborative Filtering](#contents) Bayesian Graph Collaborative Filtering(BGCF) was proposed in 2020 by Sun J, Guo W, Zhang D et al. By naturally incorporating the uncertainty in the user-item interaction graph shows excellent performance on Amazon recommendation dataset.This is an example of @@ -29,12 +31,12 @@ training of BGCF with Amazon-Beauty dataset in MindSpore. More importantly, this [Paper](https://dl.acm.org/doi/pdf/10.1145/3394486.3403254): Sun J, Guo W, Zhang D, et al. A Framework for Recommending Accurate and Diverse Items Using Bayesian Graph Convolutional Neural Networks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020: 2030-2039. -# [Model Architecture](#contents) +## [Model Architecture](#contents) Specially, BGCF contains two main modules. The first is sampling, which produce sample graphs based in node copying. Another module aggregate the neighbors sampling from nodes consisting of mean aggregator and attention aggregator. -# [Dataset](#contents) +## [Dataset](#contents) Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. @@ -69,13 +71,13 @@ Note that you can run the scripts based on the dataset mentioned in original pap sh run_process_data_ascend.sh [SRC_PATH] ``` -# [Features](#contents) +## [Features](#contents) -## Mixed Precision +### Mixed Precision To ultilize the strong computation power of Ascend chip, and accelerate the training process, the mixed training method is used. MindSpore is able to cope with FP32 inputs and FP16 operators. In BGCF example, the model is set to FP16 mode except for the loss calculation part. -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware (Ascend/GPU) - Framework @@ -84,7 +86,7 @@ To ultilize the strong computation power of Ascend chip, and accelerate the trai - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Quick Start](#contents) +## [Quick Start](#contents) After installing MindSpore via the official website and Dataset is correctly generated, you can start training and evaluation as follows. @@ -108,9 +110,9 @@ After installing MindSpore via the official website and Dataset is correctly gen sh run_eval_gpu.sh 0 dataset_path ``` -# [Script Description](#contents) +## [Script Description](#contents) -## [Script and Sample Code](#contents) +### [Script and Sample Code](#contents) ```shell . @@ -135,7 +137,7 @@ After installing MindSpore via the official website and Dataset is correctly gen └─train.py # Train net ``` -## [Script Parameters](#contents) +### [Script Parameters](#contents) Parameters for both training and evaluation can be set in config.py. @@ -154,9 +156,9 @@ Parameters for both training and evaluation can be set in config.py. config.py for more configuration. -## [Training Process](#contents) +### [Training Process](#contents) -### Training +#### Training - running on Ascend @@ -197,9 +199,9 @@ Parameters for both training and evaluation can be set in config.py. Epoch 004 iter 12 loss 21628.908 ``` -## [Evaluation Process](#contents) +### [Evaluation Process](#contents) -### Evaluation +#### Evaluation - Evaluation on Ascend @@ -242,11 +244,11 @@ Parameters for both training and evaluation can be set in config.py. sedp_@10:0.01926, sedp_@20:0.01547, nov_@10:7.60851, nov_@20:7.81969 ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### Training Performance +#### Training Performance | Parameter | BGCF Ascend | BGCF GPU | | ------------------------------ | ------------------------------------------ | ------------------------------------------ | @@ -261,7 +263,7 @@ Parameters for both training and evaluation can be set in config.py. | Training Cost | 25min | 60min | | Scripts | [bgcf script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/bgcf) | [bgcf script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/bgcf) | -### Inference Performance +#### Inference Performance | Parameter | BGCF Ascend | BGCF GPU | | ------------------------------ | ---------------------------- | ---------------------------- | @@ -275,10 +277,10 @@ Parameters for both training and evaluation can be set in config.py. | Recall@20 | 0.1534 | 0.15524 | | NDCG@20 | 0.0912 | 0.09249 | -# [Description of random situation](#contents) +## [Description of random situation](#contents) BGCF model contains lots of dropout operations, if you want to disable dropout, set the neighbor_dropout to [0.0, 0.0, 0.0] in src/config.py. -# [ModelZoo Homepage](#contents) +## [ModelZoo Homepage](#contents) Please check the official [homepage](http://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/gnn/bgcf/README_CN.md b/model_zoo/official/gnn/bgcf/README_CN.md index dab2a051e3..8507fd8c81 100644 --- a/model_zoo/official/gnn/bgcf/README_CN.md +++ b/model_zoo/official/gnn/bgcf/README_CN.md @@ -24,17 +24,17 @@ -# 贝叶斯图协同过滤 +## 贝叶斯图协同过滤 贝叶斯图协同过滤(BGCF)是Sun J、Guo W、Zhang D等人于2020年提出的。通过结合用户与物品交互图中的不确定性,显示了Amazon推荐数据集的优异性能。使用MindSpore中的Amazon-Beauty数据集对BGCF进行训练。更重要的是,这是BGCF的第一个开源版本。 [论文](https://dl.acm.org/doi/pdf/10.1145/3394486.3403254): Sun J, Guo W, Zhang D, et al.A Framework for Recommending Accurate and Diverse Items Using Bayesian Graph Convolutional Neural Networks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020: 2030-2039. -# 模型架构 +## 模型架构 BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的样本图。另一个为聚合节点的邻居采样,节点包含平均聚合器和注意力聚合器。 -# 数据集 +## 数据集 - 数据集大小: @@ -80,13 +80,13 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 ``` -# 特性 +## 特性 -## 混合精度 +### 混合精度 为了充分利用Ascend芯片强大的运算能力,加快训练过程,此处采用混合训练方法。MindSpore能够处理FP32输入和FP16操作符。在BGCF示例中,除损失计算部分外,模型设置为FP16模式。 -# 环境要求 +## 环境要求 - 硬件(Ascend/GPU) - 框架 @@ -95,7 +95,7 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 通过官方网站安装MindSpore,并正确生成数据集后,您可以按照如下步骤进行训练和评估: @@ -123,9 +123,9 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 ```shell @@ -151,7 +151,7 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 ``` -## 脚本参数 +### 脚本参数 在config.py中可以同时配置训练参数和评估参数。 @@ -173,9 +173,9 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 在config.py中以获取更多配置。 -## 训练过程 +### 训练过程 -### 训练 +#### 训练 - Ascend处理器环境运行 @@ -221,9 +221,9 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 ``` -## 评估过程 +### 评估过程 -### 评估 +#### 评估 - Ascend评估 @@ -271,9 +271,9 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 | 参数 | BGCF Ascend | BGCF GPU | | -------------------------- | ------------------------------------------ | ------------------------------------------ | @@ -289,10 +289,10 @@ BGCF包含两个主要模块。首先是抽样,它生成基于节点复制的 | 训练成本 | 25min | 60min | | 脚本 | [bgcf脚本](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/bgcf) | [bgcf脚本](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/bgcf) | -# 随机情况说明 +## 随机情况说明 BGCF模型中有很多的dropout操作,如果想关闭dropout,可以在src/config.py中将neighbor_dropout设置为[0.0, 0.0, 0.0] 。 -# ModelZoo主页 +## ModelZoo主页 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/gnn/gat/README.md b/model_zoo/official/gnn/gat/README.md index 8dafd275d0..5aae4fdf87 100644 --- a/model_zoo/official/gnn/gat/README.md +++ b/model_zoo/official/gnn/gat/README.md @@ -1,38 +1,44 @@ +# Contents + - [Graph Attention Networks Description](#graph-attention-networks-description) - [Model architecture](#model-architecture) - [Dataset](#dataset) - [Features](#features) - - [Mixed Precision](#mixed-precision) + - [Mixed Precision](#mixed-precision) - [Environment Requirements](#environment-requirements) - [Quick Start](#quick-start) - [Script Description](#script-description) - - [Script and Sample Code](#script-and-sample-code) - - [Script Parameters](#script-parameters) - - [Training Process](#training-process) + - [Script and Sample Code](#script-and-sample-code) + - [Script Parameters](#script-parameters) + - [Training Process](#training-process) - [Training](#training) - [Model Description](#model-description) - - [Performance](#performance) - - [Evaluation Performance](#evaluation-performance) - - [Inference Performance](#evaluation-performance) + - [Performance](#performance) + - [Evaluation Performance](#evaluation-performance) + - [Inference Performance](#evaluation-performance) - [Description of random situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) + -# [Graph Attention Networks Description](#contents) - + +## [Graph Attention Networks Description](#contents) + Graph Attention Networks(GAT) was proposed in 2017 by Petar Veličković et al. By leveraging masked self-attentional layers to address shortcomings of prior graph based method, GAT achieved or matched state of the art performance on both transductive datasets like Cora and inductive dataset like PPI. This is an example of training GAT with Cora dataset in MindSpore. [Paper](https://arxiv.org/abs/1710.10903): Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks. arXiv preprint arXiv:1710.10903. -# [Model architecture](#contents) +## [Model architecture](#contents) Note that according to whether this attention layer is the output layer of the network or not, the node update function can be concatenate or average. -# [Dataset](#contents) +## [Dataset](#contents) + Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. + - Dataset size: - Statistics of dataset used are summerized as below: + Statistics of dataset used are summarized as below: | | Cora | Citeseer | | ------------------ | -------------: | -------------: | @@ -46,9 +52,9 @@ Note that you can run the scripts based on the dataset mentioned in original pap | # Test Nodes | 1000 | 1000 | - Data Preparation - - Place the dataset to any path you want, the folder should include files as follows(we use Cora dataset as an example): - - ``` + - Place the dataset to any path you want, the folder should include files as follows(we use Cora dataset as an example): + +```bash . └─data ├─ind.cora.allx @@ -61,58 +67,60 @@ Note that you can run the scripts based on the dataset mentioned in original pap └─ind.cora.y ``` - - Generate dataset in mindrecord format for cora or citeseer. - ```buildoutcfg - cd ./scripts - # SRC_PATH is the dataset file path you downloaded, DATASET_NAME is cora or citeseer - sh run_process_data_ascend.sh [SRC_PATH] [DATASET_NAME] - ``` +- Generate dataset in mindrecord format for cora or citeseer. - - Launch - ``` - #Generate dataset in mindrecord format for cora - ./run_process_data_ascend.sh ./data cora - #Generate dataset in mindrecord format for citeseer - ./run_process_data_ascend.sh ./data citeseer - ``` + ```buildoutcfg + cd ./scripts + # SRC_PATH is the dataset file path you downloaded, DATASET_NAME is cora or citeseer + sh run_process_data_ascend.sh [SRC_PATH] [DATASET_NAME] + ``` + +- Launch + + ```bash + #Generate dataset in mindrecord format for cora + ./run_process_data_ascend.sh ./data cora + #Generate dataset in mindrecord format for citeseer + ./run_process_data_ascend.sh ./data citeseer + ``` -# [Features](#contents) +## [Features](#contents) -## Mixed Precision +### Mixed Precision To ultilize the strong computation power of Ascend chip, and accelerate the training process, the mixed training method is used. MindSpore is able to cope with FP32 inputs and FP16 operators. In GAT example, the model is set to FP16 mode except for the loss calculation part. -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) -- Hardward (Ascend) +- Hardware (Ascend) - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Quick Start](#contents) +## [Quick Start](#contents) After installing MindSpore via the official website and Dataset is correctly generated, you can start training and evaluation as follows. - running on Ascend - ``` + ```bash # run training example with cora dataset, DATASET_NAME is cora sh run_train_ascend.sh [DATASET_NAME] ``` -# [Script Description](#contents) +## [Script Description](#contents) ## [Script and Sample Code](#contents) - + ```shell . -└─gat +└─gat ├─README.md - ├─scripts + ├─scripts | ├─run_process_data_ascend.sh # Generate dataset in mindrecord format - | └─run_train_ascend.sh # Launch training + | └─run_train_ascend.sh # Launch training | ├─src | ├─config.py # Training configurations @@ -122,12 +130,12 @@ After installing MindSpore via the official website and Dataset is correctly gen | └─train.py # Train net ``` - + ## [Script Parameters](#contents) - + Parameters for both training and evaluation can be set in config.py. -- config for GAT, CORA dataset +- config for GAT, CORA dataset ```python "learning_rate": 0.005, # Learning rate @@ -146,11 +154,11 @@ Parameters for both training and evaluation can be set in config.py. - running on Ascend - ```python + ```python sh run_train_ascend.sh [DATASET_NAME] ``` - Training result will be stored in the scripts path, whose folder name begins with "train". You can find the result like the + Training result will be stored in the scripts path, whose folder name begins with "train". You can find the result like the followings in log. ```python @@ -169,8 +177,9 @@ Parameters for both training and evaluation can be set in config.py. ... ``` -# [Model Description](#contents) -## [Performance](#contents) +## [Model Description](#contents) + +### [Performance](#contents) | Parameter | GAT | | ------------------------------------ | ----------------------------------------- | @@ -184,12 +193,12 @@ Parameters for both training and evaluation can be set in config.py. | Accuracy | 83.0/72.5 | | Speed | 0.195s/epoch | | Total time | 39s | -| Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/gat | +| Scripts | [GAT Script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/gat) | -# [Description of random situation](#contents) +## [Description of random situation](#contents) GAT model contains lots of dropout operations, if you want to disable dropout, set the attn_dropout and feature_dropout to 0 in src/config.py. Note that this operation will cause the accuracy drop to approximately 80%. -# [ModelZoo Homepage](#contents) +## [ModelZoo Homepage](#contents) Please check the official [homepage](http://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/gnn/gat/README_CN.md b/model_zoo/official/gnn/gat/README_CN.md index 0499c4a5fa..50fb984603 100644 --- a/model_zoo/official/gnn/gat/README_CN.md +++ b/model_zoo/official/gnn/gat/README_CN.md @@ -22,17 +22,17 @@ -# 图注意力网络描述 +## 图注意力网络描述 图注意力网络(GAT)由Petar Veličković等人于2017年提出。GAT通过利用掩蔽自注意层来克服现有基于图的方法的缺点,在Cora等传感数据集和PPI等感应数据集上都达到了最先进的性能。以下是用MindSpore的Cora数据集训练GAT的例子。 [论文](https://arxiv.org/abs/1710.10903): Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017).Graph attention networks. arXiv preprint arXiv:1710.10903. -# 模型架构 +## 模型架构 请注意节点更新函数是级联还是平均,取决于注意力层是否为网络输出层。 -# 数据集 +## 数据集 - 数据集大小: @@ -82,13 +82,13 @@ ./run_process_data_ascend.sh ./data citeseer ``` -# 特性 +## 特性 -## 混合精度 +### 混合精度 为了充分利用Ascend芯片强大的运算能力,加快训练过程,此处采用混合训练方法。MindSpore能够处理FP32输入和FP16操作符。在GAT示例中,除损失计算部分外,模型设置为FP16模式。 -# 环境要求 +## 环境要求 - 硬件(Ascend) - 框架 @@ -97,7 +97,7 @@ - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 通过官方网站安装MindSpore,并正确生成数据集后,您可以按照如下步骤进行训练和评估: @@ -108,9 +108,9 @@ sh run_train_ascend.sh [DATASET_NAME] ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 ```shell . @@ -129,7 +129,7 @@ └─train.py # 训练网络 ``` -## 脚本参数 +### 脚本参数 在config.py中可以同时配置训练参数和评估参数。 @@ -146,9 +146,9 @@ "feature_dropout":0.6 # 特征层dropout系数 ``` -## 训练过程 +### 训练过程 -### 训练 +#### 训练 - Ascend处理器环境运行 @@ -175,9 +175,9 @@ ... ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 | 参数 | GAT | | ------------------------------------ | ----------------------------------------- | @@ -193,10 +193,10 @@ | 总时长 | 39s | | 脚本 | | -# 随机情况说明 +## 随机情况说明 GAT模型中有很多的dropout操作,如果想关闭dropout,可以在src/config.py中将attn_dropout和feature_dropout设置为0。注:该操作会导致准确率降低到80%左右。 -# ModelZoo主页 +## ModelZoo主页 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/gnn/gcn/README.md b/model_zoo/official/gnn/gcn/README.md index ada0fb0c02..738857484d 100644 --- a/model_zoo/official/gnn/gcn/README.md +++ b/model_zoo/official/gnn/gcn/README.md @@ -14,20 +14,18 @@ - [Description of Random Situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) - -# [GCN Description](#contents) +## [GCN Description](#contents) GCN(Graph Convolutional Networks) was proposed in 2016 and designed to do semi-supervised learning on graph-structured data. A scalable approach based on an efficient variant of convolutional neural networks which operate directly on graphs was presented. The model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. [Paper](https://arxiv.org/abs/1609.02907): Thomas N. Kipf, Max Welling. 2016. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR 2016. +## [Model Architecture](#contents) -# [Model Architecture](#contents) - -GCN contains two graph convolution layers. Each layer takes nodes features and adjacency matrix as input, nodes' features are then updated by aggregating neighbours' features. +GCN contains two graph convolution layers. Each layer takes nodes features and adjacency matrix as input, nodes' features are then updated by aggregating neighbours' features. +## [Dataset](#contents) -# [Dataset](#contents) Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. | Dataset | Type | Nodes | Edges | Classes | Features | Label rate | @@ -35,29 +33,25 @@ Note that you can run the scripts based on the dataset mentioned in original pap | Cora | Citation network | 2708 | 5429 | 7 | 1433 | 0.052 | | Citeseer| Citation network | 3327 | 4732 | 6 | 3703 | 0.036 | - - - -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend) - - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. + - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. - Framework - - [MindSpore](https://gitee.com/mindspore/mindspore) + - [MindSpore](https://gitee.com/mindspore/mindspore) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) - -# [Quick Start](#contents) +## [Quick Start](#contents) - Install [MindSpore](https://www.mindspore.cn/install/en). - Download the dataset Cora or Citeseer provided by /kimiyoung/planetoid from github. - + - Place the dataset to any path you want, the folder should include files as follows(we use Cora dataset as an example): - -``` + +```bash . └─data ├─ind.cora.allx @@ -71,30 +65,33 @@ Note that you can run the scripts based on the dataset mentioned in original pap ``` - Generate dataset in mindrecord format for cora or citeseer. -####Usage + +### Usage + ```buildoutcfg cd ./scripts # SRC_PATH is the dataset file path you downloaded, DATASET_NAME is cora or citeseer sh run_process_data.sh [SRC_PATH] [DATASET_NAME] ``` -####Launch -``` +### Launch + +```bash #Generate dataset in mindrecord format for cora sh run_process_data.sh ./data cora #Generate dataset in mindrecord format for citeseer sh run_process_data.sh ./data citeseer ``` -# [Script Description](#contents) +### [Script Description](#contents) + +### [Script and Sample Code](#contents) -## [Script and Sample Code](#contents) - ```shell . -└─gcn +└─gcn ├─README.md - ├─scripts + ├─scripts | ├─run_process_data.sh # Generate dataset in mindrecord format | └─run_train.sh # Launch training, now only Ascend backend is supported. | @@ -106,12 +103,12 @@ sh run_process_data.sh ./data citeseer | └─train.py # Train net, evaluation is performed after every training epoch. After the verification result converges, the training stops, then testing is performed. ``` - + ## [Script Parameters](#contents) - + Parameters for training can be set in config.py. - -``` + +```bash "learning_rate": 0.01, # Learning rate "epochs": 200, # Epoch sizes for training "hidden1": 16, # Hidden size for the first graph convolution layer @@ -120,27 +117,26 @@ Parameters for training can be set in config.py. "early_stopping": 10, # Tolerance for early stopping ``` -## [Training, Evaluation, Test Process](#contents) - +### [Training, Evaluation, Test Process](#contents) + #### Usage -``` +```bash # run train with cora or citeseer dataset, DATASET_NAME is cora or citeseer sh run_train.sh [DATASET_NAME] ``` - + #### Launch - + ```bash sh run_train.sh cora ``` - + #### Result - + Training result will be stored in the scripts path, whose folder name begins with "train". You can find the result like the followings in log. - -``` +```bash Epoch: 0001 train_loss= 1.95373 train_acc= 0.09286 val_loss= 1.95075 val_acc= 0.20200 time= 7.25737 Epoch: 0002 train_loss= 1.94812 train_acc= 0.32857 val_loss= 1.94717 val_acc= 0.34000 time= 0.00438 Epoch: 0003 train_loss= 1.94249 train_acc= 0.47857 val_loss= 1.94337 val_acc= 0.43000 time= 0.00428 @@ -157,8 +153,9 @@ Test set results: cost= 1.00983 accuracy= 0.81300 time= 0.39083 ... ``` -# [Model Description](#contents) -## [Performance](#contents) +## [Model Description](#contents) + +### [Performance](#contents) | Parameters | GCN | | -------------------------- | -------------------------------------------------------------- | @@ -171,20 +168,17 @@ Test set results: cost= 1.00983 accuracy= 0.81300 time= 0.39083 | Loss Function | Softmax Cross Entropy | | Accuracy | 81.5/70.3 | | Parameters (B) | 92160/59344 | -| Scripts | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/gcn | - +| Scripts | [GCN Script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/gnn/gcn) | - -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) There are two random situations: + - Seed is set in train.py according to input argument --seed. - Dropout operations. Some seeds have already been set in train.py to avoid the randomness of weight initialization. If you want to disable dropout, please set the corresponding dropout_prob parameter to 0 in src/config.py. - -# [ModelZoo Homepage](#contents) +## [ModelZoo Homepage](#contents) Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). - diff --git a/model_zoo/official/gnn/gcn/README_CN.md b/model_zoo/official/gnn/gcn/README_CN.md index 1b5035bc2a..49709a6edb 100644 --- a/model_zoo/official/gnn/gcn/README_CN.md +++ b/model_zoo/official/gnn/gcn/README_CN.md @@ -24,24 +24,24 @@ -# 图卷积网络描述 +## 图卷积网络描述 图卷积网络(GCN)于2016年提出,旨在对图结构数据进行半监督学习。它提出了一种基于卷积神经网络有效变体的可扩展方法,可直接在图上操作。该模型在图边缘的数量上线性缩放,并学习隐藏层表示,这些表示编码了局部图结构和节点特征。 [论文](https://arxiv.org/abs/1609.02907): Thomas N. Kipf, Max Welling.2016.Semi-Supervised Classification with Graph Convolutional Networks.In ICLR 2016. -# 模型架构 +## 模型架构 GCN包含两个图卷积层。每一层以节点特征和邻接矩阵为输入,通过聚合相邻特征来更新节点特征。 -# 数据集 +## 数据集 | 数据集 | 类型 | 节点 | 边 | 类 | 特征 | 标签率 | | ------- | ---------------:|-----:| ----:| ------:|--------:| ---------:| | Cora | Citation network | 2708 | 5429 | 7 | 1433 | 0.052 | | Citeseer| Citation network | 3327 | 4732 | 6 | 3703 | 0.036 | -# 环境要求 +## 环境要求 - 硬件(Ascend处理器) - 准备Ascend或GPU处理器搭建硬件环境。如需试用昇腾处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei,审核通过即可获得资源。 @@ -51,7 +51,7 @@ GCN包含两个图卷积层。每一层以节点特征和邻接矩阵为输入 - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 - 安装[MindSpore](https://www.mindspore.cn/install) @@ -74,7 +74,7 @@ GCN包含两个图卷积层。每一层以节点特征和邻接矩阵为输入 - 为Cora或Citeseer生成MindRecord格式的数据集 -## 用法 +### 用法 ```buildoutcfg cd ./scripts @@ -82,7 +82,7 @@ cd ./scripts sh run_process_data.sh [SRC_PATH] [DATASET_NAME] ``` -## 启动 +### 启动 ```text # 为Cora生成MindRecord格式的数据集 @@ -91,9 +91,9 @@ sh run_process_data.sh ./data cora sh run_process_data.sh ./data citeseer ``` -# 脚本说明 +## 脚本说明 -## 脚本及样例代码 +### 脚本及样例代码 ```shell . @@ -112,7 +112,7 @@ sh run_process_data.sh ./data citeseer └─train.py # 训练网络,每个训练轮次后评估验证结果收敛后,训练停止,然后进行测试。 ``` -## 脚本参数 +### 脚本参数 训练参数可以在config.py中配置。 @@ -125,22 +125,22 @@ sh run_process_data.sh ./data citeseer "early_stopping": 10, # 早停容限 ``` -## 培训、评估、测试过程 +### 培训、评估、测试过程 -### 用法 +#### 用法 ```text # 使用Cora或Citeseer数据集进行训练,DATASET_NAME为Cora或Citeseer sh run_train.sh [DATASET_NAME] ``` -### 启动 +#### 启动 ```bash sh run_train.sh cora ``` -### 结果 +#### 结果 训练结果将保存在脚本路径下,文件夹名称以“train”开头。您可在日志中找到如下结果: @@ -161,9 +161,9 @@ Test set results: cost= 1.00983 accuracy= 0.81300 time= 0.39083 ... ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 | 参数 | GCN | | -------------------------- | -------------------------------------------------------------- | @@ -178,7 +178,7 @@ Test set results: cost= 1.00983 accuracy= 0.81300 time= 0.39083 | 参数(B) | 92160/59344 | | 脚本 | | -# 随机情况说明 +## 随机情况说明 以下两种随机情况: @@ -187,6 +187,6 @@ Test set results: cost= 1.00983 accuracy= 0.81300 time= 0.39083 train.py已经设置了一些种子,避免权重初始化的随机性。若需关闭随机失活,将src/config.py中相应的dropout_prob参数设置为0。 -# ModelZoo主页 +## ModelZoo主页 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/model_zoo/official/nlp/bert/scripts/ascend_distributed_launcher/README.md b/model_zoo/official/nlp/bert/scripts/ascend_distributed_launcher/README.md index c821a2b513..ae76f74d05 100644 --- a/model_zoo/official/nlp/bert/scripts/ascend_distributed_launcher/README.md +++ b/model_zoo/official/nlp/bert/scripts/ascend_distributed_launcher/README.md @@ -1,18 +1,20 @@ # Run distribute pretrain -## description +## Description + The number of Ascend accelerators can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that. +## How to use -## how to use For example, if we want to generate the launch command of the distributed training of Bert model on Ascend accelerators, we can run the following command in `/bert/` dir: -``` + +```python python ./scripts/ascend_distributed_launcher/get_distribute_pretrain_cmd.py --run_script_dir ./run_pretrain.py --hyper_parameter_config_dir ./scripts/ascend_distributed_launcher/hyper_parameter_config.ini --data_dir /path/dataset/ --hccl_config_dir model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json ``` output: -``` +```python hccl_config_dir: model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json the number of logical core: 192 avg_core_per_rank: 96 diff --git a/model_zoo/official/recommend/wide_and_deep_multitable/README.md b/model_zoo/official/recommend/wide_and_deep_multitable/README.md index ee5a225245..7a1f493043 100644 --- a/model_zoo/official/recommend/wide_and_deep_multitable/README.md +++ b/model_zoo/official/recommend/wide_and_deep_multitable/README.md @@ -1,71 +1,78 @@ # Contents + - [Wide&Deep Description](#widedeep-description) - [Model Architecture](#model-architecture) - [Dataset](#dataset) - [Environment Requirements](#environment-requirements) - [Quick Start](#quick-start) - [Script Description](#script-description) - - [Script and Sample Code](#script-and-sample-code) - - [Script Parameters](#script-parameters) - - [Training Script Parameters](#training-script-parameters) - - [Training Process](#training-process) - - [SingleDevice](#singledevice) - - [Distribute Training](#distribute-training) - - [Evaluation Process](#evaluation-process) + - [Script and Sample Code](#script-and-sample-code) + - [Script Parameters](#script-parameters) + - [Training Script Parameters](#training-script-parameters) + - [Training Process](#training-process) + - [SingleDevice](#singledevice) + - [Distribute Training](#distribute-training) + - [Evaluation Process](#evaluation-process) - [Model Description](#model-description) - - [Performance](#performance) - - [Training Performance](#training-performance) - - [Evaluation Performance](#evaluation-performance) + - [Performance](#performance) + - [Training Performance](#training-performance) + - [Evaluation Performance](#evaluation-performance) - [Description of Random Situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) +## [Wide&Deep Description](#contents) -# [Wide&Deep Description](#contents) Wide&Deep model is a classical model in Recommendation and Click Prediction area. This is an implementation of Wide&Deep as described in the [Wide & Deep Learning for Recommender System](https://arxiv.org/pdf/1606.07792.pdf) paper. -# [Model Architecture](#contents) -Wide&Deep model jointly trained wide linear models and deep neural network, which combined the benefits of memorization and generalization for recommender systems. +## [Model Architecture](#contents) + +Wide&Deep model jointly trained wide linear models and deep neural network, which combined the benefits of memorization and generalization for recommender systems. -# [Dataset](#contents) +## [Dataset](#contents) - [1] A dataset used in Click Prediction -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) + - Hardware(Ascend or GPU) - - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. + - Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. - Framework - - [MindSpore](https://gitee.com/mindspore/mindspore) + - [MindSpore](https://gitee.com/mindspore/mindspore) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) - + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) +## [Quick Start](#contents) -# [Quick Start](#contents) +1.Clone the Code -1. Clone the Code ```bash git clone https://gitee.com/mindspore/mindspore.git cd mindspore/model_zoo/official/recommend/wide_and_deep_multitable ``` -2. Download the Dataset - > Please refer to [1] to obtain the download link and data preprocess -3. Start Training +2.Download the Dataset + + > Please refer to [1] to obtain the download link and data preprocess + +3.Start Training Once the dataset is ready, the model can be trained and evaluated on the single device(Ascend) by the command as follows: ```bash python train_and_eval.py --data_path=./data/mindrecord --data_type=mindrecord ``` + To evaluate the model, command as follows: + ```bash python eval.py --data_path=./data/mindrecord --data_type=mindrecord ``` +## [Script Description](#contents) -# [Script Description](#contents) -## [Script and Sample Code](#contents) -``` +### [Script and Sample Code](#contents) + +```bash └── wide_and_deep_multitable ├── eval.py ├── README.md @@ -83,14 +90,13 @@ python eval.py --data_path=./data/mindrecord --data_type=mindrecord └── train_and_eval.py ``` -## [Script Parameters](#contents) - -### [Training Script Parameters](#contents) +### [Script Parameters](#contents) -The parameters is same for ``train_and_eval.py`` and ``train_and_eval_distribute.py`` +#### [Training Script Parameters](#contents) +The parameters is same for ``train_and_eval.py`` and ``train_and_eval_distribute.py`` -``` +```bash usage: train_and_eval.py [-h] [--data_path DATA_PATH] [--epochs EPOCHS] [--batch_size BATCH_SIZE] [--eval_batch_size EVAL_BATCH_SIZE] @@ -115,43 +121,49 @@ optional arguments: --deep_layers_dim The dimension of all deep layers.(Default:[1024,1024,1024,1024]) --deep_layers_act The activation function of all deep layers.(Default:'relu') --keep_prob The keep rate in dropout layer.(Default:1.0) - --adam_lr The learning rate of the deep part. (Default:0.003) - --ftrl_lr The learning rate of the wide part.(Default:0.1) - --l2_coef The coefficient of the L2 pernalty. (Default:0.0) + --adam_lr The learning rate of the deep part. (Default:0.003) + --ftrl_lr The learning rate of the wide part.(Default:0.1) + --l2_coef The coefficient of the L2 pernalty. (Default:0.0) --is_tf_dataset IS_TF_DATASET Whether the input is tfrecords. (Default:True) - --dropout_flag Enable dropout.(Default:0) + --dropout_flag Enable dropout.(Default:0) --output_path OUTPUT_PATH Deprecated - --ckpt_path CKPT_PATH The location of the checkpoint file.(Defalut:./checkpoints/) + --ckpt_path CKPT_PATH The location of the checkpoint file.(Default:./checkpoints/) --eval_file_name EVAL_FILE_NAME Eval output file.(Default:eval.og) --loss_file_name LOSS_FILE_NAME Loss output file.(Default:loss.log) ``` -## [Training Process](#contents) -### [SingleDevice](#contents) +### [Training Process](#contents) + +#### [SingleDevice](#contents) To train and evaluate the model, command as follows: -``` + +```bash python train_and_eval.py ``` +#### [Distribute Training](#contents) -### [Distribute Training](#contents) To train the model in data distributed training, command as follows: -``` + +```bash # configure environment path before training -bash run_multinpu_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE +bash run_multinpu_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE ``` -## [Evaluation Process](#contents) + +### [Evaluation Process](#contents) + To evaluate the model, command as follows: -``` + +```bash python eval.py ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) -### Training Performance +#### Training Performance | Parameters | Single
Ascend | Data-Parallel-8P | | ------------------------ | ------------------------------- | ------------------------------- | @@ -166,14 +178,12 @@ python eval.py | MAP Score | 0.6608 | 0.6590 | | Speed | 284 ms/step | 331 ms/step | | Loss | wide:0.415,deep:0.415 | wide:0.419, deep: 0.419 | -| Parms(M) | 349 | 349 | +| Params(M) | 349 | 349 | | Checkpoint for inference | 1.1GB(.ckpt file) | 1.1GB(.ckpt file) | - - All executable scripts can be found in [here](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/recommend/wide_and_deep_multitable/script) -### Evaluation Performance +#### Evaluation Performance | Parameters | Wide&Deep | | ----------------- | --------------------------- | @@ -185,14 +195,14 @@ All executable scripts can be found in [here](https://gitee.com/mindspore/mindsp | Outputs | AUC,MAP | | Accuracy | AUC=0.7473,MAP=0.7464 | -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) There are three random situations: + - Shuffle of the dataset. - Initialization of some model weights. - Dropout operations. +## [ModelZoo Homepage](#contents) -# [ModelZoo Homepage](#contents) - -Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). \ No newline at end of file +Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/official/recommend/wide_and_deep_multitable/README_CN.md b/model_zoo/official/recommend/wide_and_deep_multitable/README_CN.md index de630b57c6..e599b75bb3 100644 --- a/model_zoo/official/recommend/wide_and_deep_multitable/README_CN.md +++ b/model_zoo/official/recommend/wide_and_deep_multitable/README_CN.md @@ -23,19 +23,19 @@ -# Wide&Deep概述 +## Wide&Deep概述 Wide&Deep模型是推荐和点击预测领域的经典模型。 [Wide&Deep推荐系统学习](https://arxiv.org/pdf/1606.07792.pdf)论文中描述了如何实现Wide&Deep。 -# 模型架构 +## 模型架构 Wide&Deep模型训练了宽线性模型和深度学习神经网络,结合了推荐系统的记忆和泛化的优点。 -# 数据集 +## 数据集 - [1]点击预测中使用的数据集 -# 环境要求 +## 环境要求 - 硬件(Ascend或GPU) - 准备Ascend或GPU处理器搭建硬件环境。如需试用昇腾处理器,请发送[申请表](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx)至ascend@huawei.com,申请通过即可获得资源。 @@ -45,19 +45,19 @@ Wide&Deep模型训练了宽线性模型和深度学习神经网络,结合了 - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) -# 快速入门 +## 快速入门 -1. 克隆代码。 +1.克隆代码。 ```bash git clone https://gitee.com/mindspore/mindspore.git cd mindspore/model_zoo/official/recommend/wide_and_deep_multitable ``` -2. 下载数据集。 - +2.下载数据集。 > 请参考[1]获取下载链接和预处理数据。 -3. 开始训练。 + +3.开始训练。 数据集准备就绪后,即可在Ascend上单机训练和评估模型。 ```bash @@ -129,7 +129,7 @@ optional arguments: --is_tf_dataset IS_TF_DATASET Whether the input is tfrecords. (Default:True) --dropout_flag Enable dropout.(Default:0) --output_path OUTPUT_PATH Deprecated - --ckpt_path CKPT_PATH The location of the checkpoint file.(Defalut:./checkpoints/) + --ckpt_path CKPT_PATH The location of the checkpoint file.(Default:./checkpoints/) --eval_file_name EVAL_FILE_NAME Eval output file.(Default:eval.og) --loss_file_name LOSS_FILE_NAME Loss output file.(Default:loss.log) ``` @@ -161,11 +161,11 @@ bash run_multinpu_train.sh RANK_SIZE EPOCHS DATASET RANK_TABLE_FILE python eval.py ``` -# 模型描述 +## 模型描述 -## 性能 +### 性能 -### 训练性能 +#### 训练性能 | 参数 | 单Ascend | 数据并行-8卡 | | ------------------------ | ------------------------------- | ------------------------------- | @@ -185,7 +185,7 @@ python eval.py 所有可执行脚本参见[这里](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/recommend/wide_and_deep/script)。 -### 评估性能 +#### 评估性能 | 参数 | Wide&Deep | | ----------------- | --------------------------- | diff --git a/model_zoo/research/cv/ghostnet/Readme.md b/model_zoo/research/cv/ghostnet/Readme.md index 091c240a4b..e906b6d90e 100644 --- a/model_zoo/research/cv/ghostnet/Readme.md +++ b/model_zoo/research/cv/ghostnet/Readme.md @@ -5,52 +5,52 @@ - [Dataset](#dataset) - [Environment Requirements](#environment-requirements) - [Script Description](#script-description) - - [Script and Sample Code](#script-and-sample-code) - - [Training Process](#training-process) - - [Evaluation Process](#evaluation-process) - - [Evaluation](#evaluation) + - [Script and Sample Code](#script-and-sample-code) + - [Training Process](#training-process) + - [Evaluation Process](#evaluation-process) + - [Evaluation](#evaluation) - [Model Description](#model-description) - - [Performance](#performance) - - [Training Performance](#evaluation-performance) - - [Inference Performance](#evaluation-performance) + - [Performance](#performance) + - [Training Performance](#evaluation-performance) + - [Inference Performance](#evaluation-performance) - [Description of Random Situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) -# [GhostNet Description](#contents) +## [GhostNet Description](#contents) The GhostNet architecture is based on an Ghost module structure which generate more features from cheap operations. Based on a set of intrinsic feature maps, a series of cheap operations are applied to generate many ghost feature maps that could fully reveal information underlying intrinsic features. [Paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Han_GhostNet_More_Features_From_Cheap_Operations_CVPR_2020_paper.pdf): Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu. GhostNet: More Features from Cheap Operations. CVPR 2020. -# [Model architecture](#contents) +## [Model architecture](#contents) The overall network architecture of GhostNet is show below: [Link](https://openaccess.thecvf.com/content_CVPR_2020/papers/Han_GhostNet_More_Features_From_Cheap_Operations_CVPR_2020_paper.pdf) -# [Dataset](#contents) +## [Dataset](#contents) Dataset used: [Oxford-IIIT Pet](https://www.robots.ox.ac.uk/~vgg/data/pets/) - Dataset size: 7049 colorful images in 1000 classes - - Train: 3680 images - - Test: 3369 images + - Train: 3680 images + - Test: 3369 images - Data format: RGB images. - - Note: Data will be processed in src/dataset.py + - Note: Data will be processed in src/dataset.py -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend/GPU) - - Prepare hardware environment with Ascend or GPU. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. + - Prepare hardware environment with Ascend or GPU. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Script description](#contents) +## [Script description](#contents) -## [Script and sample code](#contents) +### [Script and sample code](#contents) ```python ├── GhostNet @@ -67,6 +67,7 @@ Dataset used: [Oxford-IIIT Pet](https://www.robots.ox.ac.uk/~vgg/data/pets/) ``` ## [Training process](#contents) + To Be Done ## [Eval process](#contents) @@ -75,12 +76,11 @@ To Be Done After installing MindSpore via the official website, you can start evaluation as follows: - ### Launch -``` +```bash # infer example - + Ascend: python eval.py --model [ghostnet/ghostnet-600] --dataset_path ~/Pets/test.mindrecord --platform Ascend --checkpoint_path [CHECKPOINT_PATH] GPU: python eval.py --model [ghostnet/ghostnet-600] --dataset_path ~/Pets/test.mindrecord --platform GPU --checkpoint_path [CHECKPOINT_PATH] ``` @@ -89,19 +89,20 @@ After installing MindSpore via the official website, you can start evaluation as ### Result -``` +```bash result: {'acc': 0.8113927500681385} ckpt= ./ghostnet_nose_1x_pets.ckpt result: {'acc': 0.824475333878441} ckpt= ./ghostnet_1x_pets.ckpt result: {'acc': 0.8691741618969746} ckpt= ./ghostnet600M_pets.ckpt ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) #### Evaluation Performance -###### GhostNet on ImageNet2012 +##### GhostNet on ImageNet2012 + | Parameters | | | | -------------------------- | -------------------------------------- |---------------------------------- | | Model Version | GhostNet |GhostNet-600| @@ -113,6 +114,7 @@ result: {'acc': 0.8691741618969746} ckpt= ./ghostnet600M_pets.ckpt | Accuracy (Top1) | 73.9 |80.2 | ###### GhostNet on Oxford-IIIT Pet + | Parameters | | | | -------------------------- | -------------------------------------- |---------------------------------- | | Model Version | GhostNet |GhostNet-600| @@ -134,10 +136,10 @@ result: {'acc': 0.8691741618969746} ckpt= ./ghostnet600M_pets.ckpt *The latency is measured on Huawei Kirin 990 chip under single-threaded mode with batch size 1. -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py. -# [ModelZoo Homepage](#contents) +## [ModelZoo Homepage](#contents) Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/research/cv/ghostnet_quant/Readme.md b/model_zoo/research/cv/ghostnet_quant/Readme.md index 209247869e..4dc319085f 100644 --- a/model_zoo/research/cv/ghostnet_quant/Readme.md +++ b/model_zoo/research/cv/ghostnet_quant/Readme.md @@ -6,56 +6,56 @@ - [Dataset](#dataset) - [Environment Requirements](#environment-requirements) - [Script Description](#script-description) - - [Script and Sample Code](#script-and-sample-code) + - [Script and Sample Code](#script-and-sample-code) - [Training Process](#training-process) - [Evaluation Process](#evaluation-process) - - [Evaluation](#evaluation) + - [Evaluation](#evaluation) - [Model Description](#model-description) - - [Performance](#performance) - - [Training Performance](#evaluation-performance) - - [Inference Performance](#evaluation-performance) + - [Performance](#performance) + - [Training Performance](#evaluation-performance) + - [Inference Performance](#evaluation-performance) - [Description of Random Situation](#description-of-random-situation) - [ModelZoo Homepage](#modelzoo-homepage) -# [GhostNet Description](#contents) +## [GhostNet Description](#contents) The GhostNet architecture is based on an Ghost module structure which generate more features from cheap operations. Based on a set of intrinsic feature maps, a series of cheap operations are applied to generate many ghost feature maps that could fully reveal information underlying intrinsic features. [Paper](https://openaccess.thecvf.com/content_CVPR_2020/papers/Han_GhostNet_More_Features_From_Cheap_Operations_CVPR_2020_paper.pdf): Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu. GhostNet: More Features from Cheap Operations. CVPR 2020. -# [Quantization Description](#contents) +## [Quantization Description](#contents) Quantization refers to techniques for performing computations and storing tensors at lower bitwidths than floating point precision. For 8bit quantization, we quantize the weights into [-128,127] and the activations into [0,255]. We finetune the model a few epochs after post-quantization to achieve better performance. -# [Model architecture](#contents) +## [Model architecture](#contents) The overall network architecture of GhostNet is show below: [Link](https://openaccess.thecvf.com/content_CVPR_2020/papers/Han_GhostNet_More_Features_From_Cheap_Operations_CVPR_2020_paper.pdf) -# [Dataset](#contents) +## [Dataset](#contents) Dataset used: [Oxford-IIIT Pet](https://www.robots.ox.ac.uk/~vgg/data/pets/) - Dataset size: 7049 colorful images in 1000 classes - - Train: 3680 images - - Test: 3369 images + - Train: 3680 images + - Test: 3369 images - Data format: RGB images. - - Note: Data will be processed in src/dataset.py + - Note: Data will be processed in src/dataset.py -# [Environment Requirements](#contents) +## [Environment Requirements](#contents) - Hardware(Ascend/GPU) - - Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. + - Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources. - Framework - - [MindSpore](https://www.mindspore.cn/install/en) + - [MindSpore](https://www.mindspore.cn/install/en) - For more information, please check the resources below: - - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) + - [MindSpore Tutorials](https://www.mindspore.cn/tutorial/training/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/en/master/index.html) -# [Script description](#contents) +## [Script description](#contents) -## [Script and sample code](#contents) +### [Script and sample code](#contents) ```python ├── GhostNet @@ -72,6 +72,7 @@ Dataset used: [Oxford-IIIT Pet](https://www.robots.ox.ac.uk/~vgg/data/pets/) ``` ## [Training process](#contents) + To Be Done ## [Eval process](#contents) @@ -80,12 +81,11 @@ To Be Done After installing MindSpore via the official website, you can start evaluation as follows: - ### Launch -``` +```bash # infer example - + Ascend: python eval.py --dataset_path ~/Pets/test.mindrecord --platform Ascend --checkpoint_path [CHECKPOINT_PATH] GPU: python eval.py --dataset_path ~/Pets/test.mindrecord --platform GPU --checkpoint_path [CHECKPOINT_PATH] ``` @@ -94,17 +94,18 @@ After installing MindSpore via the official website, you can start evaluation as ### Result -``` +```bash result: {'acc': 0.825} ckpt= ./ghostnet_1x_pets_int8.ckpt ``` -# [Model Description](#contents) +## [Model Description](#contents) -## [Performance](#contents) +### [Performance](#contents) #### Evaluation Performance -###### GhostNet on ImageNet2012 +##### GhostNet on ImageNet2012 + | Parameters | | | | -------------------------- | -------------------------------------- |---------------------------------- | | Model Version | GhostNet |GhostNet-int8| @@ -115,7 +116,8 @@ result: {'acc': 0.825} ckpt= ./ghostnet_1x_pets_int8.ckpt | FLOPs (M) | 142 | / | | Accuracy (Top1) | 73.9 | w/o finetune:72.2, w finetune:73.6 | -###### GhostNet on Oxford-IIIT Pet +##### GhostNet on Oxford-IIIT Pet + | Parameters | | | | -------------------------- | -------------------------------------- |---------------------------------- | | Model Version | GhostNet |GhostNet-int8| @@ -126,11 +128,10 @@ result: {'acc': 0.825} ckpt= ./ghostnet_1x_pets_int8.ckpt | FLOPs (M) | 140 | / | | Accuracy (Top1) | 82.4 | w/o finetune:81.66, w finetune:82.45 | - -# [Description of Random Situation](#contents) +## [Description of Random Situation](#contents) In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py. -# [ModelZoo Homepage](#contents) +## [ModelZoo Homepage](#contents) Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/model_zoo/utils/ascend_distributed_launcher/README.md b/model_zoo/utils/ascend_distributed_launcher/README.md index c821a2b513..45ae72a271 100644 --- a/model_zoo/utils/ascend_distributed_launcher/README.md +++ b/model_zoo/utils/ascend_distributed_launcher/README.md @@ -1,18 +1,20 @@ # Run distribute pretrain ## description -The number of Ascend accelerators can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that. +The number of Ascend accelerators can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that. ## how to use + For example, if we want to generate the launch command of the distributed training of Bert model on Ascend accelerators, we can run the following command in `/bert/` dir: -``` + +```python python ./scripts/ascend_distributed_launcher/get_distribute_pretrain_cmd.py --run_script_dir ./run_pretrain.py --hyper_parameter_config_dir ./scripts/ascend_distributed_launcher/hyper_parameter_config.ini --data_dir /path/dataset/ --hccl_config_dir model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json ``` output: -``` +```python hccl_config_dir: model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json the number of logical core: 192 avg_core_per_rank: 96 diff --git a/model_zoo/utils/cv_to_mindrecord/Caltech-UCSD-Birds-200-2011/README.md b/model_zoo/utils/cv_to_mindrecord/Caltech-UCSD-Birds-200-2011/README.md index f79cc16cad..3fc0b33666 100644 --- a/model_zoo/utils/cv_to_mindrecord/Caltech-UCSD-Birds-200-2011/README.md +++ b/model_zoo/utils/cv_to_mindrecord/Caltech-UCSD-Birds-200-2011/README.md @@ -8,16 +8,15 @@ - [Generate MindRecord](#generate-mindrecord) - [Create MindDataset By MindRecord](#create-minddataset-by-mindrecord) - ## What does the example do This example is used to read data from Caltech-UCSD Birds-200-2011 dataset and generate mindrecord. It just transfers the Caltech-UCSD Birds-200-2011 dataset to mindrecord without any data preprocessing. You can modify the example or follow the example to implement your own example. -1. run.sh: generate MindRecord entry script. +1. run.sh: generate MindRecord entry script. - gen_mindrecord.py : read the Caltech-UCSD Birds-200-2011 data and transfer it to mindrecord. -2. run_read.py: create MindDataset by MindRecord entry script. +2. run_read.py: create MindDataset by MindRecord entry script. - create_dataset.py: use MindDataset to read MindRecord to generate dataset. ## How to use the example to generate MindRecord @@ -32,25 +31,30 @@ Download Caltech-UCSD Birds-200-2011 dataset, transfer it to mindrecord, use Min > **2) -> Download -> Segmentations** 2. Unzip the training data to dir example/nlp_to_mindrecord/Caltech-UCSD-Birds-200-2011/data. - ``` + +```bash tar -zxvf CUB_200_2011.tgz -C {your-mindspore}/example/cv_to_mindrecord/Caltech-UCSD-Birds-200-2011/data/ tar -zxvf segmentations.tgz -C {your-mindspore}/example/cv_to_mindrecord/Caltech-UCSD-Birds-200-2011/data/ - ``` - - The unzip should like this: - ``` +``` + + The unzip should like this: + +```bash $ ls {your-mindspore}/example/cv_to_mindrecord/Caltech-UCSD-Birds-200-2011/data/ attributes.txt CUB_200_2011 README.md segmentations - ``` +``` ### Generate MindRecord -1. Run the run.sh script. - ```bash +1.Run the run.sh script. + +```bash bash run.sh - ``` +``` -2. Output like this: - ``` +2.Output like this: + +```bash ... >> begin generate mindrecord >> sample id: 1, filename: data/CUB_200_2011/images/001.Black_footed_Albatross/Black_Footed_Albatross_0046_18.jpg, bbox: [60.0, 27.0, 325.0, 304.0], label: 1, seg_filename: data/segmentations/001.Black_footed_Albatross/Black_Footed_Albatross_0046_18.png, class: 001.Black_footed_Albatross @@ -72,10 +76,11 @@ Download Caltech-UCSD Birds-200-2011 dataset, transfer it to mindrecord, use Min [INFO] MD(11253,python):2020-05-20-16:22:21.964.034 [mindspore/ccsrc/mindrecord/io/shard_index_generator.cc:549] ExecuteTransaction] Insert 11788 rows to index db. [INFO] MD(11253,python):2020-05-20-16:22:21.978.087 [mindspore/ccsrc/mindrecord/io/shard_index_generator.cc:620] DatabaseWriter] Generate index db for shard: 0 successfully. [INFO] ME(11253:139923799271232,MainProcess):2020-05-20-16:22:21.979.634 [mindspore/mindrecord/filewriter.py:313] The list of mindrecord files created are: ['output/CUB_200_2011.mindrecord'], and the list of index files are: ['output/CUB_200_2011.mindrecord.db'] - ``` +``` -3. Generate mindrecord files - ``` +3.Generate mindrecord files + +```bash $ ls output/ CUB_200_2011.mindrecord CUB_200_2011.mindrecord.db README.md ``` @@ -83,11 +88,13 @@ Download Caltech-UCSD Birds-200-2011 dataset, transfer it to mindrecord, use Min ### Create MindDataset By MindRecord 1. Run the run_read.sh script. + ```bash bash run_read.sh ``` 2. Output like this: + ``` [INFO] MD(12469,python):2020-05-20-16:26:38.308.797 [mindspore/ccsrc/dataset/util/task.cc:31] operator()] Op launched, OperatorId:0 Thread ID 139702598620928 Started. [INFO] MD(12469,python):2020-05-20-16:26:38.322.433 [mindspore/ccsrc/mindrecord/io/shard_reader.cc:343] ReadAllRowsInShard] Get 11788 records from shard 0 index. @@ -123,6 +130,7 @@ Download Caltech-UCSD Birds-200-2011 dataset, transfer it to mindrecord, use Min >> total rows: 11788 [INFO] MD(12469,python):2020-05-20-16:26:49.582.298 [mindspore/ccsrc/dataset/util/task.cc:128] Join] Watchdog Thread ID 139702607013632 Stopped. ``` + - bbox : coordinate value of the bounding box in the picture. - image: the image bytes which is from like "data/CUB_200_2011/images/001.Black_footed_Albatross/Black_Footed_Albatross_0001_796111.jpg". - image_filename: the image name which is like "Black_Footed_Albatross_0001_796111.jpg" diff --git a/model_zoo/utils/cv_to_mindrecord/ImageNet_Similar_Perf/README.md b/model_zoo/utils/cv_to_mindrecord/ImageNet_Similar_Perf/README.md index d3bd5fdc18..d272d62bcb 100644 --- a/model_zoo/utils/cv_to_mindrecord/ImageNet_Similar_Perf/README.md +++ b/model_zoo/utils/cv_to_mindrecord/ImageNet_Similar_Perf/README.md @@ -9,16 +9,15 @@ - [Implement data generator](#implement-data-generator) - [Run data generator](#run-data-generator) - ## What does the example do This example provides an efficient way to generate MindRecord. Users only need to define the parallel granularity of training data reading and the data reading function of a single task. That is, they can efficiently convert the user's training data into MindRecord. -1. run_template.sh: entry script, users need to modify parameters according to their own training data. -2. writer.py: main script, called by run_template.sh, it mainly reads user training data in parallel and generates MindRecord. -3. template/mr_api.py: uers define their own parallel granularity of training data reading and single task reading function through the template. +1. run_template.sh: entry script, users need to modify parameters according to their own training data. +2. writer.py: main script, called by run_template.sh, it mainly reads user training data in parallel and generates MindRecord. +3. template/mr_api.py: uers define their own parallel granularity of training data reading and single task reading function through the template. ## Example test for ImageNet @@ -29,15 +28,17 @@ This example provides an efficient way to generate MindRecord. Users only need t Store the downloaded ImageNet dataset in a folder. The folder contains all images and a mapping file that records labels of the images. In the mapping file, there are three columns, which are separated by spaces. They indicate image classes and label IDs. The following is an example of the mapping file: - ``` - n02119789 0 - n02100735 1 - n02110185 2 - n02096294 3 + + ```bash + n02119789 0 + n02100735 1 + n02110185 2 + n02096294 3 ``` 2. Edit run_imagenet.sh and modify the parameters - ``` + + ```bash --mindrecord_file: output MindRecord file. --mindrecord_partitions: the partitions for MindRecord. --label_file: ImageNet label map file. @@ -45,51 +46,59 @@ This example provides an efficient way to generate MindRecord. Users only need t ``` 3. Run the bash script - ```bash + + ```bash bash run_imagenet.sh ``` 4. Performance result - | Training Data | General API | Current Example | Env | - | ---- | ---- | ---- | ---- | - |ImageNet(140G)| 2h40m | 50m | CPU: Intel Xeon Gold 6130 x 64, Memory: 256G, Storage: HDD | +| Training Data | General API | Current Example | Env | +| ---- | ---- | ---- | ---- | +|ImageNet(140G)| 2h40m | 50m | CPU: Intel Xeon Gold 6130 x 64, Memory: 256G, Storage: HDD | ## How to use the example for other dataset ### Create work space Assume the dataset name is 'xyz' -* Create work space from template + +- Create work space from template + ```shell - cd ${your_mindspore_home}/example/cv_to_mindrecord/ImageNet_Similar_Perf - cp -r template xyz + cd ${your_mindspore_home}/example/cv_to_mindrecord/ImageNet_Similar_Perf + cp -r template xyz ``` ### Implement data generator Edit dictionary data generator. -* Edit file + +- Edit file + ```shell cd ${your_mindspore_home}/example/cv_to_mindrecord/ImageNet_Similar_Perf vi xyz/mr_api.py ``` Two API, 'mindrecord_task_number' and 'mindrecord_dict_data', must be implemented. + - 'mindrecord_task_number()' returns number of tasks. Return 1 if data row is generated serially. Return N if generator can be split into N parallel-run tasks. - 'mindrecord_dict_data(task_id)' yields dictionary data row by row. 'task_id' is 0..N-1, if N is return value of mindrecord_task_number() Tricky for parallel run. + - For ImageNet, one directory can be a task. - For TFRecord with multiple files, each file can be a task. -- For TFRecord with 1 file only, it could also be split into N tasks. Task_id=K means: data row is picked only if (count % N == K) +- For TFRecord with 1 file only, it could also be split into N tasks. Task_id=K means: data row is picked only if (count % N == K) ### Run data generator -* run python script - ```shell - cd ${your_mindspore_home}/example/cv_to_mindrecord/ImageNet_Similar_Perf - python writer.py --mindrecord_script xyz [...] - ``` - > You can put this command in script **run_xyz.sh** for easy execution +- run python script + +```shell +cd ${your_mindspore_home}/example/cv_to_mindrecord/ImageNet_Similar_Perf +python writer.py --mindrecord_script xyz [...] +``` +You can put this command in script **run_xyz.sh** for easy execution diff --git a/model_zoo/utils/graph_to_mindrecord/README.md b/model_zoo/utils/graph_to_mindrecord/README.md index df7ab33444..2fe216dd81 100644 --- a/model_zoo/utils/graph_to_mindrecord/README.md +++ b/model_zoo/utils/graph_to_mindrecord/README.md @@ -9,28 +9,29 @@ - [Implement data generator](#implement-data-generator) - [Run data generator](#run-data-generator) - ## What does the example do This example provides an efficient way to generate MindRecord. Users only need to define the parallel granularity of training data reading and the data reading function of a single task. That is, they can efficiently convert the user's training data into MindRecord. -1. write_cora.sh: entry script, users need to modify parameters according to their own training data. -2. writer.py: main script, called by write_cora.sh, it mainly reads user training data in parallel and generates MindRecord. -3. cora/mr_api.py: uers define their own parallel granularity of training data reading and single task reading function through the cora. +1.write_cora.sh: entry script, users need to modify parameters according to their own training data. +2.writer.py: main script, called by write_cora.sh, it mainly reads user training data in parallel and generates MindRecord. +3.cora/mr_api.py: uers define their own parallel granularity of training data reading and single task reading function through the cora. ## Example test for Cora 1. Download and prepare the Cora dataset as required. 2. Edit write_cora.sh and modify the parameters - ``` + + ```bash --mindrecord_file: output MindRecord file. --mindrecord_partitions: the partitions for MindRecord. ``` 3. Run the bash script + ```bash bash write_cora.sh ``` @@ -40,31 +41,37 @@ This example provides an efficient way to generate MindRecord. Users only need t ### Create work space Assume the dataset name is 'xyz' -* Create work space from cora - ```shell + +- Create work space from cora + +```shell cd ${your_mindspore_home}/example/graph_to_mindrecord cp -r cora xyz - ``` +``` ### Implement data generator Edit dictionary data generator. -* Edit file - ```shell + +- Edit file + +```shell cd ${your_mindspore_home}/example/graph_to_mindrecord vi xyz/mr_api.py - ``` +``` Two API, 'mindrecord_task_number' and 'mindrecord_dict_data', must be implemented. + - 'mindrecord_task_number()' returns number of tasks. Return 1 if data row is generated serially. Return N if generator can be split into N parallel-run tasks. - 'mindrecord_dict_data(task_id)' yields dictionary data row by row. 'task_id' is 0..N-1, if N is return value of mindrecord_task_number() ### Run data generator -* run python script - ```shell +Run python script + +```shell cd ${your_mindspore_home}/example/graph_to_mindrecord python writer.py --mindrecord_script xyz [...] - ``` - > You can put this command in script **write_xyz.sh** for easy execution +``` +> You can put this command in script **write_xyz.sh** for easy execution diff --git a/model_zoo/utils/hccl_tools/README.md b/model_zoo/utils/hccl_tools/README.md index c237827c7e..84c935536a 100644 --- a/model_zoo/utils/hccl_tools/README.md +++ b/model_zoo/utils/hccl_tools/README.md @@ -1,17 +1,19 @@ -# description +# Description -MindSpore distributed training launch helper utilty that will generate hccl config file. +MindSpore distributed training launch helper utility that will generate hccl config file. -# use +## Usage -``` +```python python hccl_tools.py --device_num "[0,8)" ``` output: -``` + +```python hccl_[device_num]p_[which device]_[server_ip].json ``` -# Note +## Note + Please note that the Ascend accelerators used must be continuous, such [0,4) means to use four chips 0,1,2,3; [0,1) means to use chip 0; The first four chips are a group, and the last four chips are a group. In addition to the [0,8) chips are allowed, other cross-group such as [3,6) are prohibited. diff --git a/model_zoo/utils/nlp_to_mindrecord/aclImdb_preprocess/README.md b/model_zoo/utils/nlp_to_mindrecord/aclImdb_preprocess/README.md index 66bcf24756..0d9de4cd7c 100644 --- a/model_zoo/utils/nlp_to_mindrecord/aclImdb_preprocess/README.md +++ b/model_zoo/utils/nlp_to_mindrecord/aclImdb_preprocess/README.md @@ -8,16 +8,18 @@ - [Generate MindRecord](#generate-mindrecord) - [Create MindDataset By MindRecord](#create-minddataset-by-mindrecord) - ## What does the example do This example is used to read data from aclImdb dataset, preprocess it and generate mindrecord. The preprocessing process mainly uses vocab file to convert the training set text into dictionary sequence, which can be further used in the subsequent training process. -1. run.sh: generate MindRecord entry script. +1. run.sh: generate MindRecord entry script. + - gen_mindrecord.py : read the aclImdb data, preprocess it and transfer it to mindrecord. -2. run_read.py: create MindDataset by MindRecord entry script. + +2. run_read.py: create MindDataset by MindRecord entry script. + - create_dataset.py: use MindDataset to read MindRecord to generate dataset. ## How to use the example to generate MindRecord @@ -30,19 +32,22 @@ Download aclImdb dataset, transfer it to mindrecord, use MindDataset to read min > [aclImdb dataset download address](http://ai.stanford.edu/~amaas/data/sentiment/) **-> Large Movie Review Dataset v1.0** 2. Unzip the training data to dir example/nlp_to_mindrecord/aclImdb_preprocess/data. - ``` + + ```bash tar -zxvf aclImdb_v1.tar.gz -C {your-mindspore}/example/nlp_to_mindrecord/aclImdb_preprocess/data/ ``` ### Generate MindRecord 1. Run the run.sh script. + ```bash bash run.sh ``` 2. Output like this: - ``` + + ```bash ... >> begin generate mindrecord by train data >> transformed 256 record... @@ -70,7 +75,8 @@ Download aclImdb dataset, transfer it to mindrecord, use MindDataset to read min ``` 3. Generate mindrecord files - ``` + + ```bash $ ls output/ aclImdb_test.mindrecord aclImdb_test.mindrecord.db aclImdb_train.mindrecord aclImdb_train.mindrecord.db README.md ``` @@ -78,12 +84,14 @@ Download aclImdb dataset, transfer it to mindrecord, use MindDataset to read min ### Create MindDataset By MindRecord 1. Run the run_read.sh script. + ```bash bash run_read.sh ``` 2. Output like this: - ``` + + ```bash example 24992: { 'input_ids': array( [ -1, -1, 65, 0, 89, 0, 367, 0, -1, @@ -141,6 +149,7 @@ Download aclImdb dataset, transfer it to mindrecord, use MindDataset to read min 'score': array(7, dtype=int32), 'label': array(0, dtype=int32)} ``` + - id : the id "3219" is from review docs like **3219**_10.txt. - label : indicates whether the review is positive or negative, positive: 0, negative: 1. - score : the score "10" is from review docs like 3219_**10**.txt.