mindspore/model_zoo/vgg16/README.md

# VGG16 Example

## Description

This example is for VGG16 model training and evaluation.

## Requirements

- Install [MindSpore](https://www.mindspore.cn/install/en).

- Download the CIFAR-10 binary version dataset.

> Unzip the CIFAR-10 dataset to any path you want and the folder structure should be as follows:
> ```
> .
> ├── cifar-10-batches-bin  # train dataset
> └── cifar-10-verify-bin   # infer dataset
> ```

## Running the Example

### Training

```
python train.py --data_path=your_data_path --device_id=6 > out.train.log 2>&1 & 
```
The python command above will run in the background, you can view the results through the file `out.train.log`.

After training, you'll get some checkpoint files under the script folder by default.

You will get the loss value as following:
```
# grep "loss is " out.train.log
epoch: 1 step: 781, loss is 2.093086
epcoh: 2 step: 781, loss is 1.827582
...
```

### Evaluation

```
python eval.py --data_path=your_data_path --device_id=6 --checkpoint_path=./train_vgg_cifar10-70-781.ckpt > out.eval.log 2>&1 & 
```
The above python command will run in the background, you can view the results through the file `out.eval.log`.

You will get the accuracy as following:
```
# grep "result: " out.eval.log
result: {'acc': 0.92}
```

### Distribute Training
```
sh run_distribute_train.sh rank_table.json your_data_path
```
The above shell script will run distribute training in the background, you can view the results through the file `train_parallel[X]/log`.

You will get the loss value as following:
```
# grep "result: " train_parallel*/log
train_parallel0/log:epoch: 1 step: 97, loss is 1.9060308
train_parallel0/log:epcoh: 2 step: 97, loss is 1.6003821
...
train_parallel1/log:epoch: 1 step: 97, loss is 1.7095519
train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579
...
...
```
> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorial/en/master/advanced_use/distributed_training.html).

## Usage:

### Training
```
usage: train.py [--device_target TARGET][--data_path DATA_PATH]
                [--device_id DEVICE_ID]

parameters/options:
  --device_target       the training backend type, default is Ascend.
  --data_path           the storage path of dataset
  --device_id           the device which used to train model.

```

### Evaluation

```
usage: eval.py [--device_target TARGET][--data_path DATA_PATH]
                [--device_id DEVICE_ID][--checkpoint_path CKPT_PATH]

parameters/options:
  --device_target       the evaluation backend type, default is Ascend.
  --data_path           the storage path of datasetd 
  --device_id           the device which used to evaluate model.
  --checkpoint_path     the checkpoint file path used to evaluate model.
```

### Distribute Training

```
Usage: sh script/run_distribute_train.sh [MINDSPORE_HCCL_CONFIG_PATH] [DATA_PATH]

parameters/options:
  MINDSPORE_HCCL_CONFIG_PATH   HCCL configuration file path.
  DATA_PATH                    the storage path of dataset.
```
add README file for vgg16 5 years ago			`# VGG16 Example`

			`## Description`

			`This example is for VGG16 model training and evaluation.`

			`## Requirements`

			`- Install [MindSpore](https://www.mindspore.cn/install/en).`

remove http link from README.md for vgg 5 years ago			`- Download the CIFAR-10 binary version dataset.`
add README file for vgg16 5 years ago
			`> Unzip the CIFAR-10 dataset to any path you want and the folder structure should be as follows:`
			> ```
			`> .`
			`> ├── cifar-10-batches-bin # train dataset`
			`> └── cifar-10-verify-bin # infer dataset`
			> ```

			`## Running the Example`

			`### Training`

			```
			`python train.py --data_path=your_data_path --device_id=6 > out.train.log 2>&1 &`
			```
			The python command above will run in the background, you can view the results through the file `out.train.log`.

			`After training, you'll get some checkpoint files under the script folder by default.`

			`You will get the loss value as following:`
			```
			`# grep "loss is " out.train.log`
			`epoch: 1 step: 781, loss is 2.093086`
			`epcoh: 2 step: 781, loss is 1.827582`
			`...`
			```

			`### Evaluation`

			```
			`python eval.py --data_path=your_data_path --device_id=6 --checkpoint_path=./train_vgg_cifar10-70-781.ckpt > out.eval.log 2>&1 &`
			```
			The above python command will run in the background, you can view the results through the file `out.eval.log`.

			`You will get the accuracy as following:`
			```
			`# grep "result: " out.eval.log`
			`result: {'acc': 0.92}`
			```

add distribute train README for vgg16 5 years ago			`### Distribute Training`
			```
			`sh run_distribute_train.sh rank_table.json your_data_path`
			```
			The above shell script will run distribute training in the background, you can view the results through the file `train_parallel[X]/log`.

			`You will get the loss value as following:`
			```
			`# grep "result: " train_parallel*/log`
			`train_parallel0/log:epoch: 1 step: 97, loss is 1.9060308`
			`train_parallel0/log:epcoh: 2 step: 97, loss is 1.6003821`
			`...`
			`train_parallel1/log:epoch: 1 step: 97, loss is 1.7095519`
			`train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579`
			`...`
			`...`
			```
			`> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorial/en/master/advanced_use/distributed_training.html).`
add README file for vgg16 5 years ago
			`## Usage:`

			`### Training`
			```
			`usage: train.py [--device_target TARGET][--data_path DATA_PATH]`
			`[--device_id DEVICE_ID]`

			`parameters/options:`
			`--device_target the training backend type, default is Ascend.`
			`--data_path the storage path of dataset`
			`--device_id the device which used to train model.`

			```

			`### Evaluation`

			```
			`usage: eval.py [--device_target TARGET][--data_path DATA_PATH]`
			`[--device_id DEVICE_ID][--checkpoint_path CKPT_PATH]`

			`parameters/options:`
			`--device_target the evaluation backend type, default is Ascend.`
			`--data_path the storage path of datasetd`
			`--device_id the device which used to evaluate model.`
			`--checkpoint_path the checkpoint file path used to evaluate model.`
add distribute train README for vgg16 5 years ago			```

			`### Distribute Training`

			```
refactoring code directory for vgg16 and lstm 5 years ago			`Usage: sh script/run_distribute_train.sh [MINDSPORE_HCCL_CONFIG_PATH] [DATA_PATH]`
add distribute train README for vgg16 5 years ago
			`parameters/options:`
			`MINDSPORE_HCCL_CONFIG_PATH HCCL configuration file path.`
			`DATA_PATH the storage path of dataset.`
			```