Update DNNL QAT document 2.0-alpha (#24494)

Update DNNL QAT document 2.0-alpha
release/2.0-alpha
lidanqing 5 years ago committed by GitHub
parent db2b6b6568
commit 8ef3c02e90
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -109,10 +109,9 @@ The code snipped shows how the `Qat2Int8MkldnnPass` can be applied to a model gr
## 5. Accuracy and Performance benchmark ## 5. Accuracy and Performance benchmark
This section contain QAT2 MKL-DNN accuracy and performance benchmark results measured on two servers: This section contain QAT2 MKL-DNN accuracy and performance benchmark results measured on the following server:
* Intel(R) Xeon(R) Gold 6271 (with AVX512 VNNI support), * Intel(R) Xeon(R) Gold 6271 (with AVX512 VNNI support),
* Intel(R) Xeon(R) Gold 6148.
Performance benchmarks were run with the following environment settings: Performance benchmarks were run with the following environment settings:
@ -144,17 +143,6 @@ Performance benchmarks were run with the following environment settings:
| VGG16 | 72.08% | 71.73% | -0.35% | 90.63% | 89.71% | -0.92% | | VGG16 | 72.08% | 71.73% | -0.35% | 90.63% | 89.71% | -0.92% |
| VGG19 | 72.57% | 72.12% | -0.45% | 90.84% | 90.15% | -0.69% | | VGG19 | 72.57% | 72.12% | -0.45% | 90.84% | 90.15% | -0.69% |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 Top1 Accuracy | INT8 QAT Top1 Accuracy | Top1 Diff | FP32 Top5 Accuracy | INT8 QAT Top5 Accuracy | Top5 Diff |
| :----------: | :----------------: | :--------------------: | :-------: | :----------------: | :--------------------: | :-------: |
| MobileNet-V1 | 70.78% | 70.85% | 0.07% | 89.69% | 89.41% | -0.28% |
| MobileNet-V2 | 71.90% | 72.08% | 0.18% | 90.56% | 90.66% | +0.10% |
| ResNet101 | 77.50% | 77.51% | 0.01% | 93.58% | 93.50% | -0.08% |
| ResNet50 | 76.63% | 76.55% | -0.08% | 93.10% | 92.96% | -0.14% |
| VGG16 | 72.08% | 71.72% | -0.36% | 90.63% | 89.75% | -0.88% |
| VGG19 | 72.57% | 72.08% | -0.49% | 90.84% | 90.11% | -0.73% |
#### Performance #### Performance
Image classification models performance was measured using a single thread. The setting is included in the benchmark reproduction commands below. Image classification models performance was measured using a single thread. The setting is included in the benchmark reproduction commands below.
@ -164,23 +152,12 @@ Image classification models performance was measured using a single thread. The
| Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) | | Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) |
| :----------: | :-------------: | :-----------------: | :---------------: | | :----------: | :-------------: | :-----------------: | :---------------: |
| MobileNet-V1 | 77.00 | 210.76 | 2.74 | | MobileNet-V1 | 74.05 | 196.98 | 2.66 |
| MobileNet-V2 | 88.43 | 182.47 | 2.06 | | MobileNet-V2 | 88.60 | 187.67 | 2.12 |
| ResNet101 | 7.20 | 25.88 | 3.60 | | ResNet101 | 7.20 | 26.43 | 3.67 |
| ResNet50 | 13.26 | 47.44 | 3.58 | | ResNet50 | 13.23 | 47.44 | 3.59 |
| VGG16 | 3.48 | 10.11 | 2.90 | | VGG16 | 3.47 | 10.20 | 2.94 |
| VGG19 | 2.83 | 8.77 | 3.10 | | VGG19 | 2.83 | 8.67 | 3.06 |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 (images/s) | INT8 QAT (images/s) | Ratio (INT8/FP32) |
| :----------: | :-------------: | :-----------------: | :---------------: |
| MobileNet-V1 | 75.23 | 103.63 | 1.38 |
| MobileNet-V2 | 86.65 | 128.14 | 1.48 |
| ResNet101 | 6.61 | 10.79 | 1.63 |
| ResNet50 | 12.42 | 19.65 | 1.58 |
| VGG16 | 3.31 | 4.74 | 1.43 |
| VGG19 | 2.68 | 3.91 | 1.46 |
Notes: Notes:
@ -194,13 +171,8 @@ Notes:
| Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff | | Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff |
|:------------:|:----------------------:|:----------------------:|:---------:| |:------------:|:----------------------:|:----------------------:|:---------:|
| Ernie | 80.20% | 79.88% | -0.32% | | Ernie | 80.20% | 79.44% | -0.76% |
>**Intel(R) Xeon(R) Gold 6148**
| Model | FP32 Accuracy | QAT INT8 Accuracy | Accuracy Diff |
| :---: | :-----------: | :---------------: | :-----------: |
| Ernie | 80.20% | 79.64% | -0.56% |
#### Performance #### Performance
@ -209,17 +181,10 @@ Notes:
| Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) | | Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) |
|:------------:|:----------------------:|:-------------------:|:---------:|:---------:| |:------------:|:----------------------:|:-------------------:|:---------:|:---------:|
| Ernie | 1 thread | 236.72 | 83.70 | 2.82x | | Ernie | 1 thread | 237.21 | 79.26 | 2.99x |
| Ernie | 20 threads | 27.40 | 15.01 | 1.83x | | Ernie | 20 threads | 22.08 | 12.57 | 1.76x |
>**Intel(R) Xeon(R) Gold 6148**
| Model | Threads | FP32 Latency (ms) | QAT INT8 Latency (ms) | Ratio (FP32/INT8) |
| :---: | :--------: | :---------------: | :-------------------: | :---------------: |
| Ernie | 1 thread | 248.42 | 169.30 | 1.46 |
| Ernie | 20 threads | 28.92 | 20.83 | 1.39 |
## 6. How to reproduce the results ## 6. How to reproduce the results
The steps below show, taking ResNet50 as an example, how to reproduce the above accuracy and performance results for Image Classification models. The steps below show, taking ResNet50 as an example, how to reproduce the above accuracy and performance results for Image Classification models.

Loading…
Cancel
Save