* Add support for new QAT models
test=develop
Co-Authored-By: Michał Gallus <michal.gallus@intel.com>
Co-Authored-By: Wojciech Uss <wojciech.uss@intel.com>
* fixed fps results
test=develop
* fix top5 accuracy drop problem
* updated for new QAT models
* skip quantizing average pooling - dirty but working
* add missing pass
* added missing conv+brelu fuse pass
* removed a call to non-existent pass
test=develop
* renamed pass
test=develop
* Adjust finding pooling scale to newest QAT models
* Remove unnecessary code from quantization_mkldnn_pass
* Copy Pooling input scale to output scale in QAT
* Refactor & remove unused code in QAT
* Incorporate fp32 FC into QAT
test=develop
* Enable graph drawing with debug flag
test=develop
* Add tests for QATv2
* Fix paths for QATv2 models
test=develop
* Add option to save transformed int8 qat model
test=develop
* Remove redundant lines from qat mkldnn pass
test=develop
* Delegate disablement of avg pooling to qat
test=develop
* fix CI bug, test=develop
* Follow Wangzhen's Review, test=develop
* Update API.spec
test=develop
* Name False in (is_unsigned, TensorScale) tuple
test=develop
@ -6,11 +6,11 @@ This document describes how to use [Paddle Slim](https://github.com/PaddlePaddle
You need to install at least PaddlePaddle-1.5 python package `pip install paddlepaddle==1.5`.
## 1. How to generate INT8 MKL-DNN QAT model
You can refer to the unit test in [test_quantization_mkldnn_pass.py](test_quantization_mkldnn_pass.py). Users firstly use PaddleSlim quantization strategy to get a saved fake QAT model by [QuantizationFreezePass](https://github.com/PaddlePaddle/models/tree/develop/PaddleSlim/quant_low_level_api), then use the `TransformForMkldnnPass` to get the graph which can be run with MKL-DNN INT8 kernel. In Paddle Release 1.5, this pass only supports `conv2d` and `depthwise_conv2d` with channel-wise quantization for weights.
You can refer to the unit test in [test_quantization_mkldnn_pass.py](test_quantization_mkldnn_pass.py). Users firstly use PaddleSlim quantization strategy to get a saved fake QAT model by [QuantizationFreezePass](https://github.com/PaddlePaddle/models/tree/develop/PaddleSlim/quant_low_level_api), then use the `FakeQAT2MkldnnINT8KernelPass` to get the graph which can be run with MKL-DNN INT8 kernel. In Paddle Release 1.5, this pass only supports `conv2d` and `depthwise_conv2d` with channel-wise quantization for weights.
```python
import paddle.fluid as fluid
from paddle.fluid.contrib.slim.quantization import TransformForMkldnnPass
from paddle.fluid.contrib.slim.quantization import FakeQAT2MkldnnINT8KernelPass
from paddle.fluid.framework import IrGraph
from paddle.fluid import core
@ -18,9 +18,9 @@ You can refer to the unit test in [test_quantization_mkldnn_pass.py](test_quanti