* add data_type register CI test=develop
* add op add test case test=develop
* add test case for register op kernel test=develop
* fix shell script bug test=develop
* fix checkout branch test=develop
* remove test case code test=develop
* fix op_type.spec name test=develop
* fix default grad op maker ci bug, test=develop, test=document_fix
* remove some ops from paddle/fluid/op_use_default_grad_op_maker.spec, test=develop, test=document_fix
* fix the device supported of the op unique and unique_with_counts.
test=develop
test=document_fix
* Fix the precision of test in the op of unique and unique_with_counts.
test=develop
test=document_fix
* add param & grad shape check for sgd op
* add _reshape_inplece interface for dygraph parallel
* refine unittest based paddle/models scripts, test=develop
* add unittest for parallel grad fuse, test=develop
* Commit before merging develop
test=develop
* Backup after working with Huihuang logs
* Commit before deleting Huihuang debug loggings
* Commit before debug
test=develop
* Fix bug commit
test=develop
* Backup of fixing bugs
test=develop
* Clean up code
test=develop
* Fix a bug in sum_op
test=develop
* Add ascending for argsort
* Refine api doc description.
* Refine descending description
* Add int32 logic to speedup when data is small size.
* Remove int32 opt as not support in python
* add ut for comparing FP32 and QAT INT8
* add save qat transformed model python script
test=develop
* updated
* added missing file
* add "with_label"
test=develop
* performance benchmark as unit test
test=develop
* change names of unnecessary thing
* Change CMakeList.txt for model downloading and UT
test=develop
* change names of functions and params for more readable code
test=develop
* Change PADDLE_ENFORCE messages
test=develop
* fix indent problems
test=develop
* indent problems
test=develop
* Implement Int8 FC
* Integrate FC into INT8v2
test=develop
* int8 FC: transpose weights before computing scales
test=develop
* Add support for activation_type string in FC
test=develop
* Disable MKL-DNN's FC in VGG16 and 19
test=develop
* Disable FC quantization when mkldnn FC is disabled
test=develop
* Solve PADDLE_ENFORCES in FC int8
* Fix Paddle enforces and remove const cast
test=develop
* Fix style changes
test=develop
* Fix quantizer_tester test and add fc quantization
test=develop
* Fix FC test fail on CUDA
* Remove unnecessary log from quantize placement pass
test=develop
* Add Thread ID to FC hash key
test=develop
* Add comments to MKL-DNN FC Kernel
test=develop
* Refactor quantizer
test=develop
* Fix linter issues
test=develop
* Fix crash in slim googlenet
test=develop
* Fix PADDLE_ENFORCE messages
test=develop
* Add fc padding to solve mkl performance
test=develop
* fix gpu pass and error information
test=develop
* fix fc_fuse_pass_test
test=develop
* fix error information
test=develop
* fix error information
test=develop
* fix name and add fc op padding test
test=develop
* fix attributes
test=develop
* optimize fc padding
test=develop
* fix test
test=develop
* Refactor MKL-DNN ElementwiseMul
remove manual fallback, remove format attrs
test=develop
* Refine PADDLE_ENFORCEs in eltwise_mul_op.h
test=develop
* Make ElementwiseMulOp inherit from ElementwiseOp
* Change type of simd_width to int
test=develop
* Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp
test=develop
* Restore attributes
test=develop
* Fix test coverage for mkldnn eltwise mul
test=develop
* Conform to new is_run_common_broadcast API
test=develop
* Add UT for AreDimsAndFormatCorrect
test=develop
* Improve argsort performance.
- Give 200000 data to compute argsort on v100,
can speed up ~190x
before opt cost: 0.53s
after opt cost:0.0027s
- Add fp16 support
* Refine error message
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* fix fetch handler problem and refactor
when a user define FetchHandler class, he or she should initialize a handler
with variable dict. the key of a variable dict is a user defined name,
the value of a variable dict is a Varaible generated from python API.
For each fetching, a user should implement handler function in which
fetched_result_dict will be available and the user can access the fetched value
with user defined keys.
* Disable fusion_group pass for windows and mac. We will do some experiments on Linux first.
test=develop
* Print the subgraph when check failed.
test=develop
* add int8 kernel to lookup_table op and add dequantize op test=develop
* change paddle_enforce to paddle_enforce_eq test=develop
* change copyright and change some not suitable code test=develop
* remove debug log test=develop
* replace GetInputType with IndicateVarDataType test=develop
* fix EmptyGradMaker test=develop
* fix diff between cpu and gpu test=develop
* use memcopy when int8_t test=develop
* open dygraph op test, test=develop
* modify to_variable, test=develop
* modify input and output for dygraph, test=develop
* modify input and output for dygraph(fix bug), test=develop
* fix input processing of dygraph op test, test=develop
* fix bug, test=develop
* fix op test, test=develop
* fix forward bug for dygraph, test=develop
* fix mkldnn op test for forward, test=develop
* update nn.py for dygraph, test=develop
* fix crop_tensor_op, test=develop
* fix elementwise_mul_op, test=develop
* fix fill_op, test=develop
* fix some mkldnn op, test=develop
* open backward op test for dygraph, test=develop
* delete log, test=develop
* close backward op test for dygraph, test=develop
* fix bug for edit_distance_op and test_lstm_cudnn_op, test=develop
* fix optest backward bug for dygraph, test=develop
* fix optest backward bug for dygraph, test=develop
* close backward op test for dygraph, test=develop
* close backward op test for dygraph, test=develop
* open dygraph op test, test=develop
* fix op test for dygraph, fix GradOpDescMaker, test=develop
* fix bug for linear_chain_crf_op.h, test=develop
* remove log, test=develop
* remove log, test=develop
* remove log for op_test.py, test=develop
* remove log for op_test.py, test=develop
* fix bug for var_conv_2d_op, change PADDLE_ENFORCE, test=develop
* fix PADDLE_ENFORCE_EQ for hierarchical_sigmoid_op.cc, test=develop
* fix bug for test_increment_ngraph_op.py, test=develop
* fix lod for op test in dygraph, test=develop
* refactor op_test.py to reduce redundant code, test=develop
* fix lod optest, modify InputVar/OutputVar to HasInput/HasOutput, test=develop
* remove debug log, test=develop
* remove redundant code in base.py, test=develop
* fix some error in optest, test=develop
* fix ClearNoNeedBufferInputs function's bug for LoDTensor, test=develop
* refactor op_test.py, test=develop
* remove redundant writing, test=develop
* fix error(get tensor of the grad variable), test=develop
* fix test_concat_mkldnn test_conv2d_mkldnn, test=develop
* fix optest.py for get tensor of LoDTensor, test=develop
* fix optest.py for get tensor of LoDTensor, test=develop
* fix optest.py for get tensor of LoDTensor, test=develop
* fix some redundant code, test=develop
* reslove conflict and rewrite paddle error message, test=develop
* fix the CAPI ZeroCopy shape error and reconstruct the output obtain
* use an anonymous namespace to cover the functor
* fix unit tests because of the output of typeid(T).name() is different from linux and windows, test=develop
* Enable generating code for a given subgraph.
* Support sorting the subgraph.
* Remove the rearange of expressions because we use the sorted subgraph directly.
* Enable generating code for a subgraph which is composed of grad ops.
* Use expression information to check the accuracy in unittest.
* Separate load and store from computation expressions.
test=develop
* Improve the loading statements in generated codes.
test=develop
* Remove unused arguments from formal list.
test=develop
* fix auc drop first commit test=develop
* update datanorm op
* update datanorm with enforce test=develop
* update test=develop
* update format test=develop
* update format
* update format test=develop
* add unit test test=develop
* update unit test test=develop
* update format test=develop
* update format test=develop
* update API description test=develop
* update API description test=develop
* update format test=develop
* fix codes as comments test=develop
* fix description as comments test=develop
* fix description as comments test=develop
* update codes.. test=develop
* modified error message for conv and conv_transpose, test=develop
* modified doc of conv and conv_transpose op, test=develop
* modified the expression for error message, test=develop
* modified error message for group_norm op, test=develop
* modified detail of Attr(data_format) or Attr(data_layout)
* add ValueError in API doc for maxout op, test=develop