* rm unused ckpt and sort ckpt
* use max op idx to sort, test=develop
* remove unsed code,test=develop
* add testcase, test_develop
* modify test case, test=develop
* test=develop
Add input type and dtype check for sign_op.
* test=develop
Fix the api text format in sign op.
* test=develop
Fix the api examples in sign op add update the api.spec.
* Update crf_decoding api & example
test=develop
* Update api spec
test=develop
* Fix linear chain crf api
test=develop
* Avoid sharing data pointer with input
test=develop
* Simplify the logic in linear_chain_crf_decoding
* Add unittest for crf_decoding when label & path both are set
test=develop
* Update API spec
test=develop
* Add unittest for layers && correct infer_shape in chunk_eval
test=develop
* test=develop, fix docker with paddle nccl problem
* test=develop, Add Variable api and refine dygraph related API
* test=develop, Add Variable api and refine dygraph related API
* test=develop, refine test for new api and error info
* test=develop, refine error info and test_layers
* test=develop, add API.spec
* test=devleop, fix to_string python2 and python3 compat error and refien doc
* test=devleop, add API spec
* test=devleop, update API spec
* test=devleop, update API spec
* test=develop, invoke ci
* test=develop, fix example code
* test=develop, update API spec
* test=develop, fix auto_prune_error_on_leaf
* test=develop, fix auto prune error on loss stop_gradient
* test=develop, remove useless error check
* test=develop, add more ut for sorted gradient
* fix the error message for reduce_mean and reduce_sum op test=develop
* fix typo test=develop
* fix according review advice test=develop
* fix the test test=develop
* fix test=develop
* fix the constant error message test=develop
* fix typo test=develop
* fix typo test=develop
* fix code style test=develop
* fix comment and bugs test=develop
* fix the bug test=develop
* fix and add unittest test=develop
* fix the typo test=develop
* add support for the fill_constant op test=develop
* add test for ci coverage test=develop
1.support asymmetric padding;
2.support padding algorithm:"SAME" and "VALID";
3.support channel_last: data_format NHWC and NDHWC;
4.change doc of python API and c++;
test=develop, test=document_preview
* How to write custom op needs to follow framework OP spec.
* Package fluid_framework.so and headers into whl.
* Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir.
* Export some C-APIs to merge OpInfo between core.so and custom_op.so.
* Add unit testing.
* Update API.spec.
* test=develop, argument shape support tensor and tensor in list
* test=develop,Increasing the coverage of CI tests
* test=develop, modify the document and update API.spec
* test=develop, modify the doc and update API.spec
* test=develop, modify the doc and update API.spec
* test=develop, modify the interface of UniformInitializer
* test=develop, modify the interface of XavierInitializer and MSRAInitializer
* test=develop, modify based on review's comments
* test=develop, modify based on review's comments
* test=develop, modify based on review's comments
* fix pool2d pool3d:
1. support asymmetric padding;
2. support padding algorithm:"SAME" and "VALID";
3. support channel_last: data_format NHWC and NDHWC;
4. support inferring shape when input with negative dims in compile time;
5. change doc of python API and c++;
6. fix bug in cuda kernel when Attr(adaptive) is true.
test=develop,test=document_preview
* fix 'tensors' to 'Tensors'. test=develop,test=document_preview
* add test for converage ValueError.test=develop,test=document_preview
* resolve conflict in test_pool2d. test=develop
* Follow Wangzhen's comment in PR 18970, test=develop
* Review comments, test=develop
* Leave fake quantization around mul
test=develop
* Replace Fake with Real Quantized Mul
test=develop
* Fix bug in quantize placement pass
Nodes in the graph now have checked type instead of node name when they are to be marked for quantization test=develop
* test=develop, fix docker with paddle nccl problem
* test=develop, Add Variable api and refine dygraph related API
* test=develop, Add Variable api and refine dygraph related API
* test=develop, refine test for new api and error info
* test=develop, refine error info and test_layers
* test=develop, add API.spec
* test=devleop, fix to_string python2 and python3 compat error and refien doc
* test=devleop, add API spec
* test=devleop, update API spec
* test=devleop, update API spec
* test=develop, invoke ci
* test=develop, fix example code
* test=develop, update API spec
* test=develop, add compat test and fix inplace campat dict error
* impove error message when passing ndarray with object dtype
* imporve message format
* change assert to raise TypeError
* remind user how to locate the irregular data instead of printing
* add unittest for input array type check
* Fix conv2d+dequantize squash for residual fusion
test=develop
* Correct int8 input
test=develop
* Add if exclude or include padding in pool2d mkldnn
test=develop
The new "fluid.data" changes old "fluid.layers.data":
1. Add shape and dtype check.
2. Remove "append_batch_size" parameter. We won't offer this in the new data layer because other deep learning platforms don't have this kind of data layer pre-processing. It may confuse users.
3. Remove "stop gradient" parameter because the data layer doesn't do back-propagation
TODO:
Now data layer feeded by executor is checked, will we want to check the feed data of readers in the future?
* add kernel for fill_op, test=develop
* modify PADDLE_ENFORCE to PADDLE_ENFORCE_EQ, test=develop
* add op test for fill_op, test=develop
* REGISTER COP CUDA KERNEL, test=develop
* update test_fill_op.py, test=develop
* change FillConstantOpVarTypeInference to FillOpVarTypeInference, test=develop
* fix op test, test=develop
* add head file, test=develop
* add support of matmul with multiple head even different width and height
Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.
One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]
test=develop
* add support of matmul with multiple head even different width and height
Original matmul with multiple head supports only the mat_a.width == mat_b.height,
in that case, mat_b will be horizontally split. In this patch, we extend the
support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height,
in this case, mab_b will be vertically split.
One example is A is [3, 8], B is [2, 16], head_number is 4. In this
case, A will be split as [3, 2], B will be (vertically) split as
[2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16]
test=develop
* refactor the code of matmul with multiple head even different width and height
test=develop
* Add support for new QAT models
test=develop
Co-Authored-By: Michał Gallus <michal.gallus@intel.com>
Co-Authored-By: Wojciech Uss <wojciech.uss@intel.com>
* fixed fps results
test=develop
* fix top5 accuracy drop problem
* updated for new QAT models
* skip quantizing average pooling - dirty but working
* add missing pass
* added missing conv+brelu fuse pass
* removed a call to non-existent pass
test=develop
* renamed pass
test=develop
* Adjust finding pooling scale to newest QAT models
* Remove unnecessary code from quantization_mkldnn_pass
* Copy Pooling input scale to output scale in QAT
* Refactor & remove unused code in QAT
* Incorporate fp32 FC into QAT
test=develop
* Enable graph drawing with debug flag
test=develop
* Add tests for QATv2
* Fix paths for QATv2 models
test=develop
* Add option to save transformed int8 qat model
test=develop
* Remove redundant lines from qat mkldnn pass
test=develop
* Delegate disablement of avg pooling to qat
test=develop
* fix CI bug, test=develop
* Follow Wangzhen's Review, test=develop
* Update API.spec
test=develop
* Name False in (is_unsigned, TensorScale) tuple
test=develop
* Remove constraint that last dimension is forced to be 1 by add
lookup_table_v2 test=develop
* modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop
* Revert "modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop"
This reverts commit 8a960bfc61e51aa27c3c529df8fb90b93ebd19f9.
* move api into fluid.embedding test=develop
* fix example code test=develop
* move one_hot into fluid.one_hot
* modify api.spec test=develop
* fix loss shape test=develop
1. Support customize eval function instead of eval program.
2. Fix loading checkpoint in quantization strategy.
3. Support saving eval model when saving a checkpoint.
4. Fix decoder of loading context in PaddleSlim.
5. Fix restoring from the checkpoint of uniform prune strategy.
6. Support saving eval model and infer model during training.
7. Add ‘unitest’ for saving eval model, saving infer model and uniform pruning restoring from the checkpoint.
8. Fix pruning of depthwise_conv_grad op by updating the groups.
* support change shuffle thread num
* support change train thread num
* fix receive shuffle data of each channel
* data norm stop gradient
* add check thread_tensor type and root_tensor type when merge metric
* remove sleep in shuffle, add config
* add config of pslib client to client communication
* fix xbox str
* add data norm op testcase
* add flush in trainer finalize