* add simple attr support and test
* add int, float attr support
* support other attribute
* add custom attrs test in cmake
* polish details
* fix test failed
* add backward test
* update test flags
* add group norm plugin
* fix compile problems
* move concat axis check to trt op teller
* add nbDims for scale and bias nv dims
* add group norm unit test
* fix unittest
* add trt version restriction for group norm op teller
* fix unittest
* fix entry
* fix distributed lookup table fuse case
* fix entry bug at first time
* move entry from paddle.fluid -> paddle.distributed
* fix ut with paddle.enable_static()
Co-authored-by: malin10 <malin10@baidu.com>
* added support for fake_quantize_dequantize_abs_max op in quantization inference pass
* remove const_cast to pass ci
* remove compare operator to pass ci-coverage
* added detailed error message for unregistered tensorrt_subgrah_pass
* add more dispatch marco
* add more dispatch marco
* add more tests
* revert unneeded change
* add timeout for test dispatch
* add float and complex test
* remove and marco
* [static setitem] support the index step > 1. tensor_a[::3] = value
* [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value
* [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value
* Add op version.
* Add conv transpose BF16
* Share function GetWeightsTz
* Adjust to review and fix op compatibility
* Add bias to unique handler name
* Remove errors related to paddle enforce
* Add conv2d_transpose to bf16 list and kernel refator
* initial commit: simple demo
* polish copyright format
* add grap op simple demo
* adapt uncertain number of argument
* change trait marco name
* add place & dtype support for add kernel
* add dispath and infershape func
* poish code & add notes
* add dynamic_loader dep for paddle_framework
* add new custom op test dir
* polish impl details
* add unittest for new custom op
* fix failed unittest
* Costum op (#1)
* fix compile error
* wrap framework tensor with LoDTensor
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* add CustomTensor default constructor
* add size() for CustomTensor
* make size const for CustomTensor
* refactor place related api to circle the concept
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* make place const
* make Tensor copy
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* remove additional head of framework
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* add gpu test
* merge latest cwh code in
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2)
* fix compile error
* wrap framework tensor with LoDTensor
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* add CustomTensor default constructor
* add size() for CustomTensor
* make size const for CustomTensor
* refactor place related api to circle the concept
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* make place const
* make Tensor copy
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* remove additional head of framework
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* add gpu test
* merge latest cwh code in
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* hid share data from and to
* rename CustomTensor to Tensor
* refactor register design & add test
* change op_funtion to op_meta_info
* split op meta info into .h and .cc
* move get methods into friend class
* move OpMetaInfoHelper into framework space
* move CustomTensorUtils into framework space
* change pybind api name
* move PD C API into op meta info
* add register custom op api
* remove inference cmake change
* refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3)
* fix compile error
* wrap framework tensor with LoDTensor
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* add CustomTensor default constructor
* add size() for CustomTensor
* make size const for CustomTensor
* refactor place related api to circle the concept
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* make place const
* make Tensor copy
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* remove additional head of framework
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* add gpu test
* merge latest cwh code in
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* hid share data from and to
* rename CustomTensor to Tensor
* support multi dtype
* remove lod, make reshape lowercase, add copy test and refactor copy api
* remove lod, make reshape lowercase, add copy test and refactor copy api
* remove lod, make reshape lowercase, add copy test and refactor copy api
* remove lod, make reshape lowercase, add copy test and refactor copy api
* fix copy to error
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* polish detail & error message
* polish test details
* Add cast api && Change copy related api to copy_to && add more test (#4)
* fix compile error
* wrap framework tensor with LoDTensor
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* add CustomTensor default constructor
* add size() for CustomTensor
* make size const for CustomTensor
* refactor place related api to circle the concept
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* fix compile error
* make place const
* make Tensor copy
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* debug CustomTensor core
* remove additional head of framework
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* use back to shared ptr for custom tensor
* add gpu test
* merge latest cwh code in
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* adjust ut code of custom op
* hid share data from and to
* rename CustomTensor to Tensor
* support multi dtype
* remove lod, make reshape lowercase, add copy test and refactor copy api
* remove lod, make reshape lowercase, add copy test and refactor copy api
* remove lod, make reshape lowercase, add copy test and refactor copy api
* remove lod, make reshape lowercase, add copy test and refactor copy api
* fix copy to error
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add more test
* add type cast
* add cast and make copy to api
* add cast and make copy to api
* add cast and make copy to api
* add cast and make copy to api
* merge cwh code
* merge cwh code
* merge cwh code
* merge cwh code
* merge cwh code
* add more error log
* add more error log
* polish code
* used for test
* remove test comment
* remove test comment
* fix uint8 type error
* fix lost uint8 type error
* add test for coverage
* polish details by reviewer comments
* add prefix for DISABLE_COPY_AND_ASSIGN
Co-authored-by: Jiabin Yang <360788950@qq.com>
* support xpu inference with analysis predictor, test=develop
* merge the cmake of the xpu toolchain, test=develop
* add c-apis, test=develop
* fix a bug in extern_xpu, test=develop
* rewrite abs op
* rewrite abs op and remove abs in activation
* remove abs register in old codes
* fix abs_grad type error
* fix abs double_grad output name error
* modify abs_grad, abs_grad_grad functor for windows building
* format code style
* fix the bug of result is nan when the divisor is zero
* add missing abs attr and add abs for float16
* add view strategy on squeeze,unsqueeze,reshape,flatten
* add squeeze unittest
* add unittests
* use View strategy as name rather than Reuse Allacation
* fix view api doc
* fix format
* use core.ops when input of reshape2 is Tensor
* fix test_cross_entropy_loss error because of reshape2
* fix test_cross_entropy_loss error because of reshape2
* add inplace strategy
* add elementwise_add sub
* let backward op not use inplace
* grad op do not use inplace
* fix memory increase error and add leaf error message
* delete selected_rows
* change op_function
* little change
* solve HandleViewBetweenInputAndOutput
* add unittest and leaf error message
* merge view error
* optimize op_function_generator format and support sum inplace op
* fix format of basic_engine
* fix format for framework
* little change of variable wrapper
* add reshape, squeeze, unsqueeze, scatter api
* add relu elu tanh softmax inplace api
* fix test_squeeze_op unittest
* fix test_relu_op unittest
* fix comment problems
* delete sample code of inplace api
* add reference of grad_pending_nodes in basic_engine
* fix unittest name
* add inplace apis into wlist
* fix error message
* add PADDLE_ENFORCE for set grad op twice
* fix head file error
* added support for inference using qunatization aware trained dygraph
* added support for inference using qunatization aware trained dygraph
correct boost get usage
* Delete incorrect warning message (#30196)
* fix warning and no grad
* clean redundant API alias in 2.0 - part 2 (#30013)
* delete paddle.nn.functional.assign
* fix dynamic to static error
* just add the op error message for the matmul xpu (#30246)
add the op error message for the matmul xpu
* Add Static Variable Clone (#30208)
Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat
* use wget to replace curl to download the lcov file (#30229)
* use wget to replace curl to download the lcov file
* add cache for lcov
* fix test_pool3d_op timeout issue (#30248)
* Fix unittests bugs. (#30250)
* modify error message based on comments (#30189)
* modify error message based on comments
* edit code according to review.
* Correct spelling according to review.
* Fix bug for 'save mutiple method' (#30218)
* Fix bug for 'save mutiple method'
* To pass coverage.
* edit code to pass coverage.
* edit code to pass coverage.
* add unittest for coverage.
* change for coverage.
* edit for coverage.
* added support for inference using qunatization aware trained dygraph
* Alias from paddle.fluid.layers.auc to paddle.static.auc (#30206)
* add alias from fluid.layers.auc to static.auc
* Update __init__.py
* added support for inference using qunatization aware trained dygraph
correct boost get usage
* corrected boost get usage
* corrected naming issues and enforcing zero check
* correct paddle enforce message
* added more error checkings
* corrected error report message and optimized code
* corrected findvar usage
* corrected paddle_enforce in scope
* correct error messages
* correct error reporting format
Co-authored-by: LielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: XiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
Co-authored-by: wawltor <fangzeyang0904@hotmail.com>
Co-authored-by: Huihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: YUNSHEN XIE <1084314248@qq.com>
Co-authored-by: Bai Yifan <me@ethanbai.com>
Co-authored-by: gongweibao <weibao.gong@gmail.com>
Co-authored-by: WeiXin <weixin10@baidu.com>
Co-authored-by: Jiaqi Liu <liujiaqi06@baidu.com>
usleep function in <unistd.h> only takes argument less than 1,000,000. Current call can exceed this limit, we have to fix it. This PR can fix random CI error.
* set expected place in child thread for dataloader
* set device id when set tensor from numpy
* revert tensor_py change
* add compile guard
* fix ci
* fix bug
* add view strategy on squeeze,unsqueeze,reshape,flatten
* add squeeze unittest
* add unittests
* use View strategy as name rather than Reuse Allacation
* fix view api doc
* fix format
* use core.ops when input of reshape2 is Tensor
* fix test_cross_entropy_loss error because of reshape2
* delete selected_rows
* change op_function
* little change
* solve HandleViewBetweenInputAndOutput
* add cast ops before and after unsupported fp16 ops.
* Keep partial net in FP32 pattern.
* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
* Add fp16 support for adam op.
* add multi precision attr for adam.
* Fix the bug of test_multi_precision_fp16_train UT.
* Code format for CI.
* Fix the redefine error about MPTypeTrait on windows.
* fix bugs of the _create_accumulators func in Momentum.
* fix bug when inserting post cast op.
* Add the update_loss_scaling op in allow_set of UnusedVarCheck.
* Update for ci coverage.
* Add some doc for OptimizerWithMixedPrecision.
* Fix the code style.
* Imporve the doc of `amp_init`.
* Change for fp16 testing if users have the infer program defined in separate way.