wuhuanzhou
587d99ae44
update compilation with C++14 ( #31815 )
...
* update compilation with C++14, test=develop
* fix compilation error in eigen, test=develop
4 years ago
tianshuo78520a
b09c1ce09a
fix whl package push pypi ( #31585 )
...
* fix whl package push pypi
* add rst
4 years ago
Thunderbrook
393b3bd6b7
fix split core ( #31892 )
...
* fix split core
* format
4 years ago
wuhuanzhou
3a95a0bc26
update cmake minimum version to 3.15 ( #31807 )
...
* update cmake minimum version to 3.15, test=develop
* fix compilation error on Windows, test=develop
* fix compilation error on Windows, test=develop
* fix compilation error on Windows, test=develop
4 years ago
taixiurong
52b05baca3
fix some bug in transformer training in xpu ( #31918 )
4 years ago
Wenyu
5394194e3a
support minus-int idx to LayerList ( #31750 )
...
* support minus-int idx to LayerList
* update layerlist test
4 years ago
furnace
ef8323d49e
[ROCM] Add ROCm support for warpctc op ( #31817 )
...
* bugfix for warpctc
* fix warpctc commit id
* fix warpctc commit id
* fix warpctc commit id
* fix warpctc commit id
* fix warpctc commit id
* fix WARPCTC_WITH_HIP invalid
* Add logs to find out why can not dlopen libwarpctc.so
* fix warpctc commit id
* fix unit test test_warpctc_op
* Optime failed log for dlopen
* Optime failed log for dlopen
* Delete extra changes
* fix warpctc commit id
* fix warpctc commit id
* Add is_compiled_with_rocm for test_warpctc_op
* fix warpctc commit id
* Cancel optimize dlopen failed reason, move to next pr, due to it makes windows ci failed
* Cancel optimize dlopen failed reason, move to next pr, due to it makes windows ci failed
* Cancel optimize dlopen failed reason, move to next pr, due to it makes windows ci failed
* fix code style problems
4 years ago
Jiawei Wang
95f808c878
fix stack op grad nullptr ( #31962 )
4 years ago
liym27
57d4288ad4
[dynamic setitem] Fix bug of dynamic setitem: Decerease axes to do right broadcast ( #31960 )
4 years ago
石晓伟
0fa6c8a35c
fix a syntax error, test=develop ( #31930 )
4 years ago
Pei Yang
98e803e04f
map_matmul_to_mul_pass support 3dim ( #31958 )
4 years ago
wuhuanzhou
a37a7f67e1
modify CI recommend information ( #31395 )
4 years ago
jakpiase
6dca7a1de7
Added int8 kernel for oneDNN LSTM op ( #31894 )
4 years ago
Pei Yang
14b7e3cf06
[Paddle-TRT] TRT inference support for BERT/Transformer in paddle 2.0 api ( #31744 )
...
* support multihead_matmul_fuse_pass_v3
* fix compile problems
* embedding_eltwise_ln pass support lookup_table_v2
* suppoort matmul and matmul_v2 in qkv matmul
4 years ago
Zhou Wei
245252b86e
fix bug when dtype of to_tensor is core.VarType ( #31931 )
4 years ago
Zhen Wang
e1f931610e
Fix save/load error in imperative qat UT. ( #31937 )
4 years ago
Yiqun Liu
e50bc2c2a6
Enhance cmake to support specifying CUDA_ARCH_NAME to Ampere. ( #31923 )
4 years ago
Zhou Wei
04a49b097e
[Custom OP]Remove old custom OP and reduce whl package volume ( #31813 )
...
* Remove old custom OP to reduce whl package volume
* [Custom OP]Remove old custom OP to reduce whl package volume
4 years ago
wangguanzhong
fe2848686b
add exclusive for test_conv2d_op, test=develop ( #31936 )
4 years ago
chajchaj
73a6fa3ed0
add deprecated for softmax_with_cross_entropy ( #31722 )
...
* add deprecated for softmax_with_cross_entropy, test=develop
* test for deprecated in english doc, test=develop
* test deprecated for softmax_with_cross_entropy in english doc, test=develop
* fix readme and English doc for cross_entropy, test=develop
* rm test for softmax_with_cross_entropy deprecated, test=develop
* update readme for CrossEntropyLoss, test=develop
* fix readme format, test=develop
* fix readme format, test=develop
* fix readme format for cross_entropy, test=develop
* add softmax_switch and fix softlabel for cross_entropy, test=develop
* 1)recovery softmax_with_cross_entropy in fluid 2) change softmax_switch to use_softmax 3) add example for softlabel for cross_entropy, test=develop
* fix Example number for cross_entropy, test=develop
* fix code format, test=develop
* fix for CI-Coverage, test=develop
* fix for CI-Coverage, test=develop
* fix ci-coverage for Non-ASCII character '\xe2' in file, test=develop
* fix ci-coverage for Non-ASCII character '\xe2' in nn.layer.loss.py, test=develop
* update description for doc when use_softmax=Fasle, test=develop
* fix some docs and code example for cross_entropy, test=develop
* delete redundant description for soft_label parameter of cross_entropy, test=develop
* fix some comment for test_cross_entropy_loss.py, test=develop
4 years ago
Shang Zhizhou
8084b7594b
fix batchnorm when inpu dims < 3 ( #31933 )
...
* fix batchnorm when inpu dims < 3
* add unittest for batchnorm dims = 2
4 years ago
zlsh80826
64ee255ffd
[Paddle-TRT] yolobox ( #31755 )
...
* yolobox converter and plugin
* yolobox unittest
* add dynamic shape restriction
* fix git merge log
4 years ago
Aurelius84
c4b60efabd
Fix segment Fault from set_value ( #31891 )
...
* Avoid raising warning while import paddle
* fix segment fault of set_value
* fix code style
4 years ago
wuhuanzhou
17030ff28b
fix op benchmark ci error caused by missing test_pr branch, test=document_fix ( #31920 )
4 years ago
niuliling123
a71d72d921
relu forward and backward with vectortype ( #31869 )
4 years ago
tianshuo78520a
8829a309fe
Delete cudnn6 code ( #31835 )
4 years ago
wanghuancoder
b48841ba2e
modify API nn.Bilinear's doc ( #31889 )
...
* modify API nn.Bilinear's doc, test=develop
* modify API nn.Bilinear's doc, test=develop
4 years ago
liym27
525c32e33c
Fix bug of set_value op:Decerease axes to do right broadcast ( #31875 )
4 years ago
ronnywang
123949eb48
[ROCM] added a cudnn switch of conv2d for rocm platform ( #31836 )
4 years ago
Shang Zhizhou
61805d8f0a
fix cmake model path ( #31866 )
...
* fix cmake model path
* update cmake
* fix unittest
* fix unittest
4 years ago
Jiabin Yang
51eb29de18
[CustomOP] Add shape related constructor for Tensor ( #31681 )
...
* give shape related contructor and reshape warning
* change line num to fit ut
* change ut to fit
* remove useless code
* call resize directly in constructor
4 years ago
zlsh80826
e3a38d790a
[Paddle-TRT] roi_align_plugin ( #31732 )
...
* add roi_align_plugin
* add roi align unit_test
* add roi align serialization
* remove roi align static plugin because of batch dim issue
* refine roi align unittest and add fp16/serialization
* add trt roi align condition to op_teller
* refine error message
* remove unnecessary reshape layer
4 years ago
zlsh80826
bfb5cf5567
[Paddle-TRT] trt affine channel converter ( #31628 )
...
* trt affine channel converter
* add trt affine channel base test
* add trt affine channel NHWC
* remove asterisk for python2 compatibility
* trt affine channel converter
* add trt affine channel base test
* add trt affine channel NHWC
* remove asterisk for python2 compatibility
* fix rebase
* move LodTensor to Tensor
* add dbg info
* affine channel converter only support NCHW
* scale,bias are parameters, use create_parameters api
* reduce test input size to not exceed the timelimit of ci
* refine affine channel unittest and add serialization/dynamic test
* change super to InferencePassTest for python2 compatibility
* change super to InferencePassTest for python2 compatibility
* fix affine channel fp16 serialize setting
4 years ago
cc
b47478efc2
[dygraph qat] Use layer to calculate output scale ( #31861 )
...
* Use layer to calculate output scale
* add backward for moving_average_abs_max_scale and save output scales to op's attr
4 years ago
lilong12
c3974d0e2a
[3D-parallel] Reformat pipeline parallel ( #31786 )
...
* update, test=develop
4 years ago
zlsh80826
01aa252624
[Paddle-TRT] multiclass nms ( #31742 )
...
* add multiclass_nms
* add multiclass_nms unittest
* add default enable_tensorrt_oss option
* refine multiclas nms unittest and add serialization/dynamic test
* change super to InferencePassTest for python2 compatibility
* refine multiclass nms unittest
* move out dynamic shape test due to ci timelimit
4 years ago
Wilber
70b67f1029
fix go api bug. ( #31857 )
4 years ago
tianshuo78520a
e804f08559
delete include framework.pb.h ( #31859 )
...
* delete include framework.pb.h
* fix error
4 years ago
Chengmo
f58cb01864
【Paddle.Fleet】fix dataset zip py3 bug ( #31441 )
...
* fix zip py3 bug
4 years ago
Kaipeng Deng
bf09dcb346
add GPU tensor notice & update default_collate_fn/default_convert_fn. test=develop ( #31763 )
4 years ago
Chen Weihang
27f2d8df8e
Polish two error messages ( #31852 )
...
* polish two error messages
* polish details
4 years ago
Zhou Wei
511e204e62
LRScheduler.get_lr should not update lr in LinearWarmup ( #31843 )
4 years ago
niuliling123
6472d62093
Revert "add relu forward kernel and backward kernel ( #31613 )" ( #31853 )
4 years ago
winter-wang
e7f28d6c0d
fix runtime crash when rnn model inference, test=develop ( #31833 )
4 years ago
parap1uie-s
5d89ec36dc
Update pooling.py ( #31829 )
...
Fix default argument of nn.MaxPool3D()
4 years ago
Huihuang Zheng
649868ffb2
[Dy2stat] Fix the bug that loop_body_func may return single element ( #31806 )
...
Our old `loop_body` function may return single element when `loop_vars` just contains only 1 element, which can cause bug. The key point of this PR is forcing `loop_body` functions always return tuple.
4 years ago
Wojciech Uss
e5f7a834d4
fix cache key in concat oneDNN kernel ( #31820 )
...
* fix cache key in concat oneDNN kernel
* key simplified
4 years ago
Aurelius84
f2cfc0f46d
[CustomOp]Avoid raising warning while import paddle ( #31804 )
4 years ago
cc
84a551380e
[dygraph qat] Refine saving output scale to infer program ( #31784 )
...
* Refine saving output scale to infer program
4 years ago
Chen Weihang
68497e7b39
change trainable to stop_gradient in optimizer ( #31823 )
4 years ago
ronnywang
270699e647
[ROCM] fix test_matmul_v2_op ( #31802 )
4 years ago
Zhou Wei
1eb927f935
Restore the third-party library cache for windows ( #31811 )
4 years ago
Chen Weihang
3f66e7deab
add cmath header for bfloat ( #31792 )
4 years ago
Feiyu Chan
4046f1303a
add coalesce_tensor into white list when checking re-creation of parameters ( #31800 )
4 years ago
Zhou Wei
a70de87d76
Update windows compiler and CI from VS2015 to VS2017 ( #31652 )
...
* modify windows CI to VS2017
* modify windows CI to VS2017
* modify windows CI to VS2017
4 years ago
Wilber
f4d9212de2
trt plugin upgrade to pluginv2ext ( #31670 )
4 years ago
niuliling123
372ac08a17
add relu forward kernel and backward kernel ( #31613 )
...
* add relu forward kernel and backward kernel
4 years ago
Wojciech Uss
814b38e30f
update scale collection and propagation algorithm ( #31783 )
4 years ago
tianshuo78520a
513641e153
Delete fast_check_nan_inf ( #31788 )
...
* Delete fast_check_nan_inf
* Delete run_fast_nan_inf_debug
4 years ago
Shang Zhizhou
9d04ef7369
fix tensorrt output varible reshape ( #31733 )
...
* fix tensorrt output varible reshape
* move padding shape x 1 x 1 in ernie to qkv and fc
* update layer name
* fix softmax when input is dynamic, fc not padding any more
* fix varlen
* move fc x_dim assert to op_teller
4 years ago
Qi Li
46dd1d4aad
[ROCM] fix reduce_sum nan in ROCM platform, test=develop ( #31780 )
4 years ago
gongweibao
f72d197ec5
fix launch ps ut test=develop ( #31771 )
...
fix launch ps ut test=develop
4 years ago
Tao Luo
032de0bfd0
update approval ( #31782 )
4 years ago
zlsh80826
bfced39eb6
[Paddle-TRT] nearest_interp op ( #31626 )
...
* nearest_interp op converter w/ dynamic/static
* fix data_layout include
* add trt nearest unit_test
* add nearest_interp NHWC test
* update trt nearest interp nhwc testcase
* remove asterisk for python2 compatibility
* add empty line to prevent conflict
* nearest_interp op converter w/ dynamic/static
* fix data_layout include
* add trt nearest unit_test
* add nearest_interp NHWC test
* update trt nearest interp nhwc testcase
* remove asterisk for python2 compatibility
* add empty line to prevent conflict
* change the priority of out_h, out_w
4 years ago
arlesniak
7ccf6b6030
[oneDNN] Initial bf16 amp integration ( #31093 )
4 years ago
lilong12
a501a7b0ca
[3D-parallel] add 1f1b scheduler for pipeline ( #31566 )
...
* add 1f1b scheduler for pp, test=develop
4 years ago
guofei
ed7956a816
Fix skip_quant in QAT ( #31704 )
...
* Fix skip_quant in QAT
4 years ago
ronnywang
8c19d7aa2f
[ROCM] fix test_conv2d_transpose_op ( #31749 )
4 years ago
Ouyang Chao
a45c8ca69d
fix bug of DepthwiseConvTransposeGradKernel ( #31762 )
4 years ago
Jacek Czaja
25fc2a1fdb
[oneDNN] Added Elementwise Mul grad fp32/bf16 ( #31647 )
4 years ago
Chen Weihang
878e117b6d
[CustomOp] Support float16 in custom op ( #31725 )
...
* support float16 in custom op
* fix failed unittests
4 years ago
ronnywang
c9e1d9dc31
[ROCM] fix test_rnn_op ( #31735 )
4 years ago
zlsh80826
1c67cf0c98
run radix sort of proposals layer on context stream ( #31631 )
4 years ago
Chen Weihang
e429deb0c4
[CustomOp] Support attribute in infershape function ( #31713 )
...
* support attribute in infershape
* polish details
4 years ago
Adam Osewski
a4a2b77def
[oneDNN] lookup_table op with support for BF16 data type. ( #31558 )
4 years ago
zlsh80826
c86e771e94
NMS Performance Optimization ( #31634 )
...
* replace mask vector to raw ptr
* launch nms on context stream
* remove redundant mask declaration
4 years ago
zlsh80826
50cafa0b0c
remove redundant sync, set collect/dist kernel to context stream, sub_lod memcpy opt ( #31641 )
4 years ago
cc
1d197f6c97
[dgraph qat] Refine calculating output scale of dygraph qat ( #31710 )
...
* Refine calculating output scale of dygraph qat, test=develop
4 years ago
ronnywang
420527f0d9
[ROCM] fix layer_norm, norm, p_norm, test_sequence_softmax_op, test_math_op_patch_var_base ( #31709 )
4 years ago
Chen Weihang
87852616aa
[CustomOp] Support complex dtype in custom op ( #31657 )
...
* support custom complex op
* fix detail error
* add inference support
* fix setup windows failed
4 years ago
zlsh80826
fe241fd02f
[Paddle-TRT] gather converter ( #31640 )
...
* trt gather converter
* add trt gather unit_test
4 years ago
zlsh80826
4ea3427865
[Paddle-TRT] support batch axis concatenation when using dynamic shape ( #31627 )
...
* support batch axis concatenation when using dynamic shape
* opteller can't return true early, or some test will not be executed
4 years ago
Zhou Wei
d4282ea97e
fix multi cuda environment bug ( #31694 )
4 years ago
Chengmo
09482ddec4
【Paddle.Fleet】Fix one ps gradient clip ( #31664 )
...
* fix one ps gradient clip
4 years ago
Kaipeng Deng
740359edaf
remove useless import ( #31700 )
...
* remove useless import. test=develop
4 years ago
Zhang Ting
7f50bb7ec1
support NHWC for temporal_shift op ( #31642 )
4 years ago
liym27
402288ad65
In __getitem__, convert integers to int64 Tensor not int32 to be compatible with Lite( #31658 )
4 years ago
Chen Weihang
2fbe9b097a
[CustomOp] Remove Eigen dependencies of float16 ( #31669 )
...
* remove eigen deps dof float16
* add cstdlib header
* replace stdlib header by cmath
4 years ago
cc
19592d2b71
Refine dygraph qat, test=develop ( #31680 )
4 years ago
Zhou Wei
4c0c55bba1
support Geforce RTX 30+ GPU ( #31529 )
4 years ago
YUNSHEN XIE
cdc5a55ac1
turn off added ut check on windows ( #31660 )
4 years ago
Qi Li
d9b50f664f
[ROCM] update ci scripts and dockefile, test=develop ( #31551 )
4 years ago
YUNSHEN XIE
1a6e3b04cd
Second optimization of retry method ( #31646 )
...
* Second optimization of retry method
* fix show_ut_retry_result repeat execuate
4 years ago
wuhuanzhou
41e9ecfd1f
Optimize compilation with Ninja ( #31449 )
...
* Optimize compilation with Ninja, notest, test=windows_ci, test=windows_op
* no cache on windows ci, notest, test=windows_ci, test=windows_op
* delete /Zc:inline compiled in NVCC, notest, test=windows_ci, test=windows_op
* fix test_warpctc_op, notest, test=windows_ci
* remove test code, test=develop
4 years ago
yiak
c1b1ccfbf5
Update tinyformat.h ( #31612 )
...
Quick fix to https://github.com/PaddlePaddle/Paddle/issues/13860
4 years ago
gongweibao
9c624b16d5
Extend unittest time of ( #31570 )
4 years ago
YUNSHEN XIE
580442ceba
fix wget with no proxy on windows ( #31505 )
...
* fix wget with no proxy on windows
* modified import packages
* fix format error
* fix bug
* fix format error
* fix format error
4 years ago
ronnywang
da10c5cf8b
[ROCM] fix softmax_with_cross_entropy_op, test=develop ( #31629 )
4 years ago
LielinJiang
75433126df
Fix summary bug when calaculating output shape ( #31549 )
...
* fix summary bug
4 years ago
ShenLiang
c3634c6b0a
fix amp bug of fleet ( #31532 )
4 years ago
Chen Weihang
027b574a0e
[CustomOp] Remove the dependence of the underlying data types on eigen ( #31602 )
...
* init commit
* move eigen of bfloat16
* add complex header
4 years ago
WangXi
9066b74f58
c_gen_nccl_id add SocketServer to persit server ( #31589 )
4 years ago
Kaipeng Deng
a32e8bf1e7
DataLoader supprot dict str ( #31481 )
...
* add dict/str/list supprot for DataLoader. test=develop
4 years ago
Chen Weihang
30a627aaf3
Normalized function parameter writing ( #31588 )
4 years ago
Pei Yang
cac9635a67
[Paddle-TRT] Fix engine key in trt int8 calibration ( #31513 )
...
* fix engine key in trt int8 calibration
* fix unit test
4 years ago
Shang Zhizhou
50ac7dbfd0
Trt elementwise plugin serialize ( #31587 )
...
* add serialize unittest
* fix element_op trt plugin serialize bug
4 years ago
guofei
ef0dd3efed
Support loading parameters from checkpoint to save quantized model ( #31419 )
...
* Support loading parameters from checkpoint to save quantized model
* Fix the unittest test_moving_average_abs_max_scale_op
* Add unittest of save_quantized_model from checkpoint
* Add comments to explain the function
4 years ago
whs
da9dda5c9b
Make CreateProgramDesc more robust ( #31543 )
4 years ago
hong
99dcd66508
try to fix imperative orc unitest error; test=develop ( #31568 )
4 years ago
Qi Li
3d5aa9d10a
[ROCM] fix conv2d and conv3d op, test=develop ( #31553 )
4 years ago
YUNSHEN XIE
f302bb4f8b
help timeout ut debug ( #31500 )
...
* To help timeout_ut debug
* To help timeout_ut debug
* added show information
4 years ago
Chen Weihang
95cceb2dd7
[CustomOp] Support duplicable op input and output ( #31535 )
...
* support duplicable op inout
* add costom concat op test
4 years ago
Aurelius84
def27bc801
[Dy2stat]Fix bug with static_convert_var_shape in locals scope ( #31556 )
...
* Fix bug with static_convert_var_shape
* replace dot with dash
4 years ago
YUNSHEN XIE
49c3d2a97b
modified show_ut_retry_result ( #31528 )
4 years ago
LielinJiang
ac493f2c72
Update comments for API `RandomResizedCrop` ( #31539 )
...
* update comments
4 years ago
lidanqing
0f1e7e3d52
[Bug fix] Different machine generate different binary file, remove md5 check ( #31482 )
...
* Different machine generate different binary file, remove md5 check
* remove unnecessary functions
4 years ago
jiangcheng
9ed6c895f1
optimize range op by place parameters on cpu rather than gpu, test=develop ( #30811 )
4 years ago
Thunderbrook
3789a69923
solve bug in heter mode ( #31531 )
...
* heter bug
* format
* format
4 years ago
chajchaj
6148b87f9d
add softmax_switch for softmax_with_cross_entropy_op, test=develop ( #31428 )
...
* add softmax_switch for softmax_with_cross_entropy_op, test=develop
* delete using EigenMatrix in softmax_with_cross_entropy_op.h, test=develop
* add REGISTER_OP_VERSION for softmax_switch attr of softmax_with_cross_entropy_op, test=develop
4 years ago
Aurelius84
f3959e9ddc
[save/load] Fix bug with input_spec=dict[InputSpec] in jit.save ( #31517 )
...
* fix bug with jit.save
* refine code
4 years ago
WangXi
83a2fb1f08
Add collective async wait op ( #31463 )
4 years ago
lilong12
0205e9f84e
remove the send/recv of tensor size ( #31460 )
...
* remove the send/recv of tensor size, but users have to specify the shape of the received var explicitly.
4 years ago
Aurelius84
c8ae837d52
[CustomOp]Fix setup_install timeout ( #31484 )
4 years ago
furnace
910f377fa5
Bugfix rocm ( #31490 )
...
* bugfix for test_cholesky_op
* bugfix for test_compare_op
* bugfix for lookup_table_op
* bugfix for affine_channel_op
4 years ago
Qi Li
416e47edef
[ROCM] fix softmax with loss nan in HIP platform, test=develop ( #31491 )
4 years ago
Shang Zhizhou
f57739be35
fix ernie_varlen when cutting head ( #31497 )
4 years ago
JamesLim
45c7d90564
Optimization of elementwise CUDA kernel ( #30801 )
4 years ago
YUNSHEN XIE
0b3c229606
Prec on mac ( #31382 )
...
* add precision on mac
* added judge
* match file_ut.json on mac
* fix code format error
* fix code format error
* fix error caused by length of ut_lists exceeds the limit
* fix format error,notest,test=cpu
* fix code format error
* add windows judge on get_pr_ut
4 years ago
Jacek Czaja
23d96cf221
[oneDNN] bumpup onednn 2.2 fixup version ( #31473 )
...
* - introduced fix onednn 2.2 version
* - compilation fix
4 years ago
YUNSHEN XIE
390cebee15
Prec on windows exclude check_added_ut ( #31372 )
...
* add precision test for windows ci exclude check_added_ut
* fix error
* added PRECISION_TEST parameters
* fix format error
4 years ago
Zhou Wei
634a12b368
fix bug of windows chineses msvc ( #31493 )
4 years ago
wangguanzhong
43d6abf0a5
update conv2d, test=develop ( #31480 )
4 years ago
wangguanzhong
50af0c2cbb
fix roi_align, test=develop ( #31479 )
4 years ago
ronnywang
e03e46730c
[ROCM] fix gather_op, sigmoid_cross_entropy_with_logits_op, test=develop ( #31467 )
4 years ago
Qi Li
b85c8e03be
[ROCM] fix reduce op, test=develop ( #31478 )
4 years ago
Jacek Czaja
39a5424ed1
[oneDNN] elementwise add bf16 grad kernel with broadcasting ( #31385 )
4 years ago
石晓伟
5f6213217b
update zero_copy_tensor_test.cc for build of gcc485, test=develop ( #31470 )
4 years ago
Qi Li
133a914bd0
[ROCM] fix test_dist_op ci test, test=develop ( #31468 )
4 years ago
Qi Li
f9377965c4
[ROCM] fix dropout and remove hipcub, test=develop ( #31455 )
4 years ago
Aurelius84
fadabbe9b0
[CustomOp] Automatically specify PADDLE_WITH_MKLDNN & Remove Interpreter argument ( #31391 )
...
* auto specify PADDLE_WITH_MKLDNN and remove Interpretper
* remove print
* fix check abi
* fix windows
* fix compile flags
4 years ago
Leo Chen
ffdd5b7773
Fix cmake of cryptopp to avoid downloading every time ( #31447 )
4 years ago
石晓伟
bc7632be73
upgrade inference tensor apis, test=develop ( #31402 )
4 years ago
JamesLim
8491ae9a02
Creating a CUDA function to find the minimum value in warp or block ( #31191 )
4 years ago
Pei Yang
30717a6cbc
fix trt serialization on windows ( #31438 )
4 years ago
Pei Yang
1321c47950
add more info in trt engine serialization ( #31434 )
4 years ago
liuyuhui
9ebf05b003
[Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn support for multi xpu and some bug-fixes ( #31130 )
4 years ago
Qi Li
4d647ec137
[ROCM] update fluid platform for rocm (part5), test=develop ( #31315 )
4 years ago
liym27
522c91ec67
[Dy2Stat] Remove gast.Index for compatibility of gast 0.4.0 ( #31358 )
4 years ago
YUNSHEN XIE
62289fccc0
fix python full coverage decrease issue ( #31429 )
...
* fix python full coverage decrease issue
* fix
4 years ago
Wilber
c9a7bfec89
prepare remove grad script and update PADDLE_CI_INFERENCE pipeline ( #31149 )
...
prepare remove grad op and kernel script.
update Paddle_CI_Inference pipeline.
4 years ago
Zhang Ting
7d95e598c1
support float16 for temporal_shift op ( #31432 )
4 years ago
YUNSHEN XIE
3a8ef10e09
fix modified_retry_method_only_win ( #31404 )
...
* fix modified_retry_method_only_win
* fix bug
* fix retry bug on windows
4 years ago
Zhang Ting
dcce54ea76
improve performance of depthwise_conv2d ( #31099 )
...
* improve performance of depthwise_conv2d
* add unittest
4 years ago
wuhuanzhou
4d6d2db812
Windows system supports Ninja compilation ( #31161 )
4 years ago
liym27
0fff930667
Fix bug for set_value op when input dtype is not float32 ( #31411 )
4 years ago
Huihuang Zheng
c40b98e068
Fix comment ( #31424 )
...
Fix wrong code comment
4 years ago
Huihuang Zheng
6bf02a1261
[Dy2stat] Fix Read-Only Attribute as while_loop Output ( #31415 )
...
Fix Read-Only Attribute as while_loop Output:
Usually, our convert_while_loop will be like:
```
[a, b, c] = paddle.jit.dy2static.convert_while_loop(
condition_name, body_name, [a, b, c])
```
where a, b, c are in loop_var_names.
However, if loop_var_names contains property such as foo.x, we cannot
assign the attribute as output of convert_while_loop because Python
property is a kind of read-only attribute. To handle the case, we replace
the attributes which are output of convert_while_loop with generated
variables, then if we know the attribute is not read-only at runtime, we
assign the attribute. The created statements are like:
```
[a, b, __attribute_variable_1] = paddle.jit.dy2static.convert_while_loop(
condition_name, body_name, [a, b, foo.x])
if not isinstance(getattr(type(foo), x, None), property): foo.x = __attribute_variable_1
```
4 years ago
jakpiase
5b4f8aac82
Added LSTM BF16 and fixed GRU BF16 ( #31234 )
4 years ago
Qi Li
7cdf6ea770
[ROCM] update fluid elementwise op for rocm (part10), test=develop ( #31361 )
...
* [ROCM] update fluid elementwise op for rocm (part10), test=develop
* update, test=develop
* address review comments, test=develop
4 years ago
Qi Li
84639b6193
[ROCM] update fluid operators for rocm (part3), test=develop ( #31213 )
...
* [ROCM] update fluid operators for rocm (part3), test=develop
* fix clang format error, test=develop
4 years ago
Qi Li
3b9db17199
[ROCM] update fluid operators for rocm (part7), test=develop ( #31307 )
4 years ago
Qi Li
db50fb6766
[ROCM] fix softmax with loss and update python scripts, test=develop ( #31373 )
4 years ago
Pei Yang
32211fe9c4
TRT conv2d converter support SAME padding ( #31379 )
4 years ago
Qi Li
e312a1ff6e
[ROCM] update fluid operators for rocm (part9), test=develop ( #31338 )
4 years ago
Qi Li
6626c6a6ad
fix bert cu file compiler error, test=develop ( #31389 )
4 years ago
wuhuanzhou
c1bc223695
compile with VS2017, test=develop ( #31388 )
4 years ago
Zhou Wei
13e4280f82
[Custom OP]polish doc of custom OP ( #31369 )
4 years ago
Qi Li
946dbdae8c
[ROCM] update fluid operators for rocm (part6), test=develop ( #31301 )
4 years ago
wangna11BD
1cbccfa594
Add attrs `deformable_groups` for deformable_conv API ( #31335 )
...
* add attrs deformable_groups
4 years ago
Shang Zhizhou
77c44e2f1b
change prelu plugin to tensorRT layer ( #30210 )
4 years ago