Commit Graph

18448 Commits (32211fe9c4c22168dfb73f19763b17ac9191341a)

Author SHA1 Message Date
Pei Yang 32211fe9c4
TRT conv2d converter support SAME padding (#31379)
4 years ago
Qi Li e312a1ff6e
[ROCM] update fluid operators for rocm (part9), test=develop (#31338)
4 years ago
Qi Li 6626c6a6ad
fix bert cu file compiler error, test=develop (#31389)
4 years ago
Zhou Wei 13e4280f82
[Custom OP]polish doc of custom OP (#31369)
4 years ago
Qi Li 946dbdae8c
[ROCM] update fluid operators for rocm (part6), test=develop (#31301)
4 years ago
Shang Zhizhou 77c44e2f1b
change prelu plugin to tensorRT layer (#30210)
4 years ago
Qi Li 59940cb383
[ROCM] update fluid operators for rocm (part8), test=develop (#31309)
4 years ago
tangwei12 5d7a8b05f8
fix sycn training error (#31357)
4 years ago
Qi Li ec72f5b235
fix ELU output for nan, test=develop (#31132)
4 years ago
Qi Li 65bcaeb004
[ROCM] update fluid operators for rocm (part5), test=develop (#31258)
4 years ago
YUNSHEN XIE 2111d912d4
Decrease threshold for failed ut retry (#30903)
4 years ago
Pei Yang 2e9e3fad15
add n-d input support for trt scale converter (#31316)
4 years ago
Shang Zhizhou 6404c43814
support trt serialize when load model from memory (#31342)
4 years ago
Gradie d79fdc3d62
lamb_op_xpu;test=kunlun (#31012)
4 years ago
danleifeng d1075df2e8
topo and memory performance for heterps (#30440)
4 years ago
Qi Li 72d99c5dcd
[ROCM] update fluid operators for rocm (part4), test=develop (#31225)
4 years ago
cucuzg 91635de390
opt matmul and matmul_v2 on kunlun, *test=kunlun (#31326)
4 years ago
Wilber e20234094c
Fix xpu compile and cipher symbol problem. (#31271)
4 years ago
wuhuanzhou 30858d8974
fix compilation errors for missing brpc header files, test=develop (#31325)
4 years ago
石晓伟 625482f752
inference modification for custom operator, test=develop (#31312)
4 years ago
wuhuanzhou a13f1d6930
optimize unity build (#31119)
4 years ago
jiangcheng 8f4ac6b525
optimize topk op through limit SortTopK kernel entrance, test=develop (#30403)
4 years ago
alncat bfb8a64234
updated conv bn fuse pass to make it compatible with latest batch_norm op (#31272)
4 years ago
Chen Weihang 5610c1717e
fix dtype unmatched (#31305)
4 years ago
Qi Li 9b016c7cb7
[ROCM] update fluid operators for rocm (part2), test=develop (#31211)
4 years ago
niuliling123 2fd999d979
Optimized the adaptive_avg_pool2d op when output_size == 1 (#31197)
4 years ago
石晓伟 1da3280660
inference modification for custom operator, test=develop (#31283)
4 years ago
Zhou Wei af9066e89c
[Custom OP]add PD_THROW and PD_CHECK for User Error message (#31253)
4 years ago
石晓伟 8c94d8cb4c
[Custom OP] change the user header file format, test=develop (#31274)
4 years ago
Jiabin Yang 038ce70d69
[Custom OP] Support stream set on Custom Op (#31257)
4 years ago
Jiabin Yang 0c38708a90
[Custom Op] Remove unsupport dtypes (#31232)
4 years ago
WangXi b8bce682e0
xpu support fuse allreduce (#31104)
4 years ago
Chen Weihang 126633c50f
[CustomOp] Split build op marco & polish details (#31229)
4 years ago
tangwei12 903235945b
loglevel adjustment for distributed training (#31205)
4 years ago
Qi Li 28b356b9a2
[ROCM] update fluid framework for rocm (part6), test=develop (#31015)
4 years ago
Qi Li c8fac5ee30
[ROCM] update fluid framework for rocm (part5), test=develop (#31014)
4 years ago
Qi Li 580447d019
[ROCM] update fluid framework for rocm (part4), test=develop (#31013)
4 years ago
Wilber 7d91974c91
enable lite ut. (#30890)
4 years ago
Guanghua Yu d18c5e47f3
fix ignore_index check in softmax_with_cross_entropy (#31201)
4 years ago
chentianyu03 ca3b6bcf78
add cache for VariableWrapper (#30880)
4 years ago
wangchaochaohu f114c3f8ca
fix the branch of code choose (#31200)
4 years ago
joanna.wozna.intel d11602481c
Add bf16 gru model test (#31158)
4 years ago
jakpiase 2f1165342b
OneDNN hardswish integration (#30211)
4 years ago
Chen Weihang e8cdb49aa9
[CustomOp] Support attributes as func input in custom op (#31128)
4 years ago
Zhou Wei ffbf71359a
modify custom op dependent from paddle_framework to paddle_custom_op (#31195)
4 years ago
Leo Chen 0f1fde5102
fix the modification of set_expected_place (#31177)
4 years ago
lilong12 dc8dfba35b
align the default value of some configuration for fleet to that of single cards (#30740)
4 years ago
lilong12 a373aa7645
fix the bug in expand_v2 op (#30984)
4 years ago
Thunderbrook c4f279fe8d
support multi node in heterps (#31102)
4 years ago
liu zhengxi ae2be49f40
Add cublas_handle() to expose cublas_handle to ops (#31157)
4 years ago