Commit Graph

30080 Commits (43d6abf0a550faa973fc096acde85a5a2fb23516)
 

Author SHA1 Message Date
wangguanzhong 43d6abf0a5
update conv2d, test=develop (#31480)
4 years ago
wangguanzhong 50af0c2cbb
fix roi_align, test=develop (#31479)
4 years ago
ronnywang e03e46730c
[ROCM] fix gather_op, sigmoid_cross_entropy_with_logits_op, test=develop (#31467)
4 years ago
Qi Li b85c8e03be
[ROCM] fix reduce op, test=develop (#31478)
4 years ago
Jacek Czaja 39a5424ed1
[oneDNN] elementwise add bf16 grad kernel with broadcasting (#31385)
4 years ago
石晓伟 5f6213217b
update zero_copy_tensor_test.cc for build of gcc485, test=develop (#31470)
4 years ago
Qi Li 133a914bd0
[ROCM] fix test_dist_op ci test, test=develop (#31468)
4 years ago
Qi Li f9377965c4
[ROCM] fix dropout and remove hipcub, test=develop (#31455)
4 years ago
Aurelius84 fadabbe9b0
[CustomOp] Automatically specify PADDLE_WITH_MKLDNN & Remove Interpreter argument (#31391)
4 years ago
Leo Chen ffdd5b7773
Fix cmake of cryptopp to avoid downloading every time (#31447)
4 years ago
石晓伟 bc7632be73
upgrade inference tensor apis, test=develop (#31402)
4 years ago
JamesLim 8491ae9a02
Creating a CUDA function to find the minimum value in warp or block (#31191)
4 years ago
Pei Yang 30717a6cbc
fix trt serialization on windows (#31438)
4 years ago
Pei Yang 1321c47950
add more info in trt engine serialization (#31434)
4 years ago
liuyuhui 9ebf05b003
[Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn support for multi xpu and some bug-fixes (#31130)
4 years ago
Qi Li 4d647ec137
[ROCM] update fluid platform for rocm (part5), test=develop (#31315)
4 years ago
liym27 522c91ec67
[Dy2Stat] Remove gast.Index for compatibility of gast 0.4.0 (#31358)
4 years ago
YUNSHEN XIE 62289fccc0
fix python full coverage decrease issue (#31429)
4 years ago
Wilber c9a7bfec89
prepare remove grad script and update PADDLE_CI_INFERENCE pipeline (#31149)
4 years ago
Zhang Ting 7d95e598c1
support float16 for temporal_shift op (#31432)
4 years ago
YUNSHEN XIE 3a8ef10e09
fix modified_retry_method_only_win (#31404)
4 years ago
Zhang Ting dcce54ea76
improve performance of depthwise_conv2d (#31099)
4 years ago
wuhuanzhou 4d6d2db812
Windows system supports Ninja compilation (#31161)
4 years ago
liym27 0fff930667
Fix bug for set_value op when input dtype is not float32 (#31411)
4 years ago
Huihuang Zheng c40b98e068
Fix comment (#31424)
4 years ago
Huihuang Zheng 6bf02a1261
[Dy2stat] Fix Read-Only Attribute as while_loop Output (#31415)
4 years ago
jakpiase 5b4f8aac82
Added LSTM BF16 and fixed GRU BF16 (#31234)
4 years ago
Qi Li 7cdf6ea770
[ROCM] update fluid elementwise op for rocm (part10), test=develop (#31361)
4 years ago
Qi Li 84639b6193
[ROCM] update fluid operators for rocm (part3), test=develop (#31213)
4 years ago
Qi Li 3b9db17199
[ROCM] update fluid operators for rocm (part7), test=develop (#31307)
4 years ago
Qi Li db50fb6766
[ROCM] fix softmax with loss and update python scripts, test=develop (#31373)
4 years ago
Pei Yang 32211fe9c4
TRT conv2d converter support SAME padding (#31379)
4 years ago
Qi Li e312a1ff6e
[ROCM] update fluid operators for rocm (part9), test=develop (#31338)
4 years ago
Qi Li 6626c6a6ad
fix bert cu file compiler error, test=develop (#31389)
4 years ago
wuhuanzhou c1bc223695
compile with VS2017, test=develop (#31388)
4 years ago
Zhou Wei 13e4280f82
[Custom OP]polish doc of custom OP (#31369)
4 years ago
Qi Li 946dbdae8c
[ROCM] update fluid operators for rocm (part6), test=develop (#31301)
4 years ago
wangna11BD 1cbccfa594
Add attrs `deformable_groups` for deformable_conv API (#31335)
4 years ago
Shang Zhizhou 77c44e2f1b
change prelu plugin to tensorRT layer (#30210)
4 years ago
YUNSHEN XIE 353dd0cd98
Modified retry method on windows (#31363)
4 years ago
Qi Li 59940cb383
[ROCM] update fluid operators for rocm (part8), test=develop (#31309)
4 years ago
tangwei12 5d7a8b05f8
fix sycn training error (#31357)
4 years ago
Qi Li ec72f5b235
fix ELU output for nan, test=develop (#31132)
4 years ago
Qi Li 65bcaeb004
[ROCM] update fluid operators for rocm (part5), test=develop (#31258)
4 years ago
YUNSHEN XIE 2111d912d4
Decrease threshold for failed ut retry (#30903)
4 years ago
Pei Yang 2e9e3fad15
add n-d input support for trt scale converter (#31316)
4 years ago
Shang Zhizhou 6404c43814
support trt serialize when load model from memory (#31342)
4 years ago
chentianyu03 a2c0b60401
remove wlist_temp in wlist.json (#31356)
4 years ago
Gradie d79fdc3d62
lamb_op_xpu;test=kunlun (#31012)
4 years ago
danleifeng d1075df2e8
topo and memory performance for heterps (#30440)
4 years ago