Commit Graph

3196 Commits (5165bc854ac35c22c3d1d8c04629f5420972a23a)

Author SHA1 Message Date
Zhou Wei 13e4280f82
[Custom OP]polish doc of custom OP (#31369)
4 years ago
Qi Li 65bcaeb004
[ROCM] update fluid operators for rocm (part5), test=develop (#31258)
4 years ago
danleifeng d1075df2e8
topo and memory performance for heterps (#30440)
4 years ago
Wilber e20234094c
Fix xpu compile and cipher symbol problem. (#31271)
4 years ago
alncat bfb8a64234
updated conv bn fuse pass to make it compatible with latest batch_norm op (#31272)
4 years ago
Chen Weihang 5610c1717e
fix dtype unmatched (#31305)
4 years ago
石晓伟 1da3280660
inference modification for custom operator, test=develop (#31283)
4 years ago
石晓伟 8c94d8cb4c
[Custom OP] change the user header file format, test=develop (#31274)
4 years ago
Jiabin Yang 038ce70d69
[Custom OP] Support stream set on Custom Op (#31257)
4 years ago
Jiabin Yang 0c38708a90
[Custom Op] Remove unsupport dtypes (#31232)
4 years ago
WangXi b8bce682e0
xpu support fuse allreduce (#31104)
4 years ago
Chen Weihang 126633c50f
[CustomOp] Split build op marco & polish details (#31229)
4 years ago
Qi Li 28b356b9a2
[ROCM] update fluid framework for rocm (part6), test=develop (#31015)
4 years ago
Qi Li c8fac5ee30
[ROCM] update fluid framework for rocm (part5), test=develop (#31014)
4 years ago
Qi Li 580447d019
[ROCM] update fluid framework for rocm (part4), test=develop (#31013)
4 years ago
chentianyu03 ca3b6bcf78
add cache for VariableWrapper (#30880)
4 years ago
jakpiase 2f1165342b
OneDNN hardswish integration (#30211)
4 years ago
Chen Weihang e8cdb49aa9
[CustomOp] Support attributes as func input in custom op (#31128)
4 years ago
Zhou Wei ffbf71359a
modify custom op dependent from paddle_framework to paddle_custom_op (#31195)
4 years ago
lilong12 dc8dfba35b
align the default value of some configuration for fleet to that of single cards (#30740)
4 years ago
Thunderbrook c4f279fe8d
support multi node in heterps (#31102)
4 years ago
Chen Weihang 1ce96fa118
[CustomOp] Add new paddle custom op so (#31141)
4 years ago
alncat 5d6a8c7b73
added support for fake_quantize_dequantize_abs_max op in quantization… (#30896)
4 years ago
joanna.wozna.intel 781df300d0
Unification of BF16 enablement process (#31034)
4 years ago
Qi Li a60d93fb77
[ROCM] update fluid framework for rocm (part2), test=develop (#31010)
4 years ago
Thunderbrook 565354f676
support save multi sparse table in one path (#31108)
4 years ago
Qi Li 50967135a5
[ROCM] update fluid framework for rocm (part3), test=develop (#31011)
4 years ago
Qi Li 8fe09faf14
[ROCM] update fluid framework for rocm (part1), test=develop (#31009)
4 years ago
Zhou Wei adaec0073d
[2.0Custom OP]Support New Custom OP on Windows (#31063)
4 years ago
Chengmo 6b3371e0c7
Remove PE special profiler (#30886)
4 years ago
TTerror d5323dab41
add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056)
4 years ago
Jiabin Yang 628451af06
hide useless headers and add complex support (#31074)
4 years ago
joanna.wozna.intel caf9d39839
Add Conv Transpose BF16 (#30877)
4 years ago
Chen Weihang f649442ddd
New custom operator extension mechanism (#30690)
4 years ago
WangXi 14d039e4a1
Fix the problem that the number of ops executed by xpu is wrong (#30961)
4 years ago
Adam Osewski 3ba69809bf
Fix LayerNorm tester for gcc4.8 (#30962)
4 years ago
wanghuancoder aab3a3012e
add include for heterbox_trainer.cc, develop=test (#30910)
4 years ago
Adam Osewski 092a2b1413
More UT for LayerNormFuse pass (#30891)
4 years ago
wanghuancoder 35c5b23f68
use iwyu clean include second time, test=develop (#30829)
4 years ago
Adam Osewski 4f066e316e
Layer normalization fuse pass. (#30721)
4 years ago
Thunderbrook cb66c53c2d
dump to cpu (#30750)
4 years ago
WangXi 31ed9c9eed
Fleet distributed strategy support pure fp16 (#30754)
4 years ago
alncat 5b59499e57
fixed compilation error on gcc 4.8.x due to the usage of isfinite (#30733)
4 years ago
liuyuhui 67abfc1588
[Kunlun] fix dead lock for exec_op_count_ (#30718)
4 years ago
alncat 5ace20fc3f
modified conv+bn fuse pass to fix wrong mask in mask rcnn (#30704)
4 years ago
lilong12 7fbc68a2c0
update, test=develop (#30692)
4 years ago
arlesniak 5bf25d1e8b
More precise mkldnn kernel rules in GetExpectedKernelType (#29840)
4 years ago
Jacek Czaja 173660be7b
[oneDNN] Cache oneDNN stream not to recreate in each oneDNN op (#30358)
4 years ago
Thunderbrook 1bebc09253
solve build gpu task core (#30626)
4 years ago
liuyuhui e5b0d9e1fc
[Kunlun] Add condition_variable and notify() in BindThreadedSSAGraphExecutor (#30586)
4 years ago