Commit Graph

5977 Commits (84639b61939ccd68702e6423f50f085af93ede19)

Author SHA1 Message Date
Qi Li 84639b6193
[ROCM] update fluid operators for rocm (part3), test=develop (#31213)
5 years ago
Qi Li 3b9db17199
[ROCM] update fluid operators for rocm (part7), test=develop (#31307)
5 years ago
Qi Li db50fb6766
[ROCM] fix softmax with loss and update python scripts, test=develop (#31373)
5 years ago
Qi Li e312a1ff6e
[ROCM] update fluid operators for rocm (part9), test=develop (#31338)
5 years ago
Qi Li 6626c6a6ad
fix bert cu file compiler error, test=develop (#31389)
5 years ago
Qi Li 946dbdae8c
[ROCM] update fluid operators for rocm (part6), test=develop (#31301)
5 years ago
Qi Li 59940cb383
[ROCM] update fluid operators for rocm (part8), test=develop (#31309)
5 years ago
Qi Li ec72f5b235
fix ELU output for nan, test=develop (#31132)
5 years ago
Qi Li 65bcaeb004
[ROCM] update fluid operators for rocm (part5), test=develop (#31258)
5 years ago
Gradie d79fdc3d62
lamb_op_xpu;test=kunlun (#31012)
5 years ago
Qi Li 72d99c5dcd
[ROCM] update fluid operators for rocm (part4), test=develop (#31225)
5 years ago
cucuzg 91635de390
opt matmul and matmul_v2 on kunlun, *test=kunlun (#31326)
5 years ago
wuhuanzhou 30858d8974
fix compilation errors for missing brpc header files, test=develop (#31325)
5 years ago
wuhuanzhou a13f1d6930
optimize unity build (#31119)
5 years ago
jiangcheng 8f4ac6b525
optimize topk op through limit SortTopK kernel entrance, test=develop (#30403)
5 years ago
Qi Li 9b016c7cb7
[ROCM] update fluid operators for rocm (part2), test=develop (#31211)
5 years ago
niuliling123 2fd999d979
Optimized the adaptive_avg_pool2d op when output_size == 1 (#31197)
5 years ago
WangXi b8bce682e0
xpu support fuse allreduce (#31104)
5 years ago
Qi Li 28b356b9a2
[ROCM] update fluid framework for rocm (part6), test=develop (#31015)
5 years ago
Wilber 7d91974c91
enable lite ut. (#30890)
5 years ago
Guanghua Yu d18c5e47f3
fix ignore_index check in softmax_with_cross_entropy (#31201)
5 years ago
wangchaochaohu f114c3f8ca
fix the branch of code choose (#31200)
5 years ago
jakpiase 2f1165342b
OneDNN hardswish integration (#30211)
5 years ago
lilong12 a373aa7645
fix the bug in expand_v2 op (#30984)
5 years ago
Qi Li ee76ea72de
[ROCM] update fluid collective op for rocm, test=develop (#31075)
5 years ago
yaoxuefeng d8fa65a3a8
fix heter compile (#30518)
5 years ago
Jacek Czaja d3f09ad702
Update of onednn to 2.2 (#31067)
5 years ago
Guanghua Yu 24ba5ee05c
merge develop conflict (#31122)
5 years ago
Qi Li cced930b61
[ROCM] update fluid operators for rocm (part1), test=develop (#31077)
5 years ago
wangchaochaohu 364cfa2686
fix windows for optimization of elementwise_add Op (#31068)
5 years ago
joanna.wozna.intel 781df300d0
Unification of BF16 enablement process (#31034)
5 years ago
Zhong Hui 16fe11d71e
fix softmax cross entropy integer overflow (#30590)
5 years ago
JamesLim b95eb38b8a
fix the bug in backward OP of index_sample. (#31026)
5 years ago
TTerror d5323dab41
add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056)
5 years ago
liym27 5b367dab44
[static setitem] Support the index is Tensor; step>1; step<0 .(#30949)
5 years ago
Jacek Czaja f7465641c3
Added reshape grad bf16 (#31035)
5 years ago
Wojciech Uss 615d8a2264
Modify relu native implementation 2 (#30996)
5 years ago
Guanghua Yu 5b267474a9
add offset parameter in roi_align,generate_proposals.etc ops (#30864)
5 years ago
Zhang Ting f0ee159280
enable exhaustive_search for forward and backward algos when dtype is float16 (#30959)
5 years ago
joanna.wozna.intel caf9d39839
Add Conv Transpose BF16 (#30877)
5 years ago
Chen Weihang 010f2caa23
try to fix reader and signal test failed (#30960)
5 years ago
liym27 97f7a70c01
Add error message for slice op(#30851)
5 years ago
Jacek Czaja 9e527d9956
[oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925)
5 years ago
liuyuhui 4a8b8b4547
[Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858)
5 years ago
taixiurong 24873f4f77
dyngraph (#30892)
5 years ago
Jacek Czaja abfa822650
[oneDNN]Extended adaptive pooling support for oneDNN pool kernel (#30757)
5 years ago
wanghuancoder 35c5b23f68
use iwyu clean include second time, test=develop (#30829)
5 years ago
cucuzg ac2e2e6b7f
add clip_by_norm on kunlun, *test=kunlun (#30862)
5 years ago
wawltor b7560a59ab
fix the broadcast for the large second input (#30818)
5 years ago
JamesLim 6e1e036a75
Implement cuda kernel for index_sample. (#30380)
5 years ago