Commit Graph

188 Commits (9d04ef73692f38247e68e121a44bd34f9f28652c)

Author SHA1 Message Date
Jacek Czaja 25fc2a1fdb
[oneDNN] Added Elementwise Mul grad fp32/bf16 (#31647)
4 years ago
JamesLim 45c7d90564
Optimization of elementwise CUDA kernel (#30801)
4 years ago
Jacek Czaja 39a5424ed1
[oneDNN] elementwise add bf16 grad kernel with broadcasting (#31385)
4 years ago
Qi Li 7cdf6ea770
[ROCM] update fluid elementwise op for rocm (part10), test=develop (#31361)
4 years ago
wangchaochaohu f114c3f8ca
fix the branch of code choose (#31200)
4 years ago
wangchaochaohu 364cfa2686
fix windows for optimization of elementwise_add Op (#31068)
4 years ago
Jacek Czaja 9e527d9956
[oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925)
4 years ago
wanghuancoder 35c5b23f68
use iwyu clean include second time, test=develop (#30829)
4 years ago
wawltor b7560a59ab
fix the broadcast for the large second input (#30818)
4 years ago
arlesniak 5bf25d1e8b
More precise mkldnn kernel rules in GetExpectedKernelType (#29840)
4 years ago
Jacek Czaja 173660be7b
[oneDNN] Cache oneDNN stream not to recreate in each oneDNN op (#30358)
4 years ago
Wojciech Uss 88fc7a7d68
fix cache key for inplaced elementwise ops (#30404)
4 years ago
chentianyu03 c7371b7b20
type promotion for grad (#30177)
4 years ago
wangchaochaohu af80859dd6
reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)
4 years ago
QingshuChen 8e1c3ddf15
add aarch64 and sunway kunlun lib (#30027)
5 years ago
wangchaochaohu d0a5620575
fix the compiler error when gcc4 cuda9.0 (#29997)
5 years ago
wawltor b33aaea86c
add the op version check for the elementwise ops, test=op_version (#30010)
5 years ago
Chen Weihang a6072055be
[Complex] Handle complex to real after type promotion (#29855)
5 years ago
chentianyu03 ddfc3d2c2f
change grad elementwise_mul for complex types (#29757)
5 years ago
chentianyu03 2a260d9b0e
change the grad of div when complex types (#29804)
5 years ago
wangchaochaohu 01c37c8e02
refine the compiler error for half2 operation (#29816)
5 years ago
wangchaochaohu f350aa59ff
Fix the compiler error for half type (#29799)
5 years ago
wangchaochaohu 7b2dc4e6b1
optimization for fp16 elementwise add (#29744)
5 years ago
Jacek Czaja 07790ba13e
[oneDNN] Reimplemented elementwise_add grad (#29747)
5 years ago
wangchaochaohu 068d905e1e
fix the shape choose of vectorize for cuda
5 years ago
wangchaochaohu 2e0d1ed00f
delete the code for fp16 optimization because it is not faster than common template code (#29715)
5 years ago
wangchaochaohu eab44e1f32
refine (#29622)
5 years ago
wangchaochaohu 1b69e528d3
optimize for long width for elementwise (#29602)
5 years ago
wangchaochaohu ac4bae8ee9
elementwise_add_grad Op optimization (#29575)
5 years ago
Zhang Ting 560b432349
Revert "improve elementwise_add_grad perf (#29277)" (#29464)
5 years ago
taixiurong ecca6585cd
1. fix elementwise ops'bug 2. fix softmax_with_cross_entropy_op 3. add biliner_interp_op (#29448)
5 years ago
LoveAn 671555ed32
Compiling operator libraries with Unity build (#29130)
5 years ago
Chen Weihang 9ad800ebb2
Support type promote for basic math ops (quantum required) (#29265)
5 years ago
Zhang Ting befd6d5338
improve elementwise_add_grad perf (#29277)
5 years ago
Leo Chen 116305ea4b
Improve performance of elementwise_add grad op (#29187)
5 years ago
QingshuChen 64f29fbb70
update kunlun conv2d/softmax/elementwise implemetation (#29229)
5 years ago
chentianyu03 8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199)
5 years ago
arlesniak bc902044a4
Fixes mkldnn dygraph learning rate scheduler crashes (#28988)
5 years ago
Noel da71173bc9
Fix ops doc for some ops
5 years ago
taixiurong a5aa4dc7a9
add xpu elementwise ops (#29031)
5 years ago
joejiong b04c78ef5e
Update pow (#29000)
5 years ago
joanna.wozna.intel 8c0ea4bffe
Add bf16 matmul, fc, elementwise add and mul (#28729)
5 years ago
Chengmo 5f04875c30
Fix xpu error message (#28061)
5 years ago
Jack Zhou d330cf66cc
Fix xpu enforce (#27978)
5 years ago
Jack Zhou c791df09cf
Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast
5 years ago
QingshuChen 6b727e08b1
support elementwise add, activation, matmul on Baidu Kunlun (#27143)
5 years ago
wanghuancoder df43905f12
use iwyu clean include (#27267)
5 years ago
Leo Chen aba759ba16
[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112)
5 years ago
ShenLiang 9ee77b1f41
Fix elementwise_floordiv op (#27352)
5 years ago
wawltor 4e8582fe5a
update the error message check for the some ops
5 years ago