Commit Graph

22 Commits (7cdf6ea77081a4938182b1fdf26bcf341e5588a8)

Author SHA1 Message Date
Qi Li 7cdf6ea770
[ROCM] update fluid elementwise op for rocm (part10), test=develop (#31361)
5 years ago
wangchaochaohu f114c3f8ca
fix the branch of code choose (#31200)
5 years ago
wangchaochaohu 364cfa2686
fix windows for optimization of elementwise_add Op (#31068)
5 years ago
wangchaochaohu d0a5620575
fix the compiler error when gcc4 cuda9.0 (#29997)
5 years ago
wangchaochaohu 01c37c8e02
refine the compiler error for half2 operation (#29816)
5 years ago
wangchaochaohu f350aa59ff
Fix the compiler error for half type (#29799)
5 years ago
wangchaochaohu 7b2dc4e6b1
optimization for fp16 elementwise add (#29744)
5 years ago
wangchaochaohu 068d905e1e
fix the shape choose of vectorize for cuda
5 years ago
wangchaochaohu 2e0d1ed00f
delete the code for fp16 optimization because it is not faster than common template code (#29715)
5 years ago
wangchaochaohu eab44e1f32
refine (#29622)
5 years ago
wangchaochaohu 1b69e528d3
optimize for long width for elementwise (#29602)
5 years ago
wangchaochaohu ac4bae8ee9
elementwise_add_grad Op optimization (#29575)
5 years ago
Zhang Ting 560b432349
Revert "improve elementwise_add_grad perf (#29277)" (#29464)
5 years ago
Zhang Ting befd6d5338
improve elementwise_add_grad perf (#29277)
5 years ago
Leo Chen 116305ea4b
Improve performance of elementwise_add grad op (#29187)
5 years ago
wanghuancoder df43905f12
use iwyu clean include (#27267)
5 years ago
danleifeng b7697f6218 fix broadcast bug;test=develop (#21898)
6 years ago
danleifeng 0e7baabe59 extend elementwise broadcast function (#20957)
6 years ago
danleifeng 425279a57b Improve elementwise operators performance in same dimensions. (#19763)
6 years ago
Kaipeng Deng bd9bef5a4e
add elementwise_add_grad_grad op (#17366)
7 years ago
Yiqun Liu dcda20233c
Optimize the elementwise op using eigen (#15494)
7 years ago
Wu Yi a2d9b34417
Refine operator cmake (#14413)
7 years ago