Commit Graph

19 Commits (824a79d383531e804e7274ef2141c30c7532e2c2)

Author SHA1 Message Date
wangchaochaohu d0a5620575
fix the compiler error when gcc4 cuda9.0 (#29997)
4 years ago
wangchaochaohu 01c37c8e02
refine the compiler error for half2 operation (#29816)
4 years ago
wangchaochaohu f350aa59ff
Fix the compiler error for half type (#29799)
4 years ago
wangchaochaohu 7b2dc4e6b1
optimization for fp16 elementwise add (#29744)
4 years ago
wangchaochaohu 068d905e1e
fix the shape choose of vectorize for cuda
4 years ago
wangchaochaohu 2e0d1ed00f
delete the code for fp16 optimization because it is not faster than common template code (#29715)
4 years ago
wangchaochaohu eab44e1f32
refine (#29622)
4 years ago
wangchaochaohu 1b69e528d3
optimize for long width for elementwise (#29602)
4 years ago
wangchaochaohu ac4bae8ee9
elementwise_add_grad Op optimization (#29575)
4 years ago
Zhang Ting 560b432349
Revert "improve elementwise_add_grad perf (#29277)" (#29464)
4 years ago
Zhang Ting befd6d5338
improve elementwise_add_grad perf (#29277)
4 years ago
Leo Chen 116305ea4b
Improve performance of elementwise_add grad op (#29187)
4 years ago
wanghuancoder df43905f12
use iwyu clean include (#27267)
4 years ago
danleifeng b7697f6218 fix broadcast bug;test=develop (#21898)
5 years ago
danleifeng 0e7baabe59 extend elementwise broadcast function (#20957)
5 years ago
danleifeng 425279a57b Improve elementwise operators performance in same dimensions. (#19763)
5 years ago
Kaipeng Deng bd9bef5a4e
add elementwise_add_grad_grad op (#17366)
6 years ago
Yiqun Liu dcda20233c
Optimize the elementwise op using eigen (#15494)
6 years ago
Wu Yi a2d9b34417
Refine operator cmake (#14413)
6 years ago