Commit Graph

9 Commits (45702951226401a24df501960d7fd9b47152083d)

Author SHA1 Message Date
zhaoyuchen2018 792443ef23
Refine elementwise kernel. (#16952)
6 years ago
guoshengCS 5dfce93101 To make CUDA_LAUNCH_KERNEL_HELPER support large size.
6 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
6 years ago
dzhwinter 2673798ddb
"fix float16 ShuffleDownSync Bug" (#12756)
7 years ago
dzhwinter 39ac9e39c2
float16 type support enhance (#12181)
7 years ago
chengduoZH 345737d0fe add sync
7 years ago
chengduoZH d36af62c1e wrap_shfl_x_sync
7 years ago
chengduoZH e97c1a8ca0 fix __shfl
7 years ago
chengduo 4fbde42cdf Fix __shfl_down_sync_ of cross_entropy (#10345)
7 years ago