Commit Graph

10 Commits (2712df42a3738b207d06cb2f1e27026aca5af169)

Author SHA1 Message Date
chentianyu03 8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199)
4 years ago
zhaoyuchen2018 792443ef23
Refine elementwise kernel. (#16952)
6 years ago
guoshengCS 5dfce93101 To make CUDA_LAUNCH_KERNEL_HELPER support large size.
6 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
6 years ago
dzhwinter 2673798ddb
"fix float16 ShuffleDownSync Bug" (#12756)
7 years ago
dzhwinter 39ac9e39c2
float16 type support enhance (#12181)
7 years ago
chengduoZH 345737d0fe add sync
7 years ago
chengduoZH d36af62c1e wrap_shfl_x_sync
7 years ago
chengduoZH e97c1a8ca0 fix __shfl
7 years ago
chengduo 4fbde42cdf Fix __shfl_down_sync_ of cross_entropy (#10345)
7 years ago