Commit Graph

8 Commits (084310f536e0849ad04d8391a5563f438ddf69a2)

Author SHA1 Message Date
guoshengCS 5dfce93101 To make CUDA_LAUNCH_KERNEL_HELPER support large size.
6 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
6 years ago
dzhwinter 2673798ddb
"fix float16 ShuffleDownSync Bug" (#12756)
7 years ago
dzhwinter 39ac9e39c2
float16 type support enhance (#12181)
7 years ago
chengduoZH 345737d0fe add sync
7 years ago
chengduoZH d36af62c1e wrap_shfl_x_sync
7 years ago
chengduoZH e97c1a8ca0 fix __shfl
7 years ago
chengduo 4fbde42cdf Fix __shfl_down_sync_ of cross_entropy (#10345)
7 years ago