Commit Graph

620 Commits (3691a46fa36750bb5a3c828d2eaf55305aa88f69)

Author SHA1 Message Date
Qiao Longfei 3691a46fa3 improve communicator
7 years ago
xuezhong 1dad36f6aa
Merge pull request #15609 from xuezhong/add_sample_logits_op
7 years ago
tensor-tang ee2321debd
Revert 15770 develop a6910f900 gelu mkl opt (#15872)
7 years ago
xuezhong 81870723c6
Merge pull request #15605 from xuezhong/fix_bug_for_lstmp
7 years ago
Yihua Xu 676995c86c Optimze Gelu with MKL Erf function (#15770)
7 years ago
xuezhong f2262d7336 update comment
7 years ago
xuezhong fb261793b9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_sample_logits_op
7 years ago
xuezhong fb9a6a2bc6 pass test for lstm op
7 years ago
xuezhong 2ba256df40 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
7 years ago
peizhilin 061299be87 fix dependency
7 years ago
xuezhong 4028943125 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
7 years ago
tensor-tang a6a1a92ef7
Merge pull request #15586 from tensor-tang/jit/cache
7 years ago
xuezhong 4c98c2ccc3 remove debug print
7 years ago
xuezhong 58ad40cc15 add sample_logits op
7 years ago
xuezhong 880836329d add cell clip and proj clip, fix bug for h0
7 years ago
Yiqun Liu 16d54f7f23
Return parent_idx in beam_search op (#15520)
7 years ago
tensor-tang a18c0d4242 cache fc kernel
7 years ago
tensor-tang 6e1ee7fb57 cache softmax kernel func
7 years ago
tensor-tang d59f733551 refine softmax and use with cache
7 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
7 years ago
tangwei12 5cfc40dea8
nce add check sample lables, test=develop (#15463)
7 years ago
Dun 9f8f0fc2d3 Memory optimization of depthwise conv op and group norm op (#15313)
7 years ago
zhaozhehao e2ba9668b4 Tree conv op (#15217)
7 years ago
Qiao Longfei 4d15515c40 fix gru_gpu_kernel test=develop
7 years ago
Qiao Longfei 4feae25378 fix build problem test=develop
7 years ago
Qiao Longfei 4c7be265d3 update avx gru grad kernel test=develop
7 years ago
Qiao Longfei 9b16e54064 update gru_grad_op
7 years ago
Qiao Longfei e477d789a1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode
7 years ago
Wu Yi fd85418329
[Feature] support mix precision training for resnet (#14899)
7 years ago
Qiao Longfei d0e3b24002 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
7 years ago
tensor-tang 223c61ca5e
Merge pull request #15170 from tensor-tang/jit/seqpool
7 years ago
Qiao Longfei c3b9edf958 follow comment test=develop
7 years ago
Qiao Longfei b16e832d4d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
7 years ago
sneaxiy ed409ac9f4 Revert "Revert "Remove op handle lock""
7 years ago
Zeng Jinle dacfaaa966 Revert "Remove op handle lock"
7 years ago
Zeng Jinle f3a13512fc
Merge pull request #15139 from sneaxiy/remove_op_handle_lock
7 years ago
tensor-tang 102d93712e Merge remote-tracking branch 'ups/develop' into jit/seqpool
7 years ago
tensor-tang 0145f40f45 use height from params of jitcode
7 years ago
Qiao Longfei 3e1b914fcb update gru op forward kernel
7 years ago
乔龙飞 Qiao Longfei e1679b8847
Merge pull request #14893 from JiabinYang/feature/add_prefech_hs
7 years ago
tensor-tang c50060bb26 add jitcode impl and use it
7 years ago
tensor-tang 142bb41748 add seqpool jitkernel test and benchmark
7 years ago
tensor-tang e58a569c6c use seqpool jitkernel
7 years ago
sneaxiy d0a8a1e950 remove_op_handle_lock
7 years ago
sneaxiy d25395fc98 remove tensor core lock
7 years ago
Qiao Longfei 25d44d40ac sum op support empty selected rows as input
7 years ago
sneaxiy b56aca82e9 merge develop
7 years ago
chengduo b9fb03cf54
Move GetTensor to tensor_util (#15011)
7 years ago
minqiyang f4e7a47381 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_adam
7 years ago
wopeizl b117a5f208
Merge pull request #14931 from wopeizl/windows/mkl
7 years ago