Commit Graph

631 Commits (c34b24ede782612464bc4c7cad47c40661616e9d)

Author SHA1 Message Date
phlrain 1580be5d6c fix sequence pad; test=develop
6 years ago
phlrain 802b33489a remove resize then seq num == 1; test=develop
6 years ago
sneaxiy 5a92e4c097 revert revert 16144
6 years ago
Zeng Jinle a91964c8fe Revert "PaddingRNN model memory optimize"
6 years ago
Zeng Jinle 0b49e43d3a
Merge pull request #16144 from sneaxiy/rnn_mem_opt
6 years ago
sneaxiy b26e9bd232 refine code
6 years ago
tensor-tang 6ff230a624 Merge remote-tracking branch 'ups/develop' into refine/jit
6 years ago
tensor-tang 14a764c930 simplify the jitkernel templates and tests
6 years ago
Yiqun Liu 5bde120243
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)
6 years ago
tensor-tang 802f362ac4 unify the kernelfuncs cache and add unit test
6 years ago
Yiqun Liu 87248281f7
Fix error in CUDA kernel of beam_search. (#15957)
6 years ago
Yihua Xu 7396788694 Optimize gelu operation with mkl erf.
6 years ago
xuezhong 1dad36f6aa
Merge pull request #15609 from xuezhong/add_sample_logits_op
6 years ago
tensor-tang ee2321debd
Revert 15770 develop a6910f900 gelu mkl opt (#15872)
6 years ago
xuezhong 81870723c6
Merge pull request #15605 from xuezhong/fix_bug_for_lstmp
6 years ago
Yihua Xu 676995c86c Optimze Gelu with MKL Erf function (#15770)
6 years ago
xuezhong f2262d7336 update comment
6 years ago
xuezhong fb261793b9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_sample_logits_op
6 years ago
xuezhong fb9a6a2bc6 pass test for lstm op
6 years ago
xuezhong 2ba256df40 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
6 years ago
peizhilin 061299be87 fix dependency
6 years ago
xuezhong 4028943125 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
6 years ago
tensor-tang a6a1a92ef7
Merge pull request #15586 from tensor-tang/jit/cache
6 years ago
xuezhong 4c98c2ccc3 remove debug print
6 years ago
xuezhong 58ad40cc15 add sample_logits op
6 years ago
xuezhong 880836329d add cell clip and proj clip, fix bug for h0
6 years ago
Yiqun Liu 16d54f7f23
Return parent_idx in beam_search op (#15520)
6 years ago
tensor-tang a18c0d4242 cache fc kernel
6 years ago
tensor-tang 6e1ee7fb57 cache softmax kernel func
6 years ago
tensor-tang d59f733551 refine softmax and use with cache
6 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
6 years ago
tangwei12 5cfc40dea8
nce add check sample lables, test=develop (#15463)
6 years ago
Dun 9f8f0fc2d3 Memory optimization of depthwise conv op and group norm op (#15313)
6 years ago
zhaozhehao e2ba9668b4 Tree conv op (#15217)
6 years ago
Qiao Longfei 4d15515c40 fix gru_gpu_kernel test=develop
6 years ago
Qiao Longfei 4feae25378 fix build problem test=develop
6 years ago
Qiao Longfei 4c7be265d3 update avx gru grad kernel test=develop
6 years ago
Qiao Longfei 9b16e54064 update gru_grad_op
6 years ago
Qiao Longfei e477d789a1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode
6 years ago
Wu Yi fd85418329
[Feature] support mix precision training for resnet (#14899)
6 years ago
Qiao Longfei d0e3b24002 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
6 years ago
tensor-tang 223c61ca5e
Merge pull request #15170 from tensor-tang/jit/seqpool
6 years ago
Qiao Longfei c3b9edf958 follow comment test=develop
6 years ago
Qiao Longfei b16e832d4d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
6 years ago
sneaxiy ed409ac9f4 Revert "Revert "Remove op handle lock""
6 years ago
Zeng Jinle dacfaaa966 Revert "Remove op handle lock"
6 years ago
Zeng Jinle f3a13512fc
Merge pull request #15139 from sneaxiy/remove_op_handle_lock
6 years ago
tensor-tang 102d93712e Merge remote-tracking branch 'ups/develop' into jit/seqpool
6 years ago
tensor-tang 0145f40f45 use height from params of jitcode
6 years ago
Qiao Longfei 3e1b914fcb update gru op forward kernel
6 years ago