Commit Graph

438 Commits (a19b3225a1da8c31fc996bace3ac09e6f5f177ef)

Author SHA1 Message Date
minqiyang 14ebc424d6 Add gpu support for unittest
7 years ago
minqiyang bd5a82e193 Polish unit test code
7 years ago
minqiyang 047fa2f9aa Add unit-test for sequence_pooling functor
7 years ago
sneaxiy 92a2817a2b test=develop
7 years ago
tensor-tang 032c3a07e3 Merge remote-tracking branch 'ups/develop' into refine/jit/gru
7 years ago
tensor-tang 159be8cc63 optimize fusion gru kernel at size 8
7 years ago
chengduo a7497653d0
Refine Split op (#13967)
7 years ago
sneaxiy a9d7a9d720 test=develop
7 years ago
tensor-tang 640e789d3d add fusion gru jit kernel
7 years ago
tensor-tang 23fc896bc2 Merge remote-tracking branch 'ups/develop' into fea/fusion_seqconv_add
7 years ago
tensor-tang e5ce965952 refine and add eltadd_relu unit test
7 years ago
tensor-tang 7cb19a5976 fuse elementwise_add and relu
7 years ago
sneaxiy ac2eba4457 test=develop
7 years ago
tensor-tang b139b687de Merge remote-tracking branch 'ups/develop' into fix/jit/exp
7 years ago
tensor-tang 748435586a clean code exp avx
7 years ago
tensor-tang b4751a34a5 fix illegal instruction of rnn2
7 years ago
tensor-tang 36588b3365 fix illegal instruction of rnn1 and text
7 years ago
tensor-tang e69328c3bc fix warning and mac compile
7 years ago
tensor-tang 6447155dac
Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole
7 years ago
sneaxiy 4b4af84e67 test=develop
7 years ago
Qiao Longfei 0225957515 change elementwise_add to elementwise_add_to test=develop
7 years ago
Qiao Longfei b4a32eafdf Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Zeng Jinle 93606c2c2c
Merge pull request #13689 from sneaxiy/sparse_rmsprop
7 years ago
sneaxiy 5cedfb60c8 test=develop
7 years ago
Qiao Longfei 936926aadd code optimize
7 years ago
Qiyang Min cab29828a5
Merge pull request #13829 from velconia/accelerate_sequence_pool_op
7 years ago
Qiao Longfei c52ccbc109 clean code
7 years ago
Qiao Longfei 6056d04361 optimize blas call
7 years ago
Qiao Longfei 5db7551317 optimize code
7 years ago
Qiao Longfei eb6d9e3bbe Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei 0170d36c42 fix a bug
7 years ago
Qiyang Min e37c9e6732
Merge pull request #13828 from velconia/accelerate_selected_rows_functor
7 years ago
Qiao Longfei 86e2e686ee fix bug
7 years ago
Qiao Longfei 333fd15204 add gpu test for mrege add
7 years ago
Qiao Longfei ab3e36da80 update MergeAdd for selected_rows_functor.cu
7 years ago
Qiao Longfei d5c64af24f change map to unordered_map
7 years ago
Qiao Longfei 005f1923a2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
tensor-tang bcb8ea397d Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole
7 years ago
tensor-tang 8e182170ba refine and replace lstm peephole kernel
7 years ago
Dun 5f2e837847 optimize depthwise conv by register memory (#13778)
7 years ago
minqiyang 3f6ec90060 Polish code
7 years ago
tensor-tang 7ef2699e18 init peephole runtime kernel
7 years ago
minqiyang 0385b0a1ea Accelerate SequencePool Op on SUM mode
7 years ago
minqiyang 8ec748cfa0 Accelerate SelectedRows Functors:
7 years ago
Qiao Longfei 38568519f7 optimize code
7 years ago
tensor-tang 3ee8f2c6cf thread local jit kernels
7 years ago
tensor-tang 9131a35676 replace the lstm compute with jitkernel
7 years ago
tensor-tang b55c247678 add lstm compute unit test
7 years ago
tensor-tang 2a00969165 optimize lstm jitkernel keq8
7 years ago
tensor-tang f2adaf1c3e add vrelu and lstm kernel
7 years ago
tensor-tang e6d8aca3bf refine code and fix
7 years ago
qiaolongfei 1a59880084 update test_sum_op
7 years ago
qiaolongfei 40d3bd4e81 selected rows merge add support multi input
7 years ago
tensor-tang ea7dc9cbf6 Merge remote-tracking branch 'ups/develop' into fea/jitkernel
7 years ago
tensor-tang 2513b2cc4e fix bug vtanh
7 years ago
tensor-tang cf8c8e72bd add vtanh and unit test
7 years ago
tensor-tang b37fe30417
Merge pull request #13690 from wangguibao/fix_cpu_lstm_compute_cc
7 years ago
dzhwinter 26771f41ba
"fix compile error" (#13579)
7 years ago
tensor-tang d10a9df7b8 add vaddbias and unit test
7 years ago
tensor-tang 3c8b651187 add vsigmoid avx implementations and unit test
7 years ago
tensor-tang 55e44761fb refine code and init vsigmoid
7 years ago
wangguibao 1940bc2d83 Avoid multiple definitions of lstm_compute_ctht when linking libpaddle_fluid.so
7 years ago
sneaxiy 584c3f048f fix sparse rmsprop
7 years ago
Dun 161c3e31f7 Optimization of Kernels that related to DeepLabv3+ (#13534)
7 years ago
tensor-tang 2d0ff6a3c2 add vexp and unit test
7 years ago
tensor-tang b3c63f40fa add vscal and unit test
7 years ago
tensor-tang 0987f2b4d9 add vadd unit test
7 years ago
tensor-tang 3d928d4f9d refine and seepdup
7 years ago
tensor-tang 77fc42d2d1 Merge remote-tracking branch 'ups/develop' into fea/jitkernel
7 years ago
tensor-tang 2937314d8e refine vmul and test
7 years ago
tensor-tang 6c986e127a fix macro and add vmul unit test
7 years ago
Yu Yang 0be1582df0
Merge pull request #13525 from reyoung/fix_mixed_vector
7 years ago
tensor-tang 8c69764d12 add vmul unit tests
7 years ago
tensor-tang 084893a9a9 add vadd kernel
7 years ago
tensor-tang eeff268a6c clean and refine kernels
7 years ago
tensor-tang dee5d35c20 refine vmul
7 years ago
tensor-tang 92031968d7 init vmul kernel
7 years ago
tensor-tang b9acbcc8c5 init lstm kernel
7 years ago
tensor-tang c260bf942d init jit kernel
7 years ago
Yu Yang 3043f51b3a
Merge pull request #13511 from reyoung/fix_ce
7 years ago
Yu Yang f7af695801
Merge pull request #13505 from reyoung/fix_selected_rows_functor_test
7 years ago
Yu Yang 6d2c6f96f1 Revert "Revert "Merge pull request #13431 from chengduoZH/refine_lod""
7 years ago
Yu Yang a6c8d6b9a2 Revert "Merge pull request #13431 from chengduoZH/refine_lod"
7 years ago
Zeng Jinle 7f1e312677
Merge pull request #13456 from sneaxiy/refine_sparse_adam
7 years ago
Yu Yang b5996fa124 Fix unstable selected_rows_functor_test.cu
7 years ago
sneaxiy a29b4227eb fix sparse gradient clip
7 years ago
Yihua Xu 87086b1386 Refine activation for GRU operator (#13275)
7 years ago
chengduo d402234ba8
Feature/op_fuse_pass (#12440)
7 years ago
Yu Yang 2c31ea9293
Merge pull request #13424 from chengduoZH/refine_seq_concat
7 years ago
Yu Yang 5996e224fa
Merge pull request #13430 from chengduoZH/refine_seq_pool
7 years ago
sneaxiy b6f61faf13 fix adam
7 years ago
chengduoZH 6534f8527a Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_lod
7 years ago
chengduoZH 24459501fe Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_seq_concat
7 years ago
chengduoZH f92b07f0b5 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_seq_pool
7 years ago
gongweibao 0c8c0d943f
fix macunittest (#13434)
7 years ago
chengduoZH cdb9605bad refine
7 years ago
chengduoZH cacf549e8a refine seq_pool
7 years ago
chengduoZH e7940141ce refine seq_concat
7 years ago
tensor-tang 7c8730824a
Merge pull request #13396 from tensor-tang/refine/op/lstm
7 years ago
Tao Luo 40c54db301
Merge pull request #13338 from bingyanghuang/bingyang/seq_pool_memcpy
7 years ago
tensor-tang e09cf031a8 refine src and header
7 years ago
bingyanghuang 76553c5a6d fix travis-ci
7 years ago
tensor-tang bc9971dd6c fix deps
7 years ago
tensor-tang ff858d35ed fix bug and enable on batch mode as well
7 years ago
tensor-tang 8dea07f209 fix comopile
7 years ago
tensor-tang 612ba41aee add simple lstm compute
7 years ago
bingyanghuang 83394bab3e modified by luotao's suggestion
7 years ago
Bai Yifan faf8ad2436
Add ignore_index in cross_entropy op (#13217)
7 years ago
bingyanghuang 1454cd54aa pre-commit check
7 years ago
bingyanghuang 7429067ab3 clean code
7 years ago
bingyanghuang cdbc5e7353 Add some comments
7 years ago
bingyanghuang 53185fde11 Rewrite sequence pooling last and first mode with memcpy and clean code
7 years ago
dzhwinter 379b471ee2 squash commit
7 years ago
dzhwinter f05520060e
fix style (#13142)
7 years ago
tensor-tang f38905a6e5 Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_gru
7 years ago
dzhwinter 34757efb8e fix windows compile
7 years ago
dzhwinter dbe90cc0f6 merge develop branch
7 years ago
dzhwinter ab1097cd8e
Feature/template (#13093)
7 years ago
tensor-tang 7bdd11d88e Merge branch 'develop' into optimize/op/fusion_gru
7 years ago
tensor-tang b0d36c4c3d add cross vec to speedup gru
7 years ago
chengduo 3bd1d22a7d
Enhance fused_elementwise_activation_op (#12837)
7 years ago
tensor-tang 2d0ddf8c41 refine cpu gru batch mode
7 years ago
tensor-tang 70d3981220 add cpu vec bias sub
7 years ago
tensor-tang d941192e74 fix gcc53 on cpu vec (#13020)
7 years ago
tensor-tang 2328a69157
Merge pull request #13012 from tensor-tang/refine/seq2batch
7 years ago
tensor-tang fd4f7c3ab5 refine seq2batch
7 years ago
fengjiayi 7e0c9f50ae Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
fengjiayi 9cb455fa7d update function
7 years ago
Zeng Jinle ef7bd03a03
Merge pull request #12964 from sneaxiy/fix_concat_sync
7 years ago
qingqing01 1f09bc320c
Support data type int8_t . (#12841)
7 years ago
dzhwinter cd8f3e9ed0 operator module is done
7 years ago
chengduo 3e1050a2e8
Add pad_constant_like_op (#12943)
7 years ago
dzhwinter 6cc7870517 fix concat synchronization bug
7 years ago
dzhwinter 2ec589a24e float.h fixed
7 years ago
dzhwinter 7dceb8a080 check some operators
7 years ago
dzhwinter 26dbe35c54 add msvc flags and copy lib done
7 years ago
Qiao Longfei 3c58b87b45
fix auc layer and add check for auc op (#12954)
7 years ago
dzhwinter d7f98f37a7 more platform is done
7 years ago
dzhwinter eca4563e5d
operators module (#12938)
7 years ago
dzhwinter a94d4f51a8 fix math_function compile
7 years ago
tensor-tang 7bdaf09664 Merge remote-tracking branch 'ups/develop' into refine/jit
7 years ago
tensor-tang 3462c29940 refine add bias with avx
7 years ago
dzhwinter c1ad52f768 pre-commit
7 years ago
dzhwinter 89f95ea25e merge develop branch
7 years ago
tensor-tang bb9f98e10d add inplace test
7 years ago
tensor-tang f269614bcd further optimize tanh with avx and mkl
7 years ago
luotao1 2b4edacca0 enhance the forward of concat op
7 years ago
dzhwinter 34f8c9b6f5 windows port
7 years ago
tensor-tang 7a4924cd44 further optimize sigmoid with avx and avx512
7 years ago
tensor-tang 6bd89ba5b6 fix typo
7 years ago