Commit Graph

177 Commits (557be6fc58a8fad13a830df33ec77560faaa3d7c)

Author SHA1 Message Date
tensor-tang f0f06992c1
Merge pull request #12878 from tensor-tang/feature/op/attention_lstm
7 years ago
tensor-tang 5ca0bb9aad support more activation type and remove some comments
7 years ago
tensor-tang ec59f0d454 add cpu vec
7 years ago
tensor-tang cf5ea925c3 fix bugs
7 years ago
tensor-tang 3dd66390b2 add blas vexp
7 years ago
tensor-tang 0ec1f65cf1 fix blas dot and add cblas scal
7 years ago
tensor-tang a2203d0466 add cblas dot
7 years ago
tensor-tang f72ab8961e refine blas gemm
7 years ago
Yu Yang 3768677980 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/process_lod_grad
7 years ago
Yu Yang 2a36ad1a96 Handle LoD for concat & seq_softmax ops
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
tensor-tang b090479409 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
dzhwinter 4069262f0e
Revert ""cherry picked operators changes" (#12184)" (#12747)
7 years ago
tensor-tang 92890ac258 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
7 years ago
tensor-tang a72f68f223 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tensor-tang f3cd2612ae refine fc and use the fc compute in fusion_lstm
7 years ago
dzhwinter bf3c34960f
"cherry picked operators changes" (#12184)
7 years ago
tensor-tang 3bf3e77ac8 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
chengduo 7c8b69c700
Feature/op fusion (#12240)
7 years ago
tensor-tang 54c95e49f0 fix blas
7 years ago
tensor-tang 8c23f7c4f0 fix blas and use packed weight
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago
tensor-tang d8d2dbcfac further optimize im2col using variables
7 years ago
tensor-tang 687a322267 Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
tensor-tang 65d418f060 complete im2col with padding==1 and speedup filter width==1
7 years ago
tensor-tang 52eb86e30f refine im2col benchmark
7 years ago
tensor-tang 3017f46076 add more test cases
7 years ago
tensor-tang 8d6be4fb5f refine im2col test and add benchmark
7 years ago
tensor-tang 507c143047 im2col cfo cpu code clean
7 years ago
tensor-tang 4eeed0b5e4 refine width padding and enable core copy
7 years ago
Wu Yi 73fcfc06ec
refine conv cudnn enforce (#12353)
7 years ago
tensor-tang e3131e2d73 enable width padding
7 years ago
tensor-tang 92518c519f reuse sizes saving time
7 years ago
tensor-tang 660df122ce enable padding!=0 and fill height padding with 0
7 years ago
tensor-tang d8e00facf7 reuse im_size
7 years ago
tensor-tang b72befc5cc reuse copy size
7 years ago
tensor-tang 6788af4bf1 refine test cases
7 years ago
tensor-tang b163e601b6 add gtest
7 years ago
tensor-tang aae994fd26 refine im2col no padding
7 years ago
Yan Chunwei 02cf54d331
bugfix lod cpu performance (#12297)
7 years ago
tensor-tang fc2b578842 add gemm_warp test
7 years ago
tensor-tang a916c52579 refine gemm
7 years ago
tensor-tang 961e754c9f mkl split gemm for better perf
7 years ago
tensor-tang f0cd493c0d
Merge pull request #11989 from tensor-tang/feature/libxsmm
7 years ago
Guo Sheng da3f766821
Merge pull request #12088 from guoshengCS/complete-hsigmoid
7 years ago
guosheng 4ee069fdba Fix the HierarchicalSigmoidGradOpKernel and refine the codes. Now hsigmoid_op is same with V2 implementation and can pass gradient check.
7 years ago
tensor-tang 1c5d6c5692 disable xsmm with float16
7 years ago
tensor-tang c9ba51ead8 Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago