Commit Graph

647 Commits (0823a7bc8b6c46a866d1e54f8cb96ccaab192bf2)

Author SHA1 Message Date
tensor-tang 7a91271436
Merge branch 'develop' into fea/jit/rnn
7 years ago
minqiyang be04d99fe4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
7 years ago
JiabinYang 2f6b529aff refine code and comments, test=develop
7 years ago
minqiyang 53433d7f2e Revert the changes of VLOG
7 years ago
tensor-tang 1f0291a51e add comments and follow comments
7 years ago
tensor-tang 557229bd39 Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
peizhilin 36cd18b549 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
JiabinYang 02d68051db add sparsed bias grad, test=develop
7 years ago
luotao1 e21edb26f6 add Set/GetCPUNumThreads api
7 years ago
JiabinYang 42470f14b7 test=develop
7 years ago
tensor-tang 6a7f83d45d enable gru jitcode and refine act and lstm jitcode
7 years ago
tensor-tang 686eaf20ba Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
JiabinYang 0fca16847c temp
7 years ago
chengduo 00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929)
7 years ago
peizhilin dfbac60398 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 7c8c9dc9bf fix unit test cases
7 years ago
tensor-tang 0c5ed5f6fc enable peephole jitcode
7 years ago
JiabinYang 3c6102a367 test=develop
7 years ago
tensor-tang e3b61cf52b init gru jitcode and fix lstm jitcode
7 years ago
tensor-tang 0f25446574 Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
wopeizl d9a1f3e58e Windows/online (#14474)
7 years ago
peizhilin bef475c92b Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Jiabin Yang f7b55de9e5
Merge branch 'develop' into enhance_hierachical_sigmod_op
7 years ago
tensor-tang 3562051302 add gru refer code and remove redundant avx code
7 years ago
Zhaolong Xing ad349e770f
Merge pull request #14452 from NHZlX/fix_avg_pool_trt_bug
7 years ago
tensor-tang f913860873 jitkernel lstm refer support peephole
7 years ago
tensor-tang 2f9b5f2383
Merge branch 'develop' into fea/jit/rnn
7 years ago
JiabinYang 014e50c284 test=develop
7 years ago
peizhilin 67562a6fcd Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang 10fb4ceefc
Merge pull request #14351 from tpatejko/tpatejko/mkldnn-elementwise_mul
7 years ago
tensor-tang ce31deb7e9 refine refer code and add lstm refer code
7 years ago
nhzlx e62872df8b fix conflicts
7 years ago
tensor-tang c2cfb03a72 add lstm jitcode
7 years ago
peizhilin 25adf970b2 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang 7aa3aff338
Merge pull request #14465 from tensor-tang/fea/jit/exp
7 years ago
Tao Luo 1b894e495f
Merge pull request #14437 from jczaja/prv-softmax-mkl
7 years ago
peizhilin 3a72a634cf Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yihua Xu f4c869d872 Optimize the layer_norm operator with AVX intrinsic function (#14417)
7 years ago
Yu Yang 98bbfc17be Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Jacek Czaja 9b0eae3023 - Removing partial specialization of sotmax for inference for GPU
7 years ago
peizhilin a3e952f41d add the jit back
7 years ago
tensor-tang a19b3225a1 fix jitcode small size
7 years ago
tensor-tang 4dbdfa60ef sigmoid and tanh support all size
7 years ago
tensor-tang ccb8963705 refine exp jitcode with all size
7 years ago
tensor-tang d3eae8f61b refine relu and fix addrelu test
7 years ago
tensor-tang 4e67fe6a12 refine act and vxx with all size
7 years ago
tensor-tang ba3eaed7a7 exp support all size
7 years ago
tensor-tang 1ffce8c0ae fix build error on noavx
7 years ago
Michal Gallus c69c41604e MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
7 years ago
tensor-tang 7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
7 years ago
Jacek Czaja 513bb6c151 Squashing MKL based softmax for inference
7 years ago
nhzlx 9b64aac41f add macro for pool2dDirectCUDAFunctor
7 years ago
whs 1722678258
Make nce support more distribution. (#13549)
7 years ago
nhzlx 83f8c403a7 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
7 years ago
nhzlx b969116988 fxi avg pool trt bug and fix cpplint
7 years ago
tensor-tang 1f00723fa3 exp, sigmoid, tanh jitcode support more size
7 years ago
tensor-tang 8cda7b3d20 Merge remote-tracking branch 'ups/develop' into fea/jit/act
7 years ago
tensor-tang e2d6eddd32 remove ComputeDeprecated
7 years ago
tensor-tang 64f7516aee
fix lrn on mac (#14426)
7 years ago
Yu Yang c8f6e70ab4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
tensor-tang f65ddff8d1 unify act jitcode of relu, exp, sigmoid and tanh
7 years ago
tensor-tang 6a159071b6 add vtanh jitcode of size 8
7 years ago
tensor-tang 046374bcd1 add vsigmoid jitcode of size 8
7 years ago
JiabinYang ba9ff508e8 temp fix
7 years ago
tensor-tang ee2a7f1b8c refine exp and fix error on avx
7 years ago
tensor-tang 1e06a32a0d add vexp jitcode of size 8
7 years ago
tensor-tang 2354409601
Merge pull request #14374 from tensor-tang/fea/jit/act
7 years ago
Tao Luo 5ef123c778 Merge branch 'develop' into dam_fc
7 years ago
dzhwinter d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
7 years ago
peizhilin be332a13bc Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Jacek Czaja b361579f09 - Softmax for Inference is enabled when ON_INFER is set
7 years ago
Tao Luo e0d4e04bdd fix some compiler warning
7 years ago
JiabinYang a507845a77 test=develop
7 years ago
tensor-tang 1be85d011d add mkl vsqr and vpow
7 years ago
tensor-tang 0043c42b3e add vrelu jitcode
7 years ago
JiabinYang 32e05b01f2 test=develop
7 years ago
sneaxiy d231e55065 merge develop
7 years ago
JiabinYang c8801e100f grad diff problem to be fixed and need api spec change to be done
7 years ago
peizhilin ca60e1d34d Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 52f7644f53 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiyang Min 698698f2fa
Merge branch 'develop' into fix_vlog
7 years ago
Yu Yang fdc689142c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
tensor-tang 22125ebaef
Merge pull request #14321 from tensor-tang/fea/jit/vscal
7 years ago
minqiyang 87450b9ad4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
7 years ago
peizhilin 41b423d41b remove duplicate
7 years ago
peizhilin dcfab11193 merge from develop
7 years ago
peizhilin 4ffa92d4f0 Merge branch 'develop' into windows/build
7 years ago
chengduo c5b6573a5a
Fix input<tensor> (#14208)
7 years ago
Zhaolong Xing ba8b5619a3
Revert "cherry picked windows patches."
7 years ago
minqiyang fcc0452c8b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
7 years ago
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
7 years ago
tensor-tang 5e64244f25 add vaddbias jitcode
7 years ago
tensor-tang 5f7956ae59 Merge remote-tracking branch 'ups/develop' into fea/jit/vscal
7 years ago
peizhilin 869487a2b7 Merge remote-tracking branch 'origin/develop' into windows/build
7 years ago
tensor-tang 3d950a812d combine jitcode of vscal
7 years ago
tensor-tang 03e11f3fc9 add vscal jitcode
7 years ago
dzhwinter 234a1d9248 Merge remote-tracking branch 'origin/develop' into windows/debug
7 years ago
chengduo a270fdf2db
Fix SelectedRowsAdd bug (#14309)
7 years ago
tensor-tang 2f0a379af7
Merge pull request #14307 from tensor-tang/fix/mac
7 years ago
Zeng Jinle b2af213009
Merge pull request #14292 from sneaxiy/delete_buggy_selected_rows_functor
7 years ago
tensor-tang 161ba9c9d1 fix mac
7 years ago
tensor-tang e8642c3c1f
Merge pull request #14265 from tensor-tang/fea/jit/vadd
7 years ago
tensor-tang 382307b943 refine code
7 years ago
tensor-tang 3319072858 fix jit kernel test on mac
7 years ago
Yu Yang 057a682ee9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
chengduo ffc866159f
hot fix log (#14293)
7 years ago
tensor-tang 25e070ecc7 Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
sneaxiy 9518bc8d0a delete buggy selected_rows functor
7 years ago
chengduo a9b5d42dd4
Add fp16 backward support (#14202)
7 years ago
dzhwinter 2835e04409 merge develop branch. test=develop
7 years ago
tensor-tang cb4083b9fa fix compile error
7 years ago
tensor-tang dd343a4971 Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
tensor-tang b81e1b655e fix jit on mac
7 years ago
tensor-tang b68ececb73 add vaddrelu jitcode
7 years ago
tensor-tang bb09e31020 add vadd jitcode
7 years ago
peizhilin 71d7980f69 fix build issue 1
7 years ago
tensor-tang 8465e7876f auto grow the size and fix test
7 years ago
tensor-tang 9255119fd9 refine jit vmul with all size
7 years ago
tensor-tang a9c1824131 refine jit vmul code supporting multiple of 2
7 years ago
tensor-tang 61fdc38e51
Merge pull request #14206 from tensor-tang/fea/jit/gen
7 years ago
peizhilin 9d67c1fb69 cpu build support
7 years ago
Kaipeng Deng daed473d4a
Merge pull request #14089 from heavengate/pool_exclude
7 years ago
tensor-tang 85bcb286f5 refine vmul jitcode
7 years ago
tensor-tang a3377f7b0a refine jitcode and add vmul jitcode implementation
7 years ago
dzhwinter 1ace55c8ee merge develop branch
7 years ago
tensor-tang f3badacd97 Merge remote-tracking branch 'ups/develop' into fea/jit/gen
7 years ago
tensor-tang a53b1b0b1b refine and init jitkernel vmul
7 years ago
tensor-tang 2139b9f677 add jit gencode
7 years ago
Tao Luo cdf2579d08
Merge pull request #14053 from jczaja/prv-seqpool-max
7 years ago
dzhwinter 316765839d add back jit simd instructions. stage.
7 years ago
dzhwinter bf2e4cb188 cleard. staged
7 years ago
dzhwinter ebfe5a02b3 merge develop branch
7 years ago
tensor-tang 3c957af139
Merge pull request #14080 from tensor-tang/refine/jit/crf2
7 years ago
Jacek Czaja 458b16f42a Rebase of seqpool-max optimization
7 years ago
dengkaipeng 8f1e398824 move param exclusive to the last in pool2d/pool3d for forward compatibility:. test=develop
7 years ago
Yu Yang c01696f8c2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
dengkaipeng c93e044ae0 add inclusive/exclusive mode in PoolOp avg pool type
7 years ago
Qiao Longfei 96d5500934 optimize code
7 years ago
Qiao Longfei 748ee35c89 sum op handle empty input update selected_rows_functor.cu
7 years ago
Qiao Longfei dd78b5df93 sum op handle empty input
7 years ago
Qiao Longfei cbe128bbae Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Zeng Jinle 97d47a7d08
Merge pull request #13913 from sneaxiy/seq_reverse
7 years ago
tensor-tang 64d5b4385e fix crf decode avx512
7 years ago
tensor-tang 21487d78bf add crf decode jit kernel
7 years ago
Qiao Longfei de539d72da format
7 years ago
tensor-tang a05fce6544 Merge remote-tracking branch 'ups/develop' into fix/jit/avx
7 years ago
Qiyang Min d0fdcb2f6d
Merge pull request #14048 from velconia/change_sequence_pool_to_cpu
7 years ago
tensor-tang d24d282a7a fix avx error
7 years ago
Qiao Longfei 6253b152e6 Merge branch 'optimize-sum-seq-pooling-op' of https://github.com/jacquesqiao/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei 14f5a40898 fix unit test
7 years ago
minqiyang e2a348cd10 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into change_sequence_pool_to_cpu
7 years ago
Qiao Longfei f4e6fe0786 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
minqiyang 2468057da6 Move code to SumSeqPoolGradFunctor
7 years ago
minqiyang 9725db0d40 Fix copy wrong pos bug
7 years ago
minqiyang 9c68709036 Accelerate sequence_pool functor
7 years ago
minqiyang 14ebc424d6 Add gpu support for unittest
7 years ago
minqiyang bd5a82e193 Polish unit test code
7 years ago
minqiyang 047fa2f9aa Add unit-test for sequence_pooling functor
7 years ago
sneaxiy 92a2817a2b test=develop
7 years ago
tensor-tang 032c3a07e3 Merge remote-tracking branch 'ups/develop' into refine/jit/gru
7 years ago
tensor-tang 159be8cc63 optimize fusion gru kernel at size 8
7 years ago
chengduo a7497653d0
Refine Split op (#13967)
7 years ago
sneaxiy a9d7a9d720 test=develop
7 years ago
tensor-tang 640e789d3d add fusion gru jit kernel
7 years ago
Yu Yang 461f71a90b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
tensor-tang 23fc896bc2 Merge remote-tracking branch 'ups/develop' into fea/fusion_seqconv_add
7 years ago
tensor-tang e5ce965952 refine and add eltadd_relu unit test
7 years ago
tensor-tang 7cb19a5976 fuse elementwise_add and relu
7 years ago
sneaxiy ac2eba4457 test=develop
7 years ago
tensor-tang b139b687de Merge remote-tracking branch 'ups/develop' into fix/jit/exp
7 years ago
tensor-tang 748435586a clean code exp avx
7 years ago
tensor-tang b4751a34a5 fix illegal instruction of rnn2
7 years ago
tensor-tang 36588b3365 fix illegal instruction of rnn1 and text
7 years ago
tensor-tang e69328c3bc fix warning and mac compile
7 years ago
tensor-tang 6447155dac
Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole
7 years ago
sneaxiy 4b4af84e67 test=develop
7 years ago
Qiao Longfei 0225957515 change elementwise_add to elementwise_add_to test=develop
7 years ago
Qiao Longfei b4a32eafdf Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Zeng Jinle 93606c2c2c
Merge pull request #13689 from sneaxiy/sparse_rmsprop
7 years ago
sneaxiy 5cedfb60c8 test=develop
7 years ago
Qiao Longfei 936926aadd code optimize
7 years ago
Qiyang Min cab29828a5
Merge pull request #13829 from velconia/accelerate_sequence_pool_op
7 years ago
Qiao Longfei c52ccbc109 clean code
7 years ago
Qiao Longfei 6056d04361 optimize blas call
7 years ago
Qiao Longfei 5db7551317 optimize code
7 years ago
Qiao Longfei eb6d9e3bbe Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei 0170d36c42 fix a bug
7 years ago
Qiyang Min e37c9e6732
Merge pull request #13828 from velconia/accelerate_selected_rows_functor
7 years ago
Qiao Longfei 86e2e686ee fix bug
7 years ago
Qiao Longfei 333fd15204 add gpu test for mrege add
7 years ago
Qiao Longfei ab3e36da80 update MergeAdd for selected_rows_functor.cu
7 years ago
Qiao Longfei d5c64af24f change map to unordered_map
7 years ago
Qiao Longfei 005f1923a2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
tensor-tang bcb8ea397d Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole
7 years ago
tensor-tang 8e182170ba refine and replace lstm peephole kernel
7 years ago
Dun 5f2e837847 optimize depthwise conv by register memory (#13778)
7 years ago
minqiyang 3f6ec90060 Polish code
7 years ago
tensor-tang 7ef2699e18 init peephole runtime kernel
7 years ago
minqiyang 0385b0a1ea Accelerate SequencePool Op on SUM mode
7 years ago
minqiyang 8ec748cfa0 Accelerate SelectedRows Functors:
7 years ago