Commit Graph

2249 Commits (f3729db6e03d5e290020d3cc74cfb50572902c4c)

Author SHA1 Message Date
guochaorong 76e9227467
Merge pull request #13199 from JiayiFeng/fix_CudnnHolder_bug
7 years ago
Krzysztof Binias 1658958fe6 Reusing converted weights
7 years ago
Yan Xu d117bbc313
Merge pull request #13291 from Yancey1989/reset_vars_on_pserver
7 years ago
qingqing01 a39eba77eb
Implement norm_op by CUDA instead of Eigen. (#13273)
7 years ago
Yancey1989 32b94a7d13 cache var types
7 years ago
Yancey1989 580f55fa0f update by comment
7 years ago
Yang Yu 8331e835a8 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug
7 years ago
Yancey1989 6edfae4234 reset received vars on pserver
7 years ago
tensor-tang 40dbd97f8e Merge remote-tracking branch 'ups/develop' into refine/op/peephole
7 years ago
Qiyang Min b805751598
Merge pull request #13223 from velconia/open_python35_CI
7 years ago
Yu Yang 34e467dcab
Merge pull request #13232 from reyoung/feature/fix_layer_norm
7 years ago
chengduo 886852557f
Refine reshape_grad and transpose_grad (#13074)
7 years ago
tensor-tang 3eb55f0643 Merge remote-tracking branch 'ups/develop' into refine/op/peephole
7 years ago
tensor-tang d7ac1cc836 refine seq when bs is large
7 years ago
tensor-tang 9dd5a177a5 refine batch mode and peephole
7 years ago
Qiao Longfei 6e03f7900f
Add centered mode rmsprop (#13161)
7 years ago
Yan Chunwei 9df2d8b5ba
test/add text-classification test (#13081)
7 years ago
tensor-tang f10710b0ca move seq peephole if out of loop
7 years ago
tensor-tang 2f3b498949 refine fusion seq lstm peephole
7 years ago
tangwei12 d1e2efae6b
reimplement auc in fluid (#13167)
7 years ago
Yu Yang f57d706aa7 Use double to reduce
7 years ago
tensor-tang 5f586e2223 Merge remote-tracking branch 'ups/develop' into refine/op/fusion_lstm
7 years ago
Brian Liu 04272c0d41 Enable lstm peephole (#13160)
7 years ago
fengjiayi 56750e6a3e Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug
7 years ago
Qiao Longfei cdd14f17f1
fix async mode handle COMPLETE_MESSAGE (#13212)
7 years ago
minqiyang 8059445fb5 Fix fake_quantize_op
7 years ago
tensor-tang 78d9ad5712 fusion gru enfore only used
7 years ago
tensor-tang 555083ae2a enforce only used
7 years ago
fengjiayi db5e3dd767 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug
7 years ago
Jiabin Yang d091dd02a0 fix mac compile error 0903 (#13184)
7 years ago
Yu Yang cda7842e26 Revert "Revert "Add Python Callstacks when Op::Run error (#12759)""
7 years ago
qingqing01 9557cc218d
Refine and fix some code for faster-rcnn. (#13135)
7 years ago
fengjiayi 82a1b35b9b Revert "Revert "Add CudnnHolder and use it in Conv and ConvTranspose op""
7 years ago
guochaorong 151e169eb7
Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"
7 years ago
Chen Weihang 3b6090e80b
Merge pull request #12887 from chenwhql/sequence_enumerate_op
7 years ago
tensor-tang 1cc35f3642
Merge pull request #13118 from tensor-tang/optimize/op/fusion_lstm
7 years ago
dzhwinter 6fb28796f5
memory (#13143)
7 years ago
dzhwinter e722f68318
fix windows compile (#13147)
7 years ago
dzhwinter f05520060e
fix style (#13142)
7 years ago
dzhwinter 856c26faef
fix elementwise (#13146)
7 years ago
fengjiayi 653c8ded7d
Merge pull request #13078 from JiayiFeng/dev_CudnnHolder
7 years ago
tensor-tang 20659fc905
Merge pull request #13107 from tensor-tang/optimize/op/fusion_gru
7 years ago
tensor-tang 93c034ee51 Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_lstm
7 years ago
tensor-tang c7adb99ae0 follow comment and refine code
7 years ago
tensor-tang 83f4bc4ecf follow comment and refine code
7 years ago
tensor-tang f38905a6e5 Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_gru
7 years ago
tangwei12 fbdd4f8c0f
Merge pull request #13101 from zenghsh3/develop
7 years ago
tensor-tang 9838bacb35
Merge branch 'develop' into optimize/op/fusion_lstm
7 years ago
qingqing01 9bd933d3fb
Improve and fix fake_quantize_op (#13092)
7 years ago
Tao Luo 3fe0575b62
Merge pull request #13148 from dzhwinter/windows/math_compile
7 years ago
chenweihang 7ddbbcb0b5 doc: refine API and doc
7 years ago
dzhwinter 34757efb8e fix windows compile
7 years ago
tensor-tang c44108803a refine prelu
7 years ago
chenweihang b081363bae Merge branch 'sequence_enumerate_op' of https://github.com/chenwhql/Paddle into sequence_enumerate_op
7 years ago
chenweihang 0b7d82befb doc: refine English description
7 years ago
dzhwinter b11332a07b
"fix style" (#13094)
7 years ago
dzhwinter ab1097cd8e
Feature/template (#13093)
7 years ago
tensor-tang 80edd7ef29 enable run with fuse pass
7 years ago
fengjiayi f79ca23115 fix bugs
7 years ago
tensor-tang a79a77eeb5 refine and clean code
7 years ago
tensor-tang c459fb5be0 add fusion lstm batch mode
7 years ago
whs e10aa80f03
Add pad2d op. (#12950)
7 years ago
tensor-tang 7bdd11d88e Merge branch 'develop' into optimize/op/fusion_gru
7 years ago
fengjiayi 1f36a4c27c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_CudnnHolder
7 years ago
fengjiayi b0aca8824d make CudnnHolder thread safe
7 years ago
tensor-tang 596213906b add gru seq mode forward
7 years ago
zenghsh3 d7495838b3 refine
7 years ago
zenghsh3 04a05d1d58 merged
7 years ago
zenghsh3 08b73b68c4 fix bug of sampling_id_op
7 years ago
tensor-tang b0d36c4c3d add cross vec to speedup gru
7 years ago
tensor-tang 038c16eed2 save intermediate data to out buffer
7 years ago
Xingyuan Bu 0a97d24b41 Faster RCNN Generate Proposal Labels (#12616)
7 years ago
fengjiayi d5f74b7308 use CudnnHolder in conv_transpose_cudnn_op
7 years ago
fengjiayi 407ff0bdbc use CudnnHolder in conv_cudnn_op
7 years ago
chengduo 3bd1d22a7d
Enhance fused_elementwise_activation_op (#12837)
7 years ago
tensor-tang 2d0ddf8c41 refine cpu gru batch mode
7 years ago
tensor-tang 70d3981220 add cpu vec bias sub
7 years ago
jerrywgz 85fe65ae61 modified error info for maxout op
7 years ago
Chen Weihang b98b744067
Merge branch 'develop' into sequence_enumerate_op
7 years ago
Yan Chunwei 902f19b46a
fea/fuse attention lstm simplify.with fusion lstm.with sequnce expand (#13006)
7 years ago
Xingyuan Bu 2ad5d91ef8 Faster RCNN Generate Proposals (#12056)
7 years ago
tensor-tang 89d6d69ce4
Merge pull request #12781 from tensor-tang/feature/op/fusion_gru
7 years ago
tensor-tang d941192e74 fix gcc53 on cpu vec (#13020)
7 years ago
tensor-tang 2328a69157
Merge pull request #13012 from tensor-tang/refine/seq2batch
7 years ago
Xin Pan 2bb15f437c
Merge pull request #12791 from panyx0718/ir3
7 years ago
Qiao Longfei a22309afe8
clean useless check code in auc_op (#13023)
7 years ago
Yu Yang 8965cee89f
Polish PrintOp (#12895)
7 years ago
chengduo 7ad39c4077
Enhance pad_constant_like_op (#12999)
7 years ago
qingqing01 0353eddb51
Improve fake_dequantize_op. (#12877)
7 years ago
Qiao Longfei 11e01d9b2d
Scale support selectedrows (#12960)
7 years ago
fengjiayi 7b84c580e2
Merge pull request #12824 from JiayiFeng/dev_sequence_padding_op
7 years ago
tensor-tang fd4f7c3ab5 refine seq2batch
7 years ago
Wu Yi 0ee6fed05b
Refine dist rpc deps (#12899)
7 years ago
fengjiayi 7e0c9f50ae Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
Zeng Jinle 599a32641b
Merge pull request #12971 from sneaxiy/unstack_op
7 years ago
Tao Luo 26cac36bfd
Merge pull request #12515 from kbinias/kbinias/bnorm-fwd-reuse
7 years ago
tensor-tang a481c5e98c Merge remote-tracking branch 'ups/develop' into feature/op/fusion_expand_concat_fc
7 years ago
tensor-tang 49c31febb5 fix typo and op test
7 years ago
fengjiayi 9cb455fa7d update function
7 years ago
Krzysztof Binias fb4b4f8d57 Refactor code
7 years ago
Krzysztof Binias 50d3e6e96b Reusing primitives for forward Batch Norm operator
7 years ago
Zeng Jinle ef7bd03a03
Merge pull request #12964 from sneaxiy/fix_concat_sync
7 years ago
sneaxiy 52a480bb98 Merge develop
7 years ago
tensor-tang 02909335e9 rename fusion seq_concat_fc to fusion seqexpand_concat_fc
7 years ago
Xin Pan 1a67061fee graph to program pass
7 years ago
qingqing01 1f09bc320c
Support data type int8_t . (#12841)
7 years ago
chenweihang 00b30b9938 doc: unified infershape format
7 years ago
chenweihang 0c4697f8cd fix: change to enumerate by sentence
7 years ago
tensor-tang c45cee0349 refine infershape and forward
7 years ago
sneaxiy 24264bc0b8 Merge develop
7 years ago
dzhwinter 0153c21d83 add unstack_op
7 years ago
tensor-tang c7c2506733 add forward implementation
7 years ago
jerrywgz 6033c1a278 Add error info & remove data sharing between input and output in rnn_memory_helper_op
7 years ago
chengduo 3e1050a2e8
Add pad_constant_like_op (#12943)
7 years ago
dzhwinter 6cc7870517 fix concat synchronization bug
7 years ago
tensor-tang 954b0e113f init fusion seq expand concat fc op
7 years ago
tensor-tang c488ee96a7 Merge remote-tracking branch 'ups/develop' into refine/op/fusion_lstm
7 years ago
tensor-tang e61cf3214d complete reverse seq
7 years ago
Chen Weihang 4ec12496dd
Merge branch 'develop' into sequence_enumerate_op
7 years ago
tensor-tang 4b28fab8c9 enable more acts
7 years ago
tensor-tang 607c41952e compute gates
7 years ago
Qiao Longfei 3c58b87b45
fix auc layer and add check for auc op (#12954)
7 years ago
jerrywgz 835573bbf2 add error_info prelu_op
7 years ago
Yibing Liu c1488b1796
Merge pull request #12940 from sneaxiy/stack_op
7 years ago
dzhwinter eca4563e5d
operators module (#12938)
7 years ago
tensor-tang 6be273cbdb add seq mode lstm
7 years ago
tensor-tang 36363292c3
Merge pull request #12904 from tensor-tang/refine/jit
7 years ago
jerrywgz bc7503c85e modified error_info for maxout_op
7 years ago
Zeng Jinle d189d4dbab
Merge pull request #12884 from sneaxiy/sequence_mask_op
7 years ago
sneaxiy 3b38e5a4fc speed up stack_op
7 years ago
tensor-tang 7bdaf09664 Merge remote-tracking branch 'ups/develop' into refine/jit
7 years ago
Tao Luo 989cc2a4f4
Merge pull request #12913 from luotao1/concat
7 years ago
Tao Luo 8650f6ffae
Merge pull request #12898 from luotao1/expand
7 years ago
Qiao Longfei 52948a0b50
Merge pull request #12909 from jacquesqiao/fix-sparse-update-bug
7 years ago
tensor-tang ba943d38e3 make runtime avx act
7 years ago
tensor-tang 3462c29940 refine add bias with avx
7 years ago
tangwei12 ef6445ee39
Merge pull request #12908 from seiriosPlus/fill_constant_selectedrows
7 years ago
tensor-tang bb9f98e10d add inplace test
7 years ago
tensor-tang f269614bcd further optimize tanh with avx and mkl
7 years ago
chenweihang 733ea0d29b adjust infershape details
7 years ago
luotao1 e999c74cff Merge branch 'develop' into concat
7 years ago
luotao1 b61cf7ac4f Merge branch 'develop' into expand
7 years ago
luotao1 2b4edacca0 enhance the forward of concat op
7 years ago
Tao Luo 3e3b5f4fda
Merge pull request #12675 from Sand3r-/fix-conv-mkldnn-0.15
7 years ago
tensor-tang 7a4924cd44 further optimize sigmoid with avx and avx512
7 years ago
qiaolongfei fcf20eed0f fix sparse update bug
7 years ago
tangwei12 ca22586818 code optimize
7 years ago
Xin Pan 557be6fc58
Merge pull request #12902 from PaddlePaddle/revert-12736
7 years ago
tensor-tang 6bd89ba5b6 fix typo
7 years ago
Chen Weihang 2969aba14f
Merge branch 'develop' into sequence_enumerate_op
7 years ago
chenweihang 219a2369da feat: wrap sequence enumerate op
7 years ago
tensor-tang e3bb98eb38 optimize relu with avx and avx512
7 years ago
guochaorong 1f270275a6 Revert "Add Python Callstacks when Op::Run error (#12759)"
7 years ago
guochaorong b1fc238694 Revert "Disable in_place in batch_norm API. (#12736)"
7 years ago
tensor-tang 25976fe736 optimize the sigmoid and tanh
7 years ago
tensor-tang 2eb46c2b06 add cpu vec test
7 years ago
sneaxiy 1083e99520 Merge develop
7 years ago
tensor-tang f0f06992c1
Merge pull request #12878 from tensor-tang/feature/op/attention_lstm
7 years ago
luotao1 83f4edabe9 remove broadcast in sequence_expand
7 years ago
sneaxiy 5ea7bf88ba
Merge pull request #12872 from sneaxiy/stack_op
7 years ago
Tao Luo ef2da86b4f
Merge pull request #12885 from luotao1/test_ditu_rnn
7 years ago
sneaxiy e895c98f0a add support to max_len is None
7 years ago
fengjiayi f4a4a4cbd9 add op comment and python layer
7 years ago
tangwei12 acdd95d5ca bug fix
7 years ago
chenweihang d2e5395b97 feat: add sequence enumerate op
7 years ago
luotao1 9c7fde45a7 enhance test_analyzer to profile ditu inference demo
7 years ago
chengduo 8ad9055804
Add is_test for while_op (#12874)
7 years ago
sneaxiy 64464cb1fa Merge develop
7 years ago
qingqing01 79918a8442 add sequence_mask_op for DAM model
7 years ago
Yu Yang b2df17003f
Add Python Callstacks when Op::Run error (#12759)
7 years ago
Yu Yang 17fcc4f5d0
Merge pull request #12864 from reyoung/feature/process_lod_grad
7 years ago
tensor-tang 5ca0bb9aad support more activation type and remove some comments
7 years ago
sneaxiy ba168bd2d2 modify API.spec
7 years ago
tensor-tang d9bf73f3ab Merge remote-tracking branch 'ups/develop' into feature/op/fusion_gru
7 years ago
tensor-tang dd938d0b94 fix bugs and pass op test
7 years ago
tensor-tang ec59f0d454 add cpu vec
7 years ago
tensor-tang cf5ea925c3 fix bugs
7 years ago
tensor-tang 6ed20474d4 refine attention lstm infershape
7 years ago
tensor-tang 508548f897 implement attention lstm cpu forward
7 years ago
tensor-tang 9affc36c89 init attention lstm
7 years ago
tensor-tang 3dd66390b2 add blas vexp
7 years ago
tensor-tang 0ec1f65cf1 fix blas dot and add cblas scal
7 years ago
tensor-tang a2203d0466 add cblas dot
7 years ago
tensor-tang f72ab8961e refine blas gemm
7 years ago
qingqing01 f5d5d7b2d9
Disable in_place in batch_norm API. (#12736)
7 years ago
sneaxiy c73c5ed573 use for_range
7 years ago
Xin Pan b548ecbc2b add stack_op
7 years ago
Yu Yang eb8fd853bc Fix sequence_softmax_cudnn op
7 years ago
Yu Yang 3768677980 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/process_lod_grad
7 years ago
Yu Yang 2a36ad1a96 Handle LoD for concat & seq_softmax ops
7 years ago
Yu Yang 211d81863d Process elemwise grad op's lod. mul_op's lod
7 years ago
Yan Chunwei 9ee698e605
enhance/ditu rnn with fc fuse (#12831)
7 years ago
Xin Pan 78415f326d
Merge pull request #12838 from panyx0718/infer
7 years ago
fengjiayi ce182d9037 bug fix
7 years ago
Xin Pan a2c0e52f3e speed up while_op
7 years ago
tensor-tang 6f78fd7d1e fuse fc in gru
7 years ago
tensor-tang 300180cc26 init fusion gru op
7 years ago
Zhaolong Xing 21ba32b065
Merge pull request #12843 from NHZlX/fix_ssa_bug_for_trt
7 years ago
Michał Gallus cd32ddac12 Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669)
7 years ago
nhzlx c999895e93 merge develop
7 years ago
nhzlx 276950291a 1. fix ssa bug with batchnorm, 2. refine the trt
7 years ago
Yan Chunwei 896a37b6e3
fea/link ir to inference analysis and fc fuse support (#12789)
7 years ago
dzhwinter e23ddf6ae4
status (#12764)
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
tangwei12 cbc6e6eb97
Merge pull request #12247 from seiriosPlus/dis_ckpt_fix
7 years ago
Qingsheng Li 3d11d018e0
Fix scatter_op python API (#12742)
7 years ago
Tao Luo 8f9f414a14
Merge pull request #12805 from tensor-tang/fix/op/elewise_add
7 years ago
tensor-tang e955361267
Merge pull request #12737 from tensor-tang/feature/op/fusion_lstm
7 years ago
tensor-tang 82bb9170fb Merge remote-tracking branch 'ups/develop' into fix/op/elewise_add
7 years ago
Chen Weihang 57b34d9196
Merge pull request #12808 from chenwhql/remove_inplace_param_in_squeeze_and_unsqueeze
7 years ago
Yihua Xu 084d4a9e9e Optimize CRF Decoding with AVX/AVX2/AVX512F instruction (#12767)
7 years ago
fengjiayi 34b209cffa Complete sequence_padding GPU kernel
7 years ago
dzhwinter 00463fdfe3
cudnn windows support (#12757)
7 years ago
qingqing01 c62f68cb94
Fix bug in conditional_block_op. (#12246)
7 years ago
chenweihang bc471b6ac4 refactor: remove inplace parameter from squeeze and unsqueeze op
7 years ago
tensor-tang 0507f7bc3c fix SEGV elementwise add at debug mode
7 years ago
tangwei12 ca1e18c04a
Merge pull request #12469 from seiriosPlus/sum_op_dim_fix
7 years ago
Zhaolong Xing e5674f6dde
Merge pull request #12753 from NHZlX/add_benchmark
7 years ago
tensor-tang b090479409 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12 b4f52b01d0 bug fix when all inputs are empty
7 years ago
tangwei12 3efac174ea Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12 dbb4f0d35d Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix
7 years ago
Qiao Longfei fd10669ecb
Add dependency to send recv (#12760)
7 years ago
fengjiayi 8d8d48a34f Complete sequence_pad_op and its CPU kernel. Add unittests
7 years ago
tangwei12 7c12c0f865 add sync in load selectedrows
7 years ago
Michal Gallus 4a7f0698e0 Add consts to new MKLDNN integration
7 years ago
Michal Gallus 6588d0e039 Update MKLDNN to 0.15, fix conv integration
7 years ago
tangwei12 9f11db4080 add todo in impl
7 years ago
tangwei12 c24a9263ba Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12 ac9ae97001 code fix
7 years ago
nhzlx f55e8901c8 merge develop
7 years ago
nhzlx 1600ba86f6 1. change tensorrt op from cpu to gpu
7 years ago
tangwei12 bb9f494740 merge develop
7 years ago
dzhwinter 4069262f0e
Revert ""cherry picked operators changes" (#12184)" (#12747)
7 years ago
Qiao Longfei 653fad08f8
Optimize selected rows for dist lookup table with pthread rwlock (#12635)
7 years ago
fengjiayi 3c749fae43 update CPU sequence_padding functor
7 years ago
tensor-tang 92890ac258 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12 0749c8822d
Merge pull request #12556 from seiriosPlus/samplingIdOp
7 years ago
tensor-tang a56142c155 optimize elementwise_mul cpu forward
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
7 years ago
tangwei12 26b228e405 remove assignment and add vlog
7 years ago
tangwei12 125e9166e1 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tensor-tang a72f68f223 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tensor-tang df28a3b452 fix lod and op test
7 years ago
Qingsheng Li 317e18abd2
Remove Data Sharing between input and output in scatter_op (#12672)
7 years ago
tensor-tang f3cd2612ae refine fc and use the fc compute in fusion_lstm
7 years ago
tangwei12 822496f626 merge cpu and gpu
7 years ago
dzhwinter bf3c34960f
"cherry picked operators changes" (#12184)
7 years ago
tensor-tang 40138c4cd6 add unit test of fusion lstm op
7 years ago
jerrywgz c108376506 Add three modes for prelu_op (#12630)
7 years ago
tangwei12 9f09d68678 add enforce
7 years ago
gongweibao d06849305a
parameter dispather. (#12666)
7 years ago
tensor-tang 852bc6f4aa refine fusion lstm op doc
7 years ago
tensor-tang 8f9132959e fuse fc in lstm
7 years ago
tensor-tang ddb05dffb6 init fusion lstm op
7 years ago
tensor-tang efc5392d97
Merge pull request #12676 from tensor-tang/refine/op/fc
7 years ago
tangwei12 470fb7c5c3 bug fix
7 years ago
tangwei12 60dda7bf9f add gpu Implementation
7 years ago
tangwei12 4661f5589d random optimize
7 years ago
Bai Yifan 9333a62792
Add flatten op interface and enhance APIs about detection to support variable-length image. (#12422)
7 years ago
tensor-tang eee38464dc refine fc op use cpu only
7 years ago
tangwei12 ed937bc6f8 merge
7 years ago
tensor-tang d84a1a0010 fc op use cpu only
7 years ago
fengjiayi a38a8db928 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
tangwei12 478f73c188 merge header in cc
7 years ago
fengjiayi d6b5302bd6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
tensor-tang c588c64a76 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang 0098a494a2 Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
fengjiayi 5e7aa8c7e5 code clean
7 years ago
tensor-tang 742300baa8 fix unkown omp pragmas
7 years ago
tensor-tang b9dbb7c5cb fix bias attri in mkldnn fc
7 years ago
tangwei12 59580a7f69 bug fix
7 years ago
tensor-tang 4b5986bb77 enable fc op in normal case
7 years ago
tensor-tang e133df6037 enable native fc forward
7 years ago
tensor-tang 6a2a9a8350
Revert "Refine elementwise_add op"
7 years ago
Yu Yang 8dda526a45
Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy
7 years ago
sneaxiy f6f5cdaa05
Merge pull request #12555 from sneaxiy/refine_layer_norm
7 years ago
sneaxiy c50c537732 fix arithmetic error in backward kernel
7 years ago
tensor-tang 038cbf799d add bias for fc op
7 years ago
whs 9d6243b6fb Fix crop op. (#12603)
7 years ago
Bai Yifan 649f5d74f0
fix mine_hard_example bug (#12664)
7 years ago
sneaxiy 2d9508f8f3
Merge pull request #12554 from sneaxiy/refine_elementwise_add
7 years ago
tensor-tang 171a0e2b42 add some comment
7 years ago
sneaxiy 2c560623d1 fix dependency error
7 years ago
tensor-tang 5377edd282 refine packed condition
7 years ago
tensor-tang 3bf3e77ac8 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei c0890988da add RPCServerProfiler, replace listen and serv optimizer
7 years ago
tangwei12 64a4925cb4 Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12 0bfd62be3d remove gpu supported, will add it later
7 years ago
Tao Luo 5a9ae411e0
Merge pull request #12618 from sfraczek/sfraczek/fix-new-mkldnn-conv-tests
7 years ago
sneaxiy cf799a6a04
Merge pull request #12553 from sneaxiy/refine_softmax_with_cross_entropy
7 years ago
dzhwinter 8499559c42
"fix style" (#12600)
7 years ago
sneaxiy 010883689c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_layer_norm
7 years ago
sneaxiy 5d698589ce Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_elementwise_add
7 years ago
sneaxiy 19ff254d05 Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
Sylwester Fraczek d74bb6ab9c fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests
7 years ago
fengjiayi 855c9e3311 clean softmax_op code
7 years ago
fengjiayi 24d51de022 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
fengjiayi 27df3a9f2b make cross_entropy_op supporting tensors
7 years ago
fengjiayi 66be53264e
Merge pull request #12592 from JiayiFeng/fix_mac_compile_error
7 years ago
fengjiayi 8e604a10aa fix mac compile error
7 years ago
nhzlx 551c802cdc merge develop
7 years ago
sneaxiy ad45d39222 refine layer_norm
7 years ago
chengduo 7c8b69c700
Feature/op fusion (#12240)
7 years ago
sneaxiy 1b4515f6db refine softmax_with_cross_entropy
7 years ago
nhzlx 3a0caf801f modify trt engine op test
7 years ago
nhzlx e51d045a6d modify trt engine op test
7 years ago
nhzlx e8954a36f5 merge develop
7 years ago
nhzlx 32a9e050bc mapping the variable name inside the subgraph
7 years ago
Wu Yi 2d036c47cd
polish dist unit test code (#12512)
7 years ago
fengjiayi 7834b4a470 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
tangwei12 5bfdefae91 Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12 b30bdde15a random optimize
7 years ago
tangwei12 9c63fef63c random optimize
7 years ago
Qiao Longfei 88a607c342
Merge pull request #12541 from jacquesqiao/optimize-profiler
7 years ago
tangwei12 5b9716d1f6 add dims check
7 years ago
tangwei12 4cd504d3b4 bug fix
7 years ago
sneaxiy e57bc4d745 Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
sneaxiy 222fbbedfb Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy 4b83afff6e
Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy b2d0ee5159 refine elementwise_add op
7 years ago
tangwei12 da2cc99f67 sampling op optimize
7 years ago
fengjiayi 7c55e08c93 stash
7 years ago
tangwei12 4973e07be3 sampling op optimize
7 years ago
tensor-tang 836068569f Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang 18c322c2a1 seperate cpu and gpu implementations for gru kernel compute
7 years ago
tensor-tang 54c95e49f0 fix blas
7 years ago
fengjiayi b656d97e86
Merge pull request #12485 from JiayiFeng/dev_ops_tensor_support
7 years ago
qiaolongfei 1623f1ba4f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler
7 years ago
tangwei12 3206970b77 sampling op rename
7 years ago
Xin Pan 99a77cfc62
Merge pull request #12468 from panyx0718/improve_profiler2
7 years ago
qiaolongfei a3f9d6a38c optimize profiler
7 years ago
tangwei12 e0ab2f7158 new sampling op
7 years ago
tensor-tang 8c23f7c4f0 fix blas and use packed weight
7 years ago
tensor-tang d9cc6b1866 replace gru compute with details
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago
tangwei12 766ac488ac sum_op selectedRows dim bug fix
7 years ago
dzhwinter 595a2c83ae
explicit gradient of elementwise_add/elementwise_sub (#11970)
7 years ago
fengjiayi e7d8e16a66 update softmax_mkldnn_op
7 years ago
Yu Yang 2567afa35d
Merge pull request #12462 from reyoung/feature/fix_cudnn_deterministic
7 years ago
fengjiayi dc111d3476 update softmax_cudnn_op
7 years ago
fengjiayi f7bd0b227b Add unittests for softmax_op
7 years ago
gongweibao 819ac3df0a
Modify style (#12465)
7 years ago
fengjiayi b314a69523 make softmax supporting tensors
7 years ago
fengjiayi b1af7e5d9b Add unittests for lookup_table_op
7 years ago
tangwei12 c4c8f60bec sum_op selectedRows dim bug fix
7 years ago
Xin Pan 486345551d clean
7 years ago
Xin Pan caf10b474f make profiler use thread_id from g_thread_id
7 years ago
Yu Yang 040fc1c39b Fix bug in cudnn_determistic
7 years ago