Commit Graph

2205 Commits (9775e50ca23842c50c1ab741e7fae32c1a1b3609)

Author SHA1 Message Date
tensor-tang 7a4924cd44 further optimize sigmoid with avx and avx512
7 years ago
qiaolongfei fcf20eed0f fix sparse update bug
7 years ago
tangwei12 ca22586818 code optimize
7 years ago
Xin Pan 557be6fc58
Merge pull request #12902 from PaddlePaddle/revert-12736
7 years ago
tensor-tang 6bd89ba5b6 fix typo
7 years ago
Chen Weihang 2969aba14f
Merge branch 'develop' into sequence_enumerate_op
7 years ago
chenweihang 219a2369da feat: wrap sequence enumerate op
7 years ago
tensor-tang e3bb98eb38 optimize relu with avx and avx512
7 years ago
guochaorong 1f270275a6 Revert "Add Python Callstacks when Op::Run error (#12759)"
7 years ago
guochaorong b1fc238694 Revert "Disable in_place in batch_norm API. (#12736)"
7 years ago
tensor-tang 25976fe736 optimize the sigmoid and tanh
7 years ago
tensor-tang 2eb46c2b06 add cpu vec test
7 years ago
sneaxiy 1083e99520 Merge develop
7 years ago
tensor-tang f0f06992c1
Merge pull request #12878 from tensor-tang/feature/op/attention_lstm
7 years ago
luotao1 83f4edabe9 remove broadcast in sequence_expand
7 years ago
sneaxiy 5ea7bf88ba
Merge pull request #12872 from sneaxiy/stack_op
7 years ago
Tao Luo ef2da86b4f
Merge pull request #12885 from luotao1/test_ditu_rnn
7 years ago
sneaxiy e895c98f0a add support to max_len is None
7 years ago
fengjiayi f4a4a4cbd9 add op comment and python layer
7 years ago
tangwei12 acdd95d5ca bug fix
7 years ago
chenweihang d2e5395b97 feat: add sequence enumerate op
7 years ago
luotao1 9c7fde45a7 enhance test_analyzer to profile ditu inference demo
7 years ago
chengduo 8ad9055804
Add is_test for while_op (#12874)
7 years ago
sneaxiy 64464cb1fa Merge develop
7 years ago
qingqing01 79918a8442 add sequence_mask_op for DAM model
7 years ago
Yu Yang b2df17003f
Add Python Callstacks when Op::Run error (#12759)
7 years ago
Yu Yang 17fcc4f5d0
Merge pull request #12864 from reyoung/feature/process_lod_grad
7 years ago
tensor-tang 5ca0bb9aad support more activation type and remove some comments
7 years ago
sneaxiy ba168bd2d2 modify API.spec
7 years ago
tensor-tang d9bf73f3ab Merge remote-tracking branch 'ups/develop' into feature/op/fusion_gru
7 years ago
tensor-tang dd938d0b94 fix bugs and pass op test
7 years ago
tensor-tang ec59f0d454 add cpu vec
7 years ago
tensor-tang cf5ea925c3 fix bugs
7 years ago
tensor-tang 6ed20474d4 refine attention lstm infershape
7 years ago
tensor-tang 508548f897 implement attention lstm cpu forward
7 years ago
tensor-tang 9affc36c89 init attention lstm
7 years ago
tensor-tang 3dd66390b2 add blas vexp
7 years ago
tensor-tang 0ec1f65cf1 fix blas dot and add cblas scal
7 years ago
tensor-tang a2203d0466 add cblas dot
7 years ago
tensor-tang f72ab8961e refine blas gemm
7 years ago
qingqing01 f5d5d7b2d9
Disable in_place in batch_norm API. (#12736)
7 years ago
sneaxiy c73c5ed573 use for_range
7 years ago
Xin Pan b548ecbc2b add stack_op
7 years ago
Yu Yang eb8fd853bc Fix sequence_softmax_cudnn op
7 years ago
Yu Yang 3768677980 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/process_lod_grad
7 years ago
Yu Yang 2a36ad1a96 Handle LoD for concat & seq_softmax ops
7 years ago
Yu Yang 211d81863d Process elemwise grad op's lod. mul_op's lod
7 years ago
Yan Chunwei 9ee698e605
enhance/ditu rnn with fc fuse (#12831)
7 years ago
Xin Pan 78415f326d
Merge pull request #12838 from panyx0718/infer
7 years ago
fengjiayi ce182d9037 bug fix
7 years ago
Xin Pan a2c0e52f3e speed up while_op
7 years ago
tensor-tang 6f78fd7d1e fuse fc in gru
7 years ago
tensor-tang 300180cc26 init fusion gru op
7 years ago
Zhaolong Xing 21ba32b065
Merge pull request #12843 from NHZlX/fix_ssa_bug_for_trt
7 years ago
Michał Gallus cd32ddac12 Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669)
7 years ago
nhzlx c999895e93 merge develop
7 years ago
nhzlx 276950291a 1. fix ssa bug with batchnorm, 2. refine the trt
7 years ago
Yan Chunwei 896a37b6e3
fea/link ir to inference analysis and fc fuse support (#12789)
7 years ago
dzhwinter e23ddf6ae4
status (#12764)
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
tangwei12 cbc6e6eb97
Merge pull request #12247 from seiriosPlus/dis_ckpt_fix
7 years ago
Qingsheng Li 3d11d018e0
Fix scatter_op python API (#12742)
7 years ago
Tao Luo 8f9f414a14
Merge pull request #12805 from tensor-tang/fix/op/elewise_add
7 years ago
tensor-tang e955361267
Merge pull request #12737 from tensor-tang/feature/op/fusion_lstm
7 years ago
tensor-tang 82bb9170fb Merge remote-tracking branch 'ups/develop' into fix/op/elewise_add
7 years ago
Chen Weihang 57b34d9196
Merge pull request #12808 from chenwhql/remove_inplace_param_in_squeeze_and_unsqueeze
7 years ago
Yihua Xu 084d4a9e9e Optimize CRF Decoding with AVX/AVX2/AVX512F instruction (#12767)
7 years ago
fengjiayi 34b209cffa Complete sequence_padding GPU kernel
7 years ago
dzhwinter 00463fdfe3
cudnn windows support (#12757)
7 years ago
qingqing01 c62f68cb94
Fix bug in conditional_block_op. (#12246)
7 years ago
chenweihang bc471b6ac4 refactor: remove inplace parameter from squeeze and unsqueeze op
7 years ago
tensor-tang 0507f7bc3c fix SEGV elementwise add at debug mode
7 years ago
tangwei12 ca1e18c04a
Merge pull request #12469 from seiriosPlus/sum_op_dim_fix
7 years ago
Zhaolong Xing e5674f6dde
Merge pull request #12753 from NHZlX/add_benchmark
7 years ago
tensor-tang b090479409 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12 b4f52b01d0 bug fix when all inputs are empty
7 years ago
tangwei12 3efac174ea Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12 dbb4f0d35d Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix
7 years ago
Qiao Longfei fd10669ecb
Add dependency to send recv (#12760)
7 years ago
fengjiayi 8d8d48a34f Complete sequence_pad_op and its CPU kernel. Add unittests
7 years ago
tangwei12 7c12c0f865 add sync in load selectedrows
7 years ago
Michal Gallus 4a7f0698e0 Add consts to new MKLDNN integration
7 years ago
Michal Gallus 6588d0e039 Update MKLDNN to 0.15, fix conv integration
7 years ago
tangwei12 9f11db4080 add todo in impl
7 years ago
tangwei12 c24a9263ba Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12 ac9ae97001 code fix
7 years ago
nhzlx f55e8901c8 merge develop
7 years ago
nhzlx 1600ba86f6 1. change tensorrt op from cpu to gpu
7 years ago
tangwei12 bb9f494740 merge develop
7 years ago
dzhwinter 4069262f0e
Revert ""cherry picked operators changes" (#12184)" (#12747)
7 years ago
Qiao Longfei 653fad08f8
Optimize selected rows for dist lookup table with pthread rwlock (#12635)
7 years ago
fengjiayi 3c749fae43 update CPU sequence_padding functor
7 years ago
tensor-tang 92890ac258 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12 0749c8822d
Merge pull request #12556 from seiriosPlus/samplingIdOp
7 years ago
tensor-tang a56142c155 optimize elementwise_mul cpu forward
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
7 years ago
tangwei12 26b228e405 remove assignment and add vlog
7 years ago
tangwei12 125e9166e1 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tensor-tang a72f68f223 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tensor-tang df28a3b452 fix lod and op test
7 years ago
Qingsheng Li 317e18abd2
Remove Data Sharing between input and output in scatter_op (#12672)
7 years ago
tensor-tang f3cd2612ae refine fc and use the fc compute in fusion_lstm
7 years ago
tangwei12 822496f626 merge cpu and gpu
7 years ago
dzhwinter bf3c34960f
"cherry picked operators changes" (#12184)
7 years ago
tensor-tang 40138c4cd6 add unit test of fusion lstm op
7 years ago
jerrywgz c108376506 Add three modes for prelu_op (#12630)
7 years ago
tangwei12 9f09d68678 add enforce
7 years ago
gongweibao d06849305a
parameter dispather. (#12666)
7 years ago
tensor-tang 852bc6f4aa refine fusion lstm op doc
7 years ago
tensor-tang 8f9132959e fuse fc in lstm
7 years ago
tensor-tang ddb05dffb6 init fusion lstm op
7 years ago
tensor-tang efc5392d97
Merge pull request #12676 from tensor-tang/refine/op/fc
7 years ago
tangwei12 470fb7c5c3 bug fix
7 years ago
tangwei12 60dda7bf9f add gpu Implementation
7 years ago
tangwei12 4661f5589d random optimize
7 years ago
Bai Yifan 9333a62792
Add flatten op interface and enhance APIs about detection to support variable-length image. (#12422)
7 years ago
tensor-tang eee38464dc refine fc op use cpu only
7 years ago
tangwei12 ed937bc6f8 merge
7 years ago
tensor-tang d84a1a0010 fc op use cpu only
7 years ago
fengjiayi a38a8db928 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
tangwei12 478f73c188 merge header in cc
7 years ago
fengjiayi d6b5302bd6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
tensor-tang c588c64a76 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang 0098a494a2 Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
fengjiayi 5e7aa8c7e5 code clean
7 years ago
tensor-tang 742300baa8 fix unkown omp pragmas
7 years ago
tensor-tang b9dbb7c5cb fix bias attri in mkldnn fc
7 years ago
tangwei12 59580a7f69 bug fix
7 years ago
tensor-tang 4b5986bb77 enable fc op in normal case
7 years ago
tensor-tang e133df6037 enable native fc forward
7 years ago
tensor-tang 6a2a9a8350
Revert "Refine elementwise_add op"
7 years ago
Yu Yang 8dda526a45
Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy
7 years ago
sneaxiy f6f5cdaa05
Merge pull request #12555 from sneaxiy/refine_layer_norm
7 years ago
sneaxiy c50c537732 fix arithmetic error in backward kernel
7 years ago
tensor-tang 038cbf799d add bias for fc op
7 years ago
whs 9d6243b6fb Fix crop op. (#12603)
7 years ago
Bai Yifan 649f5d74f0
fix mine_hard_example bug (#12664)
7 years ago
sneaxiy 2d9508f8f3
Merge pull request #12554 from sneaxiy/refine_elementwise_add
7 years ago
tensor-tang 171a0e2b42 add some comment
7 years ago
sneaxiy 2c560623d1 fix dependency error
7 years ago
tensor-tang 5377edd282 refine packed condition
7 years ago
tensor-tang 3bf3e77ac8 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei c0890988da add RPCServerProfiler, replace listen and serv optimizer
7 years ago
tangwei12 64a4925cb4 Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12 0bfd62be3d remove gpu supported, will add it later
7 years ago
Tao Luo 5a9ae411e0
Merge pull request #12618 from sfraczek/sfraczek/fix-new-mkldnn-conv-tests
7 years ago
sneaxiy cf799a6a04
Merge pull request #12553 from sneaxiy/refine_softmax_with_cross_entropy
7 years ago
dzhwinter 8499559c42
"fix style" (#12600)
7 years ago
sneaxiy 010883689c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_layer_norm
7 years ago
sneaxiy 5d698589ce Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_elementwise_add
7 years ago
sneaxiy 19ff254d05 Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
Sylwester Fraczek d74bb6ab9c fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests
7 years ago
fengjiayi 855c9e3311 clean softmax_op code
7 years ago
fengjiayi 24d51de022 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
fengjiayi 27df3a9f2b make cross_entropy_op supporting tensors
7 years ago
fengjiayi 66be53264e
Merge pull request #12592 from JiayiFeng/fix_mac_compile_error
7 years ago
fengjiayi 8e604a10aa fix mac compile error
7 years ago
nhzlx 551c802cdc merge develop
7 years ago
sneaxiy ad45d39222 refine layer_norm
7 years ago
chengduo 7c8b69c700
Feature/op fusion (#12240)
7 years ago
sneaxiy 1b4515f6db refine softmax_with_cross_entropy
7 years ago
nhzlx 3a0caf801f modify trt engine op test
7 years ago
nhzlx e51d045a6d modify trt engine op test
7 years ago
nhzlx e8954a36f5 merge develop
7 years ago
nhzlx 32a9e050bc mapping the variable name inside the subgraph
7 years ago
Wu Yi 2d036c47cd
polish dist unit test code (#12512)
7 years ago
fengjiayi 7834b4a470 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
tangwei12 5bfdefae91 Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12 b30bdde15a random optimize
7 years ago
tangwei12 9c63fef63c random optimize
7 years ago
Qiao Longfei 88a607c342
Merge pull request #12541 from jacquesqiao/optimize-profiler
7 years ago
tangwei12 5b9716d1f6 add dims check
7 years ago
tangwei12 4cd504d3b4 bug fix
7 years ago
sneaxiy e57bc4d745 Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
sneaxiy 222fbbedfb Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy 4b83afff6e
Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy b2d0ee5159 refine elementwise_add op
7 years ago
tangwei12 da2cc99f67 sampling op optimize
7 years ago
fengjiayi 7c55e08c93 stash
7 years ago
tangwei12 4973e07be3 sampling op optimize
7 years ago
tensor-tang 836068569f Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang 18c322c2a1 seperate cpu and gpu implementations for gru kernel compute
7 years ago
tensor-tang 54c95e49f0 fix blas
7 years ago
fengjiayi b656d97e86
Merge pull request #12485 from JiayiFeng/dev_ops_tensor_support
7 years ago
qiaolongfei 1623f1ba4f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler
7 years ago
tangwei12 3206970b77 sampling op rename
7 years ago
Xin Pan 99a77cfc62
Merge pull request #12468 from panyx0718/improve_profiler2
7 years ago
qiaolongfei a3f9d6a38c optimize profiler
7 years ago
tangwei12 e0ab2f7158 new sampling op
7 years ago
tensor-tang 8c23f7c4f0 fix blas and use packed weight
7 years ago
tensor-tang d9cc6b1866 replace gru compute with details
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago
tangwei12 766ac488ac sum_op selectedRows dim bug fix
7 years ago
dzhwinter 595a2c83ae
explicit gradient of elementwise_add/elementwise_sub (#11970)
7 years ago
fengjiayi e7d8e16a66 update softmax_mkldnn_op
7 years ago
Yu Yang 2567afa35d
Merge pull request #12462 from reyoung/feature/fix_cudnn_deterministic
7 years ago
fengjiayi dc111d3476 update softmax_cudnn_op
7 years ago
fengjiayi f7bd0b227b Add unittests for softmax_op
7 years ago
gongweibao 819ac3df0a
Modify style (#12465)
7 years ago
fengjiayi b314a69523 make softmax supporting tensors
7 years ago
fengjiayi b1af7e5d9b Add unittests for lookup_table_op
7 years ago
tangwei12 c4c8f60bec sum_op selectedRows dim bug fix
7 years ago
Xin Pan 486345551d clean
7 years ago
Xin Pan caf10b474f make profiler use thread_id from g_thread_id
7 years ago
Yu Yang 040fc1c39b Fix bug in cudnn_determistic
7 years ago
fengjiayi 7efdf05ac2 make look_up_op supporting tensor ids
7 years ago
Qiao Longfei 690625fe15
Merge pull request #12456 from jacquesqiao/add-profiler-to-pserver
7 years ago
qiaolongfei 7e46a8d172 fix logical bug, optimize code
7 years ago
qiaolongfei 0b62f61d29 add init flag in __init__.py for listen_and_serv_profile_period
7 years ago
dzhwinter 91fb0156ca
Memory/reshape op (#12414)
7 years ago
qiaolongfei 0b861bbca9 add profiler for listen_and_serv op
7 years ago
tensor-tang 059b27840c
Merge pull request #12408 from tensor-tang/refine/im2col
7 years ago
qiaolongfei 147bf00ffe clear mutable rows for the output of split_ids_op
7 years ago
qiaolongfei 91b114a787 change map to unordered_map
7 years ago
tensor-tang d8d2dbcfac further optimize im2col using variables
7 years ago
qiaolongfei 91f63cd401 fix split_ids_op and add unit test
7 years ago
tensor-tang 5373fe29c2 Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
Qiyang Min 7da453630e
Merge pull request #12403 from velconia/fix_hang_up
7 years ago
Tao Luo 5a634786af
Merge pull request #12312 from luotao1/unify
7 years ago
Bai Yifan e12b1d1792 Add flatten op (#12341)
7 years ago
Luo Tao 062556f938 Merge branch 'develop' into unify
7 years ago
chengduo 2409d0f710
Refine regularization for selected_rows (#12369)
7 years ago
tensor-tang 687a322267 Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
tensor-tang 65d418f060 complete im2col with padding==1 and speedup filter width==1
7 years ago
minqiyang 053540e199 Add volatile to stopped_ member
7 years ago
minqiyang b78ffde6d5 Add stopped sign for grpc client
7 years ago
tensor-tang 52eb86e30f refine im2col benchmark
7 years ago
tensor-tang 3017f46076 add more test cases
7 years ago
tensor-tang 8d6be4fb5f refine im2col test and add benchmark
7 years ago
tensor-tang 507c143047 im2col cfo cpu code clean
7 years ago
tensor-tang 4eeed0b5e4 refine width padding and enable core copy
7 years ago
Wu Yi 73fcfc06ec
refine conv cudnn enforce (#12353)
7 years ago
tensor-tang e3131e2d73 enable width padding
7 years ago
Xin Pan d7e08c53c2
Merge pull request #12169 from panyx0718/ir_graph_sort
7 years ago
tensor-tang 92518c519f reuse sizes saving time
7 years ago
tensor-tang 660df122ce enable padding!=0 and fill height padding with 0
7 years ago
tensor-tang d8e00facf7 reuse im_size
7 years ago
tensor-tang 179dd0cb8a
Merge pull request #12337 from tensor-tang/refine/im2col
7 years ago
Luo Tao 5ba4337698 unify libpaddle_inference_api into libpaddle_fluid
7 years ago
tensor-tang b72befc5cc reuse copy size
7 years ago
Yancey 6133efd9ed
Merge pull request #12218 from Yancey1989/rpc_complete_interface
7 years ago
Zhaolong Xing 6169d724b9
Merge pull request #12324 from NHZlX/enhance_for_tensorrt_infer
7 years ago
nhzlx 4d49e61ab8 fix comments
7 years ago
tensor-tang 6788af4bf1 refine test cases
7 years ago
tensor-tang b163e601b6 add gtest
7 years ago
nhzlx bcd67bdd71 add assert for GetOutput
7 years ago
tensor-tang aae994fd26 refine im2col no padding
7 years ago
Yancey1989 fb06ed7bdc Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
Yu Yang 21387e3c2a Tiny refines for lod_tensor_blocking_queue and reshape_op
7 years ago
nhzlx f42ea48996 deal with conflict
7 years ago
nhzlx 940f5dbcac modify the tensorrt engine op to adapt to chage
7 years ago
Yan Chunwei 02cf54d331
bugfix lod cpu performance (#12297)
7 years ago
Qiao Longfei b41f8b9d42
Merge pull request #12295 from jacquesqiao/speedup-reduce-sum-grad-op
7 years ago
fengjiayi eec412b230
Merge pull request #12273 from JiayiFeng/update_py_reader
7 years ago
Xin Pan 21a45420f0 polish and test
7 years ago
Qiao Longfei 95a2b5f56a
fix mac build of sendrecvop_utils (#12272)
7 years ago
qiaolongfei 273f737517 optimize code
7 years ago
Xin Pan 93355cc0d2 fix control deps
7 years ago
fengjiayi ea8a375fa4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into update_py_reader
7 years ago
qiaolongfei 5d718a5886 optimize reduce_sum_grad op
7 years ago
Yancey1989 d4f51218ef Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
qiaolongfei b643473d31 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-mac-build
7 years ago
fengjiayi 060f421797 Some enhancement on readers
7 years ago
qingqing01 873a50ce35
Fix serious bug in nesterov momentum optimizer. (#12231)
7 years ago
Yan Chunwei b42ced8eda
bugfix/tensorrt analysis fix subgraph trigger (#12266)
7 years ago
qiaolongfei 938390b38d fix mac build of sendrecvop_utils
7 years ago
gongweibao 3a6213f493
Change grpc interface to compatible with brpc. (#12164)
7 years ago
Yu Yang b06309381b
Merge pull request #12149 from reyoung/feature/combine_open_files_and_double_buffer
7 years ago
tensor-tang be04fbff42
Merge pull request #12233 from tensor-tang/refine/mkl/gemm
7 years ago
Qiao Longfei 2b58c62aa0
Update auc op (#12199)
7 years ago
Yancey1989 efd5a84986 update executor interface
7 years ago
tensor-tang fc2b578842 add gemm_warp test
7 years ago
tensor-tang a916c52579 refine gemm
7 years ago
tensor-tang 961e754c9f mkl split gemm for better perf
7 years ago
Yancey1989 ade6675490 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
yuyang18 e9c8d930a5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/combine_open_files_and_double_buffer
7 years ago
Yancey1989 d0771cf912 update
7 years ago
Yancey1989 7570d8e77c add rpc complete interface
7 years ago
yuyang18 8c70183ba6
Polish function names
7 years ago
yuyang18 b789a3a484
Change code
7 years ago
whs 8284947b82 Fix infershape of im2sequence. (#12183)
7 years ago
yuyang18 401e92f6e3
Change attr comment
7 years ago
yuyang18 be528f9815
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/combine_open_files_and_double_buffer
7 years ago
Tomasz Patejko b2b8b15bfe MKLDNN sum fix: remove in_place condition in loop creating memory primitives for sum
7 years ago
yuyang18 72b78154b2
Polish reader speed
7 years ago
Wu Yi 866fcb0c15
Merge pull request #12171 from typhoonzero/fix_pserver_with_condition_block
7 years ago
typhoonzero 32d81909dc fix pserver with condition block
7 years ago
tensor-tang d24fd2c6b1
Merge pull request #12099 from jczaja/prv-conv-grad-mkldnn-upstream2
7 years ago
yuyang18 e576345f5b
Try to speed up buffered reader
7 years ago
Wu Yi c5619bbcde
fix auc op (#12087)
7 years ago
Yancey 0042ba93c8
Merge pull request #12127 from Yancey1989/enforce_rpc_timeout
7 years ago
yuyang18 61b3a5977f
Refine Python Reader
7 years ago
yuyang18 b048ddf0bd
Merge error
7 years ago
yuyang18 b8975d6842
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/combine_open_files_and_double_buffer
7 years ago
yuyang18 d36e13efd8
Merge branch 'feature/add_pyreader_demo' into feature/combine_open_files_and_double_buffer
7 years ago
yuyang18 1478a5fc0b
Make open_files use buffer
7 years ago
yuyang18 dc34effd35
Extract buffered reader
7 years ago
yuyang18 392318045f
Merge branch 'feature/dctor_all_readers' into feature/combine_open_files_and_double_buffer
7 years ago
yuyang18 fecbe52200
Rewrite open_files
7 years ago
Yu Yang ba997b8ccd
Merge pull request #12097 from reyoung/feature/hide_api_cont
7 years ago
yuyang18 c680bc1d7f
Rewrite DoubleBuffer
7 years ago
yuyang18 c9cf2bdb9c
Dctor cache
7 years ago
yuyang18 ee7d8b4d66
Refine Shutdown Impl
7 years ago
Jacek Czaja 8e20d36bc8 - comment update
7 years ago
Jacek Czaja c981222b3b - Conv MKLDNN grad op reuse of mkldnn primitives
7 years ago
tensor-tang f0cd493c0d
Merge pull request #11989 from tensor-tang/feature/libxsmm
7 years ago
Sylwester Fraczek 4d55aca40e reserve vector space before loop in top-k
7 years ago
Yu Yang ebe3b5e78a
Merge pull request #11853 from sneaxiy/complete_py_reader_python
7 years ago
Yancey1989 4a91a14549 enforce rpc client timeout
7 years ago
Guo Sheng da3f766821
Merge pull request #12088 from guoshengCS/complete-hsigmoid
7 years ago
sneaxiy 31c7f6b968
Merge branch 'develop' into complete_py_reader_python
7 years ago
fengjiayi 6ff7f2380c
Merge pull request #12063 from reyoung/feature/exception_safe_pe
7 years ago
tensor-tang 2f7b09319a Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
guosheng 4ee069fdba Fix the HierarchicalSigmoidGradOpKernel and refine the codes. Now hsigmoid_op is same with V2 implementation and can pass gradient check.
7 years ago
yuyang18 c87e08c28d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/exception_safe_pe
7 years ago
chenweihang 938319bbd2
Merge branch 'develop' into unsqueeze_op
7 years ago
Yibing Liu 092d620187
Merge pull request #11812 from chenwhql/squeeze_op
7 years ago
tensor-tang 1c5d6c5692 disable xsmm with float16
7 years ago
tensor-tang c9ba51ead8 Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
tensor-tang 64a8e6d20e refine the threshold functions
7 years ago
Tao Luo c620c522d7
Merge pull request #12093 from Noplz/fix_warning
7 years ago
lemon34 29145e1e31 change im2sequence for ctc batch inference (#11696)
7 years ago
Noplz cfa4479b06 fix warning
7 years ago
tensor-tang 32822b2a59 Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
chenweihang b8ea7a081a Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
Jacek Czaja fbe25ef510 MKLDNN: Extending Conv MKLDNN op to reuse MKLDNN primitives (#11750)
7 years ago
baiyf be2d9dc2b8 Add prior_box output order control (#12032)
7 years ago
guosheng e7f7ba97fe Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
guosheng e7a4cfc0ff complete the hsigmoid_op
7 years ago
chenweihang 84a525a38a Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
sneaxiy f85e16f1de Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into complete_py_reader_python
7 years ago
chenweihang 0ea468225b docs: fix some errors of description
7 years ago
chenweihang fbef49e772 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang 3d15968958 docs: fix some errors of description
7 years ago
achao2013 8e4b225fe4 Add fake_quantize_op. (#11359)
7 years ago
Yuan Gao 50aa6ba6f5 add rpn target assign op (#11449)
7 years ago
chenweihang 2bd65dbf71 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
chenweihang fd01a43a3c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
tensor-tang 7bb67b6788 Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
chenweihang cef8dbc1f7 refine some messages and adjust data type
7 years ago
chenweihang 05eafcca73 refine some messages and adjust data type
7 years ago
minqiyang fceaabdd81 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_grpc_destroy_bug
7 years ago
guosheng d695381677 Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
yuyang18 3aaf798182
Refine size_t and int
7 years ago
fengjiayi 26ae6111d1
Merge pull request #12051 from JiayiFeng/dev_reader_ResetAll
7 years ago
qingqing01 10fbb831ed
Skip BatchNorm when feature only has 1 element. (#11578)
7 years ago
chenweihang 8f2486ca16 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
tensor-tang 6bc1aaaac7 refine the ColMajor replacement
7 years ago
tensor-tang c3862a7519 Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
chenweihang d552b900f0 change the copyright year form 2016 to 2018
7 years ago
qingqing01 ef4895df3b
Make IfElse operator works and fix unit testing. (#11972)
7 years ago
tensor-tang de856da9a6 fix ColMajor and RowMajor replacement
7 years ago
tensor-tang 00ee6c3c17 Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
fengjiayi 6d6f49cd56 Merge remote-tracking branch 'yuyang/feature/decorated_reader_chain' into dev_reader_ResetAll
7 years ago
chenweihang 7526eaaf13 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang 4453473f71 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
chenweihang 1721613f1e simplify construct function
7 years ago
fengjiayi 611716e9bc Merge branch 'dev_reader_shutdown_start' of https://github.com/JiayiFeng/Paddle into dev_reader_shutdown_start
7 years ago
fengjiayi 0e9f1e2790 Make ReaderBase thread safe and remove ThreadedReader
7 years ago
yuyang18 e8ee9dc7f8
Several Polish
7 years ago
chenweihang 5f89272c89 change the bit insert to array insert for understandability
7 years ago
fengjiayi b4f0e57956 fix errors
7 years ago
Tao Luo 436bb4500b
Merge pull request #11699 from pzelazko-intel/pzelazko/workaround-for-missing-mklnn-kernels
7 years ago
fengjiayi 6fc6cc2f4c Some updates on readers
7 years ago
fengjiayi 5528f59900 Split ReInit() to Shutdown() and Start()
7 years ago
fengjiayi de9a411f1c adjust readers' inheritance relationships
7 years ago
yuyang18 c48c586aca
Use weak_ptr to implement DecoratedReaderChain
7 years ago
minqiyang 1377b332bc Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_grpc_destroy_bug
7 years ago
chenweihang fccdc1abea Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang 62a17f5053 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang 80126a7496 small fix based reviewer's advice
7 years ago
yuyang18 8e86721fe7
Fix data balance on single GPU
7 years ago
tensor-tang 21516e5cbe add unit test of smm
7 years ago
tensor-tang c3941745b3 add libxsmm_gemm
7 years ago
minqiyang 2c4fb585db Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_grpc_destroy_bug
7 years ago
minqiyang 0d04545e9c Remove debug info
7 years ago
chenweihang 9ca8db237a Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
tensor-tang 7782a4ab53 fix blas build issue
7 years ago
tensor-tang 17987eb3fc link libxsmm
7 years ago
minqiyang 207d1b81fe Add fixed grpc
7 years ago
tensor-tang 3df99e72ab Merge remote-tracking branch 'ups/develop' into refine/set_num_threads
7 years ago
dzhwinter 4ed0b62476
Move fluid::framework::InitDevices into fluid::platform (#11757)
7 years ago
dzhwinter 99a99ec7e3
"remove lapack" (#11966)
7 years ago
chenweihang a6d94e8dc6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang 49b2cf5fee adjust some code based reviewer's advice
7 years ago
sneaxiy 9b28260029 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into complete_py_reader_python
7 years ago
sneaxiy 739c330914 fix merge conflict
7 years ago
fengjiayi ce16b40b04
Merge pull request #11891 from JiayiFeng/dev_eof_exp
7 years ago
chenweihang 79333fa7b8 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang ca15779394 rewrite, use reshape op in unsqueeze op, test passed
7 years ago
Xin Pan 71b1c397d7
Merge pull request #11874 from panyx0718/move_trainer
7 years ago
Xin Pan d70a38d8ec fix
7 years ago
yuyang18 c31519036b
Merge branch 'squeeze_op' of https://github.com/chenwhql/Paddle into pr/11812
7 years ago
yuyang18 1854814d49
Use reshape_op inside squeeze_op
7 years ago
Xin Pan 94cb59ad09 hide utils to legacy
7 years ago
chenweihang ee760d1c2d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
chenweihang 0cef33a468 adjust the dims range to [1,6] and fix some problem
7 years ago
Yancey f7fd711e3f
Merge pull request #11868 from Yancey1989/dist_pass_barrier
7 years ago
yuyang18 3777f10286
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into pr/11812
7 years ago
Yu Yang 9401b64d61
Merge pull request #11877 from reyoung/feature/fix_reshape_op_size
7 years ago
chenweihang 996c157f61 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang e402496238 complete unsqueeze op and related unittest.
7 years ago
fengjiayi 3fab4f65a4 Add EOFException to represent EOF in C++ reader
7 years ago
minqiyang 1d6ecd3c4e Change grpc version to 1.13.x
7 years ago
yuyang18 550ab8d723
Use single file than multiple files
7 years ago
Paweł Żelazko ac323343a0 typos fix
7 years ago
yuyang18 6038a63120
Fix fc mkldnn op
7 years ago
yuyang18 82866d4a18
Add register kernel functor and shrink reshape op
7 years ago
fengjiayi 58560622bc
Merge pull request #11854 from JiayiFeng/dev_data_balance
7 years ago
yuyang18 1ce478f100
Polish reshape op
7 years ago
Yancey1989 37410a0c75 update by comment
7 years ago
chenweihang 9ca88fa8a5 Adjust squeeze op and code the unittest, test passed
7 years ago
sneaxiy 3f9292c6e6 fix merge conflict
7 years ago
sneaxiy dd70fb4393 fix type comparation bugs
7 years ago
Xin Pan 982dabe293
Merge pull request #11866 from panyx0718/move_func
7 years ago
Xingyuan Bu 5056d3ec56 FasterRCNN Anchor Generator Op (#11218)
7 years ago
Yibing Liu 5f79c7fbb6
Merge pull request #11174 from kuke/argsort_dev
7 years ago
Yancey1989 029425a5f4 update
7 years ago
Yancey1989 c1ab215e26 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dist_pass_barrier
7 years ago
Yancey1989 1366832a41 add dist pass barrier
7 years ago
Xin Pan a9086bf320 also move a few other dir to legacy/
7 years ago
gongweibao 66c91911cf
Improve brpccmake (#11842)
7 years ago
Yibing Liu 9386ac0a40 Enhance cuda code & unittest for argsort_op
7 years ago
guochaorong c318aa5ffa
Merge pull request #11850 from guochaorong/revert_11496
7 years ago
fengjiayi 49a04d75ee Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_data_balance
7 years ago
fengjiayi 4b950951d3 Add unittests and fix a few bugs
7 years ago
chenweihang a1e7f2d520 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang 70729ad641 Add Unsqueeze Operator Framework, not finshed
7 years ago
guochaorong 6a35899131 Revert "Extend fill_zeros_like_op for zero-filling an LoDTensorArray (#11496)"
7 years ago
chenweihang 298e74da1e add squeeze op c++ part, compile success
7 years ago
fengjiayi 5b4f283069 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_data_balance
7 years ago
fengjiayi b6dc3a59f1 Add DataBalanceOpHandle to MultiDeviceSSAGragh
7 years ago
mozga-intel b8a04c2fa1 Duplicated code was moved to common function
7 years ago
mozga-intel 3b128337a1 The mkldnn batch norm supports other data format
7 years ago
Xin Pan 2ecc56226d small AverageOptimizer enhance. (#11761)
7 years ago
Yan Chunwei 5082642bdb
feature/analysis to support sub-graph for TRT engine (#11538)
7 years ago
Haichao Zhang bc28cf613f Extend fill_zeros_like_op for zero-filling an LoDTensorArray (#11496)
7 years ago
Qiao Longfei 593bbfe392
Merge pull request #11765 from jacquesqiao/fix-adam-op-for-selectedrows
7 years ago
qiaolongfei 20fae68136 adam op handle grad.rows().size == 0 condition
7 years ago
pzelazko-intel 9a15c92317 bnorm+relu fuse for mkldnn (inference) (#11434)
7 years ago
baiyf 778b71fc93
Optimize bipartite_match_op in large scale input (#11730)
7 years ago
qiaolongfei df7a266ae2 fix adam op for selected rows
7 years ago
tensor-tang e3a96300bb move SetNumThreads to platform
7 years ago
qingqing01 b756063ce7
Speed depthwise transposed conv2d. (#11740)
7 years ago
Qingsheng Li 8630ba2eb1
Fix sequence expand op (#11618)
7 years ago
sneaxiy 01fbcb0bbb
Merge pull request #11695 from sneaxiy/complete_py_reader_cpp
7 years ago
Guo Sheng 8df303c09b
Merge pull request #11238 from guoshengCS/fix-beam_search
7 years ago
guosheng d15b2e02c8 Fix copying empty tensor in beam_search_decode_op
7 years ago
sneaxiy d4d946db5a update blocking queue
7 years ago