Commit Graph

3184 Commits (99e6e8b00f9ec3ffede2dceb8b46ee65723fc06d)

Author SHA1 Message Date
JiabinYang 42470f14b7 test=develop
7 years ago
peizhilin 445fff24dc add the bigobj option to NVCC compile
7 years ago
qingqing01 36f08eef3b
CUDA kernel for density_prior_box_op. (#14513)
7 years ago
tensor-tang 6a7f83d45d enable gru jitcode and refine act and lstm jitcode
7 years ago
tensor-tang 686eaf20ba Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
peizhilin 81bd7eeff4 rollback the format
7 years ago
Qiao Longfei 1f87f263a2 clean code
7 years ago
Qiao Longfei 361cb0e078 lookup remote table can compile
7 years ago
JiabinYang 0fca16847c temp
7 years ago
JiabinYang e9be3366a9 test=develop
7 years ago
chengduo 00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929)
7 years ago
peizhilin dfbac60398 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 7c8c9dc9bf fix unit test cases
7 years ago
tensor-tang 0c5ed5f6fc enable peephole jitcode
7 years ago
JiabinYang 3c6102a367 test=develop
7 years ago
Qiao Longfei 7c3ce2952d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
7 years ago
Qiao Longfei 60a4f69b3c add lookup remote table op
7 years ago
Qiao Longfei e0b48f7e29 init lookup remote table
7 years ago
tensor-tang e3b61cf52b init gru jitcode and fix lstm jitcode
7 years ago
tensor-tang 0f25446574 Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
Dun ae7d22862b Group Norm (#13843)
7 years ago
wopeizl d9a1f3e58e Windows/online (#14474)
7 years ago
JiabinYang 57a18e32a1 test=develop
7 years ago
peizhilin bef475c92b Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo 5d4d117edc
Merge pull request #14502 from qingqing01/cudnn5_fix
7 years ago
Jiabin Yang f7b55de9e5
Merge branch 'develop' into enhance_hierachical_sigmod_op
7 years ago
Yu Yang e68c1fcd5a
Merge pull request #14522 from reyoung/feature/fix_op_header_deps
7 years ago
tensor-tang 3562051302 add gru refer code and remove redundant avx code
7 years ago
JiabinYang af9a3301da test=develop
7 years ago
Zhaolong Xing ad349e770f
Merge pull request #14452 from NHZlX/fix_avg_pool_trt_bug
7 years ago
tensor-tang f913860873 jitkernel lstm refer support peephole
7 years ago
tensor-tang 2f9b5f2383
Merge branch 'develop' into fea/jit/rnn
7 years ago
JiabinYang 014e50c284 test=develop
7 years ago
Yu Yang 3edd32d070 fix(Compile): fix depends error when compile op using cub
7 years ago
Dang Qingqing cda60311f9 Fix compling with cuDNN v5
7 years ago
peizhilin 67562a6fcd Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang 10fb4ceefc
Merge pull request #14351 from tpatejko/tpatejko/mkldnn-elementwise_mul
7 years ago
jerrywgz 13e254faed refine code, test=develop
7 years ago
tensor-tang b4c826c548 Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
tensor-tang ce31deb7e9 refine refer code and add lstm refer code
7 years ago
jerrywgz 79cec53111 add ignore index for sigmoid cross entropy with logits op, test=develop
7 years ago
nhzlx e62872df8b fix conflicts
7 years ago
tensor-tang c2cfb03a72 add lstm jitcode
7 years ago
peizhilin 25adf970b2 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo 1d3e9bde1e
Merge pull request #14488 from yihuaxu/develop_7a64d48f5_stack_opt
7 years ago
tensor-tang 7aa3aff338
Merge pull request #14465 from tensor-tang/fea/jit/exp
7 years ago
Tao Luo 1b894e495f
Merge pull request #14437 from jczaja/prv-softmax-mkl
7 years ago
peizhilin 3a72a634cf Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yihua Xu a906a361be Add the macro for NVCC (test=develop)
7 years ago
Yihua Xu d91740acb1 Revert "Remove the remnant code (test=develop)"
7 years ago
Yihua Xu be50670348 Remove the remnant code (test=develop)
7 years ago
qingqing01 9eefd2c766
Modify some infer-shape about detection operators in compile-time. (#14483)
7 years ago
Yihua Xu f4c869d872 Optimize the layer_norm operator with AVX intrinsic function (#14417)
7 years ago
peizhilin ee0fd78c81 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yu Yang f1a392a5fe
Merge pull request #13804 from sneaxiy/rewrite_allocation
7 years ago
Yihua Xu f418f552df Merge branch 'develop' into develop_7a64d48f5_stack_opt (test=develop)
7 years ago
peizhilin 8443961a4f add warp_ctc back
7 years ago
qingqing01 fd7e643153
Convolution fusion operator. (#14449)
7 years ago
Yu Yang 98bbfc17be Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
peizhilin 4a6769da84 re-organize the cmake file
7 years ago
dengkaipeng 8ef6280c03 Add operator double support. test=develop
7 years ago
peizhilin 1aff40a4c6 exclude warpctc_op on windows
7 years ago
peizhilin 7d51a0e887 disable DSO by default on windows
7 years ago
peizhilin b967e01cbe Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Wu Yi d7bd0361cb fix dist deps (#14471)
7 years ago
Jacek Czaja 9b0eae3023 - Removing partial specialization of sotmax for inference for GPU
7 years ago
peizhilin a3e952f41d add the jit back
7 years ago
tensor-tang a19b3225a1 fix jitcode small size
7 years ago
Jacek Czaja be80bb4f28 - Fix to GPU
7 years ago
tensor-tang 4dbdfa60ef sigmoid and tanh support all size
7 years ago
tensor-tang ccb8963705 refine exp jitcode with all size
7 years ago
peizhilin 1cc23ef67d merge from paddle:develop
7 years ago
tensor-tang d3eae8f61b refine relu and fix addrelu test
7 years ago
tensor-tang 4e67fe6a12 refine act and vxx with all size
7 years ago
tensor-tang ba3eaed7a7 exp support all size
7 years ago
tensor-tang 1ffce8c0ae fix build error on noavx
7 years ago
Michal Gallus c69c41604e MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
7 years ago
Michal Gallus 785066eb8a MKLDNN elementwise_mul: Check if AVX512 is available
7 years ago
Michal Gallus 08f63c4d12 MKLDNN elementwise_mul: Lint changes to UT & integration
7 years ago
Michal Gallus 49b09327f6 MKLDNN elementwise_mul: Reorder on non-nchw input, fallback on non-16 divisable fm
7 years ago
Michal Gallus d14858e4ba MKLDNN elementwise_mul: Parallelize mul
7 years ago
Michal Gallus ed31936ba1 MKLDNN elementwise_mul: Support NCHW, update UT
7 years ago
Tomasz Patejko 700bcbf74f MKLDNN elementwise_mul: h and w loops implemented in xbyak
7 years ago
Tomasz Patejko ad09facafe MKLDNN elementwise_mul: CPU tests initially refactored. MKLDNN mul test for broadcast added
7 years ago
Tomasz Patejko 2d73ad180a MKLDNN elementwise_mul: simple xbyak version for AVX512
7 years ago
Tomasz Patejko 213ec37d6a MKLDNN elementwise_add: simple initial implementation of the operator for MKLDNN format
7 years ago
Wu Yi a2d9b34417
Refine operator cmake (#14413)
7 years ago
peizhilin 764f97deac Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 8580b7a130 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang 7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
7 years ago
Jiabin Yang 28bd5b7bad fix space_to_depth_op unicode problem (#14430)
7 years ago
Jacek Czaja 513bb6c151 Squashing MKL based softmax for inference
7 years ago
nhzlx 9b64aac41f add macro for pool2dDirectCUDAFunctor
7 years ago
whs 1722678258
Make nce support more distribution. (#13549)
7 years ago
nhzlx 83f8c403a7 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
7 years ago
nhzlx b969116988 fxi avg pool trt bug and fix cpplint
7 years ago
tensor-tang 1f00723fa3 exp, sigmoid, tanh jitcode support more size
7 years ago
Qiyang Min d971d5b875
Merge pull request #14431 from velconia/fix_expand_op_dim_in_compile_time
7 years ago
Wu Yi b32c13dc20
Add cudnn ctc loss (#12366)
7 years ago
tensor-tang 8cda7b3d20 Merge remote-tracking branch 'ups/develop' into fea/jit/act
7 years ago
tensor-tang e2d6eddd32 remove ComputeDeprecated
7 years ago
peizhilin 6d0d5a76eb Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
dengkaipeng f115eb0d1e enhance api. test=develop
7 years ago
tensor-tang 64f7516aee
fix lrn on mac (#14426)
7 years ago
Yu Yang c8f6e70ab4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
tensor-tang f65ddff8d1 unify act jitcode of relu, exp, sigmoid and tanh
7 years ago
tensor-tang 6a159071b6 add vtanh jitcode of size 8
7 years ago
tensor-tang 046374bcd1 add vsigmoid jitcode of size 8
7 years ago
minqiyang 560b29ccb7 Polish code
7 years ago
minqiyang 21d6e8e8c8 Polish code
7 years ago
minqiyang 50b6e4c6bc Fix expand grad op infer shape
7 years ago
Sylwester Fraczek 8a1eeec579 add mkldnn prop_kind phase for inference-only case to pooling and activations (#14278)
7 years ago
peizhilin d1429ac4a5 add recordio support
7 years ago
chengduo 82773477ae
Add selu (#14415)
7 years ago
dengkaipeng 95d5060ddd fix abs -> fabs error. test=develop
7 years ago
minqiyang 30147d7f58 Fix expand op incorrect infer shape
7 years ago
JiabinYang ba9ff508e8 temp fix
7 years ago
Yihua Xu 03ccb9a461 Optimize the stack operator
7 years ago
dengkaipeng 2faa2b4048 remove cu file. test=develop
7 years ago
tensor-tang ee2a7f1b8c refine exp and fix error on avx
7 years ago
tensor-tang 1e06a32a0d add vexp jitcode of size 8
7 years ago
tensor-tang 2354409601
Merge pull request #14374 from tensor-tang/fea/jit/act
7 years ago
Tao Luo 5ef123c778 Merge branch 'develop' into dam_fc
7 years ago
dzhwinter d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
7 years ago
Tao Luo d3e63e6e04
Merge pull request #14412 from jczaja/prv-dam-softmax
7 years ago
peizhilin be332a13bc Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Jacek Czaja b361579f09 - Softmax for Inference is enabled when ON_INFER is set
7 years ago
Tao Luo 980a6753a8 fix typo to pass the ci
7 years ago
Tao Luo 8f301f4618
Merge pull request #14381 from qingqing01/manylinux_v5_fix
7 years ago
peizhilin 1a9008c420 code style fix
7 years ago
Tao Luo e0d4e04bdd fix some compiler warning
7 years ago
Tao Luo 8ea13e336a add in_num_col_dims for fc
7 years ago
JiabinYang a507845a77 test=develop
7 years ago
Tao Luo 9eb0ab1db3
Merge pull request #14384 from tensor-tang/refine/lrn
7 years ago
peizhilin 30ddc07a7e Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiao Longfei e65cbd3b06
Merge pull request #14387 from jacquesqiao/lookup_sparse_table_add_test_mode
7 years ago
Qiao Longfei 6cf8f24b1b
Merge pull request #14389 from jacquesqiao/fix_sgd_op_optimize_sparse_table
7 years ago
Xin Pan 10ab177f89
Merge pull request #14403 from PaddlePaddle/revert-14337-prv-dam-softmax
7 years ago
Yan Chunwei 9f252e0032
Combine Inference Analysis with IR (#13914)
7 years ago
Tao Luo 5b9c62faee
Revert "Softmax op optimization for inference "
7 years ago
Tao Luo 6490bb2765
Merge pull request #14337 from jczaja/prv-dam-softmax
7 years ago
chengduo 9f68e9a7fe
fix auc op (#14385)
7 years ago
dengkaipeng a0284f6fbc Add backward CPU kernel. test=develop
7 years ago
Dang Qingqing d219818434 Fix compiling in cuDNN v5.
7 years ago
Qiao Longfei efb5c03f60 sgd_op optimize selected rows do not enforce id < height
7 years ago
Qiao Longfei 7aa8b2ccf2 optimize code
7 years ago
Qiao Longfei 8d205c853c add is_test for lookup_sparse_table
7 years ago
tensor-tang b4dfba1779 refine lrn_op cpu forward and speedup
7 years ago
tensor-tang 1be85d011d add mkl vsqr and vpow
7 years ago
JiabinYang f4be1d99d0 polish code and test
7 years ago
ruri 4a55fb5f5b Add density_prior_box_op (#14226)
7 years ago
tensor-tang 0043c42b3e add vrelu jitcode
7 years ago
dengkaipeng 36c46152e1 Add unittest for yolov3_loss. test=develop
7 years ago
dengkaipeng 77c1328fa7 add CPU kernel forward
7 years ago
dengkaipeng 5d0b568ecb Add YOLOv3 loss operator. test=develop
7 years ago
JiabinYang b8ff0972b6 test=develop
7 years ago
JiabinYang 32e05b01f2 test=develop
7 years ago
peizhilin 61fa5218b9 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yibing Liu bd2943788b
Fix gather & stack op (#14355)
7 years ago
Yu Yang 8f9bfad246
perf(compile): speed up reduce_op compile by splitting files (#14294)
7 years ago
sneaxiy d231e55065 merge develop
7 years ago
JiabinYang c8801e100f grad diff problem to be fixed and need api spec change to be done
7 years ago
Jacek Czaja 03299ed46c - Fix to linking for GPU builds of softmax inference
7 years ago
Jacek Czaja 0756343767 - Fix GPU compilation
7 years ago
Jacek Czaja d332326847 - Added unit tests for softmax is_test=True op
7 years ago
Jacek Czaja c1fccc29c1 - Noise adding removed for Test phase of softmax
7 years ago
peizhilin 7638f0afb3 simplify the logic
7 years ago
peizhilin d01a26280e Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Xin Pan ff28b1ffc0
Merge pull request #14071 from barrierye/add_similarity_focus_op
7 years ago
li099 688ed60116 Add lod tensor array to tensor op (#13990)
7 years ago
peizhilin e23061e0dc Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
chengduo 6c6e638550
Add InferVarType for some op (#14201)
7 years ago
peizhilin 1eec5a428f Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Kaipeng Deng 0b38822624
Merge pull request #14345 from heavengate/fix_grid_sampler
7 years ago
peizhilin ca60e1d34d Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 52f7644f53 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiyang Min 698698f2fa
Merge branch 'develop' into fix_vlog
7 years ago
qingqing01 abe209234f
Exhaustive search for cuDNN conv. (#14286)
7 years ago
Yu Yang b59a9bfb7c Clean buffered_allocator
7 years ago
Kaipeng Deng f215534ecf
Merge pull request #14205 from heavengate/nearest_interp
7 years ago
dengkaipeng 72108d8dbe fix win compile error: EigenTenor * float unsupport. test=develop
7 years ago
Yu Yang 26fb34c365 Merge develop tiny fix
7 years ago
Yu Yang fdc689142c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
tensor-tang 22125ebaef
Merge pull request #14321 from tensor-tang/fea/jit/vscal
7 years ago
Tao Luo 34e9e59f4a
Merge pull request #14333 from kbinias/change-hardcoded-format-and-bump-mkldnn-version
7 years ago
minqiyang 87450b9ad4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
7 years ago
peizhilin 41b423d41b remove duplicate
7 years ago
peizhilin dcfab11193 merge from develop
7 years ago
peizhilin 4ffa92d4f0 Merge branch 'develop' into windows/build
7 years ago
chengduo c5b6573a5a
Fix input<tensor> (#14208)
7 years ago
Krzysztof Binias f1c1acf1ac Changed hardcoded format to any in convolution and bumped MKL-DNN version to 0.17-rc
7 years ago
Tao Luo 813e54efbd
Merge pull request #14328 from PaddlePaddle/revert-14046-windows/debug
7 years ago
minqiyang 3db9fad764 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
7 years ago
Xin Pan b03a44e062
Merge pull request #14026 from JiabinYang/add_reorg_op
7 years ago
Zhaolong Xing ba8b5619a3
Revert "cherry picked windows patches."
7 years ago
minqiyang fcc0452c8b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
7 years ago
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
7 years ago
tensor-tang 5e64244f25 add vaddbias jitcode
7 years ago
tensor-tang 5f7956ae59 Merge remote-tracking branch 'ups/develop' into fea/jit/vscal
7 years ago
peizhilin 869487a2b7 Merge remote-tracking branch 'origin/develop' into windows/build
7 years ago
tensor-tang 3d950a812d combine jitcode of vscal
7 years ago
tensor-tang 03e11f3fc9 add vscal jitcode
7 years ago
dzhwinter 234a1d9248 Merge remote-tracking branch 'origin/develop' into windows/debug
7 years ago
chengduo a270fdf2db
Fix SelectedRowsAdd bug (#14309)
7 years ago
tensor-tang 2f0a379af7
Merge pull request #14307 from tensor-tang/fix/mac
7 years ago
Zeng Jinle b2af213009
Merge pull request #14292 from sneaxiy/delete_buggy_selected_rows_functor
7 years ago
tensor-tang 161ba9c9d1 fix mac
7 years ago
tensor-tang e8642c3c1f
Merge pull request #14265 from tensor-tang/fea/jit/vadd
7 years ago
dengkaipeng 8b47d90f5d add 'actual_shape' attribute. test=develop
7 years ago
tensor-tang 382307b943 refine code
7 years ago
tensor-tang 3319072858 fix jit kernel test on mac
7 years ago
Yu Yang 057a682ee9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Qiao Longfei e0c8397426
Merge pull request #14257 from jacquesqiao/optimize-pserver-profiler-thread-pool
7 years ago
chengduo ffc866159f
hot fix log (#14293)
7 years ago
Zhaolong Xing 65b61db10a
Merge pull request #13927 from NHZlX/fix_googlenet_bug_with_rule
7 years ago
tensor-tang 25e070ecc7 Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
barrierye ef8218be22 update docs test=develop
7 years ago
sneaxiy 9518bc8d0a delete buggy selected_rows functor
7 years ago
chengduo a9b5d42dd4
Add fp16 backward support (#14202)
7 years ago
Qiao Longfei 3b8dd9ebbd optimize code test=develop
7 years ago
Qiao Longfei 2921f8a79c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
7 years ago
dzhwinter 2835e04409 merge develop branch. test=develop
7 years ago
dzhwinter deb4af70ef add test
7 years ago
qingqing01 db8c52da5e Revert " Exhaustive search for cuDNN conv. (#14043)"
7 years ago
qingqing01 ce7d9b0799
Exhaustive search for cuDNN conv. (#14043)
7 years ago
tensor-tang cb4083b9fa fix compile error
7 years ago
tensor-tang dd343a4971 Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
Zeng Jinle fcbe84cb50
Merge pull request #14270 from sneaxiy/fix_rmsprop_enforce_bug
7 years ago
nhzlx 5700fafd0f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_googlenet_bug_with_rule
7 years ago
nhzlx 86b99ac953 fix comments and fix bug
7 years ago
tensor-tang e6cfdf6c74
Merge pull request #14274 from tensor-tang/fix/jit
7 years ago
Zeng Jinle 8ac2242b6e
Merge pull request #14075 from sneaxiy/remove_some_locks_in_pe
7 years ago
tensor-tang b81e1b655e fix jit on mac
7 years ago
sneaxiy 11f032a82e fix rmsprop_op enforce bug
7 years ago
tensor-tang b68ececb73 add vaddrelu jitcode
7 years ago
peizhilin 1f12ba6192 gpu support, fix build issue:
7 years ago
Wu Yi 8fc05e0373
fix cpu build test=develop (#14260)
7 years ago
tensor-tang bb09e31020 add vadd jitcode
7 years ago
Qiao Longfei 59fbfbfbf7 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
7 years ago
whs d6a6a13039
Fix build error of affine grid op in mac os. (#14237)
7 years ago
tensor-tang d55481cfeb
Merge pull request #14241 from tensor-tang/refine/jit/vmulcode
7 years ago
Qiao Longfei 9e4e9e9b6e clean rpc server profiler
7 years ago
Zeng Jinle 8d930195d9
Merge pull request #14238 from sneaxiy/fix_read_lod_level_bug
7 years ago
Wu Yi 306236c2c0
feature/DC asgd (#12722)
7 years ago
dengkaipeng fef2faa709 limit CUDA kernel parallel threads max number to 4096. test=develop
7 years ago
tensor-tang c3cbf0b8ef
Merge pull request #14185 from tpatejko/tpatejko/mkldnn-conv-residual-data-reorder
7 years ago
peizhilin 71d7980f69 fix build issue 1
7 years ago
dengkaipeng 34bfae243a Add Interpolate operation. test=develop
7 years ago
sneaxiy 46d4829dd1 fix lod_level share bug in read_op
7 years ago
tensor-tang 8465e7876f auto grow the size and fix test
7 years ago
tensor-tang 9255119fd9 refine jit vmul with all size
7 years ago
tensor-tang a9c1824131 refine jit vmul code supporting multiple of 2
7 years ago
tensor-tang 61fdc38e51
Merge pull request #14206 from tensor-tang/fea/jit/gen
7 years ago
peizhilin 9d67c1fb69 cpu build support
7 years ago
barrierye 5e7bb6a9bd update docs test=develop
7 years ago
dzhwinter 60f70b174d test=develop
7 years ago
sneaxiy 7ff320f8cc merge develop
7 years ago
dongzhihong 00cf66964f Merge remote-tracking branch 'origin/develop' into fix/sign_op
7 years ago
Kaipeng Deng daed473d4a
Merge pull request #14089 from heavengate/pool_exclude
7 years ago
Kaipeng Deng 64f3e3ed8f
Merge pull request #14069 from heavengate/grid_sampler
7 years ago
sneaxiy 366ebb93f7 test=develop
7 years ago
dzhwinter eb2f7ed21b refine tests. test=develop
7 years ago
Jiabin Yang 9f65b616b2
Merge branch 'develop' into add_reorg_op
7 years ago
Kaipeng Deng 0b29078201
Merge branch 'develop' into grid_sampler
7 years ago
whs 0c319e0b35
Add affine grid generator op (#12238)
7 years ago
tangwei12 d325e668b8
[1.1] Load vars on PSERVER (#14037)
7 years ago
tensor-tang 85bcb286f5 refine vmul jitcode
7 years ago
tensor-tang a764e900a5 Merge remote-tracking branch 'ups/develop' into fea/jit/gen
7 years ago
tensor-tang a3377f7b0a refine jitcode and add vmul jitcode implementation
7 years ago
dzhwinter 1ace55c8ee merge develop branch
7 years ago
dengkaipeng df4a3544aa nearest neighbor interp add cuda kernel. test=develop
7 years ago
chengduo 2ccf77d1c1
Refine GetTensorFromVar (#14160)
7 years ago
dengkaipeng 9755611938 add unittest for nearest_neighbor_interp_op
7 years ago
dengkaipeng a24691a2a9 add nearest neighbor interpolation operator cpu kernel
7 years ago
JiabinYang 8d3c3e048b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
tensor-tang f3badacd97 Merge remote-tracking branch 'ups/develop' into fea/jit/gen
7 years ago
tensor-tang a53b1b0b1b refine and init jitkernel vmul
7 years ago
tensor-tang 2139b9f677 add jit gencode
7 years ago
Tomasz Patejko 8899d42265 MKLDNN conv residual data: primitive reuse interface used. Reorder done when formats are different
7 years ago
chengduo b73708d20b
add int and int64 dtype for gather_op (#14175)
7 years ago
Tomasz Patejko f11934cbe6 MKLDNN conv residual data: residual data is reorder when formats are incorrect
7 years ago
Tao Luo cdf2579d08
Merge pull request #14053 from jczaja/prv-seqpool-max
7 years ago
Kaipeng Deng a3b26e8528
Merge branch 'develop' into grid_sampler
7 years ago
dengkaipeng 7333fe8e55 add math formula for exclusive/inclusive mode in avg pool. test=develop
7 years ago
dzhwinter 316765839d add back jit simd instructions. stage.
7 years ago
Xin Pan eb7ed1b720
Merge pull request #13897 from gmcather/develop
7 years ago
barrierye fc23cc9d30 update paddle/fluid/API.spec
7 years ago
dzhwinter bf2e4cb188 cleard. staged
7 years ago
chengduo 2f639113ee
Fix sum_op's GetExpectedKernelType (#14112)
7 years ago
gmcather ba22624d7e position encoding && log loss
7 years ago
dzhwinter ebfe5a02b3 merge develop branch
7 years ago
qingqing01 cb27a9219d
Merge pull request #13971 from sefira/FasterOpDoc
7 years ago
sneaxiy 5e5d2223a1 test=develop
7 years ago
tensor-tang 3c957af139
Merge pull request #14080 from tensor-tang/refine/jit/crf2
7 years ago
barrierye 5f3acac9b3 update paddle/fluid/API.spec
7 years ago
Jacek Czaja 458b16f42a Rebase of seqpool-max optimization
7 years ago
dengkaipeng ff6329bd5f fix some inappropriate expressions in api doc for grid_sampler. test=develop
7 years ago
dengkaipeng 8f1e398824 move param exclusive to the last in pool2d/pool3d for forward compatibility:. test=develop
7 years ago
dengkaipeng 593e1b18d7 fix some bugs and add some doc for GridSampleOp
7 years ago
dengkaipeng 0bb0e0c10f add Grid Sampler Operator for STN.
7 years ago
Yu Yang c01696f8c2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Qiao Longfei d26ff8cb2d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpu-for-1.1-merge-with-shape
7 years ago
JiabinYang e0a89503f8 test=develop
7 years ago
Wu Yi 26200f2e42
[1.1] [project] train imagenet using large batch size (#13766)
7 years ago
barrierye 8c1e304307 merge nn.py
7 years ago
dengkaipeng c93e044ae0 add inclusive/exclusive mode in PoolOp avg pool type
7 years ago
JiabinYang 9a74c4489f test=develop
7 years ago
barrierye 9dc28179a4 add similarity_focus op
7 years ago
Qiao Longfei 7cd2417fe2 Merge branch 'develop' into cpu-for-1.1-merge-with-shape
7 years ago
dzhwinter c8adc2c6fe cudnn version. staged.
7 years ago
Yan Chunwei ee74be3a49
[1.1] Bugfix/tensorarray (#14044)
7 years ago
Qiyang Min 33b4920d2d
Merge pull request #14057 from velconia/continue_hash_op
7 years ago
Qiyang Min 209f24a241
Merge pull request #14051 from velconia/accelerate_embedding_grad
7 years ago
Qiao Longfei 7cfc3c4415 Merge branch 'optimize-sum-seq-pooling-op' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei 72aef6b168 sum selected rows check empty
7 years ago
Qiao Longfei 641369f92b Merge branch 'dist-table-do-not-init-on-trainer' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei d69c820707 Merge branch 'add-flag-to-control-rpc-thread-num' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei 1ed9ef6d70 Merge branch 'shape_int_to_int64' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei da61a5b672 Merge branch 'optimizer-prefetch' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
tangwei12 5ce3a32e06
Merge branch 'develop' into optimizer-prefetch
7 years ago
seiriosPlus b6590b05fb submit by tangwei12, test=develop
7 years ago
tangwei12 cb1ccc710b fix shape type in uniform_random_op.cu
7 years ago
Qiao Longfei 575f22711d optimize code
7 years ago
Qiao Longfei 96d5500934 optimize code
7 years ago
Qiao Longfei 748ee35c89 sum op handle empty input update selected_rows_functor.cu
7 years ago
Qiao Longfei dd78b5df93 sum op handle empty input
7 years ago
Qiao Longfei cbe128bbae Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei f4df0cb1a2 update the type of shape to int64, format code
7 years ago
Qiao Longfei 7dcb0dc8c6 update year
7 years ago
Qiao Longfei 68aeb4e7e9 add fake init test in test_dist_transpiler
7 years ago
Qiao Longfei a13c788a04 fix a bug
7 years ago
Zeng Jinle 97d47a7d08
Merge pull request #13913 from sneaxiy/seq_reverse
7 years ago
JiabinYang 6e3615422f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Jiabin Yang a3efba176c
Merge pull request #14085 from jerrywgz/fix_generate_proposals_op
7 years ago
dzhwinter 7141debe38 add cudnn back. staged.
7 years ago
Qiao Longfei 0328ffd3ab add fake init op
7 years ago
Hongyu Liu 379d933ae5
Merge pull request #14036 from phlrain/add_dropout_att_new
7 years ago
tangwei12 d8b697357f update height_sections to int64_t
7 years ago
jerrywgz de2f965c9b test=develop
7 years ago
dzhwinter 09409bad4d staged. test speed=49ms in 1080.
7 years ago
tensor-tang 64d5b4385e fix crf decode avx512
7 years ago
tensor-tang 21487d78bf add crf decode jit kernel
7 years ago
sneaxiy 1af3fe8c35 test=develop
7 years ago
Qiao Longfei de539d72da format
7 years ago
sneaxiy 5be6f762d0 remove_lock_in_some_ops
7 years ago
buxingyuan 6c1d74bb47 Merge branch 'develop' into FasterOpDoc
7 years ago
JiabinYang 7bcba47e41 test=develop
7 years ago
barrierye a7f94ec794 add similarity_focus op
7 years ago
minqiyang 0de6811ee0 Change reserve to resize
7 years ago
JiabinYang 9cad409f2a test=develop
7 years ago
minqiyang 5660d6a3ba Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
tensor-tang a05fce6544 Merge remote-tracking branch 'ups/develop' into fix/jit/avx
7 years ago
JiabinYang bd064c0f44 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Qiyang Min d0fdcb2f6d
Merge pull request #14048 from velconia/change_sequence_pool_to_cpu
7 years ago
Yu Yang 8310ce6007 Fix cluster memory
7 years ago
tensor-tang d24d282a7a fix avx error
7 years ago
tensor-tang 9cb8738f54
Merge pull request #14018 from tensor-tang/refine/jit/gru
7 years ago
Qiao Longfei 6253b152e6 Merge branch 'optimize-sum-seq-pooling-op' of https://github.com/jacquesqiao/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei 14f5a40898 fix unit test
7 years ago
minqiyang 5de4619781 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
minqiyang 0695c1fbe8 Add remind for code
7 years ago
minqiyang 0c5c4c4a5b Add blas header file
7 years ago
buxingyuan d0ccdf8fc1 follow comments
7 years ago
minqiyang e2a348cd10 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into change_sequence_pool_to_cpu
7 years ago
Qiao Longfei f4e6fe0786 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
minqiyang 40141f749b Implement the unittest for hash op
7 years ago
minqiyang 8a0f26f45f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into continue_hash_op
7 years ago
minqiyang d4f9aa0852 Add hash op implementation
7 years ago
tangwei12 755927d2b0 shape type to int64_t, test=develop
7 years ago
Qiao Longfei 7357d8412e add flags for control the thead num for pserver
7 years ago
minqiyang 1a3b38a432 Polish code
7 years ago
minqiyang 133bac2b10 Accelerate embedding op grad
7 years ago
dzhwinter 597d92179b clean demo_ci
7 years ago
phlrain 201d4f2a85 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain a6e6bc45d6 modify dropout att; test=develop
7 years ago
minqiyang 2468057da6 Move code to SumSeqPoolGradFunctor
7 years ago
minqiyang 9725db0d40 Fix copy wrong pos bug
7 years ago
minqiyang 9c68709036 Accelerate sequence_pool functor
7 years ago
minqiyang 14ebc424d6 Add gpu support for unittest
7 years ago
jerrywgz e906c8e5e7
Merge pull request #14022 from jerrywgz/fix_rpn_target_assign_op
7 years ago
minqiyang bd5a82e193 Polish unit test code
7 years ago
minqiyang 047fa2f9aa Add unit-test for sequence_pooling functor
7 years ago
qingqing01 c7379a7320 Fix top_k op (#14034)
7 years ago
sneaxiy 016bf51e3f test=develop
7 years ago
JiabinYang c13f1ef3c4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Xin Pan 8837669782
Merge pull request #13982 from panyx0718/fix
7 years ago
dzhwinter dbd0075b68 Merge branch 'windows/support' into lb
7 years ago
dzhwinter c6dcffc61a lb. add debug output
7 years ago
sneaxiy 92a2817a2b test=develop
7 years ago
JiabinYang 8e8e8e66ab Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
phlrain 049c9c7d2a Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain ffb24a73ec add dropout attr; test=develop
7 years ago
wanghaoshuang 5993155d67 Merge remote-tracking branch 'dzhwinter/windows/support' into windows/support
7 years ago
wanghaoshuang f9e7cfb03c save binary file
7 years ago
tensor-tang 032c3a07e3 Merge remote-tracking branch 'ups/develop' into refine/jit/gru
7 years ago
tensor-tang 159be8cc63 optimize fusion gru kernel at size 8
7 years ago
Tao Luo 23da8defc8
Merge pull request #14028 from luotao1/fix_resnet50_test
7 years ago
Yu Yang 71c846ef8a Revert buggy changes
7 years ago
JiabinYang ff07dc315e test=develop
7 years ago
chengduo a7497653d0
Refine Split op (#13967)
7 years ago
Yu Yang dbf9f6f408 Fix distribute compile
7 years ago
jerrywgz e0708e62ba refine code
7 years ago
jerrywgz 1c591c3909
Merge branch 'develop' into fix_rpn_target_assign_op
7 years ago
sneaxiy a9d7a9d720 test=develop
7 years ago
Tao Luo 316bc9bfc9 fix typo and warning in analyzer_resnet50_test
7 years ago
jerrywgz f06c6193d7 fix rpn target assign test=develop
7 years ago
dongzhihong 563e7bca7f "fix op. test=develop"
7 years ago
Xin Pan 8f2116d8fa clean up after the changes have been stopped for so long.
7 years ago
tensor-tang 83dc689877 Merge remote-tracking branch 'ups/develop' into refine/jit/gru
7 years ago
tensor-tang 640e789d3d add fusion gru jit kernel
7 years ago
JiabinYang 39d39775c3 test=develop
7 years ago
JiabinYang 70351de1b5 test=develop
7 years ago
Yu Yang 461f71a90b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
qingqing01 0e24138494
Merge pull request #13991 from qingqing01/refine_generate_proposals_op
7 years ago
gongweibao 58c027cc38
Add rpc profiler flags. (#13989)
7 years ago
Tao Luo 42aa1d409d
Merge pull request #13485 from tpatejko/tpatejko/capi-resnet-conv-elementwise-fusion
7 years ago
tensor-tang 664159ad42
Merge pull request #13998 from tensor-tang/fea/fusion_seqconv_add
7 years ago
Qiao Longfei 40d65a1369 optimize code
7 years ago
Qiao Longfei d37b9797ec update test
7 years ago
Qiao Longfei 4051fb36b5 add monitor thread
7 years ago
Qiao Longfei e67783375d code clean
7 years ago
Qiao Longfei 5c65eff6ef update test for ctr data
7 years ago
jerrywgz 765085d297
Merge pull request #13904 from jerrywgz/roialign
7 years ago
Dang Qingqing 56936b9e25 Refine doc for generate_proposals_op.
7 years ago
Tomasz Patejko 4be45af1cc MKLDNN conv + elementwise_add fusion: skip connection attribute renamed. Comments about patterns added.
7 years ago
Michal Gallus f688197182 MKLDNN conv + elementwise_add fusion: Fix output_data to point to the right tensor, also fix transpiler integration
7 years ago
Tomasz Patejko bf95ac36a7 MKLDNN conv + elementwise_add fusion: further reformatting
7 years ago
Tomasz Patejko b8e54ab5cc MKLDNN conv + elementwise_add fusion: parameter name changed to ResidualData
7 years ago
Tomasz Patejko 41f3d78fdf MKLDNN conv + elementwise_add fusion: output and elemwise param share data in conv primitive. Output is properly allocated
7 years ago
Tomasz Patejko 56528531ea MKLDNN conv + elementwis_add fusion: initial work on passing eltwise data to conv primitive
7 years ago
Qiao Longfei 044d2e20bf update test method
7 years ago
Dang Qingqing 4801ee8f97 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_generate_proposals_op
7 years ago
Qiao Longfei 92cbaa41eb add GetTimeInSec
7 years ago
tensor-tang 23fc896bc2 Merge remote-tracking branch 'ups/develop' into fea/fusion_seqconv_add
7 years ago
tensor-tang 339e655aec refine and add seqconv elementwiseadd relu op test
7 years ago
jerrywgz a1d3db031b
Merge pull request #13844 from jerrywgz/fix_roi_pool
7 years ago
Dang Qingqing 8e0b9496de Fix unit test
7 years ago
tensor-tang 0a9f5f1790
Merge pull request #13968 from tensor-tang/fix/jit/exp
7 years ago
Yipeng fcb2e8103e Ocr end2end dev (#13889)
7 years ago
tensor-tang e5ce965952 refine and add eltadd_relu unit test
7 years ago
sneaxiy 5a38930660 test=develop
7 years ago
Qiao Longfei dd2dfeb624 add debug information
7 years ago
Qiao Longfei 803e2ed9f4 add ctr_reader_test and fix bug
7 years ago
tensor-tang 7cb19a5976 fuse elementwise_add and relu
7 years ago
Qiao Longfei c8bd521045 add reader thread status
7 years ago
tensor-tang 3c249283af init seqconv eltadd relu op
7 years ago
Qiao Longfei 71cbc8bd24 optimize code
7 years ago
Qiao Longfei 694e8945a2 add a base class for reader
7 years ago
Qiao Longfei d981333e94 add a base class for reader
7 years ago
Qiao Longfei a06173eedc clean code
7 years ago
Qiao Longfei 71c2ad412f complete read thread
7 years ago
sneaxiy ac2eba4457 test=develop
7 years ago
Qiao Longfei 0f3ece775d use gzstream
7 years ago
jerrywgz 553342624e test=develop
7 years ago
jerrywgz 9a14ca91b8 test=develop
7 years ago
tensor-tang 60ff05e312 Merge branch 'luotao1-fix_rnn2_test' into fix/jit/exp
7 years ago
Qiao Longfei a1e0f5abb7 add gzstream.cmake
7 years ago
Tao Luo 7d680be5a3 Merge branch 'develop' into mkldnn_test
7 years ago
buxingyuan 0bb3b099c2 generate_proposal_labels doc
7 years ago
Qiao Longfei 20f181cdc1 init ctr_reader
7 years ago
gongweibao a831ecc75d
Add grpc error context. (#13957)
7 years ago
tensor-tang b139b687de Merge remote-tracking branch 'ups/develop' into fix/jit/exp
7 years ago
qingqing01 67a2b5215d
Add affine channel op to speed and save memory for faster-rcnn model. (#13919)
7 years ago
tensor-tang 748435586a clean code exp avx
7 years ago
tensor-tang b4751a34a5 fix illegal instruction of rnn2
7 years ago
tensor-tang 30dfbdee7f
Merge pull request #13951 from tensor-tang/fix/warning
7 years ago
tensor-tang 36588b3365 fix illegal instruction of rnn1 and text
7 years ago
Tao Luo 6a4e9230ed Merge branch 'develop' into mkldnn_test
7 years ago
gongweibao 078223b3e3
Add rpc timeline. (#13900)
7 years ago
dzhwinter 29382db625
Merge pull request #13874 from dzhwinter/fix/momentum
7 years ago
qingqing01 5dbb2e9986
Small changes for sum_op to avoid zero setting. (#13923)
7 years ago
Tao Luo e47f4186ae fix some compiler warning
7 years ago
dzhwinter 00e8791f66 fix compile in cpu error. test=develop
7 years ago
tensor-tang e69328c3bc fix warning and mac compile
7 years ago
Qiao Longfei d26e4507da init ctr data
7 years ago
dzhwinter d239cf2e15 use binary search. test=develop
7 years ago
dzhwinter a9f5f822e6 use binary search. test=develop
7 years ago
tensor-tang 6447155dac
Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole
7 years ago
sneaxiy 4b4af84e67 test=develop
7 years ago
jerrywgz 4c9884e713 refine unittest test=develop
7 years ago
Qiao Longfei 0225957515 change elementwise_add to elementwise_add_to test=develop
7 years ago
Qiao Longfei bd2b6d7f8f sum_op support inplace
7 years ago
Xin Pan 7fb5b66ac2
Merge pull request #13916 from panyx0718/fix2
7 years ago
dzhwinter 3861269594 merge develop branch
7 years ago
jerrywgz 98c3294b85 Merge branch 'roialign' of https://github.com/jerrywgz/Paddle into roialign
7 years ago
tangwei12 fa2ab3346c
fill constant add infervarshape, lookuptable clone lr var (#13830)
7 years ago
jerrywgz 8c79071d6a roi_align for gpu
7 years ago
Xin Pan 342e436158 Make Var::GetMutable robust
7 years ago
Yan Chunwei 7a751b83ac fix isfinite_op sprintf (#13850)
7 years ago
Qiyang Min e3a64fca44
Merge pull request #13835 from velconia/fix_reshape_op
7 years ago
Yibing Liu 46b0b7903c
Merge pull request #13856 from kuke/seq_unpad_op
7 years ago
Qiao Longfei b4a32eafdf Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
jerrywgz c9d2046f76 roi_align for gpu
7 years ago
jerrywgz 2f5a80174e add roi_align api
7 years ago
dzhwinter e41a3fcd68 fix update to develop hang problem.
7 years ago
Zeng Jinle 93606c2c2c
Merge pull request #13689 from sneaxiy/sparse_rmsprop
7 years ago
Qiao Longfei 681226e97c
Merge pull request #13864 from jacquesqiao/py-reader-add-test-mode
7 years ago
jerrywgz 90f39b1123 Merge branch 'roialign' of https://github.com/jerrywgz/Paddle into roialign
7 years ago
Xin Pan 288a112ffd
Revert "Revert "Revert "Make variable::GetMutable robust"""
7 years ago
sneaxiy 5cedfb60c8 test=develop
7 years ago