Commit Graph

2944 Commits (8d88c5a87d2e2485b7a7f8714e874f9c69c0620a)

Author SHA1 Message Date
Michal Gallus 9455be0ba5 EltwiseMul: Extract StringToFormat to MKLDNN helper
7 years ago
Jacek Czaja 1540df51cf - Fix to test_conv2d_transpose_mkldnn for GPU
7 years ago
JiabinYang eda069068d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enhance_hierachical_sigmod_op
7 years ago
JiabinYang a08dc83eb0 remove arg 'non_leaf_num', test=develop
7 years ago
chengduo 6648f5ed6f
add ShareLoD for dropout_grad (#14616)
7 years ago
JiabinYang c469334cfb polish python code and comment, test=develop
7 years ago
Qiao Longfei 92afbb923c fix compile problem test=develop
7 years ago
Qiao Longfei 97cbec9b74 clean code
7 years ago
Qiao Longfei 1edd435da6 fix ci problem test=develop
7 years ago
JiabinYang 87648f8edf merge develop, test=develop
7 years ago
wopeizl db9284ecde
Merge pull request #14617 from wopeizl/windows/online
7 years ago
JiabinYang c3c3c0b33c polish code, test=develop
7 years ago
Jacek Czaja 8bfa1fa9bb - ASUM MKL integration
7 years ago
phlrain 487ee36aec Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
tangwei12 56a4912b76
Make NCE_OP more efficient and support SelectedRows (#14469)
7 years ago
liuhongyu 1ffe41d722 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
Qiao Longfei 9589babe12 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
7 years ago
liuhongyu 05917c3c79 add cudnn lstm; test=develop
7 years ago
Qiao Longfei f35f3fe77a ctr reader can not be used in windows
7 years ago
peizhilin 6a85dd3278 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 38715e6fd0 minor fix
7 years ago
Qiao Longfei 6bef565dac clean code test=develop
7 years ago
Qiao Longfei e7d1f524f3 change log level
7 years ago
JiabinYang 7e4bd695e6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enhance_hierachical_sigmod_op
7 years ago
Qiao Longfei fe54adf70c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-ctr-reader
7 years ago
JiabinYang b10df8bcfa refine code and add none bias ut, test=develop
7 years ago
Kaipeng Deng 251a1bb0f4
Merge pull request #14588 from heavengate/revert_interpolate
7 years ago
Qiao Longfei 668ae9083e Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-ctr-reader
7 years ago
Qiyang Min 30e47bce8b
Merge branch 'develop' into revert_vlog
7 years ago
Qiao Longfei 87e4edd2ea fix grad_varname in remote prefetch
7 years ago
Qiao Longfei d98c59fd2c support none sliced variable
7 years ago
dengkaipeng bb489d4cc9 add interp_method default bilinear. test=develop
7 years ago
dengkaipeng 78f563917c revert interpolate_op to bilinear_interp_op & nearest_interp_op. test=develop
7 years ago
Jacek Czaja fb24690a58 - conv2d transpose MKL-DNN
7 years ago
tensor-tang 7a91271436
Merge branch 'develop' into fea/jit/rnn
7 years ago
minqiyang be04d99fe4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
7 years ago
JiabinYang 81e145764d refine code and comments, test=develop
7 years ago
Qiao Longfei af2f5fc824 fix some bugs
7 years ago
JiabinYang 2f6b529aff refine code and comments, test=develop
7 years ago
minqiyang 53433d7f2e Revert the changes of VLOG
7 years ago
tensor-tang 1f0291a51e add comments and follow comments
7 years ago
tensor-tang 557229bd39 Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
Qiao Longfei ed9fa4b301 can run
7 years ago
peizhilin 30849d1f20 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
qingqing01 6224e61fd9
Transpose-Flatten-Concat fusion operator. (#14568)
7 years ago
Qiao Longfei 686d15c8e0 update grpc_variable_response
7 years ago
tangwei12 3639d99f99
Fix save and load lookup table/optimizer vars (#14301)
7 years ago
peizhilin 36cd18b549 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiao Longfei d827881502 fix pserver and prefetch rpc
7 years ago
Yiqun Liu bf222f197d
Use sub scope in tensor_array_to_tensor op. (#14524)
7 years ago
JiabinYang 02d68051db add sparsed bias grad, test=develop
7 years ago
Qiao Longfei 5856c2f332 change Var to FindVar
7 years ago
Qiao Longfei 312b7786d9 clean code
7 years ago
Qiao Longfei 2b6c0c09d6 add unit test
7 years ago
Qiao Longfei 47280ef8b4 lookup table op support prefetch
7 years ago
gongweibao c1bf9664cd
Add options to disable SO_REUSEPORT of grpc. (#14269)
7 years ago
Qiao Longfei 4ad5fd8f54 add parameter prefetch
7 years ago
Qiao Longfei 9d276fe8a8 add parameter prefetch
7 years ago
luotao1 e21edb26f6 add Set/GetCPUNumThreads api
7 years ago
Qiao Longfei 9851a53478 add prefetch part in pserver
7 years ago
JiabinYang 42470f14b7 test=develop
7 years ago
peizhilin 445fff24dc add the bigobj option to NVCC compile
7 years ago
qingqing01 36f08eef3b
CUDA kernel for density_prior_box_op. (#14513)
7 years ago
tensor-tang 6a7f83d45d enable gru jitcode and refine act and lstm jitcode
7 years ago
tensor-tang 686eaf20ba Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
peizhilin 81bd7eeff4 rollback the format
7 years ago
Qiao Longfei 1f87f263a2 clean code
7 years ago
Qiao Longfei 361cb0e078 lookup remote table can compile
7 years ago
JiabinYang 0fca16847c temp
7 years ago
JiabinYang e9be3366a9 test=develop
7 years ago
chengduo 00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929)
7 years ago
peizhilin dfbac60398 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 7c8c9dc9bf fix unit test cases
7 years ago
tensor-tang 0c5ed5f6fc enable peephole jitcode
7 years ago
JiabinYang 3c6102a367 test=develop
7 years ago
Qiao Longfei 7c3ce2952d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
7 years ago
Qiao Longfei 60a4f69b3c add lookup remote table op
7 years ago
Qiao Longfei e0b48f7e29 init lookup remote table
7 years ago
tensor-tang e3b61cf52b init gru jitcode and fix lstm jitcode
7 years ago
tensor-tang 0f25446574 Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
Dun ae7d22862b Group Norm (#13843)
7 years ago
wopeizl d9a1f3e58e Windows/online (#14474)
7 years ago
JiabinYang 57a18e32a1 test=develop
7 years ago
peizhilin bef475c92b Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo 5d4d117edc
Merge pull request #14502 from qingqing01/cudnn5_fix
7 years ago
Jiabin Yang f7b55de9e5
Merge branch 'develop' into enhance_hierachical_sigmod_op
7 years ago
Yu Yang e68c1fcd5a
Merge pull request #14522 from reyoung/feature/fix_op_header_deps
7 years ago
tensor-tang 3562051302 add gru refer code and remove redundant avx code
7 years ago
JiabinYang af9a3301da test=develop
7 years ago
Zhaolong Xing ad349e770f
Merge pull request #14452 from NHZlX/fix_avg_pool_trt_bug
7 years ago
tensor-tang f913860873 jitkernel lstm refer support peephole
7 years ago
tensor-tang 2f9b5f2383
Merge branch 'develop' into fea/jit/rnn
7 years ago
JiabinYang 014e50c284 test=develop
7 years ago
Yu Yang 3edd32d070 fix(Compile): fix depends error when compile op using cub
7 years ago
Dang Qingqing cda60311f9 Fix compling with cuDNN v5
7 years ago
peizhilin 67562a6fcd Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang 10fb4ceefc
Merge pull request #14351 from tpatejko/tpatejko/mkldnn-elementwise_mul
7 years ago
jerrywgz 13e254faed refine code, test=develop
7 years ago
tensor-tang b4c826c548 Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
tensor-tang ce31deb7e9 refine refer code and add lstm refer code
7 years ago
jerrywgz 79cec53111 add ignore index for sigmoid cross entropy with logits op, test=develop
7 years ago
nhzlx e62872df8b fix conflicts
7 years ago
tensor-tang c2cfb03a72 add lstm jitcode
7 years ago
peizhilin 25adf970b2 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo 1d3e9bde1e
Merge pull request #14488 from yihuaxu/develop_7a64d48f5_stack_opt
7 years ago
tensor-tang 7aa3aff338
Merge pull request #14465 from tensor-tang/fea/jit/exp
7 years ago
Tao Luo 1b894e495f
Merge pull request #14437 from jczaja/prv-softmax-mkl
7 years ago
peizhilin 3a72a634cf Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yihua Xu a906a361be Add the macro for NVCC (test=develop)
7 years ago
Yihua Xu d91740acb1 Revert "Remove the remnant code (test=develop)"
7 years ago
Yihua Xu be50670348 Remove the remnant code (test=develop)
7 years ago
qingqing01 9eefd2c766
Modify some infer-shape about detection operators in compile-time. (#14483)
7 years ago
Yihua Xu f4c869d872 Optimize the layer_norm operator with AVX intrinsic function (#14417)
7 years ago
peizhilin ee0fd78c81 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yu Yang f1a392a5fe
Merge pull request #13804 from sneaxiy/rewrite_allocation
7 years ago
Yihua Xu f418f552df Merge branch 'develop' into develop_7a64d48f5_stack_opt (test=develop)
7 years ago
peizhilin 8443961a4f add warp_ctc back
7 years ago
qingqing01 fd7e643153
Convolution fusion operator. (#14449)
7 years ago
Yu Yang 98bbfc17be Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
peizhilin 4a6769da84 re-organize the cmake file
7 years ago
dengkaipeng 8ef6280c03 Add operator double support. test=develop
7 years ago
peizhilin 1aff40a4c6 exclude warpctc_op on windows
7 years ago
peizhilin 7d51a0e887 disable DSO by default on windows
7 years ago
peizhilin b967e01cbe Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Wu Yi d7bd0361cb fix dist deps (#14471)
7 years ago
Jacek Czaja 9b0eae3023 - Removing partial specialization of sotmax for inference for GPU
7 years ago
peizhilin a3e952f41d add the jit back
7 years ago
tensor-tang a19b3225a1 fix jitcode small size
7 years ago
Jacek Czaja be80bb4f28 - Fix to GPU
7 years ago
tensor-tang 4dbdfa60ef sigmoid and tanh support all size
7 years ago
tensor-tang ccb8963705 refine exp jitcode with all size
7 years ago
peizhilin 1cc23ef67d merge from paddle:develop
7 years ago
tensor-tang d3eae8f61b refine relu and fix addrelu test
7 years ago
tensor-tang 4e67fe6a12 refine act and vxx with all size
7 years ago
tensor-tang ba3eaed7a7 exp support all size
7 years ago
tensor-tang 1ffce8c0ae fix build error on noavx
7 years ago
Michal Gallus c69c41604e MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
7 years ago
Michal Gallus 785066eb8a MKLDNN elementwise_mul: Check if AVX512 is available
7 years ago
Michal Gallus 08f63c4d12 MKLDNN elementwise_mul: Lint changes to UT & integration
7 years ago
Michal Gallus 49b09327f6 MKLDNN elementwise_mul: Reorder on non-nchw input, fallback on non-16 divisable fm
7 years ago
Michal Gallus d14858e4ba MKLDNN elementwise_mul: Parallelize mul
7 years ago
Michal Gallus ed31936ba1 MKLDNN elementwise_mul: Support NCHW, update UT
7 years ago
Tomasz Patejko 700bcbf74f MKLDNN elementwise_mul: h and w loops implemented in xbyak
7 years ago
Tomasz Patejko ad09facafe MKLDNN elementwise_mul: CPU tests initially refactored. MKLDNN mul test for broadcast added
7 years ago
Tomasz Patejko 2d73ad180a MKLDNN elementwise_mul: simple xbyak version for AVX512
7 years ago
Tomasz Patejko 213ec37d6a MKLDNN elementwise_add: simple initial implementation of the operator for MKLDNN format
7 years ago
Wu Yi a2d9b34417
Refine operator cmake (#14413)
7 years ago
peizhilin 764f97deac Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin 8580b7a130 Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang 7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
7 years ago
Jiabin Yang 28bd5b7bad fix space_to_depth_op unicode problem (#14430)
7 years ago
Jacek Czaja 513bb6c151 Squashing MKL based softmax for inference
7 years ago
nhzlx 9b64aac41f add macro for pool2dDirectCUDAFunctor
7 years ago
whs 1722678258
Make nce support more distribution. (#13549)
7 years ago
nhzlx 83f8c403a7 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
7 years ago
nhzlx b969116988 fxi avg pool trt bug and fix cpplint
7 years ago
tensor-tang 1f00723fa3 exp, sigmoid, tanh jitcode support more size
7 years ago
Qiyang Min d971d5b875
Merge pull request #14431 from velconia/fix_expand_op_dim_in_compile_time
7 years ago
Wu Yi b32c13dc20
Add cudnn ctc loss (#12366)
7 years ago
tensor-tang 8cda7b3d20 Merge remote-tracking branch 'ups/develop' into fea/jit/act
7 years ago
tensor-tang e2d6eddd32 remove ComputeDeprecated
7 years ago
peizhilin 6d0d5a76eb Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
dengkaipeng f115eb0d1e enhance api. test=develop
7 years ago
tensor-tang 64f7516aee
fix lrn on mac (#14426)
7 years ago
Yu Yang c8f6e70ab4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
tensor-tang f65ddff8d1 unify act jitcode of relu, exp, sigmoid and tanh
7 years ago
tensor-tang 6a159071b6 add vtanh jitcode of size 8
7 years ago
tensor-tang 046374bcd1 add vsigmoid jitcode of size 8
7 years ago
minqiyang 560b29ccb7 Polish code
7 years ago
minqiyang 21d6e8e8c8 Polish code
7 years ago
minqiyang 50b6e4c6bc Fix expand grad op infer shape
7 years ago
Sylwester Fraczek 8a1eeec579 add mkldnn prop_kind phase for inference-only case to pooling and activations (#14278)
7 years ago
peizhilin d1429ac4a5 add recordio support
7 years ago
chengduo 82773477ae
Add selu (#14415)
7 years ago
dengkaipeng 95d5060ddd fix abs -> fabs error. test=develop
7 years ago
minqiyang 30147d7f58 Fix expand op incorrect infer shape
7 years ago
JiabinYang ba9ff508e8 temp fix
7 years ago
Yihua Xu 03ccb9a461 Optimize the stack operator
7 years ago
dengkaipeng 2faa2b4048 remove cu file. test=develop
7 years ago
tensor-tang ee2a7f1b8c refine exp and fix error on avx
7 years ago
tensor-tang 1e06a32a0d add vexp jitcode of size 8
7 years ago
tensor-tang 2354409601
Merge pull request #14374 from tensor-tang/fea/jit/act
7 years ago
Tao Luo 5ef123c778 Merge branch 'develop' into dam_fc
7 years ago
dzhwinter d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
7 years ago
Tao Luo d3e63e6e04
Merge pull request #14412 from jczaja/prv-dam-softmax
7 years ago
peizhilin be332a13bc Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Jacek Czaja b361579f09 - Softmax for Inference is enabled when ON_INFER is set
7 years ago
Tao Luo 980a6753a8 fix typo to pass the ci
7 years ago
Tao Luo 8f301f4618
Merge pull request #14381 from qingqing01/manylinux_v5_fix
7 years ago
peizhilin 1a9008c420 code style fix
7 years ago
Tao Luo e0d4e04bdd fix some compiler warning
7 years ago
Tao Luo 8ea13e336a add in_num_col_dims for fc
7 years ago
JiabinYang a507845a77 test=develop
7 years ago
Tao Luo 9eb0ab1db3
Merge pull request #14384 from tensor-tang/refine/lrn
7 years ago
peizhilin 30ddc07a7e Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiao Longfei e65cbd3b06
Merge pull request #14387 from jacquesqiao/lookup_sparse_table_add_test_mode
7 years ago
Qiao Longfei 6cf8f24b1b
Merge pull request #14389 from jacquesqiao/fix_sgd_op_optimize_sparse_table
7 years ago
Xin Pan 10ab177f89
Merge pull request #14403 from PaddlePaddle/revert-14337-prv-dam-softmax
7 years ago
Yan Chunwei 9f252e0032
Combine Inference Analysis with IR (#13914)
7 years ago
Tao Luo 5b9c62faee
Revert "Softmax op optimization for inference "
7 years ago