Commit Graph

11940 Commits (5d4d117edc25efdd9b6b7d0f18fbd70c48118da3)

Author SHA1 Message Date
Jacek Czaja be80bb4f28 - Fix to GPU
6 years ago
tensor-tang 4dbdfa60ef sigmoid and tanh support all size
6 years ago
tensor-tang ccb8963705 refine exp jitcode with all size
6 years ago
tensor-tang d3eae8f61b refine relu and fix addrelu test
6 years ago
tensor-tang 4e67fe6a12 refine act and vxx with all size
6 years ago
tensor-tang ba3eaed7a7 exp support all size
6 years ago
tensor-tang 1ffce8c0ae fix build error on noavx
6 years ago
superjomn 4bf6817cbc fix gpu load model
6 years ago
Michal Gallus c69c41604e MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
6 years ago
Michal Gallus 785066eb8a MKLDNN elementwise_mul: Check if AVX512 is available
6 years ago
Michal Gallus 08f63c4d12 MKLDNN elementwise_mul: Lint changes to UT & integration
6 years ago
Michal Gallus 49b09327f6 MKLDNN elementwise_mul: Reorder on non-nchw input, fallback on non-16 divisable fm
6 years ago
Michal Gallus d14858e4ba MKLDNN elementwise_mul: Parallelize mul
6 years ago
Michal Gallus ed31936ba1 MKLDNN elementwise_mul: Support NCHW, update UT
6 years ago
Michal Gallus 4e54ab76ec Add HasAttr method to Operator
6 years ago
Tomasz Patejko 700bcbf74f MKLDNN elementwise_mul: h and w loops implemented in xbyak
6 years ago
Tomasz Patejko ad09facafe MKLDNN elementwise_mul: CPU tests initially refactored. MKLDNN mul test for broadcast added
6 years ago
Tomasz Patejko 2d73ad180a MKLDNN elementwise_mul: simple xbyak version for AVX512
6 years ago
Tomasz Patejko 213ec37d6a MKLDNN elementwise_add: simple initial implementation of the operator for MKLDNN format
6 years ago
Wu Yi a2d9b34417
Refine operator cmake (#14413)
6 years ago
Tomasz Patejko 53da846d1e MKLDNN residual connections fuse pass: initial implementation of fusion for projection pass
6 years ago
tensor-tang 7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
6 years ago
Jiabin Yang 28bd5b7bad fix space_to_depth_op unicode problem (#14430)
6 years ago
Jacek Czaja 513bb6c151 Squashing MKL based softmax for inference
6 years ago
nhzlx 9b64aac41f add macro for pool2dDirectCUDAFunctor
6 years ago
Tomasz Patejko dbc4fcd722 MKLDNN residual connections fuse pass: unit tests enabled and added
6 years ago
Tomasz Patejko 4224089354 MKLDNN residual connections fuse pass: Maybe removed and boost::optional used where it makes sense
6 years ago
Tomasz Patejko 86fd3b32be MKLDNN residual connections fuse pass: counting statistics added to the pass
6 years ago
Tomasz Patejko ee6f778beb MKLDNN residual connections fuse pass: further refactoring
6 years ago
Tomasz Patejko 7423748e37 MKLDNN residual connections fuse pass:
6 years ago
nhzlx 8f9a8c455a delete unused test code.
6 years ago
whs 1722678258
Make nce support more distribution. (#13549)
6 years ago
nhzlx 83f8c403a7 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
6 years ago
nhzlx b969116988 fxi avg pool trt bug and fix cpplint
6 years ago
tensor-tang 1f00723fa3 exp, sigmoid, tanh jitcode support more size
6 years ago
Yu Yang 19e669a992 Add legacy_allocator
6 years ago
Zhaolong Xing 2f27c048cc
Merge pull request #14440 from hjchen2/develop
6 years ago
Yu Yang 1cb7e7dda2 fix(allocation): fix ut
6 years ago
Qiyang Min d971d5b875
Merge pull request #14431 from velconia/fix_expand_op_dim_in_compile_time
6 years ago
hjchen2 6a7b995737 Refine commit message to enable ci, test=develop
6 years ago
Wu Yi b32c13dc20
Add cudnn ctc loss (#12366)
6 years ago
tensor-tang 8cda7b3d20 Merge remote-tracking branch 'ups/develop' into fea/jit/act
6 years ago
tensor-tang e2d6eddd32 remove ComputeDeprecated
6 years ago
Yan Chunwei 7796f65f89
fix inference on gpu out of mem (#14414)
6 years ago
tensor-tang 64f7516aee
fix lrn on mac (#14426)
6 years ago
Yu Yang c8f6e70ab4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
6 years ago
hjchen2 413f5948b2 Fix code style
6 years ago
hjchen2 21f33b4274 Complete PRelu plugin and Conv2d transpose op converter
6 years ago
tensor-tang f65ddff8d1 unify act jitcode of relu, exp, sigmoid and tanh
6 years ago
tensor-tang 6a159071b6 add vtanh jitcode of size 8
6 years ago
tensor-tang 046374bcd1 add vsigmoid jitcode of size 8
6 years ago
minqiyang 560b29ccb7 Polish code
6 years ago
minqiyang 21d6e8e8c8 Polish code
6 years ago
minqiyang 50b6e4c6bc Fix expand grad op infer shape
6 years ago
Sylwester Fraczek 8a1eeec579 add mkldnn prop_kind phase for inference-only case to pooling and activations (#14278)
6 years ago
chengduo 82773477ae
Add selu (#14415)
6 years ago
Tao Luo 9d29ebc010
Merge pull request #14306 from sfraczek/sfraczek/test-analyzer-mobilenet
6 years ago
minqiyang 30147d7f58 Fix expand op incorrect infer shape
6 years ago
Sylwester Fraczek d318583eb5 rename mobilenet dir to mobilenet_depthwise_conv
6 years ago
Yu Yang e5c4cf6140 Polish allocation
6 years ago
Yihua Xu 03ccb9a461 Optimize the stack operator
6 years ago
tensor-tang ee2a7f1b8c refine exp and fix error on avx
6 years ago
tensor-tang 1e06a32a0d add vexp jitcode of size 8
6 years ago
tensor-tang 2354409601
Merge pull request #14374 from tensor-tang/fea/jit/act
6 years ago
Yu Yang 0d6718fcbd Pass compile
6 years ago
Tao Luo 1d867805b0 rollback analyzer_seq_conv1_tester
6 years ago
Tao Luo 5ef123c778 Merge branch 'develop' into dam_fc
6 years ago
dzhwinter d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
6 years ago
Yiqun Liu 9e6b1c5f97
Refine tester of TensorRT engine (#14390)
6 years ago
Tao Luo d3e63e6e04
Merge pull request #14412 from jczaja/prv-dam-softmax
6 years ago
Zhaolong Xing 77ac30e5fa
Merge pull request #14386 from NHZlX/add_trt_plugin
6 years ago
peizhilin 0ef2a37c0e merge from develop
6 years ago
peizhilin be332a13bc Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Xin Pan 8cfda7ee0c
Merge pull request #14382 from panyx0718/fix4
6 years ago
Jacek Czaja b361579f09 - Softmax for Inference is enabled when ON_INFER is set
6 years ago
Tao Luo 980a6753a8 fix typo to pass the ci
6 years ago
Tao Luo 8f301f4618
Merge pull request #14381 from qingqing01/manylinux_v5_fix
6 years ago
nhzlx 15bdb7ef14 delete error uploaded files
6 years ago
Yu Yang d93b2d0365 Refine code
6 years ago
Sylwester Fraczek 2412c27c2b
Merge branch 'develop' into sfraczek/test-analyzer-mobilenet
6 years ago
Tao Luo c7b3bfcdf1
Merge pull request #14376 from baojun-nervana/intel/ngraph_fusedop
6 years ago
peizhilin 1a9008c420 code style fix
6 years ago
Tao Luo e0d4e04bdd fix some compiler warning
6 years ago
Yu Yang ea81f8eed2 Clean interface of allocator
6 years ago
Tao Luo 8ea13e336a add in_num_col_dims for fc
6 years ago
Yu Yang 83ddafb515
Splict cicheks jobs and expose anakin options (#14327)
6 years ago
Xin Pan bae3659714 more test
6 years ago
nhzlx ddb120357c Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_trt_plugin
6 years ago
peizhilin 2abee2091b Merge branch 'windows/build' into windows/online
6 years ago
Wu Yi 9f33593910
human readable memory warns (#14361)
6 years ago
Tao Luo 9eb0ab1db3
Merge pull request #14384 from tensor-tang/refine/lrn
6 years ago
peizhilin 08d1dc84a9 fix
6 years ago
peizhilin 42c48c3a82 fix
6 years ago
peizhilin 447bf7c80b test=develop
6 years ago
peizhilin 203ec852cf Merge branch 'windows/build' into windows/online
6 years ago
peizhilin 30ddc07a7e Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
peizhilin a61909ff47 test=develop
6 years ago
Qiao Longfei e65cbd3b06
Merge pull request #14387 from jacquesqiao/lookup_sparse_table_add_test_mode
6 years ago
Qiao Longfei 6cf8f24b1b
Merge pull request #14389 from jacquesqiao/fix_sgd_op_optimize_sparse_table
6 years ago
Zeng Jinle 7066b3850a
Merge pull request #14395 from sneaxiy/fix_num_threads_in_fast_pe
6 years ago