Commit Graph

4879 Commits (53433d7f2e66a1b1bf66be51b7605dba40ba6336)

Author SHA1 Message Date
nhzlx faeb9b8aa9 fix compile rely problem
7 years ago
chengduo a8d3aaae2a
print output log warning (#14497)
7 years ago
Tao Luo eb9b9becdc add warm up in TestMultiThreadPrediction
7 years ago
Tao Luo 5cc7946313
Merge pull request #14499 from luotao1/disable_openblas_test
7 years ago
Houjiang Chen 10ae3ba486
Merge pull request #14493 from hjchen2/develop
7 years ago
nhzlx 2a84054372 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_trt
7 years ago
nhzlx b742d46520 fix demo ci bug on trt
7 years ago
Houjiang Chen 33c65517fd Update CMakeLists.txt test=develop
7 years ago
Tao Luo 1d3e9bde1e
Merge pull request #14488 from yihuaxu/develop_7a64d48f5_stack_opt
7 years ago
Houjiang Chen 01bda73116
Update CMakeLists.txt
7 years ago
Tao Luo 09ee266f8e disable two openblas test temporary
7 years ago
hjchen2 2c2a192eb1 Resolve merge conflicts
7 years ago
Yiqun Liu 8bc1c5d2ab
Implement the Tensorrt plugin for elementwise op (#14487)
7 years ago
tensor-tang 7aa3aff338
Merge pull request #14465 from tensor-tang/fea/jit/exp
7 years ago
Tao Luo 1b894e495f
Merge pull request #14437 from jczaja/prv-softmax-mkl
7 years ago
chengduo a94a7355f0
Refine the GraphNum check (#14144)
7 years ago
Yihua Xu a906a361be Add the macro for NVCC (test=develop)
7 years ago
Yihua Xu d91740acb1 Revert "Remove the remnant code (test=develop)"
7 years ago
Yihua Xu be50670348 Remove the remnant code (test=develop)
7 years ago
hjchen2 1622cb9937 Fix alpha tensor key
7 years ago
hjchen2 a8c077df7c Implement leaky relu tensorRT converter
7 years ago
qingqing01 9eefd2c766
Modify some infer-shape about detection operators in compile-time. (#14483)
7 years ago
Tao Luo cf685f361b
Merge pull request #14458 from tpatejko/tpatejko/mkldnn-skip-connections
7 years ago
Yihua Xu f4c869d872 Optimize the layer_norm operator with AVX intrinsic function (#14417)
7 years ago
Houjiang Chen 816b464037
Merge pull request #14486 from hjchen2/develop
7 years ago
Yu Yang f1a392a5fe
Merge pull request #13804 from sneaxiy/rewrite_allocation
7 years ago
Yihua Xu f418f552df Merge branch 'develop' into develop_7a64d48f5_stack_opt (test=develop)
7 years ago
hjchen2 2825685f2a Fix tensorrt plugin cmake dependency, test=develop
7 years ago
Superjomn e878a8e885 update
7 years ago
qingqing01 fd7e643153
Convolution fusion operator. (#14449)
7 years ago
Yu Yang 98bbfc17be Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Yu Yang 7486b0ddec fix(Mac): fix unittest of macos
7 years ago
Yu Yang d424115f9e Clean code
7 years ago
Wu Yi d7bd0361cb fix dist deps (#14471)
7 years ago
Yu Yang b12c77dae2 Fix unittests
7 years ago
Jacek Czaja 9b0eae3023 - Removing partial specialization of sotmax for inference for GPU
7 years ago
tensor-tang a19b3225a1 fix jitcode small size
7 years ago
Jacek Czaja be80bb4f28 - Fix to GPU
7 years ago
tensor-tang 4dbdfa60ef sigmoid and tanh support all size
7 years ago
tensor-tang ccb8963705 refine exp jitcode with all size
7 years ago
tensor-tang d3eae8f61b refine relu and fix addrelu test
7 years ago
tensor-tang 4e67fe6a12 refine act and vxx with all size
7 years ago
tensor-tang ba3eaed7a7 exp support all size
7 years ago
tensor-tang 1ffce8c0ae fix build error on noavx
7 years ago
superjomn 4bf6817cbc fix gpu load model
7 years ago
Michal Gallus c69c41604e MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
7 years ago
Michal Gallus 785066eb8a MKLDNN elementwise_mul: Check if AVX512 is available
7 years ago
Michal Gallus 08f63c4d12 MKLDNN elementwise_mul: Lint changes to UT & integration
7 years ago
Michal Gallus 49b09327f6 MKLDNN elementwise_mul: Reorder on non-nchw input, fallback on non-16 divisable fm
7 years ago
Michal Gallus d14858e4ba MKLDNN elementwise_mul: Parallelize mul
7 years ago
Michal Gallus ed31936ba1 MKLDNN elementwise_mul: Support NCHW, update UT
7 years ago
Michal Gallus 4e54ab76ec Add HasAttr method to Operator
7 years ago
Tomasz Patejko 700bcbf74f MKLDNN elementwise_mul: h and w loops implemented in xbyak
7 years ago
Tomasz Patejko ad09facafe MKLDNN elementwise_mul: CPU tests initially refactored. MKLDNN mul test for broadcast added
7 years ago
Tomasz Patejko 2d73ad180a MKLDNN elementwise_mul: simple xbyak version for AVX512
7 years ago
Tomasz Patejko 213ec37d6a MKLDNN elementwise_add: simple initial implementation of the operator for MKLDNN format
7 years ago
Wu Yi a2d9b34417
Refine operator cmake (#14413)
7 years ago
Tomasz Patejko 53da846d1e MKLDNN residual connections fuse pass: initial implementation of fusion for projection pass
7 years ago
tensor-tang 7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
7 years ago
Jiabin Yang 28bd5b7bad fix space_to_depth_op unicode problem (#14430)
7 years ago
Jacek Czaja 513bb6c151 Squashing MKL based softmax for inference
7 years ago
nhzlx 9b64aac41f add macro for pool2dDirectCUDAFunctor
7 years ago
Tomasz Patejko dbc4fcd722 MKLDNN residual connections fuse pass: unit tests enabled and added
7 years ago
Tomasz Patejko 4224089354 MKLDNN residual connections fuse pass: Maybe removed and boost::optional used where it makes sense
7 years ago
Tomasz Patejko 86fd3b32be MKLDNN residual connections fuse pass: counting statistics added to the pass
7 years ago
Tomasz Patejko ee6f778beb MKLDNN residual connections fuse pass: further refactoring
7 years ago
Tomasz Patejko 7423748e37 MKLDNN residual connections fuse pass:
7 years ago
nhzlx 8f9a8c455a delete unused test code.
7 years ago
whs 1722678258
Make nce support more distribution. (#13549)
7 years ago
nhzlx 83f8c403a7 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
7 years ago
nhzlx b969116988 fxi avg pool trt bug and fix cpplint
7 years ago
tensor-tang 1f00723fa3 exp, sigmoid, tanh jitcode support more size
7 years ago
Yu Yang 19e669a992 Add legacy_allocator
7 years ago
Zhaolong Xing 2f27c048cc
Merge pull request #14440 from hjchen2/develop
7 years ago
Yu Yang 1cb7e7dda2 fix(allocation): fix ut
7 years ago
Qiyang Min d971d5b875
Merge pull request #14431 from velconia/fix_expand_op_dim_in_compile_time
7 years ago
hjchen2 6a7b995737 Refine commit message to enable ci, test=develop
7 years ago
Wu Yi b32c13dc20
Add cudnn ctc loss (#12366)
7 years ago
tensor-tang 8cda7b3d20 Merge remote-tracking branch 'ups/develop' into fea/jit/act
7 years ago
tensor-tang e2d6eddd32 remove ComputeDeprecated
7 years ago
Yan Chunwei 7796f65f89
fix inference on gpu out of mem (#14414)
7 years ago
tensor-tang 64f7516aee
fix lrn on mac (#14426)
7 years ago
Yu Yang c8f6e70ab4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
hjchen2 413f5948b2 Fix code style
7 years ago
hjchen2 21f33b4274 Complete PRelu plugin and Conv2d transpose op converter
7 years ago
tensor-tang f65ddff8d1 unify act jitcode of relu, exp, sigmoid and tanh
7 years ago
tensor-tang 6a159071b6 add vtanh jitcode of size 8
7 years ago
tensor-tang 046374bcd1 add vsigmoid jitcode of size 8
7 years ago
minqiyang 560b29ccb7 Polish code
7 years ago
minqiyang 21d6e8e8c8 Polish code
7 years ago
minqiyang 50b6e4c6bc Fix expand grad op infer shape
7 years ago
Sylwester Fraczek 8a1eeec579 add mkldnn prop_kind phase for inference-only case to pooling and activations (#14278)
7 years ago
chengduo 82773477ae
Add selu (#14415)
7 years ago
Tao Luo 9d29ebc010
Merge pull request #14306 from sfraczek/sfraczek/test-analyzer-mobilenet
7 years ago
minqiyang 30147d7f58 Fix expand op incorrect infer shape
7 years ago
Sylwester Fraczek d318583eb5 rename mobilenet dir to mobilenet_depthwise_conv
7 years ago
Yu Yang e5c4cf6140 Polish allocation
7 years ago
Yihua Xu 03ccb9a461 Optimize the stack operator
7 years ago
tensor-tang ee2a7f1b8c refine exp and fix error on avx
7 years ago
tensor-tang 1e06a32a0d add vexp jitcode of size 8
7 years ago