Tao Luo
5d4d117edc
Merge pull request #14502 from qingqing01/cudnn5_fix
...
Fix compling with cuDNN v5
7 years ago
Jiabin Yang
f7b55de9e5
Merge branch 'develop' into enhance_hierachical_sigmod_op
7 years ago
Yu Yang
e68c1fcd5a
Merge pull request #14522 from reyoung/feature/fix_op_header_deps
...
fix(Compile): fix depends error when compile op using cub
7 years ago
hjchen2
6eba5bd276
Fix direct copy and refine split ut
...
test=develop
7 years ago
Qiao Longfei
fd290c2580
fix mac compile of analysis
...
test=develop
7 years ago
hjchen2
5857fb3014
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into develop
...
test=develop
7 years ago
tensor-tang
3562051302
add gru refer code and remove redundant avx code
...
test=develop
7 years ago
JiabinYang
af9a3301da
test=develop
7 years ago
hjchen2
3e3599f3d9
Refine split tensorrt plugin
7 years ago
peizhilin
f10e196fc8
fix build issue
7 years ago
Yu Yang
6a128dea32
Merge pull request #14515 from reyoung/feature/fix_macos_build
...
fix(Macos): fix compile on macos
7 years ago
Zhaolong Xing
ad349e770f
Merge pull request #14452 from NHZlX/fix_avg_pool_trt_bug
...
fix avg pool trt bug
7 years ago
tensor-tang
f913860873
jitkernel lstm refer support peephole
...
test=develop
7 years ago
tensor-tang
2f9b5f2383
Merge branch 'develop' into fea/jit/rnn
7 years ago
JiabinYang
014e50c284
test=develop
7 years ago
peizhilin
6e66fadb95
clean up the pre-definitions on windows
7 years ago
Yu Yang
3edd32d070
fix(Compile): fix depends error when compile op using cub
...
some operators depend on cub and xxhash by header. The dependency should be declared explicitly rather than declared to pybind.
test=develop
7 years ago
Dang Qingqing
cda60311f9
Fix compling with cuDNN v5
...
test=develop
7 years ago
peizhilin
67562a6fcd
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
703b26e697
add profiler, parallel_executor back
7 years ago
Tao Luo
1d9b2a453c
Merge pull request #14508 from luotao1/warm_up_multi_thread
...
add warm up in TestMultiThreadPrediction
7 years ago
Yu Yang
b3364d4035
fix(Macos): fix compile on macos
...
test=develop
7 years ago
Yu Yang
a685f305f8
Merge pull request #14479 from reyoung/feature/fix_macos_ut
...
fix(Mac): fix unittest of macos
7 years ago
tensor-tang
10fb4ceefc
Merge pull request #14351 from tpatejko/tpatejko/mkldnn-elementwise_mul
...
[MKLDNN][JIT][AVX512] Elementwise Mul
7 years ago
jerrywgz
13e254faed
refine code, test=develop
7 years ago
tensor-tang
b4c826c548
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
...
test=develop
7 years ago
tensor-tang
ce31deb7e9
refine refer code and add lstm refer code
...
test=develop
7 years ago
jerrywgz
79cec53111
add ignore index for sigmoid cross entropy with logits op, test=develop
7 years ago
nhzlx
e62872df8b
fix conflicts
7 years ago
nhzlx
a4dc1d4292
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_trt
...
test=develop
7 years ago
nhzlx
faeb9b8aa9
fix compile rely problem
7 years ago
chengduo
a8d3aaae2a
print output log warning ( #14497 )
...
test=develop
7 years ago
Tao Luo
eb9b9becdc
add warm up in TestMultiThreadPrediction
...
test=develop
7 years ago
tensor-tang
c2cfb03a72
add lstm jitcode
7 years ago
Tao Luo
5cc7946313
Merge pull request #14499 from luotao1/disable_openblas_test
...
disable two openblas test temporary
7 years ago
Houjiang Chen
10ae3ba486
Merge pull request #14493 from hjchen2/develop
...
Implement leaky relu converter from fluid to tensorRT
7 years ago
nhzlx
2a84054372
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_trt
...
test=develop
7 years ago
nhzlx
b742d46520
fix demo ci bug on trt
7 years ago
peizhilin
25adf970b2
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Houjiang Chen
33c65517fd
Update CMakeLists.txt test=develop
7 years ago
Tao Luo
1d3e9bde1e
Merge pull request #14488 from yihuaxu/develop_7a64d48f5_stack_opt
...
Optimize the stack operator
7 years ago
Houjiang Chen
01bda73116
Update CMakeLists.txt
7 years ago
Tao Luo
09ee266f8e
disable two openblas test temporary
...
test=develop
7 years ago
hjchen2
2c2a192eb1
Resolve merge conflicts
...
test=develop
7 years ago
Yiqun Liu
8bc1c5d2ab
Implement the Tensorrt plugin for elementwise op ( #14487 )
...
* Initialize the elementwise plugin.
* Implement the basic CUDA kernel of elementwise plugin.
test=develop
7 years ago
tensor-tang
7aa3aff338
Merge pull request #14465 from tensor-tang/fea/jit/exp
...
jitcode act support all size
7 years ago
Tao Luo
1b894e495f
Merge pull request #14437 from jczaja/prv-softmax-mkl
...
Introducing MKL to softmax for inference
7 years ago
peizhilin
3f73c0a70d
fix the build issue on windows
7 years ago
chengduo
a94a7355f0
Refine the GraphNum check ( #14144 )
...
* refine GraphCheck
test=develop
* fix ci fail
test=develop
7 years ago
peizhilin
3a72a634cf
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yihua Xu
a906a361be
Add the macro for NVCC (test=develop)
7 years ago
Yihua Xu
d91740acb1
Revert "Remove the remnant code (test=develop)"
...
This reverts commit be50670348
.
7 years ago
Yihua Xu
be50670348
Remove the remnant code (test=develop)
7 years ago
hjchen2
1622cb9937
Fix alpha tensor key
7 years ago
hjchen2
a8c077df7c
Implement leaky relu tensorRT converter
7 years ago
qingqing01
9eefd2c766
Modify some infer-shape about detection operators in compile-time. ( #14483 )
...
* Modify some infer-shape in compile-time.
7 years ago
Tao Luo
cf685f361b
Merge pull request #14458 from tpatejko/tpatejko/mkldnn-skip-connections
...
[WIP] Correcting and extending MKLDNN residual connection fuse pass
7 years ago
Yihua Xu
f4c869d872
Optimize the layer_norm operator with AVX intrinsic function ( #14417 )
...
* Optimize layer_norm operator with AVX intrinsic functions
* Revert the wrong modifications
* Implement the jit kernel for layer_norm operator
* Add math headfile to fix the compile issue (test=develop)
* Add math headfile to fix the compile issue (test=develop)
* Fixed the intrinsic headfile issue (test=develop)
* Fix the conflicts (test=develop)
* Revert for CUDA compiler (test=develop)
* Fixed the cuda depency (test=develop)
* Fix the marco issues (test=develop)
7 years ago
Houjiang Chen
816b464037
Merge pull request #14486 from hjchen2/develop
...
Fix tensorrt plugin cmake dependency, test=develop
7 years ago
peizhilin
81f750a88c
fix the dependency
7 years ago
peizhilin
ee0fd78c81
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yu Yang
f1a392a5fe
Merge pull request #13804 from sneaxiy/rewrite_allocation
...
Rewrite allocation
7 years ago
Yihua Xu
f418f552df
Merge branch 'develop' into develop_7a64d48f5_stack_opt (test=develop)
7 years ago
peizhilin
8443961a4f
add warp_ctc back
7 years ago
hjchen2
2825685f2a
Fix tensorrt plugin cmake dependency, test=develop
7 years ago
Superjomn
e878a8e885
update
...
test=develop
7 years ago
qingqing01
fd7e643153
Convolution fusion operator. ( #14449 )
...
* Convolution fusion operator.
* Clean code
test=develop
7 years ago
Yu Yang
98bbfc17be
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
Yu Yang
7486b0ddec
fix(Mac): fix unittest of macos
...
test=develop
7 years ago
peizhilin
4a6769da84
re-organize the cmake file
7 years ago
dengkaipeng
8ef6280c03
Add operator double support. test=develop
7 years ago
Yu Yang
d424115f9e
Clean code
...
test=develop
7 years ago
peizhilin
1aff40a4c6
exclude warpctc_op on windows
7 years ago
peizhilin
7d51a0e887
disable DSO by default on windows
7 years ago
peizhilin
b967e01cbe
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Wu Yi
d7bd0361cb
fix dist deps ( #14471 )
...
* fix dist deps test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
7 years ago
Yu Yang
b12c77dae2
Fix unittests
...
test=develop
7 years ago
Jacek Czaja
9b0eae3023
- Removing partial specialization of sotmax for inference for GPU
...
test=develop
7 years ago
peizhilin
c59d3e83bc
test case fix
7 years ago
peizhilin
a3e952f41d
add the jit back
...
fix compile error on windows
7 years ago
tensor-tang
a19b3225a1
fix jitcode small size
...
test=develop
7 years ago
Jacek Czaja
be80bb4f28
- Fix to GPU
...
test=develop
7 years ago
tensor-tang
4dbdfa60ef
sigmoid and tanh support all size
...
test=develop
7 years ago
tensor-tang
ccb8963705
refine exp jitcode with all size
...
test=develop
7 years ago
peizhilin
1cc23ef67d
merge from paddle:develop
7 years ago
tensor-tang
d3eae8f61b
refine relu and fix addrelu test
7 years ago
tensor-tang
4e67fe6a12
refine act and vxx with all size
7 years ago
tensor-tang
ba3eaed7a7
exp support all size
7 years ago
tensor-tang
1ffce8c0ae
fix build error on noavx
...
test=develop
7 years ago
superjomn
4bf6817cbc
fix gpu load model
...
the parameters will load from CPUPlace, that will keep copying data
between CPU and GPU places.
test=develop
7 years ago
Michal Gallus
c69c41604e
MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
...
test=develop
7 years ago
Michal Gallus
785066eb8a
MKLDNN elementwise_mul: Check if AVX512 is available
...
test=develop
7 years ago
Michal Gallus
08f63c4d12
MKLDNN elementwise_mul: Lint changes to UT & integration
...
test=develop
7 years ago
Michal Gallus
49b09327f6
MKLDNN elementwise_mul: Reorder on non-nchw input, fallback on non-16 divisable fm
...
test=develop
7 years ago
Michal Gallus
d14858e4ba
MKLDNN elementwise_mul: Parallelize mul
7 years ago
Michal Gallus
ed31936ba1
MKLDNN elementwise_mul: Support NCHW, update UT
7 years ago
Michal Gallus
4e54ab76ec
Add HasAttr method to Operator
7 years ago
Tomasz Patejko
700bcbf74f
MKLDNN elementwise_mul: h and w loops implemented in xbyak
7 years ago
Tomasz Patejko
ad09facafe
MKLDNN elementwise_mul: CPU tests initially refactored. MKLDNN mul test for broadcast added
7 years ago
Tomasz Patejko
2d73ad180a
MKLDNN elementwise_mul: simple xbyak version for AVX512
7 years ago
Tomasz Patejko
213ec37d6a
MKLDNN elementwise_add: simple initial implementation of the operator for MKLDNN format
7 years ago
Wu Yi
a2d9b34417
Refine operator cmake ( #14413 )
...
* wip simplify operator framework
* wip
* wip
* done test=develop
* clean test=develop
* fix test=develop
* fix deps test=develop
* fix cpu build test=develop
* fix tensorrt build test=develop
* fix tests test=develop
* fix test=develop
* fix cpu build test=develop
7 years ago
Tomasz Patejko
53da846d1e
MKLDNN residual connections fuse pass: initial implementation of fusion for projection pass
...
test=develop
7 years ago
peizhilin
764f97deac
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
8580b7a130
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang
7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
...
jitcode act relu, exp, sigmoid, tanh
7 years ago
Jiabin Yang
28bd5b7bad
fix space_to_depth_op unicode problem ( #14430 )
...
* fix space_to_depth_op unicode problem
* test=develop
7 years ago
Jacek Czaja
513bb6c151
Squashing MKL based softmax for inference
...
test=develop
- Added profiling to softmax functors
- MKL based softmax inference op
- Fix to softmax compuation via MKL
- cleaning
- Cosmetic fixes to softmax MKL
- Fix to ON_INFER lack of propagation
7 years ago
nhzlx
9b64aac41f
add macro for pool2dDirectCUDAFunctor
...
test=develop
7 years ago
Tomasz Patejko
dbc4fcd722
MKLDNN residual connections fuse pass: unit tests enabled and added
7 years ago
Tomasz Patejko
4224089354
MKLDNN residual connections fuse pass: Maybe removed and boost::optional used where it makes sense
7 years ago
Tomasz Patejko
86fd3b32be
MKLDNN residual connections fuse pass: counting statistics added to the pass
7 years ago
Tomasz Patejko
ee6f778beb
MKLDNN residual connections fuse pass: further refactoring
7 years ago
Tomasz Patejko
7423748e37
MKLDNN residual connections fuse pass:
...
* implements reachability check between identity node and non-identity argument to elementwise_add
* implements handling identity node as x and as y argument to elementwise_add
7 years ago
nhzlx
8f9a8c455a
delete unused test code.
...
test=develop
7 years ago
whs
1722678258
Make nce support more distribution. ( #13549 )
...
* Fix truncated normal.
* Fix.
* Make nce support more distribution.
* Fix API.spec.
* Fix python API.
* Fix.
test=develop
* Fix API.spec
test=develop
* Fix sampler.
* Fix order of arguments in python API.
test=develop
7 years ago
nhzlx
83f8c403a7
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
...
test=develop
7 years ago
nhzlx
b969116988
fxi avg pool trt bug and fix cpplint
7 years ago
tensor-tang
1f00723fa3
exp, sigmoid, tanh jitcode support more size
...
test=develop
7 years ago
Yu Yang
19e669a992
Add legacy_allocator
...
test=develop
7 years ago
Zhaolong Xing
2f27c048cc
Merge pull request #14440 from hjchen2/develop
...
Add PRelu tensorRT plugin and Conv2d transpose op converter
7 years ago
Yu Yang
1cb7e7dda2
fix(allocation): fix ut
...
test=develop
7 years ago
Qiyang Min
d971d5b875
Merge pull request #14431 from velconia/fix_expand_op_dim_in_compile_time
...
Fix expand op incorrect infer shape
7 years ago
hjchen2
6a7b995737
Refine commit message to enable ci, test=develop
7 years ago
Wu Yi
b32c13dc20
Add cudnn ctc loss ( #12366 )
...
* add cudnn ctc loss
* wip add test test=develop
* wip
* wip
* done test=develop
* move include cudnn test=develop
* test test=develop
* fix build test=develop
* fix build test=develop
* fix build on cudnn5 test=develop
* fix cudnn5 build test=develop
* fix cudnn5 build test=develop
* merge develop softmax functor change test=develop
7 years ago
peizhilin
d1a1fafc4c
code style
7 years ago
tensor-tang
8cda7b3d20
Merge remote-tracking branch 'ups/develop' into fea/jit/act
...
test=develop
7 years ago
tensor-tang
e2d6eddd32
remove ComputeDeprecated
...
test=develop
7 years ago
peizhilin
6d0d5a76eb
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
162f2d4109
disable the openblas multi-thread on windows since no support
...
adjust the python script
7 years ago
Yan Chunwei
7796f65f89
fix inference on gpu out of mem ( #14414 )
...
* fix inference on gpu out of mem
the transfer logic in operator.cc will keep creating new scopes.
7 years ago
dengkaipeng
f115eb0d1e
enhance api. test=develop
7 years ago
tensor-tang
64f7516aee
fix lrn on mac ( #14426 )
...
* rename and fix blas vsqr
test=develop
* update
7 years ago
Yu Yang
c8f6e70ab4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
hjchen2
413f5948b2
Fix code style
7 years ago
hjchen2
21f33b4274
Complete PRelu plugin and Conv2d transpose op converter
7 years ago
tensor-tang
f65ddff8d1
unify act jitcode of relu, exp, sigmoid and tanh
7 years ago
tensor-tang
6a159071b6
add vtanh jitcode of size 8
7 years ago
tensor-tang
046374bcd1
add vsigmoid jitcode of size 8
7 years ago
minqiyang
560b29ccb7
Polish code
...
test=develop
7 years ago
minqiyang
21d6e8e8c8
Polish code
...
test=develop
7 years ago
minqiyang
50b6e4c6bc
Fix expand grad op infer shape
...
test=develop
7 years ago
Sylwester Fraczek
8a1eeec579
add mkldnn prop_kind phase for inference-only case to pooling and activations ( #14278 )
...
* add is_test to pooling and activations
add prop_kind support for layers activation. conv and pooling
add a pass that sets is_test to true
add transpiler version of is_test pass
test=develop
* patch test and pass
test=develop
* add pass to analyzer.h
test=develop
* add is_test attr description & pass only on mkldnn
in:
activation_op.cc
batch_norm_op.cc
conv_op.cc
dropout_op.cc
lrn_op.cc
pool_op.cc
sequence_pool_op.cc
softmax_op.cc
* fix is_test handling for activation pool and conv
* change description of is_test for all layers again
* remove GetAttr(use_mkldnn) from pass
* rename correct_mkldnn_test_phase to is_test
and remove dependency on MKLDNN
test=develop
* review fix magic number
* two if(..)s into one
* Check is_test once and pass mkldnn forward prop kind
* dereference shared_ptr with * (without get())
test=develop
* add is_test_pass back
test=develop
7 years ago
peizhilin
d1429ac4a5
add recordio support
7 years ago
chengduo
82773477ae
Add selu ( #14415 )
...
* add selu
* use for range
test=develop
* add API
test=develop
* follow comment
test=develop
* update API.spec
test=develop
7 years ago
dengkaipeng
95d5060ddd
fix abs -> fabs error. test=develop
7 years ago
Tao Luo
9d29ebc010
Merge pull request #14306 from sfraczek/sfraczek/test-analyzer-mobilenet
...
add test_analyzer_mobilenet
7 years ago
minqiyang
30147d7f58
Fix expand op incorrect infer shape
...
test=develop
7 years ago
Sylwester Fraczek
d318583eb5
rename mobilenet dir to mobilenet_depthwise_conv
...
test=develop
7 years ago
JiabinYang
ba9ff508e8
temp fix
7 years ago
Yu Yang
e5c4cf6140
Polish allocation
...
Clean allocation->Deleter
test=develop
7 years ago
Yihua Xu
03ccb9a461
Optimize the stack operator
7 years ago
dengkaipeng
2faa2b4048
remove cu file. test=develop
7 years ago
tensor-tang
ee2a7f1b8c
refine exp and fix error on avx
...
test=develop
7 years ago
tensor-tang
1e06a32a0d
add vexp jitcode of size 8
...
test=develop
7 years ago
tensor-tang
2354409601
Merge pull request #14374 from tensor-tang/fea/jit/act
...
add vrelu jitcode
7 years ago
Yu Yang
0d6718fcbd
Pass compile
7 years ago
Tao Luo
1d867805b0
rollback analyzer_seq_conv1_tester
...
test=develop
7 years ago
Tao Luo
5ef123c778
Merge branch 'develop' into dam_fc
7 years ago
dzhwinter
d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
...
Windows/online
7 years ago
Yiqun Liu
9e6b1c5f97
Refine tester of TensorRT engine ( #14390 )
...
* Refine the tester for MixedRTPredictor.
test=develop
* Enable the profiler in TensorRT engine.
* Support the use of combined inference model in TensorRT unittest, and print the shape of feed targets.
7 years ago
Tao Luo
d3e63e6e04
Merge pull request #14412 from jczaja/prv-dam-softmax
...
Softmax for Inference is enabled when ON_INFER is set
7 years ago
Zhaolong Xing
77ac30e5fa
Merge pull request #14386 from NHZlX/add_trt_plugin
...
add plugin support for paddle-trt
7 years ago
peizhilin
0ef2a37c0e
merge from develop
7 years ago
peizhilin
be332a13bc
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Xin Pan
8cfda7ee0c
Merge pull request #14382 from panyx0718/fix4
...
Refine the pass builder and buildstrategy
7 years ago
Jacek Czaja
b361579f09
- Softmax for Inference is enabled when ON_INFER is set
...
test=develop
7 years ago
Tao Luo
980a6753a8
fix typo to pass the ci
...
test=develop
7 years ago
Tao Luo
8f301f4618
Merge pull request #14381 from qingqing01/manylinux_v5_fix
...
Fix compiling with cuDNN v5.
7 years ago
nhzlx
15bdb7ef14
delete error uploaded files
...
test=develop
7 years ago
Yu Yang
d93b2d0365
Refine code
7 years ago
Sylwester Fraczek
2412c27c2b
Merge branch 'develop' into sfraczek/test-analyzer-mobilenet
7 years ago
Tao Luo
c7b3bfcdf1
Merge pull request #14376 from baojun-nervana/intel/ngraph_fusedop
...
Adding fused operator for ngraph
7 years ago
peizhilin
1a9008c420
code style fix
...
test=develop
7 years ago
Tao Luo
e0d4e04bdd
fix some compiler warning
...
test=develop
7 years ago
Yu Yang
ea81f8eed2
Clean interface of allocator
...
Clean managed/umnamaged allocator
7 years ago
Tao Luo
8ea13e336a
add in_num_col_dims for fc
7 years ago
Xin Pan
bae3659714
more test
...
test=develop
7 years ago
nhzlx
ddb120357c
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_trt_plugin
...
merge develop and fix conflicts
7 years ago
JiabinYang
a507845a77
test=develop
7 years ago
peizhilin
2abee2091b
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
Wu Yi
9f33593910
human readable memory warns ( #14361 )
...
* human readable memory warns test=develop
* update test=develop
* refine test=develop
* fix build test=develop
7 years ago
Tao Luo
9eb0ab1db3
Merge pull request #14384 from tensor-tang/refine/lrn
...
Refine lrn cpu forward
7 years ago
peizhilin
08d1dc84a9
fix
7 years ago
peizhilin
42c48c3a82
fix
7 years ago
peizhilin
447bf7c80b
test=develop
7 years ago
peizhilin
203ec852cf
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
peizhilin
30ddc07a7e
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
a61909ff47
test=develop
7 years ago
Qiao Longfei
e65cbd3b06
Merge pull request #14387 from jacquesqiao/lookup_sparse_table_add_test_mode
...
Lookup sparse table add test mode
7 years ago
Qiao Longfei
6cf8f24b1b
Merge pull request #14389 from jacquesqiao/fix_sgd_op_optimize_sparse_table
...
sgd_op optimize selected rows do not enforce id < height
7 years ago
Zeng Jinle
7066b3850a
Merge pull request #14395 from sneaxiy/fix_num_threads_in_fast_pe
...
Fix num_threads settings in fast_pe
7 years ago
Xin Pan
10ab177f89
Merge pull request #14403 from PaddlePaddle/revert-14337-prv-dam-softmax
...
Revert "Softmax op optimization for inference "
7 years ago
Yan Chunwei
9f252e0032
Combine Inference Analysis with IR ( #13914 )
7 years ago
Tao Luo
5b9c62faee
Revert "Softmax op optimization for inference "
7 years ago
baojun-nervana
51a538e055
Fix style and use enum
...
test=develop
7 years ago
Tao Luo
6490bb2765
Merge pull request #14337 from jczaja/prv-dam-softmax
...
Softmax op optimization for inference
7 years ago
nhzlx
0b96268057
fix comments
...
test=develop
7 years ago
Zeng Jinle
38d32c98b8
merge develop
...
test=develop
7 years ago
sneaxiy
eb18d532a5
fix num_threads in fast_pe
...
test=develop
7 years ago
chengduo
9f68e9a7fe
fix auc op ( #14385 )
...
test=develop
7 years ago
dengkaipeng
a0284f6fbc
Add backward CPU kernel. test=develop
7 years ago
Dang Qingqing
d219818434
Fix compiling in cuDNN v5.
...
test=develop
7 years ago
Qiao Longfei
efb5c03f60
sgd_op optimize selected rows do not enforce id < height
...
test=develop
7 years ago
Qiao Longfei
51f3838f96
add log for not exist code
...
test=develop
7 years ago
Qiao Longfei
7aa8b2ccf2
optimize code
7 years ago
JiabinYang
db06568e69
test=develop
7 years ago
nhzlx
e5bf8616f0
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_trt_plugin
...
test=develop
7 years ago
nhzlx
d38fd6a0fc
add plugin support and offer an simple split sample
7 years ago
Qiao Longfei
8d205c853c
add is_test for lookup_sparse_table
7 years ago
tensor-tang
b4dfba1779
refine lrn_op cpu forward and speedup
...
test=develop
7 years ago
tensor-tang
1be85d011d
add mkl vsqr and vpow
7 years ago
JiabinYang
f4be1d99d0
polish code and test
7 years ago
Yibing Liu
6c7b64cc20
Support softmax return in softmax_with_cross_entropy ( #14367 )
...
* Support softmax return in softmax_with_cross_entropy
* Add test for return_softmax=False
test=develop
7 years ago
baojun-nervana
ea3538d8dd
Added fused operator
...
test=develop
7 years ago
Xin Pan
759ffca423
some improvements
...
test=develop
7 years ago
Xin Pan
99dffb91d6
allow to repeatedly share and update BuildStrategy
...
test=develop
7 years ago
Tao Luo
6c32945556
Merge pull request #14372 from luotao1/speedup_analysis
...
speedup DetectPatterns
7 years ago
ruri
4a55fb5f5b
Add density_prior_box_op ( #14226 )
...
Density prior box operator for image detection model.
7 years ago
nhzlx
2d7134bc37
add initial code for plugin
7 years ago
tensor-tang
0043c42b3e
add vrelu jitcode
...
test=develop
7 years ago
dengkaipeng
36c46152e1
Add unittest for yolov3_loss. test=develop
7 years ago
dengkaipeng
77c1328fa7
add CPU kernel forward
7 years ago
dengkaipeng
5d0b568ecb
Add YOLOv3 loss operator. test=develop
7 years ago
Tao Luo
668ae523d2
speedup DetectPatterns
...
test=develop
7 years ago
JiabinYang
b8ff0972b6
test=develop
7 years ago
JiabinYang
32e05b01f2
test=develop
7 years ago
Yu Yang
02631965c8
Refine
7 years ago
Yan Chunwei
9a6e239281
fix mac graph detector sort ( #14356 )
7 years ago
Qiao Longfei
c27554ac33
Merge pull request #14336 from jacquesqiao/add_bilinear_tensor_product_layer
...
add bilinear_tensor_product layer
7 years ago
Tao Luo
991a9f7b72
Merge pull request #14358 from NHZlX/add_serial_and_filter_logs
...
Set serial for tensorrt utest and set unused logs invisible.
7 years ago
peizhilin
bb3f6bd31c
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
peizhilin
1b75fd2236
revert
7 years ago
peizhilin
61fa5218b9
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yibing Liu
bd2943788b
Fix gather & stack op ( #14355 )
...
* Add int type support for stack_op
* Improve gather op to support index with shape N x 1
test=develop
* Fix stack_op kernel's registry
test=develop
7 years ago
Yu Yang
8f9bfad246
perf(compile): speed up reduce_op compile by splitting files ( #14294 )
...
test=develop
7 years ago
nhzlx
397de907ed
merge develops
...
test=develop
7 years ago
nhzlx
d6ff006903
add serial to trt test and do not print log for unused trt logs
7 years ago
peizhilin
13bfee1f85
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
peizhilin
7840d181c9
fix style issue
7 years ago
peizhilin
dc339b78d7
fix code style
7 years ago
sneaxiy
d231e55065
merge develop
...
test=develop
7 years ago
sneaxiy
cf8d2e67e3
clean buffered_allocator
7 years ago
peizhilin
ef8a7db81e
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
JiabinYang
c8801e100f
grad diff problem to be fixed and need api spec change to be done
7 years ago
Jacek Czaja
03299ed46c
- Fix to linking for GPU builds of softmax inference
...
test=develop
7 years ago
Jacek Czaja
0756343767
- Fix GPU compilation
...
test=develop
7 years ago
Jacek Czaja
d332326847
- Added unit tests for softmax is_test=True op
...
test=develop
7 years ago
Jacek Czaja
c1fccc29c1
- Noise adding removed for Test phase of softmax
7 years ago
Tao Luo
573e68eb40
Merge pull request #14348 from luotao1/speedup_analysis
...
skip mkldnn related pass when use_mkldnn=false to speedup analysis
7 years ago
peizhilin
9b558a8035
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
peizhilin
7638f0afb3
simplify the logic
7 years ago
peizhilin
d01a26280e
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Xin Pan
ff28b1ffc0
Merge pull request #14071 from barrierye/add_similarity_focus_op
...
Add similarity focus op
7 years ago
li099
688ed60116
Add lod tensor array to tensor op ( #13990 )
...
* add lod tensor array concat
* add lod tensor array concat
* test=develop
* add lod tensor array concat
test=develop
* Fix API.spec
test=develop
* add lod tensor array concat
test=develop
* revise some bug of lod tensor array concat
test=develop
* add unittest for tensor array concat
test=develop
* change to tensor array to tensor
test=develop
* revise bug
test=develop
* revise a bug
test=develop
* revise a bug
test=develop
* revise a bug of python3
test=develop
7 years ago
peizhilin
6c2b891d87
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
peizhilin
e23061e0dc
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
chengduo
6c6e638550
Add InferVarType for some op ( #14201 )
...
* add_infer_var_type
test=develop
* InferVarTypeHelper-> VarTypeInferenceHelper
test=develop
* PassInputTypeAndDTypeOnOutput
test=develop
* follow comment
test=develop
7 years ago
peizhilin
664a4e010c
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
peizhilin
1eec5a428f
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Kaipeng Deng
0b38822624
Merge pull request #14345 from heavengate/fix_grid_sampler
...
fix #14344 : win compile error, EigenTenor * float unsupport. test=develop
7 years ago
peizhilin
6f9c70acb7
Merge branch 'windows/build' into windows/online
...
test=develop
7 years ago
peizhilin
ca60e1d34d
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo
433fc7c1d4
skip mkldnn related pass when use_mkldnn=false
...
test=develop
7 years ago
peizhilin
4bd0c4c5ee
test=develop
7 years ago
peizhilin
350f1f3971
remove duplicate function definition
7 years ago
peizhilin
4b1f1a8787
fix merge issue
7 years ago
peizhilin
d08334011a
fix merge issue
7 years ago
Yu Yang
6ae0b91b39
Clean LockGuardPtr
...
test=develop
7 years ago
peizhilin
52f7644f53
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yu Yang
1420c3b155
Add enum AllocatorStrategy
...
test=develop
7 years ago
Qiyang Min
698698f2fa
Merge branch 'develop' into fix_vlog
7 years ago
qingqing01
abe209234f
Exhaustive search for cuDNN conv. ( #14286 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
* Fix compiling test=develop
7 years ago
Yu Yang
b59a9bfb7c
Clean buffered_allocator
...
test=develop
7 years ago
Kaipeng Deng
f215534ecf
Merge pull request #14205 from heavengate/nearest_interp
...
Add interpolate operator replace bilinear_interp_op and add nearest neighbor interp mode
7 years ago
dengkaipeng
72108d8dbe
fix win compile error: EigenTenor * float unsupport. test=develop
7 years ago
Yu Yang
26fb34c365
Merge develop tiny fix
7 years ago
Yu Yang
fdc689142c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
Yu Yang
7ffc9fd839
Merge branch 'rewrite_allocation' of https://github.com/sneaxiy/Paddle into rewrite_allocation
7 years ago
tensor-tang
22125ebaef
Merge pull request #14321 from tensor-tang/fea/jit/vscal
...
Fea jitcode vscal vaddbias
7 years ago
Tao Luo
f1046d7e37
Merge pull request #14335 from wojtuss/wojtuss/add-graph-viz
...
added additional call to graph_viz_pass
7 years ago
Tao Luo
34e9e59f4a
Merge pull request #14333 from kbinias/change-hardcoded-format-and-bump-mkldnn-version
...
Changed hardcoded format to any in convolution and bumped MKL-DNN version to 0.17rc
7 years ago
Qiao Longfei
3f91e0f001
update API.spec
...
test=develop
7 years ago
Sylwester Fraczek
b5f617fa9b
make mobilenet test reuse resnet50 test
7 years ago
Sylwester Fraczek
1987d45e75
add comment for depthwise pass
7 years ago
minqiyang
87450b9ad4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
peizhilin
41b423d41b
remove duplicate
7 years ago
peizhilin
dcfab11193
merge from develop
7 years ago
peizhilin
4ffa92d4f0
Merge branch 'develop' into windows/build
7 years ago
chengduo
c5b6573a5a
Fix input<tensor> ( #14208 )
...
* fix input<tensor>
test=develop
* fix split_ids
test=develop
* ElementwiseMul should not support SelectedRows
* fix scale op
test=develop
* change GetTensorFromVar() method to GetTensorOrSelectedRowsFromVar()
* fix operator
* refine MultiOutput
* fix MultiOutput
test=develop
* disable test_dist_save_load
test=develop
* fix elementwise_op
test=develop
* add get_sparse_as_op
test=develop
* add info for check
test=develop
* rename get_sparse_as_op with extract_rows_as_op.
test=develop
* elementwise doesn't support selected_rows
* fix regularizer
* remove extract_rows_as
test=develop
* fix ci
test=develop
* add test for sum_op
* fix regularizer
test=develop
* test=develop
* fix pserver weight decay multi inputs test=develop
7 years ago
Krzysztof Binias
f1c1acf1ac
Changed hardcoded format to any in convolution and bumped MKL-DNN version to 0.17-rc
...
test=develop
7 years ago
Tao Luo
813e54efbd
Merge pull request #14328 from PaddlePaddle/revert-14046-windows/debug
...
Revert "cherry picked windows patches."
7 years ago
minqiyang
3db9fad764
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
minqiyang
3da43dcae2
Because anakin do NOT use glog, so we revert anakin related change
...
test=develop
7 years ago
Tao Luo
387610aae1
Merge pull request #14325 from luotao1/fix_test_analysis_predictor
...
fix test_analysis_predictor
7 years ago
peizhilin
45125ba538
fix share library issue
7 years ago
Xin Pan
b03a44e062
Merge pull request #14026 from JiabinYang/add_reorg_op
...
Add reorg op
7 years ago
Xin Pan
ff6c809bfc
Merge pull request #14251 from panyx0718/fix
...
Make OpHandle/VarHandle and ir::Node works cleaner
7 years ago
Zhaolong Xing
ba8b5619a3
Revert "cherry picked windows patches."
7 years ago
minqiyang
49710960ef
Revert tensor_util.cu
...
test=develop
7 years ago
minqiyang
fcc0452c8b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
Tao Luo
381bea0a16
fix test_analysis_predictor
...
test=develop
7 years ago
minqiyang
0c3227a523
Change the origin VLOG level to 10 times
...
Fix code to support cpplint syntax check
test=develop
7 years ago
tensor-tang
5e64244f25
add vaddbias jitcode
...
test=develop
7 years ago
tensor-tang
5f7956ae59
Merge remote-tracking branch 'ups/develop' into fea/jit/vscal
7 years ago
Xin Pan
59c66532e7
add more logs and comments
...
test=develop
7 years ago
dzhwinter
1f4a434302
Merge pull request #14046 from dzhwinter/windows/debug
...
cherry picked windows patches.
7 years ago
peizhilin
869487a2b7
Merge remote-tracking branch 'origin/develop' into windows/build
7 years ago
Wojciech Uss
7fd640b882
added additional call to graph_viz_pass
...
test=develop
7 years ago
tensor-tang
3d950a812d
combine jitcode of vscal
7 years ago
tensor-tang
03e11f3fc9
add vscal jitcode
7 years ago
Qiao Longfei
5b7a9dd7ac
Merge pull request #13815 from jacquesqiao/optimize-pyreader
...
optimize pyreader
7 years ago
dzhwinter
234a1d9248
Merge remote-tracking branch 'origin/develop' into windows/debug
...
test=develop
7 years ago
chengduo
a270fdf2db
Fix SelectedRowsAdd bug ( #14309 )
...
* fix selected_rows bug
test=develop
* refine cos_sim
test=develop
7 years ago
Qiao Longfei
ce994190ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pyreader
...
test=develop
7 years ago
tensor-tang
2f0a379af7
Merge pull request #14307 from tensor-tang/fix/mac
...
fix mac
7 years ago
Zeng Jinle
b2af213009
Merge pull request #14292 from sneaxiy/delete_buggy_selected_rows_functor
...
Delete buggy selected_rows functor
7 years ago
tensor-tang
161ba9c9d1
fix mac
...
test=develop
7 years ago
Sylwester Fraczek
f395075efc
rebased and stuff broke
7 years ago
tensor-tang
e8642c3c1f
Merge pull request #14265 from tensor-tang/fea/jit/vadd
...
add vadd, vaddrelu jitcode
7 years ago
Sylwester Fraczek
a60957f386
addd test_analyzer_mobilenet
7 years ago
dengkaipeng
8b47d90f5d
add 'actual_shape' attribute. test=develop
7 years ago
tensor-tang
382307b943
refine code
...
test=develop
7 years ago
tensor-tang
3319072858
fix jit kernel test on mac
...
test=develop
7 years ago
tensor-tang
44cb70c088
Merge remote-tracking branch 'ups/develop' into fix/mac
7 years ago
Yu Yang
c774bcbd2d
Merge device_context
...
test=develop
7 years ago
Yu Yang
057a682ee9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Yu Yang
c28beb8a3c
test(Pe): add dry run tests for pe ( #14254 )
...
Dry run tests will skip `Op.Run` and just perform job scheduling. It helps to analysis dead lock in PE.
test=develop
7 years ago
tensor-tang
c9730d33d9
fix run error on mac
...
test=develop
7 years ago
Xin Pan
80132933b7
Merge pull request #14281 from luotao1/face
...
refine analysis_resnet50_tester
7 years ago
Qiao Longfei
e0c8397426
Merge pull request #14257 from jacquesqiao/optimize-pserver-profiler-thread-pool
...
clean rpc server profiler
7 years ago
chengduo
ffc866159f
hot fix log ( #14293 )
...
test=develop
7 years ago
Zhaolong Xing
65b61db10a
Merge pull request #13927 from NHZlX/fix_googlenet_bug_with_rule
...
Fix googlenet bug with rule
7 years ago
tensor-tang
25e070ecc7
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
barrierye
ef8218be22
update docs test=develop
7 years ago
Tao Luo
eea36739cc
refine test_helper.h
...
test=develop
7 years ago
Qiao Longfei
6449faec37
Merge pull request #14259 from jacquesqiao/optimize-thread-pool
...
Optimize thread pool
7 years ago
sneaxiy
9518bc8d0a
delete buggy selected_rows functor
...
test=develop
7 years ago
chengduo
a9b5d42dd4
Add fp16 backward support ( #14202 )
...
* add fp16 backward support
test=develop
* add sum_op fp16 test
* disable test_dist_save_load
test=develop
* add check_grad for sum
* add unit test for softmax_grad fp16
test=develop
* add scale_op unit test
* add mul_grad_op unit test for fp16
* add cross_entropy_grad and eman_grad unit test for fp16
test=develop
* fix cross_entropy unit test
* add pool2d fp16 unit test
* refine conv2d fp16 unit test
test=develop
* refine activation unit test
test=develop
* fix ci
test=develop
* follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
test=develop
7 years ago
Qiao Longfei
3b8dd9ebbd
optimize code test=develop
7 years ago
Tao Luo
2b791f1f63
unify analyzer_face_tester to analyzer_resnet50_tester
...
test=develop
7 years ago
Qiao Longfei
2921f8a79c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
7 years ago
Tao Luo
1ead9318d5
remove unused code in test_helper.h to pass ci
...
test=develop
7 years ago
Qiao Longfei
4062f00f2a
optimize thread pool code
...
test=develop
7 years ago
dzhwinter
2835e04409
merge develop branch. test=develop
7 years ago
dzhwinter
deb4af70ef
add test
7 years ago
qingqing01
db8c52da5e
Revert " Exhaustive search for cuDNN conv. ( #14043 )"
...
This reverts commit ce7d9b0799
.
7 years ago
qingqing01
ce7d9b0799
Exhaustive search for cuDNN conv. ( #14043 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Clean code
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
7 years ago
tensor-tang
cb4083b9fa
fix compile error
...
test=develop
7 years ago
tensor-tang
dd343a4971
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago