nhzlx
084310f536
paddle-anakin: concat, split, pool2d converter#16003
6 years ago
flame
be523baad2
Add anakin conv2d/relu/sigmoid/tanh converter ( #15997 )
...
* add activation op
* test conv2d relu sigmoid tanh
6 years ago
Yan Chunwei
d0ce6a9044
fix anakin converter registry ( #15993 )
6 years ago
luotao1
82af8031d9
add runtime_context_cache_pass
...
test=develop
6 years ago
Tao Luo
7d2740db83
Revert "cache runtime_context"
6 years ago
Jacek Czaja
13816dd4ac
[MKL-DNN] Fix to crash of Transformer when mkldnn is to be used ( #16233 )
...
* - Fix to crash of Transformer when mkldnn is to be used
Desc: TensorCopy was not setting MKLDNN primitive descriptor when layout was to be kMKLDNN
test=develop
* - Enable transformer for mkl-dnn
test=develo
* - Compilation fix
test=develop
* - Removed manual selection of MKL-DNN ops to be used in Transformer test
test=develop
6 years ago
Tao Luo
dbb92ee4b1
Merge pull request #16002 from luotao1/runtime_context
...
cache runtime_context
6 years ago
Qiyang Min
8e4ad008fb
Merge pull request #16198 from velconia/imperative_train_speed
...
Improve imperative mode training speed
6 years ago
luotao1
a275fd6e0c
Merge branch 'develop' into runtime_context
6 years ago
Wojciech Uss
2579ade45f
Add cpu_quantize_pass for C-API quantization ( #16127 )
...
* Add cpu_quantize_pass for C-API quantization
test=develop
* add cpu_quantize_pass test
* fix lint: add include memory unorderd_map and unordered_set
test=develop
* fuse_relu 1
test=develop
* tuned 2 without squash
* fixes
test=develop
* remove unused vars
test=develop
* refactored
test=develop
* fix lint c-style cast -> C++ style cast
test=develop
* remove QuantMax and c style casts
test=develop
* last usage of QuantMax removed
test=develop
* Fix Analysis Predictor UT
Check if memory_optimize_pass has already been added
to the analysis config before adding a new one, so
that it is not added multiple times.
test=develop
* change map to unordered_map
fix the forgotten part of cpu_quantize_pass_tester.cc
test=develop
* removed quantized attribute
* fixed cpu_quantize_pass_tester and op attr comments
test=develop
* removed redundant line
test=debug
* removed gmock
test=develop
* fix after merge
6 years ago
luotao1
5ecdc49c6b
set enable_runtime_context_cache_ default false
...
test=develop
6 years ago
minqiyang
7355d41834
1. Add imperative gperf profiler
...
2. Add binutils 2.27 in manylinux support
test=develop
6 years ago
minqiyang
98dfb492bb
Release GIL lock
6 years ago
minqiyang
42e96a029f
Accelerate CPU part
6 years ago
luotao1
1510b866b6
turn off runtime_context_cache for tensorrt
...
test=develop
6 years ago
luotao1
d94fd97230
add runtime_context_cache_pass
...
test=develop
6 years ago
fc500110
1c6e72b905
remove visualizer, which can be replaced by python IrGraph draw API
6 years ago
Tao Luo
c49b7855fa
Merge pull request #16120 from Xreki/fix_cmake_compress
...
Change the download and compress command of cmake.
6 years ago
Liu Yiqun
4e052e0ac9
Disable inference download for WIN32 temporary.
...
test=develop
6 years ago
luotao1
1283833395
zero_copy tensor support INT32
...
test=develop
6 years ago
luotao1
31c4e1d9fc
Merge branch 'develop' into zero_copy
6 years ago
luotao1
9e2c7e69fb
simplify the zero_copy tests
...
test=develop
6 years ago
luotao1
aeee4cbe71
add compare between zerocopy and analysis
6 years ago
Liu Yiqun
6bb84b74b2
Change the download and compress command of cmake.
...
test=develop
6 years ago
Tao Luo
25ca2ca001
change init_idx to INT32 in transformer_test
...
test=develop
6 years ago
Tao Luo
e5e7e9b865
Merge branch 'develop' into transformer_ut
6 years ago
Tao Luo
6f2581e4c5
Merge pull request #16090 from lidanqing-intel/paddle-int32
...
Add PaddleDType INT32 support
6 years ago
Zhaolong Xing
3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
...
Four points for enhancing Paddle-TRT
6 years ago
nhzlx
a9ed427749
cant not pass ci
...
add if use static engine for trt
test=develop
6 years ago
luotao1
fad06cb928
unify ZeroCopy in analysis_test
6 years ago
lidanqing
4aeb261da9
Add INT32 support. INT32 in last switch case
...
test=develop
6 years ago
luotao1
06aab1b493
refine SetCpuMathLibraryNumThreads
...
test=develop
6 years ago
nhzlx
3c40cb767b
7 refine zero copy
...
update trt in docker file
test=develop
6 years ago
Yiqun Liu
1616c32acf
Add the include of cudnn.h to enable the use of CUDNN_VERSION. ( #15961 )
...
test=develop
6 years ago
flame
b187e3728e
add anakin fc op converter ( #15965 )
6 years ago
flame
e40d56c3d3
anakin subgraph engine ( #15774 )
...
* add anakin subgraph engine
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* add initial op converter
* update
* update
* fix op register compile error
* update
test=develop
* update
6 years ago
nhzlx
2eff3e26b6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx
06a088a199
fix comments and fix cpplint
...
test=develop
6 years ago
nhzlx
0ed63b2108
6. delete useless predictor id
...
test=develop
6 years ago
nhzlx
1d5ef7c9ee
5. add static trt load model
...
1). add static trt load model
2). fix bug: when device_id is not 0, the trt will have a bug
test=develop
6 years ago
Tao Luo
4774dad806
Merge pull request #15857 from sfraczek/fix-typo
...
Fix few typos
6 years ago
Tao Luo
e3dd6970fc
disable dam temporarily ( #15860 )
...
test=develop
6 years ago
Sylwester Fraczek
1943119fc5
fix typo memeroy->memory
...
test=develop
6 years ago
Sylwester Fraczek
8bc604571f
fix typo seriazlized->serialized
6 years ago
Sylwester Fraczek
543e53db05
fix typo releated->related
6 years ago
Dun
a83e470405
Profiler refine and add CUDA runtime api tracer ( #15301 )
...
* refine profiler && add runtime tracer
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* fix bug && test=develop
* add thread id map && test=develop
* test=develop
* testing
* bug fix
* remove cuda event && refine code && test=develop
* test=develop
* test=develop
* test=develop
* fix windows temp file && test=develop
* test=develop
* fix windows bug && test=develop
* fix start up issue && test=develop
* code polish && test=develop
* remove unused code && test=develop
* add some cupti cbid && test=develop
* add FLAGS_multiple_of_cupti_buffer_size && test=develop
* fix compile error && test=develop
* add keyword && test=develop
* fix && test=develop
* code polish && test=develop
6 years ago
Yiqun Liu
e38dd91f04
Refine cmake's download function. ( #15512 )
...
* Refine cmake's download function.
test=develop
* Set DOWNLOAD_NO_EXTRACT to 1 pure download function.
test=develop
* Fix unpack problem in ExternalProject_Add, and it seem DOWNLOAD_NO_EXTRACT option is not support in cmake-3.5.
test=develop
6 years ago
tensor-tang
e1c707fe9c
fix warnings ( #15790 )
...
* fix warnings
test=develop
* fix enforce test
test=develop
6 years ago
nhzlx
2070fb246d
4. do the trt_engine optim during init.
...
add simple static mode loading
test=develop
6 years ago
nhzlx
ecc12fb430
3. when runing in trt mode, do not allocate memory for parameters in fluid.
...
test=develop
6 years ago
nhzlx
9cc6249cd6
2. TRTEngine using stream only when execute.
6 years ago
Wojciech Uss
daac6a05f5
Removed duplicated code
...
This also fixes linking to libpaddle_fluid.so built in debug mode
test=develop
6 years ago
Yan Chunwei
3a5d6e5e64
move passes to src to avoid different behavior in deployment ( #15705 )
6 years ago
nhzlx
034ba1c291
add static model load for trt
...
1. bind trt input and output to fluid tensors
6 years ago
Yan Chunwei
c00ed19df2
add more comment ( #15603 )
6 years ago
Gabor Buella
da9c94da33
Clang build fixes ( #15628 )
...
* Remove some superfluous std::move calls
The std:move triggered a build error (with -Werror):
```
[ 9%] Building CXX object paddle/fluid/memory/allocation/CMakeFiles/allocator_facade.dir/allocator_facade.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: note: remove std::move call here
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^~~~~~~~~~ ~
1 error generated.
```
See: https://reviews.llvm.org/D7633
* Remove a superfluous lambda capture from framework/operator.h
```
[ 10%] Building CXX object paddle/fluid/platform/CMakeFiles/device_context.dir/init.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/platform/init.cc:19:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.h:229:21: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
[this](Variable* var) { return var; });
^~~~
1 error generated.
```
Changing it to `return it->second;`, as is in the function below.
* Rethrow an exception (instead of copying it)
```
[ 11%] Building CXX object paddle/fluid/framework/CMakeFiles/operator.dir/operator.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: error: local variable 'exception' will be copied despite being thrown by name [-Werror,-Wreturn-std-move]
throw exception;
^~~~~~~~~
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: note: call 'std::move' explicitly to avoid copying
throw exception;
^~~~~~~~~
std::move(exception)
```
See https://reviews.llvm.org/D43322 for an explanation of this diagnostic message.
* Remove an unused variable
```
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:884:16: error: private field 'scope_' is not used [-Werror,-Wunused-private-field]
const Scope& scope_;
^
```
* struct ComputationOpHandle -> class ComputationOpHandle
```
[ 13%] Building CXX object paddle/fluid/framework/details/CMakeFiles/memory_early_delete_pass.dir/memory_early_delete_pass.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/memory_early_delete_pass.cc:21:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: error: class 'ComputationOpHandle' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class ComputationOpHandle;
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/computation_op_handle.h:29:8: note: previous use is here
struct ComputationOpHandle : public OpHandleBase {
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: note: did you mean struct here?
class ComputationOpHandle;
^~~~~
struct
1 error generated.
```
* Fix name() methods under fluid/operators
```
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.cc:15:
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.h:19:
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/jitcode.h:71:23: error: 'name' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
virtual const char* name() const = 0;
^
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen_base.h:31:23: note: overridden virtual function is here
virtual const char* name() const = 0;
^
```
test=develop
6 years ago
Chunwei
d85c2e4e5c
fix anakin compile dependency
...
test=develop
6 years ago
wopeizl
3614dadf23
Merge pull request #15631 from wopeizl/windows/fixci
...
fix ci broken randomly and disable some warnings
6 years ago
peizhilin
061299be87
fix dependency
...
test=develop
6 years ago
Gabor Buella
2bf63f4c33
Fix std::abs usage in memory_optimize_pass.cc ( #15627 )
...
test=develop
size_t is an unsigned integer, with a conversion rank
larger than int, therefore in the following expression
the int value was promoted to size_t, making it a
subtraction of unsigned values. The result of such
a subtraction is also an unsigned value.
6 years ago
peizhilin
3a4110f960
fix ci broken randomly and disable some warnings
...
test=develop
6 years ago
dzhwinter
4f01de6378
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
qingqing01
943d972878
Fix analysis predictor when loading the persistable RAW type variable. ( #15613 )
6 years ago
dzhwinter
9c9ad7d40b
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
...
test=develop
6 years ago
Yan Chunwei
e887d71958
fix ir debug config ( #15571 )
6 years ago
Yan Chunwei
897789b16e
fix save_inferece_model bug ( #15365 )
6 years ago
dzhwinter
6f9904e99a
rerun windows ci. test=develop
6 years ago
Tao Luo
3d0ecab41b
add analyzer_transformer_test
...
test=develop
6 years ago
Tao Luo
1a252f4be6
Merge pull request #15587 from luotao1/bert
...
use embedding=128 bert model for test
6 years ago
Jiabin Yang
b4c24f3f7c
Merge pull request #15575 from JiabinYang/feature/imperative
...
test=develop, polish code and fix some wrong change
6 years ago
Zhaolong Xing
90ffe74954
Merge pull request #15546 from NHZlX/fix_trt_utest_random_failed
...
fix trt models utest failed.
6 years ago
luotao1
8f0c2b07f2
use embedding=128 bert model for test
...
test=develop
6 years ago
JiabinYang
16f64b43d4
test=develop, Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/imperative
6 years ago
Tao Luo
245b1f0579
Merge pull request #15570 from luotao1/bert
...
fix compiler error, use len20 dataset for bert
6 years ago
JiabinYang
bb881199f2
test=develop, polish code and fix wrong change in /paddle/fluid/inference/utils/CMakeLists.txt
6 years ago
Jiabin Yang
075df09f86
Merge pull request #15470 from JiabinYang/feature/imperative
...
Add simple RNN in imperative
6 years ago
luotao1
5504425eb3
fix compiler error, use len20 dataset for bert
...
test=develop
6 years ago
Yan Chunwei
655179089f
AnalysisConfig remove contrib namespace ( #15540 )
6 years ago
luotao1
e31aef9f6e
Merge branch 'develop' into fc500110-bert_test
...
test=develop
6 years ago
qingqing01
a6910f900e
Always create variables in analysis_predictor before OptimizeInferenceProgram. ( #15533 )
...
Otherwise, some other persistable variable (like RAW type) will not be created
6 years ago
Yan Chunwei
b62b756b28
add version support ( #15469 )
6 years ago
Yan Chunwei
526790e652
infer get program ( #15511 )
6 years ago
JiabinYang
2e309b11c2
test=develop, merge develop
6 years ago
nhzlx
95b98f27ae
fix trt models utest failed.
...
test=develop
6 years ago
Tao Luo
b919190232
Merge pull request #15531 from jczaja/prv-googlenet-fix
...
Performance and functional fixes to LRN
6 years ago
JiabinYang
53d558cd41
test=develop, polish code and merge develop
6 years ago
Zhaolong Xing
97b76c94c4
Merge pull request #15242 from NHZlX/trt_int8_ultimate_version
...
add trt int8 support
6 years ago
Jacek Czaja
4aa7ef3c13
- Compensation fix to LRN MKL-DNN op
...
test=develop
6 years ago
nhzlx
b43ea40c51
delete the usage of the const_cast
...
test=develop
6 years ago
Yan Chunwei
e2818c8608
add dynamic memory optim ( #15457 )
6 years ago
nhzlx
92cf4a4c6b
fix comments
...
test=develop
6 years ago
JiabinYang
1bf2facecb
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/imperative
6 years ago
JiabinYang
e3a8929cf8
little change
6 years ago
Zhaolong Xing
a7ba07d7ef
Merge pull request #15504 from NHZlX/fix_conv2d_fusion
...
Add check: conv_fusion op runs with cudnn version > 7100 .
6 years ago
nhzlx
0779e35544
fix two bug:
...
1. graph and program_desc alignment
2. trt stream
test=develop
6 years ago
nhzlx
027d24c831
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
6 years ago
nhzlx
9641324995
fix comments
...
test=develop
6 years ago
nhzlx
484b3bc801
When cudnn version < 7100, there is problem with conv_fusion.
...
Add check for it.
test=develop
6 years ago
tensor-tang
5c68dee798
fix debug compile of analysis pass fail
...
test=develop
6 years ago
luotao1
353b5f06a7
refine analyzer_bert_test to pass the ci
...
test=develop
6 years ago
nhzlx
e6218c1d7b
change the input to a smaller value
...
test=develop
6 years ago
fuchang01
4a33a44f45
analyzer bert tester
6 years ago
nhzlx
5b92ddabe2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_trt_stream_bug
...
test=develop
6 years ago
nhzlx
2f4aee361a
fix comments
...
test=develop
6 years ago
nhzlx
ec213730bc
fix trt stream bug.
...
BUG: After continuing to input different data, the output cannot be aligned
test=develop
6 years ago
flame
d60751fb71
add python inference api ( #15248 )
...
add python inference api
6 years ago
Yan Chunwei
885c4e57ab
fea/infer memory optim2 ( #14953 )
6 years ago
Tao Luo
8f522c15ed
Merge pull request #15408 from luotao1/mm_dnn
...
test_analyzer_mm_dnn runs in serial
6 years ago
Tao Luo
001827c270
test_analyzer_mm_dnn runs in serial
...
test=develop
6 years ago
Tao Luo
140fc1e92c
Merge pull request #15392 from luotao1/pyramid_dnn
...
add pyramid_dnn c++ inference test
6 years ago
Yan Chunwei
c9e5aa19c1
get tensor API add more comments ( #15345 )
6 years ago
Yan Chunwei
e84234b551
make clone thread safe ( #15363 )
6 years ago
Tao Luo
668563088e
add pyramid_dnn c++ inference test
...
test=develop
6 years ago
Zhaolong Xing
236201c222
Merge pull request #15350 from NHZlX/fix_bug_for_precditor
...
fix analysis config bug
6 years ago
nhzlx
8817841c73
fix unit test bug
...
test=develop
6 years ago
Yan Chunwei
e07900d317
cache tensor ptr in ZeroCopyTensor ( #15352 )
6 years ago
Yan Chunwei
b7916440ff
hot fix the Native clone ( #15344 )
6 years ago
Xin Pan
3ecf6bb338
Merge pull request #15028 from yihuaxu/develop_641313ea7_elementwise_mul_mkldnn_bug_fix
...
Fix the exception when tensor format is x
6 years ago
nhzlx
b95f2ff8fe
fix win build bug
...
test=develop
6 years ago
nhzlx
b938324381
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
...
test=develop
6 years ago
nhzlx
312fe0ece1
add trt int8 calibration support
...
fix comments
test=develop
6 years ago
Yiqun Liu
568cc2ffa8
Optimize while_op for test ( #14764 )
...
* Simplify the compare op for CPU.
* Use asynchronous tensor copy in reshape_op's kernel.
* Optimize while_op for test, avoiding creating variables every time.
test=develop
* Enable the cache of kernel type and kernel function.
test=develop
* Enable profiling with gperftools.
* Remove flags for testing, and fix the linking error.
test=develop
* Delete the codes of ChooseKernel.
test=develop
* Fix bug when preparing ExecutorPrepareContext for while_op.
* Fix missing depending on grpc libraries.
* Remove the redundant print.
test=develop
* Follow comments.
* Remove the codes related to prepare the ExecutorPrepareContext for while_op.
test=develop
6 years ago
nhzlx
b2ba3471fd
fix analysis config bug.
6 years ago
tensor-tang
a7fc3d42a0
Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub
...
Fuse/second order mul sub and fuse repeated fc relu
6 years ago
bingyanghuang
a152a5c731
Disable conv3d mkldnn in dam ( #15335 )
...
* disable conv3d mkldnn in dam
* Add some comments test=develop
6 years ago
tensor-tang
1a95cd227d
disable seqpool test on mac or without mkl
...
test=develop
6 years ago
Tao Luo
2411ed4286
fix multi-threads in ZeroCopyProfile
...
test=develop
6 years ago
tensor-tang
84b0ecdcce
Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub
...
test=develop
6 years ago
tensor-tang
7035f051a8
adjust acc on mac
6 years ago
Tao Luo
e33427da0d
Merge pull request #15280 from luotao1/random_test
...
fix CompareDeterministic error when test_all_data
6 years ago
tensor-tang
d618e48309
fix fuse square mat order and refine test
...
test=develop
6 years ago
tensor-tang
a5d2a6d1ad
add fuse pass of sequared mat sub fusion
6 years ago
tensor-tang
84e023eae5
adjust the acc since the refer result is too large
...
test=develop
6 years ago
tensor-tang
4461a458a5
adjust diff since abs is too large
...
test=develop
6 years ago
tensor-tang
ca6fdc6e33
refine and fix test
...
test=develop
6 years ago
tensor-tang
a89296ac1f
add repeated fc relu pass
6 years ago
tensor-tang
781cd0cf51
add multi threads test of seqpool test ( #15293 )
6 years ago
Tao Luo
cbd1c7c01f
fix CompareDeterministic error when test_all_data
...
test=develop
6 years ago
Zhaolong Xing
98e85f3735
add_transpose_flatten_concat_fuse ( #15121 )
6 years ago
wopeizl
5d9edb4124
Merge pull request #15156 from wopeizl/windows/fixgpuissue
...
fix gpu buils issue on windows test=develop
6 years ago
tensor-tang
146e942c65
Merge pull request #15250 from tensor-tang/refine/seqpool/feed
...
Refine/seqpool/feed with infer zerocopytensor
6 years ago
peizhilin
439691f5bd
adjust the shlwapi on windows
...
test=develop
6 years ago
tensor-tang
96786d3716
add compare_determine of seqpool1 test
...
test=develop
6 years ago
tensor-tang
ce909664d8
Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
6 years ago
peizhilin
e239558e56
remove the dismatch enclosure to avoid warning message test=develop
6 years ago
Tao Luo
7d13d20769
Merge pull request #15245 from luotao1/rnn1_multi_thread
...
reduce threads number to avoid analyzer_rnn1_tester hang in CI
6 years ago
Tao Luo
2b11c710b3
Merge pull request #15249 from NHZlX/fix_trt_demo_ci
...
fix demo ci bug
6 years ago
tensor-tang
54afcb7ec6
add compare zerocopy test with native result
...
test=develop
6 years ago
tensor-tang
137060135e
fix zerocopy size
6 years ago
tensor-tang
7461356723
add zerocopy for seqpool test
6 years ago
tensor-tang
48410b9bfe
Merge pull request #15237 from tensor-tang/fuse/seqpool_concat_2
...
Fuse/seqpool concat 2
6 years ago
nhzlx
e7d83389e6
fix demo ci bug
...
1. trt_demo bug
2. trigger exit when exists a bug
test=develop
6 years ago
Tao Luo
9b41e45584
Merge pull request #15222 from luotao1/native_config
...
fix analyzer_test runs error in native_config
6 years ago
Tao Luo
d43983b61d
reduce threads number to avoid hang in CI
...
test=develop
6 years ago
nhzlx
c1264e99f3
fix win error
...
test=develop
6 years ago
nhzlx
4e3522e5b4
add trt int8 support
...
test=develop
6 years ago
tensor-tang
f8c305b243
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat_2
...
test=develop
6 years ago
Tao Luo
197d0f2431
fix trt_model_tester to pass the ci
...
test=develop
6 years ago
Tao Luo
feee78d9f0
Merge pull request #15214 from tensor-tang/fix/debug
...
fix debug build error
6 years ago
Tao Luo
71d9097a89
fix analyzer_test runs error in native_config
...
test=develop
6 years ago
Tao Luo
9c02765158
Merge pull request #15210 from Superjomn/fix/analysis_tester_bug
...
fix analysis_tester bug
6 years ago
tensor-tang
72d2a1801e
add seqpool concat fuse pass
...
test=develop
6 years ago
tensor-tang
69fd3fdb52
fix debug build error
...
test=develop
6 years ago
superjomn
23bdd0a223
fix analysis_tester bug
...
test=develop
6 years ago
Yan Chunwei
d09d6eadc0
make inference api work with Doxygen ( #15195 )
6 years ago
Tao Luo
6ca9a4810b
Merge pull request #15196 from luotao1/serial
...
run analyzer_tester serial in multi-thread
6 years ago
Yan Chunwei
6ccf8685f7
refactor tensorrt node teller ( #15181 )
6 years ago
Tao Luo
7dc0181c46
run analyzer_tester serial in multi-thread
...
test=develop
6 years ago
Yan Chunwei
875a07c32d
refactor inference analysis api ( #14634 )
6 years ago
tensor-tang
516fe301ee
add comment in case of empty name
...
test=develop
6 years ago
tensor-tang
b9c645639b
workaround with third party cache
...
test=develop
6 years ago
tensor-tang
dca68cdf97
throw error when name not find
...
test=develop
6 years ago
tensor-tang
484085693e
update url and num_ops
...
test=develop
6 years ago
tensor-tang
cd94df8679
fix load and refine
6 years ago
tensor-tang
8e271896ae
add test data for seqpool1
6 years ago
Zhaolong Xing
4048cfa9da
Merge pull request #15048 from NHZlX/add_affine_channel_fuse
...
Add conv+ affine channel fuse pass
6 years ago
Zeng Jinle
c0bcff00dc
Merge pull request #14962 from sneaxiy/rewrite_variable_type
...
Rewrite variable type
6 years ago
Tao Luo
85471533e0
Merge pull request #15079 from luotao1/analysis_test
...
simplify analysis tests
6 years ago
wopeizl
719ebe3786
Merge pull request #15070 from wopeizl/windows/testcasefix
...
fix test issues on windows
6 years ago
Qiyang Min
0238a3bb4f
Merge pull request #14972 from velconia/accelerate_lstm
...
Accelerate PADDLE_ENFORCE
6 years ago
sneaxiy
c4ce2e7b21
merge develop, solve conflict
...
test=develop
6 years ago
Tao Luo
ecae157edf
simplify some data record in analyzer_tester
...
test=develop
6 years ago
Tao Luo
05f1b65da3
simplify prepere_input in analyzer_test
...
test=develop
6 years ago
nhzlx
02e17396c2
fix comments
...
test=develop
6 years ago
nhzlx
71636e677d
add min_subgraph_size attr to tensorrt config
...
test=develop
6 years ago
peizhilin
01c00b07dd
fix test issues on windows
...
test=develop
6 years ago
nhzlx
a6aa8ea771
faster rcnn input is presistable. (fix it in paddle-trt)
...
test=develop
6 years ago
sneaxiy
dde3afe7b7
Merge develop
...
test=develop
6 years ago
Yihua Xu
0b0acfaa88
Add mkldnn item for porfile and compare usage.
...
test=develop
6 years ago
tensor-tang
d46a140dd9
add seq pool inference test
...
test=develop
6 years ago
tensor-tang
d4931a2abc
support more input fake data
6 years ago
nhzlx
73b47df1f4
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_affine_channel_fuse
...
test=develop
6 years ago
nhzlx
ce3782c193
add affine_channel fuse.
...
fix conv+elemenwise fuse bug.
6 years ago
Tao Luo
91408e3122
fix analyzer_mm_dnn_tester fails when bs > 1
...
test=develop
6 years ago
Tao Luo
f01c966800
Merge branch 'develop' into mm_dnn
6 years ago
qingqing01
51a9fca323
Async memory copy ( #15013 )
6 years ago
minqiyang
b1d0a14c14
Change the ut back
...
test=develop
6 years ago
minqiyang
7d1533216d
Fix syntax error in unit test
...
test=develop
6 years ago
Tao Luo
22c71398e3
add MM_DNN inference test
...
test=develop
6 years ago
peizhilin
9e60c58666
Merge remote-tracking branch 'upstream/develop' into windows/mkl
...
test=develop
6 years ago
luotao1
13367866cd
add deterministic result unit-test
...
test=develop
6 years ago
sneaxiy
ae6f46a1a9
rewrite variable type
...
test=develop
6 years ago
peizhilin
07c7eaabb4
Merge remote-tracking branch 'upstream/develop' into windows/mkl
...
test=develop
6 years ago
Tao Luo
6aa6b8cfa0
Merge pull request #14918 from luotao1/mobilenet_test
...
add test_analyzer_mobilenet
6 years ago
Zhaolong Xing
a9fb34fad8
Merge pull request #14903 from NHZlX/add_conv_elementwise_pass
...
Add conv + elementwiseAdd pass
6 years ago
Tao Luo
2f55a04ec6
add refer result comparasion
...
test=develop
6 years ago
peizhilin
5a6d7fe2ff
add mkl,ctc support for windows
6 years ago
wopeizl
0f085f0a5a
Merge pull request #14892 from wopeizl/windows/port3
...
fix script issue
6 years ago
nhzlx
050a68dde3
fix comments
...
test=develop
6 years ago
Tao Luo
1a6d2cfe39
add test_analyzer_mobilenet
...
test=develop
6 years ago
nhzlx
fcc93d96d5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_conv_elementwise_pass
...
fix conflicts
test=develop
6 years ago
Yu Yang
bacf1d2399
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/tensor_type
6 years ago
nhzlx
4e4a777243
add conv+elementwiseadd pass
...
test=develop
6 years ago
nhzlx
050e118f3c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_trt_thread_bug
...
test=develop
6 years ago
nhzlx
96216052d5
1. fix trt multi thread bug
6 years ago
Yan Chunwei
a985949be9
Fea/fuse conv elementwise add fuse ( #14669 )
6 years ago
Yu Yang
04a570b463
Fix ut
...
test=develop
6 years ago
peizhilin
23dec78772
fix script issue
...
test=develop
6 years ago
Yu Yang
aa38fc4ce5
Fix compile
...
test=develop
6 years ago
Yu Yang
194e66f785
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/tensor_type
6 years ago
Tao Luo
322bb8d5c5
Merge pull request #14825 from NHZlX/add_benchmark_for_trt
...
an sample of recording benchmark to file for trt
6 years ago
flame
dd3aca3b96
Merge pull request #14824 from Superjomn/fix/visualizer
...
fix visualizer
6 years ago
Yu Yang
9bd70a1e04
Change tensor uses proto::VarType::type
...
test=develop
6 years ago
nhzlx
644c13a387
fix compile error
6 years ago
nhzlx
a5bfed3776
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_benchmark_for_trt
...
test=develop
6 years ago
nhzlx
afc51e6f82
add benchmark for trt
6 years ago
Zhaolong Xing
bc6d0a3427
Merge pull request #14762 from NHZlX/fix_bug_of_trt_pool
...
fix bug of trt pool2d converter
6 years ago
superjomn
edd1f5a92b
fix visualizer
...
test=develop
6 years ago
nhzlx
019e8bbed2
fix comments test=develop
6 years ago
bingyanghuang
943ad4781f
One possible solution to add flexibility for mkldnn placement pass ( #14768 )
...
* Choose to turn on use_mkldnn attribute v1
* Fix mkldnn_op empty bug
* format change test=develop
* fix ci test=develop
* fix ci test and add test in dam test=develop
* add example to dam compare test test=develop
* review changes test=develop
6 years ago
Yihua Xu
3821fc3950
Merge branch 'develop' into develop_4f71a6ee2_conv3d_bias_fusion_mkldnn_impl
...
test=develop
6 years ago
Tao Luo
cf66133857
Merge pull request #14734 from luotao1/memory_load
...
support loading from memory
6 years ago
Tao Luo
743cb840f1
update with comments
...
test=develop
6 years ago
flame
f6a877bc57
add tool to visualize inference model ( #14621 )
6 years ago
Tao Luo
42359e88a4
clean code
...
test=develop
6 years ago
Tao Luo
923b18877e
Merge branch 'develop' into memory_load
...
test=develop
6 years ago
Tao Luo
405b2486db
support loading from memory
...
test=develop
6 years ago
Houjiang Chen
c6b39a0099
Merge pull request #14714 from NHZlX/add_prelu_gpu
...
add prelu cuda kernel for inference.
6 years ago
nhzlx
722b0a805f
fix bug of trt pool
...
test=develop
6 years ago
Xin Pan
0591ba96ec
fix hack
...
test=develop
6 years ago
nhzlx
e7abe6b654
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_prelu_gpu
...
test=develop
6 years ago
nhzlx
f75815b78c
add prelu gpu inference
6 years ago
Xin Pan
7e0801d4ed
Merge pull request #14441 from baojun-nervana/intel/ngraph_op
...
Implementing ngraph engine
6 years ago
Yihua Xu
82eefceabe
Add the profile_mkldnn flag for profile function(test=develop)
6 years ago
Yihua Xu
64e261c6cd
Implement the fusion of convolution and bias for mkldnn
...
(test=develop)
6 years ago
Tao Luo
2af5762cf8
Merge pull request #14668 from wzzju/use_small_dam
...
support the small dam model. test=develop
6 years ago
ZhenWang
6e48e47406
test=develop
6 years ago
ZhenWang
e1da6cd754
add the normal dam and the small dam
6 years ago
ZhenWang
d5947b0ed7
test=develop
6 years ago
ZhenWang
33b4963505
unify the normal and small dam model.
6 years ago
Yan Chunwei
4b7617740e
fix container not cleared ( #14231 )
6 years ago
ZhenWang
8f2e556e65
support the small dam model. test=develop
6 years ago
nhzlx
49c28b8c52
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_params_sync_pass
...
test=develop
6 years ago
nhzlx
3c83a2f720
fix comments
6 years ago
Sang Ik Lee
24e70920db
Refactor some build settings.
...
test=develop
6 years ago
Sang Ik Lee
d6125a5eec
Include ngraph in inference demo build.
...
test=develop
6 years ago
Tao Luo
b4de023ee1
Merge pull request #14636 from Superjomn/fix/word2vec
...
fix word2vec bug
6 years ago
nhzlx
d3e140a572
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_params_sync_pass
...
test=develop
6 years ago
nhzlx
d666c8eb1d
fix benchmark
6 years ago
nhzlx
900fbb83f9
add params sync pass
6 years ago
superjomn
9c665c81ae
update
...
test=develop
6 years ago
minqiyang
a02ce58f2c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
...
test=develop
6 years ago
Tao Luo
e8ef14d2a7
Merge pull request #14610 from Superjomn/revert/cache_fix
...
Revert "fix transfer cache thread_local bug (#14581 )"
6 years ago
Yiqun Liu
726f2cefe3
Fix bug of referencing a temporary variable. ( #14614 )
...
test=develop
6 years ago
peizhilin
38715e6fd0
minor fix
6 years ago
superjomn
4babc6b06c
update
...
test=develop
6 years ago
minqiyang
be04d99fe4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
...
test=develop
6 years ago
minqiyang
53433d7f2e
Revert the changes of VLOG
...
test=develop
6 years ago
peizhilin
36cd18b549
Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
qingqing01
39ec80def4
Remove the memory copy of feeding data in C++ inference API ( #14577 )
...
* Remove the memory copy for feeding data in C++ inference API
* Fix compling dependence
* Fix compling in ONLY_CPU mode
6 years ago
peizhilin
1afa9492af
Recover the profiler
6 years ago
Yiqun Liu
bf222f197d
Use sub scope in tensor_array_to_tensor op. ( #14524 )
...
test=develop
6 years ago
dzhwinter
840c1b29ad
test=develop ( #14562 )
...
* test=develop
remove code.
* test=develop
6 years ago
Yan Chunwei
923c8e3332
add benchmark for inference ( #14571 )
6 years ago
Tao Luo
e90afec47b
Merge pull request #14543 from luotao1/threads
...
add thread related inference api
6 years ago
Zhaolong Xing
e52d90a35e
Merge pull request #14527 from hjchen2/develop
...
Refine split TensorRT plugin
6 years ago
luotao1
116979a40a
refine api name
...
test=develop
6 years ago
luotao1
e66b4c6bff
adjust tester_helper to make multi-instance multi-thread work
...
test=develop
6 years ago
luotao1
a5c4b463c9
add SetMKLDNNThreadId api
6 years ago
luotao1
e21edb26f6
add Set/GetCPUNumThreads api
6 years ago
peizhilin
7c8c9dc9bf
fix unit test cases
6 years ago
hjchen2
1adda8e06c
Add more unit tests for split plugin
...
test=develop
6 years ago
wopeizl
d9a1f3e58e
Windows/online ( #14474 )
...
* add recordio support
* disable the openblas multi-thread on windows since no support
adjust the python script
* code style
* code style
test=develop
* add create_recordio_file_reader back
* fix code style
test=develop
* fix the gtest.cmake on windows
* fix cc_test on windows
* fix the win build
test=develop
* remove fused compile support on windows
test=develop
* add the jit support
test=develop
* add the jit support, test=develop
* add the jit support, test=develop
* add the jit back
fix compile error on windows
* rollback test=develop
* test case fix
* disable DSO by default on windows
* exclude warpctc_op on windows
* exclude the dynload_warpctc out on windows
test=develop
* fix the scripts error
test=develop
* disable avx on windows by default
test=develop
* re-organize the cmake file
* disable mkl on windows by default
* add warp_ctc back
* fix the dependency
* fix the dependency
* fix the build issue on windows
* remove unsupported flag on windows
* code style
* code style
test=develop
* fix issue
* add profiler, parallel_executor back
* clean up the pre-definitions on windows
* fix build issue
* test=develop
6 years ago
peizhilin
bef475c92b
Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
hjchen2
6eba5bd276
Fix direct copy and refine split ut
...
test=develop
6 years ago
Qiao Longfei
fd290c2580
fix mac compile of analysis
...
test=develop
6 years ago
hjchen2
5857fb3014
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into develop
...
test=develop
6 years ago
hjchen2
3e3599f3d9
Refine split tensorrt plugin
6 years ago
peizhilin
f10e196fc8
fix build issue
6 years ago
Zhaolong Xing
ad349e770f
Merge pull request #14452 from NHZlX/fix_avg_pool_trt_bug
...
fix avg pool trt bug
6 years ago
peizhilin
6e66fadb95
clean up the pre-definitions on windows
6 years ago
Tao Luo
1d9b2a453c
Merge pull request #14508 from luotao1/warm_up_multi_thread
...
add warm up in TestMultiThreadPrediction
6 years ago
nhzlx
e62872df8b
fix conflicts
6 years ago
nhzlx
a4dc1d4292
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_trt
...
test=develop
6 years ago
nhzlx
faeb9b8aa9
fix compile rely problem
6 years ago
Tao Luo
eb9b9becdc
add warm up in TestMultiThreadPrediction
...
test=develop
6 years ago
Tao Luo
5cc7946313
Merge pull request #14499 from luotao1/disable_openblas_test
...
disable two openblas test temporary
6 years ago
nhzlx
2a84054372
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_trt
...
test=develop
6 years ago
nhzlx
b742d46520
fix demo ci bug on trt
6 years ago
Houjiang Chen
33c65517fd
Update CMakeLists.txt test=develop
6 years ago
Houjiang Chen
01bda73116
Update CMakeLists.txt
6 years ago
Tao Luo
09ee266f8e
disable two openblas test temporary
...
test=develop
6 years ago
hjchen2
2c2a192eb1
Resolve merge conflicts
...
test=develop
6 years ago
Yiqun Liu
8bc1c5d2ab
Implement the Tensorrt plugin for elementwise op ( #14487 )
...
* Initialize the elementwise plugin.
* Implement the basic CUDA kernel of elementwise plugin.
test=develop
6 years ago
hjchen2
1622cb9937
Fix alpha tensor key
6 years ago
hjchen2
a8c077df7c
Implement leaky relu tensorRT converter
6 years ago
hjchen2
2825685f2a
Fix tensorrt plugin cmake dependency, test=develop
6 years ago
Superjomn
e878a8e885
update
...
test=develop
6 years ago
superjomn
4bf6817cbc
fix gpu load model
...
the parameters will load from CPUPlace, that will keep copying data
between CPU and GPU places.
test=develop
6 years ago
Wu Yi
a2d9b34417
Refine operator cmake ( #14413 )
...
* wip simplify operator framework
* wip
* wip
* done test=develop
* clean test=develop
* fix test=develop
* fix deps test=develop
* fix cpu build test=develop
* fix tensorrt build test=develop
* fix tests test=develop
* fix test=develop
* fix cpu build test=develop
6 years ago
nhzlx
8f9a8c455a
delete unused test code.
...
test=develop
6 years ago
nhzlx
83f8c403a7
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
...
test=develop
6 years ago
nhzlx
b969116988
fxi avg pool trt bug and fix cpplint
6 years ago
Zhaolong Xing
2f27c048cc
Merge pull request #14440 from hjchen2/develop
...
Add PRelu tensorRT plugin and Conv2d transpose op converter
6 years ago
hjchen2
6a7b995737
Refine commit message to enable ci, test=develop
6 years ago
hjchen2
413f5948b2
Fix code style
6 years ago
hjchen2
21f33b4274
Complete PRelu plugin and Conv2d transpose op converter
6 years ago
Sylwester Fraczek
8a1eeec579
add mkldnn prop_kind phase for inference-only case to pooling and activations ( #14278 )
...
* add is_test to pooling and activations
add prop_kind support for layers activation. conv and pooling
add a pass that sets is_test to true
add transpiler version of is_test pass
test=develop
* patch test and pass
test=develop
* add pass to analyzer.h
test=develop
* add is_test attr description & pass only on mkldnn
in:
activation_op.cc
batch_norm_op.cc
conv_op.cc
dropout_op.cc
lrn_op.cc
pool_op.cc
sequence_pool_op.cc
softmax_op.cc
* fix is_test handling for activation pool and conv
* change description of is_test for all layers again
* remove GetAttr(use_mkldnn) from pass
* rename correct_mkldnn_test_phase to is_test
and remove dependency on MKLDNN
test=develop
* review fix magic number
* two if(..)s into one
* Check is_test once and pass mkldnn forward prop kind
* dereference shared_ptr with * (without get())
test=develop
* add is_test_pass back
test=develop
6 years ago
Tao Luo
9d29ebc010
Merge pull request #14306 from sfraczek/sfraczek/test-analyzer-mobilenet
...
add test_analyzer_mobilenet
6 years ago
Sylwester Fraczek
d318583eb5
rename mobilenet dir to mobilenet_depthwise_conv
...
test=develop
6 years ago
Tao Luo
1d867805b0
rollback analyzer_seq_conv1_tester
...
test=develop
6 years ago
Tao Luo
5ef123c778
Merge branch 'develop' into dam_fc
6 years ago
dzhwinter
d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
...
Windows/online
6 years ago
Yiqun Liu
9e6b1c5f97
Refine tester of TensorRT engine ( #14390 )
...
* Refine the tester for MixedRTPredictor.
test=develop
* Enable the profiler in TensorRT engine.
* Support the use of combined inference model in TensorRT unittest, and print the shape of feed targets.
6 years ago
peizhilin
0ef2a37c0e
merge from develop
6 years ago
nhzlx
15bdb7ef14
delete error uploaded files
...
test=develop
6 years ago
Sylwester Fraczek
2412c27c2b
Merge branch 'develop' into sfraczek/test-analyzer-mobilenet
6 years ago
peizhilin
1a9008c420
code style fix
...
test=develop
6 years ago
Tao Luo
e0d4e04bdd
fix some compiler warning
...
test=develop
6 years ago
Tao Luo
8ea13e336a
add in_num_col_dims for fc
6 years ago
nhzlx
ddb120357c
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_trt_plugin
...
merge develop and fix conflicts
6 years ago
peizhilin
447bf7c80b
test=develop
6 years ago
peizhilin
30ddc07a7e
Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Yan Chunwei
9f252e0032
Combine Inference Analysis with IR ( #13914 )
6 years ago
nhzlx
0b96268057
fix comments
...
test=develop
6 years ago
nhzlx
e5bf8616f0
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_trt_plugin
...
test=develop
6 years ago
nhzlx
d38fd6a0fc
add plugin support and offer an simple split sample
6 years ago
nhzlx
2d7134bc37
add initial code for plugin
6 years ago
nhzlx
397de907ed
merge develops
...
test=develop
6 years ago
nhzlx
d6ff006903
add serial to trt test and do not print log for unused trt logs
6 years ago
peizhilin
ef8a7db81e
Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
peizhilin
ca60e1d34d
Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Tao Luo
433fc7c1d4
skip mkldnn related pass when use_mkldnn=false
...
test=develop
6 years ago
peizhilin
350f1f3971
remove duplicate function definition
6 years ago
peizhilin
4b1f1a8787
fix merge issue
6 years ago
peizhilin
52f7644f53
Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Qiyang Min
698698f2fa
Merge branch 'develop' into fix_vlog
6 years ago
qingqing01
abe209234f
Exhaustive search for cuDNN conv. ( #14286 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
* Fix compiling test=develop
6 years ago
Tao Luo
f1046d7e37
Merge pull request #14335 from wojtuss/wojtuss/add-graph-viz
...
added additional call to graph_viz_pass
6 years ago
Sylwester Fraczek
b5f617fa9b
make mobilenet test reuse resnet50 test
6 years ago