Commit Graph

617 Commits (e50bc2c2a6fcfec748d5bb991588a2fdc2ab4caf)

Author SHA1 Message Date
石晓伟 2bb135825e
fix analysis_predictor when func is called multiple times, test=release/1.6 (#21665)
5 years ago
joanna.wozna.intel d419b859c0 Add reshape int8 mkldnn op (#21428)
5 years ago
Zhaolong Xing fbbd94a6ce
there is bug for inference using auto grwoth allocator (#21621)
5 years ago
rensilin 7f5d532a9c fix: fail to call ZeroCopyTensor::mutable_data() when device_id is no… (#21461)
5 years ago
Pei Yang 20d61414b4
fix glog warning, test=develop (#21573)
5 years ago
Pei Yang 122b37ce62
make config option DisableGlogInfo() able to mute all inference logs (#21318)
5 years ago
Zhaolong Xing b39c011637
specify the auto growth allocator for inference. (#21448)
5 years ago
Lv Mengsi 37f3e56dea
Fix transpose conv (#21406)
5 years ago
Zhaolong Xing d1a6e112e6 fix C++ multicard inference bug. (#20955)
5 years ago
Michał Gallus 5d7d548275 INT8 Fully-connected (#17641)
5 years ago
silingtong123 45c1e7bb7b add prediction demo and script on windows (#21248)
5 years ago
zhouwei25 c0dcb090a3 Determine whether to copy and link inference lib by ON_INFER (#20931)
5 years ago
Zhaolong Xing 65f7052554
TRT int8: refine trt int8 for dynamic range set (#21112)
5 years ago
joanna.wozna.intel 77c2083586 Add transpose2 INT8 for mkl-dnn (#19424)
5 years ago
Pei Yang e89c16b90d
Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and "num" attribute in split op converter (#20733)
5 years ago
石晓伟 e742760f8e
optimize version error, test=develop (#20715)
5 years ago
石晓伟 d8f4f4239d
Ensure backward compatibility with the anakin interface, test=develop (#20691)
5 years ago
Pei Yang 443f604c3b
add DisableGlogInfo() to AnalysisConfig, test=develop (#20581)
5 years ago
zhaoyuchen2018 b8333edef6
Add Multihead matmul fuse pass (#20167)
5 years ago
Adam 7faa3e9555 Add ConvTranspose + BatchNorm fuse pass (#20161)
5 years ago
石晓伟 2c28e3283a
fix analysis_predictor ci, test=release/1.6 (#20141)
5 years ago
石晓伟 01b9d07963
update operator compatible info, test=develop (#19978)
5 years ago
Zhaolong Xing e89b12884a
FIx C++ inference BUG: When open memory optim and enable trt subgraph at the same time, there is a bug (#19969)
5 years ago
Yiqun Liu 3cd985a669
Add a pass to fuse fc+elementwise_add+layernorm (#19776)
5 years ago
石晓伟 71b2ed61bc
support MLU nums, test=develop (#19372)
5 years ago
Pei Yang 9cbc1eff2d
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
5 years ago
Yiqun Liu a65c728e5d
Implement the GPU kernel of fc operator (#19687)
6 years ago
Tao Luo f05d2c519d paddle::framework::vectorize() templatization [PART3] (#19643)
6 years ago
Yiqun Liu c5548178b0
A a pass to enable the use of cudnn (#19346)
6 years ago
liuwei1031 d6cb1a4122
add dynamic C runtime support on windows, test=develop (#19502)
6 years ago
Yiqun Liu fcec365d29
Add a pass to replace dropout_op with scale_op when is_test is true (#19297)
6 years ago
lidanqing 9240e5325c add local user data conversion into full_pascalvoc_test_preprocess.py (#19283)
6 years ago
Adam 97d1db1874 Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237)
6 years ago
Zhaolong Xing 76c95af000
Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213)
6 years ago
Zeng Jinle 5b6673c44d
merge develop to solve conflict, also fix API doc, test=develop (#18823)
6 years ago
Adam b837689e97 Add generalized Conv+Activation MKLDNN fuse pass creation (#19072)
6 years ago
wopeizl 80b7ef6fc8
add tensorrt support for windows (#19084)
6 years ago
Tao Luo 741ce8bb1a
inference_shared_library support profile (#16275)
6 years ago
silingtong123 fd3b666d8c test=develop,Synchronize the contents of develop with release1.5 (#18937)
6 years ago
石晓伟 ee2f296ef8
Fusion: seqpool_cvm_concat (#18471)
6 years ago
liuwei1031 0d99690809
fix several security bugs reported by security team (#18831)
6 years ago
Zhaolong Xing 61238d31f7
Trt fp16 support (#18860)
6 years ago
Zhaolong Xing 26ae6d49e4
Update trt5 for paddle-trt (#18645)
6 years ago
石晓伟 25d8079140
Fix Bitmain Predictor::Clone() (#18599)
6 years ago
Tao Luo 076f833110
add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580)
6 years ago
Jiabin Yang 667f88f9a6
Fix/gcc 4.8 ubt link error (#18558)
6 years ago
Zhaolong Xing 88b52a27fe
Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532)
6 years ago
石晓伟 1529154821
Support Bitmain Anakin (#18542)
6 years ago
石晓伟 047bba855b
Remove the obsolete cmake options (#18481)
6 years ago
Tao Luo 3123d18787
remove unused AnalysisPredictor::SetMkldnnThreadID() (#18444)
6 years ago
Michał Gallus 7023a86c3a Fix Pooling output scale (#18186)
6 years ago
Michał Gallus 8409693272 Reset DeviceContext after quantization warmup (#18182)
6 years ago
Sylwester Fraczek 9252e8fa08 add int8 mkldnn prior_box (#17242)
6 years ago
wopeizl daa32d5383
fix package generation for inference test=develop (#18220)
6 years ago
翟飞跃 802ea50956 fix spelling errors (#17941)
6 years ago
石晓伟 04ea7cb069
modify the access level of anakin engine (#18015)
6 years ago
石晓伟 bce259e5bf
Update the Anakin interfaces for content-dnn and MLU (#17890)
6 years ago
石晓伟 d008260fa8
update the initialization of anakin subgraph (#17880)
6 years ago
Zhaolong Xing ae576f3c68
fix: when use the load model from memory mode, the RAM occupy is high (#17788)
6 years ago
翟飞跃 993c703bcc INT8 MKL-DNN v2 integrate to slim (#17634)
6 years ago
Tao Luo e089e454a1
make omp thread num default 1 after inference run (#17801)
6 years ago
mozga-intel 5eb81fe595 Capi for a ngraph engine (#17037)
6 years ago
lidanqing 04b6c29ee0 Improve mobilenetv2 INT8 performance by using INT8 relu as post-op (#17570)
6 years ago
Jacek Czaja 6d8075ecef [MKL-DNN] conv_transpose mkldnn bias pass (#17644)
6 years ago
Sylwester Fraczek 96845d2168 add Concat quantization (#17448)
6 years ago
Zhen Wang 8bd651b7ed
Fix the bug in the AnalysisPredictor and add more directions about io APIs. (#17639)
6 years ago
Zeng Jinle 4aa931dd85
Code clean of Allocator (#17602)
6 years ago
Zhaolong Xing 61221ebc28
TRT: Support set dynamic range in int8 mode. (#17524)
6 years ago
Michał Gallus 0c39b97b4e [MKL-DNN] Add Fully Connected Op for inference only(#15226)
6 years ago
Sylwester Fraczek 5b2a3c4b12 Conv concat relu quantization (#17466)
6 years ago
guomingz 2281ebf0f3 Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130)
6 years ago
liuwei1031 ba70cc499e
fix security bugs : (#17464)
6 years ago
Tao Luo 32da5e9c3d
remove unused expected_kernel_cache_pass (#17486)
6 years ago
wopeizl ca3ba378c7
fix the random compilation failure on windows test=develop (#17475)
6 years ago
Zhen Wang 4a1b7fec96
Add setting Scope function for the graph class (#17417)
6 years ago
Zhaolong Xing 7a3bb061d8
fix: (#17279)
6 years ago
Wojciech Uss 984aa90583 improved unit test output (#17266)
6 years ago
石晓伟 a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 (#17156)
6 years ago
Leo Zhao 54636a1982 call SetNumThreads everytime to avoid missing omp thread setting (#17224)
6 years ago
wopeizl 83c4f7721f
use two GPUs to run the exclusive test test=develop (#17187)
6 years ago
luotao1 490e746269 fix runtime_context_cache bug when gpu model has an op runs only on cpu
6 years ago
wopeizl d9991dccdd
add parallel build script to ci … (#16901)
6 years ago
Tao Luo aa7b975bf6 disable runtime_context_cache pass by default
6 years ago
Tao Luo ca8b8fa0bd
Merge pull request #16830 from Superjomn/fix/tmp-memory-optim
6 years ago
lijianshe02 de26df440b add SaveOptimModel interface in analysis_predictor.h and test it in a… (#16441)
6 years ago
superjomn f58c3ec189 fix memory optim temporarily
6 years ago
liuwei1031 85363848a1
Security issue (#16774)
6 years ago
tensor-tang d6c1b5a73b disable seqpool concat pass by default saving CI time
6 years ago
luotao1 226596a296 Merge branch 'develop' into core_opt_choose_kernel
6 years ago
luotao1 bd636a9ea6 test_analyzer_int8 tests use default pass order
6 years ago
Yan Chunwei 044ae2497d
fix identity temporarily (#15942)
6 years ago
Wojciech Uss ec2750b3c2 fix repeating passes (#16606)
6 years ago
Wojciech Uss 9b6a029666 fix dataset reading and add support for full dataset (#16559)
6 years ago
石晓伟 5dea0bdd1b
Merge pull request #16498 from Shixiaowei02/feature/anakin-engine
6 years ago
Shixiaowei02 bddb2cd315 resolve conflicts with the develop branch test=develop
6 years ago
nhzlx d065b5bf2b Anakin ssd support
6 years ago
Michał Gallus 2d8b7b3a76 Refine default MKL-DNN Pass order (#16490)
6 years ago
Wojciech Uss 09dfc7a2aa
C-API quantization core 2 (#16396)
6 years ago
nhzlx 953bdde058 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
nhzlx 45b3766fdf fix comments
6 years ago
liuwei1031 de3b70a101
fix cdn issue, test=develop (#16423)
6 years ago
nhzlx 3df7b98a0f Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
nhzlx f3a2e4b3d8 1. Add ANAKIN_ROOT compile option
6 years ago
luotao1 056599a738 add expected_kernel_cache_pass
6 years ago
Wojciech Uss cbe2dbf0db Add enabling quantization (#16326)
6 years ago
nhzlx 4f4daa4b66 cherry-pick from feature/anakin-engine: add data type for zero copy #16313
6 years ago
nhzlx 07dcf2856c git cherry-pick from feature/anakin-engine: update anakin subgraph #16278
6 years ago
nhzlx c407dfa3cb cherry-pick from feature/anakin-engine: refine paddle-anakin to new interface. #16276
6 years ago
nhzlx a25331bc26 cherry-pick from feature/anakin-engine: deal the changing shape when using anakin #16189
6 years ago
nhzlx c79f06d3d8 cherry-pick from feature/anakin-engine: add batch interface for pd-anakin #16178
6 years ago
nhzlx 69d37f81d7 cherry-pick from feature/anakin-engine: refine anakin subgraph. #16157
6 years ago
nhzlx a1d200a5de cherry-pick from feature/anakin-engine: Anakin support facebox #16111
6 years ago
nhzlx b21770a2aa cherry-pick from feature/anakin-engine: Add subgraph fuse support and anakin engine #16018
6 years ago
luotao1 82af8031d9 add runtime_context_cache_pass
6 years ago
Tao Luo 7d2740db83
Revert "cache runtime_context"
6 years ago
luotao1 a275fd6e0c Merge branch 'develop' into runtime_context
6 years ago
Wojciech Uss 2579ade45f Add cpu_quantize_pass for C-API quantization (#16127)
6 years ago
luotao1 5ecdc49c6b set enable_runtime_context_cache_ default false
6 years ago
luotao1 1510b866b6 turn off runtime_context_cache for tensorrt
6 years ago
luotao1 d94fd97230 add runtime_context_cache_pass
6 years ago
luotao1 1283833395 zero_copy tensor support INT32
6 years ago
luotao1 31c4e1d9fc Merge branch 'develop' into zero_copy
6 years ago
Tao Luo e5e7e9b865 Merge branch 'develop' into transformer_ut
6 years ago
Tao Luo 6f2581e4c5
Merge pull request #16090 from lidanqing-intel/paddle-int32
6 years ago
Zhaolong Xing 3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
6 years ago
nhzlx a9ed427749 cant not pass ci
6 years ago
luotao1 fad06cb928 unify ZeroCopy in analysis_test
6 years ago
lidanqing 4aeb261da9 Add INT32 support. INT32 in last switch case
6 years ago
luotao1 06aab1b493 refine SetCpuMathLibraryNumThreads
6 years ago
nhzlx 3c40cb767b 7 refine zero copy
6 years ago
Yiqun Liu 1616c32acf
Add the include of cudnn.h to enable the use of CUDNN_VERSION. (#15961)
6 years ago
nhzlx 2eff3e26b6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx 06a088a199 fix comments and fix cpplint
6 years ago
nhzlx 0ed63b2108 6. delete useless predictor id
6 years ago
Sylwester Fraczek 1943119fc5 fix typo memeroy->memory
6 years ago
Sylwester Fraczek 8bc604571f fix typo seriazlized->serialized
6 years ago
Sylwester Fraczek 543e53db05 fix typo releated->related
6 years ago
tensor-tang e1c707fe9c
fix warnings (#15790)
6 years ago
nhzlx 2070fb246d 4. do the trt_engine optim during init.
6 years ago
Yan Chunwei 3a5d6e5e64
move passes to src to avoid different behavior in deployment (#15705)
6 years ago
Yan Chunwei c00ed19df2
add more comment (#15603)
6 years ago
Gabor Buella da9c94da33 Clang build fixes (#15628)
6 years ago
Chunwei d85c2e4e5c fix anakin compile dependency
6 years ago
qingqing01 943d972878
Fix analysis predictor when loading the persistable RAW type variable. (#15613)
6 years ago
Yan Chunwei e887d71958
fix ir debug config (#15571)
6 years ago
Yan Chunwei 897789b16e
fix save_inferece_model bug (#15365)
6 years ago
Tao Luo 3d0ecab41b add analyzer_transformer_test
6 years ago
Yan Chunwei 655179089f
AnalysisConfig remove contrib namespace (#15540)
6 years ago
qingqing01 a6910f900e
Always create variables in analysis_predictor before OptimizeInferenceProgram. (#15533)
6 years ago
Yan Chunwei b62b756b28
add version support (#15469)
6 years ago
Yan Chunwei 526790e652
infer get program (#15511)
6 years ago
Zhaolong Xing 97b76c94c4
Merge pull request #15242 from NHZlX/trt_int8_ultimate_version
6 years ago
Yan Chunwei e2818c8608
add dynamic memory optim (#15457)
6 years ago
nhzlx 92cf4a4c6b fix comments
6 years ago
nhzlx 027d24c831 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
6 years ago
nhzlx 9641324995 fix comments
6 years ago
nhzlx 484b3bc801 When cudnn version < 7100, there is problem with conv_fusion.
6 years ago
flame d60751fb71
add python inference api (#15248)
6 years ago
Yan Chunwei 885c4e57ab
fea/infer memory optim2 (#14953)
6 years ago
Yan Chunwei c9e5aa19c1
get tensor API add more comments (#15345)
6 years ago
Yan Chunwei e84234b551
make clone thread safe (#15363)
6 years ago
Zhaolong Xing 236201c222
Merge pull request #15350 from NHZlX/fix_bug_for_precditor
6 years ago
Yan Chunwei e07900d317
cache tensor ptr in ZeroCopyTensor (#15352)
6 years ago
Yan Chunwei b7916440ff
hot fix the Native clone (#15344)
6 years ago
nhzlx b95f2ff8fe fix win build bug
6 years ago
nhzlx b938324381 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
6 years ago
nhzlx 312fe0ece1 add trt int8 calibration support
6 years ago
Yiqun Liu 568cc2ffa8
Optimize while_op for test (#14764)
6 years ago
nhzlx b2ba3471fd fix analysis config bug.
6 years ago
tensor-tang a5d2a6d1ad add fuse pass of sequared mat sub fusion
6 years ago
tensor-tang a89296ac1f add repeated fc relu pass
6 years ago
Zhaolong Xing 98e85f3735 add_transpose_flatten_concat_fuse (#15121)
6 years ago
wopeizl 5d9edb4124
Merge pull request #15156 from wopeizl/windows/fixgpuissue
6 years ago
tensor-tang 146e942c65
Merge pull request #15250 from tensor-tang/refine/seqpool/feed
6 years ago
peizhilin 439691f5bd adjust the shlwapi on windows
6 years ago
tensor-tang ce909664d8 Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
6 years ago
peizhilin e239558e56 remove the dismatch enclosure to avoid warning message test=develop
6 years ago
Tao Luo 2b11c710b3
Merge pull request #15249 from NHZlX/fix_trt_demo_ci
6 years ago
tensor-tang 137060135e fix zerocopy size
6 years ago
nhzlx e7d83389e6 fix demo ci bug
6 years ago
nhzlx 4e3522e5b4 add trt int8 support
6 years ago
tensor-tang 72d2a1801e add seqpool concat fuse pass
6 years ago
Yan Chunwei d09d6eadc0
make inference api work with Doxygen (#15195)
6 years ago
Yan Chunwei 875a07c32d
refactor inference analysis api (#14634)
6 years ago
tensor-tang 516fe301ee add comment in case of empty name
6 years ago
tensor-tang dca68cdf97 throw error when name not find
6 years ago
tensor-tang cd94df8679 fix load and refine
6 years ago
Zhaolong Xing 4048cfa9da
Merge pull request #15048 from NHZlX/add_affine_channel_fuse
6 years ago
Zeng Jinle c0bcff00dc
Merge pull request #14962 from sneaxiy/rewrite_variable_type
6 years ago
Tao Luo 05f1b65da3 simplify prepere_input in analyzer_test
6 years ago
nhzlx 02e17396c2 fix comments
6 years ago
nhzlx 71636e677d add min_subgraph_size attr to tensorrt config
6 years ago
sneaxiy dde3afe7b7 Merge develop
6 years ago
nhzlx 73b47df1f4 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_affine_channel_fuse
6 years ago
nhzlx ce3782c193 add affine_channel fuse.
6 years ago
qingqing01 51a9fca323
Async memory copy (#15013)
6 years ago
sneaxiy ae6f46a1a9 rewrite variable type
6 years ago
peizhilin 07c7eaabb4 Merge remote-tracking branch 'upstream/develop' into windows/mkl
6 years ago
Zhaolong Xing a9fb34fad8
Merge pull request #14903 from NHZlX/add_conv_elementwise_pass
6 years ago
peizhilin 5a6d7fe2ff add mkl,ctc support for windows
6 years ago