Commit Graph

1050 Commits (0823a7bc8b6c46a866d1e54f8cb96ccaab192bf2)

Author SHA1 Message Date
Wojciech Uss 09dfc7a2aa
C-API quantization core 2 (#16396)
6 years ago
Yihua Xu 57dc3c1943 Disable compare for Issue#16316 (#16466)
6 years ago
nhzlx 953bdde058 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
nhzlx 45b3766fdf fix comments
6 years ago
Wojciech Uss 46677fb080 Move cpu_quantize_* passes into mkldnn subfolder
6 years ago
liuwei1031 de3b70a101
fix cdn issue, test=develop (#16423)
6 years ago
nhzlx 3df7b98a0f Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
nhzlx f3a2e4b3d8 1. Add ANAKIN_ROOT compile option
6 years ago
Tao Luo 294cdf6f48
Merge pull request #16177 from fc500110/remove_visualizer
6 years ago
luotao1 056599a738 add expected_kernel_cache_pass
6 years ago
Wojciech Uss cbe2dbf0db Add enabling quantization (#16326)
6 years ago
nhzlx 4f4daa4b66 cherry-pick from feature/anakin-engine: add data type for zero copy #16313
6 years ago
nhzlx 07dcf2856c git cherry-pick from feature/anakin-engine: update anakin subgraph #16278
6 years ago
nhzlx c407dfa3cb cherry-pick from feature/anakin-engine: refine paddle-anakin to new interface. #16276
6 years ago
nhzlx a25331bc26 cherry-pick from feature/anakin-engine: deal the changing shape when using anakin #16189
6 years ago
nhzlx c79f06d3d8 cherry-pick from feature/anakin-engine: add batch interface for pd-anakin #16178
6 years ago
nhzlx 69d37f81d7 cherry-pick from feature/anakin-engine: refine anakin subgraph. #16157
6 years ago
nhzlx a1d200a5de cherry-pick from feature/anakin-engine: Anakin support facebox #16111
6 years ago
flame a32d420043 cherry-pick from feature/anakin-engine: batch norm (#16110)
6 years ago
flame 0945b97f07 cherry-pick feature/anakin-engine: add anakin softmax/transpose/batch_norm/flatten/reshape op (#16020)
6 years ago
nhzlx b21770a2aa cherry-pick from feature/anakin-engine: Add subgraph fuse support and anakin engine #16018
6 years ago
nhzlx 084310f536 paddle-anakin: concat, split, pool2d converter#16003
6 years ago
flame be523baad2 Add anakin conv2d/relu/sigmoid/tanh converter (#15997)
6 years ago
Yan Chunwei d0ce6a9044 fix anakin converter registry (#15993)
6 years ago
luotao1 82af8031d9 add runtime_context_cache_pass
6 years ago
Tao Luo 7d2740db83
Revert "cache runtime_context"
6 years ago
Jacek Czaja 13816dd4ac [MKL-DNN] Fix to crash of Transformer when mkldnn is to be used (#16233)
6 years ago
Tao Luo dbb92ee4b1
Merge pull request #16002 from luotao1/runtime_context
6 years ago
Qiyang Min 8e4ad008fb
Merge pull request #16198 from velconia/imperative_train_speed
6 years ago
luotao1 a275fd6e0c Merge branch 'develop' into runtime_context
6 years ago
Wojciech Uss 2579ade45f Add cpu_quantize_pass for C-API quantization (#16127)
6 years ago
luotao1 5ecdc49c6b set enable_runtime_context_cache_ default false
6 years ago
minqiyang 7355d41834 1. Add imperative gperf profiler
6 years ago
minqiyang 98dfb492bb Release GIL lock
6 years ago
minqiyang 42e96a029f Accelerate CPU part
6 years ago
luotao1 1510b866b6 turn off runtime_context_cache for tensorrt
6 years ago
luotao1 d94fd97230 add runtime_context_cache_pass
6 years ago
fc500110 1c6e72b905 remove visualizer, which can be replaced by python IrGraph draw API
6 years ago
Tao Luo c49b7855fa
Merge pull request #16120 from Xreki/fix_cmake_compress
6 years ago
Liu Yiqun 4e052e0ac9 Disable inference download for WIN32 temporary.
6 years ago
luotao1 1283833395 zero_copy tensor support INT32
6 years ago
luotao1 31c4e1d9fc Merge branch 'develop' into zero_copy
6 years ago
luotao1 9e2c7e69fb simplify the zero_copy tests
6 years ago
luotao1 aeee4cbe71 add compare between zerocopy and analysis
6 years ago
Liu Yiqun 6bb84b74b2 Change the download and compress command of cmake.
6 years ago
Tao Luo 25ca2ca001 change init_idx to INT32 in transformer_test
6 years ago
Tao Luo e5e7e9b865 Merge branch 'develop' into transformer_ut
6 years ago
Tao Luo 6f2581e4c5
Merge pull request #16090 from lidanqing-intel/paddle-int32
6 years ago
Zhaolong Xing 3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
6 years ago
nhzlx a9ed427749 cant not pass ci
6 years ago
luotao1 fad06cb928 unify ZeroCopy in analysis_test
6 years ago
lidanqing 4aeb261da9 Add INT32 support. INT32 in last switch case
6 years ago
luotao1 06aab1b493 refine SetCpuMathLibraryNumThreads
6 years ago
nhzlx 3c40cb767b 7 refine zero copy
6 years ago
Yiqun Liu 1616c32acf
Add the include of cudnn.h to enable the use of CUDNN_VERSION. (#15961)
6 years ago
flame b187e3728e
add anakin fc op converter (#15965)
6 years ago
flame e40d56c3d3
anakin subgraph engine (#15774)
6 years ago
nhzlx 2eff3e26b6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx 06a088a199 fix comments and fix cpplint
6 years ago
nhzlx 0ed63b2108 6. delete useless predictor id
6 years ago
nhzlx 1d5ef7c9ee 5. add static trt load model
6 years ago
Tao Luo 4774dad806
Merge pull request #15857 from sfraczek/fix-typo
6 years ago
Tao Luo e3dd6970fc disable dam temporarily (#15860)
6 years ago
Sylwester Fraczek 1943119fc5 fix typo memeroy->memory
6 years ago
Sylwester Fraczek 8bc604571f fix typo seriazlized->serialized
6 years ago
Sylwester Fraczek 543e53db05 fix typo releated->related
6 years ago
Dun a83e470405
Profiler refine and add CUDA runtime api tracer (#15301)
6 years ago
Yiqun Liu e38dd91f04
Refine cmake's download function. (#15512)
6 years ago
tensor-tang e1c707fe9c
fix warnings (#15790)
6 years ago
nhzlx 2070fb246d 4. do the trt_engine optim during init.
6 years ago
nhzlx ecc12fb430 3. when runing in trt mode, do not allocate memory for parameters in fluid.
6 years ago
nhzlx 9cc6249cd6 2. TRTEngine using stream only when execute.
6 years ago
Wojciech Uss daac6a05f5 Removed duplicated code
6 years ago
Yan Chunwei 3a5d6e5e64
move passes to src to avoid different behavior in deployment (#15705)
6 years ago
nhzlx 034ba1c291 add static model load for trt
6 years ago
Yan Chunwei c00ed19df2
add more comment (#15603)
6 years ago
Gabor Buella da9c94da33 Clang build fixes (#15628)
6 years ago
Chunwei d85c2e4e5c fix anakin compile dependency
6 years ago
wopeizl 3614dadf23
Merge pull request #15631 from wopeizl/windows/fixci
6 years ago
peizhilin 061299be87 fix dependency
6 years ago
Gabor Buella 2bf63f4c33 Fix std::abs usage in memory_optimize_pass.cc (#15627)
6 years ago
peizhilin 3a4110f960 fix ci broken randomly and disable some warnings
6 years ago
dzhwinter 4f01de6378 Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
qingqing01 943d972878
Fix analysis predictor when loading the persistable RAW type variable. (#15613)
6 years ago
dzhwinter 9c9ad7d40b Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
Yan Chunwei e887d71958
fix ir debug config (#15571)
6 years ago
Yan Chunwei 897789b16e
fix save_inferece_model bug (#15365)
6 years ago
dzhwinter 6f9904e99a rerun windows ci. test=develop
6 years ago
Tao Luo 3d0ecab41b add analyzer_transformer_test
6 years ago
Tao Luo 1a252f4be6
Merge pull request #15587 from luotao1/bert
6 years ago
Jiabin Yang b4c24f3f7c
Merge pull request #15575 from JiabinYang/feature/imperative
6 years ago
Zhaolong Xing 90ffe74954
Merge pull request #15546 from NHZlX/fix_trt_utest_random_failed
6 years ago
luotao1 8f0c2b07f2 use embedding=128 bert model for test
6 years ago
JiabinYang 16f64b43d4 test=develop, Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/imperative
6 years ago
Tao Luo 245b1f0579
Merge pull request #15570 from luotao1/bert
6 years ago
JiabinYang bb881199f2 test=develop, polish code and fix wrong change in /paddle/fluid/inference/utils/CMakeLists.txt
6 years ago
Jiabin Yang 075df09f86
Merge pull request #15470 from JiabinYang/feature/imperative
6 years ago
luotao1 5504425eb3 fix compiler error, use len20 dataset for bert
6 years ago
Yan Chunwei 655179089f
AnalysisConfig remove contrib namespace (#15540)
6 years ago
luotao1 e31aef9f6e Merge branch 'develop' into fc500110-bert_test
6 years ago