Commit Graph

1070 Commits (5eb81fe595a95758ee01450f600850273c97a197)

Author SHA1 Message Date
Wojciech Uss 9b6a029666 fix dataset reading and add support for full dataset (#16559)
6 years ago
lidanqing 2ca0de3cd4 fix preprocess script with processbar, integrity check and logs (#16608)
6 years ago
Tao Luo ce18710421 enhance analyzer_tests download
6 years ago
石晓伟 5dea0bdd1b
Merge pull request #16498 from Shixiaowei02/feature/anakin-engine
6 years ago
Shixiaowei02 7b9fc71076 update tensorrt subgraph_util test=develop
6 years ago
Wojciech Uss 2498395132 remove profiling from int8 test
6 years ago
Zhaolong Xing 3e6aa498d6
Merge pull request #16526 from NHZlX/refine_trt_anakin
6 years ago
Tao Luo 8f7b5883b8
Merge pull request #16529 from lidanqing-intel/lidanqing/preprocess-data
6 years ago
Tao Luo 5b24002389
Merge pull request #16399 from sfraczek/sfraczek/analyzer_int8_resnet50_test
6 years ago
Shixiaowei02 bddb2cd315 resolve conflicts with the develop branch test=develop
6 years ago
lidanqing 0d656996bf fix some bugs of unzip and reading val list
6 years ago
nhzlx d065b5bf2b Anakin ssd support
6 years ago
lidanqing b46e467abc add wget and unzip part and change data_dir
6 years ago
lidanqing 894aa9b235 change script file name and data_dir location
6 years ago
lidanqing 57f51e5b08 preprocess with PIL the full val dataset and save binary
6 years ago
chengduo ed61d67c73
Fix the interface of Pass::Apply (#16484)
6 years ago
Sylwester Fraczek 8ece7a9708 fixed url to dataset
6 years ago
gongweibao eb83abeac3
Add DGC(Deep Gradient Compression) interface. (#15841)
6 years ago
Sylwester Fraczek fe21578a44 create test for quantized resnet50
6 years ago
Michał Gallus 2d8b7b3a76 Refine default MKL-DNN Pass order (#16490)
6 years ago
Wojciech Uss 09dfc7a2aa
C-API quantization core 2 (#16396)
6 years ago
Yihua Xu 57dc3c1943 Disable compare for Issue#16316 (#16466)
6 years ago
nhzlx 953bdde058 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
nhzlx 45b3766fdf fix comments
6 years ago
Wojciech Uss 46677fb080 Move cpu_quantize_* passes into mkldnn subfolder
6 years ago
liuwei1031 de3b70a101
fix cdn issue, test=develop (#16423)
6 years ago
nhzlx 3df7b98a0f Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
nhzlx f3a2e4b3d8 1. Add ANAKIN_ROOT compile option
6 years ago
Tao Luo 294cdf6f48
Merge pull request #16177 from fc500110/remove_visualizer
6 years ago
luotao1 056599a738 add expected_kernel_cache_pass
6 years ago
Wojciech Uss cbe2dbf0db Add enabling quantization (#16326)
6 years ago
nhzlx 4f4daa4b66 cherry-pick from feature/anakin-engine: add data type for zero copy #16313
6 years ago
nhzlx 07dcf2856c git cherry-pick from feature/anakin-engine: update anakin subgraph #16278
6 years ago
nhzlx c407dfa3cb cherry-pick from feature/anakin-engine: refine paddle-anakin to new interface. #16276
6 years ago
nhzlx a25331bc26 cherry-pick from feature/anakin-engine: deal the changing shape when using anakin #16189
6 years ago
nhzlx c79f06d3d8 cherry-pick from feature/anakin-engine: add batch interface for pd-anakin #16178
6 years ago
nhzlx 69d37f81d7 cherry-pick from feature/anakin-engine: refine anakin subgraph. #16157
6 years ago
nhzlx a1d200a5de cherry-pick from feature/anakin-engine: Anakin support facebox #16111
6 years ago
flame a32d420043 cherry-pick from feature/anakin-engine: batch norm (#16110)
6 years ago
flame 0945b97f07 cherry-pick feature/anakin-engine: add anakin softmax/transpose/batch_norm/flatten/reshape op (#16020)
6 years ago
nhzlx b21770a2aa cherry-pick from feature/anakin-engine: Add subgraph fuse support and anakin engine #16018
6 years ago
nhzlx 084310f536 paddle-anakin: concat, split, pool2d converter#16003
6 years ago
flame be523baad2 Add anakin conv2d/relu/sigmoid/tanh converter (#15997)
6 years ago
Yan Chunwei d0ce6a9044 fix anakin converter registry (#15993)
6 years ago
luotao1 82af8031d9 add runtime_context_cache_pass
6 years ago
Tao Luo 7d2740db83
Revert "cache runtime_context"
6 years ago
Jacek Czaja 13816dd4ac [MKL-DNN] Fix to crash of Transformer when mkldnn is to be used (#16233)
6 years ago
Tao Luo dbb92ee4b1
Merge pull request #16002 from luotao1/runtime_context
6 years ago
Qiyang Min 8e4ad008fb
Merge pull request #16198 from velconia/imperative_train_speed
6 years ago
luotao1 a275fd6e0c Merge branch 'develop' into runtime_context
6 years ago
Wojciech Uss 2579ade45f Add cpu_quantize_pass for C-API quantization (#16127)
6 years ago
luotao1 5ecdc49c6b set enable_runtime_context_cache_ default false
6 years ago
minqiyang 7355d41834 1. Add imperative gperf profiler
6 years ago
minqiyang 98dfb492bb Release GIL lock
6 years ago
minqiyang 42e96a029f Accelerate CPU part
6 years ago
luotao1 1510b866b6 turn off runtime_context_cache for tensorrt
6 years ago
luotao1 d94fd97230 add runtime_context_cache_pass
6 years ago
fc500110 1c6e72b905 remove visualizer, which can be replaced by python IrGraph draw API
6 years ago
Tao Luo c49b7855fa
Merge pull request #16120 from Xreki/fix_cmake_compress
6 years ago
Liu Yiqun 4e052e0ac9 Disable inference download for WIN32 temporary.
6 years ago
luotao1 1283833395 zero_copy tensor support INT32
6 years ago
luotao1 31c4e1d9fc Merge branch 'develop' into zero_copy
6 years ago
luotao1 9e2c7e69fb simplify the zero_copy tests
6 years ago
luotao1 aeee4cbe71 add compare between zerocopy and analysis
6 years ago
Liu Yiqun 6bb84b74b2 Change the download and compress command of cmake.
6 years ago
Tao Luo 25ca2ca001 change init_idx to INT32 in transformer_test
6 years ago
Tao Luo e5e7e9b865 Merge branch 'develop' into transformer_ut
6 years ago
Tao Luo 6f2581e4c5
Merge pull request #16090 from lidanqing-intel/paddle-int32
6 years ago
Zhaolong Xing 3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
6 years ago
nhzlx a9ed427749 cant not pass ci
6 years ago
luotao1 fad06cb928 unify ZeroCopy in analysis_test
6 years ago
lidanqing 4aeb261da9 Add INT32 support. INT32 in last switch case
6 years ago
luotao1 06aab1b493 refine SetCpuMathLibraryNumThreads
6 years ago
nhzlx 3c40cb767b 7 refine zero copy
6 years ago
Yiqun Liu 1616c32acf
Add the include of cudnn.h to enable the use of CUDNN_VERSION. (#15961)
6 years ago
flame b187e3728e
add anakin fc op converter (#15965)
6 years ago
flame e40d56c3d3
anakin subgraph engine (#15774)
6 years ago
nhzlx 2eff3e26b6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx 06a088a199 fix comments and fix cpplint
6 years ago
nhzlx 0ed63b2108 6. delete useless predictor id
6 years ago
nhzlx 1d5ef7c9ee 5. add static trt load model
6 years ago
Tao Luo 4774dad806
Merge pull request #15857 from sfraczek/fix-typo
6 years ago
Tao Luo e3dd6970fc disable dam temporarily (#15860)
6 years ago
Sylwester Fraczek 1943119fc5 fix typo memeroy->memory
6 years ago
Sylwester Fraczek 8bc604571f fix typo seriazlized->serialized
6 years ago
Sylwester Fraczek 543e53db05 fix typo releated->related
6 years ago
Dun a83e470405
Profiler refine and add CUDA runtime api tracer (#15301)
6 years ago
Yiqun Liu e38dd91f04
Refine cmake's download function. (#15512)
6 years ago
tensor-tang e1c707fe9c
fix warnings (#15790)
6 years ago
nhzlx 2070fb246d 4. do the trt_engine optim during init.
6 years ago
nhzlx ecc12fb430 3. when runing in trt mode, do not allocate memory for parameters in fluid.
6 years ago
nhzlx 9cc6249cd6 2. TRTEngine using stream only when execute.
6 years ago
Wojciech Uss daac6a05f5 Removed duplicated code
6 years ago
Yan Chunwei 3a5d6e5e64
move passes to src to avoid different behavior in deployment (#15705)
6 years ago
nhzlx 034ba1c291 add static model load for trt
6 years ago
Yan Chunwei c00ed19df2
add more comment (#15603)
6 years ago
Gabor Buella da9c94da33 Clang build fixes (#15628)
6 years ago
Chunwei d85c2e4e5c fix anakin compile dependency
6 years ago
wopeizl 3614dadf23
Merge pull request #15631 from wopeizl/windows/fixci
6 years ago
peizhilin 061299be87 fix dependency
6 years ago