Paddle

Commit Graph

Author	SHA1	Message	Date
Wojciech Uss	46677fb080	Move cpu_quantize_* passes into mkldnn subfolder test=develop	6 years ago
liuwei1031	de3b70a101	fix cdn issue, test=develop (#16423 ) * fix cdn issue, test=develop * fix cdn issue, test=develop	6 years ago
Tao Luo	294cdf6f48	Merge pull request #16177 from fc500110/remove_visualizer remove graph visualizer tool, which can be replaced by python IrGraph draw API	6 years ago
Wojciech Uss	cbe2dbf0db	Add enabling quantization (#16326 ) * Add enabling quantization test=develop * remove unused (here) function	6 years ago
luotao1	82af8031d9	add runtime_context_cache_pass test=develop	6 years ago
Tao Luo	7d2740db83	Revert "cache runtime_context"	6 years ago
Jacek Czaja	13816dd4ac	[MKL-DNN] Fix to crash of Transformer when mkldnn is to be used (#16233 ) * - Fix to crash of Transformer when mkldnn is to be used Desc: TensorCopy was not setting MKLDNN primitive descriptor when layout was to be kMKLDNN test=develop * - Enable transformer for mkl-dnn test=develo * - Compilation fix test=develop * - Removed manual selection of MKL-DNN ops to be used in Transformer test test=develop	6 years ago
Tao Luo	dbb92ee4b1	Merge pull request #16002 from luotao1/runtime_context cache runtime_context	6 years ago
Qiyang Min	8e4ad008fb	Merge pull request #16198 from velconia/imperative_train_speed Improve imperative mode training speed	6 years ago
luotao1	a275fd6e0c	Merge branch 'develop' into runtime_context	6 years ago
Wojciech Uss	2579ade45f	Add cpu_quantize_pass for C-API quantization (#16127 ) * Add cpu_quantize_pass for C-API quantization test=develop * add cpu_quantize_pass test * fix lint: add include memory unorderd_map and unordered_set test=develop * fuse_relu 1 test=develop * tuned 2 without squash * fixes test=develop * remove unused vars test=develop * refactored test=develop * fix lint c-style cast -> C++ style cast test=develop * remove QuantMax and c style casts test=develop * last usage of QuantMax removed test=develop * Fix Analysis Predictor UT Check if memory_optimize_pass has already been added to the analysis config before adding a new one, so that it is not added multiple times. test=develop * change map to unordered_map fix the forgotten part of cpu_quantize_pass_tester.cc test=develop * removed quantized attribute * fixed cpu_quantize_pass_tester and op attr comments test=develop * removed redundant line test=debug * removed gmock test=develop * fix after merge	6 years ago
luotao1	5ecdc49c6b	set enable_runtime_context_cache_ default false test=develop	6 years ago
minqiyang	7355d41834	1. Add imperative gperf profiler 2. Add binutils 2.27 in manylinux support test=develop	6 years ago
minqiyang	98dfb492bb	Release GIL lock	6 years ago
minqiyang	42e96a029f	Accelerate CPU part	6 years ago
luotao1	1510b866b6	turn off runtime_context_cache for tensorrt test=develop	6 years ago
luotao1	d94fd97230	add runtime_context_cache_pass test=develop	6 years ago
fc500110	1c6e72b905	remove visualizer, which can be replaced by python IrGraph draw API	6 years ago
Tao Luo	c49b7855fa	Merge pull request #16120 from Xreki/fix_cmake_compress Change the download and compress command of cmake.	6 years ago
Liu Yiqun	4e052e0ac9	Disable inference download for WIN32 temporary. test=develop	6 years ago
luotao1	1283833395	zero_copy tensor support INT32 test=develop	6 years ago
luotao1	31c4e1d9fc	Merge branch 'develop' into zero_copy	6 years ago
luotao1	9e2c7e69fb	simplify the zero_copy tests test=develop	6 years ago
luotao1	aeee4cbe71	add compare between zerocopy and analysis	6 years ago
Liu Yiqun	6bb84b74b2	Change the download and compress command of cmake. test=develop	6 years ago
Tao Luo	25ca2ca001	change init_idx to INT32 in transformer_test test=develop	6 years ago
Tao Luo	e5e7e9b865	Merge branch 'develop' into transformer_ut	6 years ago
Tao Luo	6f2581e4c5	Merge pull request #16090 from lidanqing-intel/paddle-int32 Add PaddleDType INT32 support	6 years ago
Zhaolong Xing	3d63aa0a11	Merge pull request #15729 from NHZlX/add_static_model_load_for_trt Four points for enhancing Paddle-TRT	6 years ago
nhzlx	a9ed427749	cant not pass ci add if use static engine for trt test=develop	6 years ago
luotao1	fad06cb928	unify ZeroCopy in analysis_test	6 years ago
lidanqing	4aeb261da9	Add INT32 support. INT32 in last switch case test=develop	6 years ago
luotao1	06aab1b493	refine SetCpuMathLibraryNumThreads test=develop	6 years ago
nhzlx	3c40cb767b	7 refine zero copy update trt in docker file test=develop	6 years ago
Yiqun Liu	1616c32acf	Add the include of cudnn.h to enable the use of CUDNN_VERSION. (#15961 ) test=develop	6 years ago
flame	b187e3728e	add anakin fc op converter (#15965 )	6 years ago
flame	e40d56c3d3	anakin subgraph engine (#15774 ) * add anakin subgraph engine * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * add initial op converter * update * update * fix op register compile error * update test=develop * update	6 years ago
nhzlx	2eff3e26b6	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt	6 years ago
nhzlx	06a088a199	fix comments and fix cpplint test=develop	6 years ago
nhzlx	0ed63b2108	6. delete useless predictor id test=develop	6 years ago
nhzlx	1d5ef7c9ee	5. add static trt load model 1). add static trt load model 2). fix bug: when device_id is not 0, the trt will have a bug test=develop	6 years ago
Tao Luo	4774dad806	Merge pull request #15857 from sfraczek/fix-typo Fix few typos	6 years ago
Tao Luo	e3dd6970fc	disable dam temporarily (#15860 ) test=develop	6 years ago
Sylwester Fraczek	1943119fc5	fix typo memeroy->memory test=develop	6 years ago
Sylwester Fraczek	8bc604571f	fix typo seriazlized->serialized	6 years ago
Sylwester Fraczek	543e53db05	fix typo releated->related	6 years ago
Dun	a83e470405	Profiler refine and add CUDA runtime api tracer (#15301 ) * refine profiler && add runtime tracer * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * fix bug && test=develop * add thread id map && test=develop * test=develop * testing * bug fix * remove cuda event && refine code && test=develop * test=develop * test=develop * test=develop * fix windows temp file && test=develop * test=develop * fix windows bug && test=develop * fix start up issue && test=develop * code polish && test=develop * remove unused code && test=develop * add some cupti cbid && test=develop * add FLAGS_multiple_of_cupti_buffer_size && test=develop * fix compile error && test=develop * add keyword && test=develop * fix && test=develop * code polish && test=develop	6 years ago
Yiqun Liu	e38dd91f04	Refine cmake's download function. (#15512 ) * Refine cmake's download function. test=develop * Set DOWNLOAD_NO_EXTRACT to 1 pure download function. test=develop * Fix unpack problem in ExternalProject_Add, and it seem DOWNLOAD_NO_EXTRACT option is not support in cmake-3.5. test=develop	6 years ago
tensor-tang	e1c707fe9c	fix warnings (#15790 ) * fix warnings test=develop * fix enforce test test=develop	6 years ago
nhzlx	2070fb246d	4. do the trt_engine optim during init. add simple static mode loading test=develop	6 years ago

1 2 3 4 5 ...

980 Commits (c34b24ede782612464bc4c7cad47c40661616e9d)