Paddle

Commit Graph

Author	SHA1	Message	Date
gongweibao	29d8781240	Polish fleet API to support cuda collective mode and nccl2 mode. (#18966 ) Polish fleet API to support cuda collective mode and nccl2 mode	6 years ago
Tao Luo	076f833110	add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580 ) * add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy test=develop * enhance MkldnnPostReset test=develop * add comments for mkldnn_cache_capacity field test=develop	6 years ago
Tao Luo	fe32879d2a	add mkldnn shapeblob cache clear strategy (#18513 ) * add mkldnn shapeblob cache clear strategy test=develop * refine with comments test=develop * make cache clear strategy more safey test=develop * add lock for GetShapeBlobSize test=develop	6 years ago
Tao Luo	3f3112ceb0	add shape_blob for cache mkldnn primitive (#18454 ) test=develop	6 years ago
Leo Zhao	8f5fffca0a	rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() (#18453 ) * rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() test=develop * update session id definition and adjust logic for default behavior test=develop * reset logic in mkldnn reuse as most of cases work in default. test=develop	6 years ago
Michał Gallus	8409693272	Reset DeviceContext after quantization warmup (#18182 ) test=develop	6 years ago
chengduo	4978db2c10	Remove nccl dep when the number of GPU is 1 (#18158 ) * remove nccl dep when the number of GPU is 1 test=develop	6 years ago
Zeng Jinle	3ece61f71e	Remove attribute in Allocator::Allocate (#17878 ) * remove attribute in Allocator::Allocate, test=develop * fix travis ci error, test=develop	6 years ago
Zeng Jinle	3925bd81e8	Fix cuda/cudnn version detection error (#17853 ) * fix cuda/cudnn version detection error, test=develop * fix again, test=develop	6 years ago
gongweibao	eb83abeac3	Add DGC(Deep Gradient Compression) interface. (#15841 )	6 years ago
nhzlx	a1d11bb175	fix ci bug: cudnn handler in multi card test=develop	6 years ago
nhzlx	07dcf2856c	git cherry-pick from feature/anakin-engine: update anakin subgraph #16278	6 years ago
qingqing01	86e912c544	Fix windows compiling (#16230 ) test=develop	6 years ago
qingqing01	8ad672a287	Support sync batch norm. (#16121 ) * Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)	6 years ago
Sylwester Fraczek	74672d1aff	Change (smart_ptr.get()) -> smart_ptr reason: dereferencing smart pointer is the same as the underlying pointer test=develop	6 years ago
sneaxiy	209b355762	fix many warning test=develop	6 years ago
minqiyang	315b133e67	Add single GPU support to imperative	6 years ago
chengduo	064512aa47	Remove workspace_handle in conv_cudnn (#15186 ) * remove workspace_handle in conv2d_cudnn test=develop * remove workspace_handle test=develop * fix bug test=develop * make test_conv2d_op SERIAL test=develop * save memory in conv_cudnn test=develop * enhance thread safety test=develop * enhance temporary allocator test=develop * Add excess fraction test=develop * follow comments test=develop * fix bug and code refine test=develop * fix memory size check test=develop * rename reuse_tmp_allocation_excess_fraction test=develop	6 years ago
Zeng Jinle	e29f10d315	Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var Remove op handle lock and fix var	6 years ago
Zeng Jinle	c562be20d9	Merge pull request #15193 from sneaxiy/fix_cudnn_compatible_check Fix cudnn compatible check	6 years ago
sneaxiy	ed409ac9f4	Revert "Revert "Remove op handle lock"" test=develop	6 years ago
Zeng Jinle	dacfaaa966	Revert "Remove op handle lock" test=develop	6 years ago
sneaxiy	9793a0b6a6	fix_cudnn_compatible_check	6 years ago
sneaxiy	d0a8a1e950	remove_op_handle_lock test=develop	6 years ago
sneaxiy	d25395fc98	remove tensor core lock test=develop	6 years ago
chengduo	b9fb03cf54	Move GetTensor to tensor_util (#15011 ) * refine tensor test=develop * refine tensor test=develop * fix device_context log test=develop	6 years ago
chengduo	79bd6dfa18	[Feature] Add Temporary Allocator (#14875 ) * Add Temporal Allocator * add Temporay Allocator to DeviceContext test=develop * code refine test=develop * fix mean_iou test=develop * Add DeviceTemporaryAllocator test=develop * fix conv_op bug test=develop * small fix test=develop * code refine test=develop * log refine test=develop * fix unit test test=develop * move double check * refine concat_and_split test=develop * add limit_of_temporary_allocation test=develop * fix name test=develop	6 years ago
Yan Chunwei	a985949be9	Fea/fuse conv elementwise add fuse (#14669 )	6 years ago
sneaxiy	66182abda6	add cuda cudnn version check test=develop	6 years ago
sneaxiy	0f96c2e80f	fix thread-safety bug test=develop	6 years ago
sneaxiy	900765224c	fix deallocate bug test=develop	6 years ago
Yu Yang	d93b2d0365	Refine code	6 years ago
sneaxiy	d231e55065	merge develop test=develop	6 years ago
qingqing01	abe209234f	Exhaustive search for cuDNN conv. (#14286 ) * exhaustive search for cuDNN conv. * Refine code and add unit testing. * Fix model load in fluid/inference and unit testing in conv2d * Follow comments. * Fix compiling test=develop	6 years ago
Zhaolong Xing	ba8b5619a3	Revert "cherry picked windows patches."	6 years ago
Yu Yang	c774bcbd2d	Merge device_context test=develop	6 years ago
Yu Yang	057a682ee9	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation	6 years ago
dzhwinter	2835e04409	merge develop branch. test=develop	6 years ago
qingqing01	db8c52da5e	Revert " Exhaustive search for cuDNN conv. (#14043 )" This reverts commit `ce7d9b0799`.	6 years ago
qingqing01	ce7d9b0799	Exhaustive search for cuDNN conv. (#14043 ) * exhaustive search for cuDNN conv. * Refine code and add unit testing. * Clean code * Fix model load in fluid/inference and unit testing in conv2d * Follow comments.	6 years ago
Zeng Jinle	8ac2242b6e	Merge pull request #14075 from sneaxiy/remove_some_locks_in_pe Remove some locks in ParallelExecutor	6 years ago
sneaxiy	faac8a76ce	remove unnecessary codes test=develop	6 years ago
Yu Yang	ff9e531bd9	style(platform): disable warning when cuda cc not matched (#14029 ) Warning only at first when CUDA CC not matched. test=develop	6 years ago
sneaxiy	7ff320f8cc	merge develop	6 years ago
dzhwinter	1ace55c8ee	merge develop branch	6 years ago
Yu Yang	90d9e5aee8	feat(platform): lazy initialization of devicecontext in pool (#14067 ) * feat(platform): lazy initialization of devicecontext in pool Use std::async(deferer, []{...}) to lazy initialize DeviceContext in Pool test=develop * Add future includes test=develop	6 years ago
dzhwinter	bf2e4cb188	cleard. staged	6 years ago
dzhwinter	ebfe5a02b3	merge develop branch	6 years ago
Yu Yang	c01696f8c2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	6 years ago
Sylwester Fraczek	2098b42584	review fixes (Teamcity fails) test=develop	6 years ago

1 2

89 Commits (43a82d83fa60be0524a752a10e522429783cbf67)