Paddle

Commit Graph

Author	SHA1	Message	Date
Leo Chen	bbc84e0fe0	Refine error msg in paddle/fluid/framework/details [part 1] (#25631 ) * refine error msg in var_handle.h, test=develop * refine all_reduce_op_handle * fix some error msg * refine variable_visitor * refine threaded_ssa_graph_executor * refine inplace related files * refine executor related files * refine fetch_op_handle.cc * fix bug * follow comments	4 years ago
Chen Weihang	d1062d5278	Replace all errors thrown by LOG(FATAL) with PADDLE_THROW (#24759 ) * remove REPLACE_ENFORCE_GLOG compile option & add ci rule prohibit LOG(FATAL) using, test=develop * remove ci test case, test=develop * replace all LOG(FATAL) & polish message, test=develop * fix typo, test=develop * polish error info detail, test=develop	5 years ago
Chen Weihang	aa0f254fbe	Add macro BOOST_GET to enrich the error information of boost :: get (#24175 ) * add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop	5 years ago
Wilber	a90fa54092	Compile without nccl deps. [1/2] (#22509 ) 支持不依赖nccl进行编译。[1/2] 多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
WangXi	17299b8d21	fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop (#21801 )	5 years ago
WangXi	507afa8a8a	Fix dgc nan by stripping nccl from sparseReduce. (#20630 )	5 years ago
chengduo	5866a7a5fe	Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418 ) * Enable fused_all_reduce_op_handle support GPU and CPU Gradients	6 years ago
Zeng Jinle	708bd9798d	move_flags_to_unified_files_for_management, test=develop (#19224 )	6 years ago
gongweibao	29d8781240	Polish fleet API to support cuda collective mode and nccl2 mode. (#18966 ) Polish fleet API to support cuda collective mode and nccl2 mode	6 years ago
Zeng Jinle	d3003a1620	Feature/buffer_shared_inplace (#17911 ) * feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop	6 years ago
gongweibao	f5caf3443c	Fix reinitialized ncclid error! (#18025 )	6 years ago
gongweibao	fbbdc9ccad	Add backward and optimizer operator dependency pass. (#17746 )	6 years ago
gongweibao	65bbf950ee	Add multi-ncclcomm and 2D ncclallreduce support. (#17263 )	6 years ago
gongweibao	cbdb8a17b1	Polish DGC code (#16818 )	6 years ago
gongweibao	8b793d0efd	Fix DGC bug. (#16697 )	6 years ago
chengduo	ea2a2f778a	Fix the bug of AllReduceDepPass (#16393 )	6 years ago
gongweibao	eb83abeac3	Add DGC(Deep Gradient Compression) interface. (#15841 )	6 years ago
chengduo	a6a3b2fbbc	[Speed]Refine ParallelExecutor (#16190 ) * refine parallelExecutor test=develop * Polish op_handle test=develop * Remove unnecessary op_handle test=develop * Fix Travis CI test=develop * Fix fetch bug test=develop * Remove WaitInputVarGenerated * Fix OpHandleBase::Run test=develop * debug test=develop * use origin fetch_op_handle test=develop * Revert op_handle_base.cc test=develop * Polish code test=develop * Fix OpHandleBase::Run test=develop * code refine * test CI and CE test=develop * fix OpHandle::Run test=develop * refine AllReduceOpHandle test=develop * Polish code test=develop	6 years ago
Dun	a83e470405	Profiler refine and add CUDA runtime api tracer (#15301 ) * refine profiler && add runtime tracer * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * fix bug && test=develop * add thread id map && test=develop * test=develop * testing * bug fix * remove cuda event && refine code && test=develop * test=develop * test=develop * test=develop * fix windows temp file && test=develop * test=develop * fix windows bug && test=develop * fix start up issue && test=develop * code polish && test=develop * remove unused code && test=develop * add some cupti cbid && test=develop * add FLAGS_multiple_of_cupti_buffer_size && test=develop * fix compile error && test=develop * add keyword && test=develop * fix && test=develop * code polish && test=develop	6 years ago
gongweibao	7cd4dd7ce4	Hide varhandle members. (#15382 )	6 years ago
Yancey1989	4ad9de74dd	disable sync nccl by default test=develop	6 years ago
Yancey1989	0a885ac12a	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode test=develop	6 years ago
Yancey1989	845bfd5807	cleanup code	6 years ago
peizhilin	1e7f83e60a	add cuda dso support for windows test=develop	6 years ago
Yancey1989	41a64f6a2a	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode	6 years ago
Yancey1989	06936a2ff5	fix 1gpu test=develop	6 years ago
Yancey1989	d3a4da5cf6	fix comment test=develop	6 years ago
Yancey1989	49870f507d	delete unused code test=develop	6 years ago
Yancey1989	4a4ccac1d0	update by comment test=develop	6 years ago
Yu Yang	9bd70a1e04	Change tensor uses proto::VarType::type test=develop	6 years ago
Yancey1989	47740ace28	fix performance	6 years ago
Yancey1989	cb8a24be14	clean code	6 years ago
Yancey1989	c9de6f1b05	init parallel graph mode	6 years ago
Wu Yi	29d9fb53fc	[Feature] multi process multi gpu dist training, boost v100 performance by 20% (#14661 ) * wip multi process multi gpu dist training * workable for p2p * update test=develop * change back env name test=develop * fix alloc init * fix cpu build test=devlop * fix mac tests test=develop * refine code * refine test=develop	6 years ago
peizhilin	7c8c9dc9bf	fix unit test cases	6 years ago
chengduo	ed087f8232	refine op_handle (#14178 ) test=develop	6 years ago
Yancey1989	1e1b6622fd	update by comment	7 years ago
Yancey1989	5ce1a960a5	move bcast op into pass	7 years ago
Xin Pan	caf10b474f	make profiler use thread_id from g_thread_id Add a few more RecordEvent. Cleanup	7 years ago
Xin Pan	37e514432b	op compose node and update nodes.	7 years ago
chengduoZH	7b723839ef	Add cpu test for parallel_executor_crf executor_fetch_feed, and enable these tests	7 years ago
chengduoZH	d24e046c1e	fix allReduce bug	7 years ago
chengduoZH	a57e8a4338	add cpu test	7 years ago
chengduoZH	495368c243	ADD CPU_NUM	7 years ago
chengduoZH	27073c284d	nccl_all_reduce_op_handle => all_reduce_op_handle	7 years ago

46 Commits (162b4d6c13f6f38a234423bc984fb41710796475)