chengduo
a6a3b2fbbc
[Speed]Refine ParallelExecutor ( #16190 )
...
* refine parallelExecutor
test=develop
* Polish op_handle
test=develop
* Remove unnecessary op_handle
test=develop
* Fix Travis CI
test=develop
* Fix fetch bug
test=develop
* Remove WaitInputVarGenerated
* Fix OpHandleBase::Run
test=develop
* debug
test=develop
* use origin fetch_op_handle
test=develop
* Revert op_handle_base.cc
test=develop
* Polish code
test=develop
* Fix OpHandleBase::Run
test=develop
* code refine
* test CI and CE
test=develop
* fix OpHandle::Run
test=develop
* refine AllReduceOpHandle
test=develop
* Polish code
test=develop
6 years ago
chengduo
f26ba5bddd
Fuse AllReduce ( #15921 )
...
* fuse all_reduce
test=develop
* add fuse_parameter_groups_size
test=develop
* Polish code
test=develop
* Fix travis-ci
test=develop
* Add SetGroupAccordingToLayers and SetGroupAccordingToGroupSize
test=develop
* Add SetGroupAccordingToMemorySize
test=develop
* fix multi_devices_graph
test=develop
* reset params_grads
test=develop
* Polish code
test=develop
6 years ago
qingqing01
8ad672a287
Support sync batch norm. ( #16121 )
...
* Support Sync Batch Norm.
* Note, do not enable it in one device.
Usage:
build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
loss_name=loss_mean.name,
build_strategy=build_strategy)
6 years ago
Xin Pan
a6e3cd5eb7
Merge pull request #15425 from panyx0718/api
...
Pass graph to parallel executor instead of program
6 years ago
乔龙飞 Qiao Longfei
ec8e878200
Merge pull request #15840 from jacquesqiao/revert-15684-revert-15661-fix-cpu-broadcast
...
fix cpu broadcast
6 years ago
Qiao Longfei
2b7931d5c9
refine code test=develop
6 years ago
Xin Pan
19d78f6797
polish
...
test=develop
6 years ago
Xin Pan
32d5a16036
resolve conflicts
...
test=develop
6 years ago
Xin Pan
26e32e095a
allow compiler to use graph
...
test=develop
6 years ago
Xin Pan
6019054cdd
Merge pull request #15716 from Yancey1989/refine_pg
...
Refine ParallelGraph Execution
6 years ago
Yancey1989
d5090c892d
polish code test=develop
6 years ago
Yancey1989
0f8bd73cc9
cleanup code test=develop
6 years ago
dzhwinter
d376cf71b7
polish code for reading. test=develop
6 years ago
Yancey1989
73005ee00d
cleanup code test=develop
6 years ago
Yancey1989
88d3dc949e
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into refine_pg
...
test=develop
6 years ago
Yancey1989
f3463ecb6e
refine pg execution
6 years ago
乔龙飞 Qiao Longfei
45b19cbc9a
Revert "Revert "cpu reduce mode did not need to broadcast params test=develop""
6 years ago
乔龙飞 Qiao Longfei
6e0e706198
Revert "cpu reduce mode did not need to broadcast params test=develop"
6 years ago
Qiao Longfei
97b143fb49
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-cpu-broadcast
...
test=develop
6 years ago
Qiao Longfei
fbadd4b60c
follow comment test=develop
6 years ago
dzhwinter
04e9776aef
add details. test=develop
6 years ago
Qiao Longfei
76072261f8
fix compiler
...
test=develop
6 years ago
dzhwinter
e537634d16
delete graph print pass. test=develop
6 years ago
dzhwinter
0a63234c85
follow comments. test=develop
6 years ago
dzhwinter
32a2014939
refine build strategy. test=develop
6 years ago
dzhwinter
ee3aae56cd
merge develop branch. test=develop
6 years ago
dzhwinter
2739096eec
compatibable with python side mem_opt
6 years ago
WangZhen
3ce6172052
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
6 years ago
dzhwinter
8f3b252392
squash commits. test=develop
6 years ago
Dun
9f8f0fc2d3
Memory optimization of depthwise conv op and group norm op ( #15313 )
...
* mem opt
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine with cub test=develop
* fix mkldnn test && remove comments && test=develop
* polish code && test=develop
* add only_forward test && test=develop
6 years ago
WangZhen
e2ff300b02
add UT for quantization.
6 years ago
WangZhen
451896fce4
init quantization.
6 years ago
minqiyang
68a07328fa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_pyramid_dnn_support
...
test=develop
6 years ago
minqiyang
4bfa110fd8
Add no lock optimize pass
...
test=develop
6 years ago
chengduo
eabb2105fa
Refactor MultiDevSSAGraphBuilder ( #15090 )
...
* Refactor ParallelExecutor
test=develop
* extract Reduce and AllReduce mode from MultiDevSSAGraphBuilder
test=develop
* Refactor MultiDevSSAGraphBuilder
test=developt
* Remove enable_data_balance
test=develop
* code refine
test=develop
* remove data balance
test=develop
* refine ScaleLossGradOp
test=develop
* remove uncessary file
test=develop
* code refine
test=develop
* modify function name
test=develop
* follow comments
test=develop
* add is_distribution field
test=develop
* set is_distribution
test=develop
* fix DistSSAGraphBuilder
test=develop
6 years ago
Yancey1989
e65436103f
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
...
test=develop
6 years ago
Wu Yi
227e0c4518
fix nccl2 mode startup test=develop ( #15132 )
6 years ago
Yancey1989
ca8c77d966
selecte execution according to strategy test=develop
6 years ago
Yancey1989
1a4f79a7de
fix unittest test=develop
6 years ago
Yancey1989
845bfd5807
cleanup code
6 years ago
Yancey1989
41a64f6a2a
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
chengduo
550e7e410b
Code Clean parallel_executor.py ( #14849 )
...
* refine parallel_executor
* remove uncessary code
test=develop
6 years ago
dzhwinter
7cd24b1318
add ir memory optimize. ( #14530 )
...
* follow comments. test=develop
* Fix typo
* fix compile error. test=develop
* merge develop branch. test=develop
* Remove set_equal
* Polish code
* Delete unused functions
test=develop
* polish code. test=develop
* follow comment
* polish code.
* fix windows compile error. test=develop
* fix op handle.
* rerun ci. test=develop
* rerun ci. test=develop
* rerun macci. test=develop
* polish code. test=develop
* rewrite sort code. test=develop
* remove unused code. test=develop
* fix tests. test=develop
* fix conflict. test=develop
* follow comment. test=develop
* merge develop branch. test=develop
* fix tests. test=develop
* remove ToTypeIndex. test=develop
* rerun ci. test=develop
6 years ago
Yancey1989
fd144954ed
redefine api test=develop
6 years ago
gongweibao
f1fb64b17f
Add reduce sparse tensor feature. ( #14757 )
6 years ago
Wu Yi
29d9fb53fc
[Feature] multi process multi gpu dist training, boost v100 performance by 20% ( #14661 )
...
* wip multi process multi gpu dist training
* workable for p2p
* update test=develop
* change back env name test=develop
* fix alloc init
* fix cpu build test=devlop
* fix mac tests test=develop
* refine code
* refine test=develop
6 years ago
gongweibao
867c312bc4
Fix allreduce dependency order. ( #14586 )
6 years ago
peizhilin
7c8c9dc9bf
fix unit test cases
6 years ago
Xin Pan
759ffca423
some improvements
...
test=develop
6 years ago
Xin Pan
99dffb91d6
allow to repeatedly share and update BuildStrategy
...
test=develop
6 years ago