Wu Yi
22db82c053
fix tangwei merge issue test=develop ( #15506 )
6 years ago
Xin Pan
24bb6a6aec
expose CompiledProgram
...
test=develop
6 years ago
Xin Pan
7b73fc9e1a
Merge pull request #15089 from panyx0718/api
...
try unify Executor and ParallelExecutor
6 years ago
Xin Pan
c4b09a713f
polish
...
test=develop
6 years ago
chengduo
eabb2105fa
Refactor MultiDevSSAGraphBuilder ( #15090 )
...
* Refactor ParallelExecutor
test=develop
* extract Reduce and AllReduce mode from MultiDevSSAGraphBuilder
test=develop
* Refactor MultiDevSSAGraphBuilder
test=developt
* Remove enable_data_balance
test=develop
* code refine
test=develop
* remove data balance
test=develop
* refine ScaleLossGradOp
test=develop
* remove uncessary file
test=develop
* code refine
test=develop
* modify function name
test=develop
* follow comments
test=develop
* add is_distribution field
test=develop
* set is_distribution
test=develop
* fix DistSSAGraphBuilder
test=develop
6 years ago
Xin Pan
cb1891f97b
polish
...
test=develop
6 years ago
Xin Pan
5e928e579a
try unify Executor and ParallelExecutor
...
test=develop
6 years ago
chengduo
fe8495a758
[WIP] Refine MultiDevSSAGraph ( #15040 )
...
* refine parallel_exe
test=develop
* rename shared_var_device
* code refine
* add test_weight_decay
* remove Sort
test=develop
* Add SortForReduce
test=develop
* code refine
test=develop
* follow comment
test=develop
6 years ago
chengduo
550e7e410b
Code Clean parallel_executor.py ( #14849 )
...
* refine parallel_executor
* remove uncessary code
test=develop
6 years ago
gongweibao
f1fb64b17f
Add reduce sparse tensor feature. ( #14757 )
6 years ago
Wu Yi
29d9fb53fc
[Feature] multi process multi gpu dist training, boost v100 performance by 20% ( #14661 )
...
* wip multi process multi gpu dist training
* workable for p2p
* update test=develop
* change back env name test=develop
* fix alloc init
* fix cpu build test=devlop
* fix mac tests test=develop
* refine code
* refine test=develop
6 years ago
gongweibao
867c312bc4
Fix allreduce dependency order. ( #14586 )
6 years ago
chengduo
2562eb92b8
Add strategy doc ( #13849 )
...
* add strategy doc
test=develop
* fix doc
test=develop
* add ParallelExecutor arg doc
test=develop
7 years ago
Xin Pan
a2a94f602e
clean a few more kwargs
7 years ago
Yancey1989
4778c6e21c
delete unused py codes
7 years ago
Yancey1989
b084dfab7e
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_bcast
7 years ago
Yancey1989
5ce1a960a5
move bcast op into pass
7 years ago
gongweibao
392ae69650
Set parallel executor thread num under nccl2 distributed env ( #13207 )
7 years ago
Wu Yi
f90c7865f0
Benchmark tool for imgnet ( #12305 )
...
* support test using executor without reader
* run imgnet
* update fluid benchmark
* wip
* update
* update all models
* support pyreader
* update
* clean up
* make profile batches contollable
* update API.spec
* update scripts
* clean dockerfile
* update
* clean comments
* add scope argument docstring
* use num_trainers to determine nccl init comms
7 years ago
minqiyang
1ef5f2c3e8
Make flowers reader and parallel_executor more efficient
7 years ago
minqiyang
e0d5f8a820
Move compat module to python/paddle
7 years ago
minqiyang
5338417b47
Polish code style
7 years ago
minqiyang
ae39709e59
Polish code
7 years ago
minqiyang
1800fef142
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
Wu Yi
8b77448d5f
hide misc APIs ( #12540 )
...
* hide misc APIs
* update
* fix transformer test
* update API.spec
7 years ago
minqiyang
6abe819f07
Fix pybind11 problem
...
Fix str and bytes problem
Fix sorted problem
Fix math problem
Fix CI problem
7 years ago
minqiyang
1f618c4ff9
Fix the overfix of 2to3 in xrange
7 years ago
minqiyang
a58dd3e557
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
chengduo
7a495a5871
add threads in thread pool for CPU ( #12475 )
7 years ago
minqiyang
559d36328c
Apply 2to3 to current paddle main python code
7 years ago
Wu Yi
db67d60e31
Remove block api ( #12107 )
...
* remove block api
* remove clone_variable
* hide block inner apis
* update
* fix tests
7 years ago
chengduo
c6e36e7738
Change return_numpy [ParallelExecutor] default value ( #11713 )
...
* change return_numpy[PE] default value
* Remove convert to numpy in unit test
7 years ago
chengduo
a64844ad00
enable PE return numpy ( #11704 )
7 years ago
Yancey
2fdbc1ce65
hidden bcast_params call in dist train ( #11575 )
7 years ago
chengduoZH
73f224d091
add Doc parallel exe
7 years ago
chengduoZH
aadaadf735
replace use_event with use_cuda, because use_event means the program running with CUDA, so use_cuda maybe more intuitive.
7 years ago
chengduoZH
d24e046c1e
fix allReduce bug
7 years ago
chengduoZH
a57e8a4338
add cpu test
7 years ago
chengduoZH
495368c243
ADD CPU_NUM
7 years ago
chengduoZH
a56dcf5159
fix parallel_executor.py and xx_mnist.py
7 years ago
yuyang18
7c777dd549
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/exec_strategy
7 years ago
yuyang18
08295f9877
Add build strategy
7 years ago
typhoonzero
7b0c0273f4
update by comments
7 years ago
yuyang18
e5281b3c2d
Clean code & add execution strategy
7 years ago
typhoonzero
928418a9ac
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
typhoonzero
f5840d8925
follow comments
7 years ago
chengduoZH
97cb5479ae
change PE strategy
7 years ago
typhoonzero
17009d0627
workable version
7 years ago
typhoonzero
a529d790b6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
typhoonzero
3667578ec2
testing
7 years ago