Commit Graph

185 Commits (e429deb0c42c51a647bfc9d90e41be0382bded8a)

Author SHA1 Message Date
Chengmo 09482ddec4
【Paddle.Fleet】Fix one ps gradient clip (#31664)
4 years ago
Thunderbrook 3789a69923
solve bug in heter mode (#31531)
4 years ago
Thunderbrook c4f279fe8d
support multi node in heterps (#31102)
4 years ago
tangwei12 ebbdf52557
fix entry (#31079)
4 years ago
Thunderbrook 565354f676
support save multi sparse table in one path (#31108)
4 years ago
Chengmo 528e03fc08
【Paddle.Fleet】Fix tensor table (#30075)
4 years ago
Thunderbrook 0b8e1fadc5
add topo-aware in heter-ps (#30087)
4 years ago
tangwei12 032414ca2a
[Feature] one ps (3/4) (#29604)
5 years ago
Thunderbrook 09b6e71928
heter box (#29734)
5 years ago
Zhen Wang be3777a50a
Add pure fp16 training with master weights. (#27712)
5 years ago
123malin b5c6342336
Update ps gpu (#29209)
5 years ago
123malin 92817f8005
test=develop, rm pathlib (#28658)
5 years ago
yaoxuefeng 545df287fc
add user_define_dump (#28596)
5 years ago
Leo Chen 3815d7aa40
Upgrade string literals to raw string (#28989)
5 years ago
Thunderbrook 0073f9bdb0
support ps-gpu (#28752)
5 years ago
Chengmo 4dc8c44ba1
【Paddle.Fleet】Fix fleetrun heter (#28252)
5 years ago
MRXLT 55098b975e
fleet support paddle.optimzier (#28026)
5 years ago
123malin aa3b4ed717
【paddle.fleet】geo send sparse optimize (#27719)
5 years ago
Chengmo 328cb289ed
【paddle.fleet】fix sparse load (#27680)
5 years ago
zhang wenhui 5a83496c8d
Multi task (#26002)
5 years ago
Chengmo c5f2802d56
【paddle.fleet】Update fleetrun & ps-heter (#27472)
5 years ago
123malin cc780b1977
test=develop, optimize geo communicator (#26857)
5 years ago
lilong12 bbc2add703
Initialize gloo for low level collective apis (#27672)
5 years ago
yaoxuefeng 780140599f
【paddle.distributed.fleet】add data_generator in distributed.fleet.dataset (#27345)
5 years ago
lilong12 36c0410223
Revert "Initialize gloo for low level collective apis (#27356)", test=document_fix (#27665)
5 years ago
123malin 6822307745
test=develop, rm netifaces (#27581)
5 years ago
lilong12 fa73e4a284
Initialize gloo for low level collective apis (#27356)
5 years ago
tangwei12 bc5f0246a8
large scale kv speedup (#26510)
5 years ago
tangwei12 d6b54de467
【paddle.fleet】Fix/role maker api fix (#27326)
5 years ago
123malin f36b9a7f79
【Fleet2.0 Util】 add documents (#26698)
5 years ago
gongweibao 11bcf0e21c
Cleanup redundant code files (#27319)
5 years ago
123malin f2d68d3ed5
【paddle.fleet】parameter_server_optimizer support auto_strategy (#26838)
5 years ago
Chengmo c4846196b8
fix Heter Ps multi thread (#26876)
5 years ago
Chengmo d0962abd20
supplement bug fix of parameter server (#26217)
5 years ago
yaoxuefeng a47d92d868
fleet add save with whitelist test=develop (#23376)
5 years ago
Chengmo 7f2aa2db3c
【paddle.fleet】Support Heter Parameter Server (#25998)
5 years ago
tangwei12 8e4ed662d1
fix decay global counter (#26387)
5 years ago
Chengmo eeeef957c7
Fix ps gpu (#26218)
5 years ago
Dong Daxiang 50a5bcfc9d
【paddle.fleet】paddle.fleet -> paddle.distributed.fleet. (#26186)
5 years ago
gongweibao a7c5210051
Fix test_hdfs bug. (#26068)
5 years ago
gongweibao 0067a2e4ec
Save checkpoint automatically (#25917)
5 years ago
tangwei12 3755564ae1
Fix/large scale fix (#25999)
5 years ago
123malin 2191a08317
【paddle.fleet】fleet_util move to paddle.fleet (#25805)
5 years ago
Thunderbrook 0cb60c700d
add heter ps mode (#25682)
5 years ago
gentelyang 6773fcc1ba
fix stratege.set_program_config (#25864)
5 years ago
tangwei12 caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957)
5 years ago
gongweibao 80f1c50738
Fix typo in interface. (#24779)
5 years ago
xujiaqi01 3016a4ac27
add mock barrier all (#24786)
5 years ago
xujiaqi01 1034ca316f
add timeout and http store in communication (#23436)
5 years ago
mapingshuo f0e743f136
fix AMP and recompute (#23551)
5 years ago