Commit Graph

126 Commits (24a063f6ac0ba1122b5b6bec524c6ec659197e5f)

Author SHA1 Message Date
gongweibao 24a063f6ac
Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL (#22586)
5 years ago
xujiaqi01 21f59779ea
fix dump slot in strategy (#23398)
5 years ago
xujiaqi01 3a45767d49
add fleet pslib pull and push sparse op and push dense op (#23139)
5 years ago
xujiaqi01 c8f9e66b71
fix no_cvm in config_fleet (#22818)
5 years ago
xujiaqi01 68ea1ad55b
add clear one table (#23089)
5 years ago
yaoxuefeng 995a6376f7
add pslib SparseDoubleTable test=develop (#23053)
5 years ago
tangwei12 c4a6a0e2e4
Revert "Integrated API of Parameter Server (#22710)" test=develop (#23071)
5 years ago
tangwei12 66fce9e824
Integrated API of Parameter Server (#22710)
5 years ago
yaoxuefeng c5cbe7f07b
fix add grad bug test=develop (#22924)
5 years ago
123malin 0f9d40816e
test=develop, optimize distributedstrategy (#22677)
5 years ago
tianshuo78520a 433cef03e5
fix typo word (#22784)
5 years ago
tianshuo78520a d2ba91aad1
fix typo words (#22653)
5 years ago
tangwei12 66a3150135
SYNC with communicaotor (#22344)
5 years ago
123malin 00594c1c88
support dumping params/grads in transpiler mode (#22490)
5 years ago
123malin e59463efc7
test=develop, add distributed tools (#22623)
5 years ago
tangwei12 1aab3e61c9
add texttable for pretty flag output (#22584)
5 years ago
tangwei12 b0675c8193
fix bug with compiledProgram (#22495)
5 years ago
yaoxuefeng 2235ee1a5e
multi-loss optimization by adding a DownpourOpt worker (#22025)
5 years ago
xujiaqi01 6e4f39a061
add hdfs ls retry time and sleep time, fix save inference (#22433)
5 years ago
tangwei12 7e2665c58b
fix bug with half (#22378)
5 years ago
xujiaqi01 371f377bea
add GeneralRoleMaker (#22295)
5 years ago
tangwei12 82bc814a57
integrated HALF_ASYNC to communicator (#21869)
5 years ago
123malin 7fb817d447
add distributed_strategy (#21710)
5 years ago
WangXi 3ec289a6a3 fix sync_batch_norm hang in fleet (#21838)
5 years ago
lilong12 da75ac8b6c bugfix: construct a DistributedStrategy instance if the passed one is None (#21545)
5 years ago
xujiaqi01 f1178e9d79
fix fleet save bug (#21362)
5 years ago
Zhen Wang be2e3e67d9
Fix some typos in AMP. (#21354)
5 years ago
Thunderbrook 9a7832f8be
print table stat info for pslib (#21296)
5 years ago
xujiaqi01 319d2ba925
fix fs_client_param bug (#21212)
5 years ago
Thunderbrook 0d17c1b816
solve pslib core in stop worker (#21263)
5 years ago
xujiaqi01 eca66f317e
fix fleet util bug (#21254)
5 years ago
Thunderbrook 349e82d669
support general embedding params (#21217)
5 years ago
Dong Daxiang ccbdd7aad0
update worker_num for MPISymetricRoleMaker (#20798)
5 years ago
xujiaqi01 23876de55b
fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052)
5 years ago
xujiaqi01 9e045170c0
add copy table (#21086)
5 years ago
lilong12 53148e0696
modify the implementation of save_persistables and save_inference_model for fleet collective mode (#20802)
5 years ago
Thunderbrook 5970e8ac5e
find lookup table in order (#20932)
5 years ago
Chengmo 16596f6498
Fix Paddle Cloud role maker (#20860)
5 years ago
Bai Yifan ac87d4e6e1
fix hdfs.download, test=develop (#20907)
5 years ago
Thunderbrook 59bcdc8a19
support dump param of model into afs (#20302)
5 years ago
xujiaqi01 48669aa8f0
fix several sparse table issuses (#20686)
5 years ago
xujiaqi01 5223b0dd9d
add check nan / inf in downpour worker (#20694)
5 years ago
Chengmo 940c6ff1c8
Fix communicator slow bug & fix communicator stop bug (#20366)
5 years ago
WangXi cadc6a9704 fix dgc test and bug when not set trainers_endpoints_, test=develop (#20617)
5 years ago
mapingshuo f55d1c6867
Fleet: deal with special case: strategy is None (#20359)
5 years ago
Thunderbrook f76a32df4a
dump fix dov vec file num (#20539)
5 years ago
zhang wenhui b521992041
fix converter , test=develop (#20522)
5 years ago
zhang wenhui b82e6520e1
fix pslib datanorm double bug (#20297)
5 years ago
zhang wenhui b28d4a824f
fix fleet_desc delete_after_unseen_day bug in node.py (#20091)
5 years ago
Chengmo 728ec1b43d
Add GEO-SGD distribute training algorithm (#20018)
5 years ago