Commit Graph

49 Commits (2101dfd2b3552defe4e6e14f7eb96321ffe41fc2)

Author SHA1 Message Date
Dong Daxiang 50a5bcfc9d
【paddle.fleet】paddle.fleet -> paddle.distributed.fleet. (#26186)
6 years ago
123malin 2191a08317
【paddle.fleet】fleet_util move to paddle.fleet (#25805)
6 years ago
Thunderbrook 0cb60c700d
add heter ps mode (#25682)
6 years ago
tangwei12 caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957)
6 years ago
xujiaqi01 3016a4ac27
add mock barrier all (#24786)
6 years ago
xujiaqi01 1034ca316f
add timeout and http store in communication (#23436)
6 years ago
Chengmo ddd604446b
Fix judge pslib transpiler (#23720)
6 years ago
tangwei12 c4a6a0e2e4
Revert "Integrated API of Parameter Server (#22710)" test=develop (#23071)
6 years ago
tangwei12 66fce9e824
Integrated API of Parameter Server (#22710)
6 years ago
tianshuo78520a d2ba91aad1
fix typo words (#22653)
6 years ago
xujiaqi01 371f377bea
add GeneralRoleMaker (#22295)
6 years ago
Zhen Wang be2e3e67d9
Fix some typos in AMP. (#21354)
7 years ago
Dong Daxiang ccbdd7aad0
update worker_num for MPISymetricRoleMaker (#20798)
7 years ago
Chengmo 16596f6498
Fix Paddle Cloud role maker (#20860)
7 years ago
tangwei12 278dd00322
paddle cloud role maker fix (#19646)
7 years ago
gongweibao 6c2bc29cc0
Fix float16 optimizer. (#19682)
7 years ago
123malin a25a716e87
Optimize fleet API: add input check for some interfaces (#18971)
7 years ago
gongweibao 86f0591175
Remove node_num function. (#19167)
7 years ago
gongweibao 29d8781240
Polish fleet API to support cuda collective mode and nccl2 mode. (#18966)
7 years ago
jiaqi 02c370c3dc
support filelist size < trainer num && fix pull dense (#18956)
7 years ago
guru4elephant 30562e371b
refine launch_ps and role_maker (#18795)
7 years ago
tangwei12 d845848341
do some odd jobs (#18641)
7 years ago
guru4elephant 9c17a899d7
upgrade collective fleet api (#18533)
7 years ago
guru4elephant 1f1cc2221f
add random port (#18504)
7 years ago
guru4elephant 357311fdb7
make fleet support mpi job submit directly (#18441)
7 years ago
HaoRen b7128bac5f supports collective communicated training (#18175)
7 years ago
guru4elephant ff399fd720
fix paddle cloud role maker bug (#18269)
7 years ago
Qiao Longfei 23f8a4b1c3 assign role_maker before use (#18137)
7 years ago
guru4elephant 58f3e1bad7
add paddle cloud role maker for customized usage, note this is only for industrial users that have cloud environment pre-configuration (#18121)
7 years ago
tangwei12 101f74cb19
fix save/load in fleet (#17675)
7 years ago
lilong12 b5c35ae3e7
add UserDefinedCollectiveRoleMaker for collective mode (#17898)
7 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
7 years ago
jiaqi 34369944f5
support config file, cvm, load, save, shrink (#17319)
7 years ago
tangwei12 565d309501
Reformat fleet API (#17135)
7 years ago
tangwei12 1a4a51db2b
Fleet unify distributed training (#16791)
7 years ago
jiaqi 7968887fae
Merge branch 'develop' into dataset_merge_develop
7 years ago
dongdaxiang ceac9df87a fix code style for incubator
7 years ago
xjqbest d5ee580c5c move split filelist from trainer.py to fleet & fix error
7 years ago
xjqbest a99c8d0c29 fix client to client communication bug
7 years ago
xjqbest a38b98cb32 fix code style & runtime error
7 years ago
dongdaxiang 17790188d0 make role maker and distributed optimizer private
7 years ago
xujiaqi01 f5c6a14b54 fix runtime error
7 years ago
xujiaqi01 a5b1a0e12b support multi dataset && add init model && fix bug
7 years ago
dongdaxiang 3c65cc1bbd add document for role_maker and fleet parameter, data_generator
7 years ago
dongdaxiang 3e38d1db46 add trainfileswithprofiler for downpour worker
7 years ago
dongdaxiang 6af697adb0 add trainfileswithprofiler for downpour worker
7 years ago
dongdaxiang ea5851fa69 add comment for MPI Symetric role maker
7 years ago
dongdaxiang f612877797 add incubate for unified API
7 years ago
dongdaxiang 3641a78b01 add incubate for unified API
7 years ago