Commit Graph

49 Commits (ec1000cca97a9c0270f7f12fc084f0e800e21503)

Author SHA1 Message Date
guru4elephant 9c17a899d7
upgrade collective fleet api (#18533)
6 years ago
guru4elephant 1f1cc2221f
add random port (#18504)
6 years ago
guru4elephant 357311fdb7
make fleet support mpi job submit directly (#18441)
6 years ago
guru4elephant e83f902b98
add MultiSlotStringDataGenerator for speedup of string based user inp… (#18390)
6 years ago
tangwei12 999d9a59a5
fix communicator with pyreader (#18350)
6 years ago
HaoRen b7128bac5f supports collective communicated training (#18175)
6 years ago
guru4elephant ff399fd720
fix paddle cloud role maker bug (#18269)
6 years ago
songhao 432fda51aa fix bug in Class MultiSlotDataGenerator's function _gen_str, test=develop (#18222)
6 years ago
Qiao Longfei 23f8a4b1c3 assign role_maker before use (#18137)
6 years ago
guru4elephant 58f3e1bad7
add paddle cloud role maker for customized usage, note this is only for industrial users that have cloud environment pre-configuration (#18121)
6 years ago
tangwei12 4c735f24ea
fix bug in fleet, test=develop (#18058)
6 years ago
tangwei12 101f74cb19
fix save/load in fleet (#17675)
6 years ago
Kaipeng Deng 96ee528e3e
fix logging basicConfig cannot be setting after import paddle (#17786)
6 years ago
lilong12 b5c35ae3e7
add UserDefinedCollectiveRoleMaker for collective mode (#17898)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
jiaqi 05df39ac06
support sparse table get shard_num from TableParameter (#17443)
6 years ago
jiaqi 34369944f5
support config file, cvm, load, save, shrink (#17319)
6 years ago
tangwei12 565d309501
Reformat fleet API (#17135)
6 years ago
tangwei12 1a4a51db2b
Fleet unify distributed training (#16791)
6 years ago
jiaqi 7968887fae
Merge branch 'develop' into dataset_merge_develop
6 years ago
dongdaxiang ceac9df87a fix code style for incubator
6 years ago
xjqbest e784884e70 add Example in doc string of split_filelist
6 years ago
xjqbest 1c0ef929f9 fix code style
6 years ago
xujiaqi01 1938132936 fix code style
6 years ago
xjqbest d5ee580c5c move split filelist from trainer.py to fleet & fix error
6 years ago
xjqbest 126d2a2f9d fix init_worker bug
6 years ago
xjqbest 7a759d76cd fix code style
6 years ago
xjqbest 5e5139283b fix runtime error
6 years ago
xjqbest a99c8d0c29 fix client to client communication bug
6 years ago
xjqbest a38b98cb32 fix code style & runtime error
6 years ago
dongdaxiang 17790188d0 make role maker and distributed optimizer private
6 years ago
xjqbest d52586a97d add doc string
6 years ago
dongdaxiang dc8cf36e4b add more example on datagenerator
6 years ago
xjqbest b7940c2918 fix bug of gen_worker_desc and set_filelist, add some doc
6 years ago
xujiaqi01 20b76f3deb init model support multi programs
6 years ago
xujiaqi01 f5c6a14b54 fix runtime error
6 years ago
xujiaqi01 a5b1a0e12b support multi dataset && add init model && fix bug
6 years ago
dongdaxiang 3c65cc1bbd add document for role_maker and fleet parameter, data_generator
6 years ago
dongdaxiang 73b1f396d7 add data_generator into paddle.fluid.incubate.data_generator, add op run log in hogwild_device_worker and downpour_device_worker
6 years ago
dongdaxiang a58df687a8 only allow fleet to be initialized once
6 years ago
dongdaxiang 3e38d1db46 add trainfileswithprofiler for downpour worker
6 years ago
dongdaxiang 6af697adb0 add trainfileswithprofiler for downpour worker
6 years ago
dongdaxiang 2644b88685 add comment for MPI Symetric role maker
6 years ago
dongdaxiang ea5851fa69 add comment for MPI Symetric role maker
6 years ago
dongdaxiang b7a202aa38 add distributed optimizer factory
6 years ago
dongdaxiang fd3adf58a3 add distributed optimizer factory
6 years ago
dongdaxiang f612877797 add incubate for unified API
6 years ago
dongdaxiang 317eb0aad3 add incubate for unified API
6 years ago
dongdaxiang 3641a78b01 add incubate for unified API
6 years ago