Commit Graph

250 Commits (4aa9099067ba62110d4db0dcdf63ad9c57c3a59b)

Author SHA1 Message Date
Chen Weihang aa0f254fbe
Add macro BOOST_GET to enrich the error information of boost :: get (#24175)
5 years ago
qingqing01 6162cf2f2e
Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO. (#23426)
5 years ago
tangwei12 ad9c8f6d2d
fix communicator when break under pyreder mode (#22911)
5 years ago
tangwei12 07e13b84cd
remove vlog, test=develop (#22898)
5 years ago
tianshuo78520a 433cef03e5
fix typo word (#22784)
5 years ago
tangwei12 66a3150135
SYNC with communicaotor (#22344)
5 years ago
Wilber de009152a7 Compile without nccl deps. [2/2] (#22484)
5 years ago
Chengmo 8f36c39537
Fix GEO-SGD init & send Bug (#22375)
5 years ago
tangwei12 82bc814a57
integrated HALF_ASYNC to communicator (#21869)
5 years ago
123malin 985bceac53
Bug fix for sparse recorder (#21969)
5 years ago
123malin 7fb817d447
add distributed_strategy (#21710)
5 years ago
zhouwei25 a01663ca1f remove patch command and file of cares to Improved quality of Paddle Repo (#21776)
5 years ago
Chengmo a86f11b5f5
Speed GEO dense calc & communication (#21579)
5 years ago
tangwei12 9ad940fdfe
memory leak for cpu (#21174)
5 years ago
Tao Luo 70eb397677
remove unused snappy/snappystream depends in distributed codes (#21484)
5 years ago
Tao Luo 01fa4ead61
fix -Wno-error=sign-compare warning in gcc8 (#21434)
5 years ago
Tao Luo c0656dcb1a
remove -Wno-error=sign-compare, make warning as error (#21358)
5 years ago
Chengmo bc8e600ce5
Fix rpc not wait in GEO communicator (#20967)
5 years ago
123malin 20cdff0e02
Optimize decay (#20816)
5 years ago
Chengmo 16596f6498
Fix Paddle Cloud role maker (#20860)
5 years ago
123malin 95e90aa102
test=develop, add communicator_is_sgd_optimizer flag (#20677)
5 years ago
gongweibao c1710e91b2
Disable GRPC_ARG_ALLOW_REUSEPORT to avoid potencial problem. (#20690)
5 years ago
tangwei12 04384502a8
fix bug with heart beat , test=develop (#20654)
5 years ago
gongweibao f3f52fc1e2
Retry when failed to bind address. (#20642)
5 years ago
Chengmo 940c6ff1c8
Fix communicator slow bug & fix communicator stop bug (#20366)
5 years ago
123malin b4a3b75002
bug fix: invalid learning rate decay in pserver async mode (#20325)
5 years ago
Chengmo eb05db7104
Speed GEO-SGD (#20158)
5 years ago
tangwei12 c9139c3db3
trainer from dataset fetch targets (#19760)
5 years ago
tangwei12 b5a410466c
Trainer heartbeat for async mode (#19600)
5 years ago
Chengmo 728ec1b43d
Add GEO-SGD distribute training algorithm (#20018)
5 years ago
tangwei12 8f0b3c0516
the integrated communicator (#19849)
5 years ago
123malin 1bc285a53a
add retry function to try to solve grpc error code 14 (#19661)
6 years ago
Tao Luo bcddbc78d4
remove -Wmaybe-uninitialized warning (#19653)
6 years ago
123malin 2f037c3189
fix the diff between async mode and async_half mode (#19535)
6 years ago
tangwei12 f45cb1c2ca
fix bug of communicator flag, test=develop (#19635)
6 years ago
tangwei12 65c7368400
Fix the correctness of async mode at distributed training (#18863)
6 years ago
gongweibao fd4b15a2f6
Unset unittests http_proxy env to avoid timeout. (#19269)
6 years ago
Zeng Jinle 708bd9798d
move_flags_to_unified_files_for_management, test=develop (#19224)
6 years ago
gongweibao 29d8781240
Polish fleet API to support cuda collective mode and nccl2 mode. (#18966)
6 years ago
tangwei12 999d9a59a5
fix communicator with pyreader (#18350)
6 years ago
Qiao Longfei 0e08e91c18
optimize communicator merge sparse gradient test=develop (#18159)
6 years ago
tangwei12 101f74cb19
fix save/load in fleet (#17675)
6 years ago
Zeng Jinle 3ece61f71e
Remove attribute in Allocator::Allocate (#17878)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
Tao Luo 3d19f44a89
remove unused SERIAL compiler option (#17500)
6 years ago
Qiao Longfei 287de41c04
Optimize communicator flags (#17494)
6 years ago
Qiao Longfei d831f1b0ba fix brpc code
6 years ago
Qiao Longfei 8b8a0487c7 fix compile test=develop
6 years ago
Qiao Longfei a541c25ab6 fix cpplint test=develop
6 years ago
Qiao Longfei 0608f8ca56 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async_sparse_param_update_recorder
6 years ago