Commit Graph

344 Commits (9ee288ac87c5bb204830f1ebee51be3cfefc4da0)

Author SHA1 Message Date
tangwei12 a010d883b4
doc fix, test=develop, test=document_fix (#20239)
5 years ago
Chengmo 494d6cf252
Fix transpiler en doc (#20149)
5 years ago
Chengmo eb05db7104
Speed GEO-SGD (#20158)
5 years ago
tangwei12 b5a410466c
Trainer heartbeat for async mode (#19600)
5 years ago
Chengmo 728ec1b43d
Add GEO-SGD distribute training algorithm (#20018)
5 years ago
Zeng Jinle 5f2290ab84
Add deprecated memory optimize doc (#20111)
5 years ago
123malin 6c74e7387f
fix APIs, test=document_preview (#19954)
5 years ago
tangwei12 6a1db2044c
fix sync_with_distributed_lookup_table, test=develop (#19737)
5 years ago
123malin a25a716e87
Optimize fleet API: add input check for some interfaces (#18971)
6 years ago
Yi Liu 4ef6b8457a
adapte fleet api for localsgd and support nccl comm configuration in executor (#19443)
6 years ago
tangwei12 65c7368400
Fix the correctness of async mode at distributed training (#18863)
6 years ago
tangwei12 19dac67e9f
fix distribute transpiler GRPC error code 4, RPC Deadline (#18984)
6 years ago
Tao Luo 2f8c7e021f
remove unused inference_transpiler unit-tests (#19130)
6 years ago
gongweibao 29d8781240
Polish fleet API to support cuda collective mode and nccl2 mode. (#18966)
6 years ago
Zeng Jinle c194b0c835
Try to deprecate unstable python memory optimize (#18983)
6 years ago
Zeng Jinle 8008ab4e6b
Remove legacy C++ memory optimization codes (#18834)
6 years ago
Yi Liu 157211c4e1
supports distributed classification (#18690)
6 years ago
tangwei12 d845848341
do some odd jobs (#18641)
6 years ago
gongweibao c0a82748cf
Polish backwards optimizer dependency codes and use more default values. (#18255)
6 years ago
Yi Liu a873fa84ce
supports collective training with programs (#18392)
6 years ago
HaoRen b7128bac5f supports collective communicated training (#18175)
6 years ago
chengduo e06c69c788
Fix default value of fluid.memory_optimize (#18295)
6 years ago
tangwei12 659b72a97c
fix document of python api get_startup_program() (#17764)
6 years ago
yaoxuefeng ac92e4c066
fix distributed_transpiler.py api test=develop (#17668)
6 years ago
gongweibao 0d561ef442
fix 2dconn test=develop (#17681)
6 years ago
tangwei12 0d3c48e0a8
fix doc in transpiler, test=develop (#17313)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
Michał Gallus 0c39b97b4e [MKL-DNN] Add Fully Connected Op for inference only(#15226)
6 years ago
Qiao Longfei 92e7d5d7cc
fix distribute doc test=develop (#17318)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
liuwei1031 f82e4d75e7
improve the doc of paddle.fluid.memory_optimize, test=develop (#17473)
6 years ago
liuwei1031 6a53fa95e7
improve the API Sample of DataFeeder, memory_optimize and release_memory (#17374)
6 years ago
tangwei12 7330cd639c
truncated_gaussian_random supported in distributed training, test=develop (#17091)
6 years ago
tangwei12 1a4a51db2b
Fleet unify distributed training (#16791)
6 years ago
Qiao Longfei 0608f8ca56 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async_sparse_param_update_recorder
6 years ago
Qiao Longfei d640c6cfa9 fix pylint
6 years ago
Qiao Longfei 542b52fac3 fix trainer_id
6 years ago
Qiao Longfei de65398cb8 update transpiler and listen and serv op
6 years ago
Qiao Longfei fab1b54d99 Merge branch 'add-communicator' of ssh://github.com/jacquesqiao/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
Xin Pan 0c277ac6e9 polish
6 years ago
Xin Pan 840cf780e4 add deprecation warning.
6 years ago
Qiao Longfei b8491bfd4e Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
tangwei12 971f3bc9b0
fix params with only 1 dim (#15828)
6 years ago
dzhwinter 84f067be94
update. test=develop
6 years ago
dzhwinter d453b0dcf7 add details. test=develop
6 years ago
Qiao Longfei 8bda4ab213 parameter recv can run
6 years ago
Qiao Longfei fbd186bd5d complete recv op
6 years ago
Qiao Longfei 4356f186b4 complete parameter_send
6 years ago
dzhwinter 9c9ad7d40b Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
dzhwinter 0a63234c85 follow comments. test=develop
6 years ago