qingqing01
24509f4af9
Fix the grammar in copyright. ( #8403 )
7 years ago
gongweibao
8c9119afcd
add logs and fix a bug ( #5074 )
...
add logs and fix a python path bug
8 years ago
Helin Wang
60238a1bfb
Go master, pserver, trainer: switch to log15, away from logrus
8 years ago
Helin Wang
05176bd1bb
master server will wait etcd forever
8 years ago
Helin Wang
5270585e10
fix according to comment
8 years ago
Helin Wang
da7a1f2f6c
master client: retry connecting to etcd
8 years ago
武毅
c10121e13c
[Done] Sync master client between passes and fix recordio split ( #2948 )
...
* fix recordio split and task passes
* update for pre commit
* update
* update, still need to sync client wait for pass end.
* able to sync passes for task dispatching
* update to comment
* update
* fix yapf check
* why local pre-commit fails? version is the same
* fix race condition
* update
* fix race condition
* this still have duplicate problem in unit test
* update
* update
* update by comment
* update
8 years ago
Helin Wang
3ff0a9fbb1
Implement distributed training save model, improve master.NewClient interface
8 years ago
dongzhihong
e1e7309789
boring copyright
8 years ago
Helin Wang
2b1cac4113
Handle all unchecked errors
...
Unchecked errors could be handled by: cd go; gometalinter --vendor --disable-all --enable errcheck $(glide nv)
8 years ago
武毅
23b8346072
Fault tolerant distributed training, just work version, with etcd ( #2849 )
...
* using etcd as fault tolerant training
* update
* workable version, ft not tested
* small fix
* update
* remove TODO
8 years ago
gongweibao
d05d19ba03
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into taskfail
8 years ago
gongweibao
b64c7a635d
fix by helin's comments
8 years ago
gongweibao
a40a7a5cb1
fix by helin's comments
8 years ago
gongweibao
a94d217487
add TaskID
8 years ago
gongweibao
e25c155f39
add taskfail interface
8 years ago
gongweibao
af5ac2c474
merge with upstream develop
8 years ago
gongweibao
b3c5808e13
rm cloud EOF
8 years ago
gongweibao
0fa409246b
fix bugs
8 years ago
Yancey
9af8d86b7c
Trainer library discover master by etcd ( #2551 )
...
* add trainer library
* modifty file name
* move trainer to master client
* update
* update
* modify monitor master to receive a chan
* update
* use etcd client from etcd_client.go
* update
* update
* remove etcd client without lock
* update
* update the comment
* update commonts
8 years ago
gongweibao
4874810ba5
fix bugs
8 years ago
gongweibao
fc3d031425
first add
8 years ago
Helin Wang
6cd1441df6
add bufSize parameter for creating master client
8 years ago
Helin Wang
094106adfa
use logrus for logging
8 years ago
Helin Wang
4970484d37
improve comment, fix build error
8 years ago
Helin Wang
7b9080ef56
Implement master client, cgo and Python part
8 years ago
Helin Wang
fa5c3f1f73
implement master client, Go part
8 years ago
Helin Wang
0bebaa05be
fix according to comments
8 years ago
Helin Wang
54e8263cae
implement master server client, remove unnecessary dummy variable
8 years ago
Helin Wang
72a73ab6d2
implement master server client, RPC part.
8 years ago