Commit Graph

27 Commits (24a063f6ac0ba1122b5b6bec524c6ec659197e5f)

Author SHA1 Message Date
xujiaqi01 93ea9dd27a
fix stat var in hogwild worker (#23367)
5 years ago
yaoxuefeng 2235ee1a5e
multi-loss optimization by adding a DownpourOpt worker (#22025)
5 years ago
tangwei12 82bc814a57
integrated HALF_ASYNC to communicator (#21869)
5 years ago
Thunderbrook 349e82d669
support general embedding params (#21217)
5 years ago
xujiaqi01 9e045170c0
add copy table (#21086)
5 years ago
Thunderbrook 59bcdc8a19
support dump param of model into afs (#20302)
5 years ago
xujiaqi01 5223b0dd9d
add check nan / inf in downpour worker (#20694)
5 years ago
Thunderbrook f76a32df4a
dump fix dov vec file num (#20539)
5 years ago
yaoxuefeng 10ca3f9609
add thread scope stat accurate metrics test=develop (#19480)
6 years ago
Thunderbrook 1fe468d319
support debug each output of each ins (#19004)
6 years ago
jiaqi 768059b3a0
adjust ins weight according to nid slot (#18784)
6 years ago
fuyinno4 c167a4b4dd
Fix shrink-dense and add scale-datanorm (#18746)
6 years ago
Thunderbrook d8396281ef
add slot to sparse table (#18686)
6 years ago
hutuxian 969e6378b9
Pipeline Concurrency (#17402)
6 years ago
jiaqi 66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118)
6 years ago
dongdaxiang 3c2d236815 remove all warnings
6 years ago
dongdaxiang 88880d9b69 fix import trainer_desc_pb2 error
6 years ago
dongdaxiang 5687f234bf fix trainer_desc.proto error
6 years ago
dongdaxiang b95b80bc76 add doc string for executor and update API.spec
6 years ago
dongdaxiang 6bf796df14 refine print fetch list
6 years ago
dongdaxiang 68d7bf3de5 add fetch var function
6 years ago
dongdaxiang 2644b88685 add comment for MPI Symetric role maker
6 years ago
heqiaozhi 9bca1926c1 refactor & fix bug
6 years ago
dongdaxiang cf1360643f add printer for fetch variable
6 years ago
dongdaxiang c165012031 refine device_worker and trainer code
6 years ago
dongdaxiang 8a335b50be add downpour device_worker pb configuration
6 years ago
dongdaxiang 855bf579d2 add dist_multi_trainer for distributed training, add trainer_factory and device_worker_factory so that we can easily extend new training mode, add pull dense worker which is a singleton for parameter fetching
6 years ago