Commit Graph

56 Commits (8c19d7aa2f89a38b3a68e53c73d88af16a3de8ce)

Author SHA1 Message Date
Thunderbrook 3789a69923
solve bug in heter mode (#31531)
4 years ago
Thunderbrook c4f279fe8d
support multi node in heterps (#31102)
4 years ago
Qi Li a60d93fb77
[ROCM] update fluid framework for rocm (part2), test=develop (#31010)
4 years ago
Thunderbrook 565354f676
support save multi sparse table in one path (#31108)
4 years ago
wanghuancoder 35c5b23f68
use iwyu clean include second time, test=develop (#30829)
4 years ago
Thunderbrook 0b8e1fadc5
add topo-aware in heter-ps (#30087)
4 years ago
Thunderbrook 09b6e71928
heter box (#29734)
4 years ago
Thunderbrook 0073f9bdb0
support ps-gpu (#28752)
4 years ago
wanghuancoder 41aad9bfcd
revert 4 files, from clear include by iwyu, test=develop (#27895)
4 years ago
Thunderbrook 6f69a4cb05
add xpu in heter mode (#27000)
4 years ago
wanghuancoder df43905f12
use iwyu clean include (#27267)
4 years ago
Thunderbrook 5205748481
fix eigen in push sparse; fix hadoop command (#26872)
5 years ago
yaoxuefeng a47d92d868
fleet add save with whitelist test=develop (#23376)
5 years ago
Thunderbrook 0cb60c700d
add heter ps mode (#25682)
5 years ago
Chen Weihang d1062d5278
Replace all errors thrown by LOG(FATAL) with PADDLE_THROW (#24759)
5 years ago
xujiaqi01 1034ca316f
add timeout and http store in communication (#23436)
5 years ago
xujiaqi01 d98084e7ec
add save with prefix (#23449)
5 years ago
xujiaqi01 3a45767d49
add fleet pslib pull and push sparse op and push dense op (#23139)
5 years ago
xujiaqi01 68ea1ad55b
add clear one table (#23089)
5 years ago
yaoxuefeng 2235ee1a5e
multi-loss optimization by adding a DownpourOpt worker (#22025)
5 years ago
xujiaqi01 e3a457d34b
add collective communication library in fleet (#22211)
5 years ago
Thunderbrook c3cf42d0f7
add table id in cache shuffle (#21585)
5 years ago
xujiaqi01 c05706fe73
fix code style of fleet_wrapper (#21639)
5 years ago
Thunderbrook 9a7832f8be
print table stat info for pslib (#21296)
5 years ago
Thunderbrook 0d17c1b816
solve pslib core in stop worker (#21263)
5 years ago
Thunderbrook 349e82d669
support general embedding params (#21217)
5 years ago
xujiaqi01 23876de55b
fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052)
5 years ago
xujiaqi01 9e045170c0
add copy table (#21086)
5 years ago
xujiaqi01 48669aa8f0
fix several sparse table issuses (#20686)
5 years ago
xujiaqi01 cedc04775c
support change shuffle and train thread num (#19841)
5 years ago
xujiaqi01 6bf298bf09
support preload thread, optimize hdfs log, fix master+patch bug (#19695)
5 years ago
jiaqi b104ea0684
add get_last_save_xbox_base/get_last_save_xbox (#19122)
6 years ago
yaoxuefeng 9150cf50fc
add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871)
6 years ago
Thunderbrook 52c1431eee
add clear_model interface in fleetwrapper (#18815)
6 years ago
fuyinno4 c167a4b4dd
Fix shrink-dense and add scale-datanorm (#18746)
6 years ago
Thunderbrook d8396281ef
add slot to sparse table (#18686)
6 years ago
jiaqi d18aabb472
support patch data, add load_one_table, fix bug (#18509)
6 years ago
jiaqi 66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118)
6 years ago
xjqbest 5e5139283b fix runtime error
6 years ago
dongdaxiang 718ea6dbd5 fix fleet code style
6 years ago
xjqbest a99c8d0c29 fix client to client communication bug
6 years ago
dongdaxiang 98dda08a85 fix pull sparse slow problem
6 years ago
dongdaxiang d4514949bf remove local random engine in fleet with rand_r()
6 years ago
dongdaxiang 365be5d559 support win32 flag in io.cc shell.cc, fix code style problem in fleet_wrapper, fix lodtensor_printer_test problem
6 years ago
xjqbest b7940c2918 fix bug of gen_worker_desc and set_filelist, add some doc
6 years ago
xujiaqi01 f5c6a14b54 fix runtime error
6 years ago
xujiaqi01 a5b1a0e12b support multi dataset && add init model && fix bug
6 years ago
dongdaxiang b7a202aa38 add distributed optimizer factory
6 years ago
xujiaqi01 d25389fefd add some log && fix error
6 years ago
dongdaxiang 317eb0aad3 add incubate for unified API
6 years ago