Commit Graph

29774 Commits (5cb20f30fcaaf76ff782957f71a4e53198e40eaa)
 

Author SHA1 Message Date
Leo Chen 5cb20f30fc
add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)
5 years ago
gongweibao c687edecd8
Fix reshape on GE graph. (#31084)
5 years ago
xiayanming a6edbc478b
support parsing ascend rank table file (#31000)
5 years ago
Leo Chen 1201cd2ef2
[feature] support npu allocator, part 2 (#30972)
5 years ago
Leo Chen 7e049108c5
[feature] support npu operator (#30951)
5 years ago
Leo Chen 81138239db
[feature] support npu allocator (#30840)
5 years ago
gongweibao ebef6601d5
Destroy session first. (#30954)
5 years ago
Leo Chen 500f28ec37
pass cxx_flags to gloo cmake (#30857)
5 years ago
gongweibao de42d19336
Add paddle ascend distribution training supported (#30796)
5 years ago
OleNet ebb5d181e8
Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug (#30797)
5 years ago
dingsiyu 4a26729540
Merge ascend_optimizer and ascend_parser. (#30776)
5 years ago
gongweibao 636fefd9f8
code style (#30781)
5 years ago
Leo Chen 88dfd067bf
Dev/fix ascend string (#30749)
5 years ago
Leo Chen 6eabbc8076
fix compilation on ascend-20.1 (#30722)
5 years ago
Void Main 904cc44349
[Feature] Build parser to support distributed training (#30658)
5 years ago
gongweibao 5b77b259d8
cleanup (#30646)
5 years ago
gongweibao 7158061a29
Add startup bash files of test_ascend_group. (#30645)
5 years ago
gongweibao e4287ca60b
Add Hccl program group (#30642)
5 years ago
gongweibao f5aca8fbb4
Pass device_ids info from launch to trainer. (#30632)
5 years ago
Void Main d2404da768
Build praser for Hcom* operators (#30627)
5 years ago
gongweibao f9c97dd728
Add distribution supported (#30578)
5 years ago
gongweibao 1882f2ce2d
Fix compilcation on CANN20.1 and older (#30494)
5 years ago
hutuxian 6dd52c5b25
Ascend rc (#30483)
5 years ago
石晓伟 715d862868
export global google flags to users, test=develop (#30448)
5 years ago
Wojciech Uss 88fc7a7d68
fix cache key for inplaced elementwise ops (#30404)
5 years ago
WeiXin e5bb4edb2c
perfect 'var_list' of static.load/fluid.load (#30457)
5 years ago
123malin 05f06d9ae1
test=develop, fix fleet.metric (#30438)
5 years ago
wawltor 3d49882e2c
fix the rnn mask memory bug for out of read (#30459)
5 years ago
tianshuo78520a f090066e85
Clean dockerfiles (#30401)
5 years ago
taixiurong 6a3c8725b0
support transformer v2.0 (#30381)
5 years ago
ShenLiang e85be1b1b2
fix flatten api grad (#30426)
5 years ago
Zhou Wei c94a4b9468
Separate AVX and NO_AVX compilation, enhance installation error message (#30413)
5 years ago
yaoxuefeng 6e0da01c61
Heter ps new (#30198)
5 years ago
Shang Zhizhou 49e79cad39
fix jetson compile error (#30378)
5 years ago
Jiaqi Liu e395bcd1e0
add auc into 'all' list (#30310)
5 years ago
Chengmo 859431aadb
fix ps init(#30397)
5 years ago
123malin 2a98e9323a
test=develop, add distributed_infer (#30300)
5 years ago
Wilber 96784ed6c8
fix compile error on ARM (#30398)
5 years ago
Chen Weihang ae1f32091a
fix prune input bug (#30384)
5 years ago
QingshuChen cf786d22ec
fix bug that cann't find mkldnn(kunlun) (#30394)
5 years ago
WeiXin 5ff4f1ad5e
move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339)
5 years ago
Huihuang Zheng cd5f11b822
Decrease Batch Size for Windows CI, test=develop (#30331)
5 years ago
cc 8e3a294045
skip quantizing ops in cpu inference (#30342)
5 years ago
Bai Yifan ad6fee2fa8
fix quantize error in speical naming model (#30354)
5 years ago
alncat 7bbf3ac5ab
Added support for inference using quantization aware trained dygraph (#30288)
5 years ago
GaoWei8 180877e988
Softmax backward optimize (#30249)
5 years ago
huangxu96 342d62de60
add amp example document (#30314)
5 years ago
Zhou Wei b1d8ff45d7
running unit test sigle GPU parallely on Linux/windows GPU (#29523)
5 years ago
Zhang Jun 10a8f3e5c3
fix bug on compiling inference shared lib with crypto;test=develop (#30269)
5 years ago
Huihuang Zheng 28e156c27f
Fix Sleep Error in enforce.h (#30335)
5 years ago