Commit Graph

29769 Commits (81138239db4dbb37cf659ec5688d24ce33f7ab57)
 

Author SHA1 Message Date
Leo Chen 81138239db
[feature] support npu allocator (#30840)
4 years ago
gongweibao ebef6601d5
Destroy session first. (#30954)
4 years ago
Leo Chen 500f28ec37
pass cxx_flags to gloo cmake (#30857)
4 years ago
gongweibao de42d19336
Add paddle ascend distribution training supported (#30796)
4 years ago
OleNet ebb5d181e8
Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug (#30797)
4 years ago
dingsiyu 4a26729540
Merge ascend_optimizer and ascend_parser. (#30776)
4 years ago
gongweibao 636fefd9f8
code style (#30781)
4 years ago
Leo Chen 88dfd067bf
Dev/fix ascend string (#30749)
4 years ago
Leo Chen 6eabbc8076
fix compilation on ascend-20.1 (#30722)
4 years ago
Void Main 904cc44349
[Feature] Build parser to support distributed training (#30658)
4 years ago
gongweibao 5b77b259d8
cleanup (#30646)
4 years ago
gongweibao 7158061a29
Add startup bash files of test_ascend_group. (#30645)
4 years ago
gongweibao e4287ca60b
Add Hccl program group (#30642)
4 years ago
gongweibao f5aca8fbb4
Pass device_ids info from launch to trainer. (#30632)
4 years ago
Void Main d2404da768
Build praser for Hcom* operators (#30627)
4 years ago
gongweibao f9c97dd728
Add distribution supported (#30578)
4 years ago
gongweibao 1882f2ce2d
Fix compilcation on CANN20.1 and older (#30494)
4 years ago
hutuxian 6dd52c5b25
Ascend rc (#30483)
4 years ago
石晓伟 715d862868
export global google flags to users, test=develop (#30448)
4 years ago
Wojciech Uss 88fc7a7d68
fix cache key for inplaced elementwise ops (#30404)
4 years ago
WeiXin e5bb4edb2c
perfect 'var_list' of static.load/fluid.load (#30457)
4 years ago
123malin 05f06d9ae1
test=develop, fix fleet.metric (#30438)
4 years ago
wawltor 3d49882e2c
fix the rnn mask memory bug for out of read (#30459)
4 years ago
tianshuo78520a f090066e85
Clean dockerfiles (#30401)
4 years ago
taixiurong 6a3c8725b0
support transformer v2.0 (#30381)
4 years ago
ShenLiang e85be1b1b2
fix flatten api grad (#30426)
4 years ago
Zhou Wei c94a4b9468
Separate AVX and NO_AVX compilation, enhance installation error message (#30413)
4 years ago
yaoxuefeng 6e0da01c61
Heter ps new (#30198)
4 years ago
Shang Zhizhou 49e79cad39
fix jetson compile error (#30378)
4 years ago
Jiaqi Liu e395bcd1e0
add auc into 'all' list (#30310)
4 years ago
Chengmo 859431aadb
fix ps init(#30397)
4 years ago
123malin 2a98e9323a
test=develop, add distributed_infer (#30300)
4 years ago
Wilber 96784ed6c8
fix compile error on ARM (#30398)
4 years ago
Chen Weihang ae1f32091a
fix prune input bug (#30384)
4 years ago
QingshuChen cf786d22ec
fix bug that cann't find mkldnn(kunlun) (#30394)
4 years ago
WeiXin 5ff4f1ad5e
move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339)
4 years ago
Huihuang Zheng cd5f11b822
Decrease Batch Size for Windows CI, test=develop (#30331)
4 years ago
cc 8e3a294045
skip quantizing ops in cpu inference (#30342)
4 years ago
Bai Yifan ad6fee2fa8
fix quantize error in speical naming model (#30354)
4 years ago
alncat 7bbf3ac5ab
Added support for inference using quantization aware trained dygraph (#30288)
4 years ago
GaoWei8 180877e988
Softmax backward optimize (#30249)
4 years ago
huangxu96 342d62de60
add amp example document (#30314)
4 years ago
Zhou Wei b1d8ff45d7
running unit test sigle GPU parallely on Linux/windows GPU (#29523)
4 years ago
Zhang Jun 10a8f3e5c3
fix bug on compiling inference shared lib with crypto;test=develop (#30269)
4 years ago
Huihuang Zheng 28e156c27f
Fix Sleep Error in enforce.h (#30335)
4 years ago
Huihuang Zheng 017a534888
Decrease Mac Input Size Because of CI Short Memory (#30330)
4 years ago
Leo Chen 3d015f1cf5
Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338)
4 years ago
QingshuChen 2c1bba02e4
optimize memcpy perf for kunlun (#30291)
4 years ago
cnn 10ae31579b
update error information (#30277)
4 years ago
huangxu96 ee623bff64
Implemented AddQuantDequantPass in imperative quantization. (#26692)
4 years ago