Commit Graph

12433 Commits (ascendrc)

Author SHA1 Message Date
zhang wenhui 581e5460a0
【NPU】add relu op for npu (#31515)
5 years ago
oyjxer cfeeb4bc95
[NPU] Support npu op elementwise_max (#31574)
5 years ago
oyjxer e15ccafb84
[NPU] Support npu op elementwise_mul and elementwise_mul_grad (#31571)
5 years ago
zhang wenhui 29d50d2049
【NPU】Support npu kernel for matmul op (#31544)
5 years ago
xiayanming f400ce9f51
[NPU] Support npu kernel for reduceany op (#31422)
5 years ago
zhang wenhui 7524ac9345
【NPU】support npu kernel for fill_constant op (#31521)
5 years ago
zhang wenhui 9df84bd693
【NPU】add scale op for npu (#31499)
5 years ago
xiayanming e19195f795
Support npu kernel for gather op (#31458)
5 years ago
zhang wenhui b3c88e961c
[NPU] Support npu kernel for shape op (#31427)
5 years ago
Leo Chen ac3d821bc0
[NPU] add npu kernel for equal op (#31393)
5 years ago
Leo Chen 0310945f5c
[NPU] Support npu op layer_norm and layer_norm_grad (#31310)
5 years ago
Leo Chen 5618f14047
fix reading flags from env (#31329)
5 years ago
liym27 a1ddff81e3
[NPU] Support npu op: (1) slice (2) slice_grad (#31275)
5 years ago
liym27 77a0c41cb2
Fix pow npu fp16 test (#31256)
5 years ago
liym27 187248f568
[NPU] Support npu op pow and pow grad (#31247)
5 years ago
xiayanming 821c2f4ef8
add ascend unittest (#31249)
5 years ago
xiayanming 387c1db4f1
Ascendrc (#31065)
5 years ago
Leo Chen ff4654e216
refactor npu device manager (#31154)
5 years ago
liym27 1435b4c096
[NPU] Support executor with NPU (#31057)
5 years ago
gongweibao c687edecd8
Fix reshape on GE graph. (#31084)
5 years ago
xiayanming a6edbc478b
support parsing ascend rank table file (#31000)
5 years ago
gongweibao ebef6601d5
Destroy session first. (#30954)
5 years ago
gongweibao de42d19336
Add paddle ascend distribution training supported (#30796)
5 years ago
OleNet ebb5d181e8
Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug (#30797)
5 years ago
dingsiyu 4a26729540
Merge ascend_optimizer and ascend_parser. (#30776)
5 years ago
gongweibao 636fefd9f8
code style (#30781)
5 years ago
Void Main 904cc44349
[Feature] Build parser to support distributed training (#30658)
5 years ago
gongweibao 5b77b259d8
cleanup (#30646)
5 years ago
gongweibao 7158061a29
Add startup bash files of test_ascend_group. (#30645)
5 years ago
gongweibao e4287ca60b
Add Hccl program group (#30642)
5 years ago
gongweibao f5aca8fbb4
Pass device_ids info from launch to trainer. (#30632)
5 years ago
Void Main d2404da768
Build praser for Hcom* operators (#30627)
5 years ago
gongweibao f9c97dd728
Add distribution supported (#30578)
5 years ago
hutuxian 6dd52c5b25
Ascend rc (#30483)
5 years ago
WeiXin e5bb4edb2c
perfect 'var_list' of static.load/fluid.load (#30457)
5 years ago
123malin 05f06d9ae1
test=develop, fix fleet.metric (#30438)
5 years ago
taixiurong 6a3c8725b0
support transformer v2.0 (#30381)
5 years ago
Zhou Wei c94a4b9468
Separate AVX and NO_AVX compilation, enhance installation error message (#30413)
5 years ago
Jiaqi Liu e395bcd1e0
add auc into 'all' list (#30310)
5 years ago
Chengmo 859431aadb
fix ps init(#30397)
5 years ago
123malin 2a98e9323a
test=develop, add distributed_infer (#30300)
5 years ago
Wilber 96784ed6c8
fix compile error on ARM (#30398)
5 years ago
Chen Weihang ae1f32091a
fix prune input bug (#30384)
5 years ago
WeiXin 5ff4f1ad5e
move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339)
5 years ago
Huihuang Zheng cd5f11b822
Decrease Batch Size for Windows CI, test=develop (#30331)
5 years ago
cc 8e3a294045
skip quantizing ops in cpu inference (#30342)
5 years ago
Bai Yifan ad6fee2fa8
fix quantize error in speical naming model (#30354)
5 years ago
huangxu96 342d62de60
add amp example document (#30314)
5 years ago
Huihuang Zheng 017a534888
Decrease Mac Input Size Because of CI Short Memory (#30330)
5 years ago
Leo Chen 3d015f1cf5
Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338)
5 years ago
QingshuChen 2c1bba02e4
optimize memcpy perf for kunlun (#30291)
5 years ago
cnn 10ae31579b
update error information (#30277)
5 years ago
huangxu96 ee623bff64
Implemented AddQuantDequantPass in imperative quantization. (#26692)
5 years ago
ShenLiang a60f17b89d
Support unused parameters in dynamic graph distributed (#30224)
5 years ago
JZ-LIANG 75936d838f
Recompute Offload (#30233)
5 years ago
houj04 dc12b5eedf
resolve #30141 (#30145)
5 years ago
lidanqing a238298659
Skip some conv2d_int8 tests in windows (#30128)
5 years ago
Wojciech Uss fc42faffc2
Wojtuss/upgrade one dnn 2.0 (#30295)
5 years ago
tangwei12 5e839e4da5
add sparse embedding & load vars for 2.0 & gloo bug fix (#30306)
5 years ago
YUNSHEN XIE da3ab010e0
disable test_pipeline (#30204)
5 years ago
tangwei12 25f80fd304
Fix/distributed proto (#29981)
5 years ago
Chengmo d479ae1725
【Paddle.Fleet】Support local save sparse param (#30175)
5 years ago
chajchaj 113810c557
fix bug of celoss when using ignore_index and reduction (#30180)
5 years ago
Double_V 231501fefc
fix elugradgrad test fail & error message opt (#30171)
5 years ago
Zhen Wang fb49ea388e
Fix the accuracy problem of allclose op when using float64 data type in static mode. (#29890)
5 years ago
furnace 77051cc9f0
add fp16 support for tril_triu op (#30186)
5 years ago
LielinJiang 86d81af5ef
reduce unittest time of test_datasets (#30275)
5 years ago
liym27 b4989fb744
Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126)
5 years ago
furnace c6296b2b0e
fix empty op unit test fail sometimes (#30225)
5 years ago
AshburnLee 924aac2216
Add tf32 switch for cuDNN (#29192)
5 years ago
chentianyu03 c7371b7b20
type promotion for grad (#30177)
5 years ago
YUNSHEN XIE 42a6442a08
disable ut test_tsm on windows (#30017)
5 years ago
Jiaqi Liu b7335b4db7
Alias from paddle.fluid.layers.auc to paddle.static.auc (#30206)
5 years ago
WeiXin edafb5465a
Fix bug for 'save mutiple method' (#30218)
5 years ago
gongweibao 8700a7bd90
Fix unittests bugs. (#30250)
5 years ago
Bai Yifan dd6f591991
fix test_pool3d_op timeout issue (#30248)
5 years ago
Huihuang Zheng c372a76303
Add Static Variable Clone (#30208)
5 years ago
XiaoguangHu 6bfdef727e
clean redundant API alias in 2.0 - part 2 (#30013)
5 years ago
LielinJiang e6a1e8757d
Delete incorrect warning message (#30196)
5 years ago
wangchaochaohu af80859dd6
reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)
5 years ago
pangyoki da16b33f2e
add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op (#29913)
5 years ago
huangxu96 be5c2e6050
fix windows bug (#29993)
5 years ago
Chen Weihang 3016ba852e
remove distributed prepare context (#30219)
5 years ago
Zhen Wang 7f7dfccf20
Support pure fp16 training for AMP API. (#29544)
5 years ago
Leo Chen 8696335f86
Fix dtype of ungenerated grad var (#28511)
5 years ago
Aurelius84 03e072736e
Skip convert tensor shape while using Paddle.shape (#30223)
5 years ago
liym27 49411a20da
In creation.assgin, reuse implamention code of layers.tensor.assign to avoid maintain two code (#30227)
5 years ago
littletomatodonkey e03171b7c7
fix pad (#30222)
5 years ago
liym27 31ed9a5ed3
[Dy2Stat] Use Paddle2.0 api paddle.tensor.array_* (#30156)
5 years ago
liym27 ad55f609d5
[Dy2Stat] Don't convert to paddle.shape if var_x.shape is not negetive (#29965)
5 years ago
Leo Chen 1f97d61c68
Add callback after TensorCopy (#30123)
5 years ago
liym27 b2483d78a8
Fix test_slice: avoid unnecessary copying of TensorArray from subblock to parent block(#30168)
5 years ago
Chengmo 528e03fc08
【Paddle.Fleet】Fix tensor table (#30075)
5 years ago
guofei 1bdf924217
Quantization supports 2.0 APIs (#30036)
5 years ago
Chen Weihang d0fb06b27f
[Complex] Simplify prepared op impl to improve performance (#30153)
5 years ago
Chen Weihang e503470700
try multi times for sys.exit (#30188)
5 years ago
WangXi 619c62bb48
fix adamw apply gradient (#30130)
5 years ago
LutaoChu 1ff69f58b6
fix paddle.pow doc, test=document_fix (#30159)
5 years ago
wangchaochaohu 7dd551e08b
refine the paddle place support using str (#28769)
5 years ago
Chen Weihang 8020e34e7c
Simplify the options of spawn based on fleetrun (#30144)
5 years ago