Commit Graph

4979 Commits (f302bb4f8bfe9bd5c2b5fbb944e79601ac88bf72)

Author SHA1 Message Date
WeiXin 3491acfb1e
Split unittest. (#30727)
5 years ago
liu zhengxi fef3654b4e
upgrade gather_tree to core.ops (#30697)
5 years ago
jakpiase f8da5536ed
REUPLOAD Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30719)
5 years ago
liym27 13ef444fa6
[Dy2Stat] Fix error message when the message has more than one lines. (#30714)
5 years ago
Tao Luo 824a79d383
Revert "Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661)" (#30708)
5 years ago
jakpiase d834f4e6e8
Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661)
5 years ago
Leo Chen 1a13626f5f
polish printing dtype (#30682)
5 years ago
WangXi a28a202603
fix test_gen_nccl_id_op failed (#30686)
5 years ago
chentianyu03 fb7fbc7a5d
fix abs bug and add abs test case (#30637)
5 years ago
ShenLiang 9514b4aa5f
Fix scatter grad bug (#30604)
5 years ago
Qi Li 1f5841c2a0
[ROCM] update cmake and dockerfile, test=develop (#30598)
5 years ago
Zhen Wang 4a9de931a2
Fix the bug in fleet amp_init. (#30606)
5 years ago
TTerror 10271ddfc4
support reduce_max op on kunlun (#30581)
5 years ago
WeiXin ca33821475
延长单测'test_static_save_load'超时 (#30599)
5 years ago
chentianyu03 358106fcb0
make abs op support complex types (#30375)
5 years ago
huangxu96 138620084c
Add fleet amp_init() (#30572)
5 years ago
lilong12 8126a41d73
fix the bug of all_reduce pipeline gradient multiple times (#30437)
5 years ago
Aurelius84 621bc4f771
[Dy2static]Fix paddle prefix in is_paddle_api (#30569)
5 years ago
Aurelius84 5067e3a8d2
[Dy2Static]Enhance check of TracedLayers out vars (#30576)
5 years ago
liym27 ff25c5b36f
Fix bug: GetAttrValue should deal with attr with attrType vector<double> (#30536)
5 years ago
WangXi 572c466d19
[Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455)
5 years ago
ykkk2333 549855ac20
add rmsprop_op_xpu test=kunlun (#30493)
5 years ago
Leo Chen 7043b8cfc6
support layer_norm fp16 in dygraph amp (#30430)
5 years ago
Zhang Ting 66c514ce83
[2.0 API] device guard (#30307)
5 years ago
WangXi 7a0a576e51
fix adamw lr_to_coeff is fixed when dygraph (#30526)
5 years ago
WeiXin c0fb03a0dc
Supplement PR29988(https://github.com/PaddlePaddle/Paddle/pull/29988) (#30507)
5 years ago
hutuxian 40ede12631
Ascend Framework Part1: OP & Wrapper (#30281)
5 years ago
gongweibao bdae7ed326
Fix potential port conflicts. (#30508)
5 years ago
QingshuChen 8489d4f76f
optimize batch_norm & pool op for kunlun (#30490)
5 years ago
taixiurong 5e5c2827a3
fix range op crash in dygraph xpu place (#30469)
5 years ago
WeiXin 18ecd433f5
Avoid bug on 'MAC python3.5/6'. (#30485)
5 years ago
JZ-LIANG 16ba0abc79
Recompute Offload: fixed bug in memcpy (#30484)
5 years ago
lijianshe02 d8a9ba56ef
fix random seed in nll_loss unittest test=develop (#30468)
5 years ago
guofei 11e78ebaa3
Modify the calculation logic of LambOptimizer (#29313)
5 years ago
pangyoki 13d757362c
Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103)
5 years ago
WeiXin e5bb4edb2c
perfect 'var_list' of static.load/fluid.load (#30457)
5 years ago
123malin 05f06d9ae1
test=develop, fix fleet.metric (#30438)
5 years ago
taixiurong 6a3c8725b0
support transformer v2.0 (#30381)
5 years ago
123malin 2a98e9323a
test=develop, add distributed_infer (#30300)
5 years ago
Chen Weihang ae1f32091a
fix prune input bug (#30384)
5 years ago
Huihuang Zheng cd5f11b822
Decrease Batch Size for Windows CI, test=develop (#30331)
5 years ago
Huihuang Zheng 017a534888
Decrease Mac Input Size Because of CI Short Memory (#30330)
5 years ago
QingshuChen 2c1bba02e4
optimize memcpy perf for kunlun (#30291)
5 years ago
ShenLiang a60f17b89d
Support unused parameters in dynamic graph distributed (#30224)
5 years ago
JZ-LIANG 75936d838f
Recompute Offload (#30233)
5 years ago
lidanqing a238298659
Skip some conv2d_int8 tests in windows (#30128)
5 years ago
Wojciech Uss fc42faffc2
Wojtuss/upgrade one dnn 2.0 (#30295)
5 years ago
YUNSHEN XIE da3ab010e0
disable test_pipeline (#30204)
5 years ago
chajchaj 113810c557
fix bug of celoss when using ignore_index and reduction (#30180)
5 years ago
Double_V 231501fefc
fix elugradgrad test fail & error message opt (#30171)
5 years ago
Zhen Wang fb49ea388e
Fix the accuracy problem of allclose op when using float64 data type in static mode. (#29890)
5 years ago
furnace 77051cc9f0
add fp16 support for tril_triu op (#30186)
5 years ago
liym27 b4989fb744
Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126)
5 years ago
furnace c6296b2b0e
fix empty op unit test fail sometimes (#30225)
5 years ago
AshburnLee 924aac2216
Add tf32 switch for cuDNN (#29192)
5 years ago
chentianyu03 c7371b7b20
type promotion for grad (#30177)
5 years ago
YUNSHEN XIE 42a6442a08
disable ut test_tsm on windows (#30017)
5 years ago
WeiXin edafb5465a
Fix bug for 'save mutiple method' (#30218)
5 years ago
gongweibao 8700a7bd90
Fix unittests bugs. (#30250)
5 years ago
Bai Yifan dd6f591991
fix test_pool3d_op timeout issue (#30248)
5 years ago
Huihuang Zheng c372a76303
Add Static Variable Clone (#30208)
5 years ago
XiaoguangHu 6bfdef727e
clean redundant API alias in 2.0 - part 2 (#30013)
5 years ago
wangchaochaohu af80859dd6
reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)
5 years ago
pangyoki da16b33f2e
add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op (#29913)
5 years ago
huangxu96 be5c2e6050
fix windows bug (#29993)
5 years ago
Chen Weihang 3016ba852e
remove distributed prepare context (#30219)
5 years ago
Leo Chen 8696335f86
Fix dtype of ungenerated grad var (#28511)
5 years ago
Aurelius84 03e072736e
Skip convert tensor shape while using Paddle.shape (#30223)
5 years ago
liym27 ad55f609d5
[Dy2Stat] Don't convert to paddle.shape if var_x.shape is not negetive (#29965)
5 years ago
Leo Chen 1f97d61c68
Add callback after TensorCopy (#30123)
5 years ago
liym27 b2483d78a8
Fix test_slice: avoid unnecessary copying of TensorArray from subblock to parent block(#30168)
5 years ago
Chengmo 528e03fc08
【Paddle.Fleet】Fix tensor table (#30075)
5 years ago
Chen Weihang d0fb06b27f
[Complex] Simplify prepared op impl to improve performance (#30153)
5 years ago
Chen Weihang e503470700
try multi times for sys.exit (#30188)
5 years ago
WangXi 619c62bb48
fix adamw apply gradient (#30130)
5 years ago
wangchaochaohu 7dd551e08b
refine the paddle place support using str (#28769)
5 years ago
Chen Weihang 8020e34e7c
Simplify the options of spawn based on fleetrun (#30144)
5 years ago
tangwei12 4763e6bc4e
pre padding in dygraph (#30163)
5 years ago
123malin 198fbdfb60
Add Lookahead and ModelAverage Optimizer (#30004)
5 years ago
ceci3 6a19e41f1f
fix syncbn convert (#30158)
5 years ago
Leo Chen adac38c506
add dispenable input for core.ops.reshape2/expand/slice (#30072)
5 years ago
Zhou Wei 30888ca343
Polish and Optimize the print/repr information of Layer (#29998)
5 years ago
WeiXin f3a2392662
Extend the timeout for the (#30151)
5 years ago
Zhou Wei 9c99d37906
fix unittest failed on windows (#29837)
5 years ago
liym27 9922bd4125
Fix bug: In dynamic mode, if start or end is negetive, __getitem__ return wrong result(#30003)
5 years ago
ceci3 334247791a
add attribute for batch_norm (#29950)
5 years ago
Jiaqi Liu 2e8425b693
Fix beam search bug (#29824)
5 years ago
WeiXin f43e1d8c57
Support storage of large parameters (#29988)
5 years ago
chentianyu03 666e665132
change the kron gradient when complex types (#29995)
5 years ago
WangXi ab04997846
[fleet] combine amp and gradient merge, test=develop (#30086)
5 years ago
WangXi ee16006b5d
Optimization grad merge performance (#29784)
5 years ago
lilong12 9e51e3833f
update, test=develop (#30047)
5 years ago
chentianyu03 e012930aa3
complex gradient matmul (#29966)
5 years ago
lilong12 b0bd93de00
Disable gloo by default (#29805)
5 years ago
ShenLiang b6fd262951
fix gather nd for untest (#30037)
5 years ago
lilong12 2bc5121da8
add the paddle.distributed.split api (#29970)
5 years ago
cc c3c064a8fc
Add mkldnn nearest_interp and bilinear_interp op (#30016)
5 years ago
tangwei12 ed856d254e
fix ut (#29989)
5 years ago
Chen Weihang a1d9a14e89
support grad accumulated across batch (#29942)
5 years ago
liuyuhui bb20dcfc1a
[Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor (#29961)
5 years ago
XiaoguangHu 726c78f293
clean redundant API alias in 2.0 - part 1 (#29928)
5 years ago
liym27 14bd77f941
[Windows CI test] Enable unittest test_optimizer_in_control_flow and remove unnecessay code (#29851)
5 years ago
littletomatodonkey 5c162fe66e
fix reg api ut fail (#29921)
5 years ago
Leo Chen a4b9daf97c
fix optimizer dtype (#29917)
5 years ago
liuyuhui 4427df37cf
[Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)
5 years ago
LielinJiang 0b74428db8
Fix Conv2DTanspose bug when padding='same' (#29915)
5 years ago
lilong12 01950ceb42
fix the bug in pipeline data parallelism (#29731)
5 years ago
YUNSHEN XIE 2a01756bf3
remove duplicate ut names (#29809)
5 years ago
Chen Weihang a6072055be
[Complex] Handle complex to real after type promotion (#29855)
5 years ago
Chen Weihang 1a304e6c06
[Complex] Add support for complex grad accumulated (#29889)
5 years ago
guofei 80eb77788f
Skip Windows Multi-GPU test of test_fetch_lod_tensor_array (#29508)
5 years ago
Leo Chen 6b258317cb
fix TransferInplaceBack (#29830)
5 years ago
QingshuChen 59b47f3b32
feat: support check_nan_inf for kunlun/xpu device (#29694)
5 years ago
wawltor 7498df2587
add the cumsum unit test for the develop (#29881)
5 years ago
wanghuancoder 26f9ab70f7
if PR have no .py files, do not use 'python coverage run', to speedup unit test (#29739)
5 years ago
Tao Luo 5d130d5670
Revert "fix conv2d int8 windows UT (#29528)" (#29869)
5 years ago
tangwei12 032414ca2a
[Feature] one ps (3/4) (#29604)
5 years ago
jakpiase edc06c6a1b
Added fc + activation fuse pass (currently only gelu, sigmoid and tanh are supported) (#29772)
5 years ago
Chen Weihang 0e0bb1b97d
replace exit method (#29862)
5 years ago
lidanqing 067d7f1d0d
fix conv2d int8 windows UT (#29528)
5 years ago
liym27 97e75ad0f5
[setitem] Support Tensor setitem in static mode (#29708)
5 years ago
YUNSHEN XIE 24ce051a84
remove duplicate ut reload (#29810)
5 years ago
ceci3 c4eb5d0378
fix unittest timeout (#29820)
5 years ago
chentianyu03 ddfc3d2c2f
change grad elementwise_mul for complex types (#29757)
5 years ago
chentianyu03 2a260d9b0e
change the grad of div when complex types (#29804)
5 years ago
Guo Sheng 356efd36fa
Remove test_rnn_decode_api from disable list. (#29814)
5 years ago
TTerror 82aa01c373
add nearest_interp_v2 on kunlun (#29725)
5 years ago
whs 82630408b4
Support double backward rsqrt (#29589)
5 years ago
xiaoting 55725cd2e1
fix for timeout, test=develop (#29788)
5 years ago
LielinJiang a94c3cbbf3
register cudnn conv double grad for depthwise conv (#29807)
5 years ago
liym27 0cc42e34c6
Migrate 4 APIs about array to paddle.tensor.* (#29565)
5 years ago
liym27 41a7b07159
[Dy2Stat] Fix bug for loop: a variable is used and created in loop, but used before created (#29769)
5 years ago
LielinJiang e5af650b71
Add double grad for conv_transpose (#29706)
5 years ago
Wojciech Uss 6ef8129dcc
upgrade oneDNN with GRU INT8 optimizations (#28420)
5 years ago
Huihuang Zheng dfffee8a5d
[Dy2stat] Enable jit.save to Save Without Running (#29579)
5 years ago
liym27 a0b60716f1
[Dy2Stat] Support grammar: for ele in var[idx] (#29541)
5 years ago
chentianyu03 b59b6d7ae6
Complex op test (#29753)
5 years ago
liym27 096c048b45
Fix unitest test_slice (#29740)
5 years ago
Huihuang Zheng 2e788bd81e
Reduce batch size ot fix CPU memory, test=develop (#29736)
5 years ago
chentianyu03 71063b8137
add conj op for complex types (#29527)
5 years ago
Chen Weihang 6cfa59de1b
[Complex] Add real & imag op and api for complex tensor (#29672)
5 years ago
TTerror af8ded773a
update activation op on kunlun (#29577)
5 years ago
ceci3 cc387159f3
add pad and concat double grad (#29549)
5 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
5 years ago
YUNSHEN XIE d0b789d27f
disable ut test_cumsum_op (#29613)
5 years ago
Jack Zhou 84bae27779
fix wmt14 doc, remove backward, add bidirect direction in rnn api (#29633)
5 years ago
YUNSHEN XIE 2926e74326
New UT should not exceed 15s (#29492)
5 years ago
Chen Weihang f02aece1f0
Add complex dtype op (add) test example (#29603)
5 years ago
AshburnLee efea540ca9
Add tf32 support for A100 tensor core acceleration for cuBLAS (#28732)
5 years ago
lijianshe02 7779768b53
add transpose double grad test=develop (#29600)
5 years ago