Commit Graph

29479 Commits (1cbb282d7774539a809d32f45bb9b443f56485a7)
 

Author SHA1 Message Date
Huihuang Zheng 1cbb282d77
Add Retry Logic to CublasHandlerHolder
4 years ago
yukavio 96934b7430
fix flops (#29758)
4 years ago
liym27 41a7b07159
[Dy2Stat] Fix bug for loop: a variable is used and created in loop, but used before created (#29769)
4 years ago
LielinJiang e5af650b71
Add double grad for conv_transpose (#29706)
4 years ago
Leo Chen 224f3bcbb1
format code (#29714)
4 years ago
huangxu96 97e29411eb
fix a bug in multi_precision_fp16 unittest. (#29756)
4 years ago
LoveAn 2e5b4a216c
Optimize compilation time with Unity Build (#29733)
4 years ago
Zhang Jun 0c23ba95d8
enable MakeCiper api for inference;test=develop (#29692)
4 years ago
wangchaochaohu 7b2dc4e6b1
optimization for fp16 elementwise add (#29744)
4 years ago
chalsliu 27bdbec7fc
Refine precision test print message
4 years ago
chalsliu e63a68feac
Retry when download failed for precision test
4 years ago
Jacek Czaja 07790ba13e
[oneDNN] Reimplemented elementwise_add grad (#29747)
4 years ago
Wojciech Uss 6ef8129dcc
upgrade oneDNN with GRU INT8 optimizations (#28420)
4 years ago
Huihuang Zheng dfffee8a5d
[Dy2stat] Enable jit.save to Save Without Running (#29579)
4 years ago
Aurelius84 17c8e3adfe
Polish code in gpu_launch_config.h (#29730)
4 years ago
wangchaochaohu 068d905e1e
fix the shape choose of vectorize for cuda
4 years ago
liym27 a0b60716f1
[Dy2Stat] Support grammar: for ele in var[idx] (#29541)
4 years ago
chentianyu03 b59b6d7ae6
Complex op test (#29753)
4 years ago
liym27 096c048b45
Fix unitest test_slice (#29740)
4 years ago
syyxsxx 7c2affaa26
fix isfinite_v2_op OpProtoAndCheckerMaker AddComment bug (#29626)
4 years ago
Huihuang Zheng 2e788bd81e
Reduce batch size ot fix CPU memory, test=develop (#29736)
4 years ago
石晓伟 8bd2879ef7
update the operator registration for incompatible upgrade, test=develop (#29720)
4 years ago
LielinJiang 10edfb6f21
Update en docs of to_tensor (#29718)
4 years ago
chentianyu03 71063b8137
add conj op for complex types (#29527)
4 years ago
Wilber b593d588aa
[Inference] EnableUseGpu has higher priority than flags (#29697)
4 years ago
WangXi 9cbcc6cadc
fleet sync build strategy, test=develop (#29732)
4 years ago
tianshuo78520a 638ccaabf4
fix ubuntu docker error (#29719)
4 years ago
wanghuancoder 0c59ad2a1a
Windows generate pdb and dump, for debug (#29628)
4 years ago
Huihuang Zheng 4c4d4ba5e0
Modify CublasHandleHolder to Fix Random Unittest Failure. test=develop (#29617)
4 years ago
Chen Weihang 6cfa59de1b
[Complex] Add real & imag op and api for complex tensor (#29672)
4 years ago
Jacek Czaja 9eff1a674f
Added missing format of oneDNN (#29670)
4 years ago
LiuChiachi 572810eecb
Update EarlyStopping sample code (#29723)
4 years ago
wangchaochaohu 2e0d1ed00f
delete the code for fp16 optimization because it is not faster than common template code (#29715)
4 years ago
LoveAn bb5a7854f3
Add approval monitor for unity_build_rule.cmake (#29701)
4 years ago
Qi Li 7684b91817
[GO] add two cgo api, test=develop (#29659)
4 years ago
TTerror af8ded773a
update activation op on kunlun (#29577)
4 years ago
ceci3 cc387159f3
add pad and concat double grad (#29549)
4 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
4 years ago
Y_Xuan 76738504ad
添加rocm平台支持代码 (#29342)
4 years ago
huangxu96 b96dada4f0
add static.amp into setup.pu.in (#29621)
4 years ago
Zhang Ting 1e9127f688
improve dropout grad (#29605)
4 years ago
wangchaochaohu eab44e1f32
refine (#29622)
4 years ago
YUNSHEN XIE d0b789d27f
disable ut test_cumsum_op (#29613)
4 years ago
Jack Zhou 84bae27779
fix wmt14 doc, remove backward, add bidirect direction in rnn api (#29633)
4 years ago
WangXi 613c46bc07
fix gen_nccl_id_op_helper compile failed, test=develop (#29614)
4 years ago
chen zhiyu f5f8809c1a
1. add python version selection 2.add dynamic flags setting. (#29612)
4 years ago
YUNSHEN XIE 2926e74326
New UT should not exceed 15s (#29492)
4 years ago
Chen Weihang f02aece1f0
Add complex dtype op (add) test example (#29603)
4 years ago
AshburnLee efea540ca9
Add tf32 support for A100 tensor core acceleration for cuBLAS (#28732)
4 years ago
lijianshe02 7779768b53
add transpose double grad test=develop (#29600)
4 years ago