Commit Graph

3122 Commits (88e6dc4ac5a5f0a4ed0c54365e4210528da6f3ab)

Author SHA1 Message Date
liuyuhui 254ad61959
fix xpu pe sync, test=notest (#30095)
4 years ago
Thunderbrook 0b8e1fadc5
add topo-aware in heter-ps (#30087)
4 years ago
WangXi ee16006b5d
Optimization grad merge performance (#29784)
4 years ago
Shang Zhizhou 08dc5bc27e
fix op version checker of pass bug (#30028)
4 years ago
cc c3c064a8fc
Add mkldnn nearest_interp and bilinear_interp op (#30016)
4 years ago
wawltor cc2f94620c
add the support the op version check for matmul, test=op_version (#30011)
4 years ago
wawltor b33aaea86c
add the op version check for the elementwise ops, test=op_version (#30010)
4 years ago
Leo Chen 47d10c55d5
Enhance debugging (#30001)
4 years ago
wawltor 8f49f9d5c9
change the elementwise ops version check, test=op_version
4 years ago
Thunderbrook 0ca6de171f
add include (#29952)
4 years ago
cc 6a0102b038
map matmul/squeeze2+matmul/reshape2+matmul to mul (#29911)
4 years ago
Jack Zhou 5a4e42ca9a
add gru op_register_version; test=op_version; (#29931)
4 years ago
Wilber 2b1d796cd0
[Inference] Solve 2.0 trt performance reduce compare 1.8. (#29925)
4 years ago
石晓伟 181ea1870b
flush denormals to zero, test=develop (#29924)
4 years ago
liuyuhui 3d1741b794
[Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor (#29926)
4 years ago
liym27 9602a182b2
[Dynamic Inplace] Support ShareInplaceVersionCounterWith for C++ Tensor (#29842)
4 years ago
liuyuhui 4427df37cf
[Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)
4 years ago
YUNSHEN XIE 2a01756bf3
remove duplicate ut names (#29809)
4 years ago
Chen Weihang a6072055be
[Complex] Handle complex to real after type promotion (#29855)
4 years ago
Leo Chen 6b258317cb
fix TransferInplaceBack (#29830)
4 years ago
QingshuChen 59b47f3b32
feat: support check_nan_inf for kunlun/xpu device (#29694)
4 years ago
tangwei12 032414ca2a
[Feature] one ps (3/4) (#29604)
4 years ago
jakpiase edc06c6a1b
Added fc + activation fuse pass (currently only gelu, sigmoid and tanh are supported) (#29772)
4 years ago
YUNSHEN XIE 24ce051a84
remove duplicate ut reload (#29810)
4 years ago
Thunderbrook 09b6e71928
heter box (#29734)
4 years ago
Jacek Czaja 7b33720c90
[oneDNN] Tensor copy fix to oneDNN tensors (#29771)
4 years ago
Leo Chen 224f3bcbb1
format code (#29714)
4 years ago
石晓伟 8bd2879ef7
update the operator registration for incompatible upgrade, test=develop (#29720)
4 years ago
WangXi 9cbcc6cadc
fleet sync build strategy, test=develop (#29732)
4 years ago
Chen Weihang 6cfa59de1b
[Complex] Add real & imag op and api for complex tensor (#29672)
4 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
4 years ago
lilong12 ff6a145011
update, test=develop (#29559)
4 years ago
Jacek Czaja f6cca62575
[oneDNN] Making ThreadID info in caching key optional (#29272)
4 years ago
JZ-LIANG d33d468f02
[Sharding] add hybrid-dp feature (#29518)
4 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
4 years ago
Aurelius84 2a42250699
Polish hash function of executor cache key (#29556)
4 years ago
jakpiase 57a4f16d9e
added internal and external reorders to profiler (#29443)
4 years ago
LoveAn 03b42d9fa7
fix unittest on windows, test=develop (#29365)
4 years ago
cc a623ce044f
Use different name_scope for different conv type, test=develop (#29355)
4 years ago
liym27 b10ecd9d3a
[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267)
4 years ago
Chen Weihang 9ad800ebb2
Support type promote for basic math ops (quantum required) (#29265)
4 years ago
Aurelius84 67c700b479
[Dy2Stat] Add cache for Executor and Context in run_program_op (#28421)
4 years ago
Chen Weihang 1de32f823d
Hot fix complle failed in gcc4.8 caused by complex impl (#29254)
4 years ago
GeminiCarrie 642abe2a48
Fix a bug when running on an operating system without "bash." (#29131)
4 years ago
ShenLiang 46b73e6cd9
Change the api of DataParallel and Fleet (#29224)
4 years ago
chentianyu03 8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199)
4 years ago
liym27 865a45984f
Check whether there is any inplace operation affecting gradient calculation. (#27901)
4 years ago
Chen Weihang 0b032faeee
Polish unittests details and execution conditions to adapt to MUSL (#29044)
4 years ago
Wojciech Uss 4fd4095d1b
Add quantization of multi_gru op and tests (#28615)
4 years ago
yaoxuefeng 545df287fc
add user_define_dump (#28596)
4 years ago