Commit Graph

10661 Commits (7779768b534943742fc355a6f07bd8152ca0570b)

Author SHA1 Message Date
lijianshe02 7779768b53
add transpose double grad test=develop (#29600)
5 years ago
wangchaochaohu 1b69e528d3
optimize for long width for elementwise (#29602)
5 years ago
Wilber 78dad78610
fix none-contiguous bug for python api. (#29615)
5 years ago
ShenLiang 1efef8baed
Fix bug of matmul_v2 for broadcast case (#29599)
5 years ago
qingqing01 8d549fc85d
Add clip double grad (#29590)
5 years ago
wangchaochaohu ac4bae8ee9
elementwise_add_grad Op optimization (#29575)
5 years ago
arlesniak 62d4483649
Added verbose oneDNN lib version (#29378)
5 years ago
lilong12 ff6a145011
update, test=develop (#29559)
5 years ago
WangXi 467c716963
gen nccl id use socket (#29431)
5 years ago
tangwei12 0034273b7e
add service (#29560)
5 years ago
Leo Chen c0163837a5
Fix compile problem when cuda_arch < 6000 (#29576)
5 years ago
QingshuChen 79a41a9ed6
support roi_align & affine_channel for kunlun (#29561)
5 years ago
Jacek Czaja f6cca62575
[oneDNN] Making ThreadID info in caching key optional (#29272)
5 years ago
Wilber 740c0d58c3
update for xpu ci. (#29568)
5 years ago
JZ-LIANG d33d468f02
[Sharding] add hybrid-dp feature (#29518)
5 years ago
Leo Chen 1e72e03217
remove duplicated macro (#29563)
5 years ago
Zhang Ting 6702040e94
improve dropout (#29465)
5 years ago
Zhang Ting 30d9589afe
add cast cuda kernel (#29352)
5 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
5 years ago
Aurelius84 2a42250699
Polish hash function of executor cache key (#29556)
5 years ago
taixiurong 760d015c14
add xpu ops for training transformer in kunlun (#29539)
5 years ago
Jacek Czaja 83a693ee55
[oneDNN] Added Unit Test for Multiple instances prediction (#29501)
5 years ago
Zhong Hui 60bfd308ab
fix p_norm with empty shape (#29500)
5 years ago
Leo Chen 9f926eb720
Layernorm opt (#29522)
5 years ago
tangwei12 ae3f7a7100
add ps table (#29463)
5 years ago
ShenLiang d8391a1983
fix error message of gather nd (#29521)
5 years ago
Zhen Wang 5ac71b36fb
Remove tensor copy in the update_loss_scaling op. (#29426)
5 years ago
Zhou Wei e74e1a226c
support deepcopy for Layer/Tensor/Paramerbase (#29387)
5 years ago
joejiong 87e75a77c2
Add tangent operator (#29207)
5 years ago
zlsh80826 95e334810a
Softmax vectorization (#29404)
5 years ago
ShenLiang 2ef9e0e23c
Rebuild group automatically in dynamic graph distributed (#29255)
5 years ago
procr 3a0558339d
support mobilenet for kunlun (#29458)
5 years ago
Huihuang Zheng a1909affc6
Fix Unit Test: Add Sleep Time for CUDA Retry (#29442)
5 years ago
Leo Chen e5e522493d
make gelu fp16 computing more robust (#29484)
5 years ago
Zhang Ting 560b432349
Revert "improve elementwise_add_grad perf (#29277)" (#29464)
5 years ago
jakpiase 57a4f16d9e
added internal and external reorders to profiler (#29443)
5 years ago
Pei Yang 2480bdef6c
change hard_swish from plugin to layer (#29177)
5 years ago
taixiurong ecca6585cd
1. fix elementwise ops'bug 2. fix softmax_with_cross_entropy_op 3. add biliner_interp_op (#29448)
5 years ago
LoveAn 03b42d9fa7
fix unittest on windows, test=develop (#29365)
5 years ago
TTerror a5fcc4b545
update reduce_sum op on xpu (#29367)
5 years ago
Jack Zhou c7cada8571
Fix gru performace decline in 1.8.5 (#29455)
5 years ago
Zhang Ting 6296f4ed09
revert cast eigen kernel (#29427)
5 years ago
Leo Chen a040c055a5
fix layer_norm accuracy (#29434)
5 years ago
Zhou Wei 24ba9ed436
fix that parameters'grad has grad var (#29408)
5 years ago
Leo Chen 4e19ce1df5
refine reshape grad and double grad kernel, use tensor copy async (#29128)
5 years ago
Shang Zhizhou 225a9c4ed8
Fix unittest (#29412)
5 years ago
Pei Yang f860de4af7
support clip op trt converter (#29411)
5 years ago
Jack Zhou 1dd7b97b66
fix rnn_op bug in cudnn_version>= 8 (#29406)
5 years ago
LoveAn 671555ed32
Compiling operator libraries with Unity build (#29130)
5 years ago
cc a623ce044f
Use different name_scope for different conv type, test=develop (#29355)
5 years ago