Commit Graph

6033 Commits (develop)

Author SHA1 Message Date
wuhuanzhou 587d99ae44
update compilation with C++14 (#31815)
5 years ago
Thunderbrook 393b3bd6b7
fix split core (#31892)
5 years ago
taixiurong 52b05baca3
fix some bug in transformer training in xpu (#31918)
5 years ago
furnace ef8323d49e
[ROCM] Add ROCm support for warpctc op (#31817)
5 years ago
Jiawei Wang 95f808c878
fix stack op grad nullptr (#31962)
5 years ago
jakpiase 6dca7a1de7
Added int8 kernel for oneDNN LSTM op (#31894)
5 years ago
niuliling123 a71d72d921
relu forward and backward with vectortype (#31869)
5 years ago
tianshuo78520a 8829a309fe
Delete cudnn6 code (#31835)
5 years ago
liym27 525c32e33c
Fix bug of set_value op:Decerease axes to do right broadcast (#31875)
5 years ago
cc b47478efc2
[dygraph qat] Use layer to calculate output scale (#31861)
5 years ago
tianshuo78520a e804f08559
delete include framework.pb.h (#31859)
5 years ago
Chen Weihang 27f2d8df8e
Polish two error messages (#31852)
5 years ago
niuliling123 6472d62093
Revert "add relu forward kernel and backward kernel (#31613)" (#31853)
5 years ago
winter-wang e7f28d6c0d
fix runtime crash when rnn model inference, test=develop (#31833)
5 years ago
Wojciech Uss e5f7a834d4
fix cache key in concat oneDNN kernel (#31820)
5 years ago
ronnywang 270699e647
[ROCM] fix test_matmul_v2_op (#31802)
5 years ago
niuliling123 372ac08a17
add relu forward kernel and backward kernel (#31613)
5 years ago
Qi Li 46dd1d4aad
[ROCM] fix reduce_sum nan in ROCM platform, test=develop (#31780)
5 years ago
arlesniak 7ccf6b6030
[oneDNN] Initial bf16 amp integration (#31093)
5 years ago
ronnywang 8c19d7aa2f
[ROCM] fix test_conv2d_transpose_op (#31749)
5 years ago
Ouyang Chao a45c8ca69d
fix bug of DepthwiseConvTransposeGradKernel (#31762)
5 years ago
Jacek Czaja 25fc2a1fdb
[oneDNN] Added Elementwise Mul grad fp32/bf16 (#31647)
5 years ago
ronnywang c9e1d9dc31
[ROCM] fix test_rnn_op (#31735)
5 years ago
zlsh80826 1c67cf0c98
run radix sort of proposals layer on context stream (#31631)
5 years ago
Adam Osewski a4a2b77def
[oneDNN] lookup_table op with support for BF16 data type. (#31558)
5 years ago
zlsh80826 c86e771e94
NMS Performance Optimization (#31634)
5 years ago
zlsh80826 50cafa0b0c
remove redundant sync, set collect/dist kernel to context stream, sub_lod memcpy opt (#31641)
5 years ago
ronnywang 420527f0d9
[ROCM] fix layer_norm, norm, p_norm, test_sequence_softmax_op, test_math_op_patch_var_base (#31709)
5 years ago
Zhang Ting 7f50bb7ec1
support NHWC for temporal_shift op (#31642)
5 years ago
ronnywang da10c5cf8b
[ROCM] fix softmax_with_cross_entropy_op, test=develop (#31629)
5 years ago
WangXi 9066b74f58
c_gen_nccl_id add SocketServer to persit server (#31589)
5 years ago
Kaipeng Deng a32e8bf1e7
DataLoader supprot dict str (#31481)
5 years ago
Pei Yang cac9635a67
[Paddle-TRT] Fix engine key in trt int8 calibration (#31513)
5 years ago
Qi Li 3d5aa9d10a
[ROCM] fix conv2d and conv3d op, test=develop (#31553)
5 years ago
jiangcheng 9ed6c895f1
optimize range op by place parameters on cpu rather than gpu, test=develop (#30811)
5 years ago
chajchaj 6148b87f9d
add softmax_switch for softmax_with_cross_entropy_op, test=develop (#31428)
5 years ago
WangXi 83a2fb1f08
Add collective async wait op (#31463)
5 years ago
lilong12 0205e9f84e
remove the send/recv of tensor size (#31460)
5 years ago
furnace 910f377fa5
Bugfix rocm (#31490)
5 years ago
Qi Li 416e47edef
[ROCM] fix softmax with loss nan in HIP platform, test=develop (#31491)
5 years ago
JamesLim 45c7d90564
Optimization of elementwise CUDA kernel (#30801)
5 years ago
Jacek Czaja 23d96cf221
[oneDNN] bumpup onednn 2.2 fixup version (#31473)
5 years ago
wangguanzhong 43d6abf0a5
update conv2d, test=develop (#31480)
5 years ago
wangguanzhong 50af0c2cbb
fix roi_align, test=develop (#31479)
5 years ago
ronnywang e03e46730c
[ROCM] fix gather_op, sigmoid_cross_entropy_with_logits_op, test=develop (#31467)
5 years ago
Qi Li b85c8e03be
[ROCM] fix reduce op, test=develop (#31478)
5 years ago
Jacek Czaja 39a5424ed1
[oneDNN] elementwise add bf16 grad kernel with broadcasting (#31385)
5 years ago
Qi Li 133a914bd0
[ROCM] fix test_dist_op ci test, test=develop (#31468)
5 years ago
Qi Li f9377965c4
[ROCM] fix dropout and remove hipcub, test=develop (#31455)
5 years ago
JamesLim 8491ae9a02
Creating a CUDA function to find the minimum value in warp or block (#31191)
5 years ago