Commit Graph

30211 Commits (587d99ae443c684faa25d1fd261eb81d37cb32e4)
 

Author SHA1 Message Date
ronnywang 270699e647
[ROCM] fix test_matmul_v2_op (#31802)
4 years ago
Zhou Wei 1eb927f935
Restore the third-party library cache for windows (#31811)
4 years ago
Chen Weihang 3f66e7deab
add cmath header for bfloat (#31792)
4 years ago
Feiyu Chan 4046f1303a
add coalesce_tensor into white list when checking re-creation of parameters (#31800)
4 years ago
Zhou Wei a70de87d76
Update windows compiler and CI from VS2015 to VS2017 (#31652)
4 years ago
Wilber f4d9212de2
trt plugin upgrade to pluginv2ext (#31670)
4 years ago
niuliling123 372ac08a17
add relu forward kernel and backward kernel (#31613)
4 years ago
Wojciech Uss 814b38e30f
update scale collection and propagation algorithm (#31783)
4 years ago
tianshuo78520a 513641e153
Delete fast_check_nan_inf (#31788)
4 years ago
Shang Zhizhou 9d04ef7369
fix tensorrt output varible reshape (#31733)
4 years ago
Qi Li 46dd1d4aad
[ROCM] fix reduce_sum nan in ROCM platform, test=develop (#31780)
4 years ago
gongweibao f72d197ec5
fix launch ps ut test=develop (#31771)
4 years ago
Tao Luo 032de0bfd0
update approval (#31782)
4 years ago
zlsh80826 bfced39eb6
[Paddle-TRT] nearest_interp op (#31626)
4 years ago
arlesniak 7ccf6b6030
[oneDNN] Initial bf16 amp integration (#31093)
4 years ago
lilong12 a501a7b0ca
[3D-parallel] add 1f1b scheduler for pipeline (#31566)
4 years ago
guofei ed7956a816
Fix skip_quant in QAT (#31704)
4 years ago
ronnywang 8c19d7aa2f
[ROCM] fix test_conv2d_transpose_op (#31749)
4 years ago
Ouyang Chao a45c8ca69d
fix bug of DepthwiseConvTransposeGradKernel (#31762)
4 years ago
Jacek Czaja 25fc2a1fdb
[oneDNN] Added Elementwise Mul grad fp32/bf16 (#31647)
4 years ago
Chen Weihang 878e117b6d
[CustomOp] Support float16 in custom op (#31725)
4 years ago
ronnywang c9e1d9dc31
[ROCM] fix test_rnn_op (#31735)
4 years ago
zlsh80826 1c67cf0c98
run radix sort of proposals layer on context stream (#31631)
4 years ago
Chen Weihang e429deb0c4
[CustomOp] Support attribute in infershape function (#31713)
4 years ago
Adam Osewski a4a2b77def
[oneDNN] lookup_table op with support for BF16 data type. (#31558)
4 years ago
zlsh80826 c86e771e94
NMS Performance Optimization (#31634)
4 years ago
zlsh80826 50cafa0b0c
remove redundant sync, set collect/dist kernel to context stream, sub_lod memcpy opt (#31641)
4 years ago
cc 1d197f6c97
[dgraph qat] Refine calculating output scale of dygraph qat (#31710)
4 years ago
ronnywang 420527f0d9
[ROCM] fix layer_norm, norm, p_norm, test_sequence_softmax_op, test_math_op_patch_var_base (#31709)
4 years ago
Chen Weihang 87852616aa
[CustomOp] Support complex dtype in custom op (#31657)
4 years ago
zlsh80826 fe241fd02f
[Paddle-TRT] gather converter (#31640)
4 years ago
zlsh80826 4ea3427865
[Paddle-TRT] support batch axis concatenation when using dynamic shape (#31627)
4 years ago
Zhou Wei d4282ea97e
fix multi cuda environment bug (#31694)
4 years ago
Chengmo 09482ddec4
【Paddle.Fleet】Fix one ps gradient clip (#31664)
4 years ago
Kaipeng Deng 740359edaf
remove useless import (#31700)
4 years ago
Zhang Ting 7f50bb7ec1
support NHWC for temporal_shift op (#31642)
4 years ago
liym27 402288ad65
In __getitem__, convert integers to int64 Tensor not int32 to be compatible with Lite(#31658)
4 years ago
Chen Weihang 2fbe9b097a
[CustomOp] Remove Eigen dependencies of float16 (#31669)
4 years ago
cc 19592d2b71
Refine dygraph qat, test=develop (#31680)
4 years ago
Zhou Wei 4c0c55bba1
support Geforce RTX 30+ GPU (#31529)
4 years ago
YUNSHEN XIE cdc5a55ac1
turn off added ut check on windows (#31660)
4 years ago
Qi Li d9b50f664f
[ROCM] update ci scripts and dockefile, test=develop (#31551)
4 years ago
YUNSHEN XIE 1a6e3b04cd
Second optimization of retry method (#31646)
4 years ago
wuhuanzhou 41e9ecfd1f
Optimize compilation with Ninja (#31449)
4 years ago
yiak c1b1ccfbf5
Update tinyformat.h (#31612)
4 years ago
gongweibao 9c624b16d5
Extend unittest time of (#31570)
4 years ago
YUNSHEN XIE 580442ceba
fix wget with no proxy on windows (#31505)
4 years ago
ronnywang da10c5cf8b
[ROCM] fix softmax_with_cross_entropy_op, test=develop (#31629)
4 years ago
LielinJiang 75433126df
Fix summary bug when calaculating output shape (#31549)
4 years ago
ShenLiang c3634c6b0a
fix amp bug of fleet (#31532)
4 years ago