Commit Graph

673 Commits (3d015f1cf529915ab52cb8aef7c475f67fb128b5)

Author SHA1 Message Date
JZ-LIANG 75936d838f
Recompute Offload (#30233)
4 years ago
wangchaochaohu af80859dd6
reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)
4 years ago
liuyuhui 15fac5e7fa
fix assign_op_xpu concat_op_xpu warining (#30120)
4 years ago
liuyuhui 254ad61959
fix xpu pe sync, test=notest (#30095)
4 years ago
WangXi ee16006b5d
Optimization grad merge performance (#29784)
4 years ago
Shang Zhizhou 08dc5bc27e
fix op version checker of pass bug (#30028)
4 years ago
cc c3c064a8fc
Add mkldnn nearest_interp and bilinear_interp op (#30016)
4 years ago
wawltor cc2f94620c
add the support the op version check for matmul, test=op_version (#30011)
4 years ago
wawltor b33aaea86c
add the op version check for the elementwise ops, test=op_version (#30010)
4 years ago
wawltor 8f49f9d5c9
change the elementwise ops version check, test=op_version
4 years ago
cc 6a0102b038
map matmul/squeeze2+matmul/reshape2+matmul to mul (#29911)
4 years ago
Jack Zhou 5a4e42ca9a
add gru op_register_version; test=op_version; (#29931)
4 years ago
Wilber 2b1d796cd0
[Inference] Solve 2.0 trt performance reduce compare 1.8. (#29925)
4 years ago
liuyuhui 3d1741b794
[Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor (#29926)
4 years ago
liuyuhui 4427df37cf
[Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)
4 years ago
jakpiase edc06c6a1b
Added fc + activation fuse pass (currently only gelu, sigmoid and tanh are supported) (#29772)
4 years ago
YUNSHEN XIE 24ce051a84
remove duplicate ut reload (#29810)
4 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
4 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
4 years ago
cc a623ce044f
Use different name_scope for different conv type, test=develop (#29355)
4 years ago
Wojciech Uss 4fd4095d1b
Add quantization of multi_gru op and tests (#28615)
4 years ago
joanna.wozna.intel b0d1ac161e
Add bf16 pool2d and unify bf16 unit tests (#29039)
4 years ago
joanna.wozna.intel fddea67445
Fix cpu_bfloat16_pass (#28730)
4 years ago
Wojciech Uss 7b5a8e46de
Add multi_gru_fuse_pass and tests (#28601)
4 years ago
Wojciech Uss 991345b368
Add multi_gru_seq_fuse_pass and tests (#28604)
4 years ago
joanna.wozna.intel 8c0ea4bffe
Add bf16 matmul, fc, elementwise add and mul (#28729)
4 years ago
Wojciech Uss efc3b182f0
a fix for the fc_lstm_fuse_pass (#28709)
4 years ago
Jacek Czaja 6d8d3d4c22
[oneDNN] Layer norm bf16 kernel (#28619)
4 years ago
joanna.wozna.intel 2cb71c0cde
Add checkpoint to quantize (#28612)
4 years ago
lidanqing 804271cff9
Op version python mkldnn_inplace test (#28354)
4 years ago
Leo Chen 90805e2df7
Register op_version for new attribute use_addto (#28463)
4 years ago
Shang Zhizhou 8699f38d08
裁剪transformer模型trt支持;修复tensorRT不支持DeletePass的bug (#28517)
4 years ago
lidanqing 0fc181dbd0
[Fix bug] If the pass name is not found, IsCompatible should return false (#28475)
4 years ago
wangchaochaohu d7cfee9b31
Checkout point add (#28488)
4 years ago
Pei Yang 75196cda40
Paddle-TRT int8 support mul op channelwise quant (#28422)
4 years ago
YUNSHEN XIE 369605be1d
fix cmake error when execute build_inference_lib (#28503)
4 years ago
YUNSHEN XIE 1e698c600e
fix cmake error when setting ut timeout properity (#28492)
4 years ago
YUNSHEN XIE ba0756325a
exec ut no more than 15s 1 (#28439)
4 years ago
joanna.wozna.intel 7821759d48
Add bfloat16 softmax and gelu (#28394)
4 years ago
Jacek Czaja ca41541472
[oneDNN]Sum bf16 kernel (#28382)
4 years ago
lidanqing 12b9587be5
Add conv_bias pass version python test (#28278)
4 years ago
石晓伟 21a63f6f90
enhance the op_version_registry, test=develop (#28347)
4 years ago
joanna.wozna.intel 571a63e7ec
Add bf16 transpose2, reshape2, concat ops (#28195)
4 years ago
Zhang Ting fdc06f2158
add Fuse bn add act pass (#28196)
4 years ago
Adam Osewski 7db747d9e8
oneDNN BatchNorm + Act fusion pass. (#27912)
4 years ago
Chen Weihang 2babd6ff67
Add compile limit for PADDLE_ENFORCE without error message (#28221)
4 years ago
lidanqing 7cb4a8b8f2
[oneDNN] Conv dilation support (#27914)
4 years ago
guofei 6bbb6e7f45
Implement the function of OutScaleForTraining/OutScaleForInference in dygraph (#26601)
4 years ago
Jacek Czaja 606611d351
[oneDNN] GRU BF16 kernel (#27731)
4 years ago
Wojciech Uss 966447e338
Added support for quantization of fusion_gru (#27518)
4 years ago