Commit Graph

111 Commits (0536b5263d188c54069765c4168bdba91ad250c7)

Author SHA1 Message Date
Jacek Czaja 2bb1b0e89e
[DNNL] Added MKL-DNN inplace pass for C-API inference (#23315)
5 years ago
Tao Luo c00d427d52
simplify the cmake log of ir/CMakeLists.txt (#23262)
5 years ago
Wilber 95b356a069
update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114)
5 years ago
Wilber ff3ddbb502
add skip_layernorm pass. test=develop (#22895)
5 years ago
Zhaolong Xing 8d6dc102fe
[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494)
5 years ago
Wilber 9a8203aa25
fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551)
5 years ago
Yiqun Liu dcfb603897
Enable the detection of subgraph composed of grad ops (#21223)
5 years ago
石晓伟 e1b0d7cbb1
remove anakin from code, test=develop (#22420)
5 years ago
Zhen Wang 46189b166d Add bn and relu fuse pass (#22048)
5 years ago
Yiqun Liu b1401fb74d Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#22094)
5 years ago
Yiqun Liu c918788ba9 Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. (#21310)
5 years ago
Yiqun Liu b5f3be8330
Implement a pass detect fusion group of elementwise op (#19884)
5 years ago
wangchaochaohu ba45dce35d
fix codetest for windows make test=develop (#20796)
5 years ago
zhaoyuchen2018 b8333edef6
Add Multihead matmul fuse pass (#20167)
5 years ago
Adam 7faa3e9555 Add ConvTranspose + BatchNorm fuse pass (#20161)
5 years ago
wangchaochaohu c9ea317b36
codegen code for reconstruction (#19728)
5 years ago
Yiqun Liu 3cd985a669
Add a pass to fuse fc+elementwise_add+layernorm (#19776)
6 years ago
Yiqun Liu c67c8758cb
Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733)
6 years ago
wangchaochaohu ed8f44ea21
codegen for fused elementwise operation (#19520)
6 years ago
Yiqun Liu c5548178b0
A a pass to enable the use of cudnn (#19346)
6 years ago
Yiqun Liu fcec365d29
Add a pass to replace dropout_op with scale_op when is_test is true (#19297)
6 years ago
Zhaolong Xing 76c95af000
Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213)
6 years ago
Adam b837689e97 Add generalized Conv+Activation MKLDNN fuse pass creation (#19072)
6 years ago
石晓伟 ee2f296ef8
Fusion: seqpool_cvm_concat (#18471)
6 years ago
chengduo fd3aad6cb3
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664)
6 years ago
石晓伟 bce259e5bf
Update the Anakin interfaces for content-dnn and MLU (#17890)
6 years ago
mozga-intel 5eb81fe595 Capi for a ngraph engine (#17037)
6 years ago
Zhaolong Xing 61221ebc28
TRT: Support set dynamic range in int8 mode. (#17524)
6 years ago
Michał Gallus 0c39b97b4e [MKL-DNN] Add Fully Connected Op for inference only(#15226)
6 years ago
Sylwester Fraczek 5b2a3c4b12 Conv concat relu quantization (#17466)
6 years ago
guomingz 2281ebf0f3 Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130)
6 years ago
Tao Luo 32da5e9c3d
remove unused expected_kernel_cache_pass (#17486)
6 years ago
chengduo 04bd413acb
Code Clean: Move all pass to paddle::framework::ir (#17228)
6 years ago
石晓伟 a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 (#17156)
6 years ago
luotao1 226596a296 Merge branch 'develop' into core_opt_choose_kernel
6 years ago
nhzlx d065b5bf2b Anakin ssd support
6 years ago
nhzlx 953bdde058 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
Wojciech Uss 46677fb080 Move cpu_quantize_* passes into mkldnn subfolder
6 years ago
luotao1 056599a738 add expected_kernel_cache_pass
6 years ago
nhzlx c407dfa3cb cherry-pick from feature/anakin-engine: refine paddle-anakin to new interface. #16276
6 years ago
nhzlx a25331bc26 cherry-pick from feature/anakin-engine: deal the changing shape when using anakin #16189
6 years ago
nhzlx a1d200a5de cherry-pick from feature/anakin-engine: Anakin support facebox #16111
6 years ago
luotao1 82af8031d9 add runtime_context_cache_pass
6 years ago
Tao Luo 7d2740db83
Revert "cache runtime_context"
6 years ago
Wojciech Uss af03008890 Add cpu_quantize_placement_pass for C-API quantization (#16265)
6 years ago
luotao1 a275fd6e0c Merge branch 'develop' into runtime_context
6 years ago
Wojciech Uss 2579ade45f Add cpu_quantize_pass for C-API quantization (#16127)
6 years ago
qingqing01 86e912c544 Fix windows compiling (#16230)
6 years ago
luotao1 1b59bed989 Merge branch 'develop' into runtime_context
6 years ago
luotao1 6ce25c99a0 Merge branch 'develop' into runtime_context
6 years ago