Commit Graph

2444 Commits (b4b169467ba27d04e6d4bd5c6bc3f7abbba04c65)

Author SHA1 Message Date
gongweibao 0d561ef442
fix 2dconn test=develop (#17681)
6 years ago
mozga-intel 5eb81fe595 Capi for a ngraph engine (#17037)
6 years ago
Jacek Czaja 6d8075ecef [MKL-DNN] conv_transpose mkldnn bias pass (#17644)
6 years ago
Sylwester Fraczek 96845d2168 add Concat quantization (#17448)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
Zeng Jinle 4aa931dd85
Code clean of Allocator (#17602)
6 years ago
Zhaolong Xing 61221ebc28
TRT: Support set dynamic range in int8 mode. (#17524)
6 years ago
Michał Gallus 0c39b97b4e [MKL-DNN] Add Fully Connected Op for inference only(#15226)
6 years ago
wopeizl 6724a652f3
add __str__ method for tensor and lodtensor to support print test=dev… (#17588)
6 years ago
Sylwester Fraczek 5b2a3c4b12 Conv concat relu quantization (#17466)
6 years ago
Sylwester Fraczek bccb0ba49a fix quantize_squash_pass segfault when no tensor linked to Bias (#17292)
6 years ago
guru4elephant 7f8bc49d00
polish_executor_and_add_ctx_cache (#17536)
6 years ago
Zeng Jinle c6189637cd
Fix allocator bug (#16712)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
guomingz 2281ebf0f3 Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130)
6 years ago
liuwei1031 c3949f5699
remove two useless flags: enable_subgraph_optimize, memory_optimize_debug, test=develop (#17491)
6 years ago
Tao Luo 32da5e9c3d
remove unused expected_kernel_cache_pass (#17486)
6 years ago
chengduo 5a6ab38013 Add record event And remove CSP (#17447)
6 years ago
Qiao Longfei 728bbaa4e3
add cache_update_mutex_ for operator test=develop (#17124)
6 years ago
guru4elephant 43c9561e9a
add inductive shape index (#17435)
6 years ago
Zeng Jinle 712bfb17cb
fix recurrent_op,test=develop (#17433)
6 years ago
Tao Luo 5babcd02dd
Revert "remove unnecessary prepare_data (#17080)" (#17432)
6 years ago
chengduo e336dc86bb
[Speed] Refine the Executor when the num_thread=1 (#17405)
6 years ago
Zhen Wang 4a1b7fec96
Add setting Scope function for the graph class (#17417)
6 years ago
jiaqi 66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118)
6 years ago
Tao Luo 68ec0a6f74
make parallel_executor support FLAGS_use_mkldnn (#17341)
6 years ago
chengduo bc833945a4
Add DropLocalExeScopes in ParallelExecutor (#17297)
6 years ago
qingqing01 e32c9888f5
Double backward of conv2d. (#17211)
6 years ago
Zeng Jinle 5e5e7b3305
fix data_type error message (#17312)
6 years ago
guru4elephant 5d6a1fcf16
fix infer_from_dataset and train_from_dataset (#17243)
6 years ago
chengduo 516317cf91
use sync copy (#17291)
6 years ago
Hongyu Liu c3195de522
Fix concat shape check (#17247)
6 years ago
chengduo 04bd413acb
Code Clean: Move all pass to paddle::framework::ir (#17228)
6 years ago
Zeng Jinle 4f8594088d
Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace (#17225)
6 years ago
songhao c2e20e2a29 fix build warning like 'comparison between signed and unsigned (#17240)
6 years ago
石晓伟 a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 (#17156)
6 years ago
Zeng Jinle ee2028a110
Add use_cuda to inplace pass (#17205)
6 years ago
chengduo 950aec55fd
It doesn't need sync when fetch_list nit not empty (#17201)
6 years ago
tensor-tang 79ed1c76cd
fix bn fuse vardesc and add model saver (#17143)
6 years ago
Zeng Jinle 4e1bc6e805
Rewrite inplace pass and fix gc bug (#17126)
6 years ago
chengduo 794a195881
fix fuse optimizer ops (#17102)
6 years ago
Tao Luo aca60e9a20
remove unnecessary prepare_data (#17080)
6 years ago
Zeng Jinle 842ded14b0
fix reference_count_pass,test=develop (#17060)
6 years ago
Tao Luo d9cd989825
Merge pull request #17048 from luotao1/fix_runtime_cache_bug
6 years ago
chengduo cc31681687
use fast executor as default (#17044)
6 years ago
chengduo a2be4b4d91
Add fuse momenutum ops (#16745)
6 years ago
luotao1 490e746269 fix runtime_context_cache bug when gpu model has an op runs only on cpu
6 years ago
wopeizl 51a0243a56 fix nccl wrapper on windows
6 years ago
Zeng Jinle 1202d3fc74
Refine model gpu memory (#16993)
6 years ago
Yibing Liu 3c375751f8
Support seq len equal to 0 in sequence ops (#16935)
6 years ago