Commit Graph

2424 Commits (712bfb17cb6d10658d804748c84b67badf47f0d7)

Author SHA1 Message Date
Zeng Jinle 712bfb17cb
fix recurrent_op,test=develop (#17433)
6 years ago
Tao Luo 5babcd02dd
Revert "remove unnecessary prepare_data (#17080)" (#17432)
6 years ago
chengduo e336dc86bb
[Speed] Refine the Executor when the num_thread=1 (#17405)
6 years ago
Zhen Wang 4a1b7fec96
Add setting Scope function for the graph class (#17417)
6 years ago
jiaqi 66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118)
6 years ago
Tao Luo 68ec0a6f74
make parallel_executor support FLAGS_use_mkldnn (#17341)
6 years ago
chengduo bc833945a4
Add DropLocalExeScopes in ParallelExecutor (#17297)
6 years ago
qingqing01 e32c9888f5
Double backward of conv2d. (#17211)
6 years ago
Zeng Jinle 5e5e7b3305
fix data_type error message (#17312)
6 years ago
guru4elephant 5d6a1fcf16
fix infer_from_dataset and train_from_dataset (#17243)
6 years ago
chengduo 516317cf91
use sync copy (#17291)
6 years ago
Hongyu Liu c3195de522
Fix concat shape check (#17247)
6 years ago
chengduo 04bd413acb
Code Clean: Move all pass to paddle::framework::ir (#17228)
6 years ago
Zeng Jinle 4f8594088d
Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace (#17225)
6 years ago
songhao c2e20e2a29 fix build warning like 'comparison between signed and unsigned (#17240)
6 years ago
石晓伟 a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 (#17156)
6 years ago
Zeng Jinle ee2028a110
Add use_cuda to inplace pass (#17205)
6 years ago
chengduo 950aec55fd
It doesn't need sync when fetch_list nit not empty (#17201)
6 years ago
tensor-tang 79ed1c76cd
fix bn fuse vardesc and add model saver (#17143)
6 years ago
Zeng Jinle 4e1bc6e805
Rewrite inplace pass and fix gc bug (#17126)
6 years ago
chengduo 794a195881
fix fuse optimizer ops (#17102)
6 years ago
Tao Luo aca60e9a20
remove unnecessary prepare_data (#17080)
6 years ago
Zeng Jinle 842ded14b0
fix reference_count_pass,test=develop (#17060)
6 years ago
Tao Luo d9cd989825
Merge pull request #17048 from luotao1/fix_runtime_cache_bug
6 years ago
chengduo cc31681687
use fast executor as default (#17044)
6 years ago
chengduo a2be4b4d91
Add fuse momenutum ops (#16745)
6 years ago
luotao1 490e746269 fix runtime_context_cache bug when gpu model has an op runs only on cpu
6 years ago
wopeizl 51a0243a56 fix nccl wrapper on windows
6 years ago
Zeng Jinle 1202d3fc74
Refine model gpu memory (#16993)
6 years ago
Yibing Liu 3c375751f8
Support seq len equal to 0 in sequence ops (#16935)
6 years ago
jiaqi 8bcba3db84
Merge pull request #16896 from xjqbest/develop
6 years ago
guru4elephant bbc6c5714f
Merge pull request #16887 from guru4elephant/add_nccl_context_pybind
6 years ago
gongweibao cbdb8a17b1
Polish DGC code (#16818)
6 years ago
dongdaxiang 2ab2869c2d fix GPU compile error problem
6 years ago
dongdaxiang 466d177d09 add pybind dependency
6 years ago
xjqbest 10991e00a9 fix bug of num > INT_MAX
6 years ago
xjqbest 241120d94d fix bug of num > INT_MAX
6 years ago
xjqbest dac70ad4c5 fix bug of num > INT_MAX
6 years ago
xjqbest 74471397cf fix bug of num > INT_MAX
6 years ago
dongdaxiang b091139049 add nccl wrapper for python API
6 years ago
dongdaxiang fff795e5c8 add nccl_wrapper
6 years ago
乔龙飞 Qiao Longfei 82cff5ec42
Merge pull request #16762 from jacquesqiao/add-async_sparse_param_update_recorder
6 years ago
Yibing Liu 4267a81afc
Correct the lod level of compiled time in lod_reset (#16790)
6 years ago
chengduo e9409665f7
Refine Fuse Optimize Ops (#16810)
6 years ago
chengduo d105c06b50
Replace ThreadedExecutor with FastThreadedExecutor (#16650)
6 years ago
Qiao Longfei 1526a3e4da Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async_sparse_param_update_recorder
6 years ago
Yihua Xu 93cedfdb9c Fix the order while sorting the operators (#16756)
6 years ago
Qiao Longfei afc56949c1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async_sparse_param_update_recorder
6 years ago
liuwei1031 85363848a1
Security issue (#16774)
6 years ago
guru4elephant aa46caf3d9
Merge pull request #16765 from guru4elephant/gpu_dataset_train
6 years ago