Commit Graph

2629 Commits (6e6eab07e80d287fb10f6033a01f15650b36fcdb)

Author SHA1 Message Date
Zeng Jinle 3f87464e9c
refine executor_gc_helper codes, test=develop (#19814)
5 years ago
Zeng Jinle 3fd3b663a8
fix gc bug in controlflow ops, test=develop (#19827)
5 years ago
Zeng Jinle db26de8389
[Bug fix] Disable memory reuse on feeded variables (#19835)
5 years ago
Thunderbrook 40c66f8df9
rm return in vfork (#19734)
5 years ago
xujiaqi01 6bf298bf09
support preload thread, optimize hdfs log, fix master+patch bug (#19695)
5 years ago
Jiabin Yang cc311bdf95
Feature/add transform data dygraph (#19707)
5 years ago
Zeng Jinle 754fd57ed7
disable memory optimization passes when FLAGS_use_ngraph=True, test=develop (#19778)
5 years ago
chengduo 8281497030
Fix warning info of build_strategy (#19805)
5 years ago
Yiqun Liu c67c8758cb
Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733)
5 years ago
Chen Weihang 00d5375e0c
Add prune_backward function to cover complicated test_program.clone situation (#19772)
5 years ago
Adam d4413a54bc Add common CreateKey for mkldnn handlers (#19767)
6 years ago
chengduo 056fdedde3
Open fuse all reduce option (#19765)
6 years ago
Huihuang Zheng 12542320c5
Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989)
6 years ago
Zeng Jinle 0daa5c9772
Make leaky relu inplacable (#19676)
6 years ago
chengduo e506c99c20
Open fuse broadcast option (#18833)
6 years ago
Yiqun Liu a65c728e5d
Implement the GPU kernel of fc operator (#19687)
6 years ago
chengduo 5866a7a5fe
Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418)
6 years ago
Tao Luo ec9bc1bd9f
paddle::framework::vectorize() templatization (#19730)
6 years ago
Zeng Jinle bb4f8dee83
add logs to left var memory size, test=develop (#19722)
6 years ago
wangguanzhong 25dcd74d34
merge empty lod tensor, test=develop (#19228)
6 years ago
Zeng Jinle 713c05dd60
refine tensor.mutable_data, test=develop (#19680)
6 years ago
hutuxian 1ca6ea0318
fix cmakelist deps (#19668)
6 years ago
Tao Luo bcddbc78d4
remove -Wmaybe-uninitialized warning (#19653)
6 years ago
wangchaochaohu ed8f44ea21
codegen for fused elementwise operation (#19520)
6 years ago
mapingshuo dca9b6c5b0 add feed_var_names to Prune interface (#19589)
6 years ago
Tao Luo 3ae939e48a
unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631)
6 years ago
tensor-tang e3e98ed678
fix scope lock bug on infer (#19624)
6 years ago
Tao Luo 0a46d34538
refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607)
6 years ago
baojun a3a4b6e570 Enable ngraph through build_strategy (#19266)
6 years ago
Adam 8d6d95cc2b paddle::framework::vectorize() templatization (#19611)
6 years ago
Tao Luo 75d1571995
refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603)
6 years ago
Yiqun Liu c5548178b0
A a pass to enable the use of cudnn (#19346)
6 years ago
Adam e94b26daf5 using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568)
6 years ago
gongweibao abaf87be2b
Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506)
6 years ago
Zeng Jinle 19474019c2
fix fast pe to run highest priority ops first, test=develop (#19575)
6 years ago
Zeng Jinle 0af8549750 fix seg fault of share lod, test=develop (#19573)
6 years ago
hutuxian c756b5d231
Paddlebox Framework (#18982)
6 years ago
Jacek Czaja ecd9f330c9 [MKL-DNN] Fix to face model on AVX512 platforms (#19282)
6 years ago
yaoxuefeng 10ca3f9609
add thread scope stat accurate metrics test=develop (#19480)
6 years ago
Tao Luo 02270b3eb1
remove unused assert.h (#19529)
6 years ago
chengduo e340df013e
Support feed single persistable variable to PE (#19417)
6 years ago
Yiqun Liu fcec365d29
Add a pass to replace dropout_op with scale_op when is_test is true (#19297)
6 years ago
Thunderbrook 1fe468d319
support debug each output of each ins (#19004)
6 years ago
Zeng Jinle 5c8f210ce3
refine inplace inference registry, test=develop (#19032)
6 years ago
chengduo b6d1d8901f
Increase num_iteration_per_drop_scope (#19075)
6 years ago
tangwei12 65c7368400
Fix the correctness of async mode at distributed training (#18863)
6 years ago
joanna.wozna.intel 2e3ec66be0 Add conv dequant squash for int8 (#18905)
6 years ago
Tao Luo c82280e445
remove unused conv_elementwise_add2_act_fuse.cc (#19344)
6 years ago
Leo Chen a9d5fc5142 Enhance OpTest to check the consistency of operators when using and not using inplace (#19101)
6 years ago
Tao Luo e3c68bde78
stronger the error message of tensor's mutable_data (#19303)
6 years ago
Adam 97d1db1874 Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237)
6 years ago
Zhaolong Xing 76c95af000
Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213)
6 years ago
Zeng Jinle 5b6673c44d
merge develop to solve conflict, also fix API doc, test=develop (#18823)
6 years ago
liuwei1031 50582071dc
fix compilation issue in windows vs2017 (#19183)
6 years ago
juncaipeng 5368b36512 remove the warning for reminding user to avoid using the OriginProgram method, test=develop (#19244)
6 years ago
chengduo 8a89ca94ce
Fix REGISTER_OP_WITHOUT_GRADIENT (#19251)
6 years ago
Zeng Jinle 708bd9798d
move_flags_to_unified_files_for_management, test=develop (#19224)
6 years ago
Adam b837689e97 Add generalized Conv+Activation MKLDNN fuse pass creation (#19072)
6 years ago
Yiqun Liu 77572b70cb
Enhance the error message when GrapOpMaker is null. (#19070)
6 years ago
chengduo c70a97f46e Use CUDAPinnedPlace in buffered_reader (#19112)
6 years ago
jiaqi b104ea0684
add get_last_save_xbox_base/get_last_save_xbox (#19122)
6 years ago
joanna.wozna.intel 492a00f53e Add conv reqantize squash (#18754)
6 years ago
joanna.wozna.intel bce72c7fea Replace Relu with bounded Relu in MobileNetV2 quantization (#18988)
6 years ago
chengduo e044e84264
open fuse_all_optimizer_ops (#19087)
6 years ago
gongweibao 29d8781240
Polish fleet API to support cuda collective mode and nccl2 mode. (#18966)
6 years ago
yaoxuefeng 9150cf50fc
add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871)
6 years ago
hutuxian 5a80cc8431
Datafeed support reading to cuda place directly. (#19071)
6 years ago
chengduo 17d62ab220
Enhance fuse optimization op pass (#19010)
6 years ago
chengduo 21440b4d69
Add call stack info during compile time (#19067)
6 years ago
jiaqi fc038da749
fix QueueDataset queue size (#19016)
6 years ago
Leo Chen 8f53735437 Fix memory overwriting of tensors returned by executor (#19030)
6 years ago
Zeng Jinle 2175d19993
fix memory_reuse_pass memory_size calculation error, test=develop (#19020)
6 years ago
Zeng Jinle 7ac748adb4
Open gc by default (#18836)
6 years ago
jiaqi 02c370c3dc
support filelist size < trainer num && fix pull dense (#18956)
6 years ago
chengduo e7da0940f9
Disable fuse optimization option (#18924)
6 years ago
石晓伟 ee2f296ef8
Fusion: seqpool_cvm_concat (#18471)
6 years ago
jiaqi 768059b3a0
adjust ins weight according to nid slot (#18784)
6 years ago
Leo Zhao 10eeed93d1 Revert "use static variable to do cache instead of thread local in thread frequent switching case (#18428)" (#18879)
6 years ago
Zeng Jinle 8008ab4e6b
Remove legacy C++ memory optimization codes (#18834)
6 years ago
Thunderbrook 52c1431eee
add clear_model interface in fleetwrapper (#18815)
6 years ago
chengduo 4140fe11a4
Open fuse optimization ops (#18741)
6 years ago
Zeng Jinle a802da650b
Feature/mem opt pass refactor (#18735)
6 years ago
fuyinno4 c167a4b4dd
Fix shrink-dense and add scale-datanorm (#18746)
6 years ago
Zhaolong Xing 26ae6d49e4
Update trt5 for paddle-trt (#18645)
6 years ago
Thunderbrook d8396281ef
add slot to sparse table (#18686)
6 years ago
jiaqi d18aabb472
support patch data, add load_one_table, fix bug (#18509)
6 years ago
chengduo fd3aad6cb3
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664)
6 years ago
Huihuang Zheng 89bc3fd841
Support memory eager deletion on recurrent OP (#17710)
6 years ago
Zeng Jinle ae58afc546
Feature/auto_growth_allocator (#18561)
6 years ago
guru4elephant d714bf037c
remove async executor and add data_feed.proto to the deps of train demo (#18659)
6 years ago
chengduo a6d468a265
fix PE fetch bug (#18644)
6 years ago
Leo Zhao ff77dea969 not use transferscope cache in cpu case (#18578)
6 years ago
123malin b414645a65
fix #17430: int64类型的attr训练非预期 (#18264)
6 years ago
gongweibao c0a82748cf
Polish backwards optimizer dependency codes and use more default values. (#18255)
6 years ago
Zeng Jinle d3003a1620
Feature/buffer_shared_inplace (#17911)
6 years ago
Zeng Jinle be24e5b391
Clean unused code of dim and place (#18565)
6 years ago
Jiabin Yang 667f88f9a6
Fix/gcc 4.8 ubt link error (#18558)
6 years ago
Zhaolong Xing 88b52a27fe
Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532)
6 years ago
Leo Zhao ce38bb5341 use static variable to do cache instead of thread local in thread frequent switching case (#18428)
6 years ago
gongweibao 160ddc980c
Regroup fusion by date type. (#18496)
6 years ago