Commit Graph

793 Commits (742cbe666024415f856ec77721b19a8a58c8e44f)

Author SHA1 Message Date
Jiabin Yang 454254115e
Feature/auto prune in dygraph (#19757)
5 years ago
Pei Yang 9cbc1eff2d
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
5 years ago
xujiaqi01 6bf298bf09
support preload thread, optimize hdfs log, fix master+patch bug (#19695)
5 years ago
Chen Weihang 00d5375e0c
Add prune_backward function to cover complicated test_program.clone situation (#19772)
5 years ago
chengduo 056fdedde3
Open fuse all reduce option (#19765)
6 years ago
Tao Luo f05d2c519d paddle::framework::vectorize() templatization [PART3] (#19643)
6 years ago
Jiabin Yang e9233d1c1e Refactor dygraph (#19107)
6 years ago
mapingshuo dca9b6c5b0 add feed_var_names to Prune interface (#19589)
6 years ago
hutuxian c756b5d231
Paddlebox Framework (#18982)
6 years ago
Thunderbrook 1fe468d319
support debug each output of each ins (#19004)
6 years ago
Leo Chen 6fb310ae29 Fix bug of getting bool Flags from os.environ (#19349)
6 years ago
liu zhengxi 32598ffd8f
Python infer api update and add unit test (#19353)
6 years ago
Leo Chen a9d5fc5142 Enhance OpTest to check the consistency of operators when using and not using inplace (#19101)
6 years ago
Zeng Jinle 5b6673c44d
merge develop to solve conflict, also fix API doc, test=develop (#18823)
6 years ago
Tao Luo 5f5648a8ff
Revert "Python inference API support numpy (#19009)" (#19160)
6 years ago
flame b7e1a1d7e7 Python inference API support numpy (#19009)
6 years ago
yaoxuefeng 9150cf50fc
add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871)
6 years ago
Zeng Jinle 88f111f885
remove unused inplace act codes, test=develop (#19079)
6 years ago
jiaqi a99bc64c63
add fleet util, add some interface in hdfs util (#18752)
6 years ago
Leo Chen 8f53735437 Fix memory overwriting of tensors returned by executor (#19030)
6 years ago
liuwei1031 a43a763b54
fix warpctc.dll not found issue (#18761)
6 years ago
flame 65d987527d
python inference enable_memory_optim(#18817)
6 years ago
Zhaolong Xing 61238d31f7
Trt fp16 support (#18860)
6 years ago
chengduo 20859c08e8
[DyGraph] Make multi-card program faster (#18892)
6 years ago
Zeng Jinle 8008ab4e6b
Remove legacy C++ memory optimization codes (#18834)
6 years ago
Thunderbrook 52c1431eee
add clear_model interface in fleetwrapper (#18815)
6 years ago
chengduo 292dfbce63
fix build strategy doc (#18725)
6 years ago
jiaqi d18aabb472
support patch data, add load_one_table, fix bug (#18509)
6 years ago
chengduo fd3aad6cb3
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664)
6 years ago
Zeng Jinle ae58afc546
Feature/auto_growth_allocator (#18561)
6 years ago
guru4elephant d714bf037c
remove async executor and add data_feed.proto to the deps of train demo (#18659)
6 years ago
123malin b414645a65
fix #17430: int64类型的attr训练非预期 (#18264)
6 years ago
gongweibao c0a82748cf
Polish backwards optimizer dependency codes and use more default values. (#18255)
6 years ago
Zeng Jinle d3003a1620
Feature/buffer_shared_inplace (#17911)
6 years ago
Zhaolong Xing 88b52a27fe
Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532)
6 years ago
Yi Liu a873fa84ce
supports collective training with programs (#18392)
6 years ago
xsrobin 47e2ef38e9
add "import paddle.fluid as fluid" to examples lack of it
6 years ago
lujun fd6631ef2f
Fix dygraph show style (#18297)
6 years ago
tangwei12 999d9a59a5
fix communicator with pyreader (#18350)
6 years ago
HaoRen b7128bac5f supports collective communicated training (#18175)
6 years ago
Zeng Jinle 5826b72e06
Refine CUDAPlace error message. (#18343)
6 years ago
jiaqi 3f8031e256
dataset (#17973)
6 years ago
chengduo 25f3cd6486
Update execution_strategy option default value (#18183)
6 years ago
Zeng Jinle 25ab23be28
Fix dygraph mem leak (#18082)
6 years ago
Sylwester Fraczek accb132f0f fix slim int8 mkldnn multithreading issue (#18009)
6 years ago
tensor-tang 5c06bff222
combine noavx and avx package (#17889)
6 years ago
Jiabin Yang 4d5f6937c3
Feature/refine api for dygraph (#17907)
6 years ago
gongweibao fbbdc9ccad
Add backward and optimizer operator dependency pass. (#17746)
6 years ago
wopeizl 453a49b1bc
Make ParallelExecutor support Windows GPU (#17787)
6 years ago
翟飞跃 993c703bcc INT8 MKL-DNN v2 integrate to slim (#17634)
6 years ago
wopeizl 841553e13f
use pyreader to read data in dygraph mode (#17314)
6 years ago
Zeng Jinle 674e0ce2d6
Use Python C-API to speed up dygraph trace (#17837)
6 years ago
Jiabin Yang 3b70f870e2
Using Smart pointer to optimizer memory usage of dyGraph (#17768)
6 years ago
guru4elephant d52391094d
fix prepare context redundant code problem, optimize executor by cach… (#17743)
6 years ago
Zeng Jinle 432ac70124
clean code of py_layer in dygraph mode,test=develop (#17661)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
Zhaolong Xing 61221ebc28
TRT: Support set dynamic range in int8 mode. (#17524)
6 years ago
wopeizl 6724a652f3
add __str__ method for tensor and lodtensor to support print test=dev… (#17588)
6 years ago
guru4elephant 326bf8291a
add Run Prepared Ctx (#17616)
6 years ago
flame 2280f185d7
BuildStrategy api comment (#17348)
6 years ago
guru4elephant 7f8bc49d00
polish_executor_and_add_ctx_cache (#17536)
6 years ago
Zeng Jinle c6189637cd
Fix allocator bug (#16712)
6 years ago
Qiao Longfei 92e7d5d7cc
fix distribute doc test=develop (#17318)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
Tao Luo 32da5e9c3d
remove unused expected_kernel_cache_pass (#17486)
6 years ago
Yan Xu 0217555530 polish parallel dygraph code (#17164)
6 years ago
Jiabin Yang d7df4e5e5b
Fix/Fix memory leak in dygraph (#17394)
6 years ago
Zhen Wang 4a1b7fec96
Add setting Scope function for the graph class (#17417)
6 years ago
jiaqi 66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118)
6 years ago
Tao Luo 68ec0a6f74
make parallel_executor support FLAGS_use_mkldnn (#17341)
6 years ago
Jiabin Yang 4624d7c642
test=develop, add gradient sort backward strategy (#17125)
6 years ago
chengduo bc833945a4
Add DropLocalExeScopes in ParallelExecutor (#17297)
6 years ago
qingqing01 e32c9888f5
Double backward of conv2d. (#17211)
6 years ago
lujun e388a1fb66
Repair api example (#17221)
6 years ago
chengduo 04bd413acb
Code Clean: Move all pass to paddle::framework::ir (#17228)
6 years ago
Zeng Jinle f2fa3f7300
fix api doc,test=develop (#17241)
6 years ago
石晓伟 a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 (#17156)
6 years ago
Zeng Jinle c5eeecca7c
Fix tensor_py.h (#17195)
6 years ago
Zeng Jinle 5dfe2ab9e8
Fix mem leak when converting Tensor to numpy array (#17182)
6 years ago
Yan Xu 0b07eef118
ParallelDyGraph with GPU collective mode (#16827)
6 years ago
guru4elephant 03d469ad98
Merge pull request #17005 from wopeizl/fix_ncclwrapper_win1
6 years ago
liuwei1031 a770ce0615
add doc for memory_optimize, test=develop (#17010)
6 years ago
qingqing01 ea42e431f8
Speed unit testing. (#16978)
6 years ago
wopeizl 51a0243a56 fix nccl wrapper on windows
6 years ago
Zeng Jinle 1202d3fc74
Refine model gpu memory (#16993)
6 years ago
guru4elephant bbc6c5714f
Merge pull request #16887 from guru4elephant/add_nccl_context_pybind
6 years ago
gongweibao cbdb8a17b1
Polish DGC code (#16818)
6 years ago
dongdaxiang 466d177d09 add pybind dependency
6 years ago
dongdaxiang 4aa6f679b5 add pybind dependency
6 years ago
dongdaxiang b091139049 add nccl wrapper for python API
6 years ago
Yiqun Liu 112f16143b
Add an option to enable the cache of expected kernel in train phase. (#16724)
6 years ago
chengduo 55b15db5af
Add unit test for fuse all_reduce ops (#16699)
6 years ago
Yiqun Liu 3fe8cb0dd7
Enable the runtime_context_cache pass in train phase (#16640)
6 years ago
guru4elephant 7d653f0aed
Merge pull request #16652 from xjqbest/dataset_merge_develop
6 years ago
xjqbest 6a57e8075a remove trainer_id in datafeed and dataset
6 years ago
Yan Xu b4c3a6aa0b
[Imperative] implement imperative NCCLParallelContext (#16477)
6 years ago
xjqbest 271b7147cc fix dataset bug
6 years ago
chengduo b75a69bad6
Add Stream for fetch op handle (#16600)
6 years ago
乔龙飞 Qiao Longfei 21622ca30b
Merge pull request #16172 from jacquesqiao/add-async-ssa-graph-executor-communicator
6 years ago
sneaxiy 10249c0b78 Merge develop
6 years ago
Qiao Longfei adf272bcec Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
xjqbest 9b84e8e66b fix code style
6 years ago
xjqbest a99c8d0c29 fix client to client communication bug
6 years ago
sneaxiy 33473890f3 Merge develop
6 years ago
dongdaxiang 720647e17f rebase current develop and fix conflict
6 years ago
dongdaxiang 45eb6f0765 run pre-commit check files and fix code style problem
6 years ago
xjqbest e95cafd9a7 fix code style & add dataset testcase
6 years ago
xjqbest be74de2c61 fix code style & fix register bug & add release_memory
6 years ago
xujiaqi01 a5b1a0e12b support multi dataset && add init model && fix bug
6 years ago
dongdaxiang b7a202aa38 add distributed optimizer factory
6 years ago
dongdaxiang f612877797 add incubate for unified API
6 years ago
dongdaxiang 317eb0aad3 add incubate for unified API
6 years ago
xujiaqi01 ecfc7df913 add dataset factory && fix style
6 years ago
xujiaqi01 3cea00bd52 store memory data in Dataset && fix bug
6 years ago
dongdaxiang cc4def6ba5 fix some conflict for compilation
6 years ago
heqiaozhi 9bca1926c1 refactor & fix bug
6 years ago
xjqbest 2e9a836c6f add DataSet and InMemoryDataFeed, support load data into memory and shuffle data
6 years ago
dongdaxiang e36bbcc871 fix some typo and CMakefile.txt
6 years ago
xjqbest 824b84d185 add DataSet and InMemoryDataFeed, support load data into memory and shuffle data
6 years ago
dongdaxiang be757096da add pybind for fleet
6 years ago
Qiao Longfei d8974e6da0 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
chengduo 1096746cbf
Fuse Adam And SGD ops (#15933)
6 years ago
sneaxiy 2c836ff914 check default grad maker
6 years ago
Zeng Jinle 69cb9792ea
Merge pull request #16506 from sneaxiy/revert-16424-fix_allocator_bug
6 years ago
chengduo ed61d67c73
Fix the interface of Pass::Apply (#16484)
6 years ago
Zeng Jinle 174d0d0b90 Revert "Fix allocator bug"
6 years ago
gongweibao eb83abeac3
Add DGC(Deep Gradient Compression) interface. (#15841)
6 years ago
Zeng Jinle 644e8af4cf
Merge pull request #16424 from sneaxiy/fix_allocator_bug
6 years ago
Zeng Jinle c7c6eeb44e
Merge pull request #16409 from sneaxiy/feature/advance_gc
6 years ago
wopeizl c300b1ba69
Tensor index (#16223)
6 years ago
Xin Pan f8c279b11c
Merge pull request #16454 from panyx0718/imperative2
6 years ago
Qiao Longfei 30618409db Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
chengduo 4f2278f032
Add doc for CPUPlace CUDAPlace CUDAPinPlace (#16442)
6 years ago
sneaxiy 78fb3a62e0 fix env variable settting bug
6 years ago
sneaxiy 2d92b6be98 merge develop
6 years ago
Xin Pan fd24ab47ab polish
6 years ago
sneaxiy a7d0ac50b8 Merge develop
6 years ago
sneaxiy 7000ec85d9 fix some op grad maker
6 years ago
sneaxiy f8ed2c229e try to fix ci error
6 years ago
sneaxiy c20db6357b split PR
6 years ago
sneaxiy 2f54d9f995 Merge develop
6 years ago
sneaxiy a93a9eef8f add op registry type
6 years ago
sneaxiy 953214ad97 add more unittest
6 years ago
chengduo f26ba5bddd
Fuse AllReduce (#15921)
6 years ago
Tao Luo 7d2740db83
Revert "cache runtime_context"
6 years ago
sneaxiy fd23262e0c merge develop, fix conflict
6 years ago
Qiyang Min c7f1f3ed0c
Merge pull request #16214 from velconia/imperative_infer_var_type
6 years ago
Tao Luo dbb92ee4b1
Merge pull request #16002 from luotao1/runtime_context
6 years ago
sneaxiy 161b8ddcaa Merge develop
6 years ago
minqiyang b40e41fbd1 Polish code style
6 years ago
Qiyang Min 8e4ad008fb
Merge pull request #16198 from velconia/imperative_train_speed
6 years ago
minqiyang 36dce65bb3 Take DataType and VarType apart
6 years ago
minqiyang 438bca9c3d Implement Runtime Var Type Inference
6 years ago
luotao1 1b59bed989 Merge branch 'develop' into runtime_context
6 years ago
qingqing01 8ad672a287
Support sync batch norm. (#16121)
6 years ago
minqiyang 7355d41834 1. Add imperative gperf profiler
6 years ago
luotao1 b2898c0f57 Merge branch 'develop' into runtime_context
6 years ago
minqiyang 98dfb492bb Release GIL lock
6 years ago
sneaxiy ac0e0f5181 merge develop
6 years ago
minqiyang 42e96a029f Accelerate CPU part
6 years ago
sneaxiy 682f2dbf29 merge develop
6 years ago
sneaxiy 2c4fcaa683 merge develop
6 years ago
luotao1 d94fd97230 add runtime_context_cache_pass
6 years ago
Yan Xu 30568473ec
fix broadcast on mp mode (#15951)
6 years ago
baojun e3c37bd564 remove const_cast and refactor ngraph engine code (#15925)
6 years ago
Zhen Wang ac6ef06ffa Add the Clone method in Graph. test=develop
6 years ago
Zhen Wang 01eddf125c Not add graph copy construction method. test=develop
6 years ago
Zhen Wang 1b9c8d5f06 add clone function for IrGraph. test=develop
6 years ago
Qiyang Min 1f4aa7a202 Imperative remove all descs (#16045)
6 years ago
Zeng Jinle 472f16b5aa
Merge pull request #16063 from sneaxiy/enhance_gc
6 years ago
wopeizl a38db3cb99
Fixrecordio (#16124)
6 years ago
sneaxiy b80d76f784 merge develop
6 years ago
sneaxiy 732fa00eaf disable gc in recurrent_op currently
6 years ago
Tao Luo 6f2581e4c5
Merge pull request #16090 from lidanqing-intel/paddle-int32
6 years ago
Zhaolong Xing 3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
6 years ago
nhzlx a9ed427749 cant not pass ci
6 years ago
lidanqing 4aeb261da9 Add INT32 support. INT32 in last switch case
6 years ago
sneaxiy 2a639d5c2a add allocator chain to fix bug
6 years ago
Qiao Longfei 8744f9a083 fix parallel executor async mode
6 years ago
Qiao Longfei e70b1727ef Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
Qiao Longfei 847e4f4e85 pure async mode train
6 years ago
sneaxiy 3334c279d0 add sample_generator
6 years ago
Qiyang Min 187cffd019
Merge pull request #15928 from velconia/imperative_backward_hooks
6 years ago
minqiyang ac88c62a5b Reset output var's pre_op pointer when op was destructed
6 years ago
sneaxiy 69b1ebdfa5 merge develop
6 years ago
mozga-intel 68a9ead17a The flag of mkldnn is enabled iff it is necessary
6 years ago
Zhen Wang e00c7a2e26
Merge pull request #15830 from wzzju/add_ir_node_encapsulation
6 years ago
Qiao Longfei f768fbf715 support multi graph
6 years ago
minqiyang efb2f2baf8 Fix bugs
6 years ago
Qiao Longfei cf0511f21e Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
Zhen Wang 548931456c update some functions' names according to the suggestion. test=develop
6 years ago
sneaxiy c545f1ed8f unify API
6 years ago
minqiyang b420ec3a92 invoke backward_hooks after reduce op's depcounts map
6 years ago
Qiyang Min 4bd28b304b
Merge pull request #15831 from velconia/imperative_engine
6 years ago
sneaxiy b17541a9c1 fix hang bug
6 years ago
minqiyang 84bf4d7b06 Move ClearBlock into OpBase and VarBase's destructor
6 years ago
minqiyang 2b3510bc50 Add imperative python tracer
6 years ago
minqiyang a15a3fc314 Polish code
6 years ago
sneaxiy 1e4c0a6f72 merge develop
6 years ago
minqiyang 9dc64edfd9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_engine
6 years ago
Xin Pan 32d5a16036 resolve conflicts
6 years ago
Xin Pan 26e32e095a allow compiler to use graph
6 years ago
minqiyang 8fe0c0c52c implement backward refs
6 years ago
Qiao Longfei cc71e89499 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
minqiyang 74551758cc Polish code
6 years ago
minqiyang f53e1d5c4b implement ClearBlock
6 years ago
sneaxiy 7160cb0f32 decoupled reader
6 years ago
sneaxiy d331e97af8 fix compiler place compare
6 years ago
sneaxiy e6ff549849 small fix doc
6 years ago
sneaxiy 796e221efc fix api arg0
6 years ago
minqiyang 52e5ee60bd Add debug info
6 years ago
Zhen Wang bc95a4ccfe
Merge branch 'develop' into quantization_inference_passes
6 years ago
Gabor Buella da9c94da33 Clang build fixes (#15628)
6 years ago
dzhwinter 381f2015a5
Merge pull request #15665 from dzhwinter/experiment/refactor_memory
6 years ago
xuezhong eeaa2066e5 add device info to tensor
6 years ago
dzhwinter 04e9776aef add details. test=develop
6 years ago
Qiao Longfei 16af1dbc7b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
peizhilin 3a4110f960 fix ci broken randomly and disable some warnings
6 years ago
dzhwinter 4f01de6378 Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
liuwei1031 6e84eb131f expose peak gpu memory API to python test=develop (#15529)
6 years ago
WangZhen 2175292634 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into quantization_inference_passes
6 years ago
dzhwinter 06f2448848 Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
Yan Chunwei 655179089f
AnalysisConfig remove contrib namespace (#15540)
6 years ago
Qiao Longfei d6c0dcaa16 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
WangZhen c67b29c178 fix some bugs of graph.to_program and get_pass.
6 years ago
dzhwinter ee3aae56cd merge develop branch. test=develop
6 years ago
Zhaolong Xing 97b76c94c4
Merge pull request #15242 from NHZlX/trt_int8_ultimate_version
6 years ago
WangZhen c8095eeb82 add freeze pass, and UT is passed.
6 years ago
Qiao Longfei ada43e89c3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
乔龙飞 Qiao Longfei c58555067e
Merge pull request #14731 from jacquesqiao/optimize-cpp-reader
6 years ago
nhzlx 36abc964df fix pybind problem: add an enum to AnalysisConfig
6 years ago
Zeng Jinle 2480a3df7d
Merge pull request #15496 from sneaxiy/lazy_allocator2
6 years ago
WangZhen dde19a0ff8 add quantization freeze pass.
6 years ago
Zeng Jinle dec89bd7ed
Merge pull request #15460 from sneaxiy/try_to_turn_on_remove_unnecessary_lock
6 years ago
Xin Pan 58cb18d9d9
Merge pull request #15322 from velconia/imperative_resnet
6 years ago
sneaxiy 51227bd447 lazy_allocator
6 years ago
minqiyang c8965dc1ab Polish code
6 years ago
sneaxiy ef788603d4 merge develop
6 years ago
Zhen Wang 58727e8e6d
Merge pull request #15455 from wzzju/graph_quantization
6 years ago
Tao Luo fef3fd6d62
Merge pull request #15452 from luotao1/legacy_option
6 years ago
Paddle CI 289aba750a Polish code
6 years ago
WangZhen b913463e83 Update according to the reviewers' suggestion. test=develop
6 years ago
sneaxiy d8568acd19 turn on remove_unnecessary_lock
6 years ago
WangZhen 3ce6172052 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
6 years ago
WangZhen 59e5cc51d6 Add quantization transform pass and UT.
6 years ago
flame d60751fb71
add python inference api (#15248)
6 years ago
dzhwinter 8f3b252392 squash commits. test=develop
6 years ago
Tao Luo cf29ea1592 remove legacy ANDROID option
6 years ago
Qiao Longfei 45578c1b48 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
6 years ago
minqiyang 8ce198b2e1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
6 years ago