Commit Graph

443 Commits (f9c97dd7287119f7546c90d9813ec825b3c956d2)

Author SHA1 Message Date
Zhen Wang 89cfa49156
Unmerged fetch list (#22635)
5 years ago
Chen Weihang 7d8d573453
Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541)
5 years ago
Leo Chen b2c1be851a
support cond in clone, test=develop (#22657)
5 years ago
hutuxian 175954d894
PaddleBox Framework Part2 (#22466)
5 years ago
wangchaochaohu c65c6ae534
add flag to control profile level in python API (#22319)
5 years ago
tangwei12 b0675c8193
fix bug with compiledProgram (#22495)
5 years ago
Wilber de009152a7 Compile without nccl deps. [2/2] (#22484)
5 years ago
Yiqun Liu dcfb603897
Enable the detection of subgraph composed of grad ops (#21223)
5 years ago
Wilber 7bc4b09500
add WITH_NCCL option for cmake. (#22384)
5 years ago
Yiqun Liu b7cac50b64
Implement a common python unittest to test the ir passes. (#22209)
5 years ago
xujiaqi01 e3a457d34b
add collective communication library in fleet (#22211)
6 years ago
Zhen Wang 46189b166d Add bn and relu fuse pass (#22048)
6 years ago
Huihuang Zheng dd4361568e
Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029)
6 years ago
Huihuang Zheng 557bce77da
Fix Backward Bugs in Conditional Block (#21809)
6 years ago
mapingshuo 686f0ecb6a
add `no_need_buffer_slots` interface to pybind (#21575)
6 years ago
liym27 9da7e6b4d4
add file check_op_desc.py and add interface to get default value. (#21530)
6 years ago
Zeng Jinle 3a7caf481c
add grad maker assert, test=develop (#21564)
6 years ago
Leo Chen cdd46d7e02
Split VarBase from Python Variable for Dygraph (#21359)
6 years ago
Aurelius84 54382ce497
Add get_all_kernels api of registered data_type in pybind.cc (#21499)
6 years ago
Youwei Song d5ff79e55e Support numpy bridge (enabled by default in dygraph mode) (#20983)
6 years ago
Zeng Jinle b9f8ae8494
Add global value getter setter (#21285)
6 years ago
Dong Daxiang 691ced87c0
Refactor fetch handler (#21264)
6 years ago
Zeng Jinle 5fdfbe3413
Add friendly dygraph trace API (#21091)
6 years ago
Leo Chen 9974e40787 Update Tensor.set() to support float16 (#19964)
6 years ago
Yiqun Liu 16e4d02675
Refine the cache of program, context and scope in executor. (#18483)
6 years ago
hong ff0886a92a
save load problem fix and new feature add (#20823)
6 years ago
WangXi 507afa8a8a Fix dgc nan by stripping nccl from sparseReduce. (#20630)
6 years ago
633WHU 12e4be0382 Dlpack support (#20039)
6 years ago
Zeng Jinle 40effc61af
Refine py_reader exit (#20331)
6 years ago
liu zhengxi f855a86c93
update the api en doc of BuildStrategy (#20445)
6 years ago
tangwei12 a010d883b4
doc fix, test=develop, test=document_fix (#20239)
6 years ago
Leo Chen 5a7142ac4e Update en APIs of LoDTensor (#20115)
6 years ago
hong fa43e80e19 New save load interface (#20148)
6 years ago
Leo Chen f4c56e9f51 Polish en doc of LoDTensorArray, test=document_fix (#19972)
6 years ago
Youwei Song 20f68916ed refine CUDA CPU places en doc (#20243)
6 years ago
tangwei12 c9139c3db3
trainer from dataset fetch targets (#19760)
6 years ago
qingqing01 1a3eef026c
Enable users to create custom cpp op outside framework. (#19256)
6 years ago
石晓伟 01b9d07963
update operator compatible info, test=develop (#19978)
6 years ago
Yang Zhang cde73a7bbf
Expose `mutable_data` as python binding (#19932)
6 years ago
Wojciech Uss 4286a6270d Add support for new QAT models (#18970)
6 years ago
Chen Weihang 00d5375e0c
Add prune_backward function to cover complicated test_program.clone situation (#19772)
6 years ago
chengduo 056fdedde3
Open fuse all reduce option (#19765)
6 years ago
mapingshuo dca9b6c5b0 add feed_var_names to Prune interface (#19589)
6 years ago
hutuxian c756b5d231
Paddlebox Framework (#18982)
6 years ago
Leo Chen 6fb310ae29 Fix bug of getting bool Flags from os.environ (#19349)
6 years ago
Leo Chen a9d5fc5142 Enhance OpTest to check the consistency of operators when using and not using inplace (#19101)
6 years ago
Zeng Jinle 5b6673c44d
merge develop to solve conflict, also fix API doc, test=develop (#18823)
6 years ago
Zeng Jinle 88f111f885
remove unused inplace act codes, test=develop (#19079)
6 years ago
Leo Chen 8f53735437 Fix memory overwriting of tensors returned by executor (#19030)
6 years ago
liuwei1031 a43a763b54
fix warpctc.dll not found issue (#18761)
6 years ago
chengduo 20859c08e8
[DyGraph] Make multi-card program faster (#18892)
6 years ago
Zeng Jinle 8008ab4e6b
Remove legacy C++ memory optimization codes (#18834)
6 years ago
chengduo 292dfbce63
fix build strategy doc (#18725)
6 years ago
chengduo fd3aad6cb3
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664)
6 years ago
Zeng Jinle ae58afc546
Feature/auto_growth_allocator (#18561)
6 years ago
guru4elephant d714bf037c
remove async executor and add data_feed.proto to the deps of train demo (#18659)
6 years ago
gongweibao c0a82748cf
Polish backwards optimizer dependency codes and use more default values. (#18255)
6 years ago
Zeng Jinle d3003a1620
Feature/buffer_shared_inplace (#17911)
6 years ago
xsrobin 47e2ef38e9
add "import paddle.fluid as fluid" to examples lack of it
6 years ago
Zeng Jinle 5826b72e06
Refine CUDAPlace error message. (#18343)
6 years ago
chengduo 25f3cd6486
Update execution_strategy option default value (#18183)
6 years ago
Sylwester Fraczek accb132f0f fix slim int8 mkldnn multithreading issue (#18009)
6 years ago
tensor-tang 5c06bff222
combine noavx and avx package (#17889)
6 years ago
gongweibao fbbdc9ccad
Add backward and optimizer operator dependency pass. (#17746)
6 years ago
wopeizl 453a49b1bc
Make ParallelExecutor support Windows GPU (#17787)
6 years ago
guru4elephant d52391094d
fix prepare context redundant code problem, optimize executor by cach… (#17743)
6 years ago
Zeng Jinle 432ac70124
clean code of py_layer in dygraph mode,test=develop (#17661)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
wopeizl 6724a652f3
add __str__ method for tensor and lodtensor to support print test=dev… (#17588)
6 years ago
guru4elephant 326bf8291a
add Run Prepared Ctx (#17616)
6 years ago
flame 2280f185d7
BuildStrategy api comment (#17348)
6 years ago
guru4elephant 7f8bc49d00
polish_executor_and_add_ctx_cache (#17536)
6 years ago
Zeng Jinle c6189637cd
Fix allocator bug (#16712)
6 years ago
Qiao Longfei 92e7d5d7cc
fix distribute doc test=develop (#17318)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
Tao Luo 32da5e9c3d
remove unused expected_kernel_cache_pass (#17486)
6 years ago
Yan Xu 0217555530 polish parallel dygraph code (#17164)
6 years ago
Jiabin Yang d7df4e5e5b
Fix/Fix memory leak in dygraph (#17394)
6 years ago
Tao Luo 68ec0a6f74
make parallel_executor support FLAGS_use_mkldnn (#17341)
6 years ago
Jiabin Yang 4624d7c642
test=develop, add gradient sort backward strategy (#17125)
6 years ago
chengduo bc833945a4
Add DropLocalExeScopes in ParallelExecutor (#17297)
6 years ago
lujun e388a1fb66
Repair api example (#17221)
6 years ago
chengduo 04bd413acb
Code Clean: Move all pass to paddle::framework::ir (#17228)
6 years ago
Zeng Jinle f2fa3f7300
fix api doc,test=develop (#17241)
6 years ago
Zeng Jinle 5dfe2ab9e8
Fix mem leak when converting Tensor to numpy array (#17182)
6 years ago
Yan Xu 0b07eef118
ParallelDyGraph with GPU collective mode (#16827)
6 years ago
guru4elephant 03d469ad98
Merge pull request #17005 from wopeizl/fix_ncclwrapper_win1
6 years ago
liuwei1031 a770ce0615
add doc for memory_optimize, test=develop (#17010)
6 years ago
wopeizl 51a0243a56 fix nccl wrapper on windows
6 years ago
dongdaxiang b091139049 add nccl wrapper for python API
6 years ago
Yiqun Liu 112f16143b
Add an option to enable the cache of expected kernel in train phase. (#16724)
6 years ago
chengduo 55b15db5af
Add unit test for fuse all_reduce ops (#16699)
6 years ago
Yiqun Liu 3fe8cb0dd7
Enable the runtime_context_cache pass in train phase (#16640)
6 years ago
Yan Xu b4c3a6aa0b
[Imperative] implement imperative NCCLParallelContext (#16477)
6 years ago
chengduo b75a69bad6
Add Stream for fetch op handle (#16600)
6 years ago
乔龙飞 Qiao Longfei 21622ca30b
Merge pull request #16172 from jacquesqiao/add-async-ssa-graph-executor-communicator
6 years ago
sneaxiy 10249c0b78 Merge develop
6 years ago
Qiao Longfei adf272bcec Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
sneaxiy 33473890f3 Merge develop
6 years ago
dongdaxiang f612877797 add incubate for unified API
6 years ago