Commit Graph

539 Commits (b4b169467ba27d04e6d4bd5c6bc3f7abbba04c65)

Author SHA1 Message Date
Zeng Jinle 432ac70124
clean code of py_layer in dygraph mode,test=develop (#17661)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
Zhaolong Xing 61221ebc28
TRT: Support set dynamic range in int8 mode. (#17524)
6 years ago
wopeizl 6724a652f3
add __str__ method for tensor and lodtensor to support print test=dev… (#17588)
6 years ago
guru4elephant 326bf8291a
add Run Prepared Ctx (#17616)
6 years ago
flame 2280f185d7
BuildStrategy api comment (#17348)
6 years ago
guru4elephant 7f8bc49d00
polish_executor_and_add_ctx_cache (#17536)
6 years ago
Zeng Jinle c6189637cd
Fix allocator bug (#16712)
6 years ago
Qiao Longfei 92e7d5d7cc
fix distribute doc test=develop (#17318)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
Tao Luo 32da5e9c3d
remove unused expected_kernel_cache_pass (#17486)
6 years ago
Yan Xu 0217555530 polish parallel dygraph code (#17164)
6 years ago
Jiabin Yang d7df4e5e5b
Fix/Fix memory leak in dygraph (#17394)
6 years ago
Zhen Wang 4a1b7fec96
Add setting Scope function for the graph class (#17417)
6 years ago
jiaqi 66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118)
6 years ago
Tao Luo 68ec0a6f74
make parallel_executor support FLAGS_use_mkldnn (#17341)
6 years ago
Jiabin Yang 4624d7c642
test=develop, add gradient sort backward strategy (#17125)
6 years ago
chengduo bc833945a4
Add DropLocalExeScopes in ParallelExecutor (#17297)
6 years ago
qingqing01 e32c9888f5
Double backward of conv2d. (#17211)
6 years ago
lujun e388a1fb66
Repair api example (#17221)
6 years ago
chengduo 04bd413acb
Code Clean: Move all pass to paddle::framework::ir (#17228)
6 years ago
Zeng Jinle f2fa3f7300
fix api doc,test=develop (#17241)
6 years ago
石晓伟 a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 (#17156)
6 years ago
Zeng Jinle c5eeecca7c
Fix tensor_py.h (#17195)
6 years ago
Zeng Jinle 5dfe2ab9e8
Fix mem leak when converting Tensor to numpy array (#17182)
6 years ago
Yan Xu 0b07eef118
ParallelDyGraph with GPU collective mode (#16827)
6 years ago
guru4elephant 03d469ad98
Merge pull request #17005 from wopeizl/fix_ncclwrapper_win1
6 years ago
liuwei1031 a770ce0615
add doc for memory_optimize, test=develop (#17010)
6 years ago
qingqing01 ea42e431f8
Speed unit testing. (#16978)
6 years ago
wopeizl 51a0243a56 fix nccl wrapper on windows
6 years ago
Zeng Jinle 1202d3fc74
Refine model gpu memory (#16993)
6 years ago
guru4elephant bbc6c5714f
Merge pull request #16887 from guru4elephant/add_nccl_context_pybind
6 years ago
gongweibao cbdb8a17b1
Polish DGC code (#16818)
6 years ago
dongdaxiang 466d177d09 add pybind dependency
6 years ago
dongdaxiang 4aa6f679b5 add pybind dependency
6 years ago
dongdaxiang b091139049 add nccl wrapper for python API
6 years ago
Yiqun Liu 112f16143b
Add an option to enable the cache of expected kernel in train phase. (#16724)
6 years ago
chengduo 55b15db5af
Add unit test for fuse all_reduce ops (#16699)
6 years ago
Yiqun Liu 3fe8cb0dd7
Enable the runtime_context_cache pass in train phase (#16640)
6 years ago
guru4elephant 7d653f0aed
Merge pull request #16652 from xjqbest/dataset_merge_develop
6 years ago
xjqbest 6a57e8075a remove trainer_id in datafeed and dataset
6 years ago
Yan Xu b4c3a6aa0b
[Imperative] implement imperative NCCLParallelContext (#16477)
6 years ago
xjqbest 271b7147cc fix dataset bug
6 years ago
chengduo b75a69bad6
Add Stream for fetch op handle (#16600)
6 years ago
乔龙飞 Qiao Longfei 21622ca30b
Merge pull request #16172 from jacquesqiao/add-async-ssa-graph-executor-communicator
6 years ago
sneaxiy 10249c0b78 Merge develop
6 years ago
Qiao Longfei adf272bcec Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
xjqbest 9b84e8e66b fix code style
6 years ago
xjqbest a99c8d0c29 fix client to client communication bug
6 years ago
sneaxiy 33473890f3 Merge develop
6 years ago