Commit Graph

264 Commits (8c81d9949eea828acb76079c685402f6c26c2059)

Author SHA1 Message Date
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
7 years ago
Yu Yang c28beb8a3c
test(Pe): add dry run tests for pe (#14254)
7 years ago
Yu Yang 90d9e5aee8
feat(platform): lazy initialization of devicecontext in pool (#14067)
7 years ago
Qiao Longfei d26ff8cb2d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpu-for-1.1-merge-with-shape
7 years ago
Wu Yi 26200f2e42
[1.1] [project] train imagenet using large batch size (#13766)
7 years ago
Qiao Longfei 641369f92b Merge branch 'dist-table-do-not-init-on-trainer' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Wu Yi 9da9b1926b
[1.1] fix graph num hang (#14072)
7 years ago
Qiao Longfei fad42fe7cc broadcast handle not inited parameter
7 years ago
Xin Pan 726fd438cd avoid blocking everyone
7 years ago
chengduo e943f4508b
add graph number check (#14025)
7 years ago
chengduozh 82d2903b63 Fix fast ParallelExe bug
7 years ago
sneaxiy d3ed070e10 test=develop
7 years ago
sneaxiy fb6201e93e test=develop
7 years ago
sneaxiy 9606b37ce4 test=develop
7 years ago
chengduo d6747a9ac2
make check_graph choosable (#13674)
7 years ago
chengduo 5175b3cb2b
Add GraphChecker (#13580)
7 years ago
Xin Pan 36c2a9af27 pass builder allow cutomize pass in python.
7 years ago
chengduo d402234ba8
Feature/op_fuse_pass (#12440)
7 years ago
Xin Pan a83a4fab5c
Merge pull request #13441 from panyx0718/ir2
7 years ago
Xin Pan ec6ee0a293 simplify and hide bcast_params
7 years ago
sneaxiy 612e1a3155 modification
7 years ago
sneaxiy d0b2453ecd merge develop
7 years ago
sneaxiy 24ea39c4c6 feature/eager_delete_tensor
7 years ago
minqiyang dc863aac7e Add kids exists detection in Scope
7 years ago
minqiyang 681514e15f Make all scope pointer to shared
7 years ago
yuyang18 05cadf1b24 Add FastExecutor
7 years ago
Xin Pan 626abfc33a code clean up and renaming
7 years ago
Xin Pan 99c0c20468 add pass test
7 years ago
Xin Pan ab72d28a5e clean up and correctness check
7 years ago
Xin Pan aa1085ddc5 all passes
7 years ago
Xin Pan e4d7d7ae8f pass refactoring
7 years ago
Xin Pan 142e832d21 pass registration
7 years ago
Xin Pan 5b183557f3 graph viz pass
7 years ago
Xin Pan c3f6e0e8a2 add namespace to Graph
7 years ago
Xin Pan 64eaa4c829 clean
7 years ago
Xin Pan 2fa8df1caf separate graph building pass and graph-based pe builder
7 years ago
Xin Pan 9605fcd124 all graphs
7 years ago
Xin Pan af79b19207 add a simple program to graph
7 years ago
Xin Pan 68aa500451 polish attrs
7 years ago
Yancey 0042ba93c8
Merge pull request #12127 from Yancey1989/enforce_rpc_timeout
7 years ago
chengduo 325fbc4f1b
Add learning rate decay test (#12124)
7 years ago
chengduo 86b0a72576
Refine multi thread cpu parallel exe (#11406)
7 years ago
Yancey1989 d14afcedeb polish function name
7 years ago
Yancey1989 1effba3312 fix pe with cpu place
7 years ago
chengduo 8d76cf397d
Fix TensorCopy bug (#11822)
7 years ago
chengduo 6711b7b5f1
fix FeedAndSplitTensorIntoLocalScopes (#11817)
7 years ago
yi.wu 8d04d0e2a3 update
7 years ago
yi.wu 6f0107126a fix broadcast bug
7 years ago
yi.wu 8e48c77b54 wip
7 years ago
yi.wu 3d69a82b83 fix dist train broadcasting bug
7 years ago
fengjiayi 964f515e9a fix mac compile
7 years ago
Yancey1989 7e6518e8ca fix compile warning
7 years ago
Yancey1989 7d1b146939 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into overlap_memcpy_with_dist
7 years ago
Qiyang Min 046bb5c8cb Fix NCCLBcast hang up bug in Parallel Executor (#11377)
7 years ago
Yancey1989 6d752bafd8 use get_appropriate_dev to schedule rpc op
7 years ago
Yancey1989 4444e79e46 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into overlap_memcpy_with_dist
7 years ago
chengduoZH 173d72b481 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enable_cpu_on_pe
7 years ago
chengduoZH aadaadf735 replace use_event with use_cuda, because use_event means the program running with CUDA, so use_cuda maybe more intuitive.
7 years ago
chengduoZH 1e731f5964 small fix
7 years ago
chengduoZH 5a3c8bf813 fix in c++ side
7 years ago
chengduoZH 0c851cab22 add SSA graph checker
7 years ago
Yancey1989 d5a88b9340 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into overlap_memcpy_with_dist
7 years ago
chengduoZH 8291b916d6 replace graph_builder_factory with ssa_graph_builder_factory
7 years ago
Yancey1989 23433def4b Merge branch 'develop' of github.com:PaddlePaddle/Paddle into overlap_memcpy_with_dist
7 years ago
yuyang18 d9af153232 SSA Graph Builder Factory
7 years ago
Yancey1989 e533a4b4ab Merge branch 'develop' of github.com:PaddlePaddle/Paddle into overlap_memcpy_with_dist
7 years ago
Yancey1989 cb3861538d fix compile failed with CPU
7 years ago
Yancey1989 93401c98e1 overlap rpc op memcpy in distributed training
7 years ago
yuyang18 86a61c177f Add ScopeBufferedSSAGraphExecutor
8 years ago
yuyang18 7c777dd549 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/exec_strategy
8 years ago
yuyang18 08295f9877 Add build strategy
8 years ago
yuyang18 e5281b3c2d Clean code & add execution strategy
8 years ago
typhoonzero 928418a9ac Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
8 years ago
typhoonzero f5840d8925 follow comments
8 years ago
chengduoZH 97cb5479ae change PE strategy
8 years ago
typhoonzero a529d790b6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
8 years ago
typhoonzero d9320dcd94 complete code
8 years ago
chengduoZH c891189568 update sparse gradient parameter with reduce and broadcast
8 years ago
chengduoZH 5ff1ef36ee update sparse parameter
8 years ago
yangyaming 82571deb89 Change `customize_loss_grad` to `use_default_grad_scale`.
8 years ago
Yu Yang 54ada9449e Add demo for recordio train/test and parallel executor
8 years ago
Yu Yang 7a395881d4 Add customize_loss_grad option to PE
8 years ago
Yu Yang 5305c5f845 Correctly implement destructor of ParallelExecutor
8 years ago
fengjiayi fbe562478d
Merge pull request #9994 from reyoung/feature/debug
8 years ago
Yu Yang 06fb055a2f New group
8 years ago
Yu Yang 71a2e6b73c Reverse create var
8 years ago
Yu Yang 89728f8e66 update
8 years ago
Yu Yang eb2e4eeade Debug
8 years ago
Yu Yang b4aaa00a8a Polish logic of ParallelExecutor
8 years ago
Yu Yang ad73b331c7 Eagerly drop local scope in iteration (#9838)
8 years ago
fengjiayi 90084a25d2
Merge pull request #9743 from JiayiFeng/modify_readers_to_fit_parallel_executor
8 years ago
wanghaoshuang 19c1a68ee9 Fix lost of LoD while splitting tensor in parallel executor.
8 years ago
JiayiFeng ee178d5aeb fix bugs
8 years ago
chengduoZH 7e7611d067 when the number of samples of current batch is less than the count of devices, let it crash.
8 years ago
qingqing01 2b7e5bd366
Support testing during training by ParallelExecutor. (#9738)
8 years ago
Xin Pan 4bbfa9eccb Add feed to ParallelExecutor
8 years ago
Xin Pan b123ce88a1 Add enable/disable for delayed ops
8 years ago
Xin Pan d0ac92531d Improve ParallelExecutor performance
8 years ago
qiaolongfei 9a101cfc08 clean code
8 years ago
qiaolongfei 997e9a1fd2 fix mac compile
8 years ago