Commit Graph

52 Commits (32211fe9c4c22168dfb73f19763b17ac9191341a)

Author SHA1 Message Date
Qi Li 34f1628ce8
[ROCM] update fluid platform for rocm39 (part2), test=develop (#30774)
4 years ago
WangXi 572c466d19
[Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455)
4 years ago
Leo Chen 81217a94d8
unify calling cudaSetDevice (#30470)
4 years ago
Huihuang Zheng acc11c2a62
Retry CUDA Initialization to Fix Random Failure, test=develop (#28323)
4 years ago
GaoWei8 c10dcff12d
refine PADDLE_ENFORCE (#25456)
5 years ago
GaoWei8 ea7e532598
Refine PADDLE_ENFORCE (#25369)
5 years ago
Chen Weihang aa0f254fbe
Add macro BOOST_GET to enrich the error information of boost :: get (#24175)
5 years ago
Yi Liu 2169e6fb58
Initialize global nccl_comm in PE (#23275)
5 years ago
Wilber de009152a7 Compile without nccl deps. [2/2] (#22484)
5 years ago
Zeng Jinle cdb3d27985
Fix warn of gcc8 (#21205)
5 years ago
Tao Luo 75d1571995
refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603)
6 years ago
gongweibao c0a82748cf
Polish backwards optimizer dependency codes and use more default values. (#18255)
6 years ago
gongweibao f5caf3443c
Fix reinitialized ncclid error! (#18025)
6 years ago
gongweibao 0d561ef442
fix 2dconn test=develop (#17681)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
Wu Yi 6382b62f6b
Collective ops (#15572)
6 years ago
qingqing01 8ad672a287
Support sync batch norm. (#16121)
6 years ago
sneaxiy ba4f43fd62 fix compile error in distributed mode
6 years ago
Yancey1989 86bb583881 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
Yancey1989 41a64f6a2a Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
Wu Yi 856f0da0fe
Fp16 training (#14992)
6 years ago
typhoonzero da87f7a698 Revert "[Feature] Fp16 training for resnet50 (#14850)"
6 years ago
Wu Yi 3d750f9c5a
[Feature] Fp16 training for resnet50 (#14850)
6 years ago
Yancey1989 4a4ccac1d0 update by comment test=develop
6 years ago
Yu Yang 9bd70a1e04 Change tensor uses proto::VarType::type
6 years ago
Yancey1989 cb8a24be14 clean code
6 years ago
Yancey1989 c9de6f1b05 init parallel graph mode
6 years ago
Wu Yi 29d9fb53fc
[Feature] multi process multi gpu dist training, boost v100 performance by 20% (#14661)
6 years ago
minqiyang 53433d7f2e Revert the changes of VLOG
6 years ago
peizhilin 7840d181c9 fix style issue
6 years ago
peizhilin ca60e1d34d Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
6 years ago
peizhilin 9d67c1fb69 cpu build support
6 years ago
Wu Yi f90c7865f0
Benchmark tool for imgnet (#12305)
7 years ago
Qiyang Min 046bb5c8cb Fix NCCLBcast hang up bug in Parallel Executor (#11377)
7 years ago
gongweibao 4fb7cc7f5e
Move sync_mode device ctx from grpc server (#10881)
7 years ago
yuyang18 7c777dd549 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/exec_strategy
7 years ago
yuyang18 08295f9877 Add build strategy
7 years ago
typhoonzero 7b0c0273f4 update by comments
7 years ago
typhoonzero f5840d8925 follow comments
7 years ago
typhoonzero 17009d0627 workable version
7 years ago
typhoonzero 3667578ec2 testing
7 years ago
typhoonzero d9320dcd94 complete code
7 years ago
Yu Yang 093d227a77 Use mutex to stablize ncclCtxMap
7 years ago
Yu Yang c64190ecbb Polish NCCLHelper
7 years ago
qingqing01 129859e732
Support data type int64 in NCCL. (#9818)
7 years ago
Yu Yang 7dcb217e31 Refine allreduce op
7 years ago
Yu Yang c0c2e15920 NCCL AllReduce
7 years ago
Yu Yang fe7ed285d1 Extract NCCLCtxMap
7 years ago
Yu Yang 6ebc6bf533 ReorganizeCode
7 years ago