Commit Graph

24 Commits (767bf0c8d3d8eeec2aae36fda2ef12e88af021ca)

Author SHA1 Message Date
Wu Yi 9ffd5eecef
test fix fetch bar place for ce (#16406)
6 years ago
chengduo a6a3b2fbbc
[Speed]Refine ParallelExecutor (#16190)
6 years ago
chengduo ed087f8232
refine op_handle (#14178)
6 years ago
yuyang18 d49763a87d Stash
7 years ago
Xin Pan 37e514432b op compose node and update nodes.
7 years ago
yuyang18 2d0e5592b5
Use std::map for Place <--> DeviceContext
7 years ago
fengjiayi ff4317cee9 follow comments
7 years ago
fengjiayi 47388020a2 fix bugs
7 years ago
chengduo da556ed6d4
enhance ParallelExecutor stable (#11637)
7 years ago
chengduoZH c99fca5f90 Add No Mutex
7 years ago
chengduoZH aadaadf735 replace use_event with use_cuda, because use_event means the program running with CUDA, so use_cuda maybe more intuitive.
7 years ago
chengduoZH a584bc86dd add fuse var op handle
7 years ago
chengduoZH a89cd46700 Wait VarDummyHandle generated
7 years ago
chengduoZH 9eec2c7509 refine pe
7 years ago
Yu Yang 4452ff76b7 Fix CPU compile
7 years ago
Yu Yang 79be06045c Support CPU/GPU mixture for ParallelExecutor
7 years ago
Yu Yang 6b20b35589 Fix Transformer Hang Problem
7 years ago
Yu Yang 084cdd1f4f Rename code
7 years ago
Yu Yang 76570c2e96 Wait fetch op
7 years ago
Yu Yang b6ca3711b4 Get error
7 years ago
Yu Yang f385228f05 Add Paddle Enforce
7 years ago
Yu Yang 54bd17fe7b Complete Flowers
7 years ago
Yu Yang 15f5f10ed5 AddInput/AddOutput for OpHandle
7 years ago
Yu Yang fe7ed285d1 Extract NCCLCtxMap
7 years ago