Commit Graph

347 Commits (9e9f5d8080995e71b3a7ef8fd20a0a02f33f107f)

Author SHA1 Message Date
Yu Yang aba46f077b Disable P2P
7 years ago
chengduoZH ab601c19c3 Add CUDAPinnedPlace
7 years ago
Abhinav Arora 65534c4762
Fluid channels should match the semantics of Go Channels (#9265)
7 years ago
chengduoZH 158d6c4d19 add unit test
7 years ago
Luo Tao ccfec1bcb1 remove vars when remove ops
7 years ago
chengduoZH 18eb77303d add CUDAPinnedPlace
7 years ago
typhoonzero 1ab4fcb5e7 prepare pserver executor
7 years ago
chengduoZH a0e2cf03e4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/add_pinned_memory
7 years ago
Yu Yang 9dd64d83f3 WMT Model
7 years ago
chengduoZH 39004080f4 replace use_pinned with is_pinned
7 years ago
Yu Yang cb40c33137 Update unittest
7 years ago
Yu Yang 3aa2a8ffcf Follow comments
7 years ago
Yu Yang 02aaecca35 Fix CPU compile
7 years ago
Yu Yang 54bd17fe7b Complete Flowers
7 years ago
Yu Yang 50e7e25db3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
Yu Yang 5c7a523326 Add Graphviz output
7 years ago
Qiao Longfei 8ccc61f334
support empty tensor (#9338)
7 years ago
Yu Yang edfd741e3a Add simple python wrapper for ParallelExecutor
7 years ago
Yu Yang a7b0d5bd26 Clean code
7 years ago
Yu Yang e3144393e3 Extract Executors to indie modules
7 years ago
Yu Yang c70b60dd70 Make executor steal graph inside
7 years ago
Yu Yang 4c3361cda8 Extract GraphExecutor
7 years ago
Yu Yang b123e43bf9 extract multi devices graph builder
7 years ago
Varun Arora 76ae540f8e
Move Select to concurrency.py; incorporate outputs (#9136)
7 years ago
dzhwinter 13f1050ab0
"fix mixed_vector bug" (#9319)
7 years ago
Liu Yiqun 5419da6e7a Fix bug caused by block_id.
7 years ago
Yu Yang dd73d18bb7 Extract SSAGraph
7 years ago
Yu Yang 79989c9025 Add SSA builder
7 years ago
Yu Yang 64d7a30271 Extract SSAGraph
7 years ago
Yu Yang 8dec4ad7a1 Use int not Place for vars
7 years ago
Yu Yang 3181501013 Rerange code
7 years ago
Yu Yang f28ae6e4b1 Reorganize Code
7 years ago
Yu Yang 5c333e4143 Add dctor for dev_ctx
7 years ago
Liu Yiqun 0968753454 Enable the test of not creating variables every time.
7 years ago
Yu Yang 15f5f10ed5 AddInput/AddOutput for OpHandle
7 years ago
Yu Yang 5368e50d84 Reorganize code
7 years ago
Yu Yang fe7ed285d1 Extract NCCLCtxMap
7 years ago
Yu Yang 6ebc6bf533 ReorganizeCode
7 years ago
Yiqun Liu 7bb4ea9c13
Add an argument in Executor.Run to allow users to choose whether to create and destroy variables every time. (#9242)
7 years ago
Yu Yang a478a11e0b NCCL Guard for bcast
7 years ago
Yu Yang f2685bed81 Clean code
7 years ago
Yu Yang 41ad632341 Add NCCL Group Guard
7 years ago
Yu Yang 99fe83a020 Move nccl helper
7 years ago
Yu Yang 90f980167d Do not wait computation stream
7 years ago
Yu Yang 7ac969b88c Debug
7 years ago
Qiao Longfei 37a272e670
add executor.prepare (#9022)
7 years ago
chengduoZH eaa90d38ad add use_pinned
7 years ago
Yu Yang 599f7a87ba Refine code
7 years ago
Yu Yang 43e54079a8 Debug code
7 years ago
Yu Yang e335f01826 Add more logs
7 years ago
Yu Yang 82693e7227 Wait nccl all reduce
7 years ago
Yu Yang eb0a580e78 Add enforce
7 years ago
Yu Yang 65bc7d17d5 Add mtx to ncclAllReduce
7 years ago
Yu Yang ba227df941 Expose num_threads
7 years ago
Yu Yang 1533bf12df Use event and single thread
7 years ago
Yu Yang 95a0d7c7c1 Illegal memory access
7 years ago
Yu Yang 798e6907b4 Change mem order
7 years ago
Yu Yang 1c2b6100b0 Add
7 years ago
Yu Yang 4e43b71377 Add wait log
7 years ago
Yu Yang dbed123382 Debug
7 years ago
Yu Yang e53b6aba63 Use no thread
7 years ago
Yu Yang a8bd7b9809 Add log
7 years ago
Yu Yang 3c9cea597e Add more log
7 years ago
Yu Yang f8f1a963d9 Add debug code
7 years ago
Yu Yang fbbcedda01 Fix bug
7 years ago
Yu Yang 7643c2cbab Add flag for use event
7 years ago
Yu Yang ca4b3d2532 Use 12 threads
7 years ago
Yu Yang f251a58e85 Use base class manage events
7 years ago
Yu Yang 1dd216dc3b Wait bcast param
7 years ago
Yu Yang 4185dd48e4 Disable multi-thread
7 years ago
Yu Yang 631aa3d10a Wait all inputs ready
7 years ago
Yu Yang 9b1f4d5d62 After nccl add event
7 years ago
Yu Yang feb569f8ea Add log
7 years ago
Yu Yang 260cfe3b86 Stop Wait NCCL Stream
7 years ago
Yu Yang e025e284c6 Exchange wait op
7 years ago
Yu Yang 3238ce0672 Add wait
7 years ago
Yu Yang 8a9de67e17 Remove wait
7 years ago
Yu Yang d2cb3790e9 Wait all evernts
7 years ago
Yu Yang 4137bb4eda Add wait
7 years ago
Yu Yang 3da4159f88 Add run iter
7 years ago
Yu Yang d3c82c356e Wait multiple stream
7 years ago
Yu Yang c18c2f6ab0 Sync all computation streams at the end of run
7 years ago
chengduo 597ba3f3f2 add more times close test (#9215)
7 years ago
Yu Yang c372ce2885 Add event for computational op
7 years ago
Yu Yang b94ffacbd7 SetDev
7 years ago
Yu Yang 99f85a9fbc Set dev
7 years ago
Yu Yang d26f093f9d Log
7 years ago
Yu Yang d55a03d916 Scale loss on place
7 years ago
Yu Yang 932364a275 Sync dev
7 years ago
Yu Yang dad7bdabd4 Add setDev
7 years ago
Yu Yang 7fd0d24e0c Add lgo
7 years ago
Yu Yang bade579826 Wait code
7 years ago
Yu Yang 4a330094f9 Add log
7 years ago
Yu Yang 9824e8f311 Scale loss op use event
7 years ago
Yu Yang 071043c388 Add paddle enforce
7 years ago
Yu Yang 8af57706e2 Only wait same device
7 years ago
Yu Yang 29cc9f308d SetDev for nccl
7 years ago
Yu Yang d7badb3ed2 Use event to sync stream
7 years ago
Yu Yang 1f53193a63 Use atomic code
7 years ago
Yu Yang c7beac1426 Add dummy var
7 years ago
Yu Yang 5fa535b717 Wait all thread done
7 years ago
Xin Pan 898e0ffa21
Merge pull request #9190 from panyx0718/p2p
7 years ago
Yu Yang 7bff02b2ca Change to pending op
7 years ago
Yu Yang 866f6f1be0 Debug
7 years ago
Yu Yang a5ba704de0 Counter
7 years ago
Yu Yang a87ce91c4b Use mtx
7 years ago
Yu Yang ea11a0a853 Use volitie
7 years ago
Xin Pan ce55975bb5 fix
7 years ago
Xin Pan 18ac6947d0 Enable P2P memory copy
7 years ago
Yu Yang 515e516e77 Add more log
7 years ago
Yu Yang 1f063d0900 Memorder
7 years ago
Yu Yang b1cb8bbd40 Debug
7 years ago
Yu Yang b57b880b05 Debug
7 years ago
Yu Yang f3e983e499 Memory order
7 years ago
Yu Yang 36e0415220 Single Thread
7 years ago
Yu Yang 5957f28b86 Debug
7 years ago
Yu Yang f52714d391 Debug
7 years ago
Yu Yang 0023c3bcf5 Use atomic bool
7 years ago
Yu Yang 09935ab936 Debug
7 years ago
Yu Yang f8141d90c8 Debug
7 years ago
Yu Yang e18a269705 Add debug code
7 years ago
Yu Yang 9cb8f50302 Complete fetch op
7 years ago
Yu Yang 254d7ff4f5 Refactor local_scopes
7 years ago
Yu Yang b2c7a9b828 Wait by stream
7 years ago
Yu Yang e8a7e5d1e6 Update
7 years ago
Yu Yang 8f0590e7c5 Add ncclAllReduce
7 years ago
Yu Yang c15d2c9edc Update
7 years ago
Yu Yang d470763f6c Stash
7 years ago
Yu Yang 9fc0b596a9 Test more
7 years ago
Yu Yang 0ef9edf566 Stash
7 years ago
Liu Yiqun 253ba6672f Merge branch 'develop' into core_inference_remove_clone
7 years ago
Yu Yang 5e87cd7574 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
Yu Yang 8c9cd369dc Polish code style
7 years ago
Yu Yang 6f0dfd89a4 Single GPU ParallelExecutor complete
7 years ago
Xin Pan 21e2c42a46
Merge pull request #9141 from panyx0718/develop
7 years ago
Tao Luo 20be8e7e33
Merge pull request #9104 from ranqiu92/doc_dir
7 years ago
Xin Pan 1ca1e1c384 Fix a program copy regression.
7 years ago
ranqiu 64775126f3 change the dir of docs
7 years ago
Yu Yang d84ddcf123 Stash
7 years ago
Yu Yang 193c0a7e43 Handle var hazard
7 years ago
Thuan Nguyen 1e4c504e60 Implement Select OP (#9088)
7 years ago
Yu Yang 35744e7b36 Polish code
7 years ago
Xin Pan d284cf88e5
Merge pull request #9037 from panyx0718/develop
7 years ago
Yu Yang ae88fdefb7 Use thread pool
7 years ago
Abhinav Arora 41894da145
Add changes to channel that are needed for select op (#9084)
7 years ago
Yu Yang 692a0f7425 Better name
7 years ago
Yu Yang baef1124fb ParallelExecutor And dependency engine
7 years ago
Yibing Liu 90afbd2856 Move back operator's event to RunImpl()
7 years ago
Xin Pan 4840c49b27 Better timeline
7 years ago
Liu Yiqun 8ecad98578 Add the bool variable to decide whether to have a copy of the program in ExecutorPrepareContext.
7 years ago