chengduo
00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH ( #13929 )
...
* refine cublase
test=develop
* code refine
* refine cublas
* add GEMME_EX
* add enable_cublas_tensor_op_math doc and add cublasCall
test=develop
* fix CublasCall for cuda version
test=develop
* fix error
test=develop
* fix GEMM_EX to be compatible with gcc 4.8
test=develop
* add GEMM_EX
test=develop
* to compatiable with gcc4.8
test=develop
6 years ago
Yu Yang
0d6718fcbd
Pass compile
6 years ago
Yu Yang
c774bcbd2d
Merge device_context
...
test=develop
7 years ago
sneaxiy
faac8a76ce
remove unnecessary codes
...
test=develop
7 years ago
sneaxiy
7ff320f8cc
merge develop
7 years ago
Yu Yang
90d9e5aee8
feat(platform): lazy initialization of devicecontext in pool ( #14067 )
...
* feat(platform): lazy initialization of devicecontext in pool
Use std::async(deferer, []{...}) to lazy initialize DeviceContext in Pool
test=develop
* Add future includes
test=develop
7 years ago
Sylwester Fraczek
2098b42584
review fixes (Teamcity fails)
...
test=develop
7 years ago
sneaxiy
5be6f762d0
remove_lock_in_some_ops
...
test=develop
7 years ago
Brian Liu
a53e8a8da6
Update MKLDNN integration framework to support Paddle multi-instances
...
Make all blob info saved in global device context to be thread based.
Meanwhile save thread id in thread local storage in ParallelDo
7 years ago
chengduozh
82d2903b63
Fix fast ParallelExe bug
...
test=develop
7 years ago
chengduo
2c9839c847
add cuda version display ( #13885 )
...
test=develop
7 years ago
typhoonzero
a4f7696a18
Revert "Some trivial optimization ( #13530 )"
...
This reverts commit 1d91a49d2f
.
7 years ago
chengduo
1d91a49d2f
Some trivial optimization ( #13530 )
...
* some trivial opt
* remove the fix of lod_tensor and shrink_rnn_memory_op
* refine ShrinkRNNMemoryOp
test=develop
7 years ago
sneaxiy
612e1a3155
modification
7 years ago
sneaxiy
d0b2453ecd
merge develop
7 years ago
sneaxiy
24ea39c4c6
feature/eager_delete_tensor
7 years ago
fengjiayi
82a1b35b9b
Revert "Revert "Add CudnnHolder and use it in Conv and ConvTranspose op""
...
This reverts commit 151e169eb7
.
7 years ago
guochaorong
151e169eb7
Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"
7 years ago
fengjiayi
1f36a4c27c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_CudnnHolder
7 years ago
fengjiayi
b0aca8824d
make CudnnHolder thread safe
7 years ago
luotao1
7169f9378c
fix mkldnn include format
7 years ago
fengjiayi
15cc9128be
fix compile error
7 years ago
fengjiayi
04bfd5c10c
add CudnnHolder to manage cudnn_handle and workspace
7 years ago
yuyang18
2d0e5592b5
Use std::map for Place <--> DeviceContext
7 years ago
chengduo
da556ed6d4
enhance ParallelExecutor stable ( #11637 )
7 years ago
chengduoZH
c99fca5f90
Add No Mutex
7 years ago
yuyang18
a1254a86ba
Add lock to record_event.
7 years ago
yuyang18
7cf8b656a2
Remove lock in device context
7 years ago
Yu Yang
6b20b35589
Fix Transformer Hang Problem
7 years ago
Yi Wang
8dbd9c394e
Fix part of the cpplint errors in fluid/platform ( #9802 )
7 years ago
chengduoZH
ab601c19c3
Add CUDAPinnedPlace
7 years ago
chengduoZH
18eb77303d
add CUDAPinnedPlace
7 years ago
Yu Yang
1d8fe2a220
Enhance device context pool ( #9293 )
7 years ago
Kexin Zhao
c88f58dbd8
add comment
7 years ago
Kexin Zhao
3b44b849d3
address comments
7 years ago
chengduo
84aea8a8a1
Merge pull request #8669 from chengduoZH/feature/concat_op
...
Refine concat_op
7 years ago
pzelazko-intel
8c71adaa8c
MKLDNN conv2d kernel added ( #8451 )
...
* MKLDNN conv2 OP kernel added
* TODOs added
* mkldnn conv2d OP refactor
* CanCUDNNBeUsed and CanMKLDNNBeUsed moved
7 years ago
chengduoZH
131ec276ed
fix bug for big number; float->double and code refine
7 years ago
qingqing01
24509f4af9
Fix the grammar in copyright. ( #8403 )
7 years ago
Yi Wang
fc374821dd
Correct #include path
7 years ago
Yi Wang
90648f336d
Move file to fluid/; Edit CMakeLists.txt
7 years ago