Commit Graph

274 Commits (adc26dffa9dac81bd93c88d70f0ab66fcdcc81f0)

Author SHA1 Message Date
dangqingqing 521db98bc9 Refine CUDA profiler and delete the test file.
7 years ago
dangqingqing f266284d9f Fix the compiling for only CPU mode.
7 years ago
dangqingqing 10622ba3cf Resolve conflicts.
7 years ago
dangqingqing 9d73950ec9 Add profiling tools for fluid.
7 years ago
QI JUN 93a2d9c59d
add more place test and rename Cudnn to CUDNN (#6621)
7 years ago
Yu Yang 1b0c7d7c7a
Simplize system_allocator and fix GPU_INFO (#6653)
7 years ago
Yu Yang d5cab4f07c
Fix compile on CUDA9.1 & MacOS (#6642)
7 years ago
tensor-tang bf269d67b3 fix place_test on MKLDNNPlace
7 years ago
tensor-tang a92f057ed1 fix conflict of Place
7 years ago
tensor-tang 7728c53448 Merge remote-tracking branch 'upstream/develop' into fluid
7 years ago
tensor-tang f271210595 fix undefined issue when with_gpu
7 years ago
tensor-tang e0c3317646 add MKLDNNPlace
7 years ago
dzhwinter 0e9b393b34
"derived cudnnDevice context" (#6585)
7 years ago
QI JUN 61ec0b9516
Refine device context (#6433)
7 years ago
qingqing01 5ba231d80b
Merge pull request #6374 from reyoung/feature/remove_device_context_finish
7 years ago
Yang Yu 6b9567e0ac Remove DeviceContext::Finish
7 years ago
Yu Yang f291abfc53
Add HasCUDNN to detect if CUDNN is installed or not (#6349)
7 years ago
QI JUN 96a5f96cc1
fix bug in gpu default memory allocating policy (#6268)
7 years ago
QI JUN d066b07f14 change GPU memory allocating policy (#6159)
7 years ago
chengduo e50f35706a code refine (#6164)
7 years ago
Yu Yang 8ac02279f2
Fix the proformance problem of enforce (#6085)
7 years ago
武毅 4ecbab42d8
Fix compile on cudnn7 (#5982)
7 years ago
dangqingqing 696b0253e5 Refine paddle/v2/fluid/profiler.py.
7 years ago
dangqingqing 623f62a7dc Add cuda profiler tools and expose it in Python.
7 years ago
dangqingqing 322d69f209 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into nvprof
7 years ago
dangqingqing 6cf2dcbc1f Add cuda profiler tools.
7 years ago
武毅 a06bec1287
Conv cudnn 3d (#5783)
7 years ago
Qiao Longfei c9172c1cb3
Make enforce target (#5889)
7 years ago
Yu Yang c077a6d57c
Feature/support int64 for sum (#5832)
7 years ago
chengduoZH dec61ab6df Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_pool3d
7 years ago
chengduoZH 0bc2f41da9 remove conflict
7 years ago
chengduoZH 7e91da41e7 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_pool3d
7 years ago
wanghaox 0968c7cd6b Update code and fix conflicts.
7 years ago
dzhwinter e97b89873a
"fix accuracy kernel bug" (#5673)
7 years ago
chengduoZH 74912c7d4e fix data layout
7 years ago
dangqingqing 884ce5d5a2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cmake_speed
7 years ago
chengduoZH ec1e2fc938 add cudnn_pool3d unit test
7 years ago
chengduoZH a93a59ec7d add cudnn 3d unit test
7 years ago
Yang Yu 174050277a Fix GPU Compile on Linux
7 years ago
dangqingqing 524ccba4fe Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cmake_speed
7 years ago
dangqingqing f5e367655e Use G++ to compile some cu operators.
7 years ago
emailweixu 2378679a9e Fix a dead lock bug for dyload/nccl.h when nccl lib cannot be loaded (#5533)
7 years ago
Yang Yu 3187451ae7
CompareOp's kernel device type is decided by input tensor place
7 years ago
qingqing01 58db07b7bb Check errors for the cuda kernel calls. (#5436)
7 years ago
QI JUN afd1e844fd
remove unused code (#5219)
7 years ago
Dong Zhihong 16a39d24f3 fix conflict
7 years ago
Qiao Longfei 56b723c40d Cudnn batch norm op (#5067)
7 years ago
Dong Zhihong 0990c87bf6 checkin nccl operator
7 years ago
Yu Yang 94e741d6f0 Use external project for NCCL (#5028)
7 years ago
Yu Yang 43c6ff212e Feature/nccl dso (#5001)
7 years ago
Markus Kliegl 164898277c MatMul operator (#4856)
7 years ago
武毅 a3ccbdb3b6 Cudnn conv op (#4195)
7 years ago
Yang Yang(Tony) c3bf332666 Merge pull request #4537 from QiJune/executor_impl
7 years ago
Luo Tao 871a3f6e76 remove unused PADDLE_ONLY_CPU comment
7 years ago
Yang Yang e51557130e clean up for review
7 years ago
qijun 1f5192a27b fix executor gpu unittest
7 years ago
qijun 39f75a13a4 Merge remote-tracking branch 'baidu/develop' into executor_impl
7 years ago
Yi Wang 880b874b47 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into paddle_only_cpu
7 years ago
Yi Wang 2b204f048b Rename platform::GetDeviceCount into platform::GetCUDADeviceCount
7 years ago
qijun e02cc571cf Merge remote-tracking branch 'baidu/develop' into executor_impl
7 years ago
qijun fe10e86dd5 fix gpu build error
7 years ago
Yi Wang 4558807c48 Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU
7 years ago
Yu Yang 84500f9487 Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU`
7 years ago
qijun cb198fa7b6 merge baidu/develop
7 years ago
qijun 395051512d remove device context manager
7 years ago
qijun 6c4d1f551d refine codes
7 years ago
qijun 023ed5eb39 merge baidu/develop
7 years ago
qijun b5dbe88b5a follow comments
7 years ago
dzhwinter 8acc010691 Merge branch 'develop' into macro
7 years ago
dongzhihong 5423cb3e57 format
7 years ago
Yu Yang 8fd845e0fa Unify Map in OpDescBind
7 years ago
chengduoZH df59889984 remove conflict
7 years ago
qijun b611a479fc fix gpu build error
7 years ago
qijun 7a6fcc7d30 move EigenDeviceConverter to device_context.h
7 years ago
Yu Yang f2feb33384 Follow comments
7 years ago
Yu Yang 3a5693e0a8 Add Skeleton of Double support
7 years ago
chengduoZH 3c0f079333 remove conflict and fix InferShape function
7 years ago
Yu Yang bc30ba19ed Merge pull request #4375 from reyoung/feature/use_bool_for_enforce
7 years ago
chengduoZH 30a586df0c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into Add_pool_op
7 years ago
Qiao Longfei d0ad82cff1 fix nv_library (#4370)
8 years ago
Yu Yang 699dbe3be9 Use `bool` for PADDLE_ENFORCE, not int
8 years ago
Yu Yang ba1f5b5c58 Sync computation when Python invoke `run`
8 years ago
chengduoZH 0417e4e4bf fix framework::LoDTensor => Tensor
8 years ago
dangqingqing 41a2321a0e Refine platform::Transform function and fix prelu_op testing.
8 years ago
Yu Yang 87e4e25db1 Change Transform API
8 years ago
Yu Yang 847fe47310 Merge branch 'develop' of github.com:baidu/Paddle into feature/remove_lazy_init_in_dev_ctx
8 years ago
Yu Yang 81d56ca86b Remove lazy-initialization in device_context
8 years ago
武毅 8580dce308 Refine accuracy_op CUDA kernel (#4097)
8 years ago
Yu Yang 9d3b920d75 Merge pull request #3981 from reyoung/feature/transform_api
8 years ago
liaogang 59d661b9a9 Fix enforce test failed
8 years ago
Yu Yang f8c6792aa3 Extract DevPtrCast to device_ptr_cast.h
8 years ago
Yu Yang 54d88d4472 Merge branch 'develop' of github.com:baidu/Paddle into feature/transform_api
8 years ago
Yu Yang 6fbf097bcc Mark thrust::device_ptr in transform
8 years ago
Yu Yang dad5421afe Remove enforce demangle
8 years ago
Yu Yang c5fa417c62 Host and device transform API
8 years ago
Yu Yang ed346f1dcd Pass CI
8 years ago
dangqingqing 8c048aa099 Remove cudnn_helper.cc
8 years ago
dangqingqing 207132226c Add unit testing for cuDNN wrapper.
8 years ago
dangqingqing c20a01d67d Add cuDNN Wrapper.
8 years ago
dangqingqing f188e22b33 Remove set functor and add comapre_grad test
8 years ago