Commit Graph

79 Commits (8af85922d0facbe1480da7059369f38ffed93f15)

Author SHA1 Message Date
石晓伟 5c59d2139e
reverts the commit 23177, test=develop (#23363)
5 years ago
石晓伟 75ebb48a91
supports thread-binding stream, test=develop (#23177)
5 years ago
Wilber 7bc4b09500
add WITH_NCCL option for cmake. (#22384)
5 years ago
zhaoyuchen2018 3d4f2aa689
Refine stack op to improve xlnet performance, test=develop (#22142)
5 years ago
Jacek Czaja cd43c4440e [MKL-DNN] LRN and Pool2d (FWD) NHWC support (#21375)
5 years ago
Zeng Jinle cdb3d27985
Fix warn of gcc8 (#21205)
5 years ago
zhaoyuchen2018 b93870e696
Improve topk performance. (#21087)
5 years ago
qingqing01 1a3eef026c
Enable users to create custom cpp op outside framework. (#19256)
5 years ago
Zeng Jinle 37f76407b0
fix cuda dev_ctx allocator cmake deps, test=develop (#19953)
5 years ago
Zeng Jinle c7f36e7c00
Add lock to cudnn handle calls (#19845)
5 years ago
Zeng Jinle 5eb381a3e2
refine reallocate of workspace size, test=develop (#19843)
5 years ago
Huihuang Zheng 12542320c5
Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989)
6 years ago
Tao Luo 75d1571995
refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603)
6 years ago
Tao Luo fe32879d2a
add mkldnn shapeblob cache clear strategy (#18513)
6 years ago
Tao Luo 3f3112ceb0
add shape_blob for cache mkldnn primitive (#18454)
6 years ago
Leo Zhao 8f5fffca0a rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() (#18453)
6 years ago
Michał Gallus 8409693272 Reset DeviceContext after quantization warmup (#18182)
6 years ago
Huihuang Zheng b9494058b3
Use CudnnWorkspaceHandle in exhaustive search (#17082)
6 years ago
Zeng Jinle 1202d3fc74
Refine model gpu memory (#16993)
6 years ago
nhzlx a1d11bb175 fix ci bug: cudnn handler in multi card
6 years ago
nhzlx 3df7b98a0f Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
Wu Yi b7baeed7bb fix win gpu build test=develop (#16334)
6 years ago
nhzlx 07dcf2856c git cherry-pick from feature/anakin-engine: update anakin subgraph #16278
6 years ago
Wu Yi 6382b62f6b
Collective ops (#15572)
6 years ago
qingqing01 86e912c544 Fix windows compiling (#16230)
6 years ago
qingqing01 8ad672a287
Support sync batch norm. (#16121)
6 years ago
chengduo 46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn (#15186)"" (#15290)
6 years ago
chengduozh 358e657f68 Revert "Remove workspace_handle in conv_cudnn (#15186)"
6 years ago
chengduo 064512aa47
Remove workspace_handle in conv_cudnn (#15186)
6 years ago
sneaxiy ed409ac9f4 Revert "Revert "Remove op handle lock""
6 years ago
Zeng Jinle dacfaaa966 Revert "Remove op handle lock"
6 years ago
sneaxiy d0a8a1e950 remove_op_handle_lock
6 years ago
sneaxiy d25395fc98 remove tensor core lock
6 years ago
chengduo b9fb03cf54
Move GetTensor to tensor_util (#15011)
6 years ago
chengduo 79bd6dfa18
[Feature] Add Temporary Allocator (#14875)
6 years ago
sneaxiy ca84c2ca8f merge develop
6 years ago
Yu Yang 7604b1ad51 Fix Eigen macro when using GPU
6 years ago
sneaxiy c47c451a00 fix bug
6 years ago
chengduo 00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929)
6 years ago
Yu Yang 0d6718fcbd Pass compile
6 years ago
Yu Yang c774bcbd2d Merge device_context
6 years ago
sneaxiy faac8a76ce remove unnecessary codes
6 years ago
sneaxiy 7ff320f8cc merge develop
6 years ago
Yu Yang 90d9e5aee8
feat(platform): lazy initialization of devicecontext in pool (#14067)
6 years ago
Sylwester Fraczek 2098b42584 review fixes (Teamcity fails)
6 years ago
sneaxiy 5be6f762d0 remove_lock_in_some_ops
6 years ago
Brian Liu a53e8a8da6 Update MKLDNN integration framework to support Paddle multi-instances
6 years ago
chengduozh 82d2903b63 Fix fast ParallelExe bug
6 years ago
chengduo 2c9839c847
add cuda version display (#13885)
6 years ago
typhoonzero a4f7696a18 Revert "Some trivial optimization (#13530)"
6 years ago