Commit Graph

54 Commits (3fbe9c34f7c7952d2ea3daae70a5912a91707e92)

Author SHA1 Message Date
Kexin Zhao 92913027fc
fix unused var error (#9908)
7 years ago
Kexin Zhao 617e790a59
fix cuda 7.5 compile error (#9885)
7 years ago
Kexin Zhao 7ed457e77a Fix cuda 7.5 error with cublas GEMM (#9811)
7 years ago
Kexin Zhao b2a1c9e8b7 Add float16 support to non-cudnn softmax op on GPU (#9686)
7 years ago
Kexin Zhao d00bd9eb72 Update the cuda API and enable tensor core for GEMM (#9622)
7 years ago
chengduoZH e099b18045 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/add_CUDAPinnedPlace
7 years ago
Yang Yu af230d9bef Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
dzhwinter 8425c2c859
Speed/sequence op1 (#9217)
7 years ago
Yang Yu b0775588c0 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
chengduoZH ab601c19c3 Add CUDAPinnedPlace
7 years ago
Luo Tao 6332bd1ed8 Merge branch 'develop' into infer_mkl
7 years ago
Yu Yang 50e7e25db3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
chengduoZH aca9180a76 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/fix_concat
7 years ago
chengduoZH 750aff10ce code refine
7 years ago
chengduoZH 043f47b27f fix concat op
7 years ago
Luo Tao ae820a34bc Merge branch 'develop' into infer_mkl
7 years ago
Tao Luo 9126e626fc
Merge pull request #9165 from ROCmSoftwarePlatform/amd_cmake_01
7 years ago
Kexin Zhao 4eaa789730 resolve conflict
7 years ago
Kexin Zhao ed2bc194c5
Merge pull request #9176 from kexinzhao/batch_norm_fp16
7 years ago
Kexin Zhao 70e7122785 initial commit
7 years ago
sabreshao e50205e744 CMake refine for HIP support.
7 years ago
Yang yaming 381c6a026d
Merge pull request #9100 from pkuyym/fix-9049
7 years ago
yangyaming 2f2c5f5e60 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-9049
7 years ago
Xi Chen 9eae086e39 add math_function to softmax's dep list
7 years ago
Yu Yang 9cb8f50302 Complete fetch op
7 years ago
Kexin Zhao 39c676e208 initial commit
7 years ago
xuwei06 ab3543e35e Fix compilation for gcc5.4
7 years ago
yangyaming bf3f56e899 Finish adaption for backward.
7 years ago
sabreshao 45c988d86a Demostration of cmake refine for HIP support.
7 years ago
Tao Luo a448fbe9e1
Merge pull request #9134 from putcn/fix-selected-row-dep
7 years ago
qingqing01 7c1a0b77a0
Delete the detection_output_op, which had been split into several operators. (#9121)
7 years ago
Xi Chen d20c6eb6de add math_function to selected_rows_functor dependency list
7 years ago
dzhwinter 128adf53cb
[Speed]implement cudnn sequence softmax cudnn (#8978)
7 years ago
Luo Tao de13f0eb4e Merge branch 'develop' into infer_mkl
7 years ago
Kexin Zhao 3b44b849d3 address comments
7 years ago
Kexin Zhao 95de7617eb fix bug
7 years ago
Kexin Zhao 1998d5afa2 add gpu info func to get compute cap
7 years ago
Kexin Zhao d400b4192d fix math function arch mismatch for older GPU
7 years ago
kexinzhao 90215b7844
Add float16 GEMM math function on GPU (#8695)
7 years ago
Luo Tao bc0cfb2283 remove PADDLE_USE_ATLAS
7 years ago
Luo Tao 49f3f1db07 add back framework_proto depends
7 years ago
Luo Tao 3ddc997182 rename concat_functor to concat, refine CMakeLists based on comments
7 years ago
Luo Tao 1ef97fa7b1 Merge branch 'develop' into math_function
7 years ago
chengduo 84aea8a8a1
Merge pull request #8669 from chengduoZH/feature/concat_op
7 years ago
kexinzhao 266ccaa843
Integrate float16 into data_type_transform (#8619)
7 years ago
chengduoZH 131ec276ed fix bug for big number; float->double and code refine
7 years ago
chengduoZH 82bd82c186 follow comments and refine code
7 years ago
chengduoZH 00e596edbe get max threads of GPU
7 years ago
Luo Tao f67275a920 refine operator/math/CMakeLists.txt, seperate im2col from math_function
7 years ago
chengduoZH 60e7ee0611 refine concat_op
7 years ago