Commit Graph

44 Commits (d93b2d0365355430f3db723dc3e278851b7a88b4)

Author SHA1 Message Date
Qiyang Min 698698f2fa
Merge branch 'develop' into fix_vlog
6 years ago
qingqing01 abe209234f
Exhaustive search for cuDNN conv. (#14286)
6 years ago
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
6 years ago
qingqing01 db8c52da5e Revert " Exhaustive search for cuDNN conv. (#14043)"
6 years ago
qingqing01 ce7d9b0799
Exhaustive search for cuDNN conv. (#14043)
6 years ago
whs 0c319e0b35
Add affine grid generator op (#12238)
6 years ago
dzhwinter 2d00e65819
namespace issue (#13543)
6 years ago
JiabinYang e322fc4e0e add error info for nccl not found
7 years ago
dzhwinter d361624c1d
platform module (#12932)
7 years ago
tensor-tang 3dd66390b2 add blas vexp
7 years ago
tensor-tang 0ec1f65cf1 fix blas dot and add cblas scal
7 years ago
tensor-tang a2203d0466 add cblas dot
7 years ago
dzhwinter e23ddf6ae4
status (#12764)
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
dzhwinter 00463fdfe3
cudnn windows support (#12757)
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago
dzhwinter 99a99ec7e3
"remove lapack" (#11966)
7 years ago
Tao Luo 2dae8a4631
Merge pull request #11596 from tensor-tang/refine/mklml/dyload
7 years ago
Yi Wang 2625178add
No NCCL on macOS (#11652)
7 years ago
tensor-tang 28a0ef9522 remove usr local lib when dynamic load lib
7 years ago
tensor-tang 3e73a7a924 add usr local lib to dynamic search path
7 years ago
tensor-tang f503f12925 enable dynamic load mklml lib on fluid
7 years ago
Xin Pan d2afd21021 Remove cuptiFinalize.
7 years ago
yuyang18 53dab95b75 Static DSO handle
7 years ago
yuyang18 c5115950a8 Use static for dlsym
7 years ago
Yu Yang 3d53631bad Make dyload strictly use the same ABI in header
7 years ago
Luo Tao d4682247e1 auto find tensorrt library
7 years ago
Yan Chunwei 186659798f
add tensorrt build support(#9891)
7 years ago
Kexin Zhao 7ed457e77a Fix cuda 7.5 error with cublas GEMM (#9811)
7 years ago
Yi Wang 47a4ec0672 Remove call_once.h (#9764)
7 years ago
Yi Wang e185502ebe
Fix cpplint errors with paddle/fluid/platform/dynload (#9715)
7 years ago
Kexin Zhao d00bd9eb72 Update the cuda API and enable tensor core for GEMM (#9622)
7 years ago
Kexin Zhao 9ba36604d8 fix cpplint error
7 years ago
Kexin Zhao 187ba08789 enable tensor core for conv cudnn
7 years ago
kexinzhao 90215b7844
Add float16 GEMM math function on GPU (#8695)
7 years ago
Xin Pan f3cbfc021c Add MEMCPY information
7 years ago
Yu Yang 22b5c07a7d Fix the compilation on CUDA 9.1/GCC 5.3
7 years ago
Xin Pan 9bbce49353 Fix version date.
7 years ago
Xin Pan b9ec24c6e9 Extend current profiler for timeline and more features.
7 years ago
Yang Yang(Tony) 87f4311a88
compile with nccl2 (#8411)
7 years ago
qingqing01 24509f4af9 Fix the grammar in copyright. (#8403)
7 years ago
Yi Wang fc374821dd Correct #include path
7 years ago
Yi Wang 90648f336d Move file to fluid/; Edit CMakeLists.txt
7 years ago