Commit Graph

77 Commits (2246f7c133e3dc3cfd9f2779fd2f4cc2778c7ea7)

Author SHA1 Message Date
HaoRen b7128bac5f supports collective communicated training (#18175)
6 years ago
wangchaochaohu c10157a5df
revise the cudnn conv choose algorithm to improve the performance(mask rcnn benchmark) (#17753)
6 years ago
chengduo 863c75168c
polish error doc (#17772)
6 years ago
Tao Luo ff1661f12a
remove unused FLAGS_warpctc_dir (#17162)
6 years ago
Chen Weihang 0b2aec14b6 Revert "Model data cryption link all lib (#16555)"
6 years ago
Chen Weihang c38c7c5619
Model data cryption link all lib (#16555)
6 years ago
Tao Luo 4efdebc6f6
Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt
6 years ago
dzhwinter 225c11a91f polish cudnn related code and fix bug. (#15164)
6 years ago
Yihua Xu 7396788694 Optimize gelu operation with mkl erf.
6 years ago
tensor-tang ee2321debd
Revert 15770 develop a6910f900 gelu mkl opt (#15872)
6 years ago
Yihua Xu 676995c86c Optimze Gelu with MKL Erf function (#15770)
6 years ago
tensor-tang 8117725852 add jit kernel hsum, hmax and softmax refer code
6 years ago
peizhilin 1e7f83e60a add cuda dso support for windows
6 years ago
peizhilin 40a94a138f remove irrelevant fix for mkl
6 years ago
peizhilin ed5bd5e586 test=develop
6 years ago
Yu Yang 7b10bf0e60 Use mkl
6 years ago
liuhongyu 8daf67f90f fix bugs; test=develop
6 years ago
liuhongyu 968dd3c078 add cudnn 5 support; test=develop
6 years ago
phlrain cf1fe61004 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
6 years ago
Tao Luo ea47685f91
Merge pull request #14646 from jczaja/prv-softmax-mkl-sasum
6 years ago
Jacek Czaja 8bfa1fa9bb - ASUM MKL integration
6 years ago
liuhongyu 05917c3c79 add cudnn lstm; test=develop
6 years ago
minqiyang be04d99fe4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
6 years ago
minqiyang 53433d7f2e Revert the changes of VLOG
6 years ago
peizhilin 36cd18b549 Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
chengduozh f7847ca6a3 fix cublas warp error
6 years ago
chengduo 00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929)
6 years ago
peizhilin 7c8c9dc9bf fix unit test cases
6 years ago
wopeizl d9a1f3e58e Windows/online (#14474)
6 years ago
peizhilin 6e66fadb95 clean up the pre-definitions on windows
6 years ago
qingqing01 fd7e643153
Convolution fusion operator. (#14449)
6 years ago
Wu Yi b32c13dc20
Add cudnn ctc loss (#12366)
6 years ago
tensor-tang 1be85d011d add mkl vsqr and vpow
6 years ago
Qiyang Min 698698f2fa
Merge branch 'develop' into fix_vlog
6 years ago
qingqing01 abe209234f
Exhaustive search for cuDNN conv. (#14286)
6 years ago
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
6 years ago
qingqing01 db8c52da5e Revert " Exhaustive search for cuDNN conv. (#14043)"
6 years ago
qingqing01 ce7d9b0799
Exhaustive search for cuDNN conv. (#14043)
6 years ago
whs 0c319e0b35
Add affine grid generator op (#12238)
6 years ago
dzhwinter 2d00e65819
namespace issue (#13543)
6 years ago
JiabinYang e322fc4e0e add error info for nccl not found
7 years ago
dzhwinter d361624c1d
platform module (#12932)
7 years ago
tensor-tang 3dd66390b2 add blas vexp
7 years ago
tensor-tang 0ec1f65cf1 fix blas dot and add cblas scal
7 years ago
tensor-tang a2203d0466 add cblas dot
7 years ago
dzhwinter e23ddf6ae4
status (#12764)
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
dzhwinter 00463fdfe3
cudnn windows support (#12757)
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago