Commit Graph

89 Commits (03479469a700ce30edea0fe80a7c14982a6082db)

Author SHA1 Message Date
Jie Fang 5e813b53c5 nhwc optimization for batchnorm (#21090)
5 years ago
danleifeng 425279a57b Improve elementwise operators performance in same dimensions. (#19763)
5 years ago
qingqing01 1a3eef026c
Enable users to create custom cpp op outside framework. (#19256)
5 years ago
liym27 24010472d4 fix pool2d pool3d,support asymmetric padding and channel_last (#19739)
5 years ago
Yihua Xu 0d6ea52958 Fix the definition issue when used mkl_scsrmm and mkl_dcsrmm functions. (#19774)
6 years ago
Yiqun Liu 42b5bec6f9
Integrate NVRTC to support compiling CUDA kernel at runtime (#19422)
6 years ago
zhouwei25 84c728013c fix the compilation issue on windows caused by mkl_CSRMM (#19533)
6 years ago
Yihua Xu b920395842 Use sparse matrix to implement fused emb_seq_pool operator (#19064)
6 years ago
wopeizl 80b7ef6fc8
add tensorrt support for windows (#19084)
6 years ago
liuwei1031 a43a763b54
fix warpctc.dll not found issue (#18761)
6 years ago
Huihuang Zheng 0d3f16f53e
Try to modify external gflags to solve CI compilation (#18872)
6 years ago
Huihuang Zheng cfce4994cf
Merge cuda 9/10 dockerfile with root dockerfile (#18693)
6 years ago
HaoRen b7128bac5f supports collective communicated training (#18175)
6 years ago
wangchaochaohu c10157a5df
revise the cudnn conv choose algorithm to improve the performance(mask rcnn benchmark) (#17753)
6 years ago
chengduo 863c75168c
polish error doc (#17772)
6 years ago
Tao Luo ff1661f12a
remove unused FLAGS_warpctc_dir (#17162)
6 years ago
Chen Weihang 0b2aec14b6 Revert "Model data cryption link all lib (#16555)"
6 years ago
Chen Weihang c38c7c5619
Model data cryption link all lib (#16555)
6 years ago
Tao Luo 4efdebc6f6
Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt
6 years ago
dzhwinter 225c11a91f polish cudnn related code and fix bug. (#15164)
6 years ago
Yihua Xu 7396788694 Optimize gelu operation with mkl erf.
6 years ago
tensor-tang ee2321debd
Revert 15770 develop a6910f900 gelu mkl opt (#15872)
6 years ago
Yihua Xu 676995c86c Optimze Gelu with MKL Erf function (#15770)
6 years ago
tensor-tang 8117725852 add jit kernel hsum, hmax and softmax refer code
6 years ago
peizhilin 1e7f83e60a add cuda dso support for windows
6 years ago
peizhilin 40a94a138f remove irrelevant fix for mkl
6 years ago
peizhilin ed5bd5e586 test=develop
6 years ago
Yu Yang 7b10bf0e60 Use mkl
6 years ago
liuhongyu 8daf67f90f fix bugs; test=develop
6 years ago
liuhongyu 968dd3c078 add cudnn 5 support; test=develop
6 years ago
phlrain cf1fe61004 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
6 years ago
Tao Luo ea47685f91
Merge pull request #14646 from jczaja/prv-softmax-mkl-sasum
6 years ago
Jacek Czaja 8bfa1fa9bb - ASUM MKL integration
6 years ago
liuhongyu 05917c3c79 add cudnn lstm; test=develop
6 years ago
minqiyang be04d99fe4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
6 years ago
minqiyang 53433d7f2e Revert the changes of VLOG
6 years ago
peizhilin 36cd18b549 Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
chengduozh f7847ca6a3 fix cublas warp error
6 years ago
chengduo 00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929)
6 years ago
peizhilin 7c8c9dc9bf fix unit test cases
6 years ago
wopeizl d9a1f3e58e Windows/online (#14474)
6 years ago
peizhilin 6e66fadb95 clean up the pre-definitions on windows
6 years ago
qingqing01 fd7e643153
Convolution fusion operator. (#14449)
6 years ago
Wu Yi b32c13dc20
Add cudnn ctc loss (#12366)
6 years ago
tensor-tang 1be85d011d add mkl vsqr and vpow
6 years ago
Qiyang Min 698698f2fa
Merge branch 'develop' into fix_vlog
6 years ago
qingqing01 abe209234f
Exhaustive search for cuDNN conv. (#14286)
6 years ago
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
6 years ago
qingqing01 db8c52da5e Revert " Exhaustive search for cuDNN conv. (#14043)"
6 years ago
qingqing01 ce7d9b0799
Exhaustive search for cuDNN conv. (#14043)
6 years ago