Commit Graph

111 Commits (8645591d664f9e059113900281a715f8f83ae93c)

Author SHA1 Message Date
GaoWei8 1fbee267d4
remove scope in cudnn lstm (#25188)
5 years ago
Pei Yang beb0ca5fab
Fix TRT plugin registry without TRT lib (#25982)
5 years ago
Zhaolong Xing 358bc06c72
[CUDNN8 support] : support CUDNN8 (#25664)
5 years ago
Pei Yang b717895f64
Fix registering trt plugin (#25744)
5 years ago
Chen Weihang a6abd92dfd
Polish install error hint message (#25531)
5 years ago
GaoWei8 c10dcff12d
refine PADDLE_ENFORCE (#25456)
5 years ago
Chen Weihang 172d4ecb6c
remove WITH_DSO compile option (#25444)
5 years ago
Zhen Wang bb45af02ac
add the c++ part of Imperative QAT. test=develop (#25446)
5 years ago
GaoWei8 ea7e532598
Refine PADDLE_ENFORCE (#25369)
5 years ago
GaoWei8 fb70682f00
fix PADDLE_ENFORCE (#25297)
5 years ago
Chen Weihang 5a959f6e6e
Refactor dynamic dso search functions (#25214)
5 years ago
Chen Weihang 353ea9e8ad
Add default cudnn lib path (#25175)
5 years ago
Chen Weihang 4a702ef361
Support SelelctedRows allreduce in multi-cards imperative mode (#24690)
5 years ago
Yiqun Liu 560c815390
Add some check for CUDA Driver API and NVRTC (#22719)
5 years ago
Guo Sheng 4a5de14426
Remove cusolver potrfBatched support on Windows. (#24338)
5 years ago
Guo Sheng 1fc6cc502a
Fix cusolver loader for Windows (#24157)
5 years ago
Yiqun Liu ecfddebbef
Add the implementation of inverse (#23310)
5 years ago
Guo Sheng a8c0fb4e86
Add cholesky_op (#23543)
5 years ago
littletomatodonkey 1c08a2136e
test=develop, add addmm op (#23384)
5 years ago
Tao Luo e4f1b1c5e1
solve mklml memory leak (#23557)
5 years ago
Wilber 7bc4b09500
add WITH_NCCL option for cmake. (#22384)
5 years ago
Yiqun Liu d48320777e
Add the first implememtation of fusion_group op (#19621)
5 years ago
Jie Fang 5e813b53c5 nhwc optimization for batchnorm (#21090)
5 years ago
danleifeng 425279a57b Improve elementwise operators performance in same dimensions. (#19763)
5 years ago
qingqing01 1a3eef026c
Enable users to create custom cpp op outside framework. (#19256)
5 years ago
liym27 24010472d4 fix pool2d pool3d,support asymmetric padding and channel_last (#19739)
5 years ago
Yihua Xu 0d6ea52958 Fix the definition issue when used mkl_scsrmm and mkl_dcsrmm functions. (#19774)
5 years ago
Yiqun Liu 42b5bec6f9
Integrate NVRTC to support compiling CUDA kernel at runtime (#19422)
6 years ago
zhouwei25 84c728013c fix the compilation issue on windows caused by mkl_CSRMM (#19533)
6 years ago
Yihua Xu b920395842 Use sparse matrix to implement fused emb_seq_pool operator (#19064)
6 years ago
wopeizl 80b7ef6fc8
add tensorrt support for windows (#19084)
6 years ago
liuwei1031 a43a763b54
fix warpctc.dll not found issue (#18761)
6 years ago
Huihuang Zheng 0d3f16f53e
Try to modify external gflags to solve CI compilation (#18872)
6 years ago
Huihuang Zheng cfce4994cf
Merge cuda 9/10 dockerfile with root dockerfile (#18693)
6 years ago
HaoRen b7128bac5f supports collective communicated training (#18175)
6 years ago
wangchaochaohu c10157a5df
revise the cudnn conv choose algorithm to improve the performance(mask rcnn benchmark) (#17753)
6 years ago
chengduo 863c75168c
polish error doc (#17772)
6 years ago
Tao Luo ff1661f12a
remove unused FLAGS_warpctc_dir (#17162)
6 years ago
Chen Weihang 0b2aec14b6 Revert "Model data cryption link all lib (#16555)"
6 years ago
Chen Weihang c38c7c5619
Model data cryption link all lib (#16555)
6 years ago
Tao Luo 4efdebc6f6
Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt
6 years ago
dzhwinter 225c11a91f polish cudnn related code and fix bug. (#15164)
6 years ago
Yihua Xu 7396788694 Optimize gelu operation with mkl erf.
6 years ago
tensor-tang ee2321debd
Revert 15770 develop a6910f900 gelu mkl opt (#15872)
6 years ago
Yihua Xu 676995c86c Optimze Gelu with MKL Erf function (#15770)
6 years ago
tensor-tang 8117725852 add jit kernel hsum, hmax and softmax refer code
6 years ago
peizhilin 1e7f83e60a add cuda dso support for windows
6 years ago
peizhilin 40a94a138f remove irrelevant fix for mkl
6 years ago
peizhilin ed5bd5e586 test=develop
6 years ago
Yu Yang 7b10bf0e60 Use mkl
6 years ago