Commit Graph

769 Commits (760d015c14d9c35b0271c3a90898d52f39596190)

Author SHA1 Message Date
Jack Zhou c7cada8571
Fix gru performace decline in 1.8.5 (#29455)
5 years ago
LoveAn 671555ed32
Compiling operator libraries with Unity build (#29130)
5 years ago
chentianyu03 8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199)
5 years ago
Jack Zhou bc6033f86b
fix gru gcc7.4 bug for the gru compile
5 years ago
Jack Zhou 085260f3de
Add eigen gru and fix the dropout bug in the rnn
5 years ago
Shang Zhizhou b9e76a0103
detect tensorRT plugin fp16 in runtime (#27933)
5 years ago
wawltor b2c8a00745
remove eigen threadpool for the speed up
5 years ago
Jack Zhou 9362d85e0e
Add LSTM, Simple RNN and GRU CPU kernel (#28577)
5 years ago
QingshuChen 30ef3815b3
adjust kunlun header file (#28536)
5 years ago
YUNSHEN XIE ba0756325a
exec ut no more than 15s 1 (#28439)
5 years ago
Wilber 09fd2b2aab
Paddle support compile on sw (#27858)
5 years ago
Leo Chen 6115c14fca
Pool2d cuda kernel supports fp16 (#28316)
5 years ago
Double_V 5289b72acc
fix Wmaybe-uninitialized warning in pooling.cc, test=develop (#28126)
5 years ago
wangchaochaohu 463c72c2d9
refine gpu kernel config for Paddle (#28085)
5 years ago
wangchaochaohu c5fcc96d5b
xpu support for fill_constant Op (#27675)
5 years ago
Double_V f6ad2375be
fix pool3d bug, test=develop (#27718)
5 years ago
Li Fuchen 1501a80f74
add support to float64 input of warpctc op. (#27399)
5 years ago
Zhong Hui a85592bcbf
fix cpplint error for the autmic max/min
5 years ago
ShenLiang 6fc74bbaf6
add fp16 for matmul (#27523)
5 years ago
wanghuancoder df43905f12
use iwyu clean include (#27267)
5 years ago
Zhong Hui 4a9d21de49
Add GPU Kernels of Segment Ops, support, sum, max, min, mean
5 years ago
Zhong Hui f4c750d721
Add the cpu version of segment sum mean max min op
5 years ago
wawltor b6a4349dd4
fix the error message for the math dir
5 years ago
Jack Zhou 63203c4abc
enhance reduce op which can reduce tensor with arbitrary rank
5 years ago
Jack Zhou 6e29c2da05
Error description optimize for the math dir
5 years ago
Zhong Hui bbad3414e8
Enhance the error messages for files in operators/math
5 years ago
Jack Zhou 9437ce36c4
Error description optimize for math dir
5 years ago
Steffy-zxf 50e60e8779
update error info for selected_rows_functor
5 years ago
wangchaochaohu c71d79b1d2
[cuda11 support] change the CMakeLists to support the cuda11 (#27124)
5 years ago
kinghuin ed292695c5
optimize the error message for math dir
5 years ago
kinghuin 1b102dd552
optimize the error message for unpooling.cc
5 years ago
joanna.wozna.intel 95e1434bb2
Add bfloat16 data type (#25402)
5 years ago
Leo Chen 844583c8fd
Refine paddle.manual_seed (#26496)
5 years ago
Bai Yifan 8986a82131
fix adaptive gpu grad bug, add doc refine (#26660)
5 years ago
yaoxuefeng efee426742
support generator seed in related kernals test=develop (#26495)
5 years ago
ShenLiang c609066074
Add Matmul op (#26411)
5 years ago
QingshuChen 138ecf24aa
support Baidu Kunlun AI Accelerator (#25959)
5 years ago
Pei Yang b717895f64
Fix registering trt plugin (#25744)
5 years ago
Zhang Ting 6486fe8a94
improve GPU performance of transpose, test=develop (#25862)
5 years ago
ShenLiang bca303165a
fix inverse bug (#25641)
5 years ago
joanna.wozna.intel e5bbffa84c
Add NOMINMAX define due to windows.h max/min macro conflict (#25637)
5 years ago
Zhang Ting 30d1ff3bb4
call cublasGemmStridedBatchedEx when using fp16, test=develop (#25553)
5 years ago
Chen Weihang 0b54d54fd8
Fix index overflow bug of the CUDA kernel loop increment (#25435)
5 years ago
zlsh80826 e528392de9
[Paddle-TRT] SkipLayernorm vectorized memory optimization (#25117)
5 years ago
zhupengyang 6de75082cb
fix test_hsigmoid windows ci (#25311)
5 years ago
Leo Chen fa657b3dbb
fix bug of prelu when rank not equal 4, test=develop (#25067)
5 years ago
zlsh80826 479c8834f7
[Paddle-TRT] Fixes #24731, opt for SoftmaxKernelWithEltadd kernel, test=develop (#24834)
5 years ago
ceci3 8db66fc3f6
fix cos_sim, test=develop (#25017)
5 years ago
Chen Weihang d1062d5278
Replace all errors thrown by LOG(FATAL) with PADDLE_THROW (#24759)
5 years ago
Leo Chen b67ded04f2
Support gradient accumulation of fp16 in imperative mode (#24823)
5 years ago