Commit Graph

78 Commits (e804f08559d96a87b8c7eb50120eef68402e4313)

Author SHA1 Message Date
Qi Li 72d99c5dcd
[ROCM] update fluid operators for rocm (part4), test=develop (#31225)
5 years ago
Y_Xuan 76738504ad
添加rocm平台支持代码 (#29342)
5 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
5 years ago
LoveAn 671555ed32
Compiling operator libraries with Unity build (#29130)
5 years ago
YUNSHEN XIE ba0756325a
exec ut no more than 15s 1 (#28439)
5 years ago
Zhong Hui f4c750d721
Add the cpu version of segment sum mean max min op
5 years ago
wangchaochaohu c71d79b1d2
[cuda11 support] change the CMakeLists to support the cuda11 (#27124)
5 years ago
Yiqun Liu ecfddebbef
Add the implementation of inverse (#23310)
6 years ago
Zeng Jinle ab2e284235
fix compilation failure (#24091)
6 years ago
Zhaolong Xing 430b0099c9
[Paddle-TRT]: Ernie Dynamic shape support. (#23138)
6 years ago
Yiqun Liu a65c728e5d
Implement the GPU kernel of fc operator (#19687)
6 years ago
Tao Luo 3ae939e48a
unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631)
6 years ago
xuezhong fb261793b9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_sample_logits_op
7 years ago
peizhilin 061299be87 fix dependency
7 years ago
xuezhong 58ad40cc15 add sample_logits op
7 years ago
tensor-tang d59f733551 refine softmax and use with cache
7 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
7 years ago
zhaozhehao e2ba9668b4 Tree conv op (#15217)
7 years ago
tensor-tang e58a569c6c use seqpool jitkernel
7 years ago
tensor-tang 64a90b2f1c use vadd, vaddrelu, lstm and gru jitkernel
7 years ago
tensor-tang fab0ee8757 Merge remote-tracking branch 'ups/develop' into refine/jitkernel
7 years ago
tensor-tang 77236e33fc init jitkernel
7 years ago
nhzlx f75815b78c add prelu gpu inference
7 years ago
wopeizl d9a1f3e58e Windows/online (#14474)
7 years ago
Yihua Xu f4c869d872 Optimize the layer_norm operator with AVX intrinsic function (#14417)
7 years ago
Yu Yang 98bbfc17be Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
whs 1722678258
Make nce support more distribution. (#13549)
7 years ago
Yu Yang c8f6e70ab4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
peizhilin 41b423d41b remove duplicate
7 years ago
peizhilin dcfab11193 merge from develop
7 years ago
peizhilin 4ffa92d4f0 Merge branch 'develop' into windows/build
7 years ago
peizhilin 869487a2b7 Merge remote-tracking branch 'origin/develop' into windows/build
7 years ago
Yu Yang 057a682ee9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
dzhwinter 2835e04409 merge develop branch. test=develop
7 years ago
tensor-tang b81e1b655e fix jit on mac
7 years ago
peizhilin 9d67c1fb69 cpu build support
7 years ago
tensor-tang a3377f7b0a refine jitcode and add vmul jitcode implementation
7 years ago
tensor-tang a53b1b0b1b refine and init jitkernel vmul
7 years ago
tensor-tang 2139b9f677 add jit gencode
7 years ago
dzhwinter 316765839d add back jit simd instructions. stage.
7 years ago
dzhwinter bf2e4cb188 cleard. staged
7 years ago
Yu Yang c01696f8c2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
tensor-tang 21487d78bf add crf decode jit kernel
7 years ago
Qiyang Min d0fdcb2f6d
Merge pull request #14048 from velconia/change_sequence_pool_to_cpu
7 years ago
minqiyang e2a348cd10 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into change_sequence_pool_to_cpu
7 years ago
minqiyang 047fa2f9aa Add unit-test for sequence_pooling functor
7 years ago
tensor-tang 032c3a07e3 Merge remote-tracking branch 'ups/develop' into refine/jit/gru
7 years ago
chengduo a7497653d0
Refine Split op (#13967)
7 years ago
tensor-tang 640e789d3d add fusion gru jit kernel
7 years ago
Yu Yang 461f71a90b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago