tensor-tang
bcb8ea397d
Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole
...
test=develop
7 years ago
Qiyang Min
f99ea99e36
Merge pull request #13720 from velconia/fix_grad_clip
...
Merge selected_rows for clip_by_norm op
7 years ago
tensor-tang
9131a35676
replace the lstm compute with jitkernel
...
test=develop
7 years ago
minqiyang
bcd8c2ccc3
Add unit test
7 years ago
minqiyang
67308822f8
Add selected_rows merge for clip_by_norm op
...
test=develop
7 years ago
dzhwinter
26771f41ba
"fix compile error" ( #13579 )
...
* "fix compile error"
* "fix ci"
* rerun ci
test=develop
* test=develop
rerun ci
7 years ago
Xin Pan
425a882165
Merge pull request #13643 from panyx0718/ir2
...
clean up channel
7 years ago
Dun
161c3e31f7
Optimization of Kernels that related to DeepLabv3+ ( #13534 )
...
* refine reduce by cub
* optimize KernelDepthwiseConvFilterGrad
* optimize depthwise conv and reduce mean and reduce sum
* fix bug: dilation
* cuda arch and cuda 8 compatible
7 years ago
Xin Pan
ddd60581b7
clean up channel
...
test=develop
7 years ago
chengduo
6757a31552
[Accelerate] Refine seq_softmax_op ( #13421 )
...
* refine seq_softmax_op
* fix seq_softmax
* use cub in seq_softmax
7 years ago
tensor-tang
612ba41aee
add simple lstm compute
7 years ago
qingqing01
9bd933d3fb
Improve and fix fake_quantize_op ( #13092 )
...
* Improve and fix fake_quantize_op.
7 years ago
fengjiayi
7e0c9f50ae
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
dzhwinter
0153c21d83
add unstack_op
7 years ago
dzhwinter
eca4563e5d
operators module ( #12938 )
7 years ago
dzhwinter
e23ddf6ae4
status ( #12764 )
7 years ago
fengjiayi
34b209cffa
Complete sequence_padding GPU kernel
7 years ago
dzhwinter
00463fdfe3
cudnn windows support ( #12757 )
...
* cudnn widndows
* "add comment"
* "windows support"
* "fix cmake error"
7 years ago
nhzlx
f55e8901c8
merge develop
7 years ago
nhzlx
1600ba86f6
1. change tensorrt op from cpu to gpu
7 years ago
tensor-tang
eee38464dc
refine fc op use cpu only
7 years ago
tensor-tang
d84a1a0010
fc op use cpu only
7 years ago
tensor-tang
0098a494a2
Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
tensor-tang
4b5986bb77
enable fc op in normal case
7 years ago
Yu Yang
8dda526a45
Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy
...
Fix 'softmax_with_cross_entropy_op' dependency error
7 years ago
sneaxiy
c50c537732
fix arithmetic error in backward kernel
7 years ago
sneaxiy
2c560623d1
fix dependency error
7 years ago
Bai Yifan
e12b1d1792
Add flatten op ( #12341 )
...
* add flatten op
7 years ago
chengduo
2409d0f710
Refine regularization for selected_rows ( #12369 )
...
* refine regularization for selected_rows
* clean lookup_table
* refine rpc_server_test
* temporally disable rpc_server_test
* fix rpc_server_test
* add unit test
7 years ago
Xin Pan
93355cc0d2
fix control deps
7 years ago
Yan Chunwei
b42ced8eda
bugfix/tensorrt analysis fix subgraph trigger ( #12266 )
7 years ago
Guo Sheng
da3f766821
Merge pull request #12088 from guoshengCS/complete-hsigmoid
...
Complete hsigmoid_op
7 years ago
chenweihang
938319bbd2
Merge branch 'develop' into unsqueeze_op
7 years ago
guosheng
d695381677
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
chenweihang
ca15779394
rewrite, use reshape op in unsqueeze op, test passed
7 years ago
yuyang18
1854814d49
Use reshape_op inside squeeze_op
...
* also convert tab to space
7 years ago
gongweibao
66c91911cf
Improve brpccmake ( #11842 )
7 years ago
Yan Chunwei
5082642bdb
feature/analysis to support sub-graph for TRT engine ( #11538 )
7 years ago
tangwei12
e589005229
merge
7 years ago
Yancey1989
712adc786f
polish dist cmake
7 years ago
Yancey1989
1ef6cdb60e
move dist codes from operaotrs/detail to operators/distributed
7 years ago
tangwei12
1c2e9bdd49
fix cmakelist
7 years ago
weixing02
8bd148dc00
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into hsigmoid_op
7 years ago
gongweibao
d9de6b8621
Add brpc surpport. ( #11263 )
7 years ago
gongweibao
627d7a64f8
Clean `sendop` `recv` operator. ( #11309 )
7 years ago
dzhwinter
d48172f22a
split reduce op into multiple libraries, accelerate the compiling ( #11029 )
...
* "split into multiple .ccl"
* "refine file structure"
* "refine files"
* "remove the cmakelist"
* "fix typo"
* "fix typo"
* fix ci
7 years ago
Yan Chunwei
4f95bc9463
feature/trt engine op test ( #11182 )
7 years ago
weixing02
3e46ec41a9
add hsigmoid
7 years ago
Luo Tao
aa4f685b66
fix compiler error when do not have TensorRT library
7 years ago
Yan Chunwei
211e131525
feature/tensorrt engine op ( #11001 )
7 years ago