bingyanghuang
76553c5a6d
fix travis-ci
7 years ago
tensor-tang
bc9971dd6c
fix deps
7 years ago
Xingyuan Bu
9e2e893f59
Enhence generate_proposal_labels_op and fix some bug. ( #13239 )
...
* Enhence generate_proposal_labels_op
* Fix bug in generate_proposals_op
* Refine rpn_target_assign_op.
* by Bu Xingyuan, Wang Guanzhong and Dang Qingqing
7 years ago
tensor-tang
ff858d35ed
fix bug and enable on batch mode as well
7 years ago
tensor-tang
8dea07f209
fix comopile
7 years ago
tensor-tang
612ba41aee
add simple lstm compute
7 years ago
Zhaolong Xing
c9995289f1
Merge pull request #13124 from NHZlX/fix_subgraph_bug
...
Fix tensorrt subgraph bug
7 years ago
Tao Luo
24e61d305b
Merge pull request #13378 from chuanqi129/group_conv
...
Support grouped convolution layer with mkldnn.
7 years ago
chuanqiw
1052a793bc
support group convolution layer with mkldnn.
7 years ago
velconia
bb9ec4b25f
Polish code
7 years ago
gongweibao
3a3f28f99b
add ( #13377 )
7 years ago
velconia
926f5f43a9
fix redundant args of lambda and remove exception of destructor
7 years ago
nhzlx
329a8c5283
merge develop
7 years ago
nhzlx
49bafc05bf
fix comments and set name for trt layer and ITensor
7 years ago
Bai Yifan
e69d9c845b
code fix ( #13365 )
7 years ago
tensor-tang
b0b5f515a9
Merge remote-tracking branch 'ups/develop' into refine/infershape
7 years ago
gongweibao
8cee9f6176
Fix rpcclient's wait action in aync env. ( #13307 )
7 years ago
tensor-tang
43d30547c5
Merge remote-tracking branch 'ups/develop' into refine/infershape
7 years ago
tensor-tang
8bb824bb93
refine infershape hasinput and hasoutput
7 years ago
Jacek Czaja
dfbd1cc3c1
Merge pull request #13209 from Sand3r-/mgallus/conv-relu-fuse
...
[MKLDNN] Fuse Conv+BatchNorm + ReLU
7 years ago
Krzysztof Binias
2ed7982d09
Merge pull request #13327 from kbinias/kbinias/conv-weights-converted-once
...
[MKLDNN] Reusing once reordered convolution weights in test mode
7 years ago
tensor-tang
c4394bc543
Merge remote-tracking branch 'ups/develop' into refine/infershape
7 years ago
tensor-tang
8a1abe54d7
clean fusion infershape code
7 years ago
tensor-tang
916f42bcbf
refine fusion gru infershape
7 years ago
tensor-tang
a5556d4417
refine attentionlstm infershape
7 years ago
Krzysztof Binias
accdecc681
Correcting Lint errors
7 years ago
bingyanghuang
83394bab3e
modified by luotao's suggestion
7 years ago
Michal Gallus
5d34ef61cb
Fuse MKLDNN's Conv + ReLU
7 years ago
nhzlx
49b5b3c5b3
merge develop
7 years ago
nhzlx
03ff4f6892
fix subgraph bug!
7 years ago
tensor-tang
e0436ad8bb
refine fusion lstm infershape
7 years ago
Krzysztof Binias
1ce9e9dc30
Renaming decision variable
7 years ago
chengduoZH
cc18fffb90
add nest while_op
7 years ago
Bai Yifan
faf8ad2436
Add ignore_index in cross_entropy op ( #13217 )
...
* add ignore index
* update api.spec
* enhance softmax_with_cross_entropy
7 years ago
bingyanghuang
1454cd54aa
pre-commit check
7 years ago
bingyanghuang
7429067ab3
clean code
7 years ago
bingyanghuang
cdbc5e7353
Add some comments
7 years ago
bingyanghuang
53185fde11
Rewrite sequence pooling last and first mode with memcpy and clean code
7 years ago
guochaorong
76e9227467
Merge pull request #13199 from JiayiFeng/fix_CudnnHolder_bug
...
Fix cudnn holder bug
7 years ago
Krzysztof Binias
1658958fe6
Reusing converted weights
7 years ago
Yan Xu
d117bbc313
Merge pull request #13291 from Yancey1989/reset_vars_on_pserver
...
reset received vars on pserver
7 years ago
qingqing01
a39eba77eb
Implement norm_op by CUDA instead of Eigen. ( #13273 )
...
* Implement norm_op by CUDA instead of Eigen.
* Remove the commented code.
7 years ago
Yancey1989
32b94a7d13
cache var types
7 years ago
Yancey1989
580f55fa0f
update by comment
7 years ago
Yang Yu
8331e835a8
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug
7 years ago
Yancey1989
6edfae4234
reset received vars on pserver
7 years ago
tensor-tang
40dbd97f8e
Merge remote-tracking branch 'ups/develop' into refine/op/peephole
7 years ago
Qiyang Min
b805751598
Merge pull request #13223 from velconia/open_python35_CI
...
Open python35 ci
7 years ago
Yu Yang
34e467dcab
Merge pull request #13232 from reyoung/feature/fix_layer_norm
...
Use double to reduce
7 years ago
chengduo
886852557f
Refine reshape_grad and transpose_grad ( #13074 )
...
* Add intermediate
* fix flatten/squeeze/unsqueeze
* Considering compatibility issues, we could not fix the origin op
* follow comment
* reset the shape of XShape
7 years ago
tensor-tang
3eb55f0643
Merge remote-tracking branch 'ups/develop' into refine/op/peephole
7 years ago
tensor-tang
d7ac1cc836
refine seq when bs is large
7 years ago
tensor-tang
9dd5a177a5
refine batch mode and peephole
7 years ago
Qiao Longfei
6e03f7900f
Add centered mode rmsprop ( #13161 )
...
* rmsprop optimizer support v1 mode
* typo
* optimize code
* refine code
* optimize unit test
* update test_rmsprop_op.py
* update formula of rmsprop
* optimize document
* update API.spec for RMSPropOptimizer
* add default value to check_output_with_place equal_nan
7 years ago
Yan Chunwei
9df2d8b5ba
test/add text-classification test ( #13081 )
7 years ago
tensor-tang
f10710b0ca
move seq peephole if out of loop
7 years ago
tensor-tang
2f3b498949
refine fusion seq lstm peephole
7 years ago
tangwei12
d1e2efae6b
reimplement auc in fluid ( #13167 )
...
* reimplement auc in pyton
* reimplement auc in fluid
* add auc unittest
* replace new auc in layers
* add batch Auc in Fluid
* name formated
7 years ago
Yu Yang
f57d706aa7
Use double to reduce
7 years ago
tensor-tang
5f586e2223
Merge remote-tracking branch 'ups/develop' into refine/op/fusion_lstm
7 years ago
Brian Liu
04272c0d41
Enable lstm peephole ( #13160 )
...
* Refine fusion lstm op code for better readability
* Enable peephole in fusion lstm op (seq_mode part) and add unit test
* Enable peephole in fused lstop op (batch_mode part)
Set batch_mode as default as well
* Use pre-commit to clean format
* Follow up review comments as well as adding more unit tests for seq mode
7 years ago
fengjiayi
56750e6a3e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug
7 years ago
Qiao Longfei
cdd14f17f1
fix async mode handle COMPLETE_MESSAGE ( #13212 )
7 years ago
minqiyang
8059445fb5
Fix fake_quantize_op
7 years ago
tensor-tang
78d9ad5712
fusion gru enfore only used
7 years ago
tensor-tang
555083ae2a
enforce only used
7 years ago
fengjiayi
db5e3dd767
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug
7 years ago
Jiabin Yang
d091dd02a0
fix mac compile error 0903 ( #13184 )
7 years ago
Yu Yang
cda7842e26
Revert "Revert "Add Python Callstacks when Op::Run error ( #12759 )""
...
This reverts commit 1f270275a6
.
7 years ago
qingqing01
9557cc218d
Refine and fix some code for faster-rcnn. ( #13135 )
...
* Fix bug in generate_proposals_op.
* Fix data type for RoIs.
* Refine and fix rpn_target_assign_op.
* Add the missing file bbox_util.h
* Rename BoxEncoder to BoxToDelta
7 years ago
fengjiayi
82a1b35b9b
Revert "Revert "Add CudnnHolder and use it in Conv and ConvTranspose op""
...
This reverts commit 151e169eb7
.
7 years ago
guochaorong
151e169eb7
Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"
7 years ago
Chen Weihang
3b6090e80b
Merge pull request #12887 from chenwhql/sequence_enumerate_op
...
Feat: add sequence enumerate op
7 years ago
tensor-tang
1cc35f3642
Merge pull request #13118 from tensor-tang/optimize/op/fusion_lstm
...
Optimize fusion lstm batch mode
7 years ago
dzhwinter
6fb28796f5
memory ( #13143 )
7 years ago
dzhwinter
e722f68318
fix windows compile ( #13147 )
7 years ago
dzhwinter
f05520060e
fix style ( #13142 )
7 years ago
dzhwinter
856c26faef
fix elementwise ( #13146 )
7 years ago
fengjiayi
653c8ded7d
Merge pull request #13078 from JiayiFeng/dev_CudnnHolder
...
Add CudnnHolder and use it in Conv and ConvTranspose op
7 years ago
tensor-tang
20659fc905
Merge pull request #13107 from tensor-tang/optimize/op/fusion_gru
...
Optimize fusion gru
7 years ago
tensor-tang
93c034ee51
Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_lstm
7 years ago
tensor-tang
c7adb99ae0
follow comment and refine code
7 years ago
tensor-tang
83f4bc4ecf
follow comment and refine code
7 years ago
tensor-tang
f38905a6e5
Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_gru
7 years ago
tangwei12
fbdd4f8c0f
Merge pull request #13101 from zenghsh3/develop
...
Fix bug of sampling_id op
7 years ago
tensor-tang
9838bacb35
Merge branch 'develop' into optimize/op/fusion_lstm
7 years ago
qingqing01
9bd933d3fb
Improve and fix fake_quantize_op ( #13092 )
...
* Improve and fix fake_quantize_op.
7 years ago
Tao Luo
3fe0575b62
Merge pull request #13148 from dzhwinter/windows/math_compile
...
cuda math port
7 years ago
chenweihang
7ddbbcb0b5
doc: refine API and doc
7 years ago
dzhwinter
34757efb8e
fix windows compile
7 years ago
tensor-tang
c44108803a
refine prelu
7 years ago
chenweihang
b081363bae
Merge branch 'sequence_enumerate_op' of https://github.com/chenwhql/Paddle into sequence_enumerate_op
7 years ago
chenweihang
0b7d82befb
doc: refine English description
7 years ago
dzhwinter
b11332a07b
"fix style" ( #13094 )
7 years ago
dzhwinter
ab1097cd8e
Feature/template ( #13093 )
...
* remove template operator
* "fix compile"
* "fix ci"
* "fix ci"
7 years ago
tensor-tang
80edd7ef29
enable run with fuse pass
7 years ago
fengjiayi
f79ca23115
fix bugs
7 years ago
tensor-tang
a79a77eeb5
refine and clean code
7 years ago
tensor-tang
c459fb5be0
add fusion lstm batch mode
7 years ago
whs
e10aa80f03
Add pad2d op. ( #12950 )
...
* Add pad2d op.
* Add unitest and python api.
* Fix cuda op kernel.
* Fix python api.
* Fix python api.
* Update API.spec.
* Fix python api
7 years ago
tensor-tang
7bdd11d88e
Merge branch 'develop' into optimize/op/fusion_gru
7 years ago
fengjiayi
1f36a4c27c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_CudnnHolder
7 years ago
fengjiayi
b0aca8824d
make CudnnHolder thread safe
7 years ago
tensor-tang
596213906b
add gru seq mode forward
7 years ago
zenghsh3
d7495838b3
refine
7 years ago
zenghsh3
04a05d1d58
merged
7 years ago
zenghsh3
08b73b68c4
fix bug of sampling_id_op
7 years ago
tensor-tang
b0d36c4c3d
add cross vec to speedup gru
7 years ago
tensor-tang
038c16eed2
save intermediate data to out buffer
7 years ago
Xingyuan Bu
0a97d24b41
Faster RCNN Generate Proposal Labels ( #12616 )
...
* Add generate_proposal_labels for Faster-RCNN.
7 years ago
fengjiayi
d5f74b7308
use CudnnHolder in conv_transpose_cudnn_op
7 years ago
fengjiayi
407ff0bdbc
use CudnnHolder in conv_cudnn_op
7 years ago
chengduo
3bd1d22a7d
Enhance fused_elementwise_activation_op ( #12837 )
...
* Enhance the function of fused_elementwise_activation_op
* enhance unit test
* Clean Code And Add Doc
* Add compound functors
* Fix doc and enhance unit test
* define Dx and Dy for d_binary_func
* add mul_scale
* add mul_scale
* add elementwise_mul
* code refine
* code refine
* add doc
* add AsIntermediate
7 years ago
tensor-tang
2d0ddf8c41
refine cpu gru batch mode
7 years ago
tensor-tang
70d3981220
add cpu vec bias sub
7 years ago
jerrywgz
85fe65ae61
modified error info for maxout op
7 years ago
Chen Weihang
b98b744067
Merge branch 'develop' into sequence_enumerate_op
7 years ago
Yan Chunwei
902f19b46a
fea/fuse attention lstm simplify.with fusion lstm.with sequnce expand ( #13006 )
7 years ago
Xingyuan Bu
2ad5d91ef8
Faster RCNN Generate Proposals ( #12056 )
...
* Add proposals generation operator for Faster-RCNN.
7 years ago
tensor-tang
89d6d69ce4
Merge pull request #12781 from tensor-tang/feature/op/fusion_gru
...
add fusion gru
7 years ago
tensor-tang
d941192e74
fix gcc53 on cpu vec ( #13020 )
7 years ago
tensor-tang
2328a69157
Merge pull request #13012 from tensor-tang/refine/seq2batch
...
refine seq2batch
7 years ago
Xin Pan
2bb15f437c
Merge pull request #12791 from panyx0718/ir3
...
graph to program pass
7 years ago
Qiao Longfei
a22309afe8
clean useless check code in auc_op ( #13023 )
7 years ago
Yu Yang
8965cee89f
Polish PrintOp ( #12895 )
...
* Polish PrintOp
* Polish PrintOp
* Polish PrintOp
* Refine test_print_op
7 years ago
chengduo
7ad39c4077
Enhance pad_constant_like_op ( #12999 )
...
* enhance pad_constant_like_op
* add API
* add API
7 years ago
qingqing01
0353eddb51
Improve fake_dequantize_op. ( #12877 )
...
* Improve fake_dequantize_op.
* Follow comments.
7 years ago
Qiao Longfei
11e01d9b2d
Scale support selectedrows ( #12960 )
...
* add ScaleOpVarTypeInference for scale op
* scale op support scale selected rows
* optimize code
* use FindVar
* use FindVarRecursive in ScaleOpVarTypeInference
7 years ago
fengjiayi
7b84c580e2
Merge pull request #12824 from JiayiFeng/dev_sequence_padding_op
...
Sequence pad op
7 years ago
tensor-tang
fd4f7c3ab5
refine seq2batch
7 years ago
Wu Yi
0ee6fed05b
Refine dist rpc deps ( #12899 )
...
* refine dist train RPC deps
* clean up
* clean up
* fix ut
* remove input for fetch_barrier
* follow comments
7 years ago
fengjiayi
7e0c9f50ae
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
Zeng Jinle
599a32641b
Merge pull request #12971 from sneaxiy/unstack_op
...
Add unstack op
7 years ago
Tao Luo
26cac36bfd
Merge pull request #12515 from kbinias/kbinias/bnorm-fwd-reuse
...
Reusing primitives for forward Batch Norm operator
7 years ago
tensor-tang
a481c5e98c
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_expand_concat_fc
7 years ago
tensor-tang
49c31febb5
fix typo and op test
7 years ago
fengjiayi
9cb455fa7d
update function
7 years ago
Krzysztof Binias
fb4b4f8d57
Refactor code
7 years ago
Krzysztof Binias
50d3e6e96b
Reusing primitives for forward Batch Norm operator
7 years ago
Zeng Jinle
ef7bd03a03
Merge pull request #12964 from sneaxiy/fix_concat_sync
...
Fix concat bug
7 years ago
sneaxiy
52a480bb98
Merge develop
7 years ago
tensor-tang
02909335e9
rename fusion seq_concat_fc to fusion seqexpand_concat_fc
7 years ago
Xin Pan
1a67061fee
graph to program pass
...
fix a few other things
7 years ago
qingqing01
1f09bc320c
Support data type int8_t . ( #12841 )
...
* Support int8 type.
7 years ago
chenweihang
00b30b9938
doc: unified infershape format
7 years ago
chenweihang
0c4697f8cd
fix: change to enumerate by sentence
7 years ago
tensor-tang
c45cee0349
refine infershape and forward
7 years ago
sneaxiy
24264bc0b8
Merge develop
7 years ago
dzhwinter
0153c21d83
add unstack_op
7 years ago
tensor-tang
c7c2506733
add forward implementation
7 years ago
jerrywgz
6033c1a278
Add error info & remove data sharing between input and output in rnn_memory_helper_op
7 years ago
chengduo
3e1050a2e8
Add pad_constant_like_op ( #12943 )
...
* Add pad_constant_batch_size_like
* refine pad_op
* optimize memory
7 years ago
dzhwinter
6cc7870517
fix concat synchronization bug
7 years ago
tensor-tang
954b0e113f
init fusion seq expand concat fc op
7 years ago
tensor-tang
c488ee96a7
Merge remote-tracking branch 'ups/develop' into refine/op/fusion_lstm
7 years ago
tensor-tang
e61cf3214d
complete reverse seq
7 years ago
Chen Weihang
4ec12496dd
Merge branch 'develop' into sequence_enumerate_op
7 years ago
tensor-tang
4b28fab8c9
enable more acts
7 years ago
tensor-tang
607c41952e
compute gates
7 years ago
Qiao Longfei
3c58b87b45
fix auc layer and add check for auc op ( #12954 )
...
* fix auc layer and add check for auc op
* use input to check if states are inited
* optimize code
7 years ago
jerrywgz
835573bbf2
add error_info prelu_op
7 years ago
Yibing Liu
c1488b1796
Merge pull request #12940 from sneaxiy/stack_op
...
Speedup stack_op
7 years ago
dzhwinter
eca4563e5d
operators module ( #12938 )
7 years ago
tensor-tang
6be273cbdb
add seq mode lstm
7 years ago
tensor-tang
36363292c3
Merge pull request #12904 from tensor-tang/refine/jit
...
optimize cpu vec activations
7 years ago
jerrywgz
bc7503c85e
modified error_info for maxout_op
7 years ago
Zeng Jinle
d189d4dbab
Merge pull request #12884 from sneaxiy/sequence_mask_op
...
Add sequence_mask_op for DAM model
7 years ago
sneaxiy
3b38e5a4fc
speed up stack_op
7 years ago
tensor-tang
7bdaf09664
Merge remote-tracking branch 'ups/develop' into refine/jit
7 years ago
Tao Luo
989cc2a4f4
Merge pull request #12913 from luotao1/concat
...
enhance the forward of concat op
7 years ago
Tao Luo
8650f6ffae
Merge pull request #12898 from luotao1/expand
...
remove broadcast in sequence_expand
7 years ago
Qiao Longfei
52948a0b50
Merge pull request #12909 from jacquesqiao/fix-sparse-update-bug
...
fix sparse update bug
7 years ago
tensor-tang
ba943d38e3
make runtime avx act
7 years ago
tensor-tang
3462c29940
refine add bias with avx
7 years ago
tangwei12
ef6445ee39
Merge pull request #12908 from seiriosPlus/fill_constant_selectedrows
...
add SelectedRows support in fill_constant_op
7 years ago
tensor-tang
bb9f98e10d
add inplace test
7 years ago
tensor-tang
f269614bcd
further optimize tanh with avx and mkl
7 years ago
chenweihang
733ea0d29b
adjust infershape details
7 years ago
luotao1
e999c74cff
Merge branch 'develop' into concat
7 years ago
luotao1
b61cf7ac4f
Merge branch 'develop' into expand
7 years ago
luotao1
2b4edacca0
enhance the forward of concat op
7 years ago
Tao Luo
3e3b5f4fda
Merge pull request #12675 from Sand3r-/fix-conv-mkldnn-0.15
...
Update MKLDNN to 0.15, fix convolution integration
7 years ago
tensor-tang
7a4924cd44
further optimize sigmoid with avx and avx512
7 years ago
qiaolongfei
fcf20eed0f
fix sparse update bug
7 years ago
tangwei12
ca22586818
code optimize
...
(cherry picked from commit 587cca7)
7 years ago
Xin Pan
557be6fc58
Merge pull request #12902 from PaddlePaddle/revert-12736
...
Revert "Disable in_place in batch_norm API. (#12736 )"
7 years ago
tensor-tang
6bd89ba5b6
fix typo
7 years ago
Chen Weihang
2969aba14f
Merge branch 'develop' into sequence_enumerate_op
7 years ago
chenweihang
219a2369da
feat: wrap sequence enumerate op
7 years ago
tensor-tang
e3bb98eb38
optimize relu with avx and avx512
7 years ago
guochaorong
1f270275a6
Revert "Add Python Callstacks when Op::Run error ( #12759 )"
...
This reverts commit b2df17003f
.
7 years ago
guochaorong
b1fc238694
Revert "Disable in_place in batch_norm API. ( #12736 )"
...
This reverts commit f5d5d7b2d9
.
7 years ago
tensor-tang
25976fe736
optimize the sigmoid and tanh
7 years ago
tensor-tang
2eb46c2b06
add cpu vec test
7 years ago
sneaxiy
1083e99520
Merge develop
7 years ago
tensor-tang
f0f06992c1
Merge pull request #12878 from tensor-tang/feature/op/attention_lstm
...
Add attention lstm cpu forward
7 years ago
luotao1
83f4edabe9
remove broadcast in sequence_expand
7 years ago
sneaxiy
5ea7bf88ba
Merge pull request #12872 from sneaxiy/stack_op
...
Add stack_op for DAM model
7 years ago
Tao Luo
ef2da86b4f
Merge pull request #12885 from luotao1/test_ditu_rnn
...
enhance test_analyzer to profile ditu inference demo
7 years ago
sneaxiy
e895c98f0a
add support to max_len is None
7 years ago