tensor-tang
7a4924cd44
further optimize sigmoid with avx and avx512
7 years ago
qiaolongfei
fcf20eed0f
fix sparse update bug
7 years ago
tangwei12
ca22586818
code optimize
...
(cherry picked from commit 587cca7)
7 years ago
Xin Pan
557be6fc58
Merge pull request #12902 from PaddlePaddle/revert-12736
...
Revert "Disable in_place in batch_norm API. (#12736 )"
7 years ago
tensor-tang
6bd89ba5b6
fix typo
7 years ago
Chen Weihang
2969aba14f
Merge branch 'develop' into sequence_enumerate_op
7 years ago
chenweihang
219a2369da
feat: wrap sequence enumerate op
7 years ago
tensor-tang
e3bb98eb38
optimize relu with avx and avx512
7 years ago
guochaorong
1f270275a6
Revert "Add Python Callstacks when Op::Run error ( #12759 )"
...
This reverts commit b2df17003f
.
7 years ago
guochaorong
b1fc238694
Revert "Disable in_place in batch_norm API. ( #12736 )"
...
This reverts commit f5d5d7b2d9
.
7 years ago
tensor-tang
25976fe736
optimize the sigmoid and tanh
7 years ago
tensor-tang
2eb46c2b06
add cpu vec test
7 years ago
sneaxiy
1083e99520
Merge develop
7 years ago
tensor-tang
f0f06992c1
Merge pull request #12878 from tensor-tang/feature/op/attention_lstm
...
Add attention lstm cpu forward
7 years ago
luotao1
83f4edabe9
remove broadcast in sequence_expand
7 years ago
sneaxiy
5ea7bf88ba
Merge pull request #12872 from sneaxiy/stack_op
...
Add stack_op for DAM model
7 years ago
Tao Luo
ef2da86b4f
Merge pull request #12885 from luotao1/test_ditu_rnn
...
enhance test_analyzer to profile ditu inference demo
7 years ago
sneaxiy
e895c98f0a
add support to max_len is None
7 years ago
fengjiayi
f4a4a4cbd9
add op comment and python layer
7 years ago
tangwei12
acdd95d5ca
bug fix
7 years ago
chenweihang
d2e5395b97
feat: add sequence enumerate op
7 years ago
luotao1
9c7fde45a7
enhance test_analyzer to profile ditu inference demo
7 years ago
chengduo
8ad9055804
Add is_test for while_op ( #12874 )
...
* add is_test for while_op
* Change API
7 years ago
sneaxiy
64464cb1fa
Merge develop
7 years ago
qingqing01
79918a8442
add sequence_mask_op for DAM model
7 years ago
Yu Yang
b2df17003f
Add Python Callstacks when Op::Run error ( #12759 )
...
* Add Python Callstacks when Op::Run error
* Skip op with sub-block
* refactor: refine callstack info's format
* Reshape only support matrix
* Polish Python code
* Fix UT
* Fix Py3
7 years ago
Yu Yang
17fcc4f5d0
Merge pull request #12864 from reyoung/feature/process_lod_grad
...
Feature/process lod grad
7 years ago
tensor-tang
5ca0bb9aad
support more activation type and remove some comments
7 years ago
sneaxiy
ba168bd2d2
modify API.spec
7 years ago
tensor-tang
d9bf73f3ab
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_gru
7 years ago
tensor-tang
dd938d0b94
fix bugs and pass op test
7 years ago
tensor-tang
ec59f0d454
add cpu vec
7 years ago
tensor-tang
cf5ea925c3
fix bugs
7 years ago
tensor-tang
6ed20474d4
refine attention lstm infershape
7 years ago
tensor-tang
508548f897
implement attention lstm cpu forward
7 years ago
tensor-tang
9affc36c89
init attention lstm
7 years ago
tensor-tang
3dd66390b2
add blas vexp
7 years ago
tensor-tang
0ec1f65cf1
fix blas dot and add cblas scal
7 years ago
tensor-tang
a2203d0466
add cblas dot
7 years ago
tensor-tang
f72ab8961e
refine blas gemm
7 years ago
qingqing01
f5d5d7b2d9
Disable in_place in batch_norm API. ( #12736 )
...
* Disable in_place in batch_norm API.
7 years ago
sneaxiy
c73c5ed573
use for_range
7 years ago
Xin Pan
b548ecbc2b
add stack_op
7 years ago
Yu Yang
eb8fd853bc
Fix sequence_softmax_cudnn op
7 years ago
Yu Yang
3768677980
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/process_lod_grad
7 years ago
Yu Yang
2a36ad1a96
Handle LoD for concat & seq_softmax ops
7 years ago
Yu Yang
211d81863d
Process elemwise grad op's lod. mul_op's lod
7 years ago
Yan Chunwei
9ee698e605
enhance/ditu rnn with fc fuse ( #12831 )
...
* make fc fuse work with ditu rnn
* add ditu rnn data download to CMAKE
7 years ago
Xin Pan
78415f326d
Merge pull request #12838 from panyx0718/infer
...
speed up while_op
7 years ago
fengjiayi
ce182d9037
bug fix
7 years ago
Xin Pan
a2c0e52f3e
speed up while_op
7 years ago
tensor-tang
6f78fd7d1e
fuse fc in gru
7 years ago
tensor-tang
300180cc26
init fusion gru op
7 years ago
Zhaolong Xing
21ba32b065
Merge pull request #12843 from NHZlX/fix_ssa_bug_for_trt
...
fix ssa bug with batch_norm and refine the trt
7 years ago
Michał Gallus
cd32ddac12
Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias ( #12669 )
...
* Fuse Convolution and Eltwise Add into Conv+Bias
* Reduce bias branching at conv_mkldnn_op
* Add MKLDNN build checks for Conv Bias
* Conv-bias: check if bias input exist befor assignment
* Conv-bias: Remove Bias dim check from infershape
It was causing conv3d test to crash upon\ncalling HasInput(Bias)
7 years ago
nhzlx
c999895e93
merge develop
7 years ago
nhzlx
276950291a
1. fix ssa bug with batchnorm, 2. refine the trt
7 years ago
Yan Chunwei
896a37b6e3
fea/link ir to inference analysis and fc fuse support ( #12789 )
...
* link IR graph to analysis graph
* add clean code and update
* add infer_clean_pass
* add ir_pass_manager
* support fc fuse executation
* fix ir circle
7 years ago
dzhwinter
e23ddf6ae4
status ( #12764 )
7 years ago
Tao Luo
d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
...
Refine elementwise mul cpu forward
7 years ago
tangwei12
cbc6e6eb97
Merge pull request #12247 from seiriosPlus/dis_ckpt_fix
...
add load slice_vars in io.py
7 years ago
Qingsheng Li
3d11d018e0
Fix scatter_op python API ( #12742 )
...
* Fix scatter_op python API and remove inconsistency between implementation and doc
* API spec change
* Change as review comment
7 years ago
Tao Luo
8f9f414a14
Merge pull request #12805 from tensor-tang/fix/op/elewise_add
...
fix SEGV element wise add at debug mode
7 years ago
tensor-tang
e955361267
Merge pull request #12737 from tensor-tang/feature/op/fusion_lstm
...
add fusion lstm
7 years ago
tensor-tang
82bb9170fb
Merge remote-tracking branch 'ups/develop' into fix/op/elewise_add
7 years ago
Chen Weihang
57b34d9196
Merge pull request #12808 from chenwhql/remove_inplace_param_in_squeeze_and_unsqueeze
...
Refactor: remove inplace parameter from squeeze and unsqueeze op
7 years ago
Yihua Xu
084d4a9e9e
Optimize CRF Decoding with AVX/AVX2/AVX512F instruction ( #12767 )
...
* Optimize CRF decoding with AVX/AVX2 instruction
* Enable the AVX2 flags for compiling
* Clean the code and decrease the count of multiply calculation
* Add the support of AVX512 instruction to optimize CRF Decoding
* Clean the code
* Enable the AVX512f flags for compiling
* Clean the code for the invaluable switch
* Fixed the issue to check AVX512F status
* Clean the code
* Add some explanation of the key points
7 years ago
fengjiayi
34b209cffa
Complete sequence_padding GPU kernel
7 years ago
dzhwinter
00463fdfe3
cudnn windows support ( #12757 )
...
* cudnn widndows
* "add comment"
* "windows support"
* "fix cmake error"
7 years ago
qingqing01
c62f68cb94
Fix bug in conditional_block_op. ( #12246 )
...
* Fix bug in conditional_block_op.
* Fix bug and add comments.
* Rename arguments.
7 years ago
chenweihang
bc471b6ac4
refactor: remove inplace parameter from squeeze and unsqueeze op
7 years ago
tensor-tang
0507f7bc3c
fix SEGV elementwise add at debug mode
7 years ago
tangwei12
ca1e18c04a
Merge pull request #12469 from seiriosPlus/sum_op_dim_fix
...
sum_op selectedRows dim bug fix
7 years ago
Zhaolong Xing
e5674f6dde
Merge pull request #12753 from NHZlX/add_benchmark
...
modify tensorrt engine op from cpu mode to gpu
7 years ago
tensor-tang
b090479409
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12
b4f52b01d0
bug fix when all inputs are empty
7 years ago
tangwei12
3efac174ea
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12
dbb4f0d35d
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix
7 years ago
Qiao Longfei
fd10669ecb
Add dependency to send recv ( #12760 )
...
Add dependency to send recv
7 years ago
fengjiayi
8d8d48a34f
Complete sequence_pad_op and its CPU kernel. Add unittests
7 years ago
tangwei12
7c12c0f865
add sync in load selectedrows
7 years ago
Michal Gallus
4a7f0698e0
Add consts to new MKLDNN integration
...
Also replace memory types from int64_t to size_t
7 years ago
Michal Gallus
6588d0e039
Update MKLDNN to 0.15, fix conv integration
7 years ago
tangwei12
9f11db4080
add todo in impl
7 years ago
tangwei12
c24a9263ba
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12
ac9ae97001
code fix
7 years ago
nhzlx
f55e8901c8
merge develop
7 years ago
nhzlx
1600ba86f6
1. change tensorrt op from cpu to gpu
7 years ago
tangwei12
bb9f494740
merge develop
7 years ago
dzhwinter
4069262f0e
Revert ""cherry picked operators changes" ( #12184 )" ( #12747 )
...
This reverts commit bf3c34960f
.
7 years ago
Qiao Longfei
653fad08f8
Optimize selected rows for dist lookup table with pthread rwlock ( #12635 )
...
Optimize selected rows for dist lookup table with rwlock
7 years ago
fengjiayi
3c749fae43
update CPU sequence_padding functor
7 years ago
tensor-tang
92890ac258
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12
0749c8822d
Merge pull request #12556 from seiriosPlus/samplingIdOp
...
Sampling id op
7 years ago
tensor-tang
a56142c155
optimize elementwise_mul cpu forward
7 years ago
tensor-tang
6644ce79a5
add mklml vmul
7 years ago
tensor-tang
ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
...
Refine gru cpu forward
7 years ago
tangwei12
26b228e405
remove assignment and add vlog
7 years ago
tangwei12
125e9166e1
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tensor-tang
a72f68f223
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tensor-tang
df28a3b452
fix lod and op test
7 years ago
Qingsheng Li
317e18abd2
Remove Data Sharing between input and output in scatter_op ( #12672 )
...
* Remove Data Sharing between input and output in scatter_op
* Removed data sharing in backward op
7 years ago
tensor-tang
f3cd2612ae
refine fc and use the fc compute in fusion_lstm
7 years ago
tangwei12
822496f626
merge cpu and gpu
7 years ago
dzhwinter
bf3c34960f
"cherry picked operators changes" ( #12184 )
...
* "cherry picked operators changes"
* "remove duplicated code"
* "add constant setter"
* "add get expected kernel"
* "fix ci"
* "add fill constant"
7 years ago
tensor-tang
40138c4cd6
add unit test of fusion lstm op
7 years ago
jerrywgz
c108376506
Add three modes for prelu_op ( #12630 )
...
* Add three modes for prelu_op.
7 years ago
tangwei12
9f09d68678
add enforce
7 years ago
gongweibao
d06849305a
parameter dispather. ( #12666 )
7 years ago
tensor-tang
852bc6f4aa
refine fusion lstm op doc
7 years ago
tensor-tang
8f9132959e
fuse fc in lstm
7 years ago
tensor-tang
ddb05dffb6
init fusion lstm op
7 years ago
tensor-tang
efc5392d97
Merge pull request #12676 from tensor-tang/refine/op/fc
...
refine fc op
7 years ago
tangwei12
470fb7c5c3
bug fix
7 years ago
tangwei12
60dda7bf9f
add gpu Implementation
7 years ago
tangwei12
4661f5589d
random optimize
7 years ago
Bai Yifan
9333a62792
Add flatten op interface and enhance APIs about detection to support variable-length image. ( #12422 )
...
* add flatten api&enhance detection api
* unify shape_op data type
* update API.spec
7 years ago
tensor-tang
eee38464dc
refine fc op use cpu only
7 years ago
tangwei12
ed937bc6f8
merge
7 years ago
tensor-tang
d84a1a0010
fc op use cpu only
7 years ago
fengjiayi
a38a8db928
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
tangwei12
478f73c188
merge header in cc
7 years ago
fengjiayi
d6b5302bd6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
tensor-tang
c588c64a76
Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang
0098a494a2
Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
fengjiayi
5e7aa8c7e5
code clean
7 years ago
tensor-tang
742300baa8
fix unkown omp pragmas
7 years ago
tensor-tang
b9dbb7c5cb
fix bias attri in mkldnn fc
7 years ago
tangwei12
59580a7f69
bug fix
7 years ago
tensor-tang
4b5986bb77
enable fc op in normal case
7 years ago
tensor-tang
e133df6037
enable native fc forward
7 years ago
tensor-tang
6a2a9a8350
Revert "Refine elementwise_add op"
7 years ago
Yu Yang
8dda526a45
Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy
...
Fix 'softmax_with_cross_entropy_op' dependency error
7 years ago
sneaxiy
f6f5cdaa05
Merge pull request #12555 from sneaxiy/refine_layer_norm
...
Refine layer_norm op
7 years ago
sneaxiy
c50c537732
fix arithmetic error in backward kernel
7 years ago
tensor-tang
038cbf799d
add bias for fc op
7 years ago
whs
9d6243b6fb
Fix crop op. ( #12603 )
...
* Fix infer shape of crop op.
* Speed crop op.
7 years ago
Bai Yifan
649f5d74f0
fix mine_hard_example bug ( #12664 )
7 years ago
sneaxiy
2d9508f8f3
Merge pull request #12554 from sneaxiy/refine_elementwise_add
...
Refine elementwise_add op
7 years ago
tensor-tang
171a0e2b42
add some comment
7 years ago
sneaxiy
2c560623d1
fix dependency error
7 years ago
tensor-tang
5377edd282
refine packed condition
7 years ago
tensor-tang
3bf3e77ac8
Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei
c0890988da
add RPCServerProfiler, replace listen and serv optimizer
7 years ago
tangwei12
64a4925cb4
Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12
0bfd62be3d
remove gpu supported, will add it later
7 years ago
Tao Luo
5a9ae411e0
Merge pull request #12618 from sfraczek/sfraczek/fix-new-mkldnn-conv-tests
...
fix UT for mkldnn 0.15
7 years ago
sneaxiy
cf799a6a04
Merge pull request #12553 from sneaxiy/refine_softmax_with_cross_entropy
...
Refine softmax_with_cross_entropy op
7 years ago
dzhwinter
8499559c42
"fix style" ( #12600 )
7 years ago
sneaxiy
010883689c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_layer_norm
7 years ago
sneaxiy
5d698589ce
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_elementwise_add
7 years ago
sneaxiy
19ff254d05
Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
Sylwester Fraczek
d74bb6ab9c
fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests
7 years ago
fengjiayi
855c9e3311
clean softmax_op code
7 years ago
fengjiayi
24d51de022
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
fengjiayi
27df3a9f2b
make cross_entropy_op supporting tensors
7 years ago
fengjiayi
66be53264e
Merge pull request #12592 from JiayiFeng/fix_mac_compile_error
...
fix mac compile error
7 years ago
fengjiayi
8e604a10aa
fix mac compile error
7 years ago
nhzlx
551c802cdc
merge develop
7 years ago
sneaxiy
ad45d39222
refine layer_norm
7 years ago
chengduo
7c8b69c700
Feature/op fusion ( #12240 )
...
* Add Preface
* Add demo code
* Save file
* Refine code
* seems can work
* use elementwise strategy
* Use ElementwiseComputeEx
* Add comments
* extract functions from operator
* Refine code
* Follow comment
* code refine
* follow comments
* follow comments
7 years ago
sneaxiy
1b4515f6db
refine softmax_with_cross_entropy
7 years ago
nhzlx
3a0caf801f
modify trt engine op test
7 years ago
nhzlx
e51d045a6d
modify trt engine op test
7 years ago
nhzlx
e8954a36f5
merge develop
7 years ago
nhzlx
32a9e050bc
mapping the variable name inside the subgraph
7 years ago
Wu Yi
2d036c47cd
polish dist unit test code ( #12512 )
...
* polish dist se resnext ut
* update
* update
* update
* avoid cpu initializer differ
* change to use executor for now
* update by comment
* remove lr decay use para exe, should fix para exe bug later
* update by comment
7 years ago
fengjiayi
7834b4a470
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
tangwei12
5bfdefae91
Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12
b30bdde15a
random optimize
7 years ago
tangwei12
9c63fef63c
random optimize
7 years ago
Qiao Longfei
88a607c342
Merge pull request #12541 from jacquesqiao/optimize-profiler
...
optimize profiler
7 years ago
tangwei12
5b9716d1f6
add dims check
7 years ago
tangwei12
4cd504d3b4
bug fix
7 years ago
sneaxiy
e57bc4d745
Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
sneaxiy
222fbbedfb
Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy
4b83afff6e
Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy
b2d0ee5159
refine elementwise_add op
7 years ago
tangwei12
da2cc99f67
sampling op optimize
7 years ago
fengjiayi
7c55e08c93
stash
7 years ago
tangwei12
4973e07be3
sampling op optimize
7 years ago
tensor-tang
836068569f
Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang
18c322c2a1
seperate cpu and gpu implementations for gru kernel compute
7 years ago
tensor-tang
54c95e49f0
fix blas
7 years ago
fengjiayi
b656d97e86
Merge pull request #12485 from JiayiFeng/dev_ops_tensor_support
...
Make lookup_table_op and softmax_op supporting high rank tensor
7 years ago
qiaolongfei
1623f1ba4f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler
7 years ago
tangwei12
3206970b77
sampling op rename
7 years ago
Xin Pan
99a77cfc62
Merge pull request #12468 from panyx0718/improve_profiler2
...
Improve profiler
7 years ago
qiaolongfei
a3f9d6a38c
optimize profiler
7 years ago
tangwei12
e0ab2f7158
new sampling op
7 years ago
tensor-tang
8c23f7c4f0
fix blas and use packed weight
7 years ago
tensor-tang
d9cc6b1866
replace gru compute with details
7 years ago
tensor-tang
43cee33a23
add mkl packed gemm
7 years ago
tangwei12
766ac488ac
sum_op selectedRows dim bug fix
7 years ago
dzhwinter
595a2c83ae
explicit gradient of elementwise_add/elementwise_sub ( #11970 )
...
* "add gradient register"
* "make some enhance"
* "better format"
* "fix typo"
* "fix reuse"
* "fix get expected kernel"
* "change the mkldnn code"
* "fix mkldnn"
* "fix mkldnn failed test"
* "add comment"
7 years ago
fengjiayi
e7d8e16a66
update softmax_mkldnn_op
7 years ago
Yu Yang
2567afa35d
Merge pull request #12462 from reyoung/feature/fix_cudnn_deterministic
...
Fix bug in cudnn_determistic
7 years ago
fengjiayi
dc111d3476
update softmax_cudnn_op
7 years ago
fengjiayi
f7bd0b227b
Add unittests for softmax_op
7 years ago
gongweibao
819ac3df0a
Modify style ( #12465 )
7 years ago
fengjiayi
b314a69523
make softmax supporting tensors
7 years ago
fengjiayi
b1af7e5d9b
Add unittests for lookup_table_op
7 years ago
tangwei12
c4c8f60bec
sum_op selectedRows dim bug fix
7 years ago
Xin Pan
486345551d
clean
7 years ago
Xin Pan
caf10b474f
make profiler use thread_id from g_thread_id
...
Add a few more RecordEvent.
Cleanup
7 years ago
Yu Yang
040fc1c39b
Fix bug in cudnn_determistic
...
* Introduced by #11205
7 years ago
fengjiayi
7efdf05ac2
make look_up_op supporting tensor ids
7 years ago
Qiao Longfei
690625fe15
Merge pull request #12456 from jacquesqiao/add-profiler-to-pserver
...
Add profiler to pserver
7 years ago
qiaolongfei
7e46a8d172
fix logical bug, optimize code
7 years ago
qiaolongfei
0b62f61d29
add init flag in __init__.py for listen_and_serv_profile_period
7 years ago
dzhwinter
91fb0156ca
Memory/reshape op ( #12414 )
...
* "remove inplace in single op"
* "fix ci"
* "add transpiler case"
* fix conflict
* "fix reshape"
* "delete reshape inplace attr"
* "follo the comments"
* "rerun ci"
7 years ago
qiaolongfei
0b861bbca9
add profiler for listen_and_serv op
7 years ago
tensor-tang
059b27840c
Merge pull request #12408 from tensor-tang/refine/im2col
...
Refine CPU im2col padding with 1
7 years ago
qiaolongfei
147bf00ffe
clear mutable rows for the output of split_ids_op
7 years ago
qiaolongfei
91b114a787
change map to unordered_map
7 years ago
tensor-tang
d8d2dbcfac
further optimize im2col using variables
7 years ago
qiaolongfei
91f63cd401
fix split_ids_op and add unit test
7 years ago
tensor-tang
5373fe29c2
Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
Qiyang Min
7da453630e
Merge pull request #12403 from velconia/fix_hang_up
...
Fix grpc destroy bug
7 years ago
Tao Luo
5a634786af
Merge pull request #12312 from luotao1/unify
...
unify libpaddle_inference_api and libpaddle_fluid
7 years ago
Bai Yifan
e12b1d1792
Add flatten op ( #12341 )
...
* add flatten op
7 years ago
Luo Tao
062556f938
Merge branch 'develop' into unify
7 years ago
chengduo
2409d0f710
Refine regularization for selected_rows ( #12369 )
...
* refine regularization for selected_rows
* clean lookup_table
* refine rpc_server_test
* temporally disable rpc_server_test
* fix rpc_server_test
* add unit test
7 years ago
tensor-tang
687a322267
Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
tensor-tang
65d418f060
complete im2col with padding==1 and speedup filter width==1
7 years ago
minqiyang
053540e199
Add volatile to stopped_ member
7 years ago
minqiyang
b78ffde6d5
Add stopped sign for grpc client
7 years ago
tensor-tang
52eb86e30f
refine im2col benchmark
7 years ago
tensor-tang
3017f46076
add more test cases
7 years ago
tensor-tang
8d6be4fb5f
refine im2col test and add benchmark
7 years ago
tensor-tang
507c143047
im2col cfo cpu code clean
7 years ago
tensor-tang
4eeed0b5e4
refine width padding and enable core copy
7 years ago
Wu Yi
73fcfc06ec
refine conv cudnn enforce ( #12353 )
...
* refine conv cudnn enforce
* update
* update all cudnn ops
* fix
7 years ago
tensor-tang
e3131e2d73
enable width padding
7 years ago
Xin Pan
d7e08c53c2
Merge pull request #12169 from panyx0718/ir_graph_sort
...
construct a SSAGraph at the beginning.
7 years ago
tensor-tang
92518c519f
reuse sizes saving time
7 years ago
tensor-tang
660df122ce
enable padding!=0 and fill height padding with 0
7 years ago
tensor-tang
d8e00facf7
reuse im_size
7 years ago
tensor-tang
179dd0cb8a
Merge pull request #12337 from tensor-tang/refine/im2col
...
refine cpu im2col no padding
7 years ago
Luo Tao
5ba4337698
unify libpaddle_inference_api into libpaddle_fluid
7 years ago
tensor-tang
b72befc5cc
reuse copy size
7 years ago
Yancey
6133efd9ed
Merge pull request #12218 from Yancey1989/rpc_complete_interface
...
Add rpc complete interface
7 years ago
Zhaolong Xing
6169d724b9
Merge pull request #12324 from NHZlX/enhance_for_tensorrt_infer
...
Enhance for tensorrt infer
7 years ago
nhzlx
4d49e61ab8
fix comments
7 years ago
tensor-tang
6788af4bf1
refine test cases
7 years ago
tensor-tang
b163e601b6
add gtest
7 years ago
nhzlx
bcd67bdd71
add assert for GetOutput
7 years ago
tensor-tang
aae994fd26
refine im2col no padding
7 years ago
Yancey1989
fb06ed7bdc
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
Yu Yang
21387e3c2a
Tiny refines for lod_tensor_blocking_queue and reshape_op
7 years ago
nhzlx
f42ea48996
deal with conflict
7 years ago
nhzlx
940f5dbcac
modify the tensorrt engine op to adapt to chage
7 years ago
Yan Chunwei
02cf54d331
bugfix lod cpu performance ( #12297 )
7 years ago
Qiao Longfei
b41f8b9d42
Merge pull request #12295 from jacquesqiao/speedup-reduce-sum-grad-op
...
Speedup reduce sum grad op
7 years ago
fengjiayi
eec412b230
Merge pull request #12273 from JiayiFeng/update_py_reader
...
Some enhancement on readers
7 years ago
Xin Pan
21a45420f0
polish and test
7 years ago
Qiao Longfei
95a2b5f56a
fix mac build of sendrecvop_utils ( #12272 )
7 years ago
qiaolongfei
273f737517
optimize code
7 years ago
Xin Pan
93355cc0d2
fix control deps
7 years ago
fengjiayi
ea8a375fa4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into update_py_reader
7 years ago
qiaolongfei
5d718a5886
optimize reduce_sum_grad op
7 years ago
Yancey1989
d4f51218ef
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
qiaolongfei
b643473d31
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-mac-build
7 years ago
fengjiayi
060f421797
Some enhancement on readers
...
1. Make the feeding thread of py_reader a daemon thread.
2. Update buffer_reader's destructor, fixing a bug.
3. Make pyreader demo script supporting CPU environment.
7 years ago
qingqing01
873a50ce35
Fix serious bug in nesterov momentum optimizer. ( #12231 )
...
* Fix serious bug in nesterov momentum optimizer.
7 years ago
Yan Chunwei
b42ced8eda
bugfix/tensorrt analysis fix subgraph trigger ( #12266 )
7 years ago
qiaolongfei
938390b38d
fix mac build of sendrecvop_utils
7 years ago
gongweibao
3a6213f493
Change grpc interface to compatible with brpc. ( #12164 )
7 years ago
Yu Yang
b06309381b
Merge pull request #12149 from reyoung/feature/combine_open_files_and_double_buffer
...
Change and polish readers
7 years ago
tensor-tang
be04fbff42
Merge pull request #12233 from tensor-tang/refine/mkl/gemm
...
add option split mkl gemm
7 years ago
Qiao Longfei
2b58c62aa0
Update auc op ( #12199 )
...
fix AUC op
optimize it's test
7 years ago
Yancey1989
efd5a84986
update executor interface
7 years ago
tensor-tang
fc2b578842
add gemm_warp test
7 years ago
tensor-tang
a916c52579
refine gemm
7 years ago
tensor-tang
961e754c9f
mkl split gemm for better perf
7 years ago
Yancey1989
ade6675490
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
yuyang18
e9c8d930a5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/combine_open_files_and_double_buffer
7 years ago
Yancey1989
d0771cf912
update
7 years ago
Yancey1989
7570d8e77c
add rpc complete interface
7 years ago
yuyang18
8c70183ba6
Polish function names
7 years ago
yuyang18
b789a3a484
Change code
7 years ago
whs
8284947b82
Fix infershape of im2sequence. ( #12183 )
7 years ago
yuyang18
401e92f6e3
Change attr comment
7 years ago
yuyang18
be528f9815
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/combine_open_files_and_double_buffer
7 years ago
Tomasz Patejko
b2b8b15bfe
MKLDNN sum fix: remove in_place condition in loop creating memory primitives for sum
7 years ago
yuyang18
72b78154b2
Polish reader speed
7 years ago
Wu Yi
866fcb0c15
Merge pull request #12171 from typhoonzero/fix_pserver_with_condition_block
...
fix pserver with condition block
7 years ago
typhoonzero
32d81909dc
fix pserver with condition block
7 years ago
tensor-tang
d24fd2c6b1
Merge pull request #12099 from jczaja/prv-conv-grad-mkldnn-upstream2
...
MKLDNN: Extending Conv grad MKLDNN op with reusing MKLDNN primitives
7 years ago
yuyang18
e576345f5b
Try to speed up buffered reader
7 years ago
Wu Yi
c5619bbcde
fix auc op ( #12087 )
...
* fix auc
* update
* update
* fix compile
* fix param name
* add doc string
* fix test
7 years ago
Yancey
0042ba93c8
Merge pull request #12127 from Yancey1989/enforce_rpc_timeout
...
Enforce rpc timeout
7 years ago
yuyang18
61b3a5977f
Refine Python Reader
7 years ago
yuyang18
b048ddf0bd
Merge error
7 years ago
yuyang18
b8975d6842
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/combine_open_files_and_double_buffer
7 years ago
yuyang18
d36e13efd8
Merge branch 'feature/add_pyreader_demo' into feature/combine_open_files_and_double_buffer
7 years ago
yuyang18
1478a5fc0b
Make open_files use buffer
7 years ago
yuyang18
dc34effd35
Extract buffered reader
7 years ago
yuyang18
392318045f
Merge branch 'feature/dctor_all_readers' into feature/combine_open_files_and_double_buffer
7 years ago
yuyang18
fecbe52200
Rewrite open_files
7 years ago
Yu Yang
ba997b8ccd
Merge pull request #12097 from reyoung/feature/hide_api_cont
...
Hide internal API of LoDTensors, Clipping, etc.
7 years ago
yuyang18
c680bc1d7f
Rewrite DoubleBuffer
7 years ago
yuyang18
c9cf2bdb9c
Dctor cache
7 years ago
yuyang18
ee7d8b4d66
Refine Shutdown Impl
7 years ago
Jacek Czaja
8e20d36bc8
- comment update
7 years ago
Jacek Czaja
c981222b3b
- Conv MKLDNN grad op reuse of mkldnn primitives
7 years ago
tensor-tang
f0cd493c0d
Merge pull request #11989 from tensor-tang/feature/libxsmm
...
introduce libxsmm
7 years ago
Sylwester Fraczek
4d55aca40e
reserve vector space before loop in top-k
7 years ago
Yu Yang
ebe3b5e78a
Merge pull request #11853 from sneaxiy/complete_py_reader_python
...
Add Python Reader Op (Python side and unittests)
7 years ago
Yancey1989
4a91a14549
enforce rpc client timeout
7 years ago
Guo Sheng
da3f766821
Merge pull request #12088 from guoshengCS/complete-hsigmoid
...
Complete hsigmoid_op
7 years ago
sneaxiy
31c7f6b968
Merge branch 'develop' into complete_py_reader_python
7 years ago
fengjiayi
6ff7f2380c
Merge pull request #12063 from reyoung/feature/exception_safe_pe
...
Make scope_buffered_ssa_graph_executor Exception safe
7 years ago
tensor-tang
2f7b09319a
Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
guosheng
4ee069fdba
Fix the HierarchicalSigmoidGradOpKernel and refine the codes. Now hsigmoid_op is same with V2 implementation and can pass gradient check.
7 years ago
yuyang18
c87e08c28d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/exception_safe_pe
7 years ago
chenweihang
938319bbd2
Merge branch 'develop' into unsqueeze_op
7 years ago
Yibing Liu
092d620187
Merge pull request #11812 from chenwhql/squeeze_op
...
Add squeeze operator and unit testing
7 years ago
tensor-tang
1c5d6c5692
disable xsmm with float16
7 years ago
tensor-tang
c9ba51ead8
Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
tensor-tang
64a8e6d20e
refine the threshold functions
7 years ago
Tao Luo
c620c522d7
Merge pull request #12093 from Noplz/fix_warning
...
fix warning
7 years ago
lemon34
29145e1e31
change im2sequence for ctc batch inference ( #11696 )
...
* change im2sequence for ctc batch inference
* Update im2sequence_op.cc
* change im2sequence for ctc batch inference
* update
* change PR by comment
* fix ocr test error
* fix test_im2sequence
* modify the old name to standard name
* fix test_layers failed
7 years ago
Noplz
cfa4479b06
fix warning
7 years ago
tensor-tang
32822b2a59
Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
chenweihang
b8ea7a081a
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
Jacek Czaja
fbe25ef510
MKLDNN: Extending Conv MKLDNN op to reuse MKLDNN primitives ( #11750 )
...
* - Rebase of conv reuse
- clag formatter fixes
- Fix to conv reuse
- Yet another fix
- Fix
- Fix
- clagn format
* - comment update
7 years ago
baiyf
be2d9dc2b8
Add prior_box output order control ( #12032 )
...
* Add flag to set prior_box output order.
7 years ago
guosheng
e7f7ba97fe
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
guosheng
e7a4cfc0ff
complete the hsigmoid_op
7 years ago
chenweihang
84a525a38a
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
sneaxiy
f85e16f1de
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into complete_py_reader_python
7 years ago
chenweihang
0ea468225b
docs: fix some errors of description
7 years ago
chenweihang
fbef49e772
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
3d15968958
docs: fix some errors of description
7 years ago
achao2013
8e4b225fe4
Add fake_quantize_op. ( #11359 )
...
* Add a fake_quantize_op, which quantize an input tensor to a tensor with lower bits.
7 years ago
Yuan Gao
50aa6ba6f5
add rpn target assign op ( #11449 )
...
* Add region proposal network (RPN) target assign operator and Python API for Faster-RCNN.
7 years ago
chenweihang
2bd65dbf71
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
chenweihang
fd01a43a3c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
tensor-tang
7bb67b6788
Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
chenweihang
cef8dbc1f7
refine some messages and adjust data type
7 years ago
chenweihang
05eafcca73
refine some messages and adjust data type
7 years ago
minqiyang
fceaabdd81
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_grpc_destroy_bug
7 years ago
guosheng
d695381677
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
yuyang18
3aaf798182
Refine size_t and int
7 years ago
fengjiayi
26ae6111d1
Merge pull request #12051 from JiayiFeng/dev_reader_ResetAll
...
[WIP] Dev reader reset all
7 years ago
qingqing01
10fbb831ed
Skip BatchNorm when feature only has 1 element. ( #11578 )
...
* Fix batch norm when only 1 elements in normzalize dimension during training.
7 years ago
chenweihang
8f2486ca16
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
tensor-tang
6bc1aaaac7
refine the ColMajor replacement
7 years ago
tensor-tang
c3862a7519
Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
chenweihang
d552b900f0
change the copyright year form 2016 to 2018
7 years ago
qingqing01
ef4895df3b
Make IfElse operator works and fix unit testing. ( #11972 )
...
1. Fix bug when only true or false branch works.
2. Fix bug in unit testing.
7 years ago
tensor-tang
de856da9a6
fix ColMajor and RowMajor replacement
7 years ago
tensor-tang
00ee6c3c17
Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
fengjiayi
6d6f49cd56
Merge remote-tracking branch 'yuyang/feature/decorated_reader_chain' into dev_reader_ResetAll
7 years ago
chenweihang
7526eaaf13
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
4453473f71
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
chenweihang
1721613f1e
simplify construct function
7 years ago
fengjiayi
611716e9bc
Merge branch 'dev_reader_shutdown_start' of https://github.com/JiayiFeng/Paddle into dev_reader_shutdown_start
7 years ago
fengjiayi
0e9f1e2790
Make ReaderBase thread safe and remove ThreadedReader
7 years ago
yuyang18
e8ee9dc7f8
Several Polish
7 years ago
chenweihang
5f89272c89
change the bit insert to array insert for understandability
7 years ago
fengjiayi
b4f0e57956
fix errors
7 years ago
Tao Luo
436bb4500b
Merge pull request #11699 from pzelazko-intel/pzelazko/workaround-for-missing-mklnn-kernels
...
workaround for no MKLDNN kernel
7 years ago
fengjiayi
6fc6cc2f4c
Some updates on readers
...
1. Shrink DoubleBufferReader's buffer size to 3.
2. Add BatchReader an option to discard leftover instances.
3. Fix a MultiPassReader bug on pass count.
7 years ago
fengjiayi
5528f59900
Split ReInit() to Shutdown() and Start()
7 years ago
fengjiayi
de9a411f1c
adjust readers' inheritance relationships
...
1. Make PyReader and RandomDataGenerator inherited from FileReader.
2. Remove the memeber variable 'dims_' and realated checks in FileReader.
7 years ago
yuyang18
c48c586aca
Use weak_ptr to implement DecoratedReaderChain
7 years ago
minqiyang
1377b332bc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_grpc_destroy_bug
7 years ago
chenweihang
fccdc1abea
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
62a17f5053
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
80126a7496
small fix based reviewer's advice
7 years ago
yuyang18
8e86721fe7
Fix data balance on single GPU
7 years ago
tensor-tang
21516e5cbe
add unit test of smm
7 years ago
tensor-tang
c3941745b3
add libxsmm_gemm
7 years ago
minqiyang
2c4fb585db
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_grpc_destroy_bug
7 years ago
minqiyang
0d04545e9c
Remove debug info
7 years ago
chenweihang
9ca8db237a
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
tensor-tang
7782a4ab53
fix blas build issue
7 years ago
tensor-tang
17987eb3fc
link libxsmm
7 years ago
minqiyang
207d1b81fe
Add fixed grpc
7 years ago
tensor-tang
3df99e72ab
Merge remote-tracking branch 'ups/develop' into refine/set_num_threads
...
fix conflicts
7 years ago
dzhwinter
4ed0b62476
Move fluid::framework::InitDevices into fluid::platform ( #11757 )
...
* move to platform
* "move init from framework to platform"
* "remove used init"
* "fix ci"
* "fix ci"
* "fix generic"
* "fix ci"
* "fix ci"
* "fix ci"
* "disable fragile test"
7 years ago
dzhwinter
99a99ec7e3
"remove lapack" ( #11966 )
7 years ago
chenweihang
a6d94e8dc6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
49b2cf5fee
adjust some code based reviewer's advice
7 years ago
sneaxiy
9b28260029
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into complete_py_reader_python
7 years ago
sneaxiy
739c330914
fix merge conflict
7 years ago
fengjiayi
ce16b40b04
Merge pull request #11891 from JiayiFeng/dev_eof_exp
...
Add EOFException to represent EOF in C++ reader
7 years ago
chenweihang
79333fa7b8
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
ca15779394
rewrite, use reshape op in unsqueeze op, test passed
7 years ago
Xin Pan
71b1c397d7
Merge pull request #11874 from panyx0718/move_trainer
...
Move trainer and utils api
7 years ago
Xin Pan
d70a38d8ec
fix
7 years ago
yuyang18
c31519036b
Merge branch 'squeeze_op' of https://github.com/chenwhql/Paddle into pr/11812
7 years ago
yuyang18
1854814d49
Use reshape_op inside squeeze_op
...
* also convert tab to space
7 years ago
Xin Pan
94cb59ad09
hide utils to legacy
7 years ago
chenweihang
ee760d1c2d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into squeeze_op
7 years ago
chenweihang
0cef33a468
adjust the dims range to [1,6] and fix some problem
7 years ago
Yancey
f7fd711e3f
Merge pull request #11868 from Yancey1989/dist_pass_barrier
...
add dist pass barrier
7 years ago
yuyang18
3777f10286
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into pr/11812
7 years ago
Yu Yang
9401b64d61
Merge pull request #11877 from reyoung/feature/fix_reshape_op_size
...
User can register a standard C++ functor as Kernel
7 years ago
chenweihang
996c157f61
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
e402496238
complete unsqueeze op and related unittest.
7 years ago
fengjiayi
3fab4f65a4
Add EOFException to represent EOF in C++ reader
7 years ago
minqiyang
1d6ecd3c4e
Change grpc version to 1.13.x
7 years ago
yuyang18
550ab8d723
Use single file than multiple files
7 years ago
Paweł Żelazko
ac323343a0
typos fix
7 years ago
yuyang18
6038a63120
Fix fc mkldnn op
7 years ago
yuyang18
82866d4a18
Add register kernel functor and shrink reshape op
...
* Shrink reshape_op library size
* User can register a standard C++ functor as a op kernel
7 years ago
fengjiayi
58560622bc
Merge pull request #11854 from JiayiFeng/dev_data_balance
...
Data balance for the ParallelExecutor
7 years ago
yuyang18
1ce478f100
Polish reshape op
7 years ago
Yancey1989
37410a0c75
update by comment
7 years ago
chenweihang
9ca88fa8a5
Adjust squeeze op and code the unittest, test passed
7 years ago
sneaxiy
3f9292c6e6
fix merge conflict
7 years ago
sneaxiy
dd70fb4393
fix type comparation bugs
7 years ago
Xin Pan
982dabe293
Merge pull request #11866 from panyx0718/move_func
...
Move some v2 codes to a legacy directory.
7 years ago
Xingyuan Bu
5056d3ec56
FasterRCNN Anchor Generator Op ( #11218 )
...
* Add anchor generator operator for Faster-RCNN.
* Add unittest testing.
* Add Python API.
7 years ago
Yibing Liu
5f79c7fbb6
Merge pull request #11174 from kuke/argsort_dev
...
Add the argsort operator
7 years ago
Yancey1989
029425a5f4
update
7 years ago
Yancey1989
c1ab215e26
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dist_pass_barrier
7 years ago
Yancey1989
1366832a41
add dist pass barrier
7 years ago
Xin Pan
a9086bf320
also move a few other dir to legacy/
7 years ago
gongweibao
66c91911cf
Improve brpccmake ( #11842 )
7 years ago
Yibing Liu
9386ac0a40
Enhance cuda code & unittest for argsort_op
7 years ago
guochaorong
c318aa5ffa
Merge pull request #11850 from guochaorong/revert_11496
...
Revert "Extend fill_zeros_like_op for zero-filling an LoDTensorArray …
7 years ago
fengjiayi
49a04d75ee
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_data_balance
7 years ago
fengjiayi
4b950951d3
Add unittests and fix a few bugs
7 years ago
chenweihang
a1e7f2d520
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into unsqueeze_op
7 years ago
chenweihang
70729ad641
Add Unsqueeze Operator Framework, not finshed
7 years ago
guochaorong
6a35899131
Revert "Extend fill_zeros_like_op for zero-filling an LoDTensorArray ( #11496 )"
...
This reverts commit bc28cf613f
.
7 years ago
chenweihang
298e74da1e
add squeeze op c++ part, compile success
7 years ago
fengjiayi
5b4f283069
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_data_balance
7 years ago
fengjiayi
b6dc3a59f1
Add DataBalanceOpHandle to MultiDeviceSSAGragh
7 years ago
mozga-intel
b8a04c2fa1
Duplicated code was moved to common function
7 years ago
mozga-intel
3b128337a1
The mkldnn batch norm supports other data format
7 years ago
Xin Pan
2ecc56226d
small AverageOptimizer enhance. ( #11761 )
...
* small AverageOptimizer enhance.
* clean
* clean
7 years ago
Yan Chunwei
5082642bdb
feature/analysis to support sub-graph for TRT engine ( #11538 )
7 years ago
Haichao Zhang
bc28cf613f
Extend fill_zeros_like_op for zero-filling an LoDTensorArray ( #11496 )
...
* Add fill_zeros_array op. This op is used for zero-filling an LoDTensorArray.
* merge fill_zeros_array_op with fill_zeros_like_op
* add unit_test for fill_zeros_like for array
7 years ago
Qiao Longfei
593bbfe392
Merge pull request #11765 from jacquesqiao/fix-adam-op-for-selectedrows
...
fix adam op for selected rows
7 years ago
qiaolongfei
20fae68136
adam op handle grad.rows().size == 0 condition
7 years ago
pzelazko-intel
9a15c92317
bnorm+relu fuse for mkldnn (inference) ( #11434 )
...
* bnorm+relu fuse for mkldnn
* separate fuse_relu function
* bug fix
* proper while range in inference_transpiler
* description fix
* review fix
* review fix
* unit test for fwd batch norm+relu MKLDNN fuse
7 years ago
baiyf
778b71fc93
Optimize bipartite_match_op in large scale input ( #11730 )
...
* optimize bipartite_match_op in large scale input
7 years ago
qiaolongfei
df7a266ae2
fix adam op for selected rows
7 years ago
tensor-tang
e3a96300bb
move SetNumThreads to platform
7 years ago
qingqing01
b756063ce7
Speed depthwise transposed conv2d. ( #11740 )
...
* Speed depthwise transposed conv2d.
7 years ago
Qingsheng Li
8630ba2eb1
Fix sequence expand op ( #11618 )
...
* Set zero outside functor
7 years ago
sneaxiy
01fbcb0bbb
Merge pull request #11695 from sneaxiy/complete_py_reader_cpp
...
Add Python Reader Op (CPP side)
7 years ago
Guo Sheng
8df303c09b
Merge pull request #11238 from guoshengCS/fix-beam_search
...
Fix and enhance beam_search_op and beam_searc_decode_op
7 years ago
guosheng
d15b2e02c8
Fix copying empty tensor in beam_search_decode_op
7 years ago
sneaxiy
d4d946db5a
update blocking queue
7 years ago