fengjiayi
ce182d9037
bug fix
7 years ago
Xin Pan
a2c0e52f3e
speed up while_op
7 years ago
typhoonzero
dd7a79158b
add scope info in graphviz debug
7 years ago
tensor-tang
6f78fd7d1e
fuse fc in gru
7 years ago
tensor-tang
300180cc26
init fusion gru op
7 years ago
Zhaolong Xing
21ba32b065
Merge pull request #12843 from NHZlX/fix_ssa_bug_for_trt
...
fix ssa bug with batch_norm and refine the trt
7 years ago
Michał Gallus
cd32ddac12
Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias ( #12669 )
...
* Fuse Convolution and Eltwise Add into Conv+Bias
* Reduce bias branching at conv_mkldnn_op
* Add MKLDNN build checks for Conv Bias
* Conv-bias: check if bias input exist befor assignment
* Conv-bias: Remove Bias dim check from infershape
It was causing conv3d test to crash upon\ncalling HasInput(Bias)
7 years ago
nhzlx
c999895e93
merge develop
7 years ago
nhzlx
276950291a
1. fix ssa bug with batchnorm, 2. refine the trt
7 years ago
Yan Chunwei
896a37b6e3
fea/link ir to inference analysis and fc fuse support ( #12789 )
...
* link IR graph to analysis graph
* add clean code and update
* add infer_clean_pass
* add ir_pass_manager
* support fc fuse executation
* fix ir circle
7 years ago
dzhwinter
e23ddf6ae4
status ( #12764 )
7 years ago
Tao Luo
d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
...
Refine elementwise mul cpu forward
7 years ago
tangwei12
cbc6e6eb97
Merge pull request #12247 from seiriosPlus/dis_ckpt_fix
...
add load slice_vars in io.py
7 years ago
Qiyang Min
72965226e6
Merge pull request #12818 from velconia/fix_python3_CI_job
...
Fix python3 CI job
7 years ago
minqiyang
656c77e712
Resume cicheck
7 years ago
minqiyang
e1492f19e1
Change the sequence of ci check
7 years ago
tangwei12
44bade8b17
fix api spec
7 years ago
Zhaolong Xing
470335e8c4
Merge pull request #12786 from NHZlX/add_batch_norm_trt_converter
...
Add batch norm trt converter
7 years ago
Qingsheng Li
3d11d018e0
Fix scatter_op python API ( #12742 )
...
* Fix scatter_op python API and remove inconsistency between implementation and doc
* API spec change
* Change as review comment
7 years ago
nhzlx
ff052c0e6f
merge develop
7 years ago
nhzlx
c6a5c4b0c0
add comments for execute in ut_helper
7 years ago
minqiyang
50d66a0790
Fix prelu_op
7 years ago
minqiyang
beb93bb901
Fix ut bug for graph_test
...
Port dist_transpiler new added codes
Port ut for clone desc
7 years ago
Tao Luo
8f9f414a14
Merge pull request #12805 from tensor-tang/fix/op/elewise_add
...
fix SEGV element wise add at debug mode
7 years ago
tensor-tang
e955361267
Merge pull request #12737 from tensor-tang/feature/op/fusion_lstm
...
add fusion lstm
7 years ago
tensor-tang
82bb9170fb
Merge remote-tracking branch 'ups/develop' into fix/op/elewise_add
7 years ago
tangwei12
99f74be561
Merge pull request #12802 from seiriosPlus/inference_teeny_mistakes
...
fix some teeny mistakes
7 years ago
Tao Luo
2ae885e224
Merge pull request #12811 from luotao1/tensorrt_compiler_bug
...
fix tensorrt compiler bug
7 years ago
Chen Weihang
57b34d9196
Merge pull request #12808 from chenwhql/remove_inplace_param_in_squeeze_and_unsqueeze
...
Refactor: remove inplace parameter from squeeze and unsqueeze op
7 years ago
Xin Pan
daf464af68
Merge pull request #12807 from panyx0718/fix
...
fix program_desc constructor
7 years ago
luotao1
808e5b1748
fix tensorrt compiler bug
7 years ago
Yihua Xu
084d4a9e9e
Optimize CRF Decoding with AVX/AVX2/AVX512F instruction ( #12767 )
...
* Optimize CRF decoding with AVX/AVX2 instruction
* Enable the AVX2 flags for compiling
* Clean the code and decrease the count of multiply calculation
* Add the support of AVX512 instruction to optimize CRF Decoding
* Clean the code
* Enable the AVX512f flags for compiling
* Clean the code for the invaluable switch
* Fixed the issue to check AVX512F status
* Clean the code
* Add some explanation of the key points
7 years ago
fengjiayi
34b209cffa
Complete sequence_padding GPU kernel
7 years ago
dzhwinter
00463fdfe3
cudnn windows support ( #12757 )
...
* cudnn widndows
* "add comment"
* "windows support"
* "fix cmake error"
7 years ago
Xin Pan
4a4c469f61
add test
7 years ago
qingqing01
c62f68cb94
Fix bug in conditional_block_op. ( #12246 )
...
* Fix bug in conditional_block_op.
* Fix bug and add comments.
* Rename arguments.
7 years ago
nhzlx
1bf9d9e90c
fix comments
7 years ago
chenweihang
bc471b6ac4
refactor: remove inplace parameter from squeeze and unsqueeze op
7 years ago
Xin Pan
7473d5f735
fix program_desc constructor
7 years ago
tensor-tang
0507f7bc3c
fix SEGV elementwise add at debug mode
7 years ago
tangwei12
cfb12f09bf
fix some teeny mistakes
7 years ago
Yu Yang
c6af7201e9
Merge pull request #12692 from reyoung/feature/fast_executor
...
Feature/fast executor
7 years ago
Xin Pan
e525aa232e
Merge pull request #12780 from panyx0718/ir4
...
fix ProgramToGraph
7 years ago
Tao Luo
7decbaaa13
Merge pull request #12762 from luotao1/anakin_cuda_env
...
disable anakin when cuda < 8.0 or cudnn < 7.0
7 years ago
nhzlx
324dd16816
merge develop
7 years ago
yuyang18
b8029fd650
Follow comments
7 years ago
tangwei12
ca1e18c04a
Merge pull request #12469 from seiriosPlus/sum_op_dim_fix
...
sum_op selectedRows dim bug fix
7 years ago
Xin Pan
1d3343240e
fix
7 years ago
nhzlx
144b20c160
add batch norm op converter
7 years ago
nhzlx
14311bb094
merge develop
7 years ago
Zhaolong Xing
e5674f6dde
Merge pull request #12753 from NHZlX/add_benchmark
...
modify tensorrt engine op from cpu mode to gpu
7 years ago
Zhaolong Xing
310708726b
Merge pull request #12761 from NHZlX/global_pooling_trt
...
Add support for global pooling for trt
7 years ago
tensor-tang
b090479409
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
nhzlx
1e92baf746
fix comments
7 years ago
Xin Pan
17b88811e0
fix ProgramToGraph
...
when while_grad, it writes multiple @EMPTY@ with no VarDesc.
7 years ago
tangwei12
b4f52b01d0
bug fix when all inputs are empty
7 years ago
tangwei12
3efac174ea
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12
dbb4f0d35d
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix
7 years ago
Qiao Longfei
fd10669ecb
Add dependency to send recv ( #12760 )
...
Add dependency to send recv
7 years ago
nhzlx
ce7f361a80
fix comments
7 years ago
Xin Pan
a9217031ba
small fix
7 years ago
fengjiayi
8d8d48a34f
Complete sequence_pad_op and its CPU kernel. Add unittests
7 years ago
nhzlx
df9cbabcee
add pool2d test for global_pooling true
7 years ago
dzhwinter
2673798ddb
"fix float16 ShuffleDownSync Bug" ( #12756 )
...
* "fix bug"
* "add test case"
7 years ago
Yan Chunwei
6fe5547db7
switch NodeAttr to boost::varient ( #12539 )
7 years ago
Chen Weihang
535a6e9206
Merge pull request #12509 from JiabinYang/scripts0802
...
fix the paddle script causes 'command not found' error'
7 years ago
nhzlx
133ec69625
add batch norm trt converter
7 years ago
tangwei12
7c12c0f865
add sync in load selectedrows
7 years ago
luotao1
413bf9d494
disable anakin when cuda < 8.0 or cudnn < 7.0
7 years ago
Michal Gallus
4a7f0698e0
Add consts to new MKLDNN integration
...
Also replace memory types from int64_t to size_t
7 years ago
Michal Gallus
6588d0e039
Update MKLDNN to 0.15, fix conv integration
7 years ago
tangwei12
9f11db4080
add todo in impl
7 years ago
tangwei12
40febec402
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix
7 years ago
tangwei12
c24a9263ba
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
Qiao Longfei
03d4c7efd3
add rw lock test ( #12752 )
...
* add rw lock test
* optimize read_write and wirte_read test
7 years ago
dzhwinter
f36818d532
"windows testing easier" ( #12739 )
7 years ago
nhzlx
2bdd20be22
add support for global pooling for trt
7 years ago
tangwei12
ac9ae97001
code fix
7 years ago
nhzlx
f55e8901c8
merge develop
7 years ago
nhzlx
1600ba86f6
1. change tensorrt op from cpu to gpu
7 years ago
tangwei12
bb9f494740
merge develop
7 years ago
tangwei12
eba7177475
add unit test and code fix
7 years ago
dzhwinter
4069262f0e
Revert ""cherry picked operators changes" ( #12184 )" ( #12747 )
...
This reverts commit bf3c34960f
.
7 years ago
Qiao Longfei
653fad08f8
Optimize selected rows for dist lookup table with pthread rwlock ( #12635 )
...
Optimize selected rows for dist lookup table with rwlock
7 years ago
Qiao Longfei
64d48f4d6a
fix mac compile ( #12751 )
7 years ago
fengjiayi
3c749fae43
update CPU sequence_padding functor
7 years ago
tensor-tang
92890ac258
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12
0749c8822d
Merge pull request #12556 from seiriosPlus/samplingIdOp
...
Sampling id op
7 years ago
Qiyang Min
340a104c58
Merge pull request #12658 from velconia/port_pybind11
...
Port pybind11 and python code to support py3 CI test
7 years ago
tensor-tang
a56142c155
optimize elementwise_mul cpu forward
7 years ago
tensor-tang
6644ce79a5
add mklml vmul
7 years ago
tensor-tang
ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
...
Refine gru cpu forward
7 years ago
tangwei12
26b228e405
remove assignment and add vlog
7 years ago
Chen Weihang
d4d8f83137
Merge pull request #12633 from chenwhql/demangle_type_name
...
Error message refine: add demangle api to attribute type
7 years ago
Chen Weihang
1e961b145c
Merge pull request #12591 from chenwhql/enforce_msg_polish
...
polish high frequency enforce error message
7 years ago
tangwei12
125e9166e1
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tensor-tang
a72f68f223
Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tensor-tang
df28a3b452
fix lod and op test
7 years ago
Tao Luo
17da113c87
Merge pull request #12693 from luotao1/anakin_bug
...
fix specific cudnn include and library path
7 years ago
Qingsheng Li
317e18abd2
Remove Data Sharing between input and output in scatter_op ( #12672 )
...
* Remove Data Sharing between input and output in scatter_op
* Removed data sharing in backward op
7 years ago
tensor-tang
f3cd2612ae
refine fc and use the fc compute in fusion_lstm
7 years ago
qingqing01
c44fb00371
Add name in relu and log API. ( #12438 )
7 years ago
luotao1
9f3789944c
use latest anakin commit
7 years ago
tangwei12
822496f626
merge cpu and gpu
7 years ago
dzhwinter
bf3c34960f
"cherry picked operators changes" ( #12184 )
...
* "cherry picked operators changes"
* "remove duplicated code"
* "add constant setter"
* "add get expected kernel"
* "fix ci"
* "add fill constant"
7 years ago
tensor-tang
40138c4cd6
add unit test of fusion lstm op
7 years ago
jerrywgz
c108376506
Add three modes for prelu_op ( #12630 )
...
* Add three modes for prelu_op.
7 years ago
tangwei12
9f09d68678
add enforce
7 years ago
gongweibao
d06849305a
parameter dispather. ( #12666 )
7 years ago
tensor-tang
852bc6f4aa
refine fusion lstm op doc
7 years ago
tensor-tang
8f9132959e
fuse fc in lstm
7 years ago
tensor-tang
ddb05dffb6
init fusion lstm op
7 years ago
tensor-tang
efc5392d97
Merge pull request #12676 from tensor-tang/refine/op/fc
...
refine fc op
7 years ago
minqiyang
a32ce8c444
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
Yan Chunwei
5d2834fcf7
fea/ir support fuse, based on graph pattern detection helper ( #12636 )
7 years ago
tangwei12
470fb7c5c3
bug fix
7 years ago
minqiyang
0d7047ca79
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
yuyang18
d1d825ee02
Hide unnecessary API
7 years ago
yuyang18
265302edea
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/fast_executor
7 years ago
tangwei12
60dda7bf9f
add gpu Implementation
7 years ago
tangwei12
4661f5589d
random optimize
7 years ago
Wu Yi
bd87f67f0e
Dist transpile can pass startup program by argument ( #12606 )
...
* dist transpile can pass startup program by argument
* update API.spec
7 years ago
Bai Yifan
9333a62792
Add flatten op interface and enhance APIs about detection to support variable-length image. ( #12422 )
...
* add flatten api&enhance detection api
* unify shape_op data type
* update API.spec
7 years ago
tensor-tang
eee38464dc
refine fc op use cpu only
7 years ago
tangwei12
ed937bc6f8
merge
7 years ago
fengjiayi
f276006f0c
Merge pull request #12694 from JiayiFeng/dev_op_tensor_support
...
Make cross_entropy_op supporting tensor
7 years ago
Yu Yang
a197737c02
Merge pull request #12690 from reyoung/feature/better_exception_holder
...
Polish API of exception holder
7 years ago
Yan Chunwei
e765dead86
add profiler to fluid inference ( #12707 )
7 years ago
tensor-tang
d84a1a0010
fc op use cpu only
7 years ago
tensor-tang
fbc164047d
Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
Xin Pan
d96ee24f0b
Merge pull request #12697 from panyx0718/ir2
...
test and doc IR Graph
7 years ago
minqiyang
77f12e000f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
tangwei12
f56102505a
add pserver_endpoints args in load_inference_model
7 years ago
fengjiayi
a38a8db928
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
tangwei12
478f73c188
merge header in cc
7 years ago
fengjiayi
d6b5302bd6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
Yan Chunwei
0a641ba326
add ratio to profiler ( #12701 )
7 years ago
tensor-tang
c588c64a76
Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang
0098a494a2
Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
gongweibao
842fb021b3
Fix clone() bug. ( #12583 )
7 years ago
Qiao Longfei
5d579e1a96
add export_for_deployment flag to save_inference_model ( #12582 )
...
add export_for_deployment flag to save_inference_model
7 years ago
chenweihang
7797e55f42
use paddle::platform::demangle
7 years ago
minqiyang
e0d5f8a820
Move compat module to python/paddle
7 years ago
chenweihang
da39d84a48
refine by reviewer's advice
7 years ago
Xin Pan
891c3c0f9a
test and doc IR Graph
7 years ago
minqiyang
7e0f66e99a
Polish code
7 years ago
minqiyang
5338417b47
Polish code style
7 years ago
minqiyang
ae39709e59
Polish code
7 years ago
minqiyang
55d7f55c63
Revert the changes to attribute.h
7 years ago
fengjiayi
5e7aa8c7e5
code clean
7 years ago
chenweihang
21d5b94228
error message refine: add demangle api to attribute type
7 years ago
minqiyang
1800fef142
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
tensor-tang
742300baa8
fix unkown omp pragmas
7 years ago
yuyang18
05cadf1b24
Add FastExecutor
7 years ago
tensor-tang
b9dbb7c5cb
fix bias attri in mkldnn fc
7 years ago
yuyang18
c6eb7a89ff
Merge branch 'feature/better_exception_holder' into feature/fast_executor
7 years ago
yuyang18
aac80ef4cc
Polish API of exception holder
7 years ago
yuyang18
d49763a87d
Stash
7 years ago
tangwei12
59580a7f69
bug fix
7 years ago
Zhaolong Xing
83c85f34e8
Merge pull request #12598 from NHZlX/add_tensorrt_softmax
...
Add tensorrt softmax
7 years ago
Tao Luo
1e1974c998
Merge pull request #12563 from luotao1/anakin_test
...
* make inference_anakin_test SERIAL
* add anakin compiler from github source code
* fix inference_lib_dist error
* add comment
* update anakin.cmake
* fix anakin-NOTFOUND compiler error
* modify the anakin_model download dir
7 years ago
tensor-tang
a85bf42ae4
Merge pull request #12681 from PaddlePaddle/revert-12554-refine_elementwise_add
...
Revert "Refine elementwise_add op"
7 years ago
tensor-tang
4b5986bb77
enable fc op in normal case
7 years ago
Wu Yi
8b77448d5f
hide misc APIs ( #12540 )
...
* hide misc APIs
* update
* fix transformer test
* update API.spec
7 years ago
tensor-tang
e133df6037
enable native fc forward
7 years ago
tensor-tang
6a2a9a8350
Revert "Refine elementwise_add op"
7 years ago
minqiyang
68b221401d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
Yu Yang
8dda526a45
Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy
...
Fix 'softmax_with_cross_entropy_op' dependency error
7 years ago
sneaxiy
f6f5cdaa05
Merge pull request #12555 from sneaxiy/refine_layer_norm
...
Refine layer_norm op
7 years ago
sneaxiy
c50c537732
fix arithmetic error in backward kernel
7 years ago
tensor-tang
038cbf799d
add bias for fc op
7 years ago
whs
9d6243b6fb
Fix crop op. ( #12603 )
...
* Fix infer shape of crop op.
* Speed crop op.
7 years ago
Bai Yifan
649f5d74f0
fix mine_hard_example bug ( #12664 )
7 years ago
Tao Luo
51cc80cca0
Merge pull request #12662 from tensor-tang/fix/xbyak
...
fix missing macro condition
7 years ago
sneaxiy
2d9508f8f3
Merge pull request #12554 from sneaxiy/refine_elementwise_add
...
Refine elementwise_add op
7 years ago
tensor-tang
171a0e2b42
add some comment
7 years ago
tensor-tang
1ab1d03c62
fix missing macro condition
7 years ago
Xin Pan
6b45c5a134
Merge pull request #12605 from panyx0718/ir
...
code clean up and renaming
7 years ago
sneaxiy
2c560623d1
fix dependency error
7 years ago
Qiao Longfei
331151f065
Merge pull request #12647 from jacquesqiao/add-RPCServerProfiler
...
add RPCServerProfiler, replace listen and serv optimizer
7 years ago
Qiao Longfei
e8fcb71bed
Merge pull request #12620 from jacquesqiao/timeline-support-pure-cpu
...
Timeline support pure cpu
7 years ago
minqiyang
e4057d071b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
tensor-tang
5377edd282
refine packed condition
7 years ago
tensor-tang
3bf3e77ac8
Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei
5a6c3cd9e0
fix profiler dead lock
7 years ago
Tao Luo
16b65c559d
Merge pull request #12646 from tensor-tang/feature/jit/xbyak
...
introduce xbyak
7 years ago
chengduo
64824ac73f
Add write after write dependence ( #12632 )
...
* Add write after write
* follow comment
7 years ago
qiaolongfei
c0890988da
add RPCServerProfiler, replace listen and serv optimizer
7 years ago
tensor-tang
a50889f523
introduce xbyak
7 years ago
qiaolongfei
3f2aa91970
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
tangwei12
64a4925cb4
Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12
0bfd62be3d
remove gpu supported, will add it later
7 years ago
luotao1
2ea110cd4a
Merge branch 'develop' into anakin_test
7 years ago
luotao1
a222d336ca
modify the anakin_model download dir
7 years ago
luotao1
22bc328951
fix anakin-NOTFOUND compiler error
7 years ago
luotao1
b2367f3661
update anakin.cmake
7 years ago
Qiyang Min
29fac3c092
Merge pull request #12390 from velconia/port_python3_syntax
...
Apply 2to3 to current paddle main python code
7 years ago
Tao Luo
5a9ae411e0
Merge pull request #12618 from sfraczek/sfraczek/fix-new-mkldnn-conv-tests
...
fix UT for mkldnn 0.15
7 years ago
qiaolongfei
d080d3e694
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
sneaxiy
cf799a6a04
Merge pull request #12553 from sneaxiy/refine_softmax_with_cross_entropy
...
Refine softmax_with_cross_entropy op
7 years ago
xzl
29ad9794bb
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_softmax
7 years ago
luotao1
f4bcee1d6f
Merge branch 'develop' into anakin_test
7 years ago
luotao1
94042ccd2d
add comment
7 years ago
dzhwinter
8499559c42
"fix style" ( #12600 )
7 years ago
sneaxiy
010883689c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_layer_norm
7 years ago
minqiyang
bc12c2c616
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
sneaxiy
5d698589ce
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_elementwise_add
7 years ago
sneaxiy
19ff254d05
Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
qiaolongfei
e008600b08
optimize code
7 years ago
Yan Chunwei
7555cfe33a
fix inference double free bug ( #12613 )
7 years ago
Zhaolong Xing
5dc57b71ee
Merge pull request #12593 from NHZlX/filter_redundant_output
...
filter redundant output
7 years ago
Luo Tao
64c0ba288a
fix inference_lib_dist error
7 years ago
qiaolongfei
7c649e06c3
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
minqiyang
09103084d3
Polish compat.py and add unittest for it
7 years ago
Sylwester Fraczek
d74bb6ab9c
fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests
7 years ago
Xin Pan
626abfc33a
code clean up and renaming
...
Reduce one level of inheritence.
7 years ago
Qiao Longfei
c1446342ff
Merge pull request #12577 from jacquesqiao/optimize-vlog-before-and-after-op-run
...
optimize vlog before and after op run, move into op.run
7 years ago
minqiyang
c3fdf3aee4
Fix divide problem in CI
...
Fix pb_protobuf2 FromString problem
7 years ago
fengjiayi
855c9e3311
clean softmax_op code
7 years ago
fengjiayi
24d51de022
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
fengjiayi
27df3a9f2b
make cross_entropy_op supporting tensors
7 years ago
Chen Weihang
b2435a3a11
Merge pull request #12374 from chenwhql/py_calc_memory
...
Add memory usage estimate API
7 years ago
fengjiayi
66be53264e
Merge pull request #12592 from JiayiFeng/fix_mac_compile_error
...
fix mac compile error
7 years ago
chenweihang
b1dd4149b9
adjust enforce test cases
7 years ago
Yu Yang
cb79b0233e
Merge pull request #12595 from reyoung/fix_scale_loss_with_memopt
...
Fix bug when memopt optimize loss.grad and use ParallelExecutor
7 years ago
nhzlx
641f32da8c
add softmax op converter
7 years ago
nhzlx
943950c190
refine graph draw
7 years ago
Yu Yang
c4f8afa258
Fix bug when memopt optimize loss.grad and use ParallelExecutor
7 years ago
nhzlx
7a019cd608
merge develop
7 years ago
nhzlx
e823ce68bb
filter redundant output
7 years ago
fengjiayi
8e604a10aa
fix mac compile error
7 years ago
nhzlx
551c802cdc
merge develop
7 years ago
nhzlx
c69ae865db
fix comments
7 years ago
Luo Tao
e8aa6d1283
add anakin compiler from github source code
7 years ago
chenweihang
61052cdbc6
polish high frequency enforce error message
7 years ago
sneaxiy
ad45d39222
refine layer_norm
7 years ago
chengduo
7c8b69c700
Feature/op fusion ( #12240 )
...
* Add Preface
* Add demo code
* Save file
* Refine code
* seems can work
* use elementwise strategy
* Use ElementwiseComputeEx
* Add comments
* extract functions from operator
* Refine code
* Follow comment
* code refine
* follow comments
* follow comments
7 years ago
sneaxiy
1b4515f6db
refine softmax_with_cross_entropy
7 years ago
nhzlx
8f9e704f94
merge develop
7 years ago
nhzlx
3a0caf801f
modify trt engine op test
7 years ago
nhzlx
e51d045a6d
modify trt engine op test
7 years ago
Luo Tao
21b4d90ab9
Merge branch 'develop' into anakin_test
7 years ago
qiaolongfei
b4d48531e4
optimize vlog before and after op run, move into op.run
7 years ago
Qiao Longfei
88e47e1e2d
Merge pull request #12570 from jacquesqiao/add-flag-to-disable-inference
...
add WITH_INFERENCE flag
7 years ago
nhzlx
e8954a36f5
merge develop
7 years ago
nhzlx
32a9e050bc
mapping the variable name inside the subgraph
7 years ago
minqiyang
6abe819f07
Fix pybind11 problem
...
Fix str and bytes problem
Fix sorted problem
Fix math problem
Fix CI problem
7 years ago
Wu Yi
2d036c47cd
polish dist unit test code ( #12512 )
...
* polish dist se resnext ut
* update
* update
* update
* avoid cpu initializer differ
* change to use executor for now
* update by comment
* remove lr decay use para exe, should fix para exe bug later
* update by comment
7 years ago
qiaolongfei
9331ba752f
add WITH_INFERENCE flag
7 years ago
chengduo
97a77512b4
Fix the order of sum ( #12562 )
...
* fix the order of sum
* add doc
* check whether need to copy
* follow comments
7 years ago
fengjiayi
7834b4a470
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
Luo Tao
cf74473244
make inference_anakin_test SERIAL
7 years ago
Jeff Wang
4713f0a9e4
Simplify the travis script. ( #12557 )
...
* Simplify the travis script.
Now use docker to deploy documentations
* Check for the pull request
* Update paddle_build.sh
* Update paddle_build.sh
7 years ago
tangwei12
5bfdefae91
Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12
b30bdde15a
random optimize
7 years ago
tangwei12
9c63fef63c
random optimize
7 years ago
Qiao Longfei
88a607c342
Merge pull request #12541 from jacquesqiao/optimize-profiler
...
optimize profiler
7 years ago
tangwei12
5b9716d1f6
add dims check
7 years ago
tangwei12
4cd504d3b4
bug fix
7 years ago
sneaxiy
e57bc4d745
Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
qiaolongfei
954d680b40
fix test_parallel_do.py
7 years ago
sneaxiy
222fbbedfb
Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy
4b83afff6e
Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy
b2d0ee5159
refine elementwise_add op
7 years ago
tangwei12
da2cc99f67
sampling op optimize
7 years ago
Tao Luo
0fd2f713a4
Merge pull request #12548 from Superjomn/bugfix/disable-anakin-test
...
Bugfix/disable anakin test
7 years ago
fengjiayi
7c55e08c93
stash
7 years ago
superjomn
ebe1920626
add comment
7 years ago
superjomn
3c5e15de03
disable anakin test
7 years ago
tangwei12
4973e07be3
sampling op optimize
7 years ago
tensor-tang
836068569f
Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang
18c322c2a1
seperate cpu and gpu implementations for gru kernel compute
7 years ago
tensor-tang
54c95e49f0
fix blas
7 years ago
fengjiayi
b656d97e86
Merge pull request #12485 from JiayiFeng/dev_ops_tensor_support
...
Make lookup_table_op and softmax_op supporting high rank tensor
7 years ago
qiaolongfei
52576c5f38
revert inference
7 years ago
qiaolongfei
1623f1ba4f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler
7 years ago
tangwei12
3206970b77
sampling op rename
7 years ago
qiaolongfei
903b2c0162
optimize code
7 years ago
Xin Pan
99a77cfc62
Merge pull request #12468 from panyx0718/improve_profiler2
...
Improve profiler
7 years ago
qiaolongfei
4c5bcd7859
add guard to profiler
7 years ago
qiaolongfei
d553e2ff3f
revert inference
7 years ago
qiaolongfei
a3f9d6a38c
optimize profiler
7 years ago
tangwei12
e0ab2f7158
new sampling op
7 years ago
minqiyang
a58dd3e557
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
tensor-tang
8c23f7c4f0
fix blas and use packed weight
7 years ago
tensor-tang
d9cc6b1866
replace gru compute with details
7 years ago
tensor-tang
43cee33a23
add mkl packed gemm
7 years ago
minqiyang
f9ef0ee8a9
Polish code
7 years ago
minqiyang
c4d000a990
Make code more efficient
7 years ago
JiabinYang
4af5d3e3d3
fix the paddle script causes 'command not found' error'
7 years ago
minqiyang
9812bb8b48
Fix pserver can NOT start with DebugString problem
7 years ago
tangwei12
766ac488ac
sum_op selectedRows dim bug fix
7 years ago
Zhaolong Xing
d7dd0868db
Merge pull request #12449 from NHZlX/add_tensorrt_elementwise_add
...
Add tensorrt elementwise add
7 years ago
nhzlx
d50f776b27
merge develop
7 years ago
Bai Yifan
900d61dd98
Clean python api ( #12406 )
...
* api clean
* update API.spec
7 years ago
dzhwinter
0c8fde7dce
"cherry picked cpp tests" ( #12182 )
...
* "cherry picked cpp tests"
* "cherry picked"
* "cherry picked tests"
* "merge develop branch"
7 years ago
dzhwinter
595a2c83ae
explicit gradient of elementwise_add/elementwise_sub ( #11970 )
...
* "add gradient register"
* "make some enhance"
* "better format"
* "fix typo"
* "fix reuse"
* "fix get expected kernel"
* "change the mkldnn code"
* "fix mkldnn"
* "fix mkldnn failed test"
* "add comment"
7 years ago
nhzlx
64a08f840f
increase the test batch
7 years ago
Zhaolong Xing
f37f875f1f
Merge pull request #12349 from NHZlX/add_tensorrt_conv2d_converter
...
add conv2d trt converter
7 years ago
Zhaolong Xing
7e6bac3ea6
Merge pull request #12479 from NHZlX/fix_gtest_test_eq_warning
...
fix warning
7 years ago
fengjiayi
e7d8e16a66
update softmax_mkldnn_op
7 years ago
nhzlx
c7e6a11bc1
merge develop
7 years ago
nhzlx
0015df1b12
modify op converter for conv2d
7 years ago
Yu Yang
2567afa35d
Merge pull request #12462 from reyoung/feature/fix_cudnn_deterministic
...
Fix bug in cudnn_determistic
7 years ago
fengjiayi
dc111d3476
update softmax_cudnn_op
7 years ago
nhzlx
66406619ec
merge develop
7 years ago
nhzlx
a2749adf5d
fix warning
7 years ago
fengjiayi
f7bd0b227b
Add unittests for softmax_op
7 years ago
gongweibao
819ac3df0a
Modify style ( #12465 )
7 years ago
cuichaowen
046de2acdb
Improve anakin feature ( #11961 )
7 years ago
fengjiayi
b314a69523
make softmax supporting tensors
7 years ago
fengjiayi
b1af7e5d9b
Add unittests for lookup_table_op
7 years ago
tangwei12
c4c8f60bec
sum_op selectedRows dim bug fix
7 years ago
Xin Pan
486345551d
clean
7 years ago
Xin Pan
caf10b474f
make profiler use thread_id from g_thread_id
...
Add a few more RecordEvent.
Cleanup
7 years ago
nhzlx
c13efe02d9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_elementwise_add
7 years ago
nhzlx
a5c96af33c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_conv2d_converter
7 years ago
Yu Yang
040fc1c39b
Fix bug in cudnn_determistic
...
* Introduced by #11205
7 years ago
fengjiayi
7efdf05ac2
make look_up_op supporting tensor ids
7 years ago
Tao Luo
baff71d504
Merge pull request #12460 from luotao1/small_tgz
...
compress the fluid.tgz
7 years ago
Yan Chunwei
dcfbc6a661
inference analyzer as bin ( #12450 )
7 years ago
Yan Chunwei
31a2c87688
fea/lightly support lod ( #12451 )
7 years ago
fengjiayi
38863a2c9d
Merge pull request #12454 from JiayiFeng/dev_exception_holder
...
Exception Holder
7 years ago
Qiao Longfei
690625fe15
Merge pull request #12456 from jacquesqiao/add-profiler-to-pserver
...
Add profiler to pserver
7 years ago
dzhwinter
6d3da458a7
Fix/float16 style ( #12446 )
...
* "rewrite the test case"
* "follow comment"
7 years ago
yuyang18
59c900e1e9
Update API.spec
7 years ago
Luo Tao
5e6f7bc569
compress the fluid.tgz
7 years ago
fengjiayi
bc1b7b96ec
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_exception_holder
7 years ago
qiaolongfei
7e46a8d172
fix logical bug, optimize code
7 years ago
qiaolongfei
d04dca3798
revert cmakelist
7 years ago
qiaolongfei
0b62f61d29
add init flag in __init__.py for listen_and_serv_profile_period
7 years ago
dzhwinter
91fb0156ca
Memory/reshape op ( #12414 )
...
* "remove inplace in single op"
* "fix ci"
* "add transpiler case"
* fix conflict
* "fix reshape"
* "delete reshape inplace attr"
* "follo the comments"
* "rerun ci"
7 years ago
qiaolongfei
b4496ee442
Merge branch 'fix-mac-build-graph_executor' of ssh://github.com/jacquesqiao/Paddle into add-profiler-to-pserver
7 years ago
qiaolongfei
c8c8c01a23
fix mac build of graph_executor
7 years ago
qiaolongfei
0b861bbca9
add profiler for listen_and_serv op
7 years ago
Zhaolong Xing
7ae73e33da
Merge pull request #12432 from Superjomn/fea/analysis-ssa
...
inference analysis supports SSA
7 years ago
fengjiayi
3e4083ed1f
Make exception handling of threaded_ssa_graph_executor an independent class
7 years ago
tensor-tang
059b27840c
Merge pull request #12408 from tensor-tang/refine/im2col
...
Refine CPU im2col padding with 1
7 years ago
Superjomn
15c2f1abb3
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fea/analysis-ssa
7 years ago
nhzlx
b241a47e8e
merge develop
7 years ago
nhzlx
5fcdd81da7
tiny modify
7 years ago
minqiyang
ce4eba3b0d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
qiaolongfei
236fc1bd38
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-test-for-split-ids-op
7 years ago
qingqing01
f372f27e3f
Hidden APIs for While, StaticRNN, ParallelDo. ( #12332 )
...
* Hidden APIs for While, StaticRNN, ParallelDo.
7 years ago
minqiyang
000ba1ac5f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
Xin Pan
4b8ae523c4
Merge pull request #12367 from panyx0718/ir_pass
...
Ir pass
7 years ago
nhzlx
f05c7fb8ae
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_conv2d_converter
7 years ago
nhzlx
6f6d552790
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_conv2d_converter
7 years ago
Qiao Longfei
297cbeb1c6
Merge pull request #12439 from jacquesqiao/CheckTensorNANOrInf-support-selectedrows
...
CheckTensorNANOrInf support checking SelectedRows
7 years ago
dzhwinter
39ac9e39c2
float16 type support enhance ( #12181 )
...
* cherry picked
* "cherry picked platform"
* "add comment"
* "fix ci"
7 years ago
qiaolongfei
3033841b4a
CheckTensorNANOrInf support checking SelectedRows
7 years ago
qiaolongfei
147bf00ffe
clear mutable rows for the output of split_ids_op
7 years ago
qiaolongfei
91b114a787
change map to unordered_map
7 years ago
tensor-tang
d8d2dbcfac
further optimize im2col using variables
7 years ago
Superjomn
4d2405d851
inference analysis support ssa
7 years ago
qiaolongfei
91f63cd401
fix split_ids_op and add unit test
7 years ago
tensor-tang
5373fe29c2
Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
Xin Pan
02c31458bb
Merge pull request #12417 from panyx0718/add_dist_deps
...
properly set up dep of concat and fetch_bar
7 years ago
Xin Pan
25706d0868
properly set up dep of concat and fetch_bar
7 years ago
minqiyang
e96fef2cf7
Fix inference api impl deps
7 years ago
Xin Pan
4abcb1b8e7
Merge pull request #12409 from panyx0718/add_dist_deps
...
add distributed training deps.
7 years ago
Qiyang Min
7da453630e
Merge pull request #12403 from velconia/fix_hang_up
...
Fix grpc destroy bug
7 years ago
Xin Pan
398cfb47b1
disable dist_se_resnext since it's not stable yet.
...
fix fluid_benchmark.py
7 years ago
Tao Luo
5a634786af
Merge pull request #12312 from luotao1/unify
...
unify libpaddle_inference_api and libpaddle_fluid
7 years ago
Bai Yifan
e12b1d1792
Add flatten op ( #12341 )
...
* add flatten op
7 years ago
Luo Tao
062556f938
Merge branch 'develop' into unify
7 years ago
Xin Pan
5fff8d7a55
add distributed training deps.
7 years ago
nhzlx
98948b975e
wrong added file
7 years ago
nhzlx
830aa12c1a
add elementwise init code
7 years ago
chengduo
2409d0f710
Refine regularization for selected_rows ( #12369 )
...
* refine regularization for selected_rows
* clean lookup_table
* refine rpc_server_test
* temporally disable rpc_server_test
* fix rpc_server_test
* add unit test
7 years ago
Zhaolong Xing
85c4912755
Merge pull request #12355 from NHZlX/add_tensorrt_pooling_converter
...
Add tensorrt pooling converter
7 years ago
tensor-tang
5bea9c148c
Merge pull request #12397 from tensor-tang/refine/num_threads
...
refine num_threads control
7 years ago
tensor-tang
687a322267
Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
tensor-tang
65d418f060
complete im2col with padding==1 and speedup filter width==1
7 years ago
minqiyang
053540e199
Add volatile to stopped_ member
7 years ago
tensor-tang
4f0383f52e
fix unknown flag
7 years ago
minqiyang
0c7d6eb8b2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
minqiyang
b78ffde6d5
Add stopped sign for grpc client
7 years ago
fengjiayi
ec4c6e1f7c
Merge pull request #12384 from JiayiFeng/dev_update_save_inference_model
...
update inference_optimize() to support program with readers
7 years ago
tensor-tang
9788e5ab87
add flags to control num_threads
7 years ago
tensor-tang
10a1c2bb86
control omp num_threads
7 years ago
Xin Pan
99c0c20468
add pass test
7 years ago
tensor-tang
52eb86e30f
refine im2col benchmark
7 years ago
tensor-tang
3017f46076
add more test cases
7 years ago
typhoonzero
54e9fd3f61
fix cudnn enforce
7 years ago
tensor-tang
8d6be4fb5f
refine im2col test and add benchmark
7 years ago
minqiyang
559d36328c
Apply 2to3 to current paddle main python code
7 years ago
tensor-tang
507c143047
im2col cfo cpu code clean
7 years ago
fengjiayi
604bd85a45
update inference_optimize()
7 years ago
Xin Pan
12e9bf6c17
clean up
7 years ago
Xin Pan
ab72d28a5e
clean up and correctness check
7 years ago
tensor-tang
4eeed0b5e4
refine width padding and enable core copy
7 years ago
Tao Luo
3ade95d0db
Merge pull request #12379 from luotao1/demo_ci_fix
...
fix manylinux1 Failed to publish artifacts
7 years ago
fengjiayi
0d43594d16
Merge pull request #12364 from JiayiFeng/dev_add_FLAG_free_idle_memory
...
add flag to prevent unnessary memory free
7 years ago
Wu Yi
73fcfc06ec
refine conv cudnn enforce ( #12353 )
...
* refine conv cudnn enforce
* update
* update all cudnn ops
* fix
7 years ago
Xin Pan
aa1085ddc5
all passes
...
add doc
7 years ago
nhzlx
fb204fbfbe
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_pooling_converter
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
7 years ago
nhzlx
4f71a3b12b
fix a bug
7 years ago
Luo Tao
83e59257d0
fix manylinux1 Failed to publish artifacts
7 years ago
Xin Pan
e4d7d7ae8f
pass refactoring
7 years ago
tensor-tang
e3131e2d73
enable width padding
7 years ago
Xin Pan
142e832d21
pass registration
7 years ago
Xin Pan
5b183557f3
graph viz pass
7 years ago
qiaolongfei
64e7902530
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into parallel-executor-support-prefetch
7 years ago
Xin Pan
d7e08c53c2
Merge pull request #12169 from panyx0718/ir_graph_sort
...
construct a SSAGraph at the beginning.
7 years ago
tensor-tang
92518c519f
reuse sizes saving time
7 years ago
tensor-tang
660df122ce
enable padding!=0 and fill height padding with 0
7 years ago
tensor-tang
d8e00facf7
reuse im_size
7 years ago
tensor-tang
179dd0cb8a
Merge pull request #12337 from tensor-tang/refine/im2col
...
refine cpu im2col no padding
7 years ago
nhzlx
c8adfb3451
add paddle_enforce
7 years ago
nhzlx
5533400720
fix comments
7 years ago
fengjiayi
fd2d2c66e9
add flag to prevent unnessary memory free
7 years ago
qiaolongfei
e7eeb19f90
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into parallel-executor-support-prefetch
7 years ago
Qiao Longfei
2d21aa76c7
Merge pull request #12331 from jacquesqiao/fix-mixed-tensor
...
fix mixed tensor compile and add cpu unit test
7 years ago
Luo Tao
5ba4337698
unify libpaddle_inference_api into libpaddle_fluid
7 years ago
nhzlx
01566fb61b
1. support mutil batch utest 2. support pool op
7 years ago
qiaolongfei
754e96a30c
distribute lookup table work with parallel executor
7 years ago
qiaolongfei
65e5aebd43
fix mixed_vector_test
7 years ago
nhzlx
990741aa85
add weight's dim assert
7 years ago
nhzlx
21890ca0cf
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_pooling_converter
7 years ago
qiaolongfei
da035fc674
remove explicit for compile problem
7 years ago
tensor-tang
7b63b85086
fix mismatch of infer api ( #12342 )
7 years ago
tensor-tang
b72befc5cc
reuse copy size
7 years ago
Yancey
6133efd9ed
Merge pull request #12218 from Yancey1989/rpc_complete_interface
...
Add rpc complete interface
7 years ago
qiaolongfei
c6fb163571
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-mixed-tensor
7 years ago
nhzlx
fc41eb40b1
add conv2d trt converter
7 years ago
qingqing01
24bea40116
Hiden some LoDTensor ralated ops' Python wrapper. ( #12230 )
...
* Hiden some LoDTensor ralatted ops' Python wrapper.
7 years ago
Zhaolong Xing
6169d724b9
Merge pull request #12324 from NHZlX/enhance_for_tensorrt_infer
...
Enhance for tensorrt infer
7 years ago
nhzlx
4d49e61ab8
fix comments
7 years ago
qiaolongfei
18d539e82a
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-mixed-tensor
7 years ago
Wu Yi
9f0d9dffe6
hide variable API ( #12307 )
...
* hide variable API
* edit API.spec
7 years ago
tensor-tang
6788af4bf1
refine test cases
7 years ago
tensor-tang
b163e601b6
add gtest
7 years ago
Yu Yang
7c046ae772
Merge pull request #12323 from reyoung/feature/polish_reshape_and_lod_tensor_blocking_queue
...
Feature/polish reshape and lod tensor blocking queue
7 years ago
nhzlx
bcd67bdd71
add assert for GetOutput
7 years ago
qiaolongfei
5022b14de8
fix mixed tensor compile and add cpu unit test
7 years ago
tensor-tang
aae994fd26
refine im2col no padding
7 years ago
Yancey1989
fb06ed7bdc
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
nhzlx
7382f98600
1. set ut batch > 1 2. readd the mul op(utest will be added later)
7 years ago
nhzlx
bd64979fe9
the argument should not be a const one
7 years ago
Yu Yang
21387e3c2a
Tiny refines for lod_tensor_blocking_queue and reshape_op
7 years ago
nhzlx
f42ea48996
deal with conflict
7 years ago
nhzlx
940f5dbcac
modify the tensorrt engine op to adapt to chage
7 years ago
nhzlx
82527696e7
1. we delelte mul op, 2.modify fc and action op 3. modify the test inferface
7 years ago
nhzlx
2372daff1d
there is no batchsize concept in tensorrt's tensor
7 years ago
qiaolongfei
35d09abd01
add profiler for demo_trainer
7 years ago
qiaolongfei
a6d30a8607
profiler support cpu
7 years ago
Yan Chunwei
02cf54d331
bugfix lod cpu performance ( #12297 )
7 years ago
Qiao Longfei
b41f8b9d42
Merge pull request #12295 from jacquesqiao/speedup-reduce-sum-grad-op
...
Speedup reduce sum grad op
7 years ago
Xin Pan
5173a53c8a
fix reorder issue.
7 years ago