chengduoZH
7b723839ef
Add cpu test for parallel_executor_crf executor_fetch_feed, and enable these tests
7 years ago
chengduoZH
d24e046c1e
fix allReduce bug
7 years ago
chengduoZH
a57e8a4338
add cpu test
7 years ago
chengduoZH
1e731f5964
small fix
7 years ago
chengduoZH
495368c243
ADD CPU_NUM
7 years ago
chengduoZH
d09fd1f6f0
test seresnext
7 years ago
chengduoZH
27073c284d
nccl_all_reduce_op_handle => all_reduce_op_handle
7 years ago
chengduoZH
2d94697a82
code refine
7 years ago
chengduoZH
5a3c8bf813
fix in c++ side
7 years ago
chengduoZH
a56dcf5159
fix parallel_executor.py and xx_mnist.py
7 years ago
mozga-intel
3ff9ba0e6b
Mkldnn layout ( #11040 )
...
* Add MKLDNN layout support in Paddle
Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
is hardcode to be used in all MKLDNN op kernels. As a result,
non-optimized execution path is selected in MKLDNN primitive which
bring worse performance.
Besides framework change, three MKLDNN OP kernels were updated
for using new MKLDNN layout. They are conv/pool2d/batch_norm.
Other MKLDNN OP kernels need be also updated in similar way to
achieve best performance.
* Add MKLDNN layout support in activation OP
* Don't populate layout from input to output when kMKLDNN in
* Refine pool mkldnn op kernel
* MKLDNN layout
* Remove the inferitance from tensor file
* MKLDNN layout: refactoring
* Remove additional #define to register new operator
* Prepare mkldnn tests to work with layout
7 years ago
fengjiayi
a1e046bfc0
Merge pull request #11270 from JiayiFeng/fix_a_error_on_max
...
fix a compile error on Mac
7 years ago
Yu Yang
03073df182
Merge pull request #11237 from chengduoZH/add_fuse_var_op_handle
...
[Feature] Add fuse vars op handle
7 years ago
Tao Luo
6d80dd5a50
Merge pull request #11222 from luotao1/trt
...
rewrite unittest of trt_activation_op
7 years ago
fengjiayi
2f5e310167
fix a compile error
7 years ago
yuyang18
8149b0a9aa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_fuse_var_op_handle
7 years ago
Luo Tao
f6fb51a164
add test_mode in trt/activation_op
7 years ago
fengjiayi
50104f18c7
Merge pull request #11230 from JiayiFeng/add_broadcast_test
...
Add unittests to check channelwise add
7 years ago
fengjiayi
65a94be1a1
Merge pull request #11223 from JiayiFeng/dev_reverse_op
...
Add reverse op
7 years ago
Luo Tao
c73977af03
Merge branch 'develop' into trt
7 years ago
Tao Luo
f40fc24974
Merge pull request #11260 from luotao1/gtk
...
install libgtk2.0-dev in latest images
7 years ago
Yu Yang
ff9b1a0f95
Merge pull request #11234 from reyoung/feature/refine_code
...
SSA Graph Builder Factory
7 years ago
Yu Yang
08823146ec
Merge pull request #11232 from reyoung/feature/extract_tensor
...
Extract method from tensor_impl.h to tensor.cc
7 years ago
Luo Tao
08220d39e7
install libgtk2.0-dev in latest images
7 years ago
weixing
d02b318c19
Merge pull request #11201 from weixing02/format
...
Fix formula display error
7 years ago
gongweibao
ddd95022c1
add link ( #11255 )
7 years ago
dzhwinter
d050360a74
"reduce test input data size to accelerate ci" ( #10831 )
7 years ago
Xin Pan
53a509daaf
make benchmark really working ( #11215 )
7 years ago
gongweibao
2028a8ef6d
Add rpc_client interface. ( #11154 )
7 years ago
Xin Pan
ca2d6d3c66
Merge pull request #11224 from dzhwinter/fix/cudnn
...
fix cudnn version issue
7 years ago
tensor-tang
3a294042c8
Merge pull request #11233 from tensor-tang/multithreads
...
Fix abort issue in cpu multi-threads
7 years ago
Yan Chunwei
4f95bc9463
feature/trt engine op test ( #11182 )
7 years ago
Tao Luo
fdf2d6fd9d
Merge pull request #11242 from luotao1/opencv
...
add python-opencv in paddlepaddle/paddle:lastest images
7 years ago
qingqing01
e0a32074bd
Fix PADDLE_ASSERT. ( #10981 )
...
* Enable assertions in CUDA.
* Fix PADDLE_ASSERT.
7 years ago
tensor-tang
4b7b17a84f
fix conflcts
...
Merge remote-tracking branch 'ups/develop' into multithreads
7 years ago
yuyang18
d9af153232
SSA Graph Builder Factory
...
* Use Builder Chain to decorate new builders. It is easy to extend
builders.
* Make graphviz path as a build strategy, not a FLAGS.
7 years ago
yuyang18
b6c8701e45
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/extract_tensor
7 years ago
chengduoZH
a584bc86dd
add fuse var op handle
7 years ago
Xin Pan
106ee9d1cc
Merge pull request #11243 from panyx0718/scope
...
small clean up and document pointer ownership.
7 years ago
tensor-tang
64323b1caf
Merge remote-tracking branch 'ups/develop' into multithreads
7 years ago
Luo Tao
4fac15c6f1
add python-opencv in paddlepaddle/paddle:lastest images
7 years ago
Luo Tao
e8e8ad0491
Merge branch 'develop' into trt
7 years ago
dzhwinter
44c662b4e1
Merge remote-tracking branch 'origin/develop' into fix/cudnn
7 years ago
fengjiayi
ba773fc9f8
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_broadcast_test
7 years ago
fengjiayi
ea73fb8416
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_reverse_op
7 years ago
Xin Pan
73aa5d230b
small clean up and document pointer ownership.
7 years ago
Tao Luo
ea408d5521
Merge pull request #11226 from typhoonzero/disable_unitest
...
disable failed tests
7 years ago
tensor-tang
4ae935e2cf
refine the lock in scope
7 years ago
Yu Yang
c36dd3b338
Merge pull request #11114 from reyoung/feature/yep
...
Try to speed up parallel executor
7 years ago
fengjiayi
42d7174778
refine API
7 years ago