luotao1
fad06cb928
unify ZeroCopy in analysis_test
6 years ago
lidanqing
4aeb261da9
Add INT32 support. INT32 in last switch case
...
test=develop
6 years ago
Yiqun Liu
36e2d3241e
Enhance the op benchmark: ( #16066 )
...
- Support setting attr in config
- Support setting dtype and initializer for input in config
test=develop
6 years ago
tensor-tang
9be825a982
polish the cast op doc ( #16078 )
...
* polish the cast op doc
test=develop
* follow comments
test=develop
* fix api.spec
test=develop
6 years ago
jerrywgz
847bb6a279
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fpn_ops
6 years ago
Wu Yi
5e92eb3f25
add parallel graph dist test ( #16076 )
...
* add parallel graph dist test=develop
* update test=develop
* update style test=develop
6 years ago
jerrywgz
e5759d6c38
refine doc, test=develop
6 years ago
jerrywgz
2b41743791
fix doc, test=develop
6 years ago
jerrywgz
c2eda2325b
refine code, test=develop
6 years ago
jerrywgz
9eb6d35f59
fix API.spec,test=develop
6 years ago
jerrywgz
a2e83d1d7b
add box_coder_and_assign, test=develop
6 years ago
Wu Yi
d206582337
add parallel graph dist test ( #16076 )
...
* add parallel graph dist test=develop
* update test=develop
* update style test=develop
6 years ago
jerrywgz
893789a0d1
Merge pull request #16050 from jerrywgz/add_box_decoder_and_assign
...
Add box decoder and assign
6 years ago
liuwei1031
1b5768c33b
fix a code bug which cause crash when empty variable is used, test=develop ( #16080 )
6 years ago
liuwei1031
045e5911bf
fix a code bug which cause crash when empty variable is used, test=develop ( #16080 )
6 years ago
ceci3
c109e6b3aa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into npair_loss0
6 years ago
ceci3
7613918e23
test=develop, change labels name
6 years ago
xiaolil1
a177d48217
Add Requantize OP ( #15318 )
...
* Enable INT8 ReQuantize OP
test=develop
* Clean code
test=develop
* Add comments
test=develop
* Revert "Clean code"
test=develop
This reverts commit a7a49b8aa214f9730cb84e11ea96da564fe4b4d9.
* Modify requantize op test
test=develop
* fix requantize UT by moving public function to public test file.
test=develop
* Fix test fail due to file address change.
test=develop
* Change file address for requantize op.
test=develop
6 years ago
chengduo
6fe7478ba8
Refine recurrent_op ( #16027 )
...
* refine recurrent_op
test=develop
* remove unnecessary code
test=develop
6 years ago
chengduo
f5a3751845
Refine recurrent_op ( #16027 )
...
* refine recurrent_op
test=develop
* remove unnecessary code
test=develop
6 years ago
sneaxiy
7b608396fe
fix travis-ci format check
...
test=develop
6 years ago
ceci3
dc57952b7f
test=develop, add random to testfile
6 years ago
chengduo
84e3adbe60
Fix reshape bug ( #16069 )
...
* In some case, the input may have one than one negative value.
test=develop
* fix matmul bug
test=develop
6 years ago
wopeizl
eb367f990c
remove the ignored from is_empty and less_than test=develop ( #15971 )
...
* remove the ignored from is_empty and less_than test=develop
* fix api spec test=develop
* fix the api spec test=develop
* test=develop
6 years ago
liuwei1031
9cc6f4009f
add IfElse test case for ir memory optimize ( #15998 )
...
* add ir memory optimize test case for IfElse op, test=develop
* fix some unitttest failure by force using the python memory_optimize, test=develop
* tweak comments, test=develop
* fix unittest, test=develop
* fix unittest, test=develop
6 years ago
luotao1
503efa8b86
refine SetCpuMathLibraryNumThreads
...
test=develop
6 years ago
baojun
9f85876885
fix tanh typo test=develop ( #16049 )
6 years ago
whs
bd9669003f
Make sequence_erase op support for input with multi-level LoD. ( #15982 )
...
test=develop
6 years ago
Tao Luo
1301dc1a27
remove legacy function in ExecutionContext
...
test=develop
6 years ago
lidanqing
21156b8d4c
MKLDNN: Add UT for conv_transpose_mkldnn op. ( #16030 )
...
* MKLDNN: Add UT for conv_transpose_mkldnn op.
test=develop
* MKLDNN: Add fuse_bias check UT for conv_transpose_mkldnn op.
test=develop
6 years ago
dengkaipeng
b1a49e873f
fix statement. test=develop
6 years ago
dengkaipeng
0e0a2d046d
fix API.spec. test=develop
6 years ago
dengkaipeng
dbb8d07886
fix doc statement. test=develop
6 years ago
dengkaipeng
eeeebdd006
refine doc. test=develop
6 years ago
dengkaipeng
8ee866bf19
fix format. test=develop
6 years ago
dengkaipeng
9c47f36d1b
fix spectral_norm doc. test=develop
6 years ago
dengkaipeng
12416a24d2
add doc and test_layers. test=develop
6 years ago
dengkaipeng
63d322f07c
fix attr dim calc. test=develop
6 years ago
dengkaipeng
ca1502c7f5
add grad kernel for spectral_norm. test=develop
6 years ago
dengkaipeng
8956a59637
add unittest for spectral_norm. test=develop
6 years ago
dengkaipeng
fd66089d23
add spectral_norm forwarn kenel
6 years ago
tensor-tang
cab46b62f8
refine vbroadcast jitcode
...
test=develop
6 years ago
tensor-tang
6010361c7a
add vbroadcast mkl code and jitcode
...
test=develop
6 years ago
tensor-tang
2e96da453a
add vbroadcast jitkernel refer code and use it
...
test=develop
6 years ago
tensor-tang
020540948f
add jitkernel vcopy and speedup unit test time
...
test=develop
6 years ago
tensor-tang
6057f36208
Merge pull request #15996 from tensor-tang/op/embgrad
...
refine embeddingseqpool grad
6 years ago
chengduo
c67afb0f76
Fix reshape bug ( #16069 )
...
* In some case, the input may have one than one negative value.
test=develop
* fix matmul bug
test=develop
6 years ago
Tao Luo
14b4337663
Merge pull request #16062 from luotao1/num_threads
...
refine SetCpuMathLibraryNumThreads
6 years ago
sneaxiy
33138a421d
remove match check
...
test=develop
6 years ago
wopeizl
7fbf52daa3
remove the ignored from is_empty and less_than test=develop ( #15971 )
...
* remove the ignored from is_empty and less_than test=develop
* fix api spec test=develop
* fix the api spec test=develop
* test=develop
6 years ago
Zhen Wang
8063b31e2d
Reduce redundant code for channel wise dequant op. test=develop
6 years ago
Tao Luo
6375fe45d7
Merge pull request #16039 from luotao1/execution_context
...
remove legacy function in ExecutionContext
6 years ago
Zhen Wang
e8f9dac7ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into channel_wise_quant_op
...
test=develop
6 years ago
Zhen Wang
806832e091
update the input format of channel wise dequantize op.
6 years ago
jerrywgz
f0177a1ed1
refine doc, test=develop
6 years ago
jerrywgz
072eca348a
refine doc, test=develop
6 years ago
Kaipeng Deng
6d8771b55c
Merge pull request #15864 from heavengate/spectral_norm
...
Add spectral norm op
6 years ago
sneaxiy
814a759061
merge develop
...
test=develop
6 years ago
sneaxiy
597dc65e76
enhance gc
...
test=develop
6 years ago
liuwei1031
caadd0581d
add IfElse test case for ir memory optimize ( #15998 )
...
* add ir memory optimize test case for IfElse op, test=develop
* fix some unitttest failure by force using the python memory_optimize, test=develop
* tweak comments, test=develop
* fix unittest, test=develop
* fix unittest, test=develop
6 years ago
luotao1
06aab1b493
refine SetCpuMathLibraryNumThreads
...
test=develop
6 years ago
baojun
da45fbdaf5
fix tanh typo test=develop ( #16049 )
6 years ago
whs
0f99d24083
Make sequence_erase op support for input with multi-level LoD. ( #15982 )
...
test=develop
6 years ago
Zhen Wang
89dee160d1
add channel wise dequantize op.
6 years ago
Tao Luo
f4587789d8
remove legacy function in ExecutionContext
...
test=develop
6 years ago
luotao1
c0b240aa43
try to fix distributed unit-test
...
test=develop
6 years ago
jerrywgz
b4f5180299
fix doc, test=develop
6 years ago
jerrywgz
21e0d35ce3
fix formula, test=develop
6 years ago
jerrywgz
d1901f27bc
refine doc
6 years ago
jerrywgz
a1ef7df865
refine code, test=develop
6 years ago
tensor-tang
12eb9aecde
Merge remote-tracking branch 'ups/develop' into op/embgrad
...
test=develop
6 years ago
jerrywgz
e64921c79a
fix API.spec,test=develop
6 years ago
jerrywgz
d497bd9079
resolve conflict, test=develop
6 years ago
jerrywgz
41471d28ac
add box_coder_and_assign, test=develop
6 years ago
lidanqing
02c106c717
MKLDNN: Add UT for conv_transpose_mkldnn op. ( #16030 )
...
* MKLDNN: Add UT for conv_transpose_mkldnn op.
test=develop
* MKLDNN: Add fuse_bias check UT for conv_transpose_mkldnn op.
test=develop
6 years ago
sneaxiy
7e5a4a3d63
test=develop
6 years ago
luotao1
784826a4f5
enhance cache runtime_context for different scope
...
test=develop
6 years ago
dengkaipeng
3eab9e4b95
fix statement. test=develop
6 years ago
dengkaipeng
e37f5ab5b1
fix API.spec. test=develop
6 years ago
dengkaipeng
54bbbfa71f
fix doc statement. test=develop
6 years ago
dengkaipeng
c1a69e3ea0
refine doc. test=develop
6 years ago
dengkaipeng
65d375a09f
fix format. test=develop
6 years ago
dengkaipeng
82d514345c
fix spectral_norm doc. test=develop
6 years ago
dengkaipeng
2ea5843cbf
add doc and test_layers. test=develop
6 years ago
dengkaipeng
037855f42d
fix attr dim calc. test=develop
6 years ago
dengkaipeng
70dbd59839
add grad kernel for spectral_norm. test=develop
6 years ago
dengkaipeng
72509ec3bd
add unittest for spectral_norm. test=develop
6 years ago
dengkaipeng
3bf1ae9b59
add spectral_norm forwarn kenel
6 years ago
Zhen Wang
545247d7b4
add channel wise quantize op.
6 years ago
sneaxiy
f0634da4b5
test=develop
6 years ago
ceci3
44a4ac0f8c
fix API.spec and testfile
6 years ago
tensor-tang
b16dabd7e0
refine vbroadcast jitcode
...
test=develop
6 years ago
tensor-tang
c2e56e6bbc
Merge remote-tracking branch 'ups/develop' into op/embgrad
6 years ago
ceci3
3b96aa0839
conflict fix
6 years ago
ceci3
06d8e1a15d
test=develop
6 years ago
chengduo
92438f6132
Revert "Add Event for TensorCopy" ( #16022 )
...
* Revert "Add Event for TensorCopy (#15953 )"
This reverts commit 7235fd662b
.
test=develop
* fix CI
test=develop
6 years ago
baojun
742839f8f4
fix cpplint test=develop ( #16028 )
6 years ago
chengduo
d4b461eb10
Unified ParallelExecutor and Compiler ( #15970 )
...
* Unified ParallelExecutor and Compiler
6 years ago
chengduo
06f3c8575d
Add Event for TensorCopy ( #15953 )
...
Add Event for TensorCopy
6 years ago
Tink_Y
8949a94691
refine image_resize annotation ( #15976 )
...
* fix image_resize annotation
test=develop
* fix some typo
* Update nn.py
* Update interpolate_op.cc
test=develop
6 years ago
tangwei12
7b0875e9f8
add op type in check nan/inf ( #15986 )
...
* add op name in check nan/inf, test=develop
6 years ago
Yiqun Liu
2bdf44641c
Add the include of cudnn.h to enable the use of CUDNN_VERSION. ( #15961 )
...
test=develop
6 years ago
Yiqun Liu
b94307a919
Revert "Optimize while_op when is_test is true. ( #15811 )" ( #15968 )
...
test=develop
6 years ago
flame
eeb70edd9a
add anakin fc op converter ( #15965 )
6 years ago
minqiyang
ab5a648481
Add missing headers
...
test=develop
6 years ago
minqiyang
94c8ce3f13
reduce ut time
...
test=develop
6 years ago
Yiqun Liu
c90b82a637
Fix error in CUDA kernel of beam_search. ( #15957 )
...
test=develop
6 years ago
minqiyang
3723dcc301
Polish code
...
test=develop
6 years ago
flame
afc3fcd509
anakin subgraph engine ( #15774 )
...
* add anakin subgraph engine
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* add initial op converter
* update
* update
* fix op register compile error
* update
test=develop
* update
6 years ago
minqiyang
212242c4e4
Polish code
...
test=develop
6 years ago
Yiqun Liu
1b10a7843c
Optimize while_op when is_test is true. ( #15811 )
...
test=develop
6 years ago
xiaolil1
91838c3214
Optimize Quantize Op with primitive reuse. ( #15929 )
...
test=develop
6 years ago
luotao1
1c58eee9b2
refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool
...
test=develop
6 years ago
minqiyang
3f4aeed57f
Polish code
...
test=develop
6 years ago
minqiyang
b754bf30fb
Reset output var's pre_op pointer when op was destructed
6 years ago
baojun
ac72bcd065
Added adam op test=develop ( #15710 )
6 years ago
mozga-intel
b29acec815
Register sum operator ( #15889 )
...
test=develop
6 years ago
dzhwinter
4449e85528
polish cudnn related code and fix bug. ( #15164 )
...
* staged.
* polish code
* polish code. test=develop
* polish code. test=develop
* api change. test=develop
* fix default value. test=develop
* fix default value. test=develop
6 years ago
Xin Pan
8e094f7117
polish
...
test=develop
6 years ago
Xin Pan
90b17d28ec
have no time for cmake/externel
...
test=develop
6 years ago
mozga-intel
06a7f741f0
The flag of mkldnn is enabled iff it is necessary
...
test=develop
6 years ago
baojun-nervana
320b27988c
added concat op test=develop
6 years ago
minqiyang
b71af29fb4
Remove var op deps in imperative mode
...
test=develop
6 years ago
Tao Luo
690be0bb09
fix cpplint error of async_executor.h
...
test=develop
6 years ago
Tao Luo
6e87843e26
enable cpplint, remove go_fmt
6 years ago
tensor-tang
0eefad0a2d
fix jitcodekey and refine test
...
test=develop
6 years ago
tensor-tang
ce4cc482a4
add sgd jitcode and op test
...
test=develop
6 years ago
tensor-tang
1bfc565ffe
add benchmark and mkl sgd implement
...
test=develop
6 years ago
shippingwang
a0834044fc
add API.spec. test=develop
6 years ago
shippingwang
7d4feb2fc5
fix api.spec, test=develop
6 years ago
minqiyang
9035887bc9
Add gperftools into imperative tracer
...
test=develop
6 years ago
Yihua Xu
b48d56e87f
Optimize gelu operation with mkl erf.
...
test=develop
6 years ago
xiaoli.liu@intel.com
f8cbc4f34b
Optimize INT8 DeQuantize Op with primitive reuse.
...
test=develop
6 years ago
minqiyang
701af43958
Fix bugs
...
test=develop
6 years ago
baojun-nervana
dea34134e8
Update ngraph version to v0.14 test=develop
6 years ago
minqiyang
f1a2d20430
invoke backward_hooks after reduce op's depcounts map
...
test=develop
6 years ago
minqiyang
e0a2b472f4
Move ClearBlock into OpBase and VarBase's destructor
...
test=develop
6 years ago
minqiyang
9abf40c9e2
Add imperative python tracer
6 years ago
tensor-tang
92f3cf42cb
enable sgd jitkernel refer code and test
...
test=develop
6 years ago
shippingwang
13e891516b
add cosine decay op, test=develop
6 years ago
jerrywgz
b2ce832021
change default option related to softmax, test=develop
6 years ago
chengduo
e2da3a5b22
Revert "Add Event for TensorCopy" ( #16022 )
...
* Revert "Add Event for TensorCopy (#15953 )"
This reverts commit 7235fd662b
.
test=develop
* fix CI
test=develop
6 years ago
luotao1
2fb38c108c
Merge branch 'develop' into runtime_context
6 years ago
sneaxiy
a9ea99d700
merge develop
6 years ago
baojun
9aaea38c0a
fix cpplint test=develop ( #16028 )
6 years ago
tianshuo78520a
26e3842d40
Update detection API add new check document ( #15848 )
...
* Update detection API add new check document
* update API.spec
* test=develop;add shanyi15 approved API.spec
* test=develop;update PM check API.spec
* check api.spec
* test=develop
* update API.spec
* test=develop;update API.spec
* update API.spec
* cat API.spec
* update documnent in api.spec
* check python35 api.spec
* update print_signatures md5 function
* test=develop
* update API.spec
* test=develop;fix python3 API.spec diff
* test=develop
* test=develop
* test=develop
6 years ago
chengduo
ae37f82964
Unified ParallelExecutor and Compiler ( #15970 )
...
* Unified ParallelExecutor and Compiler
6 years ago
chengduo
7235fd662b
Add Event for TensorCopy ( #15953 )
...
Add Event for TensorCopy
6 years ago
luotao1
82b0bb9d72
fix cpplint error
...
test=develop
6 years ago
luotao1
9773f38f99
cache runtime_context
...
test=develop
6 years ago
Tink_Y
31d830de9f
refine image_resize annotation ( #15976 )
...
* fix image_resize annotation
test=develop
* fix some typo
* Update nn.py
* Update interpolate_op.cc
test=develop
6 years ago
nhzlx
3c40cb767b
7 refine zero copy
...
update trt in docker file
test=develop
6 years ago
tensor-tang
641b3cccce
add vbroadcast mkl code and jitcode
...
test=develop
6 years ago
tensor-tang
41a1270856
add vbroadcast jitkernel refer code and use it
...
test=develop
6 years ago
tensor-tang
867e93b21a
add jitkernel vcopy and speedup unit test time
...
test=develop
6 years ago
tangwei12
6d5a04c1e7
add op type in check nan/inf ( #15986 )
...
* add op name in check nan/inf, test=develop
6 years ago
Qiyang Min
187cffd019
Merge pull request #15928 from velconia/imperative_backward_hooks
...
Imperative backward hooks
6 years ago
Yiqun Liu
1616c32acf
Add the include of cudnn.h to enable the use of CUDNN_VERSION. ( #15961 )
...
test=develop
6 years ago
jerrywgz
c31da7899a
refine code, test=develop
6 years ago
Yiqun Liu
798925453e
Revert "Optimize while_op when is_test is true. ( #15811 )" ( #15968 )
...
test=develop
6 years ago
flame
b187e3728e
add anakin fc op converter ( #15965 )
6 years ago
minqiyang
e5f3435dd5
Add missing headers
...
test=develop
6 years ago
minqiyang
fa1ff1d2f1
reduce ut time
...
test=develop
6 years ago
Yiqun Liu
87248281f7
Fix error in CUDA kernel of beam_search. ( #15957 )
...
test=develop
6 years ago
Tao Luo
c494f64a0f
Merge pull request #15941 from mozga-intel/mozga-intel/enable_mkldnn_framework
...
The flag of mkldnn engine is enabled iff it is necessary
6 years ago
jerrywgz
e8a8fe07e7
fix code for windows CI, test=develop
6 years ago
jerrywgz
149411762a
add gpu kernel, test=develop
6 years ago
Tao Luo
4efdebc6f6
Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt
...
Optimize gelu operation with mkl erf
6 years ago
tensor-tang
e5f9d3a47c
Merge pull request #15892 from tensor-tang/jit/sgd
...
refine sgd op
6 years ago
Tao Luo
e6bab55f1b
Merge pull request #15959 from luotao1/infershape_refine
...
refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool
6 years ago
minqiyang
50639fafdb
Polish code
...
test=develop
6 years ago
ruri
72efef6358
Merge pull request #15887 from shippingwang/cosine_decay_op
...
add cosine decay op, test=develop
6 years ago
flame
e40d56c3d3
anakin subgraph engine ( #15774 )
...
* add anakin subgraph engine
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* add initial op converter
* update
* update
* fix op register compile error
* update
test=develop
* update
6 years ago
minqiyang
fe406b98c9
Polish code
...
test=develop
6 years ago
Yiqun Liu
613d9d0756
Optimize while_op when is_test is true. ( #15811 )
...
test=develop
6 years ago
xiaolil1
1abddd8d97
Optimize Quantize Op with primitive reuse. ( #15929 )
...
test=develop
6 years ago
Tao Luo
7ec97a0a7e
Merge pull request #15930 from xiaolil1/dequantize-reuse
...
Optimize INT8 DeQuantize Op with primitive reuse.
6 years ago
nhzlx
2eff3e26b6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx
06a088a199
fix comments and fix cpplint
...
test=develop
6 years ago
luotao1
34404f9c31
refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool
...
test=develop
6 years ago
Xin Pan
a63e2a0a4f
Merge pull request #15948 from panyx0718/api2
...
Add deprecation warning
6 years ago
minqiyang
f469bb6b36
Polish code
...
test=develop
6 years ago
minqiyang
ac88c62a5b
Reset output var's pre_op pointer when op was destructed
6 years ago
baojun
f285191fb3
Added adam op test=develop ( #15710 )
6 years ago
jerrywgz
b92ef45fe9
Merge pull request #15678 from jerrywgz/refine_softmax_with_cross_entropy
...
change default option related to softmax, test=develop
6 years ago
mozga-intel
558f94cd77
Register sum operator ( #15889 )
...
test=develop
6 years ago
tensor-tang
58b8231338
added concat op test=develop ( #15946 )
6 years ago
Tao Luo
47d36b2008
Merge pull request #15924 from baojun-nervana/ngraph_v14
...
Update ngraph version to v0.14
6 years ago
Qiyang Min
1c9cfb01df
Merge pull request #15934 from velconia/imperative_gperftools
...
Add gperftools into imperative tracer
6 years ago
jerrywgz
0f652f304c
add distribute fpn proposals op, test=develop
6 years ago
dzhwinter
225c11a91f
polish cudnn related code and fix bug. ( #15164 )
...
* staged.
* polish code
* polish code. test=develop
* polish code. test=develop
* api change. test=develop
* fix default value. test=develop
* fix default value. test=develop
6 years ago
Tao Luo
6e3624442e
Merge pull request #15939 from luotao1/pre_commit2
...
enable cpplint, remove go_fmt
6 years ago
Xin Pan
0c277ac6e9
polish
...
test=develop
6 years ago
ceci3
4b7bf06e1f
test=develop
6 years ago
Xin Pan
4d80db838a
have no time for cmake/externel
...
test=develop
6 years ago
Yiqun Liu
454f4f2140
Rewrite is_empty op to avoid unnecessary data transform. ( #15509 )
...
* Rewrite is_empty op to avoid unnecessary data transform.
test=develop
* Add the implementation of InferShape and InferVarType for is_empty op.
test=develop
* Rewrite is_empty op to avoid directly inherit OperatorBase.
test=develop
6 years ago
xiaolil1
6724be2b0d
INT8 Pool kernel Key Creation Optimization. ( #15883 )
...
* Optimize key creation of INT8 pool kernel to improve the peformance of ResNet-50 and MobileNet, especially for latency.
test=develop
* Optimize key creation of pool fp32 grad.
test=develop
6 years ago
xiaoli.liu@intel.com
c4187dbd7c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dequantize-reuse
6 years ago
Tao Luo
d5a888e15c
Merge pull request #15943 from kbinias/kbinias/add-placement-pass-tester
...
MKL-DNN: Add placement pass tester
6 years ago
Tao Luo
ba90e05281
Merge pull request #15917 from jczaja/prv-tensor-mkldnn-ops
...
[MKL-DNN] Adjusting ops to Tensor modifications
6 years ago
mozga-intel
68a9ead17a
The flag of mkldnn is enabled iff it is necessary
...
test=develop
6 years ago
baojun-nervana
e4ab40a7b9
added concat op test=develop
6 years ago
Krzysztof Binias
72253391b6
Add MKL-DNN placement pass tester
...
test=develop
6 years ago
minqiyang
cb85ee987b
Remove var op deps in imperative mode
...
test=develop
6 years ago
colourful-tree
7d8f639883
Merge pull request #15902 from colourful-tree/new_develop
...
remove mkldnn & fix commit
6 years ago
Tao Luo
436dfbb342
fix cpplint error of async_executor.h
...
test=develop
6 years ago
Tao Luo
28680c65d9
enable cpplint, remove go_fmt
6 years ago
Tao Luo
effec86600
Merge pull request #15913 from liangan1/func_coverage
...
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
6 years ago
Zhen Wang
e00c7a2e26
Merge pull request #15830 from wzzju/add_ir_node_encapsulation
...
add IrNode&IrVarNode&IrOpNode. test=develop
6 years ago
tensor-tang
8bc6381546
fix jitcodekey and refine test
...
test=develop
6 years ago
tensor-tang
7044cfa7c7
add sgd jitcode and op test
...
test=develop
6 years ago
tensor-tang
8e04133719
add benchmark and mkl sgd implement
...
test=develop
6 years ago
tensor-tang
07efdb5139
Merge remote-tracking branch 'ups/develop' into jit/sgd
6 years ago
Jacek Czaja
c63f6b2039
- MKL-DNN pooling updated to set_prim_desc
...
- MKLDNN ops revisited
- disabled softmax modifications
- disabled elementwise_add
- reverted LRN modifications
- reverted SUM primitive
- Partial reviing of softmax
- Enable softmax
- Softmax changes
- LRN is back
- LRN partially disabled
- LRN is back
- LRN fix
- compilation fixes
- Sum fixed(hopefully)
- Enabling (partially) elementwise_add
- Fixes to elemenwise_add
- Lint fixes
quantize fix
- compilation fix
test=develop
Disabling pooling
- Disabled quantize op
test=develop
6 years ago
shippingwang
3398293272
add API.spec. test=develop
6 years ago
shippingwang
5ce46c637a
fix api.spec, test=develop
6 years ago
qingqing01
8e439ccfff
Fix bug in fake_quantize_op and add more unit testing ( #15912 )
6 years ago
qingqing01
f4846bf3dc
loosly check in the InferShape of cross_entropy_op. ( #15863 )
...
* loosly check in cross_entropy_op when soft_label is True
* Add Runtime assertion in backward infer_shape check.
* Skip InferShape check when un-know the input dimensions
6 years ago
minqiyang
28077c4da6
Add gperftools into imperative tracer
...
test=develop
6 years ago
Yihua Xu
7396788694
Optimize gelu operation with mkl erf.
...
test=develop
6 years ago
nhzlx
0ed63b2108
6. delete useless predictor id
...
test=develop
6 years ago
xiaoli.liu@intel.com
70759d181b
Optimize INT8 DeQuantize Op with primitive reuse.
...
test=develop
6 years ago
minqiyang
efb2f2baf8
Fix bugs
...
test=develop
6 years ago
Yiqun Liu
f4634d76d7
Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. ( #15493 )
...
* Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU.
test=develop
* Refine the op benchmark to support setting lod in config.
test=develop
6 years ago
Tao Luo
60546b78cc
Merge pull request #15923 from Sand3r-/mgallus/conv-residual-ut
...
Add Conv Residual Connection UT for Projection
6 years ago
guomingz
630c1e8317
This PR improve performance of prior_box op about 1.25x faster on CPU. ( #15909 )
...
* This PR improve performance of prior_box op about 1.25x faster on CPU.
* Test Env:SKX 8180 with fake data on 28 threads(bs=1).
* The below table shows the ~25% improvement which generated by [eval_tp_fake_data.py](https://github.com/PaddlePaddle/Paddle/issues/15618#issuecomment-464613976 ).
| Type |Event | Calls | Total | Min. | Max. | Ave. | Ratio.|
| ---------------- | ------------------ | ---- | ------- | -------- | -------- | ------------ | -------- |
| w/ optimization | thread0::prior_box | 6000 | 921.201 | 0.110572 | 0.383402 | **0.153533** | 0.084585 |
| w/o optimization | thread0::prior_box | 6000 | 1151.85 | 0.102276 | 0.426702 | **0.191976** | 0.103337 |
test=develop
* Fix the style issue.
test=develop
6 years ago
Tao Luo
9c05421c97
Merge pull request #15914 from Sand3r-/mgallus/mkldnn-sum-code-reuse
...
Refactor MKL-DNN Sum to use reference version on fallback
6 years ago
chengduo
7ca8553d4e
Add alloc_continuous_space_op ( #15900 )
...
* add alloc_continuous_space_op
test=develop
* Polish code
test=develop
* follow comment
test=develop
6 years ago
wopeizl
2192c46436
Merge pull request #15916 from wopeizl/win/fixevent1
...
fix build issue for cudaEvent_t
6 years ago
baojun-nervana
2ffacdebc2
Update ngraph version to v0.14 test=develop
6 years ago
Michal Gallus
6a2bc9a275
Add Conv Residual Connection UT for Projection
...
test=develop
6 years ago
Zhen Wang
548931456c
update some functions' names according to the suggestion. test=develop
6 years ago
Michal Gallus
6ebe9877bb
Improve code reuse at MKL-DNN sum
...
test=develop
6 years ago
dzhwinter
660e410655
Merge pull request #15855 from dzhwinter/fix/nightly_test
...
accelerate memory optimize process
6 years ago
peizhilin
c6472579c0
test=develop
6 years ago
peizhilin
b5d6e38b05
fix build issue for cudaEvent_t
...
test=develop
6 years ago
minqiyang
b420ec3a92
invoke backward_hooks after reduce op's depcounts map
...
test=develop
6 years ago
Qiyang Min
4bd28b304b
Merge pull request #15831 from velconia/imperative_engine
...
Imperative training network to the end
6 years ago
Xin Pan
a6e3cd5eb7
Merge pull request #15425 from panyx0718/api
...
Pass graph to parallel executor instead of program
6 years ago
liangan1
4acc522087
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
...
test=develop
6 years ago
wopeizl
3ccd8964a4
Merge pull request #15905 from wopeizl/win/fix_eigen
...
fix build issue on windows for sample prop op
6 years ago
chengduo
8e904d322f
Remove unnecessary dependence for profiler ( #15899 )
...
* refile profiler
test=develop
* follow comment
test=develop
6 years ago
Zhen Wang
9261cf39db
update with develop. test=develop
6 years ago
Zhen Wang
0bf809c9b3
add set_attr for IrOpNode. test=develop
6 years ago
qingqing01
d8128930ef
Refine doc of uniform_random and fix dtype ( #15873 )
...
* Refine doc of uniform_random and fix dtype
* Update defaule value in the arguments
6 years ago
Xin Pan
44e7fcddc5
Merge pull request #15844 from panyx0718/infer
...
add per kernel config and remove const_cast.
6 years ago
dzhwinter
a71f2fbe4f
fix default value. test=develop
6 years ago
Jacek Czaja
dec9cf53c8
[MKL-DNN] MKL-DNN specific Tensor modification ( #15429 )
...
* - Implemented draft of primitive desc keeping in Tensor
test=develop
- TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented
- Added nchw and nc formats setting for sake of compatiblity
Fixed unit tests
- Worakaround to problem with 5D data in conv
- Added 3D and 1D MKL-DNN formats for name handles for tensor
test=develop
- Fix to UTs
test=develop
- Conv fp32 op was updated
Cosmetic fixes
test=develop
- tensor mkldnn cosmetics
test=develop
- Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils
* - Lint fixes
test=develop
* - setting prim dec in Tensor , sets also layout to kMKLDNN
test=develop
* - Moved creation of prim desc totally out of Tensor
test=develop
* - Cosmetic fixes adter review
test=develop
6 years ago
heqiaozhi
08c96d1b48
remove mkldnn & fix commit
...
test=develop
6 years ago
minqiyang
84bf4d7b06
Move ClearBlock into OpBase and VarBase's destructor
...
test=develop
6 years ago