tink2123
837ad7f86f
Add the inverse trigonometric function
...
test=develop
6 years ago
tensor-tang
14a764c930
simplify the jitkernel templates and tests
...
test=develop
6 years ago
Yiqun Liu
5bde120243
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. ( #16106 )
...
test=develop
6 years ago
Zhaolong Xing
3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
...
Four points for enhancing Paddle-TRT
6 years ago
jerrywgz
b0e3c02410
Merge pull request #15952 from jerrywgz/fpn_ops
...
add distribute fpn proposals op, test=develop
6 years ago
tensor-tang
802f362ac4
unify the kernelfuncs cache and add unit test
...
test=develop
6 years ago
Yiqun Liu
36e2d3241e
Enhance the op benchmark: ( #16066 )
...
- Support setting attr in config
- Support setting dtype and initializer for input in config
test=develop
6 years ago
tensor-tang
9be825a982
polish the cast op doc ( #16078 )
...
* polish the cast op doc
test=develop
* follow comments
test=develop
* fix api.spec
test=develop
6 years ago
jerrywgz
847bb6a279
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fpn_ops
6 years ago
jerrywgz
893789a0d1
Merge pull request #16050 from jerrywgz/add_box_decoder_and_assign
...
Add box decoder and assign
6 years ago
xiaolil1
a177d48217
Add Requantize OP ( #15318 )
...
* Enable INT8 ReQuantize OP
test=develop
* Clean code
test=develop
* Add comments
test=develop
* Revert "Clean code"
test=develop
This reverts commit a7a49b8aa214f9730cb84e11ea96da564fe4b4d9.
* Modify requantize op test
test=develop
* fix requantize UT by moving public function to public test file.
test=develop
* Fix test fail due to file address change.
test=develop
* Change file address for requantize op.
test=develop
6 years ago
chengduo
f5a3751845
Refine recurrent_op ( #16027 )
...
* refine recurrent_op
test=develop
* remove unnecessary code
test=develop
6 years ago
sneaxiy
7b608396fe
fix travis-ci format check
...
test=develop
6 years ago
tensor-tang
6057f36208
Merge pull request #15996 from tensor-tang/op/embgrad
...
refine embeddingseqpool grad
6 years ago
chengduo
c67afb0f76
Fix reshape bug ( #16069 )
...
* In some case, the input may have one than one negative value.
test=develop
* fix matmul bug
test=develop
6 years ago
sneaxiy
33138a421d
remove match check
...
test=develop
6 years ago
Zhen Wang
8063b31e2d
Reduce redundant code for channel wise dequant op. test=develop
6 years ago
Zhen Wang
e8f9dac7ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into channel_wise_quant_op
...
test=develop
6 years ago
Zhen Wang
806832e091
update the input format of channel wise dequantize op.
6 years ago
jerrywgz
072eca348a
refine doc, test=develop
6 years ago
Kaipeng Deng
6d8771b55c
Merge pull request #15864 from heavengate/spectral_norm
...
Add spectral norm op
6 years ago
sneaxiy
814a759061
merge develop
...
test=develop
6 years ago
sneaxiy
597dc65e76
enhance gc
...
test=develop
6 years ago
baojun
da45fbdaf5
fix tanh typo test=develop ( #16049 )
6 years ago
whs
0f99d24083
Make sequence_erase op support for input with multi-level LoD. ( #15982 )
...
test=develop
6 years ago
Zhen Wang
89dee160d1
add channel wise dequantize op.
6 years ago
jerrywgz
b4f5180299
fix doc, test=develop
6 years ago
jerrywgz
a1ef7df865
refine code, test=develop
6 years ago
tensor-tang
12eb9aecde
Merge remote-tracking branch 'ups/develop' into op/embgrad
...
test=develop
6 years ago
jerrywgz
d497bd9079
resolve conflict, test=develop
6 years ago
jerrywgz
41471d28ac
add box_coder_and_assign, test=develop
6 years ago
lidanqing
02c106c717
MKLDNN: Add UT for conv_transpose_mkldnn op. ( #16030 )
...
* MKLDNN: Add UT for conv_transpose_mkldnn op.
test=develop
* MKLDNN: Add fuse_bias check UT for conv_transpose_mkldnn op.
test=develop
6 years ago
dengkaipeng
3eab9e4b95
fix statement. test=develop
6 years ago
dengkaipeng
e37f5ab5b1
fix API.spec. test=develop
6 years ago
dengkaipeng
54bbbfa71f
fix doc statement. test=develop
6 years ago
dengkaipeng
c1a69e3ea0
refine doc. test=develop
6 years ago
dengkaipeng
65d375a09f
fix format. test=develop
6 years ago
dengkaipeng
82d514345c
fix spectral_norm doc. test=develop
6 years ago
dengkaipeng
2ea5843cbf
add doc and test_layers. test=develop
6 years ago
dengkaipeng
037855f42d
fix attr dim calc. test=develop
6 years ago
dengkaipeng
70dbd59839
add grad kernel for spectral_norm. test=develop
6 years ago
dengkaipeng
72509ec3bd
add unittest for spectral_norm. test=develop
6 years ago
dengkaipeng
3bf1ae9b59
add spectral_norm forwarn kenel
6 years ago
Zhen Wang
545247d7b4
add channel wise quantize op.
6 years ago
tensor-tang
b16dabd7e0
refine vbroadcast jitcode
...
test=develop
6 years ago
tensor-tang
c2e56e6bbc
Merge remote-tracking branch 'ups/develop' into op/embgrad
6 years ago
chengduo
e2da3a5b22
Revert "Add Event for TensorCopy" ( #16022 )
...
* Revert "Add Event for TensorCopy (#15953 )"
This reverts commit 7235fd662b
.
test=develop
* fix CI
test=develop
6 years ago
baojun
9aaea38c0a
fix cpplint test=develop ( #16028 )
6 years ago
chengduo
7235fd662b
Add Event for TensorCopy ( #15953 )
...
Add Event for TensorCopy
6 years ago
Tink_Y
31d830de9f
refine image_resize annotation ( #15976 )
...
* fix image_resize annotation
test=develop
* fix some typo
* Update nn.py
* Update interpolate_op.cc
test=develop
6 years ago
tensor-tang
641b3cccce
add vbroadcast mkl code and jitcode
...
test=develop
6 years ago
tensor-tang
41a1270856
add vbroadcast jitkernel refer code and use it
...
test=develop
6 years ago
tensor-tang
867e93b21a
add jitkernel vcopy and speedup unit test time
...
test=develop
6 years ago
sneaxiy
3334c279d0
add sample_generator
...
test=develop
6 years ago
jerrywgz
c31da7899a
refine code, test=develop
6 years ago
Yiqun Liu
798925453e
Revert "Optimize while_op when is_test is true. ( #15811 )" ( #15968 )
...
test=develop
6 years ago
Yiqun Liu
87248281f7
Fix error in CUDA kernel of beam_search. ( #15957 )
...
test=develop
6 years ago
jerrywgz
e8a8fe07e7
fix code for windows CI, test=develop
6 years ago
jerrywgz
149411762a
add gpu kernel, test=develop
6 years ago
Tao Luo
4efdebc6f6
Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt
...
Optimize gelu operation with mkl erf
6 years ago
tensor-tang
e5f9d3a47c
Merge pull request #15892 from tensor-tang/jit/sgd
...
refine sgd op
6 years ago
Tao Luo
e6bab55f1b
Merge pull request #15959 from luotao1/infershape_refine
...
refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool
6 years ago
Yiqun Liu
613d9d0756
Optimize while_op when is_test is true. ( #15811 )
...
test=develop
6 years ago
xiaolil1
1abddd8d97
Optimize Quantize Op with primitive reuse. ( #15929 )
...
test=develop
6 years ago
Tao Luo
7ec97a0a7e
Merge pull request #15930 from xiaolil1/dequantize-reuse
...
Optimize INT8 DeQuantize Op with primitive reuse.
6 years ago
nhzlx
2eff3e26b6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx
06a088a199
fix comments and fix cpplint
...
test=develop
6 years ago
luotao1
34404f9c31
refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool
...
test=develop
6 years ago
baojun
f285191fb3
Added adam op test=develop ( #15710 )
6 years ago
jerrywgz
b92ef45fe9
Merge pull request #15678 from jerrywgz/refine_softmax_with_cross_entropy
...
change default option related to softmax, test=develop
6 years ago
mozga-intel
558f94cd77
Register sum operator ( #15889 )
...
test=develop
6 years ago
tensor-tang
58b8231338
added concat op test=develop ( #15946 )
6 years ago
Tao Luo
47d36b2008
Merge pull request #15924 from baojun-nervana/ngraph_v14
...
Update ngraph version to v0.14
6 years ago
jerrywgz
0f652f304c
add distribute fpn proposals op, test=develop
6 years ago
dzhwinter
225c11a91f
polish cudnn related code and fix bug. ( #15164 )
...
* staged.
* polish code
* polish code. test=develop
* polish code. test=develop
* api change. test=develop
* fix default value. test=develop
* fix default value. test=develop
6 years ago
sneaxiy
69b1ebdfa5
merge develop
...
test=develop
6 years ago
Yiqun Liu
454f4f2140
Rewrite is_empty op to avoid unnecessary data transform. ( #15509 )
...
* Rewrite is_empty op to avoid unnecessary data transform.
test=develop
* Add the implementation of InferShape and InferVarType for is_empty op.
test=develop
* Rewrite is_empty op to avoid directly inherit OperatorBase.
test=develop
6 years ago
xiaolil1
6724be2b0d
INT8 Pool kernel Key Creation Optimization. ( #15883 )
...
* Optimize key creation of INT8 pool kernel to improve the peformance of ResNet-50 and MobileNet, especially for latency.
test=develop
* Optimize key creation of pool fp32 grad.
test=develop
6 years ago
xiaoli.liu@intel.com
c4187dbd7c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dequantize-reuse
6 years ago
Tao Luo
ba90e05281
Merge pull request #15917 from jczaja/prv-tensor-mkldnn-ops
...
[MKL-DNN] Adjusting ops to Tensor modifications
6 years ago
baojun-nervana
e4ab40a7b9
added concat op test=develop
6 years ago
colourful-tree
7d8f639883
Merge pull request #15902 from colourful-tree/new_develop
...
remove mkldnn & fix commit
6 years ago
Tao Luo
effec86600
Merge pull request #15913 from liangan1/func_coverage
...
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
6 years ago
tensor-tang
8bc6381546
fix jitcodekey and refine test
...
test=develop
6 years ago
tensor-tang
7044cfa7c7
add sgd jitcode and op test
...
test=develop
6 years ago
tensor-tang
8e04133719
add benchmark and mkl sgd implement
...
test=develop
6 years ago
tensor-tang
07efdb5139
Merge remote-tracking branch 'ups/develop' into jit/sgd
6 years ago
Jacek Czaja
c63f6b2039
- MKL-DNN pooling updated to set_prim_desc
...
- MKLDNN ops revisited
- disabled softmax modifications
- disabled elementwise_add
- reverted LRN modifications
- reverted SUM primitive
- Partial reviing of softmax
- Enable softmax
- Softmax changes
- LRN is back
- LRN partially disabled
- LRN is back
- LRN fix
- compilation fixes
- Sum fixed(hopefully)
- Enabling (partially) elementwise_add
- Fixes to elemenwise_add
- Lint fixes
quantize fix
- compilation fix
test=develop
Disabling pooling
- Disabled quantize op
test=develop
6 years ago
qingqing01
8e439ccfff
Fix bug in fake_quantize_op and add more unit testing ( #15912 )
6 years ago
qingqing01
f4846bf3dc
loosly check in the InferShape of cross_entropy_op. ( #15863 )
...
* loosly check in cross_entropy_op when soft_label is True
* Add Runtime assertion in backward infer_shape check.
* Skip InferShape check when un-know the input dimensions
6 years ago
Yihua Xu
7396788694
Optimize gelu operation with mkl erf.
...
test=develop
6 years ago
nhzlx
0ed63b2108
6. delete useless predictor id
...
test=develop
6 years ago
xiaoli.liu@intel.com
70759d181b
Optimize INT8 DeQuantize Op with primitive reuse.
...
test=develop
6 years ago
Yiqun Liu
f4634d76d7
Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. ( #15493 )
...
* Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU.
test=develop
* Refine the op benchmark to support setting lod in config.
test=develop
6 years ago
guomingz
630c1e8317
This PR improve performance of prior_box op about 1.25x faster on CPU. ( #15909 )
...
* This PR improve performance of prior_box op about 1.25x faster on CPU.
* Test Env:SKX 8180 with fake data on 28 threads(bs=1).
* The below table shows the ~25% improvement which generated by [eval_tp_fake_data.py](https://github.com/PaddlePaddle/Paddle/issues/15618#issuecomment-464613976 ).
| Type |Event | Calls | Total | Min. | Max. | Ave. | Ratio.|
| ---------------- | ------------------ | ---- | ------- | -------- | -------- | ------------ | -------- |
| w/ optimization | thread0::prior_box | 6000 | 921.201 | 0.110572 | 0.383402 | **0.153533** | 0.084585 |
| w/o optimization | thread0::prior_box | 6000 | 1151.85 | 0.102276 | 0.426702 | **0.191976** | 0.103337 |
test=develop
* Fix the style issue.
test=develop
6 years ago
Tao Luo
9c05421c97
Merge pull request #15914 from Sand3r-/mgallus/mkldnn-sum-code-reuse
...
Refactor MKL-DNN Sum to use reference version on fallback
6 years ago
chengduo
7ca8553d4e
Add alloc_continuous_space_op ( #15900 )
...
* add alloc_continuous_space_op
test=develop
* Polish code
test=develop
* follow comment
test=develop
6 years ago
baojun-nervana
2ffacdebc2
Update ngraph version to v0.14 test=develop
6 years ago
Michal Gallus
6ebe9877bb
Improve code reuse at MKL-DNN sum
...
test=develop
6 years ago
sneaxiy
c545f1ed8f
unify API
...
test=develop
6 years ago
liangan1
4acc522087
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
...
test=develop
6 years ago
sneaxiy
a8c4324d3c
fix hang bug
6 years ago
Xin Pan
44e7fcddc5
Merge pull request #15844 from panyx0718/infer
...
add per kernel config and remove const_cast.
6 years ago
Jacek Czaja
dec9cf53c8
[MKL-DNN] MKL-DNN specific Tensor modification ( #15429 )
...
* - Implemented draft of primitive desc keeping in Tensor
test=develop
- TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented
- Added nchw and nc formats setting for sake of compatiblity
Fixed unit tests
- Worakaround to problem with 5D data in conv
- Added 3D and 1D MKL-DNN formats for name handles for tensor
test=develop
- Fix to UTs
test=develop
- Conv fp32 op was updated
Cosmetic fixes
test=develop
- tensor mkldnn cosmetics
test=develop
- Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils
* - Lint fixes
test=develop
* - setting prim dec in Tensor , sets also layout to kMKLDNN
test=develop
* - Moved creation of prim desc totally out of Tensor
test=develop
* - Cosmetic fixes adter review
test=develop
6 years ago
heqiaozhi
08c96d1b48
remove mkldnn & fix commit
...
test=develop
6 years ago
Xin Pan
5dd281f738
polish
...
test=develop
6 years ago
heqiaozhi
fab09ac0b8
Merge branch 'new_develop' of https://github.com/colourful-tree/Paddle into new_develop
6 years ago
heqiaozhi
da4f5a2f18
remove mkl & fix commit
...
test=develop
6 years ago
colourful-tree
f2d6473ef8
Merge branch 'develop' into new_develop
6 years ago
heqiaozhi
04f876f5bc
remove mkl & fix commit
6 years ago
dengkaipeng
373cfb0ccf
use kernel size in global_pooling. test=develop
6 years ago
dengkaipeng
60305196b8
fix spell mistakes. test=develop
6 years ago
Tao Luo
8a7efc78f1
Merge pull request #15882 from sfraczek/unique_ptr_dereference
...
Change *(smart_ptr.get()) -> *smart_ptr
6 years ago
tensor-tang
a0c37662b9
enable sgd jitkernel refer code and test
...
test=develop
6 years ago
xuezhong
1dad36f6aa
Merge pull request #15609 from xuezhong/add_sample_logits_op
...
add sample_logits and sampled_softmax_with_cross_entropy op
6 years ago
Kaipeng Deng
9e524a7b51
Merge pull request #15870 from heavengate/fix_adaptive_pool_doc
...
fix adaptive pool doc.test=develop
6 years ago
sneaxiy
1e4c0a6f72
merge develop
6 years ago
dengkaipeng
14df92fe8f
fix spell error. test=develop
6 years ago
dengkaipeng
144016fcfc
fix adaptive_pool and yolov3_loss. test=develop
6 years ago
Sylwester Fraczek
74672d1aff
Change *(smart_ptr.get()) -> *smart_ptr
...
reason: dereferencing smart pointer is the same as the underlying pointer
test=develop
6 years ago
tensor-tang
ee2321debd
Revert 15770 develop a6910f900
gelu mkl opt ( #15872 )
...
* Revert "Optimze Gelu with MKL Erf function (#15770 )"
This reverts commit 676995c86c
.
* test=develop
6 years ago
xuezhong
81870723c6
Merge pull request #15605 from xuezhong/fix_bug_for_lstmp
...
Fix bug for lstmp
6 years ago
dengkaipeng
eb65b4e47d
\frac -> \frac. test=develop
6 years ago
nhzlx
1d5ef7c9ee
5. add static trt load model
...
1). add static trt load model
2). fix bug: when device_id is not 0, the trt will have a bug
test=develop
6 years ago
dengkaipeng
8167588f14
add blank after math::. test=develop
6 years ago
dengkaipeng
d9ec605873
use math:: instead of 29. test=develop
6 years ago
dengkaipeng
19292ac6a1
fix adaptive pool doc.test=develop
6 years ago
Yiqun Liu
7d96c74ab2
Initialize the benchmark tester for operator. ( #15772 )
...
* Initialize the benchmark tester for operator.
test=develop
* Rearrange the codes.
test=develop
6 years ago
Yihua Xu
676995c86c
Optimze Gelu with MKL Erf function ( #15770 )
...
* Optimize for gelu operator
* Set up the low accuracy mode of MKL ERF function.
test=develop
* Only enable MKLML ERF when OS is linux
* Use the speical mklml version included vmsErf function to verify gelu mkl kernel.
test=develop
* Add the CUDA macro to avoid NVCC's compile issue.
test=develop
* Add the TODO comments for mklml library modification.
test=develop
* Clean Code
test=develop
* Add the comment of marco for NVCC compiler.
test=develop
6 years ago
mozga-intel
5d132ecf83
Auto-cmake generator, auto-fill map ( #15402 )
...
test=develop
6 years ago
Krzysztof Binias
1578c60bdd
Add new ut and remove unnecessary code
...
test=develop
6 years ago
Xin Pan
5eb87506bc
add per kernel config and remove const_cast.
...
test=develop
6 years ago
Dun
a83e470405
Profiler refine and add CUDA runtime api tracer ( #15301 )
...
* refine profiler && add runtime tracer
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* fix bug && test=develop
* add thread id map && test=develop
* test=develop
* testing
* bug fix
* remove cuda event && refine code && test=develop
* test=develop
* test=develop
* test=develop
* fix windows temp file && test=develop
* test=develop
* fix windows bug && test=develop
* fix start up issue && test=develop
* code polish && test=develop
* remove unused code && test=develop
* add some cupti cbid && test=develop
* add FLAGS_multiple_of_cupti_buffer_size && test=develop
* fix compile error && test=develop
* add keyword && test=develop
* fix && test=develop
* code polish && test=develop
6 years ago
sneaxiy
7160cb0f32
decoupled reader
...
test=develop
6 years ago
mozga-intel
13ec2d331b
Enable momentum operator for a ngraph engine ( #15673 )
...
* Enable momentum operator for a ngraph engine
test=develop
* Update tests
test=develop
* Unnecessary line of the code as intended was removed
test=develop
6 years ago
xuezhong
eb7bc3e7ea
remove non-ascii charactor
...
test=develop
6 years ago
tensor-tang
e1c707fe9c
fix warnings ( #15790 )
...
* fix warnings
test=develop
* fix enforce test
test=develop
6 years ago
xuezhong
d328660304
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
6 years ago
xuezhong
f2262d7336
update comment
...
test=develop
6 years ago
Tao Luo
6402424f7a
Merge pull request #15773 from chengduoZH/fix_shape_api_doc
...
Fix shape api doc
6 years ago
xuezhong
d12252e6a6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_sample_logits_op
...
test=develop
6 years ago
xuezhong
c5360a3f6b
refine code
6 years ago
tensor-tang
5aea2cd2e0
Merge pull request #15652 from tensor-tang/refine/pyramiddnn
...
refine fused emb seq pool
6 years ago
mozga-intel
df23a6f894
Enable cross_entropy operator for a ngraph engine ( #15674 )
...
* Enable cross_entropy operator for a ngraph engine
test=develop
* Update tests
test=develop
* Added PADDLE_ENFORCE for the batch_norm operator
test=develop
* Update the message about which format are supported right now
test=develop
6 years ago
Yiqun Liu
56a5039e24
Correct the doc in Python API ( #15725 )
...
* Correct the comment in control_flow.py.
* Correct the argument list of ops.
test=develop
* Update API.spec.
test=develop
* Skip op_callstack attr for all op apis.
test=develop
* Remove use_mkldnn and is_test from python api.
test=develop
* Remove use_mkldnn and is_test from op_proto_maker and hard-coding them in python when generating doc string.
test=develop
6 years ago
baojun
72061b0ac0
Add ngraph op coverage ( #15721 )
6 years ago
chengduozh
d79d2f686c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_shape_api_doc
...
test=develop
6 years ago
xuezhong
4424021623
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_sample_logits_op
6 years ago
nhzlx
2070fb246d
4. do the trt_engine optim during init.
...
add simple static mode loading
test=develop
6 years ago
Yihua Xu
685a20ef56
Add JIT CRF_decoding and Layer_norm unit-test ( #15699 )
...
* Add the CRFDecoding and LayerNorm's test case
test=develop
* Fix the size checking issue
test=develop
* Remove the remnant code
test=develop
* Add TestAllImpls and double support
test=develop
* Clean Code
test=develop
* Add benchmark test for LayerNorm & CRFDecoding
test=develop
6 years ago
tensor-tang
75fc792d40
fix when table width larger than 64
...
test=develop
6 years ago
tensor-tang
40402d5e68
add emb seqpool jitcode
...
test=develop
6 years ago
tensor-tang
2ccbcb157d
Merge remote-tracking branch 'ups/develop' into refine/pyramiddnn
6 years ago
chengduozh
3ce12b1b8e
fix shape api doc
...
test=develop
6 years ago
Dun
5e6834d891
inplace group_norm ( #15754 )
...
* inplace group
* test=develop
6 years ago
Hongyu Liu
8c0292dead
Merge pull request #15717 from phlrain/fix_leak
...
Fix lstm possible leak
6 years ago
Tao Luo
4da291c6a3
Merge pull request #15726 from qingqing01/fix_api_doc
...
Fix row_conv doc
6 years ago
nhzlx
ecc12fb430
3. when runing in trt mode, do not allocate memory for parameters in fluid.
...
test=develop
6 years ago
Dun
e4b9fcdbd2
More restrict check load_combine_op. ( #15479 )
...
* fix && test=develop
* fix && test=develop
* test=develop
6 years ago
qingqing01
48a5cccbcd
Fix debug mode in prior_box_op ( #15702 )
...
* Fix debug mode in prior_box_op
* Refine code
6 years ago
Dang Qingqing
2868232556
Fix row_conv doc
...
test=develop
6 years ago
tensor-tang
a3a3d3d861
add embseqpool jitkernel mkl impl and use it
...
test=develop
6 years ago
tensor-tang
15da2f9a0d
add embseqpool jitkernel refer code, test and benchmark
...
test=develop
6 years ago
tensor-tang
c2ccf14590
Merge remote-tracking branch 'ups/develop' into refine/pyramiddnn
6 years ago
qingqing01
abcefe7211
Fix debug mode in fake_quantize_op ( #15693 )
...
* Fix debug mode in fake_quantize_op
* Remove template specialization
6 years ago
liuhongyu
029be5fda9
fix lstmp bug; test=develop
6 years ago
nhzlx
9cc6249cd6
2. TRTEngine using stream only when execute.
6 years ago
liuhongyu
393fa6021e
set lstm lstmp unsed pointer to nullptr; test=develop
6 years ago
liuhongyu
869f00ffc6
set lstm lstmp unsed pointer to null
6 years ago
nhzlx
034ba1c291
add static model load for trt
...
1. bind trt input and output to fluid tensors
6 years ago
jerrywgz
6f11f35abe
Merge pull request #15703 from jerrywgz/enhance_expand_op
...
support multiple var types for expand op
6 years ago
Tao Luo
3086502522
Merge pull request #15704 from Sand3r-/mgallus/old-fc-mkldnn-branch-fix-develop
...
Fix old FC backward weights descriptor creation
6 years ago
baojun
c47e258ea4
Add ngraph sum, sigmoid, relu_grad and tanh_grad op ( #15642 )
...
* Added ngraph sum op test=develop
* Added sigmoid, relu_grad and tanh_grad test=develop
* remove duplicates test=develop
6 years ago
tensor-tang
33d0cebbff
Merge pull request #15695 from tensor-tang/fix/name
...
fix jitcode name, use after free
6 years ago
Michal Gallus
7a8eff36a6
Fix old FC backward weights descriptor creation
...
test=develop
6 years ago
chengduo
ad61e1b22c
fix potential bug ( #15688 )
...
test=develop
6 years ago
dzhwinter
f9ac88e1a0
Merge pull request #15694 from liuwei1031/fix_security_issue
...
Fix security issue
6 years ago
jerrywgz
8fc0fc314a
support multiple var types for expand op, test=develop
6 years ago
tensor-tang
fb2a7b2300
fix aligned-new error in jitkernel ( #15626 )
...
* fix aligned-new error in jitkernel
test=develop
* override genbase new to fix mis-align
test=develop
6 years ago
乔龙飞 Qiao Longfei
08ad72d0b9
Merge pull request #15679 from jacquesqiao/update-lookup_table_grad-padding-index
...
lookup_table_grad kernel should consider padding_idx test=develop
6 years ago
Tao Luo
d9270e34d1
Merge pull request #15691 from luotao1/activation_doc
...
fix generate doc error in activation ops
6 years ago
tensor-tang
15d7220f94
fix jitcode name
...
test=develop
6 years ago
tensor-tang
31fd8ce1e1
Merge pull request #15375 from mozga-intel/mozga-intel/batch_norm_ngraph_operator
...
Enable batch_norm operator for a ngraph engine
6 years ago
liuwei1031
b1f97a6fa9
fix security issue 27, 38 test=develop
6 years ago
Tao Luo
882e7ec480
fix generate doc error in activation ops
...
test=develop
6 years ago
Gabor Buella
da9c94da33
Clang build fixes ( #15628 )
...
* Remove some superfluous std::move calls
The std:move triggered a build error (with -Werror):
```
[ 9%] Building CXX object paddle/fluid/memory/allocation/CMakeFiles/allocator_facade.dir/allocator_facade.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: note: remove std::move call here
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^~~~~~~~~~ ~
1 error generated.
```
See: https://reviews.llvm.org/D7633
* Remove a superfluous lambda capture from framework/operator.h
```
[ 10%] Building CXX object paddle/fluid/platform/CMakeFiles/device_context.dir/init.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/platform/init.cc:19:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.h:229:21: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
[this](Variable* var) { return var; });
^~~~
1 error generated.
```
Changing it to `return it->second;`, as is in the function below.
* Rethrow an exception (instead of copying it)
```
[ 11%] Building CXX object paddle/fluid/framework/CMakeFiles/operator.dir/operator.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: error: local variable 'exception' will be copied despite being thrown by name [-Werror,-Wreturn-std-move]
throw exception;
^~~~~~~~~
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: note: call 'std::move' explicitly to avoid copying
throw exception;
^~~~~~~~~
std::move(exception)
```
See https://reviews.llvm.org/D43322 for an explanation of this diagnostic message.
* Remove an unused variable
```
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:884:16: error: private field 'scope_' is not used [-Werror,-Wunused-private-field]
const Scope& scope_;
^
```
* struct ComputationOpHandle -> class ComputationOpHandle
```
[ 13%] Building CXX object paddle/fluid/framework/details/CMakeFiles/memory_early_delete_pass.dir/memory_early_delete_pass.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/memory_early_delete_pass.cc:21:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: error: class 'ComputationOpHandle' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class ComputationOpHandle;
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/computation_op_handle.h:29:8: note: previous use is here
struct ComputationOpHandle : public OpHandleBase {
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: note: did you mean struct here?
class ComputationOpHandle;
^~~~~
struct
1 error generated.
```
* Fix name() methods under fluid/operators
```
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.cc:15:
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.h:19:
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/jitcode.h:71:23: error: 'name' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
virtual const char* name() const = 0;
^
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen_base.h:31:23: note: overridden virtual function is here
virtual const char* name() const = 0;
^
```
test=develop
6 years ago
Qiao Longfei
76c1378a70
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into update-lookup_table_grad-padding-index
...
test=develop
6 years ago
Qiao Longfei
29a4b21bc8
fix problem test=develop
6 years ago
Qiao Longfei
7b673bce6a
lookup_table_grad kernel should consider padding_idx test=develop
6 years ago
jerrywgz
5ce48220f1
change default option related to softmax, test=develop
6 years ago
xuezhong
9b24ac34dd
remove debug print
...
test=develop
6 years ago
xuezhong
50b48400bb
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
...
test=develop
6 years ago
dzhwinter
b80bcbb4fd
Merge pull request #15660 from dzhwinter/enhance/memory
...
add elementwise_xxx_grad for inplace optimize
6 years ago
mozga-intel
1198ccae6b
Enable batch_norm operator for a ngraph engine
...
test=develop
6 years ago
xuezhong
58101e6d4d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
...
test=develop
6 years ago
xuezhong
4921c2cd02
add api spec change
...
test=develop
6 years ago
baojun
f4a0e68481
Fix ngraph compile WITH_DISTRIBUTE=ON ( #15636 )
...
* fix compile issue with_distribute test=develop
* simplified logic test=develop
* use ngraph dependency test=develop
* set cpu only test=develop
* update test and eliminate fp16 test test=develop
6 years ago
xuezhong
fb261793b9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_sample_logits_op
...
test=develop
6 years ago
xuezhong
fb9a6a2bc6
pass test for lstm op
...
test=develop
6 years ago
xuezhong
1abb0d835e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_bug_for_lstmp
...
test=develop
6 years ago