ceci3
20f30dd604
add benchmark flag for conv_transpose ( #22389 )
5 years ago
Leo Chen
b96c7c9a7a
polish code, test=develop ( #22380 )
...
remove unnecessary template.
5 years ago
Chengmo
8f36c39537
Fix GEO-SGD init & send Bug ( #22375 )
...
* test=develop, fix geo Send & Init
5 years ago
zhupengyang
c6f888e5a5
update unittest accuracy to float64 for relu, prelu, maxout ( #22273 )
5 years ago
wangchaochaohu
0d8b222b79
Optimize the depthwise op test=develop ( #22265 )
5 years ago
Leo Chen
aaa4fe491a
use function instead of lambda, test=develop ( #22348 )
...
* use function instead of lambda, test=develop
* follow comments, test=develop
5 years ago
Adam
e7a9f6bbb7
[Bugfix] Preserve shape in inpalce operators ( #22360 )
5 years ago
qingqing01
2d20869c94
Fix infer_shape in compling for elementwise_op ( #22291 )
5 years ago
Yiqun Liu
b7cac50b64
Implement a common python unittest to test the ir passes. ( #22209 )
...
* Implement a common python unittest to test the ir passes.
test=develop
* Save the results in np.array and support to startup on CPU.
test=develop
* Fix the unittest.
test=develop
* Add check_program to check whether the optimized program is different from the origin one.
test=develop
* Remove the inferface all_ops.
test=develop
* Add exception test in pass_test.
test=develop
5 years ago
tangwei12
82bc814a57
integrated HALF_ASYNC to communicator ( #21869 )
...
* add half_async in the communicator
* fix DistributedStrategy
5 years ago
wangchaochaohu
1e932eccfa
remove unused code test=develop ( #22327 )
5 years ago
Leo Chen
3e5744aa65
Remove unused inputs for some operators ( #22284 )
...
* remove unused inputs, test=develop
* remove unused inputs, test=develop
* update dtype, test=develop
* remove unused inputs, test=develop
* update op_use_default_grad_op_maker, tese=develop
* resolve conflicts, test=develop
* follow comments, test=develop
* update center_loss_grad, test=develop
5 years ago
zhangchunle
805328e13b
fix typo in error message ( #22312 )
5 years ago
lidanqing
895f8da7d6
change std::cout to log(INFO), vlog ( #22316 )
5 years ago
石晓伟
8cb04664b9
revert paddle_fluid.map, test=develop ( #22236 )
5 years ago
Chen Weihang
35efbe6d95
Speeding up dygraph DataLoader with multiprocessing ( #21762 )
...
* add multiprocess for dygraph data loader, test=develop
* polish code & add safe gurad, test=develop
* refactor dygraph dataloader & add signal handler, test=develop
* fix member initializer compile error on ci, test=develop
* fix member initializer compile error one more, test=develop
* remove useless config, test=develop
* skip windows incompatible problem, test=develop
* add unittest for coverage, test=coverage
* add more exception unittest case, test=develop
* deal with signal handler coverage, test=develop
* polish code & add signal handler tests, test=develop
* deal with coverage ci problem, test=develop
* split data loader test & coverage ci fix, test=develop
* remove test_imperative_data_loader_with_exception, test=develop
* remove singal process except test case, test=develop
* add exception tests again & remove sample list test, test=develop
* split normal and exception unittests to diff class, test=develop
* polish doc for use_multiprocess effect in static mode, test=develop
5 years ago
Zeng Jinle
9435533adf
remove op_use_default_grad_op_maker.spec, test=develop, test=document_fix ( #22300 )
5 years ago
wangchaochaohu
7b76a76495
fix the conda build confilict test=develop ( #22279 )
5 years ago
Zeng Jinle
5e601a92ad
polish grad op check ( #22290 )
...
* polish grad op check, test=develop, test=document_fix
* keep op_use_default_grad_maker.spec to avoid conflict, test=develop, test=document_fix
5 years ago
Bai Yifan
faba4b116a
Remove disable flag in test_fsp_op.py ( #22171 )
...
* fix fsp_op, test=develop
* fix fsp grad op maker, test=develop
* update op_use_default_grad_op_maker.spec, test=develop
5 years ago
Zhen Wang
e40cfb1010
fix the bug of assert_is_op_output. test=develop ( #22262 )
5 years ago
Wojciech Uss
d3a6647372
improve placement pass tests code coverage ( #22197 )
5 years ago
liu zhengxi
07afc29e90
Make api.cc malloc consistent with paddle_api.h for PaddleBuf ( #22255 )
5 years ago
silingtong123
4f1da4adcb
remove the useless third_party library from C++ inference library ( #22021 )
...
* remove the useless third_party library from C++ inference library
* revert removing the install directory
5 years ago
zhouwei25
549e6de7ac
faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 ( #22164 )
5 years ago
xujiaqi01
e3a457d34b
add collective communication library in fleet ( #22211 )
...
* add collective communication library in fleet to replace mpi
* test=develop
5 years ago
Zhen Wang
f2522e91c4
fix the type error caused by setting bool attr in OpDesc. test=develop ( #22257 )
5 years ago
songyouwei
0ba1d140d4
Add CI check for sequence ops' unittests ( #21615 )
5 years ago
Zeng Jinle
1b76e789cf
remove cuda allocator ctor, test=develop ( #22212 )
5 years ago
Adam
9942d9ed5c
Add caching mechanizm to requantize_mkldnn_op ( #22223 )
5 years ago
Wilber
1230c110cb
[fluid-lite] adjust to relative error ( #22232 )
...
- fluid和lite精度比较替换为相对误差
5 years ago
123malin
985bceac53
Bug fix for sparse recorder ( #21969 )
...
* test=develop, bug fix for sparse recorder
5 years ago
Chen Weihang
fc0b21e17b
Polish fetch error message of parallel executor ( #22206 )
...
* polish error message of parallel executor, test=develop
* change PADDLE_ENFORCE, test=develop
5 years ago
Wojciech Uss
2e90c4eb0a
improve mkldnn_quantizer_config test code coverage ( #22216 )
5 years ago
Wilber
5750152e80
support fluid-lite subgraph run resnet test=develop ( #22191 )
...
- 添加了fluid-lite子图方式运行resnet的单测
- 修改了依赖Lite的git commit id
5 years ago
wangchaochaohu
621d3e0b66
fix the bug of profile update ( #22207 )
...
* fix the bug of profile update test=develop
5 years ago
FlyingQianMM
443a713c9e
add backward gradient computation for op argsort ( #22203 )
...
* add backward gradient computation for op argsort test=developo
* use pre-commit test=develop
5 years ago
Zhen Wang
46189b166d
Add bn and relu fuse pass ( #22048 )
...
* add bn and relu fuse pass
* add op attr assert and dtype assert
* fix some inputs&&outputs bugs for the fused op and pattern.
* add the unittest for fuse_bn_act_pass. test=develop
* use normative enforce statements. test=develop
* add the cpu test. test=develop
* add the support of batch_size=1 for the bn with relu op. test=develop
* add the error type for paddle throws. test=develop
* add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
5 years ago
zhouwei25
2f3e2a84af
fix ci rule to show Shell variables ( #22177 )
5 years ago
baojun
298ee7d28a
Improve ngraph file line coverage ( #22155 )
5 years ago
zhongpu
d0f0a2520c
test Optimizer in dygraph ( #21949 )
...
* test Optimizer in dygraph, test=develop
* add optest for Optimizer in dygraph, test=develop
* fix adagrad optimizer, test=develop
* fix dpsgd optimizer, test=develop
* fix test_optimizer.py, test=develop
* fix dpsgd optimizer, this op only support cpu, test=develop
* add optest for optimizer, test=develop
* add description for dpsgd, test=develop
* add rmsprop to white_list in unused_var_check.cc, test=develop
* polish code style, test=develop
* polish code style, test=develop
* delete seed attribute for DpsgdOptimizer, test=develop
* change testing to debugging, test=develop
5 years ago
石晓伟
ad0dfb17c1
[Feature] Lite subgraph ( #22114 )
5 years ago
joanna.wozna.intel
5b2e98aa17
Add multiple quantize operators fuse ( #22062 )
5 years ago
Yiqun Liu
96980c2244
Polish the PADDLE_ENFORCE in fusion_group pass related codes. ( #22144 )
...
* Polish the PADDLE_ENFORCE in fusion_group pass related codes.
test=develop
* Correct the unittest because of the change relu_grad's formula.
test=develop
5 years ago
wangchaochaohu
c3876cf82d
add support for nested profiling event and printing in different level ( #22061 )
...
* add support for nested profiling event and printing in different level
5 years ago
Zeng Jinle
c3bcd3c1e2
fix dygraph non zero gpu bug, test=develop ( #22165 )
5 years ago
zhaoyuchen2018
3d4f2aa689
Refine stack op to improve xlnet performance, test=develop ( #22142 )
...
stack's wait cost a lot of cpu time, use cuda kernel to do memory copy
will reduce cpu time.
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
5 years ago
zhongpu
cf475f95df
Remove FC in dygraph, modify FC to Linear in sample code ( #22082 )
...
* modify fc to linear in sample code, test=develop
* remove FC, test=develop
* remove warnings, test=develop
* drop fluid/imperative/README.md , test=develop
* change fc to linear, test=develop
* polish code style, test=develop
5 years ago
liu zhengxi
64a4044292
add double register op_data_type of pad2d and fix compile error, test=develop ( #22075 )
5 years ago
Liu Xudong
7ba7acd197
Add coverage tools ( #21975 )
...
Add coverage data processing tools.
5 years ago
Double_V
6ea3809143
Support prroi_pool_op with Tensor and LoDTensor rois ( #20649 )
...
1. Add a new input named batch_roi_nums for prroi_pool_op. batch_roi_nums includes the number of roi for each image in batch when rois is Tensor. This information is saved in rois's lod when rois is LoDTensor.
2. add grad check to prroi_pool_op and solve unnormal X grad diff in CPU.
5 years ago
Pei Yang
d8a9b134e3
fix trt instance_norm serialize bug. test=develop ( #22152 )
5 years ago
zhongpu
cc1a9f4238
fix sample code in paddle/fluid/imperative/README.md ( #22141 )
...
* fix sample code, test=develop
* polish code style, test=develop
5 years ago
Zeng Jinle
4c2df8e4d4
fix allocator strategy comment, test=develop, test=document_fix ( #22121 )
5 years ago
bingyanghuang
7872d06ff4
Add explanation on conv grad for dims<3 ( #22125 )
5 years ago
liu zhengxi
724b13e459
fix xception precision problem, test=develop ( #22124 )
5 years ago
Yiqun Liu
b1401fb74d
Remove subgraph_detector from inference/analysis to the common framework/ir directory. ( #22094 )
...
test=develop
5 years ago
Pei Yang
50bee83f71
add TRT support for instance_norm op ( #21928 )
...
* add TRT support for instance_norm op
5 years ago
zhaoyuchen2018
3dbd4087fe
Fix windows build not kernel issue, test=develop ( #22105 )
...
windows conv_fusion failed as no kernel, explicit declare lambda
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
5 years ago
Chengmo
418abc92f4
Update pyramid related OP ( #21372 )
...
* add special way to add distribute vars, Update Pyramid hash op
5 years ago
bingyanghuang
4b4a9cc88f
fix format in operator.cc ( #22101 )
5 years ago
Feiyu Chan
14aebc7a95
add erf op ( #21785 )
...
* add erf op and python interface.
* add fp16 support for erf op.
* add unitests for erf op and its python interface.
5 years ago
Chen Weihang
ba8414d3a5
replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop ( #22109 )
5 years ago
silingtong123
6c20e7c4e6
test=develop, remove unused parameter from class RuntimeInferShapeContext constructors ( #22046 )
5 years ago
Double_V
fab4b0765a
support elu_op double grad ( #21822 )
...
* support elu activation double grad,test=develop
* delete the code commit in .cc,test=develop
* fix relu test unpass, test=develop
* add elu double grad kernel and unit test
* add caculate dX in elu double grad functor, test=develop
* update the commit code,test=develop
5 years ago
Pei Yang
0a51098a71
Add TRT support for BERT ( #21135 )
...
* add gelu plugin
* align trt bert with gpu
* add support for fused fc with relu,
* add unittest for bert trt
5 years ago
Jacek Czaja
b0b27ff699
[MKL-DNN] Conv grad and Batch Norm grad NHWC support ( #22088 )
5 years ago
Huihuang Zheng
dd4361568e
Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug ( #22029 )
5 years ago
Zeng Jinle
9587249442
polish allocator strategy doc, test=develop, test=document_fix ( #22095 )
5 years ago
Zeng Jinle
d9f5d1eb29
ag allocator by default, test=develop ( #21837 )
5 years ago
123malin
7fb817d447
add distributed_strategy ( #21710 )
...
* add distributed_strategy
5 years ago
Jacek Czaja
ad8a9cb82c
[MKL-DNN] Pool & LRN Grad Ops NHWC support ( #21747 )
5 years ago
Kaipeng Deng
34c57120eb
polish cross_entropy ENFORCE ( #22056 )
5 years ago
SunAhong1993
7f4abaf2f5
register int/int64_t/float16 in pow/square kernel,test=develop ( #22023 )
...
* register int/int64_t/float16 in pow/square kernel,test=develop
* add abs/square/exp type,test=develop
5 years ago
Leo Chen
3f653c8323
register NoNeedBufferVarsInference for max_pool_grad_op, test=develop ( #22055 )
...
* fix test_conv2d_ngraph for grad diff, test=develop
* register NoNeedBufferVarsInference for max_pool_grad_op, test=develop
* refine error message, test=develop
* fix numpy, test=develop
* disable test conv2d_ngraph_op, test=develop
Co-authored-by: Zhang Ting <709968123@qq.com>
5 years ago
Yiqun Liu
d48320777e
Add the first implememtation of fusion_group op ( #19621 )
...
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop
* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop
* Disable for mac and windows.
test=develop
* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop
* Refine the CUDA kernel to support large dims.
test=develop
* Add DeviceCodePool to manage all device codes.
* Add the first implementation fusion_group op.
* Add unit-test for fusion_group op.
* Add the check of result.
* Add the check of nvrtc in unit-test.
test=develop
* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop
* Disable fusion_group op for mac and windows.
test=develop
* Make the compiling of device code return status instead of hanging up.
test=develop
* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
* Unify fusion_group_op's input and output names.
test=develop
* Add the check of CUDA driver library in unittest.
test=develop
* Refine the calling of PADDLE_ENFORCE.
test=develop
5 years ago
Michał Gallus
6192108408
[DNNL] 3D Fully-Connected ( #21746 )
5 years ago
FDInSky
aa2ed0dcc6
fix generate_proposal_labesl op ( #21793 )
...
* test=develop fix generate_proposal_labesl op
5 years ago
ceci3
95d79b6d00
update error log for batch_norm_grad ( #22017 )
...
* update error information about batch_norm_grad
* update bn,test=develop
5 years ago
Aurelius84
c53b62eb8e
fix integer overflow in match_matrix ( #22036 )
...
* fix integer overflow in match_matrix test=develop
* fix integer overflow in match_matrix test=develop
* fix typo test=develop
5 years ago
Chen Weihang
2e9082250d
polish default error msg & cublas error hint, test=develop ( #22032 )
5 years ago
wangchaochaohu
64baee4144
polish code test=develop ( #22014 )
5 years ago
Chen Weihang
35ff1568e9
Add error message for cublas inItizalize failed ( #21995 )
5 years ago
Chen Weihang
fbb42173a9
fix no hint problem when use ENFORCE for cuda, test=develop ( #21994 )
5 years ago
zhouwei25
e66f92d1ae
Modify demo_ci to support Windows, prepare for PR_Windows_Inference ( #21873 )
5 years ago
danleifeng
b7697f6218
fix broadcast bug;test=develop ( #21898 )
5 years ago
liu zhengxi
196e20dfbb
Fix multi-threads memory out of bounds error for passes ( #21920 )
...
* fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop
* fix attention_lstm_fuse_pass during multi-threads inference, test=develop
* fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop
* fix fc_lstm_fuse_pass during multi-threads inference, test=develop
* fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop
5 years ago
zhaoyuchen2018
8859ddd6cf
Refine multihead kernel, align block to 32 ( #21961 )
...
* Refine multihead kernel, align block to 32
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine log comments
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
5 years ago
silingtong123
fd9b00df4b
test=develop, remove unused variable ( #21974 )
5 years ago
zhoushiyu
cee2ccb078
add shuffle batch op ( #21674 )
...
* add shuffle batch op, test=develop, test=document_preview
* fix size_t conflict and check_output test=develop, test=document_preview
* fix bug test=develop, test=document_preview
* add unittest of shuffle_batch layer test=develop, test=document_preview
* fix py coverage and op input type, test=develop, test=document_preview
* fix py coverage, test=develop
* fix en doc, test=develop
* move to contrib test=develop
* add unique_name test=develop
* invoke shuffle_batch in contrib.layers test=develop
5 years ago
mapingshuo
c3e1954918
make reverse op support negative axis ( #21925 )
...
* make reverse op support negative axis
5 years ago
石晓伟
03479469a7
fix multi-thread error of fc_gru_fuse_pass.cc, test=develop ( #21841 )
...
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop
* export FLAGS and GLOG symbols, test=develop
5 years ago
wangchaochaohu
de9ba01f11
add conda build python script test=develop ( #21943 )
...
* add script for conda package build
5 years ago
Aurelius84
10d6846900
Remove double registered dataType in Pad2d ( #21942 )
...
* fix compile error in CUDA10 test=develop
* remove double in pad2d test=develop
5 years ago
zhouwei25
2df4be5d35
Fix openblas bug to support compile on windows when WITH_MKL=OFF ( #21902 )
...
* Fix openblas to support compile on Windows when WITH_MKL=OFF
5 years ago
hutuxian
27decacb8a
fix aucop stat shape ( #21846 )
...
* fix stat shape back in global auc scenario
* add UT to cover global auc
5 years ago
Pei Yang
3e5008ad01
fix trt calib not working bug, test=develop ( #21934 )
5 years ago
Aurelius84
5cb2c74127
add register op_data_type of pad/expand_as et.al ( #21718 )
...
* add register op_data_type test=develop
* fix register bug in isfinite op test=develop
* rm int int64_t in pad2d gradKernel test=develop
5 years ago
qingqing01
2066745847
Pack imperative/layer into paddle_framework.so ( #21921 )
...
* Pack imperative/layer into paddle_framework.so
5 years ago
hong
30d000f8c2
fix matmul error message; test=develop ( #21885 )
5 years ago
zhouwei25
a01663ca1f
remove patch command and file of cares to Improved quality of Paddle Repo ( #21776 )
5 years ago
flame
2bbc0d7d60
python zero copy inference, delete pass ( #21897 )
...
* python zero copy inference
* support delete inference pass
5 years ago
Aurelius84
51a86d2b6b
Optimize adam speed ( #21777 )
...
* optimize adam speed by removing _finish_update test=develop
* fix SparseAdamFunctor param list test=develop
* Remove scale_op in expect_list of adam_op test=develop
* fix test optimizer loss assert error test=develop
* fix test optimizer loss assert error test=develop
* modify PADDLE_ENFORCE usage test=develop
* fix op_type in lamb_op.cc test=develop
* fix errors ostream format bug test=develop
* add betaPowOut in ngraph op test=develop
* fix ngraph::op api for gcc8 test=develop
* clean code test=develop
* modify struct into class test=develop
* remove code of beta1Tensor in lamb_op test=develop
5 years ago
Leo Chen
310edc0d0c
Update layers used in ptb model to use auto-generated op functions in dygraph mode ( #21724 )
...
* update layers, test=develop
* fix input numpy, test=develop
* fix bugs, test=develop
* follow commments, test=develop
* update getitem, test=develop
5 years ago
lidanqing
9dff56e8e2
change qat_performance with mobilenet, change batch_size of qat2_resnet50 ( #21895 )
...
test=develop
5 years ago
FDInSky
6b9fbcf3ad
Update iou_similarity op to support non-normalized bbox ( #21671 )
...
Update iou_similarity op to support non-normalized bbox
5 years ago
guofei
46f9184aff
Modify the while_loop API ( #21844 )
5 years ago
Guo Sheng
7689b6aaa4
Fix default label dim of label_smooth_op. test=develop ( #21862 )
5 years ago
zhouwei25
13e4756f18
change ci check rule of deleting unit-test ( #21876 )
5 years ago
GaoWei8
d4dda8628e
optimize fc jit ( #21878 )
...
test=develop
5 years ago
zhouwei25
013225bb68
fix Execution order of ci_check_unittest, and add it to Linux_py35 ( #21640 )
5 years ago
Chen Weihang
2b941736f3
fix softmax_with_cross_entropy_fix bug, test=develop ( #21810 )
5 years ago
Thunderbrook
c3cf42d0f7
add table id in cache shuffle ( #21585 )
...
* general table
* add sparse table
test=develop
* no cvm
test=develop
* add no_cvm
test=develop
* add note
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* add key of optimizer
test=develop
* solve pslib stop core
test=develop
* barrier
test=develop
* add notes
test=develop
* add table id in cache shuffle
test=develop
* table id
test=develop
* code style
test=develop
5 years ago
Michał Gallus
253e664275
Disable memory opt pass when DNNL is on ( #21826 )
...
* Disable memory opt pass when DNNL is on
* Refine comment above mem optimization pass enablement
test=develop
5 years ago
Chengmo
a86f11b5f5
Speed GEO dense calc & communication ( #21579 )
...
* test=develop, speed dense calc & communication
5 years ago
Wojciech Uss
666c3bb9b0
handle multi-inputs with empty inputs for mkldnn_concat_op ( #21827 )
...
test=develop
5 years ago
Zeng Jinle
aa4d6a5d6c
Add some debug flags to auto growth allocator ( #21766 )
...
* add some debug flags to auto growth allocator, test=develop
* add comments about auto growth, test=develop
5 years ago
guofei
8b7c50f49a
Make While Op could run on GPU place and add while_loop unittest ( #21672 )
...
1. Make while_op accept GPU conditional data
2. Add more complex test cases for while_loop API
5 years ago
WangXi
17299b8d21
fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop ( #21801 )
5 years ago
Huihuang Zheng
557bce77da
Fix Backward Bugs in Conditional Block ( #21809 )
...
The fixed bugs:
1. The condition sub-graph is not pruned
2. When backward graph is extremely simple, the whole backward ops are pruned.
5 years ago
xujiaqi01
0eb4d990c4
fix compiled error when with_pslib=on ( #21769 )
...
* fix compiled error of butil when with_pslib=on and with_testing=on
* test=develop
5 years ago
Huihuang Zheng
0677a1c1c1
Fix That conditional_block_op Doesn't Have InferShape ( #21733 )
5 years ago
zhaoyuchen2018
a5a8d14414
Fix softmax cuda bug ( #21720 )
...
* Fix softmax cuda bug
* Refine multihead log and softmax logic
5 years ago
Kaipeng Deng
943a44492b
yolo_box OP add Attr(clip_bbox). ( #21620 )
...
* yolo_box OP add Attr(clip_bbox). test=develop
5 years ago
Michał Gallus
a5159d8480
Re-anble vgg and resnet101 models download ( #21713 )
...
test=develop
5 years ago
Leo Chen
7181afd75c
Fix elementwise_pow bug on CUDA place with integer ( #21675 )
...
* fix elementwise_pow bug on integer, test=develop
* use llrint to support elementwise_pow_grad, test=develop
* add some tests, test=develop
* revert grad functor, test=develop
5 years ago
石晓伟
2bb135825e
fix analysis_predictor when func is called multiple times, test=release/1.6 ( #21665 )
5 years ago
lidanqing
d3a96632fa
Add fc-dequantize squash in cpu_quantize_squash_pass for ernie model ( #21714 )
...
* fc-dequantize squash
test=develop
* change according to reviews
test=develop
* change PADDLE_ENFORCE
test=develop
* add second test when fc-dequant do not fuse
test=develop
* change all related PADDLE_ENFORCE
test=develop
5 years ago
Chen Weihang
1fd1f06f11
Rename paddle throw error macro ( #21657 )
...
* rename paddle throw error macro, test=develop
* fix new error use case, test=develop
5 years ago
WangXi
8754cbd1f2
fix std::min type in nan_inf, test=develop ( #21725 )
5 years ago
Leo Chen
fbe3ac217e
polish cmake, test=develop ( #21681 )
...
* polish cmake, test=develop
* add current directory to LD_LIBRARY_PATH, test=develop
5 years ago
joanna.wozna.intel
d419b859c0
Add reshape int8 mkldnn op ( #21428 )
...
* Add reshape int8 op
test=develop
* Change test to CPUPlace
test=develop
* Correct tests
test=develop
5 years ago
WangXi
8a0f611b64
Rewrite check nan inf tools ( #21076 )
5 years ago
tangwei12
9ad940fdfe
memory leak for cpu ( #21174 )
...
* add fake init for the trainer, fix large memory hold in the trainer
* do not merge recv vars from a remote endpoint, test=develop
* add recv and save op, merge slice var in one op, save memory
* remove hsigmoid with pull sparse, test=develop
5 years ago
Zhaolong Xing
fbbd94a6ce
there is bug for inference using auto grwoth allocator ( #21621 )
...
test=develop
5 years ago
Zeng Jinle
73461a7ae6
Make OperatorWithKernel::InferShape abstract ( #21633 )
...
* make OperatorWithKernel::InferShape virtual, test=develop
* fix test_prepare_op by relu, test=develop
5 years ago
mapingshuo
686f0ecb6a
add `no_need_buffer_slots` interface to pybind ( #21575 )
...
* add no_need_buffer_slots interface to pybind
5 years ago
Zeng Jinle
6828f3684b
fix op_registry, add ignore op_function_impl.h, test=develop ( #21654 )
5 years ago
GaoWei8
5af0c7ba89
Modify padding strategy: remove weight copy in fc padding ( #21650 )
...
test=develop
5 years ago
Chen Weihang
d96acc3363
Refine dygraph DataLoader implementation ( #21634 )
...
* refine dygraph dataloader & polish related code, test=develop
* refine code based review comment, test=develop
5 years ago
wangchaochaohu
5eec8cf5af
fix the mean grad OP performance improvement test=develop ( #21658 )
5 years ago
Zeng Jinle
29f64c8c9e
refine some grad op makers, test=develop ( #21629 )
5 years ago
mapingshuo
e2d849b989
Dropout with seed ( #21590 )
...
* add seed op
5 years ago
Adam
e81f0228df
MKL-DNN 1.0 Update ( #20162 )
...
* MKLDNN v1.0 rebase to Paddle 1.6
test=develop
* Add hacky paddle::string::to_string() implementation
* vectorize<int64-t>() -> vectorize() cleanup
test=develop
* PADDLE_ENFORCE and void_cast fixes
test=develop
* Rebase changes
test=develop
* Cosmetics
test=develop
* Delete MKL from mkldnn.cmake
test=develop
* CMake debug commands
test=develop
* Delete MKLDNN_VERBOSE and rebase fixes
test=develop
* Rebase fixes
test=develop
* Temporarily disable int8 resnet101 vgg16 and vgg19 tests
test=develop
* Add libmkldnn.so.1 to python setup
test=develop
* Add libmkldnn.so.1 to inference_lib cmake after rebase
test=develop
* Post rebase fixes + FC int8 changes
test=develop
* Fix LRN NHWC
test=develop
* Fix NHWC conv3d
test=develop
* Windows build fix + next conv3d fix
test=develop
* Fix conv2d on AVX2 machines
test=develop
5 years ago
rensilin
7f5d532a9c
fix: fail to call ZeroCopyTensor::mutable_data() when device_id is no… ( #21461 )
...
* ZeroCopyTensor::mutable_data in the right device, test=develop
* add unittest for zerocopy, test=develop
5 years ago
xujiaqi01
f404157205
fix master patch when slot is dense ( #21580 )
...
* fix master patch when slot is dense
* test=develop
5 years ago
xujiaqi01
c05706fe73
fix code style of fleet_wrapper ( #21639 )
...
* fix code style of fleet_wrapper
* test=develop
5 years ago
wangchaochaohu
95b95a284b
Mean gpu optimize ( #21643 )
...
* accelerate mean op test=develop
5 years ago
Leo Chen
48600d7f17
Add op function generator for dygraph ( #21569 )
...
* add op function generator, test=develop
* add unittest, test=develop
* follow comments, test=develop
* fix windows compilation problem, test=develop
5 years ago
lidanqing
fbf9eca0d3
QAT Int8 document ( #21360 )
...
* update benchmark for int8v2, QAT1, QAT2 accuracy and performance
test=document_fix
* change according to reviews
test=develop test=document_fix
* improve some descriptions and some models
test=develop test=document_fix
* update models benchmark data
test=develop test=document_fix
* update int8v2 and qat2 performance
test=develop test=document_fix
5 years ago
liym27
be6a639655
Add CI for checking Input/Output/Attr of modified Ops ( #21522 )
...
* add shell scripts. test=develop
* rename test_pybind_inference to test_pybind_interface and print repeat process in check_op_desc.py. test=develop
* add approval RD. test=develop
5 years ago
Leo Chen
4f81d1bd5f
Refine VarBase init function ( #21587 )
...
* refine init function, test=develop
* add tests, test=develop
* remove extern, which may cause symbol error in gcc-4.8, test=develop
5 years ago
Leo Chen
84b7267100
dygraph_grad_maker supports varbase without grad_var ( #21524 )
...
* dygraph_grad_maker supports varbase without grad_var, test=develop
* fix compile, test=develop
* fix test_tracer, test=develop
* follow comments, test=develop
5 years ago
xujiaqi01
88960684aa
rm optimize_for in framework.proto ( #21571 )
...
* remove optimize_for in framework.proto
* test=develop
5 years ago
Zeng Jinle
0f8888360e
Polish op registry codes ( #21561 )
...
* polish infer shape registry, test=develop
* modify some operators registry, test=develop
5 years ago
Aurelius84
3d9dee575e
Set lod_level of Out in compile time of sequence_pool_op ( #21604 )
5 years ago
zhouwei25
346705967d
monitoring changes of unittest, delete one unittest will need approve ( #21377 )
5 years ago
Zeng Jinle
97e76cb96d
refine dev_ctx.Wait() exception throw, test=develop ( #21600 )
5 years ago
Huihuang Zheng
1dcf6a7212
Add Much Complex Test and Fix Bugs for Control Flow cond API ( #21532 )
...
Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests.
Fix bugs:
1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op.
2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var.
This PR also did some code clean up:
1. Print the var name when sgd_op catches shape error so that it is easier to debug
2. Fix a typo: dicta -> dict
5 years ago
hutuxian
c5aec2fe68
Paddlebox Related to Framework ( #21586 )
...
* Add a single_process_multi_thread transpiler.
* Add some UTs.
* Fix some API description.
5 years ago
liym27
9da7e6b4d4
add file check_op_desc.py and add interface to get default value. ( #21530 )
...
* add file check_op_desc.py and add interface to get default value. test=develop
* add test for c++ coverage rate. test=develop
* Correct typo. test=develop
5 years ago
Jacek Czaja
8f5a93a07b
- Fix to regression in performance of ResNet-50 training ( #21588 )
...
test=develop
5 years ago
Jacek Czaja
9ce0e29dc3
[MKL-DNN] Batch norm mkl-dnn NHWC support ( #21553 )
...
* - BAtch norm mkl-dnn NHWC
test=develop
- compilation fix
test=develop
- UT fix
- cosmetics
test=develop
- Fix to Batch Norm MKL-DNN NHWC UT
test=develop
Conflicts:
paddle/fluid/operators/batch_norm_op.h
* - Lint fixes
test=develop
5 years ago
Zeng Jinle
3a7caf481c
add grad maker assert, test=develop ( #21564 )
5 years ago
Huihuang Zheng
b241c7329c
Refine a Warning Which Can Occur Not Only During Init ( #21546 )
...
As the title
5 years ago
Pei Yang
20d61414b4
fix glog warning, test=develop ( #21573 )
5 years ago
wangchaochaohu
932aca162d
Add Branch to avoid CPU profiler warning print ( #21556 )
...
* fix profiler warning message in cpu profile mode test=develop
5 years ago
Leo Chen
cdd46d7e02
Split VarBase from Python Variable for Dygraph ( #21359 )
...
* test=develop, fix docker with paddle nccl problem
* don't expose numerous Tensor.set(), test=develop
* fix condition, test=develop
* fix float16 bug, test=develop
* feed should be Tensor or np.array, not Variable or number, test=develop
* use forcecast to copy numpy slice to new array, test=develop
* remove float16-uint16 hacking, test=develop
* add variable method to varbase and refactor to_variable to support return varbase
* support kwargs in varbase constructor
* add VarBase constructor to support default python args
* refine varbase initial method
* reset branch
* fix ut for change VarBase error info to PaddleEnforce
* cherry is parameter change before
* overload isinstance to replace too many change of is_variable
* rm useless files
* rm useless code merged by git
* test=develop, fix some ut failed error
* test=develop, fix test_graph_wrapper
* add some tests, test=develop
* refine __getitem__, test=develop
* add tests, test=develop
* fix err_msg, test=develop
5 years ago
Youwei Song
cdba41af4d
dygraph Embedding layer use lookuptable v2 ( #21209 )
...
* dygraph Embedding layer use lookuptable v2
test=develop
* fix test_nce
test=develop
5 years ago
Pei Yang
122b37ce62
make config option DisableGlogInfo() able to mute all inference logs ( #21318 )
...
* make DisableGlogInfo able to mute all logs in inference.
5 years ago
wangchaochaohu
4c9b3dafa7
fill_constant_batch_size_like OP precious problem fix ( #21337 )
...
* fix fill_constant_batch_size_like_op precious problem test=develop
5 years ago
Aurelius84
fa7cff1fee
Add CI for checking registered data_type of new Op ( #21488 )
...
* add data_type register CI test=develop
* add op add test case test=develop
* add test case for register op kernel test=develop
* fix shell script bug test=develop
* fix checkout branch test=develop
* remove test case code test=develop
* fix op_type.spec name test=develop
5 years ago
Zhaolong Xing
da7748c53d
add conv, depthwise_conv, pooling ( #20966 )
...
test=develop
5 years ago
WangXi
768f9242e9
Fix dgc clip & rampup step, test=develop ( #21491 )
5 years ago
hong
0b75a0c10b
add overrider for virtual function to avoid warning ( #21503 )
...
* add overrider for virtual function; test=develop
* fix layer.h OutputName bug; test=develop
5 years ago
Aurelius84
54382ce497
Add get_all_kernels api of registered data_type in pybind.cc ( #21499 )
...
* add _get_all_register_op_kernels api test=develop
* refine usage of check_op_register_type test=develop
* add import in core test=develop
5 years ago
Zeng Jinle
3662fb71a7
remove eval() calls in Eigen, test=develop ( #21498 )
5 years ago
Jacek Czaja
18a5d30754
[MKL-DNN] Conv2d and Conv2d transpose MKL-DNN NHWC support ( #21466 )
5 years ago
GaoWei8
250a192181
Add ernie large c++ inference test ( #21365 )
...
* add ernie-large test
test=develop
* add ernie large c++ inference test
test=develop
5 years ago
zhongpu
6ebf0f47b8
support SelectedRows in dygraph, test=develop ( #21078 )
...
* support SelectedRows in dygraph, test=develop
* fix bug of _grad_ivar interface, test=develop
* add optest for support seletedrows, test=develop
* fix bug for gradient_accumulator in GPU mode, test=develop
* fix error when Selectedrows addto LodTensor in sorted_gradient mdoe in dygraph, test=develop
* refine and simplify gradient accumulator code, test=develop
* add optest, test=develop
* add optest and simplify code, test=develop
* fix bug for test_imperative_selected_rows, test=develop
* add optest for Coverage, test=develop
* fix gradient interface and simplify code, test=develop
* update api for gradient, test=develop
* fix ShareDim's bug in DygraphExecutionContext class, test=develop
* add optest, test=develop
5 years ago
Tao Luo
70eb397677
remove unused snappy/snappystream depends in distributed codes ( #21484 )
...
test=develop
5 years ago
lilong12
0bc8bdf724
set dim[0] to -1 if dim[0] < 0 during compiling for c_allgather op ( #21402 )
...
* set dim[0] to -1 if dim[0] < 0 and remove assertion to runtime, test=develop
* modify ENFORCE message, test=develop
* add validation for x.shape[0] > 0, test=develop
* add ut, test=develop
5 years ago
Zhaolong Xing
c5f0293cf3
NV jetson(nano, tx2, xavier) inference compile support ( #21393 )
...
* add jeston compile support
test=develop
* refine the cmake
test=develop
5 years ago
Zhaolong Xing
b39c011637
specify the auto growth allocator for inference. ( #21448 )
...
test=develop
5 years ago
tangwei12
0bddb951c2
fix async mode, test=develop ( #21367 )
5 years ago
Zeng Jinle
81ef8b7f8f
Fix CI DefaultGradOpMaker check ( #21482 )
...
* fix default grad op maker ci bug, test=develop, test=document_fix
* remove some ops from paddle/fluid/op_use_default_grad_op_maker.spec, test=develop, test=document_fix
5 years ago
Huihuang Zheng
a71f53d7ac
Add warning message when initialize GLOG failed. ( #21487 )
...
Add warning message when initialize GLOG failed
5 years ago
Leo Chen
b3090ad406
fix synchronization problem in softmax_with_cross_entropy_op, test=develop ( #21480 )
5 years ago
Tao Luo
01fa4ead61
fix -Wno-error=sign-compare warning in gcc8 ( #21434 )
...
* fix -Wno-error=sign-compare warning in gcc8
test=develop
* fix warning in distributed codes
test=develop
5 years ago
Lv Mengsi
37f3e56dea
Fix transpose conv ( #21406 )
...
* fix transpose conv,test=develop
* fix comments
test=develop
5 years ago
hutuxian
7e68bc896b
refactor AUC OP and add its CUDA Kernel ( #21336 )
...
* refactor AUC OP and add its CUDA Kernel
* the layout of global auc doesn't change
5 years ago
wawltor
dbbe6e9cb6
fix the device supported of the op unique and unique_with_counts. ( #21395 )
...
* fix the device supported of the op unique and unique_with_counts.
test=develop
test=document_fix
* Fix the precision of test in the op of unique and unique_with_counts.
test=develop
test=document_fix
5 years ago
wangchaochaohu
d4776ec027
fix the correctness of memcpy profiling result test=develop ( #21458 )
5 years ago
wangguanzhong
379e3febf2
fix shape check in density_prior_box, test=develop ( #21414 )
...
* fix shape check in density_prior_box, test=develop
5 years ago
Adam
76b55da15a
Fix bug in UpdatePadding for int64_t type ( #21465 )
...
test=develop
5 years ago
Pei Yang
7b28d938bf
show shape diff in wrong trt input shape errmsg, test=develop ( #21451 )
5 years ago
Jie Fang
5e813b53c5
nhwc optimization for batchnorm ( #21090 )
5 years ago
Leo Chen
e0c9d856fb
add unused input vars check for OpWithKernel, test=develop ( #21169 )
...
* add unused input vars check for OpWithKernel, test=develop
* remove unused vars in some ops, test=develop
* fix batch_norm, test=develop
* add white list, test=develop
* add CI check for white list, test=develop
* :ove white list to c++, test=develop
* solve failure of CI, test=develop
* add unittest for unused_var_check, test=develop
* refine code, enable check in operator_test, test=develop
* skip mkldnn, test=develop
* extend white list, test=develop
* refine condition of mkldnn, test=develop
* fix paddle_build, test=develop
* follow comments, test=develop
* fix GetExpectedKernelType
* add wiki ref to err_msg, test=develop
* follow comment, test=develop
5 years ago
Chen Weihang
664f958a02
Fix optimizer op infershape failed in dygraph multi-cards mode ( #21374 )
...
* add param & grad shape check for sgd op
* add _reshape_inplece interface for dygraph parallel
* refine unittest based paddle/models scripts, test=develop
* add unittest for parallel grad fuse, test=develop
5 years ago
Huihuang Zheng
630be31952
Fix Cond Bug for Nested Control Flow ( #21340 )
...
* Commit before merging develop
test=develop
* Backup after working with Huihuang logs
* Commit before deleting Huihuang debug loggings
* Commit before debug
test=develop
* Fix bug commit
test=develop
* Backup of fixing bugs
test=develop
* Clean up code
test=develop
* Fix a bug in sum_op
test=develop
5 years ago