Paddle

Commit Graph

Author	SHA1	Message	Date
joanna.wozna.intel	17f2c0899f	Add dequant-scale squash (#22409 ) * Add dequant scale squash test=develop * Correct dequant-scale squash test test=develop	5 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
Michał Gallus	269db0d1d1	[DNNL] Fix accuracy in INT8 FC (#22404 ) * Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop	5 years ago
joanna.wozna.intel	3099d9d47c	Restore requantize squash (#22399 )	5 years ago
Yiqun Liu	b7cac50b64	Implement a common python unittest to test the ir passes. (#22209 ) * Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop	5 years ago
lidanqing	895f8da7d6	change std::cout to log(INFO), vlog (#22316 )	5 years ago
Zhen Wang	e40cfb1010	fix the bug of assert_is_op_output. test=develop (#22262 )	5 years ago
Wojciech Uss	d3a6647372	improve placement pass tests code coverage (#22197 )	5 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	5 years ago
joanna.wozna.intel	5b2e98aa17	Add multiple quantize operators fuse (#22062 )	5 years ago
Yiqun Liu	96980c2244	Polish the PADDLE_ENFORCE in fusion_group pass related codes. (#22144 ) * Polish the PADDLE_ENFORCE in fusion_group pass related codes. test=develop * Correct the unittest because of the change relu_grad's formula. test=develop	5 years ago
liu zhengxi	724b13e459	fix xception precision problem, test=develop (#22124 )	5 years ago
Yiqun Liu	b1401fb74d	Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#22094 ) test=develop	5 years ago
Yiqun Liu	d48320777e	Add the first implememtation of fusion_group op (#19621 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop	5 years ago
Michał Gallus	6192108408	[DNNL] 3D Fully-Connected (#21746 )	5 years ago
liu zhengxi	196e20dfbb	Fix multi-threads memory out of bounds error for passes (#21920 ) * fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop * fix attention_lstm_fuse_pass during multi-threads inference, test=develop * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop * fix fc_lstm_fuse_pass during multi-threads inference, test=develop * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop	5 years ago
石晓伟	03479469a7	fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841 ) * fix multi-thread error of fc_gru_fuse_pass.cc, test=develop * export FLAGS and GLOG symbols, test=develop	5 years ago
Pei Yang	3e5008ad01	fix trt calib not working bug, test=develop (#21934 )	5 years ago
Aurelius84	51a86d2b6b	Optimize adam speed (#21777 ) * optimize adam speed by removing _finish_update test=develop * fix SparseAdamFunctor param list test=develop * Remove scale_op in expect_list of adam_op test=develop * fix test optimizer loss assert error test=develop * fix test optimizer loss assert error test=develop * modify PADDLE_ENFORCE usage test=develop * fix op_type in lamb_op.cc test=develop * fix errors ostream format bug test=develop * add betaPowOut in ngraph op test=develop * fix ngraph::op api for gcc8 test=develop * clean code test=develop * modify struct into class test=develop * remove code of beta1Tensor in lamb_op test=develop	5 years ago
lidanqing	d3a96632fa	Add fc-dequantize squash in cpu_quantize_squash_pass for ernie model (#21714 ) * fc-dequantize squash test=develop * change according to reviews test=develop * change PADDLE_ENFORCE test=develop * add second test when fc-dequant do not fuse test=develop * change all related PADDLE_ENFORCE test=develop	5 years ago
joanna.wozna.intel	d419b859c0	Add reshape int8 mkldnn op (#21428 ) * Add reshape int8 op test=develop * Change test to CPUPlace test=develop * Correct tests test=develop	5 years ago
Tao Luo	01fa4ead61	fix -Wno-error=sign-compare warning in gcc8 (#21434 ) * fix -Wno-error=sign-compare warning in gcc8 test=develop * fix warning in distributed codes test=develop	5 years ago
Zeng Jinle	89966525f1	Polish reference count pass (#21324 ) * fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop	5 years ago
GaoWei8	8493f20ebc	Polish the codes of fc when needs padding (#21378 ) test=develop	5 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	5 years ago
GaoWei8	234060f88f	Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972 ) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop	5 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	5 years ago
Yiqun Liu	c918788ba9	Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. (#21310 ) * Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. test=develop * Print the subgraph when check failed. test=develop	5 years ago
Chen Weihang	952508527a	Polish some PE code details (#21274 ) * polish code details, test=develop * futher polish hint msg, test=develop	5 years ago
Yiqun Liu	6b1e1f0dda	Enable generating code for a given subgraph. (#21126 ) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop	5 years ago
Zeng Jinle	cdb3d27985	Fix warn of gcc8 (#21205 ) * fix warnings oof gcc 8 compilation, test=develop * fix boost::bad_get, test=develop * refine PADDLE_ENFORCE, test=develop	5 years ago
Zhaolong Xing	65f7052554	TRT int8: refine trt int8 for dynamic range set (#21112 ) * refine trt int8 for dynamic range set test=develop * refine trt int8 test=develop	5 years ago
Chen Weihang	8da0cd537a	Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop add more already exists examples, test=develop	5 years ago
Chen Weihang	8414575b78	Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137 ) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop	5 years ago
WangXi	de5d3ff688	Fix dgc buffer illegal & reuse velocity (#21012 )	5 years ago
Chen Weihang	826254f664	Add pre-condition check for fuse optimizer op pass (#21005 ) * add pre condition check for fuse optimizer op pass, test=develop * add log & set init to zero, test=develop * fix test_fuse_all_reduce_pass failed, test=develop * polish details, test=develop * refine PADDLE_ENFORCE & remove needless VLOG, test=develop * refactor op check method, test=develop	5 years ago
Yiqun Liu	9091f8cdf9	Support generating code for grad_op (#21066 ) * Add the definition of operation in fusion_group. * Use operations in OperationMap to detect fusion_group of elementwise pattern. * Add namespace fusion_group in code_generator. * Use operations recorded in OperationMap to generate code. * Remove implementation codes to .cc file. * Refine Operation and CodeGenerator to make it easier to generate code for grad_op. Refine the unittest for better reuse. * Avoid recording the template's keyword in a array. * Support the generating of code for grad_op and add unittest. test=develop * Remove replaced_element_in_order and use use number instead. test=develop	5 years ago
joanna.wozna.intel	77c2083586	Add transpose2 INT8 for mkl-dnn (#19424 ) * Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdba4c859abb945e062ab13124f70508054, reversing changes made to 2ce6473f144da298aba4a43d46918f27d463cf7c. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd78ca47ae56881161172b2aacd349aba90. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop	5 years ago
Zeng Jinle	878a40f57d	Support NoNeedBufferVarsInference in dygraph backward (#20868 ) * support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop	5 years ago
Wilber	c534149642	fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param test=develop (#20960 ) fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param	5 years ago
WangXi	eec4fa9099	And Enforce to fuse pass for DGC doesn't support fuse for now, test=develop (#20935 )	5 years ago
hong	8c4573a3cb	GradMaker for dygraph (#19706 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop	5 years ago
Yiqun Liu	b5f3be8330	Implement a pass detect fusion group of elementwise op (#19884 ) * Add fusion_group_pass and elementwise pattern. * Rewrite the detector of elementwise group. test=develop * Add a comment in codegen. * Add more unittest cases. test=develop * Move code_generator related code to fusion_group directory. * Correct the including path. * Add the definition of SubGraph and finish the insert of fusion_group op in pass. * Insert graph_vis_pass in tester to visualize the graph for debug.	5 years ago
wangchaochaohu	ba45dce35d	fix codetest for windows make test=develop (#20796 )	5 years ago
石晓伟	48a774c713	fix ts_sort's bug, test=develop (#20720 )	5 years ago
wopeizl	9e5948230e	add support to gcc8, add docker env test=develop (#19807 ) * add support to gcc8, add docker env test=develop	5 years ago
WangXi	cadc6a9704	fix dgc test and bug when not set trainers_endpoints_, test=develop (#20617 )	5 years ago
Pei Yang	443f604c3b	add DisableGlogInfo() to AnalysisConfig, test=develop (#20581 )	5 years ago
zhaoyuchen2018	b8333edef6	Add Multihead matmul fuse pass (#20167 ) * Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments	5 years ago
Adam	7faa3e9555	Add ConvTranspose + BatchNorm fuse pass (#20161 ) * Add ConvTranspose + BatchNorm fuse pass test=develop * Add tests for conv+bn and conv_transpose+bn passes test=develop	5 years ago
bingyanghuang	9de6772510	Follow comment of Merged QAT PR 18970 (#19979 ) * Follow Wangzhen's comment in PR 18970, test=develop * Review comments, test=develop * Leave fake quantization around mul test=develop * Replace Fake with Real Quantized Mul test=develop * Fix bug in quantize placement pass Nodes in the graph now have checked type instead of node name when they are to be marked for quantization test=develop	5 years ago
joanna.wozna.intel	f5221ac19f	Disable conv requant squash (#20041 ) * Fix conv2d+dequantize squash for residual fusion test=develop * Disable conv-requant squash test=develop	5 years ago
wangchaochaohu	c9ea317b36	codegen code for reconstruction (#19728 ) * codegen code for reconstruction test=develop * fix the cmake test=develop * fix review advice test=develop	5 years ago
chengduo	101a2b610a	Add dtype for coalesce_tensor_op (#20016 ) Add dtype for coalesce_tensor_op	5 years ago
joanna.wozna.intel	3f1d0234ae	Fix conv2d+dequantize squash for residual fusion (#19545 ) * Fix conv2d+dequantize squash for residual fusion test=develop * Change condition test=develop	5 years ago
Yiqun Liu	3cd985a669	Add a pass to fuse fc+elementwise_add+layernorm (#19776 ) * Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop	5 years ago
Zeng Jinle	3fd3b663a8	fix gc bug in controlflow ops, test=develop (#19827 )	5 years ago
Zeng Jinle	db26de8389	[Bug fix] Disable memory reuse on feeded variables (#19835 ) * fix memory reuse bug on feeding variables, test=develop * add comments to reference count members, test=develop	5 years ago
chengduo	8281497030	Fix warning info of build_strategy (#19805 ) * fix warning info test=develop * fix bug of all_reduce_deps_pass test=develop	5 years ago
Yiqun Liu	c67c8758cb	Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop	5 years ago
chengduo	056fdedde3	Open fuse all reduce option (#19765 ) * Open fuse all reduce op test=develop * Add Fuse optimization op log * Add log in fuse_optimizer op pass and fuse all_reduce op pass * replace with boost::optional<bool> test=develop * Polish code test=develop * fix code coverage test=develop	5 years ago
chengduo	e506c99c20	Open fuse broadcast option (#18833 ) * fix vlog level and fuse option type test=develop	6 years ago
Yiqun Liu	a65c728e5d	Implement the GPU kernel of fc operator (#19687 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop	6 years ago
chengduo	5866a7a5fe	Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418 ) * Enable fused_all_reduce_op_handle support GPU and CPU Gradients	6 years ago
wangchaochaohu	ed8f44ea21	codegen for fused elementwise operation (#19520 ) * test=develop codegen for fused elementwise operation * fix test=develop	6 years ago
baojun	a3a4b6e570	Enable ngraph through build_strategy (#19266 ) * enable ngraph throught build_strategy test=develop * add unittest test=develop * put use_ngraph unconditional test=develop * remove paddle_enforce test=develop * remove paddle_enforce test=develop * fix copyright test=develop * limit for ngraph only test=develop	6 years ago
Tao Luo	75d1571995	refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603 ) test=develop	6 years ago
Yiqun Liu	c5548178b0	A a pass to enable the use of cudnn (#19346 ) * Add a interface to enable cudnn for inference. * Add cudnn_placement_pass. test=develop * Set the default value of cudnn_enabled_op_types to null. test=develop * Write the common basic class, placement_pass_base, to refine the codes. test=develop * Call EnableCUDNN in unittest. test=develop * Refine cudnn_placement_pass tester. * Enable the testing of cudnn_placement_pass in inference's unittest. test=develop * Add the check of op kernels. test=develop	6 years ago
Yiqun Liu	fcec365d29	Add a pass to replace dropout_op with scale_op when is_test is true (#19297 ) * Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true. test=develop * Delete dropout_op directly when upscale_in_train is true. test=develop * Improve the debug string, adding the print of op_desc information. * Fix the case when dropout's input x is reused as the next op's output. * Add the pass to inference. test=develop * Change the log level. test=develop * Add unittest for inplace case. * Add comment to explain the pass. * Apply the pass for CPU inference. test=develop * Fix the typo. test=develop * Add the check of AttrType. test=develop	6 years ago
tangwei12	65c7368400	Fix the correctness of async mode at distributed training (#18863 ) * fix correctness of the communicator * fix a bug in send thread when sending var context is empty, test=develop * add lookup_table_prefetch_op and prefetch optimize, test=develop * remove remote prefetch GPU supported * word2vec force with CPU, test=develop * test dist remote lookup table force with CPU, test=develop	6 years ago
joanna.wozna.intel	2e3ec66be0	Add conv dequant squash for int8 (#18905 )	6 years ago
Tao Luo	c82280e445	remove unused conv_elementwise_add2_act_fuse.cc (#19344 ) test=develop	6 years ago
Adam	97d1db1874	Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237 ) * Add generalized Conv+Activation MKLDNN fuse pass creation Part2 test=develop * Undefined behaviour of GetAttrIfExists<> FIX test=develop	6 years ago
Zhaolong Xing	76c95af000	Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213 ) * fix mask rcnn bug: 1. affine channel fuse (diff) 2. condition block op (memory leak) 3. merge lod tensor op (diff) 4. memroy optim (diff) test=develop * fix ci aboud PADDLE_ENFOCE fix merge lod infer op ut test=develop	6 years ago
liuwei1031	50582071dc	fix compilation issue in windows vs2017 (#19183 ) * fix compilation issue in windows vs2017, test=develop * fix gtest lib not found issue, test=develop	6 years ago
juncaipeng	5368b36512	remove the warning for reminding user to avoid using the OriginProgram method, test=develop (#19244 ) This log information may annoy users who don't need to care about it.	6 years ago
Adam	b837689e97	Add generalized Conv+Activation MKLDNN fuse pass creation (#19072 ) test=develop	6 years ago
joanna.wozna.intel	492a00f53e	Add conv reqantize squash (#18754 ) * Add requantize squash test=develop * Add more precise tests test=develop * REname and REfactor tester test=develop	6 years ago
joanna.wozna.intel	bce72c7fea	Replace Relu with bounded Relu in MobileNetV2 quantization (#18988 ) test=develop	6 years ago
chengduo	e044e84264	open fuse_all_optimizer_ops (#19087 ) test=develop	6 years ago
chengduo	17d62ab220	Enhance fuse optimization op pass (#19010 ) * Enhance fuse optimization op pass test=develop	6 years ago
Zeng Jinle	2175d19993	fix memory_reuse_pass memory_size calculation error, test=develop (#19020 )	6 years ago
Zeng Jinle	7ac748adb4	Open gc by default (#18836 ) * open gc by default, test=develop * fix test_train_recognize_digits and disable gc when ngraph is enabled, test=develop * fix conditional_block op eager deletion bug, test=develop * add some comments to reviewers, test=develop	6 years ago
石晓伟	ee2f296ef8	Fusion: seqpool_cvm_concat (#18471 ) * add fusion_seqpool_cvm_concat test=develop * simplify pass, test=develop * fix code style, test=develop	6 years ago
Zeng Jinle	8008ab4e6b	Remove legacy C++ memory optimization codes (#18834 ) * remove legacy memory optimization codes, test=develop * follow huihuang's comments,test=develop * follow luotao's comments, test=develop	6 years ago
chengduo	4140fe11a4	Open fuse optimization ops (#18741 ) * open fuse optimization ops test=develop	6 years ago
Zeng Jinle	a802da650b	Feature/mem opt pass refactor (#18735 ) * first version memory optimize pass, test=develop * remove move_tensor_sharing_pass, test=develop * refine code comments, add unittests, test=develop * turn off memory_optimize by default, test=develop * follow huihuang's comments, test=develop * follow chengduoZH's comments, test=develop * fix grammar error, add const qualifier, fix pass_test exception message, test=develop * follow chengduoZH's comments 2nd, test=develop	6 years ago
Zhaolong Xing	26ae6d49e4	Update trt5 for paddle-trt (#18645 ) * update paddle-trt for: 1. fix bug: when batch > 2, core in split plugin. 2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.) 3. add new attr to dropout. 4. shuffle channel, swish, relu6 support test=develop * 1. fix ci test=develop	6 years ago
chengduo	fd3aad6cb3	Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664 ) * support sparse gradients test=develop	6 years ago
Huihuang Zheng	89bc3fd841	Support memory eager deletion on recurrent OP (#17710 ) Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)	6 years ago
Zeng Jinle	d3003a1620	Feature/buffer_shared_inplace (#17911 ) * feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop	6 years ago
Zhaolong Xing	88b52a27fe	Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop	6 years ago
gongweibao	160ddc980c	Regroup fusion by date type. (#18496 )	6 years ago
chengduo	7453857324	Make fuse_all_reduce_op_pass support mix_precision (#17652 )	6 years ago
Michał Gallus	7023a86c3a	Fix Pooling output scale (#18186 ) * Int8: Fix Pooling output scale test=develop * Update scales quantization for certain operators These include: concat, transpose, pool and reshape. test=develop * Move concat minimum scale finding to quantizer test=develop	6 years ago
Sylwester Fraczek	9252e8fa08	add int8 mkldnn prior_box (#17242 ) add prior_box quantization code add scale algo rules for prior box test=develop	6 years ago
chengduo	14e1e165df	update alloc_continuous_space_for_grad_pass (#18287 ) test=develop	6 years ago
gongweibao	f5caf3443c	Fix reinitialized ncclid error! (#18025 )	6 years ago
gongweibao	da9143c1cc	Polish codes of old prs. (#17938 )	6 years ago
石晓伟	bce259e5bf	Update the Anakin interfaces for content-dnn and MLU (#17890 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop	6 years ago
Zeng Jinle	3ece61f71e	Remove attribute in Allocator::Allocate (#17878 ) * remove attribute in Allocator::Allocate, test=develop * fix travis ci error, test=develop	6 years ago
gongweibao	972c54cd70	Fix FLAGS_fuse_parameter_memory_size unit from Bytes to MBytes. (#17924 )	6 years ago
gongweibao	fbbdc9ccad	Add backward and optimizer operator dependency pass. (#17746 )	6 years ago
Yiqun Liu	8fd39f3e99	Enhance fused_elementwise_activation op and add python api in contrib.layers (#17236 ) * Enhance fused_elementwise_activation op. test=develop * Move the api fused_elementwise_activation to contrib. test=develop * Add including files. test=develop * Add the support of sigmoid in fused_elementwise_activetion op. * Update API.spec. test=develop	6 years ago
mozga-intel	5eb81fe595	Capi for a ngraph engine (#17037 )	6 years ago
Jacek Czaja	6d8075ecef	[MKL-DNN] conv_transpose mkldnn bias pass (#17644 ) * - changes to graph detector - Changes to pass - Added ut for new pass - use_pass - Added pass to mkldnn passes - fix to registration - improved verbose messaging for conv bias passes - Lint fixes test=develop * - Lint fixes test=develop	6 years ago
Sylwester Fraczek	96845d2168	add Concat quantization (#17448 ) * add Concat quantization add unit test for quantizing concat fix for wrong value when the input is not in map of calculated scales add use_quantizer to concat_op.cc add scale_algo rules for concat test=develop * missing fix for multiple inputs quantize-squash * wojtuss review fix: adding comment test=develop	6 years ago
gongweibao	65bbf950ee	Add multi-ncclcomm and 2D ncclallreduce support. (#17263 )	6 years ago
Zhaolong Xing	61221ebc28	TRT: Support set dynamic range in int8 mode. (#17524 ) * fluid int8 train and trt int8 predict align. trt int8 predict init op converter * 2. align fluid int8 train and trt int8 inference. enhance quant dequant fuse pass enhance op converter, trt engine, trt engine op, trt subgraph pass. * 3. add delete_quant_dequant_pass for trt test=develop * 4. add the missing file test=develop * 5. i modify the c++ interface, but forget to modify the pybind code fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter test=develop	6 years ago
Michał Gallus	0c39b97b4e	[MKL-DNN] Add Fully Connected Op for inference only(#15226 ) * fuse mul and elementwise add to fc * Reimplement the FC forward operator * Fix FC MKLDNN integration by transposing weights * Add FC MKLDNN Pass test=develop * FC MKLDNN Pass: change memcpy to std::copy * Fix MKLDNN FC handling of mismatch input and weights dims * Lower tolerance for MKL-DNN in resnet50 test test=develop * Adjust FC to support MKLDNN Op placement test=develop * Adjust Placement Op to set use_mkldnn attribute for graph test=develop * MKLDNN FC: fix weights format so that gemm version is called test=develop * FC MKLDNN: Remove tolerance decrease from tester_helper * FC MKL-DNN: Refactor the code, change input reorder to weight reorder * MKL-DNN FC: Introduce operator caching test=develop * FC MKL-DNN: Fix the tensor type in ExpectedKernelType test=develop * FC MKL-DNN: fix style changes test=develop * FC MKL-DNN: fallback to native on non-supported dim sizes test=develop * FC MKLDNN: fix CMake paths test=develop * FC MKLDNN: Refine placement pass graph mkldnn attribute test=develop * Fix Transpiler error for fuse_conv_eltwise test=develop * Fix missing STL includes in files test=develop * FC MKL-DNN: Enable new output size computation Also, refine pass to comply with newest interface. test=develop * FC MKL-DNN: enable only when fc_mkldnn_pass is enabled * FC MKL-DNN: Allow Weights to use oi or io format * FC MKL-DNN: Adjust UT to work with correct dims test=develop * Enable MKL DEBUG for resnet50 analyzer test=develop * FC MKL-DNN: Improve Hashing function test=develop * FC MKL-DNN: Fix shape for fc weights in transpiler * FC MKL-DNN: Update input pointer in re-used fc primitive * Add log for not handling fc fuse for unsupported dims test=develop * FC MKL-DNN: Move transpose from pass to Op Kernel test=develop * FC MKL-DNN: Disable transpose in unit test test=develop * FC MKL-DNN: Remove fc_mkldnn_pass from default list * Correct Flag for fake data analyzer tests test=develop * FC MKL-DNN: Add comment about fc mkldnn pass disablement test=develop * FC MKL-DNN: Disable fc in int8 tests test=develop	6 years ago
Sylwester Fraczek	5b2a3c4b12	Conv concat relu quantization (#17466 ) * add conv_concat_relu fuse test=develop * add test code test=develop * added missing include with unordered_map test=develop * review fixes for wojtuss test=develop * remove 'should (not) be fused' comment statements one of them was invalid anyway test=develop	6 years ago
Sylwester Fraczek	bccb0ba49a	fix quantize_squash_pass segfault when no tensor linked to Bias (#17292 ) * fix quantize_squash_pass segfault when there is no tensor linked do Bias input test=develop * add googlenet test test=develop * fix concat CreateKey not using input format test=develop	6 years ago
Qiao Longfei	58f7695ab2	Async exe support communicator (#17386 ) Async exe support communicator	6 years ago
guomingz	2281ebf0f3	Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130 ) * Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization. Below table shows the benchmark(FPS) which measured on skx-8180(28 cores) Batch size \| with fusion \| without fusion -- \| -- \| -- 1 \| 214.7 \| 53.4 50 \| 1219.727 \| 137.280 test=develop * Fix the format issue test=develop * Add the missing nolint comments. test=develop * Fix the typos. test=develop * Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine. test=develop * Adjust the indentation. test=develop * Add the test_conv_brelu_mkldnn_fuse_pass case. test=develop * Slightly update the code per Baidu comments. Let the parameter definition embedded into the code. That's will make the code easy to understand. test=develop	6 years ago
liuwei1031	c3949f5699	remove two useless flags: enable_subgraph_optimize, memory_optimize_debug, test=develop (#17491 )	6 years ago
Tao Luo	32da5e9c3d	remove unused expected_kernel_cache_pass (#17486 ) test=develop	6 years ago
Zeng Jinle	712bfb17cb	fix recurrent_op,test=develop (#17433 )	6 years ago
Zhen Wang	4a1b7fec96	Add setting Scope function for the graph class (#17417 ) * add set_not_owned function for graph * add scope set. test=develop * add scope_ptr enforce not null before setting.test=develop	6 years ago
chengduo	04bd413acb	Code Clean: Move all pass to paddle::framework::ir (#17228 ) * move pass to ir * polish code test=develop * fix dependency test=develop	6 years ago
Zeng Jinle	4f8594088d	Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace (#17225 ) * add use_cuda to inplace pass,test=develop * add test softmax_with_xe_inplace test,test=develop * fix potential inplace bug test=develop * add more skip vars in mem opt pass,test=develop * follow comment,test=develop * follow comments,move duplicate out arg check to program->graph,test=develop	6 years ago
石晓伟	a72dbe9abf	Cherry-pick benchmark related changes from release/1.4 (#17156 ) * cherry-pick commit from `8877054` * cherry-pick commit from `3f0b97d` * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn (cherry picked from commit `8643dbc233`) * Cherry-Pick from 16662 : Anakin subgraph cpu support (cherry picked from commit `7ad182e16c`) * Cherry-pick from 1662, 16797.. : add anakin int8 support (cherry picked from commit `e14ab180fe`) * Cherry-pick from 16813 : change singleton to graph RegistBlock test=release/1.4 (cherry picked from commit `4b9fa42307`) * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2 Support ShuffleNet and MobileNet-v2, test=release/1.4 (cherry picked from commit `a6fb066f90`) * Cherry-pick : anakin subgraph add opt config layout argument #16846 test=release/1.4 (cherry picked from commit `8121b3eccb`) * 1. add shuffle_channel_detect (cherry picked from commit `6efdea8997`) * update shuffle_channel op convert, test=release/1.4 (cherry picked from commit `e4726a066f`) * Modify symbol export rules test=develop	6 years ago
tensor-tang	79ed1c76cd	fix bn fuse vardesc and add model saver (#17143 ) * fix bn fuse vardesc and add model saver test=develop * unify save model in test helper test=develop * fix mkdir on windows test=develop * remove magic number use bn bias var desc test=develop	6 years ago
Zeng Jinle	4e1bc6e805	Rewrite inplace pass and fix gc bug (#17126 ) * fix op graph view test=develop * rewrite inplace pass and fix reference count pass bug test=develop * fix unittest failed test=develop * follow comments, test=develop	6 years ago
Yihua Xu	93cedfdb9c	Fix the order while sorting the operators (#16756 ) * Fix the order when sorting operators. test=develop * Enable transfomer compare test item. test=develop * Use set to replace vector. test=develop	6 years ago
Yiqun Liu	112f16143b	Add an option to enable the cache of expected kernel in train phase. (#16724 ) * Add an option to enable the cache of expected kernel in train phase. test=develop * Change the default value of cache_expected_kernel to true.	6 years ago
Tao Luo	ad4a1bd13c	Merge pull request #16339 from luotao1/core_opt_choose_kernel Cache the chosen kernel of operators	6 years ago
Yiqun Liu	3fe8cb0dd7	Enable the runtime_context_cache pass in train phase (#16640 ) * Try to enable the runtime_context_cache pass in train phase. * Put the append of runtime_context_cache pass ahead of multi_dev passes. test=develop	6 years ago
luotao1	695f2db6a0	update expected_kernel_cache_pass test=develop	6 years ago
luotao1	226596a296	Merge branch 'develop' into core_opt_choose_kernel	6 years ago
gongweibao	423bc515da	fix batch merge bug (#16601 )	6 years ago
Qiao Longfei	baf02328b2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
Qiao Longfei	d8974e6da0	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
nhzlx	d065b5bf2b	Anakin ssd support refine trt first run add quant dequant fuse pass omit simplify_anakin_priorbox_detection template omit transpose_flatten_concat_fuse template test=develop	6 years ago
chengduo	ed61d67c73	Fix the interface of Pass::Apply (#16484 ) * modify the interface of Pass::Allay test=develop * Polish code test=develop * Fix Travis CI test=develop * fix Pass::Apply interface test=develop * Fix Travis CI test=develop	6 years ago
Qiao Longfei	392e97aae5	fix cpplint test=develop	6 years ago
Qiao Longfei	30618409db	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator	6 years ago
nhzlx	953bdde058	Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD test=develop	6 years ago
Wojciech Uss	46677fb080	Move cpu_quantize_* passes into mkldnn subfolder test=develop	6 years ago
nhzlx	3df7b98a0f	Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD	6 years ago
luotao1	056599a738	add expected_kernel_cache_pass test=develop	6 years ago
Wojciech Uss	cbe2dbf0db	Add enabling quantization (#16326 ) * Add enabling quantization test=develop * remove unused (here) function	6 years ago
Tao Luo	9a05859179	Merge pull request #16322 from wojtuss/wojtuss/fix_cpu_quantize_pass fix pattern maching conv2d with(out) ResidualData	6 years ago
nhzlx	c407dfa3cb	cherry-pick from feature/anakin-engine: refine paddle-anakin to new interface. #16276	6 years ago
nhzlx	a25331bc26	cherry-pick from feature/anakin-engine: deal the changing shape when using anakin #16189	6 years ago
nhzlx	69d37f81d7	cherry-pick from feature/anakin-engine: refine anakin subgraph. #16157 support change input size	6 years ago
nhzlx	a1d200a5de	cherry-pick from feature/anakin-engine: Anakin support facebox #16111	6 years ago
Wojciech Uss	104a9f1e27	fix pattern maching conv2d with(out) ResidualData test=develop	6 years ago
luotao1	82af8031d9	add runtime_context_cache_pass test=develop	6 years ago
Tao Luo	7d2740db83	Revert "cache runtime_context"	6 years ago
Qiyang Min	c7f1f3ed0c	Merge pull request #16214 from velconia/imperative_infer_var_type Implement imperative infer var type	6 years ago
Wojciech Uss	af03008890	Add cpu_quantize_placement_pass for C-API quantization (#16265 ) * Add cpu_quantize_placement_pass for C-API quantization test=develop * added a comment on required pass attributes test=develop	6 years ago
minqiyang	b40e41fbd1	Polish code style test=develop	6 years ago
minqiyang	36dce65bb3	Take DataType and VarType apart test=develop	6 years ago
luotao1	cc0ae1f1a1	refine with comments test=develop	6 years ago
luotao1	a275fd6e0c	Merge branch 'develop' into runtime_context	6 years ago
Wojciech Uss	2579ade45f	Add cpu_quantize_pass for C-API quantization (#16127 ) * Add cpu_quantize_pass for C-API quantization test=develop * add cpu_quantize_pass test * fix lint: add include memory unorderd_map and unordered_set test=develop * fuse_relu 1 test=develop * tuned 2 without squash * fixes test=develop * remove unused vars test=develop * refactored test=develop * fix lint c-style cast -> C++ style cast test=develop * remove QuantMax and c style casts test=develop * last usage of QuantMax removed test=develop * Fix Analysis Predictor UT Check if memory_optimize_pass has already been added to the analysis config before adding a new one, so that it is not added multiple times. test=develop * change map to unordered_map fix the forgotten part of cpu_quantize_pass_tester.cc test=develop * removed quantized attribute * fixed cpu_quantize_pass_tester and op attr comments test=develop * removed redundant line test=debug * removed gmock test=develop * fix after merge	6 years ago
qingqing01	86e912c544	Fix windows compiling (#16230 ) test=develop	6 years ago
luotao1	1b59bed989	Merge branch 'develop' into runtime_context	6 years ago
luotao1	6ce25c99a0	Merge branch 'develop' into runtime_context	6 years ago
qingqing01	8ad672a287	Support sync batch norm. (#16121 ) * Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)	6 years ago
minqiyang	ca392c7e97	Implement infer var type context	6 years ago
Wojciech Uss	b9252f3df8	Add cpu_quantize_squash_pass for C-API quantization (#16128 ) * Add cpu_quantize_squash_pass for C-API quantization test=develop * add cpu_quantize_squash_pass teste * fix lint: add include memory unorderd_map and unordered_set test=develop * lint fix 2 * fixes test=develop * refactored test=develop * fix windows ci test=develop	6 years ago
luotao1	d94fd97230	add runtime_context_cache_pass test=develop	6 years ago
Zhen Wang	41b8cf0bae	Merge pull request #16162 from wzzju/fix_nan_static_quant Fix NaN bugs for static quantization strategy (mutil-cards train).	6 years ago
Zhen Wang	5685a48c23	Add some fixme. test=develop	6 years ago
Zhen Wang	ac6ef06ffa	Add the Clone method in Graph. test=develop	6 years ago
Zhen Wang	01eddf125c	Not add graph copy construction method. test=develop	6 years ago
Zhen Wang	1b9c8d5f06	add clone function for IrGraph. test=develop	6 years ago
Yihua Xu	40f1dd818b	Fix the node's order issue when the content of graph is changed (#16088 ) * Fix the node's sort issue when the graph is changed. test=develop * Clean code test=develop	6 years ago
Qiao Longfei	fab1b54d99	Merge branch 'add-communicator' of ssh://github.com/jacquesqiao/Paddle into add-async-ssa-graph-executor-communicator	6 years ago
nhzlx	2eff3e26b6	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt	6 years ago
nhzlx	06a088a199	fix comments and fix cpplint test=develop	6 years ago
Krzysztof Binias	72253391b6	Add MKL-DNN placement pass tester test=develop	6 years ago
Qiao Longfei	49f2f4f91d	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator	6 years ago
Michal Gallus	6a2bc9a275	Add Conv Residual Connection UT for Projection test=develop	6 years ago
Xin Pan	a6e3cd5eb7	Merge pull request #15425 from panyx0718/api Pass graph to parallel executor instead of program	6 years ago
Qiao Longfei	b8491bfd4e	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator	6 years ago
Xin Pan	0362ef75f4	fix test=develop	6 years ago
Xin Pan	12a0e2ed9d	polish codes test=develop	6 years ago
Xin Pan	19d78f6797	polish test=develop	6 years ago
Xin Pan	32d5a16036	resolve conflicts test=develop	6 years ago
Michał Gallus	c4faf36e7a	MKL-DNN: Add test for conv bias fuse pass (#15824 ) * MKL-DNN: Add test for conv bias fuse pass test=develop * Remove const cast from Conv Bias Pass Test * Add conv with bias test case for conv+bias fuse ut test=develop	6 years ago
Xin Pan	26e32e095a	allow compiler to use graph test=develop	6 years ago
Sylwester Fraczek	0b926114c0	add override to ApplyImpl and #pragma once in edited headers add #include<string> in edited headers test=develop	6 years ago
Xin Pan	6019054cdd	Merge pull request #15716 from Yancey1989/refine_pg Refine ParallelGraph Execution	6 years ago
tensor-tang	e1c707fe9c	fix warnings (#15790 ) * fix warnings test=develop * fix enforce test test=develop	6 years ago
Yancey1989	4b193db14c	polish code test=develop	6 years ago
Yancey1989	642fd68ce0	update by comment test=develop	6 years ago
Yan Chunwei	077d12b939	fix scale cleaner (#15742 )	6 years ago
nhzlx	ecc12fb430	3. when runing in trt mode, do not allocate memory for parameters in fluid. test=develop	6 years ago
Yancey1989	ecdd1166b8	cleanup code test=develop	6 years ago
Yancey1989	73005ee00d	cleanup code test=develop	6 years ago
Yancey1989	88d3dc949e	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into refine_pg test=develop	6 years ago
Yancey1989	f3463ecb6e	refine pg execution	6 years ago
dzhwinter	f9ac88e1a0	Merge pull request #15694 from liuwei1031/fix_security_issue Fix security issue	6 years ago
tensor-tang	e49706c80e	Merge pull request #15659 from GBuella/add_to_string Tests - add some missing to_string calls	6 years ago
liuwei1031	b1f97a6fa9	fix security issue 27, 38 test=develop	6 years ago
Gabor Buella	da9c94da33	Clang build fixes (#15628 ) * Remove some superfluous std::move calls The std:move triggered a build error (with -Werror): ``` [ 9%] Building CXX object paddle/fluid/memory/allocation/CMakeFiles/allocator_facade.dir/allocator_facade.cc.o /home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move] [this] { return std::move(CreateAllocatorWithChunk()); }, capacity); ^ /home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: note: remove std::move call here [this] { return std::move(CreateAllocatorWithChunk()); }, capacity); ^~~~~~~~~~ ~ 1 error generated. ``` See: https://reviews.llvm.org/D7633 * Remove a superfluous lambda capture from framework/operator.h ``` [ 10%] Building CXX object paddle/fluid/platform/CMakeFiles/device_context.dir/init.cc.o In file included from /home/tej/code/gbuella_paddle/paddle/fluid/platform/init.cc:19: /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.h:229:21: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] [this](Variable* var) { return var; }); ^~~~ 1 error generated. ``` Changing it to `return it->second;`, as is in the function below. * Rethrow an exception (instead of copying it) ``` [ 11%] Building CXX object paddle/fluid/framework/CMakeFiles/operator.dir/operator.cc.o /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: error: local variable 'exception' will be copied despite being thrown by name [-Werror,-Wreturn-std-move] throw exception; ^~~~~~~~~ /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: note: call 'std::move' explicitly to avoid copying throw exception; ^~~~~~~~~ std::move(exception) ``` See https://reviews.llvm.org/D43322 for an explanation of this diagnostic message. * Remove an unused variable ``` /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:884:16: error: private field 'scope_' is not used [-Werror,-Wunused-private-field] const Scope& scope_; ^ ``` * struct ComputationOpHandle -> class ComputationOpHandle ``` [ 13%] Building CXX object paddle/fluid/framework/details/CMakeFiles/memory_early_delete_pass.dir/memory_early_delete_pass.cc.o In file included from /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/memory_early_delete_pass.cc:21: /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: error: class 'ComputationOpHandle' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags] class ComputationOpHandle; ^ /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/computation_op_handle.h:29:8: note: previous use is here struct ComputationOpHandle : public OpHandleBase { ^ /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: note: did you mean struct here? class ComputationOpHandle; ^~~~~ struct 1 error generated. ``` * Fix name() methods under fluid/operators ``` In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.cc:15: In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.h:19: /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/jitcode.h:71:23: error: 'name' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override] virtual const char* name() const = 0; ^ /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen_base.h:31:23: note: overridden virtual function is here virtual const char* name() const = 0; ^ ``` test=develop	6 years ago
Dun Liang	1905f1a108	bug fix && test=develop	6 years ago

... 2 3 4 5 6 ...

673 Commits (3d015f1cf529915ab52cb8aef7c475f67fb128b5)