Paddle

Commit Graph

Author	SHA1	Message	Date
Adam	e81f0228df	MKL-DNN 1.0 Update (#20162 ) * MKLDNN v1.0 rebase to Paddle 1.6 test=develop * Add hacky paddle::string::to_string() implementation * vectorize<int64-t>() -> vectorize() cleanup test=develop * PADDLE_ENFORCE and void_cast fixes test=develop * Rebase changes test=develop * Cosmetics test=develop * Delete MKL from mkldnn.cmake test=develop * CMake debug commands test=develop * Delete MKLDNN_VERBOSE and rebase fixes test=develop * Rebase fixes test=develop * Temporarily disable int8 resnet101 vgg16 and vgg19 tests test=develop * Add libmkldnn.so.1 to python setup test=develop * Add libmkldnn.so.1 to inference_lib cmake after rebase test=develop * Post rebase fixes + FC int8 changes test=develop * Fix LRN NHWC test=develop * Fix NHWC conv3d test=develop * Windows build fix + next conv3d fix test=develop * Fix conv2d on AVX2 machines test=develop	5 years ago
wangchaochaohu	95b95a284b	Mean gpu optimize (#21643 ) * accelerate mean op test=develop	5 years ago
Zeng Jinle	0f8888360e	Polish op registry codes (#21561 ) * polish infer shape registry, test=develop * modify some operators registry, test=develop	5 years ago
Aurelius84	3d9dee575e	Set lod_level of Out in compile time of sequence_pool_op (#21604 )	5 years ago
Huihuang Zheng	1dcf6a7212	Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532 ) Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests. Fix bugs: 1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op. 2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var. This PR also did some code clean up: 1. Print the var name when sgd_op catches shape error so that it is easier to debug 2. Fix a typo: dicta -> dict	5 years ago
Jacek Czaja	8f5a93a07b	- Fix to regression in performance of ResNet-50 training (#21588 ) test=develop	5 years ago
Jacek Czaja	9ce0e29dc3	[MKL-DNN] Batch norm mkl-dnn NHWC support (#21553 ) * - BAtch norm mkl-dnn NHWC test=develop - compilation fix test=develop - UT fix - cosmetics test=develop - Fix to Batch Norm MKL-DNN NHWC UT test=develop Conflicts: paddle/fluid/operators/batch_norm_op.h * - Lint fixes test=develop	5 years ago
Youwei Song	cdba41af4d	dygraph Embedding layer use lookuptable v2 (#21209 ) * dygraph Embedding layer use lookuptable v2 test=develop * fix test_nce test=develop	5 years ago
wangchaochaohu	4c9b3dafa7	fill_constant_batch_size_like OP precious problem fix (#21337 ) * fix fill_constant_batch_size_like_op precious problem test=develop	5 years ago
WangXi	768f9242e9	Fix dgc clip & rampup step, test=develop (#21491 )	5 years ago
Zeng Jinle	3662fb71a7	remove eval() calls in Eigen, test=develop (#21498 )	5 years ago
Jacek Czaja	18a5d30754	[MKL-DNN] Conv2d and Conv2d transpose MKL-DNN NHWC support (#21466 )	5 years ago
Tao Luo	70eb397677	remove unused snappy/snappystream depends in distributed codes (#21484 ) test=develop	5 years ago
lilong12	0bc8bdf724	set dim[0] to -1 if dim[0] < 0 during compiling for c_allgather op (#21402 ) * set dim[0] to -1 if dim[0] < 0 and remove assertion to runtime, test=develop * modify ENFORCE message, test=develop * add validation for x.shape[0] > 0, test=develop * add ut, test=develop	5 years ago
tangwei12	0bddb951c2	fix async mode, test=develop (#21367 )	5 years ago
Leo Chen	b3090ad406	fix synchronization problem in softmax_with_cross_entropy_op, test=develop (#21480 )	5 years ago
Tao Luo	01fa4ead61	fix -Wno-error=sign-compare warning in gcc8 (#21434 ) * fix -Wno-error=sign-compare warning in gcc8 test=develop * fix warning in distributed codes test=develop	5 years ago
Lv Mengsi	37f3e56dea	Fix transpose conv (#21406 ) * fix transpose conv,test=develop * fix comments test=develop	5 years ago
hutuxian	7e68bc896b	refactor AUC OP and add its CUDA Kernel (#21336 ) * refactor AUC OP and add its CUDA Kernel * the layout of global auc doesn't change	5 years ago
wawltor	dbbe6e9cb6	fix the device supported of the op unique and unique_with_counts. (#21395 ) * fix the device supported of the op unique and unique_with_counts. test=develop test=document_fix * Fix the precision of test in the op of unique and unique_with_counts. test=develop test=document_fix	5 years ago
wangguanzhong	379e3febf2	fix shape check in density_prior_box, test=develop (#21414 ) * fix shape check in density_prior_box, test=develop	5 years ago
Adam	76b55da15a	Fix bug in UpdatePadding for int64_t type (#21465 ) test=develop	5 years ago
Pei Yang	7b28d938bf	show shape diff in wrong trt input shape errmsg, test=develop (#21451 )	5 years ago
Jie Fang	5e813b53c5	nhwc optimization for batchnorm (#21090 )	5 years ago
Leo Chen	e0c9d856fb	add unused input vars check for OpWithKernel, test=develop (#21169 ) * add unused input vars check for OpWithKernel, test=develop * remove unused vars in some ops, test=develop * fix batch_norm, test=develop * add white list, test=develop * add CI check for white list, test=develop * :ove white list to c++, test=develop * solve failure of CI, test=develop * add unittest for unused_var_check, test=develop * refine code, enable check in operator_test, test=develop * skip mkldnn, test=develop * extend white list, test=develop * refine condition of mkldnn, test=develop * fix paddle_build, test=develop * follow comments, test=develop * fix GetExpectedKernelType * add wiki ref to err_msg, test=develop * follow comment, test=develop	5 years ago
Chen Weihang	664f958a02	Fix optimizer op infershape failed in dygraph multi-cards mode (#21374 ) * add param & grad shape check for sgd op * add _reshape_inplece interface for dygraph parallel * refine unittest based paddle/models scripts, test=develop * add unittest for parallel grad fuse, test=develop	5 years ago
Huihuang Zheng	630be31952	Fix Cond Bug for Nested Control Flow (#21340 ) * Commit before merging develop test=develop * Backup after working with Huihuang logs * Commit before deleting Huihuang debug loggings * Commit before debug test=develop * Fix bug commit test=develop * Backup of fixing bugs test=develop * Clean up code test=develop * Fix a bug in sum_op test=develop	5 years ago
Jacek Czaja	cd43c4440e	[MKL-DNN] LRN and Pool2d (FWD) NHWC support (#21375 )	5 years ago
Leo Chen	add62acfd1	remove kDepXOut for abs_grad op, test=develop (#21407 )	5 years ago
Adam	9107bf209f	Add template version of UpdatePadding (#21426 ) test=develop	5 years ago
zhaoyuchen2018	b16274556a	Add dscending for argsort (#21400 ) * Add ascending for argsort * Refine api doc description. * Refine descending description * Add int32 logic to speedup when data is small size. * Remove int32 opt as not support in python	5 years ago
hong	ac8546701d	Add dygraph execution context (#20157 ) * add_dygraph_execution_context * add dygraph infershape context and execution context; test=develop * fix imperative bug; test=develop * remove inputs outputs interface from execution context, because it have same function with inputNames; test=develop * remove tracer_test ctest; test=develop * fix split op bug; test=develop * fix unitests bug; test=develop * fix distribute test bug; test=develop * fix ngraph compile bug; test=develop * fix grad maker bug; test=develop * fix load op bugs; test=develop * fix operator.cc construct bug; test=develop * remove useless name find in operator; test=develop * add tracer_test; test=develop * fix concat, split bug; test=develop * remove tracer_test unitest; test=develop * fix attribute check bug; test=develop * add test code to fix converage; test=develop * remove useless code, change check backward input in engin; test=develop * unlock var type infer shape;test=develop * add ShareAllLoD api; test=develop * add dygraph infershape context unitest; test=develop * remove increase and decrease lod in dygraph; test=develop * addd override; test=develop * fix increase descrease lod; test=develop * fix paddle_enforce; test=develop * disable lod op dygraph check; test=develop * fix paddle enforce error; test=develop * add comment for op_registry and OperatorBase; test=develop * optimize the comment of op_registry; test=develop * fix format of comment; test=develop * fix format of comment; test=develop * optimize the format of comment; test=develop * optimize the format of the comment; test=develop * optimize comment of op_registry; test=develop	5 years ago
hutuxian	a6b089c614	add macro to ban windows (#21422 ) remove nccl related code in windows	5 years ago
Kaipeng Deng	ebfb720a63	add Adam beta1/beta2 support Variable (#21234 ) * add Adam beta1/beta2 support Variable. test=develop	5 years ago
Zeng Jinle	09696d5df8	Use system allocator in OpTest (#21335 ) * use system allocator in unittests, test=develop * fix op bugs, test=develop * fix tensor copy bug when src and dst are the same, test=develop	5 years ago
Kaipeng Deng	67c836fb5c	batch_norm momentum support variable (#21246 ) * batch_norm momentum support variable. test=develop * fix format. test=develop * add batch_norm momentum variable example. test=develop * move MomentumTensor to training branch. test=develop * split example. test=develop * fix doc. test=develop * fix PADDLE_ENFORCE ci. test=develop * fix format. test=develop	5 years ago
ShenLiang	e2c6f434ec	Add Lod information for gather_nd & scatter_nd (#21404 ) * add lod information, test=develop * add lod, test=develop * fix lod, test=develop * fix lod, test=develop	5 years ago
Tao Luo	c0656dcb1a	remove -Wno-error=sign-compare, make warning as error (#21358 ) * remove -Wno-error=sign-compare, make warning as error test=develop test=document_fix * fix exist compile warning test=develop	5 years ago
Zeng Jinle	b97fc16d21	fix lod_reset bug, test=develop (#21392 )	5 years ago
Zeng Jinle	89966525f1	Polish reference count pass (#21324 ) * fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop	5 years ago
hutuxian	47a82e38e3	Support data_norm gpu kernel (#21325 ) * support data_norm_op run in CUDA * add two parameters sync_stats & summary_decay_rate * add UT	5 years ago
GaoWei8	8493f20ebc	Polish the codes of fc when needs padding (#21378 ) test=develop	5 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	5 years ago
Zeng Jinle	b639a882c3	fix syn bn grad maker, test=develop, test=document_fix (#21317 )	5 years ago
Youwei Song	4d0f5ab1a8	add axis check for concat op (#21288 ) * add axis check for concat op test=develop * fix PADDLE_ENFORCE format test=develop * move to ComputeAxis for InferShape check test=develop	5 years ago
zhaoyuchen2018	afb134847d	Fix ernie python infer diff (#21311 ) * Fix ernie pythoin infer diff * Refine mask test=develop	5 years ago
Lv Mengsi	b6ce4f8b2f	Fix mistake of batch norm op (#21237 ) * fix_bn * revert unittest,test=develop	5 years ago
lilong12	41d13209d7	add the framework support for distfc (#21197 ) * add the framework support for distfc and ut, test=develop * fix the implementation of shard_index_op, test=develop	5 years ago
GaoWei8	234060f88f	Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972 ) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop	5 years ago
Jacek Czaja	f4cf028a8c	[MKL-DNN] Error throwing for NHWC layout for MKL-DNN ops (#21207 )	5 years ago
Michał Gallus	ed9ceb9f98	Refactor MKL-DNN ElementwiseMul (#21061 ) * Refactor MKL-DNN ElementwiseMul remove manual fallback, remove format attrs test=develop * Refine PADDLE_ENFORCEs in eltwise_mul_op.h test=develop * Make ElementwiseMulOp inherit from ElementwiseOp * Change type of simd_width to int test=develop * Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp test=develop * Restore attributes test=develop * Fix test coverage for mkldnn eltwise mul test=develop * Conform to new is_run_common_broadcast API test=develop * Add UT for AreDimsAndFormatCorrect test=develop	5 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	5 years ago
wangchaochaohu	6514f52e46	fix the fill_constant op precious problem (#21322 ) * fix the fill_constant op precious problem test=develop	5 years ago
zhaoyuchen2018	08c19c585d	Improve argsort performance. (#21267 ) * Improve argsort performance. - Give 200000 data to compute argsort on v100, can speed up ~190x before opt cost: 0.53s after opt cost:0.0027s - Add fp16 support * Refine error message * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
WangXi	8ac7687e36	Fix dgc accuracy by mv regularization to local (#21278 )	5 years ago
Leo Zhao	b19e1a1b56	use prefetch to load next mem into cache (#21206 ) * use prefetch to load next mem into cache test=develop * remove hard code memcpy om pyramid_hash_ff test=develop	5 years ago
gongweibao	ed2a185248	optimize nhwc for tensor core in ConvOp and ConvGradOp (#20597 )	5 years ago
Yihua Xu	69dd5152cf	Fix the crash issue when scale or bias was null-pointer. (#21284 ) * Fix the crash issue when scale or bias was null-pointer. test=develop * Add the error message for passing CI. test=develop	5 years ago
Zhang Ting	698b8b73ad	optimize lod_reset op to avoid data transform	5 years ago
Liufang Sang	f0b1518438	add dequantize_abs_max op and modify lookup_table op (#20899 ) * add int8 kernel to lookup_table op and add dequantize op test=develop * change paddle_enforce to paddle_enforce_eq test=develop * change copyright and change some not suitable code test=develop * remove debug log test=develop * replace GetInputType with IndicateVarDataType test=develop * fix EmptyGradMaker test=develop * fix diff between cpu and gpu test=develop * use memcopy when int8_t test=develop	5 years ago
hutuxian	a6ce2306f9	support cvm_op run in gpu (#21300 ) Previously, CVM OP was only able to run in CPU. This PR implements its GPU kernel. What's more, we improve the UTs about CVM OP.	5 years ago
Yihua Xu	b085ecc258	Avoid the string as the key of map to improve the jit performance (#21292 ) * Avoid the string as the key of map to improve the jit performance. test=develop * Use map to replace unordered_map. test=develop	5 years ago
zhongpu	c4ede95c74	open dygraph op test, test=develop (#19787 ) * open dygraph op test, test=develop * modify to_variable, test=develop * modify input and output for dygraph, test=develop * modify input and output for dygraph(fix bug), test=develop * fix input processing of dygraph op test, test=develop * fix bug, test=develop * fix op test, test=develop * fix forward bug for dygraph, test=develop * fix mkldnn op test for forward, test=develop * update nn.py for dygraph, test=develop * fix crop_tensor_op, test=develop * fix elementwise_mul_op, test=develop * fix fill_op, test=develop * fix some mkldnn op, test=develop * open backward op test for dygraph, test=develop * delete log, test=develop * close backward op test for dygraph, test=develop * fix bug for edit_distance_op and test_lstm_cudnn_op, test=develop * fix optest backward bug for dygraph, test=develop * fix optest backward bug for dygraph, test=develop * close backward op test for dygraph, test=develop * close backward op test for dygraph, test=develop * open dygraph op test, test=develop * fix op test for dygraph, fix GradOpDescMaker, test=develop * fix bug for linear_chain_crf_op.h, test=develop * remove log, test=develop * remove log, test=develop * remove log for op_test.py, test=develop * remove log for op_test.py, test=develop * fix bug for var_conv_2d_op, change PADDLE_ENFORCE, test=develop * fix PADDLE_ENFORCE_EQ for hierarchical_sigmoid_op.cc, test=develop * fix bug for test_increment_ngraph_op.py, test=develop * fix lod for op test in dygraph, test=develop * refactor op_test.py to reduce redundant code, test=develop * fix lod optest, modify InputVar/OutputVar to HasInput/HasOutput, test=develop * remove debug log, test=develop * remove redundant code in base.py, test=develop * fix some error in optest, test=develop * fix ClearNoNeedBufferInputs function's bug for LoDTensor, test=develop * refactor op_test.py, test=develop * remove redundant writing, test=develop * fix error(get tensor of the grad variable), test=develop * fix test_concat_mkldnn test_conv2d_mkldnn, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix some redundant code, test=develop * reslove conflict and rewrite paddle error message, test=develop	5 years ago
danleifeng	6fc3e8ec84	edit elementwise_mul doublegrad inplace (#21245 )	5 years ago
zhaoyuchen2018	3ff5cc2d5e	Fix topk compile failed on windows (#21243 ) * Fix topk compile failed on windows * Use explicit cast for assign data	5 years ago
Zhang Ting	01a9646323	optimize assign op to avoid copy data from GPU to GPU (#21181 ) * optimize assign op to avoid copy data from GPU to GPU, test=develop * modified GetkernelTypeForVar and just avoid device transform, test=develop	5 years ago
danleifeng	0e7baabe59	extend elementwise broadcast function (#20957 )	5 years ago
Adam	d623e863c9	Fix GELU grad error (#21204 ) test=develop	5 years ago
yaoxuefeng	b5d8ba8394	fix data_norm op to avoid impractical normalization result test=develop (#21152 ) * fix auc drop first commit test=develop * update datanorm op * update datanorm with enforce test=develop * update test=develop * update format test=develop * update format * update format test=develop * add unit test test=develop * update unit test test=develop * update format test=develop * update format test=develop * update API description test=develop * update API description test=develop * update format test=develop * fix codes as comments test=develop * fix description as comments test=develop * fix description as comments test=develop * update codes.. test=develop	5 years ago
Zhang Ting	9cbe7bccba	modified error message and API doc for channel_last supported Op (#21002 ) * modified error message for conv and conv_transpose, test=develop * modified doc of conv and conv_transpose op, test=develop * modified the expression for error message, test=develop * modified error message for group_norm op, test=develop * modified detail of Attr(data_format) or Attr(data_layout) * add ValueError in API doc for maxout op, test=develop	5 years ago
guofei	56b5d14704	Fix the error of init variable in StaticRNN when stop_gradient=ON (#21118 )	5 years ago
WangXi	3c98ec90ce	Fix INF bug of softmax_cross_entropy_op (#21165 )	5 years ago
Yihua Xu	eec9c9cbe7	Fix jit tls issue (#21151 )	5 years ago
ruri	aeb887911f	Refine edit distance cn (#21121 )	5 years ago
Kaipeng Deng	98b59cb82c	fix elementwise_mod float point kernel. test=develop (#21183 )	5 years ago
whs	cfdd1fc2cd	Fix warpctc in padding mode. (#21033 )	5 years ago
Chen Weihang	8da0cd537a	Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop add more already exists examples, test=develop	5 years ago
zhaoyuchen2018	b93870e696	Improve topk performance. (#21087 ) * Improve topk performance. give 200000 data to compute topk, before opt: cost 1s after opt: cost 0.0028s. * Refine return value. * Add cuda util funtions. * Fix ComputeBlockSize bug & refine comments. Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Chen Weihang	8414575b78	Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137 ) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop	5 years ago
Chen Weihang	7e5f74b825	Add examples for error message writing specification - InvalidArgument (#21132 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop fix error, test=develop	5 years ago
zhaoyuchen2018	4a544762a2	Add Asypadding for conv fusion. (#21041 ) * Add Asypadding for conv fusion. test=develop reference: pr/20042 * Fix eigen build link error * Change back file mode * Use math function & add more checks.	5 years ago
WangXi	de5d3ff688	Fix dgc buffer illegal & reuse velocity (#21012 )	5 years ago
ceci3	f62a929151	fix instance norm (#21042 ) * fix instance norm * update unitest,test=develop	5 years ago
lilong12	e249d9a3e2	fix the computation for dx (grad for x) for prelu operation. (#20949 ) * set the default value of alpha for prelu to 0.25, test=develop * add the call to __syncthreads(), test=develop * fix the implementation of cpu prelu, test=develop * repair the implementation of element mode prelu, test=develop * modify test_prelu_op.py, test=develop	5 years ago
Zhang Ting	e0285eae64	add check for input channels and Attr(groups), test=develop (#21095 )	5 years ago
Yiqun Liu	35f17ae28f	Add the check of lod_level between compile-time and runtime. (#20961 ) * Add the check of lod_level between compile-time and runtime. test=develop * Fix bug in check_compile_vs_runtime. test=develop * Fix the check of output when it is dispensiable or intermediate. test=develop * Share lod of x to out in match_matrix_tensor op in compile-time. * Implement GetLoDLevel in InferShapeContext. * Set the default value of check_compile_vs_runtime to False and enable it in test_sequence_pad_op. test=develop * Enable check_compile_vs_runtime in test_match_matrix_tensor. * Add the implementation of SetLoDLevel in InferShapeContext. * Remove the implementation of IncreaseLoDLevel and call Get/SetLoDLevel instead. * Remove the implementation of DecreaseLoDLevel and call Set/GetLoDLevel instead. * Refine some ops and unittests. test=develop * Fix a typo. test=develop * Remove the check of var type, and change int to int32_t. test=develop * Add unittest for Get/SetLoDLevel. test=develop	5 years ago
Chen Weihang	826254f664	Add pre-condition check for fuse optimizer op pass (#21005 ) * add pre condition check for fuse optimizer op pass, test=develop * add log & set init to zero, test=develop * fix test_fuse_all_reduce_pass failed, test=develop * polish details, test=develop * refine PADDLE_ENFORCE & remove needless VLOG, test=develop * refactor op check method, test=develop	5 years ago
Aurelius84	1cd6721873	Optimizer mmcpy if _rand_len=16 and remove data copy in GradKernel (#21099 )	5 years ago
joanna.wozna.intel	77c2083586	Add transpose2 INT8 for mkl-dnn (#19424 ) * Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdba4c859abb945e062ab13124f70508054, reversing changes made to 2ce6473f144da298aba4a43d46918f27d463cf7c. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd78ca47ae56881161172b2aacd349aba90. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop	5 years ago
LielinJiang	06063b7001	add op locality_aware_nms, test=develop (#20976 )	5 years ago
wangchaochaohu	fc385777e4	fix the compile cost long time test=develop (#21064 )	5 years ago
Chen Weihang	2f27b10331	Add dependency for error_codes.proto (#21084 ) * fix activation_functions deps, test=develop, test=document_fix * add error_codes_proto deps, test=develop, test=document_fix * try delete enforce.h, test=develop, test=document_fix	5 years ago
wangchaochaohu	149a1e3124	Expand refine (#21063 ) * fix the expand op compile time cost long time test=develop * add tag for just copy test=develop	5 years ago
Wojciech Uss	af3ff422cc	Fix dst memory allocation in elementwise_add (#21059 ) test=develop	5 years ago
liym27	26a6e27afe	fix bug in pool/conv/conv_transpose: UpdatePaddingAndDilation, _get_padding_with_SAME and conv2dtranspose_forward_naive. (#20997 ) * fix bug in pool/conv/conv_transpose: 1. It should be stride[i] not stride[0] in UpdatePaddingAndDilation; 2. fix bug of func _get_padding_with_SAME in test_conv/conv_transpose_op.py; 3. fix bug of the computation process in function conv2dtranspose_forward_naive. test=develop * change test to make the data of different dimensions different. test=develop	5 years ago
Chen Weihang	7ee25189c3	Enrich the type of error and declare the error type interfaces (#21024 ) * Enrich the type of error and declare the error type interfaces, test=develop * adjust tests to adapt new form, test=develop * add inference deps with error_codes.pb.h, test=develop * restore stack iter start pos, test=develop * polish code based review comments, test=develop	5 years ago
Adam	3fda695bb0	Add support for asymetric padding in MKLDNN pool, conv and conv_transpose (#21062 ) * Add asymetric padding support for mkldnn pooling test=develop * Add asymetric padding support for mkldnn conv test=develop * Add asymetric padding support for mkldnn conv_transpose test=develop	5 years ago
Huihuang Zheng	1957192f05	Add select_input_op and select_output_op (#21016 ) These ops are useful in control flow.	5 years ago
Liufang Sang	e5e699ecc0	set lod level for compile time test=develop (#21022 )	5 years ago
liym27	f0e95a6049	Polish error messages of pool_2d/3d and add Raises in English document. test=develop (#21017 )	5 years ago
zhaoyuchen2018	0059404e77	Fix ce ocr_recognition test fails (#20987 ) ocr_recognition fails, so add a path to handle small frame_size. test=develop	5 years ago
Chengmo	bc8e600ce5	Fix rpc not wait in GEO communicator (#20967 ) * test=develop,fix rpc not wait in geo	5 years ago
Tao Luo	25ffa8445d	refine murmurhash3_x64_128 for bloom_filter (#20996 ) test=develop	5 years ago
Zeng Jinle	878a40f57d	Support NoNeedBufferVarsInference in dygraph backward (#20868 ) * support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop	5 years ago
wangchaochaohu	bf379fef96	refine code for code reuse test=develop (#20988 )	5 years ago
Zhang Ting	de9bec607e	lrn supports channel_last input, test=develop (#20954 )	5 years ago
Liufang Sang	9b666cae67	fix diff in dequantize op between cpu and gpu test=develop (#20953 )	5 years ago
Zhang Ting	f4f85831d3	fix the bug of conv_transpose cudnn kernel, test=develop (#20958 ) fix the bug of conv_transpose cudnn kernel: before version 1.6, the data_format is AnyLayout in inference model. When use version 1.6 and load the model which is saved by previous version, the error occurs. This is because the cudnn kernel in version 1.6 is not compitable with Anylayout setting.	5 years ago
zhaoyuchen2018	7f3a445e9a	Fix gru as small frame_size has error. (#20922 ) seems shuffle_sync cannot handle small size test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
123malin	20cdff0e02	Optimize decay (#20816 ) * update pserver decay blocks * update distributed notify handler	5 years ago
Chengmo	16596f6498	Fix Paddle Cloud role maker (#20860 ) * fix PaddleCloud Role maker & add warning in distribute transpiler & change rpc_retry_times	5 years ago
liym27	59de8e1214	Compatible int32 and int64 for attr in concat/split/unsqueeze. test=develop (#20912 )	5 years ago
Zhang Ting	8d1e9f0f7e	maxout supports channel_last input (#20846 ) * maxout support channel_last input, test=develop * modified details of Input(X) and Attr(groups, axis) in doc, test=develop	5 years ago
Yihua Xu	b6260f3866	Optimize the kernel implementation of layernorm with openmp (#20895 )	5 years ago
hong	8c4573a3cb	GradMaker for dygraph (#19706 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop	5 years ago
Chen Weihang	768551b25d	Add parameter init check add run_startup_progrom error message for fc(mul) (#20906 )	5 years ago
Zhang Ting	c18f1bd716	fix the bug of conv_transpose:compatible with Anylayout setting, test=develop (#20897 )	5 years ago
Wilber	b489760099	fix jit_matmul bug test=develop (#20886 ) * fix jit_matmul bug * update jit matmul and add test	5 years ago
Yiqun Liu	03ba0fdae6	Move the codes of fused operators to operators/fused directory. (#20881 ) * Move the codes of fused operators to operators/fused directory. test=develop * Correct the op name in cmake. * Change the use of PADDLE_ENFORCE. test=develop	5 years ago
zhang wenhui	d428912503	fix select_rows mergeadd bug, test=develop (#20876 )	5 years ago
liym27	6802539a2e	support Tensor for split and concat, support -1 in num_or_sections, add check num_or_sections (#20780 ) * improve split and concat op: 1. support Tensor for argument 'dim' in split op. 2. support Tensor for argument 'axis' in concat op. test=develop * redefine function GetDataFromTensor and set unknown output shape to - 1. test=develop * add check: Attr(sections) match Input(X). test=develop * support Tensor for attr(sections) and attr(sections) can contain -1. add check for attr(sections). test=develop * modify error message for concat and call Resize only when necessary. test=develop	5 years ago
wangchaochaohu	28ca2e5ffa	strided_slice perforamnce improvement test=develop (#20852 )	5 years ago
Yiqun Liu	6fcfd32e6c	Check and correct the output's lod_level in DynamicRNN related operators (#19144 ) * Refine the InferShape of ReadFrom and WriteTo op, and add comment to explain why not call ShareLoD for runtime. test=develop * Add comment for ReorderLoDTensorByRank op. * Add comment for lod_tensor_to_tensor_array op to explain why only call DecreaseLoDLevel for compile time. test=develop * ShrinkRNNMemory op should call ShareLoD for compile time. test=develop * Add the implementation of IncreaseLoDLevel and add the compile-time check of lod_level in InferShape of sequence_pool. test=develop * Refine the unittest of DynamicRNN. test=develop * Change PADDLE_ENFORCE to PADDLE_ENFORCE_NE. test=develop	5 years ago
liym27	84d221b667	improve unsqueeze op to support int, Tensor for argument axes (#20824 ) * improve unsqueeze op to support int, Tensor and Tensor list for argument axes. test=develop * call Resize only when necessary. test=develop	5 years ago
silingtong123	03d7f3ddb2	Make shape tensor support int32 (#20757 ) * Make shape tensor support int32	5 years ago
Huihuang Zheng	95ba4bd2ab	Add shape and type check at read_op (#20754 )	5 years ago
Aurelius84	aacd16dbb4	add pyramid_hash_op (#20698 )	5 years ago
Chen Weihang	8b59ac3ad0	delete paddle infershape enforce marco (#20832 )	5 years ago
whs	c8e49be2f1	Fix roi_perspective_transform op (#20764 )	5 years ago
Chen Weihang	26cc1fe508	Replace risky GetInputType method with secure IndicateVarDataType interface (#20668 ) * replace part of the old implementation, test=develop * restore concat op, test=develop * update all ops implemention & delete GetDataTypeOfVar func, test=develop	5 years ago
Yamei-Lee	cf717fd6dd	fix bug in reshape: (#20781 ) consider the situation that shape of input can contain more than one -1. test=develop	5 years ago
Zhang Ting	5a8d885d72	All elements in attr(shape) of crop_tensor can be -1 and int32/64 kernel registered (#20756 ) * All elements in attr(shape) of crop_tensor can be -1, test=develop, test=document_preview * fix the bug that attr(offsets) should be initialized, test=develop	5 years ago
danleifeng	9171f73714	fix fp16 grid_size for size=1; test=develop (#20812 )	5 years ago
Tao Luo	efbdad0596	make search_compute support avx default (#20779 ) * make search_compute support avx only * clean search_compute.h * rename sse_axpy to avx_axpy test=develop * update CMakeLists.txt test=develop	5 years ago
WangXi	250e72d254	Fix DGC algorithm flow to make it the same as paper (#20758 )	5 years ago
zhaoyuchen2018	6e6eab07e8	Fix multihead op bug. (#20783 ) The op should handle k=1024 test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
lvmengsi	dfa0549f87	Revert "fix_depthwise_conv_cudnn, test=develop (#20712 )" (#20782 ) This reverts commit `dc229b4195`.	5 years ago
whs	4c7d196d83	Add norm_by_time for warpctc op in padding mode. (#17580 )	5 years ago
Pei Yang	e89c16b90d	Bug Fix: Paddle-TRT cannot handle adaptive pooling in pool2d op converter and "num" attribute in split op converter (#20733 ) * fix pool2d trt converter, test=develop * add fix for split op converter, test=develop	5 years ago
石晓伟	37cd43545a	update the infer shape of matmul, test=develop (#20717 ) * update the infer shape of matmul, test=release/1.6 * add unittests of matmul, test=release/1.6 * change func names, test=develop	5 years ago
Adam	67b59ddb38	Minor MKL-DNN conv int8 performance fixes (#20753 ) test=develop	5 years ago
wangchaochaohu	0687bcd64f	Refine getitem of Variable (#20729 ) * add support for __get_item__ of Variable test=develop	5 years ago
danleifeng	79e08ecebf	add assertions on whether elementwise_div divison is zero (#20618 )	5 years ago
123malin	95e90aa102	test=develop, add communicator_is_sgd_optimizer flag (#20677 ) * test=develop, communicator_is_sgd_optimizer flags	5 years ago
Aurelius84	74a28f5ea4	fix fill_constant shape with -1 and enhance cross_entropy test=develop (#20722 )	5 years ago
lvmengsi	dc229b4195	fix_depthwise_conv_cudnn, test=develop (#20712 )	5 years ago
gongweibao	c1710e91b2	Disable GRPC_ARG_ALLOW_REUSEPORT to avoid potencial problem. (#20690 )	5 years ago
lidanqing	46e93f7c86	Revert "Refactor conv computeINT8" (#20640 ) * Revert "Refactor conv computeINT8 (#19574)" This reverts commit `2c32c2d649`. test=develop * replace PADDLE_ENFORCE test=develop	5 years ago
Zeng Jinle	ab575de725	Fix op run log when memory optimization strategy is enabled (#20695 )	5 years ago
Jacek Czaja	a1cd27f13f	[MKL-DNN] Added mkl-dnn cache clearing when creating Executor instance (#20241 ) * - Flushing mkl-dnn cache test=develop - Disabled clearing cache for LoadModel - Added clearing of mkl-dnn cache when Executor is created test=develop - Do not clear for GPU places test=develop - compilation fix test=develop * - Moved clearing of mkl-dnn cache in destructor of executor test=develop * - Compilation fix test=develop - Reverted conditional clearing of mkl-dnn cache in Executors's destructor test=develop - compilation fix	5 years ago
Zeng Jinle	10505faf4e	polish codes, test=develop (#20672 )	5 years ago
Zeng Jinle	34e3adaece	Refine reduce codes to save compiling time and binary size (#20676 ) * refine reduce code to save compiling time and binary sizes, test=develop * add reduce rank check to avoid bug, test=develop	5 years ago
whs	a3e641e93c	Fix infer shape of warpctc op. (#20653 ) test=develop	5 years ago
Zeng Jinle	4922eb6da5	make_conv_workspace_size_configurable, test=develop (#20662 )	5 years ago
zhongpu	efa10937bd	fix elementwise_floordiv_op and elementwise_mod_op (#20534 ) * fix elementwise_floordiv_op and elementwise_mod_op, test=develop * fix API.spec, test=develop * fix API.spec, test=develop	5 years ago
tangwei12	04384502a8	fix bug with heart beat , test=develop (#20654 )	5 years ago
wangchaochaohu	7783d3bd43	Conv refine (#20644 ) * add condition judgement for performance improvement test=develop * add condition judgement for performance improvement test=develop * refine code style test=develop	5 years ago
Chen Weihang	003f369bb2	Add IndicateVarDataType interface to block tensor is not initialized problem in OP GetExceptedKernelType (#20044 ) * add indicate_var_data_type inferface, test=develop * add unittests & polish error message, test=develop * remove needless include, test=develop * extract public function & polish message, test=develop * delete empty var check, test=develop * change data_type to pointer parameter, test=develop * polish details, test=develop	5 years ago
gongweibao	f3f52fc1e2	Retry when failed to bind address. (#20642 )	5 years ago
qingqing01	01eddc1a04	Support fp16 in GPU impl of fused_elemwise_activation_op. (#20636 ) * Support fp16 in fused_elemwise_activation_op. * Fix unit testing in ONLY-CPU mode.	5 years ago
Chengmo	940c6ff1c8	Fix communicator slow bug & fix communicator stop bug (#20366 ) * test=develop,Fix communicator slow bug * test=develop, delete if() in stop_worker() * test=develop * fix UT, test=develop * fix bug in fetch handler, test=develop * fix bug in fetch handler, test=develop * test=develop, fix fetch barrier bug * test=develop, bug fix * test=develop, bug fix * test=develop, fix bug	5 years ago
zhaoyuchen2018	8314e64a8b	Fix sum op fails as no memory in tensor(#20602 ) test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Yibing Liu	ee2869cae9	Remove redundant infershape in linear chain crf grad, test=develop (#20629 )	5 years ago
123malin	b4a3b75002	bug fix: invalid learning rate decay in pserver async mode (#20325 ) * bug fix: invalid learning rate decay in pserver async mode	5 years ago
石晓伟	a4753f3a79	Optimize error message of mean_op and matmul_op (#20413 ) * add data type check, test=develop * polish error messages, test=develop * polish error messages, test=develop * Remove support for the CPU architecture matmul, test=develop * fix syntax bug, test=develop	5 years ago
Leo Chen	d6c1d6ca56	update class name, test=develop (#20578 )	5 years ago
Double_V	0b39218749	memory optimizer for reshape op,test=develop (#20569 )	5 years ago
chengduo	36c85ef492	Add sub-scope check in RecurrentOp (#20468 ) * fix recurrent bug test=develop	5 years ago
JesseyXujin	2ff18e537f	add expand_as op, test=develop (#20565 ) * add expand_as op, test=develop * add expand_as op,test=develop * add expand_as op,test=develop * add nn.py, test=develop * delele paddle_enforce, test=develop	5 years ago
Zeng Jinle	40effc61af	Refine py_reader exit (#20331 ) * refine py_reader exit, test=develop * fix multiprocess_reader exception unittest, test=develop * increase code coverage for legacy fluid.layers.py_reader, test=develop	5 years ago
Zhang Ting	78910480c1	fix conv_transpose's bug: compatible with Anylayout setting, test=develop (#20589 )	5 years ago
Yuan Shuai	172e91c008	Refine error message of transpose_op (#20437 ) * Refine error message of transpose. * Fix transpose, multiplex, unsqueeze, unstack. test=develop, test=document_preview, test=document_fix	5 years ago
liym27	fc6ec3b9f6	fill_constant support Tensor; (#20521 ) 2. fix bug in backward.py: using fill_constant instead of fill_constant_batch_size_like 3. fix bug in ExpandGradOp. test=develop	5 years ago
Zhang Ting	0130cc969c	fixed group_norm's bug and modified unittest (#20506 ) * modified group_norm's unittest for pass statement, test=develop * fix group_norm's bug: scale or bias is None which causes segmentation fault, test=develop	5 years ago
zhaoyuchen2018	8fb569e5b9	Fix api doc example bug and polish square doc (#20491 ) * Refine create_array api en doc test=develop test=document_fix * Fix api doc example bug and polish square test=develop test=document_fix * Refine comment test=develop test=document_fix * refine API.spec test=develop test=document_fix	5 years ago
Guo Sheng	dfd1eee7f7	Add seq2seq api related code (#19820 )	5 years ago
lvmengsi	2384589383	Fix conv_grad_grad (#20469 ) * fix_conv_grad_grad * fix_bug, test=develop	5 years ago
Double_V	8299203370	Support reshape_op double gradient (#20304 ) * support reshape doubel grad, test=develop * fix reshape double grad, pass converage, test=develop * fix review, test=develop	5 years ago
hong19860320	4d0d5e4cc7	refine eng doc for hard_sigmoid op (#20442 ) * refine eng doc for hard_sigmoid op test=develop test=document_fix * refine the description of hard_sigmoid test=develop test=document_fix * update API.spec test=document_fix * Refine the decription of parameters of HardSigmoid op test=develop, test=document_fix * Update API.spec for hard_sigmoid op test=develop, test=document_fix	5 years ago
Aurelius84	22823df2e2	enhance embedding error message test=develop (#20246 ) * enhance embedding error message test=develop * enforce .h error test=develop * fix unittest code test=develop * Fix fp16 dtype in embedding test=develop * add import warnings test=develop	5 years ago
zhupengyang	3997743a5b	add input type and dtype check, enhance shape error message for concat_op (#20101 ) * add input type and dtype check, enhance shape error message for concat_op test=develop * enhance shape check test=develop * improve coverage test=develop	5 years ago
zhupengyang	95524a4d30	fix APIs: relu, relu6, hash (#20416 ) * fix APIs: relu, relu6, hash test=develop test=document_fix * fix relu6 doc test=develop test=document_fix * fix API.spec test=develop test=document_fix * add description link for hash test=develop test=document_fix	5 years ago
JesseyXujin	843bdbaae1	add input type and dtype check for accuracy_op (#20399 ) * add input type and dtype check for accuracy_op * add input type and dtype check for accuracy_op * modify python error on accuracy_op,add test=develop * modify details on accuracy_op, test=develop * test float16, test=develop * add warning, test=develop	5 years ago
lijianshe02	211f5b0319	enhance mul_op input error message test=develop (#20414 ) * enhance mul_op input error message test=develop	5 years ago
GaoWei8	5ea2cc6733	fix API:cos, exp, ceil, elu, brelu English doc (#20032 ) * fix API:cos, exp, ceil, elu, brelu English doc test=develop test=document_fix	5 years ago
wopeizl	3044a62f2a	fix the precise roi poop op test=develop (#20126 ) * fix the precise roi poop op test=develop add roi backward implementation, fix the output-channel	5 years ago
Wilber	2893cd1ae0	modify english api (#20159 ) * modify english api test=develop test=document_fix - leaky_relu - less_than - log - logical_and - logical_or - logical_xor - logical_not	5 years ago
zhouwei25	b1218d056b	fix English Doc of API:layers.py_func/sum (#20329 ) * fix English Doc of API:layers.py_func/sum	5 years ago
qingqing01	63194d6e67	Enhance InferShape in deformable_conv and prior_box op (#20372 )	5 years ago
tangwei12	a010d883b4	doc fix, test=develop, test=document_fix (#20239 ) * doc fix, test=develop, test=document_fix	5 years ago
huzhiqiang	6a8e54047f	fix reorder_lod_tensor_by_rank doc en (#20256 ) fix reorder_lod_tensor_by_rank doc en	5 years ago
Yibing Liu	899ab30df0	Fix several api docs (#20282 ) * Fix several api docs test=develop, test=document_fix	5 years ago
wangchaochaohu	1288ac2983	fix expand bug (#20340 ) * fix expand bug test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop	5 years ago
SunGaofeng	a73e1f68b4	fix document of 11 APIs (#20278 ) * modify document of 11 APIs test=develop test=document_fix * fix dtype to data type and description of name parameter	5 years ago
Pei Yang	057d782d51	fix en api doc of [round, sin, sqrt], test=develop, test=document_fix (#20296 )	5 years ago
Kaipeng Deng	3833b511a6	refine en API doc (#20206 ) * refine en doc. test=develop. test=document_fix	5 years ago
wangchaochaohu	bc6126dd07	fix the reduce bug test=develop (#20102 )	5 years ago
FDInSky	e2c7b6821a	test=develop enhance uniform_random op python api (#20295 )	5 years ago
danleifeng	3a0f93b3f9	fix error message for elementwise_add/mul (#20283 )	5 years ago
liym27	670937e11d	add input type and dtype check for reshape op. (#20099 ) enhance shape error messages for reshape op. test=develop	5 years ago

... 2 3 4 5 6 ...

4994 Commits (05c00af5f16da64d1e8953711c647512121ef3d2)