Paddle

Commit Graph

Author	SHA1	Message	Date
123malin	6c74e7387f	fix APIs, test=document_preview (#19954 ) * fix DistributeTranspilerConfig document, test=develop	5 years ago
wangchaochaohu	3409db950c	fix reduce bug test=develop (#19971 )	5 years ago
whs	3ea2b661c0	Make PaddleSlim support PyReader (#19995 ) * Make PaddleSlim support PyReader. * Fix unittest of sensitive pruning. * Add some assert.	5 years ago
Adam	4b65af7719	MKLDNN BatchNorm operator refactor (#20012 ) test=develop	5 years ago
joanna.wozna.intel	1d32897c5c	Fix test pool2d int8 mkldnn (#19976 ) * Fix conv2d+dequantize squash for residual fusion test=develop * Correct int8 input test=develop * Add if exclude or include padding in pool2d mkldnn test=develop	5 years ago
chengduo	2450d15b78	disable fuse_all_optimizer_ops (#19966 ) test=develop	5 years ago
Aurelius84	f58c8db668	Require x.dims=label.dims in huber_loss (#20017 ) * x.dims == y.dims test=develop * refine comment	5 years ago
Yang Zhang	cde73a7bbf	Expose `mutable_data` as python binding (#19932 ) * Expose `mutable_data` as python binding test=develop * Add test for device pointer binding test=develop * Make test compatible with python 2	5 years ago
Aurelius84	137e6336ef	Remove constraint that last dimension is forced to be 1 in rank_loss (#19997 ) * fix input shape check test=develop * move PADDLE_ENFORCE test=develop	5 years ago
chengduo	101a2b610a	Add dtype for coalesce_tensor_op (#20016 ) Add dtype for coalesce_tensor_op	5 years ago
Zhaolong Xing	f04f2b232a	fix if else error info (#19974 ) test=develop test=document_fix	5 years ago
gongweibao	a7512db2bc	Polish elementwise max min pow document to add more examples. (#19946 ) Polish elementwise max min pow document to add more examples	5 years ago
Aurelius84	2b5b4b3c5e	fix dataType in C++ comment in embedding op (#20004 )	5 years ago
Tao Luo	bcb2903e60	enhance shape error message of mul_op (#19998 ) test=develop	5 years ago
mapingshuo	d62360fe5f	fix doc of apply_optimize (#19965 ) * fix doc of apply_optimize test=document_fix test=document_preview * modify doc of backward test=develop test=document_fix * modify document hash test=develop test=document_preview	5 years ago
Chen Weihang	1409586eaa	Add LoD empty check for all related sequence ops (#19980 ) * add lod check for sequence op, test=develop * delete unnecessary check in expend op, test=develop	5 years ago
Huihuang Zheng	88af4ab650	Add new data layer (#19916 ) The new "fluid.data" changes old "fluid.layers.data": 1. Add shape and dtype check. 2. Remove "append_batch_size" parameter. We won't offer this in the new data layer because other deep learning platforms don't have this kind of data layer pre-processing. It may confuse users. 3. Remove "stop gradient" parameter because the data layer doesn't do back-propagation TODO： Now data layer feeded by executor is checked, will we want to check the feed data of readers in the future?	5 years ago
xujiaqi01	f50e701b3b	fix memory leak in HogwildWorker (#19956 ) fix memory leak in HogwildWorker, whose ops are explicitly deleted in destructor	5 years ago
Zeng Jinle	b8aff5e5e9	fix buddy_allocator_test, test=develop (#19967 )	5 years ago
Zeng Jinle	4a5ce4feb1	Add AdadeltaOptimizer doc (#19875 ) * add AdadeltaOptimizer doc, test=develop * refine doc,test=develop * folllow lanxiang's comments, test=develop, test=document_fix	5 years ago
Zeng Jinle	7912e6caa1	Expose set_gradient_clip API (#19869 ) * expose set_gradient_clip, test=develop, test=document_preview, test=preview * expose gradient clip, test=develop, test=document_fix * refine doc, test=develop * follow lanxiang's comments, test=develop, test=document_fix	5 years ago
chengjuntao	0099e54924	refine deformable roi pooling doc (#19944 ) * refine doc, test=develop, test=document_preview	5 years ago
zhongpu	b1bb23841e	add kernel for fill_op, test=develop (#19719 ) * add kernel for fill_op, test=develop * modify PADDLE_ENFORCE to PADDLE_ENFORCE_EQ, test=develop * add op test for fill_op, test=develop * REGISTER COP CUDA KERNEL, test=develop * update test_fill_op.py, test=develop * change FillConstantOpVarTypeInference to FillOpVarTypeInference, test=develop * fix op test, test=develop * add head file, test=develop	5 years ago
wangchaochaohu	382d099dcb	add support tensor and tensorlist for strided_slice OP (#19929 ) * add support tensor and tensorlist for strided_slice OP test=develop * fix the commnet test=develop * fix test=develop * fix the bug test=develop * delete log test=develop * fix API.spec test=develop * fix test=develop	5 years ago
lvmengsi	fe218df326	Fix ssdloss num and batch norm format and conv2d (#19754 ) * update API.spec	5 years ago
lvmengsi	619a241bd0	Fix OpTest of bn (#19062 ) * fix bn	5 years ago
Bob Zhu	c670058a8d	add support of matmul with multiple head even different width and height (#19708 ) * add support of matmul with multiple head even different width and height Original matmul with multiple head supports only the mat_a.width == mat_b.height, in that case, mat_b will be horizontally split. In this patch, we extend the support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height, in this case, mab_b will be vertically split. One example is A is [3, 8], B is [2, 16], head_number is 4. In this case, A will be split as [3, 2], B will be (vertically) split as [2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16] test=develop * add support of matmul with multiple head even different width and height Original matmul with multiple head supports only the mat_a.width == mat_b.height, in that case, mat_b will be horizontally split. In this patch, we extend the support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height, in this case, mab_b will be vertically split. One example is A is [3, 8], B is [2, 16], head_number is 4. In this case, A will be split as [3, 2], B will be (vertically) split as [2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16] test=develop * refactor the code of matmul with multiple head even different width and height test=develop	5 years ago
Liufang Sang	6884dc800a	refine ctc align op with padding (#19926 ) * refine ctc align op with padding * refine api sample code	5 years ago
Zhaolong Xing	e89b12884a	FIx C++ inference BUG: When open memory optim and enable trt subgraph at the same time, there is a bug (#19969 ) * fix memory optimization type test=develop * 1. fix BUG: open trt and memory optim will trigger bug. 2. Clean memory optim bug. test=develop	5 years ago
Wojciech Uss	4286a6270d	Add support for new QAT models (#18970 ) * Add support for new QAT models test=develop Co-Authored-By: Michał Gallus <michal.gallus@intel.com> Co-Authored-By: Wojciech Uss <wojciech.uss@intel.com> * fixed fps results test=develop * fix top5 accuracy drop problem * updated for new QAT models * skip quantizing average pooling - dirty but working * add missing pass * added missing conv+brelu fuse pass * removed a call to non-existent pass test=develop * renamed pass test=develop * Adjust finding pooling scale to newest QAT models * Remove unnecessary code from quantization_mkldnn_pass * Copy Pooling input scale to output scale in QAT * Refactor & remove unused code in QAT * Incorporate fp32 FC into QAT test=develop * Enable graph drawing with debug flag test=develop * Add tests for QATv2 * Fix paths for QATv2 models test=develop * Add option to save transformed int8 qat model test=develop * Remove redundant lines from qat mkldnn pass test=develop * Delegate disablement of avg pooling to qat test=develop * fix CI bug, test=develop * Follow Wangzhen's Review, test=develop * Update API.spec test=develop * Name False in (is_unsigned, TensorScale) tuple test=develop	5 years ago
Aurelius84	99a9615a4b	Removing length dims constraints of seq_pad and seq_unpad (#19497 ) * Removing last dims constraints of seq_pad and seq_unpad test=develop * fix test_layer api code test=develop * fix sequence_pad_op.cc conflict test=develop * remove test_analyzer_mm_dnn test=develop * fix vectorize bug test=develop * fix vectorize<int> test=develop	5 years ago
chengduo	cca26f5c42	polish multi process warning info (#19961 ) test=develop	5 years ago
Yi Liu	2efdf0ef35	update en document of shard_index_op (#19963 ) test=develop test=document_fix	5 years ago
jhjiangcs	766bd529d1	add optimizer:dpsgd,test=develop (#19915 )	5 years ago
Zeng Jinle	37f76407b0	fix cuda dev_ctx allocator cmake deps, test=develop (#19953 )	5 years ago
Yang Zhang	ebff68fa74	Add float16 support to `sync_batch_norm_op` (#19681 ) * Add float16 support to `sync_batch_norm_op` test=develop * Add test for sync_bn with FP16 input test=develop	5 years ago
Aurelius84	039b9710d5	Remove constraint that last dimension is forced to be 1 by adding lookup_table_v2 (#19735 ) * Remove constraint that last dimension is forced to be 1 by add lookup_table_v2 test=develop * modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop * Revert "modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop" This reverts commit 8a960bfc61e51aa27c3c529df8fb90b93ebd19f9. * move api into fluid.embedding test=develop * fix example code test=develop * move one_hot into fluid.one_hot * modify api.spec test=develop * fix loss shape test=develop	5 years ago
Zeng Jinle	80e0f547bb	fix allocator ut,test=develop (#19945 )	5 years ago
whs	bdb3e376d0	[PaddleSlim] Enhence compressor api in PaddleSlim (#19894 ) 1. Support customize eval function instead of eval program. 2. Fix loading checkpoint in quantization strategy. 3. Support saving eval model when saving a checkpoint. 4. Fix decoder of loading context in PaddleSlim. 5. Fix restoring from the checkpoint of uniform prune strategy. 6. Support saving eval model and infer model during training. 7. Add ‘unitest’ for saving eval model, saving infer model and uniform pruning restoring from the checkpoint. 8. Fix pruning of depthwise_conv_grad op by updating the groups.	5 years ago
xujiaqi01	cedc04775c	support change shuffle and train thread num (#19841 ) * support change shuffle thread num * support change train thread num * fix receive shuffle data of each channel * data norm stop gradient * add check thread_tensor type and root_tensor type when merge metric * remove sleep in shuffle, add config * add config of pslib client to client communication * fix xbox str * add data norm op testcase * add flush in trainer finalize	5 years ago
Kaipeng Deng	14625ffe9e	add elementwise mod support float/double. test=develop (#19570 )	5 years ago
Jacek Czaja	5b07ca9cdd	- ReImplemented pooling fwd mkldnn (#19911 ) - First implementation of BWD and FWD of pooling mkl-dnn - Compilation fix - Fix - Fix - Fix - Fix to crash - Compilation fix - Combined AcquireBacward with Fwd test=develop	5 years ago
Zeng Jinle	b1e83b33b0	fix huber loss op attr type, test=develop (#19937 )	5 years ago
Zeng Jinle	cc157d5990	add inplace to assign op, test=develop (#19927 )	5 years ago
chengduo	55ce696986	clean tensor array (#19930 ) test=develop	5 years ago
Leo Chen	57606205f5	Make OpTest check grad inplace even if forward has no inplace (#19847 ) * make OpTest check grad inplace even if forward has no inplace, test=develop * do not run PE when enable_inplace is False, test=develop * add conv3d cuda kernel for float16 type, test=develop * refactor OpTest for inplace, test=develop * add comments, test=develop	5 years ago
Zhang Ting	cb8f3c03a7	resize Ops support data_layout:channel_last, test=develop, test=document_preview (#19914 )	5 years ago
mapingshuo	9901f69677	Forward recompute3 (#19913 ) * add recompute based checkpoints methods for large batch training test=develop * add append_backward_with_forward_recomputation test=develop * refine optimizer test=develop * update backward and optimizer test=develop * make Variable usable test=develop * add recompute code * refine optimizer test=develop * refine addup _append_backward_ops_with_checkpoints_ 1) for recompute part, just cache the grad_op_desc without appending to block 2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch test=develop * make method private * add recompute strategy into DistributedStrategy test=develop * checkpoint version3 test=develop * remove some print information test=develop * remove unused sumop test=develop * try to fix recompute with graph building modules * add input names to vars should be held * add memory debug tool * backup backward * Fix bugs * add backward desc for op not in any segments * add exception info for sub_block test=develop * modify code style test=develop * modify code style test=develop * remove print functions test=develop * add API spec test=develop test=document_preview * make Recompute a child class of Optimizer test=develop test=document_preview * add API spec test=develop test=document_preview * modify API spec test=develop test=document_preview * add document for Recompute test=develop test=document_preview * change API doc of Rcompute test=develop test=document_preview * code cleaning test=develop test=document_preview * modify API spec * fix bugs when segments hold no element * add testcase for Recompute Optimizer test=develop test=document_preview * add test for apply_gradient, and code cleaning test=develop test=document_preview * add test case for load function * enable CI test=develop test=document * add test case test=develop test=document_preview * add sample code for 4 function of recompute optimizer test=develop test=document_preview	5 years ago
chengduo	d7251a8e1e	Delete local execution scopes (#19749 ) * Add RecordHistoryLocalExecScopes test=develop	5 years ago
wopeizl	5452b6a152	remove the useless warning for user to avoid confuse test=develop (#19871 ) * remove the useless warning for user to avoid confuse test=develop	5 years ago
ruri	d31c92a2cd	add mse_loss (#19759 ) * add mse_loss op	5 years ago
hong	85b398f171	Add op compatible information (#19910 ) * add op compatible infomation; test=develop * add enum type * add enum type; test=develop	5 years ago
Kaipeng Deng	3f021781a1	fix softmax CE time limit check failed (#19846 ) * fix softmax ce time limit check failed. test=develop * refine softmax calc. test=develop	5 years ago
Tao Luo	a4919d3688	move tree_conv to fluid.contrib.layers (#19918 ) * move tree_conv to fluid.contrib.layers test=develop * update API.spec for tree_conv test=develop * update tree_conv api to increase unit coverage test=develop	5 years ago
石晓伟	30adea0a23	tensor_array_to_tensor_op.cc, test=develop (#19289 )	5 years ago
Zeng Jinle	0436efd6a3	Unify DataLoader APIs (#19305 ) * unify DataLoader APIs, test=develop * integrate iterable CPU Dataset, test=develop add GPU dataset supporting, test=develop * add unittests for dataset, test=develop * add more docs to dataloader apis, test=develop, test=document_preview * refine doc, test=develop * refine doc again, test=develop * increase coverage, test=develop	5 years ago
lvmengsi	4155e62559	add instance norm (#19500 ) * add instance norm op	5 years ago
Zeng Jinle	c7f36e7c00	Add lock to cudnn handle calls (#19845 ) * refine reallocate of workspace size, test=develop * add lock to cudnn handle calls, test=develop	5 years ago
pawelpiotrowicz	2c5c636514	Add two extra flags for test_analyzer_int8_image_classification to disable fp32/int8 (#19840 ) test=develop	5 years ago
Adam	cb65439da8	Add support for other axes in MKLDNN softmax op (#19907 ) * Initial, functional commit * Clean commit related files test=develop	5 years ago
Jiabin Yang	454254115e	Feature/auto prune in dygraph (#19757 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE * support auto prune in dygraph mode * test=develop, support auto prune * test=develop, merge develop conflict * test=develop, fix test_layer and test_tracer ut * test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs	5 years ago
Aurelius84	418a0967f3	move match_matrix var_conv2d et.al api into fluid.contrib test=develop (#19859 )	5 years ago
Pei Yang	baccd7e2ca	Add TRT input shape check between model and runtime (#19864 ) * add TRT shape check, test=develop * model_input_shape == runtime_input_shape, refine message, test=develop	5 years ago
Pei Yang	74812d1c90	Fix BUGS: paddle-TRT repeatedly sets weight_map and overdeletes repetitive_params (#19825 ) * fix trt bugs when sharing params, test=develop * add unittest for cascade_rcnn	5 years ago
Zeng Jinle	747d44980a	Refine err msg of out of gpu memory (#19779 ) * refine err msg of out of gpu memory, test=develop * refine err msg again, test=develop * refine errog message again, test=develop * follow reviewer's comments, test=develop	5 years ago
Aurelius84	fcf53e55ff	support 2-level lod of input in sequence_pool (#19839 ) * support 2-level lod of input in sequence_pool test=develop * fix lod level bug in .cu test=develop	5 years ago
Zeng Jinle	b25d1e758d	remove enforce.h file written, test=develop (#19897 )	5 years ago
Zhang Ting	93364b45c1	group_norm support data_layout:NHWC, test=develop, test=document_preview (#19614 ) 1. group_norm support data_layout=NHWC 2. modified doc of group_norm	5 years ago
Huihuang Zheng	e117114289	Set states of recurrent op as dependent vars in prune (#19865 ) * Set states of recurrent op as dependent vars in prune of save inference model This PR will fix the save/load inference model problem of RNN models. The reason of the bug is that save_inferenc_model will prune OPs that doesn't contribute to Output. But in recurrent_op, States are not Output, OPs refers States will be pruned. This fix adds States of recurrent_op as dependent var so that OPs referring States won't be pruned.	5 years ago
石晓伟	d004a0f50e	fix multi-thread exec of trt, test=develop (#19338 )	5 years ago
Zeng Jinle	b754700fb5	fix reduce and broadcast to avoid multi-stream, test=develop (#19889 )	5 years ago
Zeng Jinle	8359b415e4	add free chunks to auto growth allocator, test=develop (#19890 )	5 years ago
Jacek Czaja	619c797a7f	[MKL-DNN] LRN refactoring (#19798 ) - LRN mkl-dnn kernel refactor test=develop - compilation fix - Another compilation fix - Compilation fix - another compilation fix - compilation fix - Crash fix - optional LRN mkldnn workspace - Added mid allocation - Workaround for tests - Removed gradient from is_test ut - Removed mid for inference - Reverted LRN mid removal for is_test - PADDLE_ENFORCE adjusted - Rebase to templatization commit - Compilation fix - compilation fix test=develop - lint test=develop - Fix to crash - Rebase to recent codebase - lin - lint - compilation fix	5 years ago
Zhang Ting	439d95e157	modified interpolate op to support tensor attribute, test=develop, test=document_preview (#19287 ) modified interpolate_op to support tensor attribute 1. the parameter out_shape of image_resize、resize_nearest/bilinear/trilinear can be a list or a 1-D tensor variable. If a list, each element can be an integer or a tensor variable with shape: [1]. 2. the parameter scale of above Ops can be a 1-D tensor variable. modified document of image_resize, resize_nearest, resize_bilinear, resize_trilinear and add some code example.	5 years ago
Zhang Ting	b38889413d	add crop_tensor_op, test=develop, test=document_preview (#19314 ) add crop_tensor op. The main difference with crop is : 1. If the argument shape is a list, each element is an integer or a tensor variable with shape: [1]. This way is suitable for the case that the shape may be changed each iteration. 2. If the argument shape is a variable. Its rank must be 1. In crop op, the rank of shape must be the same as x offsets can be a list, in which each element is an integer or a tensor variavle with shape: [1].	5 years ago
lidanqing	2c32c2d649	Refactor conv computeINT8 (#19574 ) * fix conflicts test=develop * change mask_bias_reorder test=develop * add ComputeMask function to make code clear test=develop * change according to reviews test=develop * change according to reviews test=develop	5 years ago
joanna.wozna.intel	3f1d0234ae	Fix conv2d+dequantize squash for residual fusion (#19545 ) * Fix conv2d+dequantize squash for residual fusion test=develop * Change condition test=develop	5 years ago
Huihuang Zheng	a35557d8f4	Fix deps of prune (#19876 ) Add boost as dependency of prune fix #19862	5 years ago
Adam	c7e688921b	Add template functions for Acquire primitive/primitive_desc (#19867 ) * Add template functions for Acquire primitive/primitive_desc test=develop * Move acquire primitive descriptor to protected section test=develop	5 years ago
flame	fe18cfdb4f	hide with inference optim API (#17355 )	5 years ago
Leo Chen	578a2f5da3	fix SplitLodTensor when batch_size = 0, test=develop (#19866 )	5 years ago
Aurelius84	b125e327aa	Remove constraint that last dimension is forced to be 1 in cross_entropy (#19606 ) * Remove constraint that last dimension is forced to be 1 in cross_entropy test=develop * modify labels last dims test=develop	5 years ago
wopeizl	a7c440d303	add precise roi pooling op test=develop (#18960 ) * add precise roi pooling op test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * detail the description test=develop * test=develop * elaborate the doc for return type test=develop * test=develop	5 years ago
Yiqun Liu	3cd985a669	Add a pass to fuse fc+elementwise_add+layernorm (#19776 ) * Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop	5 years ago
Jie Fang	d9db94d752	Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus. (#19714 ) Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus	5 years ago
wangchaochaohu	47af618f70	Strided slice (#19642 ) * strided_slice op basic function test=develop * test=develop rewrite and fix * fix bug test=develop * fix for the PADDLE_ENFORCE usage * add some unit testw * fix for the aip test and copright and fix test=develop * fix API.spec test=develop * fix API.spec test=develop * add axis parameter test=develop * fix for the build error test=develop * fix python api test=develop * fix the build test=develop * fix build test=develop * fix API spec test=develop * test=develop add some comment and single op test * fix API spece test=develop * fix test=develop * fix test=develop * fix api test=develop * fix api test=develop * fix API.spec test=develop * fix typo test=develop * fix API.spec test=develop * fix API typo test=develop * fix doc and API.spec test=develop	5 years ago
Zeng Jinle	13ca364ceb	remove some flags and add comments to some flags, test=develop (#19813 )	5 years ago
123malin	1bc285a53a	add retry function to try to solve grpc error code 14 (#19661 ) * rpc retry for asycsend/get/prefetch * test=develop, change retry vlog level to 3 * test=develop, set default grpc_retry_times is 3	5 years ago
Zeng Jinle	5eb381a3e2	refine reallocate of workspace size, test=develop (#19843 )	5 years ago
石晓伟	71b2ed61bc	support MLU nums, test=develop (#19372 )	5 years ago
Zeng Jinle	3f87464e9c	refine executor_gc_helper codes, test=develop (#19814 )	5 years ago
LielinJiang	6d72a86b14	fix_roi_transform_bug (#19785 )	5 years ago
Zeng Jinle	3fd3b663a8	fix gc bug in controlflow ops, test=develop (#19827 )	5 years ago
Leo Chen	982e61f5ff	Update elementwise double grad to save gpu memory (#19509 ) * update elementwise double grad to save gpu memory, test=develop * update elementwise_mul/div_grad_grad to save memory, test=develop * remove eval function in eigen statement to save memory, test=develop * add unittest for elementwise_div_grad_grad without dout, test=develop * add unittest for elementwise_add_grad_grad without ddx, test=develop * add float16 cuda kernel for elementwise double grad op, test=develop	5 years ago
Zeng Jinle	db26de8389	[Bug fix] Disable memory reuse on feeded variables (#19835 ) * fix memory reuse bug on feeding variables, test=develop * add comments to reference count members, test=develop	5 years ago
Adam	dfdd73cbc0	Add MKLDNNhandlerT templatized class (#19801 ) test=develop	5 years ago
Zeng Jinle	cabb9501bd	fix leaky_relu op when alpha is zero, test=develop (#19833 )	5 years ago
Pei Yang	9cbc1eff2d	zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822 )	5 years ago
chengjuntao	00efd1d8a9	add deformable conv v1 op and cpu version of deformable conv v2 (#18500 ) * add deformable conv v1 op, test=develop	5 years ago
Thunderbrook	40c66f8df9	rm return in vfork (#19734 ) * rm return in vfork * rm return in vfork test=develop	5 years ago
Zhaolong Xing	110be57c1b	fix memory optimization type (#19781 ) test=develop	5 years ago
liym27	677e714425	fix pow op, support tensor for agument factor. (#19313 ) improve pow op according to reviews: 1. Delete unnecessary judgement statements in PowGradOpDescMaker; 2. Improve test of test_api; overload GetKernelTypeForVar add stop_gradient=True when attr(factor) is tensor Variable, change examples in API pow. test=develop,test=document_preview	5 years ago
liym27	bd89a27308	add tensor support for argument shape in reshape op; (#19268 ) add support parameter inference when argument shape is a list containing integer and tensor variable; test=develop fix reshape op according to reviews: 1. improve or message; 2. improve test of test_api. test=develop,test=document_preview fix reshape op: Add error message in nn.py, test=develop add stop_gradient=True when attr(shape) is tensor Variable. change examples in API reshape. test=develop,test=document_preview	5 years ago
liym27	88628016b2	add tensor(tensor and tensor in list) support for argument starts and ends in slice op; (#19208 ) add support parameter inference when arguments starts or ends is a list containing integer and tensor variable; test=develop,test=document_preview improve slice op according to review(from hongyu). test=develop fix slice op according to review: infer_flags, test=develop fix slice op: improve overload operator __getitem__ to support attrs(starts and ends) are Variable. test=develop,test=document_preview fix test_slice_op: add TestSliceOp_decs_dim_6 to resolve conflict with test_slice_ngraph_op. test=develop add stop_gradient=True when attr(starts) or attr(ends) is tensor Variable. test=develop,test=document_preview	5 years ago
liym27	e9e3c08777	fix expand op: (#19302 ) 1. add tensor support for argument expand_times in expand op; 2. add support parameter inference when argument expand_times is a list containing integer and tensor variable; improve expand op according to reviews: 1. add doc of ExpandTimes in expand_op.cc; 2. improve the test of test_api. add stop_gradient=True when attr(expand_times) is tensor Variable, change code examples. test=develop,test=document_preview	5 years ago
xujiaqi01	6bf298bf09	support preload thread, optimize hdfs log, fix master+patch bug (#19695 ) * support preload thread * sleep before fleet wrapper exit for pslib core dump * optimize hdfs log * fix master+patch bug	5 years ago
Huihuang Zheng	a0d80754c5	Add comments for CUDA Device Context Allocator related stuff (#19809 )	5 years ago
Jiabin Yang	cc311bdf95	Feature/add transform data dygraph (#19707 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * add transform_data to dygraph * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * add test and change input to const ref for safety * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * add ut for data transform * refine ut for data_transform * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE * add test_tracer on multiple devices * test=develop, change place to mutable for data transform * test=develop, add transform data on same place test and remove useless log * test=develop, Add to do for data layout and and ut for conv2d with no bias	5 years ago
lvmengsi	b76343c3b7	cpu Conv double grad (#19672 ) * cpu conv_grad_grad	5 years ago
Zeng Jinle	754fd57ed7	disable memory optimization passes when FLAGS_use_ngraph=True, test=develop (#19778 )	5 years ago
翟飞跃	93c85c930a	Implement FusedEmbeddingSeqPoolGradKernel with cblas_saxpy (#19770 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * optimize bp with mkl sparse matrix test=develop * tmp add fused_emb_seq layer * Add the support of padding_idx attribute. test=develop * add padding_idx support test=develop * implement grad refer lego test=develop	5 years ago
chengduo	8281497030	Fix warning info of build_strategy (#19805 ) * fix warning info test=develop * fix bug of all_reduce_deps_pass test=develop	5 years ago
Zeng Jinle	b34933d9ee	fix retry allocator bug, test=develop (#19794 )	5 years ago
Yiqun Liu	c67c8758cb	Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop	5 years ago
Zeng Jinle	32b1151f5e	reduce default value of cudnn workspace size, test=develop (#19780 )	5 years ago
zhongpu	52673956de	add kernel for squeeze_op, test=develop (#19656 ) * add kernel for squeeze_op, test=develop * delete comment, test=develop	5 years ago
zhongpu	2a81c3679a	add kernel for unstack_op, test=develop (#19538 ) * add kernel for unstack_op, test=develop * add kernel for unstack_op, test=develop * add kernel for unstack_op, test=develop * adjust the code format, test=develop * modify some comment, test=develop	5 years ago
Chen Weihang	00d5375e0c	Add prune_backward function to cover complicated test_program.clone situation (#19772 )	5 years ago
Kaipeng Deng	99c78b772a	fix softmax axis!=-1. test=develop (#19800 )	5 years ago
Adam	d4413a54bc	Add common CreateKey for mkldnn handlers (#19767 ) test=develop	6 years ago
Yihua Xu	0d6ea52958	Fix the definition issue when used mkl_scsrmm and mkl_dcsrmm functions. (#19774 ) test=develop	6 years ago
chengduo	056fdedde3	Open fuse all reduce option (#19765 ) * Open fuse all reduce op test=develop * Add Fuse optimization op log * Add log in fuse_optimizer op pass and fuse all_reduce op pass * replace with boost::optional<bool> test=develop * Polish code test=develop * fix code coverage test=develop	6 years ago
Aurelius84	8c7e411908	Remove constraint that last dimension is forced to be 1 by adding one_hot_v2 (#19716 ) * add one_hot_v2_op to remove last_dims==1 test=develop * add api unittest code for CI_Coverage test=develop * improve CI_Coverage rate by adding test_with_depth test=develop	6 years ago
JesseyXujin	e352467c1c	modify activation op API, delete use_cudnn args, test=develop, (#19758 )	6 years ago
Jacek Czaja	9e4c958552	Refactoring activation mkldnn op (#19748 ) test=develop - fix to BWD test=develop	6 years ago
Huihuang Zheng	12542320c5	Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989 ) TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation	6 years ago
Zeng Jinle	0daa5c9772	Make leaky relu inplacable (#19676 ) * make leaky relu inplacable, test=develop * force add unittests to pass coverage, test=develop	6 years ago
Zeng Jinle	078a678219	refine math_op_patch, test=develop (#19727 )	6 years ago
chengduo	e506c99c20	Open fuse broadcast option (#18833 ) * fix vlog level and fuse option type test=develop	6 years ago
Jacek Czaja	47f670d58c	- Softmax mkl-dnn refactoring (#19615 ) test=develop - Cosmetic fixes test=develop	6 years ago
Yiqun Liu	a65c728e5d	Implement the GPU kernel of fc operator (#19687 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop	6 years ago
Aurelius84	22301115d0	Remove constraint that last dimension is forced to be 1 in huber_loss op (#19562 ) * Remove constraint that last dimension is forced to be 1 in huber_loss test=develop * add y[rank-1] == 1 when x_rank=y_rank test=develop * modify into contain_unknown_dim test=develop	6 years ago
chengduo	5866a7a5fe	Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418 ) * Enable fused_all_reduce_op_handle support GPU and CPU Gradients	6 years ago
Youwei Song	3e5fb6361b	fix api-doc error for dygraph and backward (#19721 ) * update dygraph api-doc and backward api-doc, test=develop * update dygraph api-doc and backward api-doc, update api.spec, test=develop * update dygraph api-doc and backward api-doc, update api.spec, test=develop * update API.spec, test=develop	6 years ago
Tao Luo	ec9bc1bd9f	paddle::framework::vectorize() templatization (#19730 ) remove unused accuracy-diff warpctc-cudnn implementation test=develop	6 years ago
Zeng Jinle	bb4f8dee83	add logs to left var memory size, test=develop (#19722 )	6 years ago
Adam	428b2b9e17	MKLDNN handler cleanup (#19713 ) * MKLDNN handler cleanup * MKLDNN handler cleanup test=develop	6 years ago
XiaoguangHu	27235cf222	Add document annotations for FLAGS that need to be open to external developers test=develop (#19692 ) Add document annotations for FLAGS that need to be open to external developers	6 years ago
Zeng Jinle	1c25c88aba	refine memory usage of some operators, test=develop (#19700 )	6 years ago
wangguanzhong	25dcd74d34	merge empty lod tensor, test=develop (#19228 ) * merge_empty_lod_tensor, test=develop * fix multiclass_nms, test=develop * refine API.spec, test=develop * add unittest case for fetch, test=develop * add lod tensor test, test=develop * return index for multiclass_nms, test=develop * add api for multiclass_nms2 * update API.spc, test=develop * refine api doc, test=develop * fix test_detection.py, test=develop * polish code, test=develop * add more unittest case, test=develop	6 years ago
yaoxuefeng	c6756ed225	fix instag op (#19591 ) * fix instag op * fix instag bug: Some tiny logical error, occurring when ins_tag (2nd input) is multiple. test=develop	6 years ago
gongweibao	6c2bc29cc0	Fix float16 optimizer. (#19682 ) Fix float16 optimizer	6 years ago
Zeng Jinle	713c05dd60	refine tensor.mutable_data, test=develop (#19680 )	6 years ago
Chen Weihang	c78a4781bf	Fix train error when test_program.clone is executed after optimizer.minimize (#19397 ) * add prune when test_program.clone is executed after optimizer.minimize * add unittest, test=develop * add resnet and transformer test case, test=develop * add regularization for optimizer & program compare function, test=develop * add lstm unittest, test=develop * polish code based on review comment, test=develop * adapt to interface change in framework._prune, test=develop * update API.spec, test=develop	6 years ago
zhongpu	5f627488db	add kernel for unsqueeze_op and Add unsqueezed op test, test=develop (#19436 ) * add kernel for unsqueeze_op, test=develop * add kernel for unsqueeze_op, test=develop * add kernel for unsqueeze_op, test=develop	6 years ago
Zeng Jinle	a7691603a5	add gpu_allocator_try_time config, test=develop (#19675 )	6 years ago
JesseyXujin	0b06db9413	delete transmission args in linear_chain_crf op (#19619 ) * delete args on linear_chain_crf_op doc * delete args on linear_chain_crf_op doc * delete args on linear_chain_crf_op doc * add code example * fix api doc * fix doc of crf * fix doc of crf * add test=develop * modify API.spec, test=develop	6 years ago
Tao Luo	f05d2c519d	paddle::framework::vectorize() templatization [PART3] (#19643 ) * paddle::framework::vectorize() templatization test=develop * update pybind/imperative.cc test=develop * revert update on unsqueeze_op.cc and warpctc_cudnn_op.cu.cc test=develop	6 years ago
hutuxian	1ca6ea0318	fix cmakelist deps (#19668 ) fix cmakelist deps: remove unnecessary deps and add proper op deps	6 years ago
Tao Luo	bcddbc78d4	remove -Wmaybe-uninitialized warning (#19653 ) * remove -Wmaybe-uninitialized warning test=develop * remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc test=develop	6 years ago
Zeng Jinle	2db40d9f60	reduce thread num of retry_allocator_test,test=develop (#19638 )	6 years ago
wangchaochaohu	4440d7ced0	test=develop cuda realization of label smooth op (#19175 )	6 years ago
chengduo	31c5a5ee26	Remove linear_chain_crf_op.cu (#19645 ) test=develop	6 years ago
123malin	a25a716e87	Optimize fleet API: add input check for some interfaces (#18971 ) * fleet api add input check, test=develop	6 years ago
wangchaochaohu	ed8f44ea21	codegen for fused elementwise operation (#19520 ) * test=develop codegen for fused elementwise operation * fix test=develop	6 years ago
Chen Weihang	73daa3d6c0	Code Cleanup: delete three useless raw variables in Conv2D (#19644 ) * delete useless raw variables in Conv2D, test=develop * adjust the vars number in test_graph_wrapper to pass unittest, test=develop	6 years ago
123malin	2f037c3189	fix the diff between async mode and async_half mode (#19535 ) * test=develop, communicator merge add => merge average	6 years ago
Jiabin Yang	e9233d1c1e	Refactor dygraph (#19107 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE	6 years ago
mapingshuo	dca9b6c5b0	add feed_var_names to Prune interface (#19589 ) * Fix bug: add feed_vars to the prune function	6 years ago
tangwei12	f45cb1c2ca	fix bug of communicator flag, test=develop (#19635 )	6 years ago
Yiqun Liu	42b5bec6f9	Integrate NVRTC to support compiling CUDA kernel at runtime (#19422 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop	6 years ago
Tao Luo	3ae939e48a	unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631 ) * remove assert.h * change PADDLE_ASSERT_MSG to PADDLE_ENFORCE test=develop * fix tensorrt paddle_enforce test=develop	6 years ago
Leo Chen	af692c9140	update reduce_sum and reduce_mean to save memory, test=develop (#19608 )	6 years ago
tensor-tang	e3e98ed678	fix scope lock bug on infer (#19624 )	6 years ago
Aurelius84	6364ebc4dd	Add distributions of Categorical and MultivariateNormal (#18263 ) * add_distributions_of_normal_and_uniform * paddle/fluid/API.spec * modify API.spec * modified paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * fix some comment, test=develop * modify API.spec, test=develop * Add distributions of Categorical and MultivariateNormal test=develop * fix pylint codestyle test=develop * fix conflict file test=develop * edit API.spec test=develop * improve sample code test=develop * modify api.spec test=develop	6 years ago
Zeng Jinle	710767d894	Enable inplace support for some ops (#19612 ) * enable inplace for affine_channel op, dropout op, test=develop * remove dropout inplace for ngraph fails, test=develop	6 years ago
FDInSky	a18cf5e119	add a argument for softshrink python api (#19396 ) * test=develop add a argument for softshrink python api * test=develop fix doc format test=develop fix doc format * test=develop fix API.spec test=develop fix API.spec	6 years ago
Tao Luo	d6c85c96dc	paddle::framework::vectorize() templatization (#19627 ) test=develop	6 years ago
danleifeng	8672e15363	elementwise broadcast function enhancement (#19536 ) elementwise broadcast function enhancement	6 years ago
Chen Weihang	8cb54ede8c	Add user-friendly error message in optimizer ops to give a hint about the position sensitive problem of run(startup_program) (#19605 ) * add extra error message hint in optimizer ops * polish format & delete useless change, test=develop * extract init judue from shape compare, test=develop	6 years ago
zhongpu	118bb897cf	add kernel for flatten_op, test=develop (#19472 ) * add kernel for flatten_op, test=develop * add kernel for flatten_op, test=develop * fix the license and remove redundant code, test=develop	6 years ago
Tao Luo	0a46d34538	refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607 ) test=develop	6 years ago
baojun	a3a4b6e570	Enable ngraph through build_strategy (#19266 ) * enable ngraph throught build_strategy test=develop * add unittest test=develop * put use_ngraph unconditional test=develop * remove paddle_enforce test=develop * remove paddle_enforce test=develop * fix copyright test=develop * limit for ngraph only test=develop	6 years ago
ShenLiang	2cd3fa3e9a	add scatter_nd op and scatter_nd_add op (#19571 ) * add scatter_nd op, test=document_preview test=develop * fixed the document, test=document_preview test=develop * modify the notes, test=document_preview test=develop * remove the ShareDataWith, test=develop	6 years ago
wawltor	364c44422e	Add the support the int64 data type of `scatter_op` input Index(#18804 ) (#19508 ) * test=develop Fix the scatter op bug when use the add mode, and support the int64 data type of scatter_op Index(#18804). * test=develop Remove the PADDLE_ENFORCE and use PADDLE_ENFORCE_EQ * test=develop Remove the fix bug of scatter_add, and just add the support of int64 in scatter_add * test=develop Add the test case for scatter op, the test case just for index int64	6 years ago
Adam	8d6d95cc2b	paddle::framework::vectorize() templatization (#19611 ) test=develop	6 years ago
Tao Luo	75d1571995	refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603 ) test=develop	6 years ago
Yiqun Liu	c5548178b0	A a pass to enable the use of cudnn (#19346 ) * Add a interface to enable cudnn for inference. * Add cudnn_placement_pass. test=develop * Set the default value of cudnn_enabled_op_types to null. test=develop * Write the common basic class, placement_pass_base, to refine the codes. test=develop * Call EnableCUDNN in unittest. test=develop * Refine cudnn_placement_pass tester. * Enable the testing of cudnn_placement_pass in inference's unittest. test=develop * Add the check of op kernels. test=develop	6 years ago
Adam	e94b26daf5	using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568 ) * using MKLDNNMemoryFormat = mkldnn::memory::format changes test=develop * PADDLE_ENFORCE update test=develop	6 years ago
Zeng Jinle	e045aadf9a	fix retry_allocator_test by removing glog envs, test=develop (#19596 )	6 years ago
baojun	f2ad30c4dd	Some ngraph op and unittest fix (#19515 ) * update ngraph ops test=develop * update unittest test=develop * increase coverage test=develop	6 years ago
Tao Luo	49523ea189	replace PADDLE_ASSERT with PADDLE_ASSERT_MSG (#19586 ) * remove unused PADDLE_ASSERT(_IS_NOT_ERROR) * replace PADDLE_ASSERT with PADDLE_ASSERT_MSG test=develop	6 years ago
gongweibao	abaf87be2b	Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506 ) Change backward_guard to optimize_guard to maximize the allreduce overlap	6 years ago
Zeng Jinle	578cccd48c	fix parallel compilation error of allocator (#19581 )	6 years ago
Zeng Jinle	f4562c3468	fix typo of allocator, test=develop (#19578 )	6 years ago
xiaoting	7a86706309	modified multiclass_nms example (#19553 ) test=develop, test=document_preview	6 years ago
gongweibao	57f0f0f2dc	Delete pserver complete file before executor running. (#19468 )	6 years ago
JesseyXujin	4a7e6deb63	add padding in linear_chain_crf op (#19583 ) * add padding in linear_chain_crf op * modify API.spec * add linear_chain_crf_op.cc and linear_chain_crf_op.h * remove useless unit test , test=develop * modify API.spec, test=develop * remove some blanks in nn.py , test=develop * fix some bugs on nn.py and API.spec ,test=develop * fix nn.py, test=develop * fix API.spec ,test=develop * fix bug of CI test in test_linear_chain_crf_op.py * fix bug of CI test in test_linear_chain_crf_op.py, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * modify nn.py, test=develop * fix API.spec, test=develop * fix unittest bug, test=develop	6 years ago
Zeng Jinle	19474019c2	fix fast pe to run highest priority ops first, test=develop (#19575 )	6 years ago
zhouwei25	84c728013c	fix the compilation issue on windows caused by mkl_CSRMM (#19533 )	6 years ago
mapingshuo	f4ee60b7d0	Imdb train demo2 (#19572 ) * add place to reader, remove snappy dependency * remove snappy dependency from train demo test=develop	6 years ago
Zeng Jinle	0af8549750	fix seg fault of share lod, test=develop (#19573 )	6 years ago
Jacek Czaja	cef95ee30d	[MKL-DNN] Refactoring Softmax (#19312 ) * - First set of modifications - Compilation fixes - compilation fix - Another compilation fix - Moved AcquireSoftmaxPrimitiveDescriptor call into handler - MKL-DNN Softmax PD refactor test=develop - Compilation fix test=develop - another compilation fix - cosmetcis test=develop - Compilation fix - Fix to crash when softmax backward is created * - Fixes after review of softmax refactoring test=develop	6 years ago
Zeng Jinle	0a73f7202a	Add retry_allocator for gpu (#19409 ) * add retry_allocator for gpu, test=develop * follow chengduoZH's comments, test=develop * follow huihuang's comments,test=develop * change f,l in enforce.h to be file,line, test=develop * increase code coverage by adding unittests, test=develop * fix CMakeLists.txt, test=develop	6 years ago
hutuxian	c756b5d231	Paddlebox Framework (#18982 ) * Support looking up embeddings from BoxPS. * Add a _pull_box_sparse op, for now this op is not exposed to users. * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on. * Add 'BoxPSDataset' in python code. * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS. * Add UT. * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982	6 years ago
Zeng Jinle	5dce1da680	remove reset recordio usage (#19519 )	6 years ago
ShenLiang	85914f7a88	add gather_nd op and unit test (#19366 ) * fixed the code for coverage * fixed the document,test=document_preview test=develop	6 years ago
Jacek Czaja	ecd9f330c9	[MKL-DNN] Fix to face model on AVX512 platforms (#19282 ) - Refactor step 1 - Compilation fix - Yet another compilation fix - Even more compilation fix - Lint fixes test=develop - Removed deprectaed PADDLE_ENFORCE occurance test=develop - Candidate fix to BN forward - Lint fixes test=develop - Refactoring in data_layout_transform - compilation fix - Another comppilation fix - Step further into darkness - Yet another compilation fix - Yet another compilation fix - missing header - compilation fix - Added MKLDNN -> Paddle conversion in fetch op test=develop - Compilation fix test=develop - Lint test=develop - Mul fix - Fix to MKLDNN MUL op and Elementwise MUL UT test=develop - Workaround for diffrent weights with groups representation Paddle vs MKL-DNN. test=develop - Candidate fix for 5D convolution with groups - Refactor of fix for conv3d and conv2d in fetch op test=develop - Compilation fix - Still same compilation fix - Compilation fix - Compilation fix - Reverted refactoring of fixes - Adapted test_conv2d_int8_mkldnn so it exects data in NCHW format not NHWC test=develop - minor fix in UT test=develop - Lint fixes test=develop	6 years ago
GaoWei8	e8405e5c61	Modify the dropout op to multi-thread (#19504 ) * Modify the dropout op to multi-thread test=develop * define parallel test=develop	6 years ago
Huihuang Zheng	2916caa2c4	Change ugly PADDLE_ENFORCE_EQ in recurrent_op.cc (#19470 ) test=develop	6 years ago
Liufang Sang	9dde564097	change var name padding_num to padding_value (#19498 )	6 years ago
Aurelius84	5b5379b32a	Add sequence_topk_avg_pooling Op (#19442 ) * add topk_avg_pooling * refine api doc and modify api.spec test=develop	6 years ago
yaoxuefeng	10ca3f9609	add thread scope stat accurate metrics test=develop (#19480 ) * add thread scope stat accurate metrics test=develop * fix style * fix style * fix style * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix conflict * fix style * fix style test=develop * fix error test=develop * fix error test=develop	6 years ago
liuwei1031	d6cb1a4122	add dynamic C runtime support on windows, test=develop (#19502 )	6 years ago
Bai Yifan	6d99842bb8	fix mean_iou api example, test=develop, test=document_preview (#19503 ) Fix mean_iou api misleading example	6 years ago
Tao Luo	02270b3eb1	remove unused assert.h (#19529 ) test=develop	6 years ago
chengduo	e340df013e	Support feed single persistable variable to PE (#19417 ) * update executor feed	6 years ago
Yiqun Liu	fcec365d29	Add a pass to replace dropout_op with scale_op when is_test is true (#19297 ) * Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true. test=develop * Delete dropout_op directly when upscale_in_train is true. test=develop * Improve the debug string, adding the print of op_desc information. * Fix the case when dropout's input x is reused as the next op's output. * Add the pass to inference. test=develop * Change the log level. test=develop * Add unittest for inplace case. * Add comment to explain the pass. * Apply the pass for CPU inference. test=develop * Fix the typo. test=develop * Add the check of AttrType. test=develop	6 years ago
hong	e169538886	fix kernel config bug in dygraph mode; test=develop (#19532 )	6 years ago
Zeng Jinle	c2c5b1b941	remove signal raise msg, test=develop (#19527 )	6 years ago
lidanqing	ba368bf696	clean up intel labeled TODOs (#19476 ) test=develop	6 years ago
Zeng Jinle	11f2f78458	fix sofmax seg fault in AVX, test=develop (#19487 )	6 years ago
Thunderbrook	1fe468d319	support debug each output of each ins (#19004 ) * dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop * support debug tensor of each ins test=develop * support debug tensor of each ins test=develop * learning rate * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style test=develop * code style test=develop * unitest * style * style * multi phase * add channel * code style * style * style * unitest * style * define * define test=develop * style test=develop * rm define test=develop * linux * linux test=develop * style test=develop * output format test=develop * windows ci test=develop	6 years ago
Zeng Jinle	5c8f210ce3	refine inplace inference registry, test=develop (#19032 )	6 years ago
chengduo	b6d1d8901f	Increase num_iteration_per_drop_scope (#19075 ) * increase num_iteration_per_drop_scope test=develop * Fix bug of while_op test=develop * fix bug of whileOp test=develop	6 years ago
Double_V	1d0f04315a	fix row_conv_op to force it support lodtensor and tensor input simultaneously, test=develop (#19412 ) Support Tensor input for row_conv_op	6 years ago
Jiabin Yang	1ce0a09e60	fix con2d transpose bias by create and init it in build_once (#18968 ) * fix con2d transpose bias by create and init it in build_onee * fix API spec * test=develop, invoke ci * fix bias_attr and act has no effect error on layer norm, conv2dTranpose, billinearTensorProduct, sequece_conv. fix original_mode not used error on GRUunit. fix sample_weight not set error on NCE. Add ut for all thoese layer * test=develop, change success standard for conv2dTranspose * test=develop, fix test_layers to invoke some error branch * test=develop, fix sample code * test=develop, fix BilinearTensorProduct failed in dygraph mode * test=develop, fix test_layers segment fault error	6 years ago
tangwei12	65c7368400	Fix the correctness of async mode at distributed training (#18863 ) * fix correctness of the communicator * fix a bug in send thread when sending var context is empty, test=develop * add lookup_table_prefetch_op and prefetch optimize, test=develop * remove remote prefetch GPU supported * word2vec force with CPU, test=develop * test dist remote lookup table force with CPU, test=develop	6 years ago
baojun	6421c61ae2	Update ngraph engine for multiple threading (#19155 ) * update for multiple threading test=develop * remove PADDLE_ENFORCE test=develop	6 years ago
Zeng Jinle	caf59d0f3f	Add signal message to stderr (#19421 ) * add signal message to stderr, test=develop * add unittests for ugly SignalHandle, test=develop	6 years ago
Yi Liu	efb05ba258	supports multiple NCCL communicators preserved in NCCLCommContext (#19407 ) * supports multiple NCCL communicators preserved in NCCLCommContext test=develop * add ut for c_comm_init_all operator and fix cuda resource release problem test=develop	6 years ago
Huihuang Zheng	56dd76538c	Delete useless ex-scope in recurrent op (#19426 )	6 years ago
wopeizl	b8aa37d529	save the callstack information to file when exception throws test=dev… (#19324 ) * save the callstack information to file when exception throws test=develop	6 years ago
Aurelius84	a9cd513680	improve sequence_conv api doc (#19316 ) * improve sequence_conv api doc test=develop * add warning for padding param test=develop modify into deprecated	6 years ago
joanna.wozna.intel	2e3ec66be0	Add conv dequant squash for int8 (#18905 )	6 years ago
vincentXiyu	482ce818bb	Support Tensor input with padding for warpctc op (#19322 ) * support tensor input with padding for warpctc op * merge with develop * test=develop * modified python API examples test=develop * nn.py is modified for code coverage test=develop * update documents info about warpctc op in API.spec test=develop * add test_warpctc_with_padding in test_layers test=develop * add warning log for cuda_version back to warpctc_op.cc * modify API.spec for warpctc op test=develop * modify API.spec * update warpctc test to new CompiledProgram API test=develop * modify code examples for warpctc op test=develop * modify API.spec for warpctc op test=develop * modify API.spec for warpctc op test=develop	6 years ago
Leo Chen	6fb310ae29	Fix bug of getting bool Flags from os.environ (#19349 ) * fix bug of getting bool Flags from os.environ, test=develop * add empty loss_name in CompiledProgram for inplace grad test, test=develop	6 years ago
liu zhengxi	32598ffd8f	Python infer api update and add unit test (#19353 ) * python inference api supports numpy and add unit test, fix unit test fail in test_slim_int8_googlenet and test_slim_int8_mobilenet	6 years ago
mapingshuo	d5ac87ec22	Lookahead optimizer (#19386 ) * Add lookahead optimizer * add unittest for lookahead optimizer test=develop * add doc string for LookaheadOptimizer test=develop test=document_preview * add API spec for lookahead test=develop test=document_preview * modify api spec test=develop test=document_preview * modified doc string * modify the test file test=develop test=document_preview * modify doc string test=develop test=document_preview	6 years ago
Huihuang Zheng	12d29f4d2a	Change TensorCopy in recurrent_op to ShareDataWith (#19319 )	6 years ago
tangwei12	19dac67e9f	fix distribute transpiler GRPC error code 4, RPC Deadline (#18984 ) * fix sync mode hang in transpiler * remove sync mode in send/recv * replace PADDLE_ENFORCE with PADDLE_ENFORCE_NE	6 years ago
Yibing Liu	5d1575cfe8	Fix arg do_model_average in param_attr (#19376 ) * Fix arg do_model_average in param_attr test=develop * Update api spec test=develop	6 years ago
Tao Luo	c82280e445	remove unused conv_elementwise_add2_act_fuse.cc (#19344 ) test=develop	6 years ago
chengduo	4278518fb0	Update CompiledProgram (#18919 ) * use PE for compiler test=develop	6 years ago
lidanqing	9240e5325c	add local user data conversion into full_pascalvoc_test_preprocess.py (#19283 ) * add local user data conversion into full_pascalvoc_test_preprocess.py test=develop * change PADDLE_ENFORCE to PADDLE_ENFORCE_GE test=develop * change according to reviews test=develop	6 years ago
翟飞跃	2e3ee57954	Use sparse matrix to implement FusedEmbeddingSeqPoolGradKernel (#19153 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * optimize bp with mkl sparse matrix test=develop	6 years ago
Leo Chen	a9d5fc5142	Enhance OpTest to check the consistency of operators when using and not using inplace (#19101 ) * add pybind interface to get all inplace ops, test=develop * enhance OpTest to check whether the consistency of operator when using and not using inplace, test=develop * handle corner cases in op_test, test=develop * support outputs without tensor holder_, like XShape in reshape_op, test=develop * fix bug, some op has GradOpMaker, but actually no grad_op in OpInfoMap, test=develop * use reshape_grad instead of reshape in FlattenGradOp, test=develop * fix error debug dims info for variables like XShape, test=develop * change computational order in sum_op to relieve computation difference using inplace, test=develop * add inplace_atol to check group_norm, and skip inplace_grad for mkldnn, test=develop * follow sneaxiy's comments, test=develop * remove unused DefaultGradOpDescMaker in mkldnn op, test=develop	6 years ago
Aurelius84	0d29cf18f4	Supports diagonal initialization in uniform_random op (#19299 ) * add diag init in Uniform_random op test=develop * modify api.spec test=develop * fix unform_batch_size_like maker test=develop * add diag_num and diag_step assert check test=develop	6 years ago
Tao Luo	e3c68bde78	stronger the error message of tensor's mutable_data (#19303 ) * stronger the error message of tensor's mutable_data test=develop * update error message test=develop	6 years ago
chengduo	a8a9823dae	add memory profiler (#19320 ) test=develop	6 years ago
Adam	97d1db1874	Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237 ) * Add generalized Conv+Activation MKLDNN fuse pass creation Part2 test=develop * Undefined behaviour of GetAttrIfExists<> FIX test=develop	6 years ago
wangguanzhong	37428952c6	fix generate mask fpn, test=develop (#19301 )	6 years ago
zhaoyuchen2018	5296294dae	Fix elementwise performance poor issue (#19278 ) For small case use 1D block is better than 2D block. Refer to this issue: #19275	6 years ago
Tao Luo	6527a7df67	replace part of PADDLE_ASSERT to PADDLE_ENFORCE (#19285 ) * replace part of PADDLE_ASSERT to PADDLE_ENFORCE test=develop * remove unused fallback_alloc_size_ * add unit-test of CUDAPinnedAllocator test=develop	6 years ago
xiaoting	62facc7e47	fix yolo_box python example (#18925 ) test=develop, test=document_preview	6 years ago
Yihua Xu	b920395842	Use sparse matrix to implement fused emb_seq_pool operator (#19064 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * Ignore the deprecated status for windows test=develop	6 years ago
wangchaochaohu	6e326ca2c6	optimize the realization of cuda dropout (#19136 ) * cuda optimie for dropout * remove tmp swp file * fix compile error test=develop * test=develop optimize the cuda realization of dropout op * remove unsed code test=develop * remove tmp file test=develop	6 years ago
Zhaolong Xing	76c95af000	Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213 ) * fix mask rcnn bug: 1. affine channel fuse (diff) 2. condition block op (memory leak) 3. merge lod tensor op (diff) 4. memroy optim (diff) test=develop * fix ci aboud PADDLE_ENFOCE fix merge lod infer op ut test=develop	6 years ago
qingqing01	5fc8de449a	Remove warning in batch_norm_op (#19260 )	6 years ago
lvmengsi	d08d5ab519	Fix the mistake of convolution (#19274 )	6 years ago
Aurelius84	78a3d837f8	Add match_matrix_tensor op (#18525 ) * add matrch_matrix_tensor op test=develop * fix ignore unittest if with_mkl=off test=develop * clean code and rm is_test param test=develop * modify API.spec test=develop * rm useless code in search_compute.h test=develop * modify api.spec test=develop * modify default_grad.spec test=develop * Add API test code test=develop * clean code in search_computer.h * modify PADDLE_ENFORCE and clean search_compute.h test=develop * fix code style test=develop	6 years ago
Zeng Jinle	5b6673c44d	merge develop to solve conflict, also fix API doc, test=develop (#18823 )	6 years ago
liuwei1031	50582071dc	fix compilation issue in windows vs2017 (#19183 ) * fix compilation issue in windows vs2017, test=develop * fix gtest lib not found issue, test=develop	6 years ago
zhang wenhui	539c870753	add fl_listen_and_serv &fl_transpiler,test=develop (#19091 ) add fl_listen_and_serv op for Federated_learning and fl_distribute_transpiler add this op to pserver program . This op just listen the endpoint and sum&scale.	6 years ago
juncaipeng	5368b36512	remove the warning for reminding user to avoid using the OriginProgram method, test=develop (#19244 ) This log information may annoy users who don't need to care about it.	6 years ago
silingtong123	af0fbd9012	change PADDLE_ENFORCE to PADDLE_ENFORCE_CUDA_SUCCESS (#19205 ) * print error code if cuda related API fails	6 years ago
Zeng Jinle	91a0911ca3	Make PADDLE_ENFORCE_EQ support types that cannot be converted to std::string (#19243 ) * make PADDLE_ENFORCE_EQ support cannot to string types, test=develop * follow huihuang's comments, test=develop	6 years ago
chengduo	8a89ca94ce	Fix REGISTER_OP_WITHOUT_GRADIENT (#19251 ) * fix REGISTER_OP_WITHOUT_GRADIENT test=develop	6 years ago
gongweibao	fd4b15a2f6	Unset unittests http_proxy env to avoid timeout. (#19269 ) Unset unittests http_proxy env to avoid timeout.	6 years ago
silingtong123	a94a25867d	imporve the doc of decorate_reader API (#19206 ) * imporve the doc of decorate_reader API, test=develop * udpate API.spec, test=develop	6 years ago
Kaipeng Deng	2848cb791e	fix temporal_shift OP PADDLE_ENFORCE. test=develop (#19161 ) * fix temporal_shift OP PADDLE_ENFORCE. test=develop * fix HasInput/HasOutpu ENFORECE. test=develop	6 years ago
Tao Luo	2f8c7e021f	remove unused inference_transpiler unit-tests (#19130 ) * remove unused inference_transpiler unit-tests test=develop * remove InferenceTranspiler usage in quantize_transpiler.py test=develop	6 years ago
Zeng Jinle	708bd9798d	move_flags_to_unified_files_for_management, test=develop (#19224 )	6 years ago
Zeng Jinle	002f325dcd	add PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#19211 )	6 years ago
lidanqing	07a4d8f8d6	Fix mAP problem in unit test of int8 object detection test (#18946 ) * change the top1 comparison to mAP comparison test=develop * change the mobilenet-ssd tester demo data and batch_size settings test=develop	6 years ago
Adam	b837689e97	Add generalized Conv+Activation MKLDNN fuse pass creation (#19072 ) test=develop	6 years ago
Yibing Liu	50b1cab122	Add padding support for crf_decoding (#19057 ) * Add padding support for crf_decoding * Fixes in comupte kernel test=develop * Update API Spec test=develop * Update API.spec test=develop * Avoid using paddle_enforce test=develop * Fix enforce test=develop	6 years ago
Aurelius84	45fb031f6b	remove is_test param of FC test=develop (#19209 ) Remove is_test parameter of FC op. The parameter is_test is not used anywhere.	6 years ago
liym27	c8cdef37b2	change the default value of summarize from -1 to 20 in Print API to improve ease of use (#18738 ) * change the default value of summarize from -1 to 20 in Print op to improve ease of use, test=develop * change the doc of API Print to make the document easier to understand, test=develop	6 years ago
Yiqun Liu	77572b70cb	Enhance the error message when GrapOpMaker is null. (#19070 ) * Enhance the error message when GrapOpMaker is null. test=develop * Call Proto() instead of directly using proto_ pointer. test=develop * Rollback to use proto_ directly, because some sepecial grad ops, such some double grad ops, donot have proto. test=develop	6 years ago
lvmengsi	c6f163cd7a	add description of sync_bn (#19056 )	6 years ago
chengduo	b5ba801ef0	Fix gather op bug (#19168 ) * fix gather op bug test=develop	6 years ago
Zeng Jinle	0f9b33954a	move python reader api to fluid.io module, test=develop (#19143 )	6 years ago
Leo Chen	80eab822c1	Remove unused DefaultGradOpDescMaker in REGISTER_OPERATOR() (#19166 ) * remove unused DefaultGradOpDescMaker in REGISTER_OPERATOR(), test=develop * remove SplitIdsOpGradMaker since it is buggy and not tested, update spec file, test=develop	6 years ago
chengduo	c70a97f46e	Use CUDAPinnedPlace in buffered_reader (#19112 ) Use CUDAPinnedPlace in buffered_reader	6 years ago
jiaqi	b104ea0684	add get_last_save_xbox_base/get_last_save_xbox (#19122 ) * add get_last_save_xbox_base/get_last_save_xbox * fix fleet_util bug of load paddle model * add doc string in fleet api	6 years ago
joanna.wozna.intel	492a00f53e	Add conv reqantize squash (#18754 ) * Add requantize squash test=develop * Add more precise tests test=develop * REname and REfactor tester test=develop	6 years ago
Jiawei Wang	6ac32d0981	Instag Implemention (#18394 ) * instag lod tensor impl * First PR for instag * First PR for instag * Before adding Selection Rows. * Change name from instag to filter_instag, add upgrade the impl of filter_instag * Change name from instag to filter_instag, add upgrade the impl of filter_instag * Fix yapf error in gradient_checker.py to pass Travis-CI * Fix Filter Instag Grad test=develop * Fix Filter Instag Grad test=develop * 1) Fix API.spec, add filter_instag Op. 2) Add Vector Support for CUDA. test=develop * Impl Loss_weight and empty output handler * change Loss Weight datatype to Float32, and add Loss Weight as 2nd output * 1) Support Tensor Input(without LOD) 2) Add Unit test * Filter By Instag Final test=develop * Update API.spec for filter_by_instag test=develop * Update API.spec for filter_by_instag 2 test=develop * Add Filter By Instag Coverage * code format of test_layers.py * code format test_layers.py test=develop * Make API args more readable test=develop * Make API args more readable and pass code format test=develop * Filter By Instag Op, Rename Map to Index Map test=develop * Filter By Instag Op, code format err in filter_by_instag_op.cc test=develop * Filter by instag op: code format of cpp files test=develop * Filter by instag Op: Api spec modification test=develop * Filter by instag Op: Api spec doc id modification test=develop * Filter by instag Op: Api spec and doc preview test=develop test=document_preview * Filter By Instag Op, fix doc erro test=document_preview test=develop * Filter By Instag Op, fix doc err and Api spec test=document_preview test=develop * Filter By Instag Op, fix Api spec test=document_preview test=develop * Filter By Instag Op, fix Paddle Encoforce deprecated warning test=document_preview test=develop * Filter By Instag Op, fix Paddle Encoforce deprecated and code format warning test=document_preview test=develop	6 years ago
Tao Luo	5f5648a8ff	Revert "Python inference API support numpy (#19009 )" (#19160 ) test=develop	6 years ago
wawltor	0019eb376a	Fix the error of op `ones_like` document，change the output variable test=document_preview test=develop Fix the error of op `ones_like` document, change the output variable from x to out.	6 years ago
huangjun12	20f18930ae	Add hard swish op (new op) (#19001 ) * add hard_swish activation op (new op) test=develop * remove redundancy files * modify document content of HardSwish OP * add API test in test_layers.py * add dynamic_graph for test_hard_swish	6 years ago
joanna.wozna.intel	bce72c7fea	Replace Relu with bounded Relu in MobileNetV2 quantization (#18988 ) test=develop	6 years ago
chengduo	e044e84264	open fuse_all_optimizer_ops (#19087 ) test=develop	6 years ago
wangguanzhong	1fc242a7ed	refine infer shape in box decoder and assign op, test=develop (#19118 )	6 years ago
gongweibao	29d8781240	Polish fleet API to support cuda collective mode and nccl2 mode. (#18966 ) Polish fleet API to support cuda collective mode and nccl2 mode	6 years ago
flame	b7e1a1d7e7	Python inference API support numpy (#19009 ) test=develop	6 years ago
wopeizl	80b7ef6fc8	add tensorrt support for windows (#19084 ) * add tensorrt support for windows	6 years ago
Kevin	744279fe68	Refine embedding Api doc (#18820 ) * fix overflow by int32 mul test=develop * fix reference nullptr * fix codestyle test=develop * modify to point in ContextProjectFunctor test=develop * modify to point in ContextProjectFunctor test=develop * modify . to -> test=develop * refine embedding padding_idx doc test=develop * fix math:padding_idx preview bug test=develop * modify API.spec test=develop * fix spell error test=develop * refine dtype parm desc test=develop	6 years ago
Kevin	945f3cf631	fix code too big test=develop (#19111 ) Fix seq_pool failed when input dims is too large. Resolve issue #3023	6 years ago
yaoxuefeng	9150cf50fc	add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871 ) * add ctr related metric layer test=develop * add save cache and slots shuffle test=develop * add save cache and slots shuffle test=develop * fix error * fix error * fix style for ci * fix for comments * change SlotsShuffle input to std::strinf for generality * fix style * fix style * fix style * fix style * fix style * fix style * fix stylr * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * change non-const reference to pointer * fix style * fix style * fix style test=develop * fix style test=develop * add return ins num in ctr metric op * change dtype to float in metric_op.py * fix error test=develop * fix style test=develop * fix API spec * fix API spec * fix API spec test=develop * add UT test=develop	6 years ago
hutuxian	5a80cc8431	Datafeed support reading to cuda place directly. (#19071 ) * add a place field in DataFeed to denote which place it will feed data to. * abstract the copy process in CopyToFeedTensor function * add UT for float32 type and for CUDAPlace	6 years ago
Zeng Jinle	88f111f885	remove unused inplace act codes, test=develop (#19079 )	6 years ago
ShenLiang	4397cb318e	add eye op, kernel and unitest test=develop (#18980 ) * add eye op,test=document_preview test=develop * fix the API.spec, test=develop * fix the document, test=document_preview test=develop * add unitest for CI coverage, test=develop	6 years ago
Kaipeng Deng	f86fead693	Add trilinear_interp OP (#18711 ) * add trilinear interp. test=develop * fix unittest. test=develop * add python api and test_layers. test=develop * refine API.spec. test=develop * fix format. test=develop * add python API test. test=develop * format code. test=develop * refine code strcuture. test=develop * fix format * fix doc. test=develop * fix converage. test=develop * fix format. test=develop	6 years ago
Zhang Ting	c2063217e7	optimize error message for "embedding" and "cross_entropy" OP (#18765 ) * optimize error message, test=develop * optimize error message, test=develop	6 years ago
Tao Luo	741ce8bb1a	inference_shared_library support profile (#16275 ) test=develop	6 years ago
chengduo	17d62ab220	Enhance fuse optimization op pass (#19010 ) * Enhance fuse optimization op pass test=develop	6 years ago
chengduo	21440b4d69	Add call stack info during compile time (#19067 ) * Add call stack info during runtime and compile time test=develop * Rename operator_call_stack test=develop * Add unit test test=develop * follow comment test=develop	6 years ago
jiaqi	a99bc64c63	add fleet util, add some interface in hdfs util (#18752 ) * add fleet util (fleet/utils/fleet_util.py): functions for users' convenience * add some interface in hdfs util : hdfs is_file、hdfs cat	6 years ago
mapingshuo	4ad7c9d5a7	[WIP] Add Imdb train demo (#18895 ) * add train demo for imdb text classification task * make inference library release data_feed dataset dataset_factory data_feed_factory * add String Data Generator * new feature of train demo: save model params * New feature of train demo: set training config using gflags * change code style for CI * add readme and dataset for imdb demo trainer	6 years ago

... 4 5 6 7 8 ...

8864 Commits (6e6eab07e80d287fb10f6033a01f15650b36fcdb)