Paddle

Commit Graph

Author	SHA1	Message	Date
tangwei12	981fc2bdba	fix bug in merge_ids (#15503 ) * fix mistakes in merge_ids, test=develop	7 years ago
baojun	efce25673c	Adding ngraph_engine_op (#14948 ) * enable ngraph_engine_op test=develop * merge develop test=develop * avoid const_cast test=develop * rm ngraph_operator test=develop * Added TODO to move EnableNgraph test=develop * Add TODO to remove const_cast test=develop	7 years ago
chengduo	f8f91fb4b3	Revert conv transpose cudnn (#15514 ) * Revert "set constant for loss" This reverts commit 167933f678ccbb3563e949710279efe004a27731. * Revert "remove workspace_handle" test=develop This reverts commit b4aca8ede9e685bce1dfb1c59e63919f33432572.	7 years ago
tensor-tang	b67584a6e9	jit benchmark use tensor test=develop	7 years ago
Yiqun Liu	3008fa1261	Add the CUDA kernel for beam_search op (#15020 ) * Refine the beam_search op and test. * A basic CUDA implementation of beam_search for small batch_size. * Implement CUDA kernel for beam_search_op. * Use multiple CUDA threads in the same block to select the top beam. * Update the python api of beam_search op. * Enable extend function in CPU kernel of beam_search op. * Unify the CUDA codes. test=develop * Unify the CPU kernel of beam_search op. * Ensure the seletced items of beam_search_op's CPU kernel sorted by scores. * Update the description of beam_search in API.spec. * Enable the use of CUDA kernel in beam_search op. * Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements. test=develop * Follow comments. test=develop * Call the CPU kernel for beam_search op when batch_size > 4. test=develop * Remove the except of is_empty op in PrepareData. test=develop	7 years ago
tink2123	78145c7dff	modified some comments test=develop	7 years ago
nhzlx	027d24c831	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version	7 years ago
chengduo	bf91d11ed5	Clean elementwise_op_function (#15502 ) test=develop	7 years ago
tangwei12	5cfc40dea8	nce add check sample lables, test=develop (#15463 ) * nce add check sample lables, test=develop	7 years ago
tink2123	e448bdb298	modified some comments test=develop	7 years ago
tink2123	88744e4ab8	fixed some errors test=develop	7 years ago
jerrywgz	9eb2d7b3e1	refine code, test=develop	7 years ago
jerrywgz	6dfd789bfc	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_nms	7 years ago
jerrywgz	6928f8318f	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_axis_for_boxcoder	7 years ago
jerrywgz	e60c8438fc	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_clip_op	7 years ago
tink2123	48cc484643	add align_corners and align_mode for image_resize test=develop	7 years ago
jerrywgz	11f1baa406	refine code, test=develop	7 years ago
Zhaolong Xing	b7b68f2a8c	Merge pull request #15461 from NHZlX/fix_trt_stream_bug fix trt stream bug.	7 years ago
tangwei12	8b50ad80ff	checkpoint at distributed training (#14854 ) checkpoint for distributed training.	7 years ago
jerrywgz	57e5f61ec8	add gpu kernel, test=develop	7 years ago
jerrywgz	cc53453057	add comment and refine code, test=develop	7 years ago
qingqing01	07dc5a1506	Add generate_mask_labels_op to support Mask-RCNN and refine some code. (#15371 ) * Add generate_mask_labels_op to support Mask-RCNN. * Refine sigmoid_cross_entropy to support nomalize mode. * Fix generator_proposals_label. * Use DeviceTemporaryAllocator in roi_pool and roi_algin. * Remove shape check in data_feeder.	7 years ago
Yiqun Liu	eaad3e4c3d	Add check of input in sequence_expand op. (#15466 ) * Add check of input in sequence_expand op. test=develop * Correct the unittest of sequence_expand op. test=develop	7 years ago
gongweibao	f4dec5cdee	Check collective server's data. (#15449 )	7 years ago
jerrywgz	c12a969bd4	refine comment and unittest, test=develop	7 years ago
chengduo	5a8bd82c0c	Remove workspace_handle (#15376 ) * remove workspace_handle test=develop * set constant for loss test=develop	7 years ago
jerrywgz	1c558ad388	add gpu kernel for box clip, test=develop	7 years ago
nhzlx	5b92ddabe2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_trt_stream_bug test=develop	7 years ago
nhzlx	2f4aee361a	fix comments test=develop	7 years ago
nhzlx	ec213730bc	fix trt stream bug. BUG: After continuing to input different data, the output cannot be aligned test=develop	7 years ago
wopeizl	a8aa79130b	Merge pull request #15453 from wopeizl/fix15313 fix pr 15313	7 years ago
gongweibao	7f8b40f68d	Fix brpc complation error. (#15451 )	7 years ago
jerrywgz	0d4b60ab8b	add lod for slice op, test=develop	7 years ago
dzhwinter	8f3b252392	squash commits. test=develop	7 years ago
peizhilin	e6a3a3a31a	fix pr 15313 test=develop	7 years ago
jerrywgz	66bb5dd760	refine infer shape, test=develop	7 years ago
tensor-tang	266e625d2e	Merge pull request #15399 from tensor-tang/refine/seqpool/fc fix cpu jitkernel test and refine benchmark test	7 years ago
Qiao Longfei	45578c1b48	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader	7 years ago
Yan Chunwei	885c4e57ab	fea/infer memory optim2 (#14953 )	7 years ago
jerrywgz	0d91507859	fix share lod, test=develop	7 years ago
Tao Luo	6597ccb01f	Merge pull request #15413 from luotao1/legacy_code remove legacy code	7 years ago
Dun	9f8f0fc2d3	Memory optimization of depthwise conv op and group norm op (#15313 ) * mem opt * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * refine code test=develop * refine code test=develop * refine code test=develop * refine code test=develop * refine with cub test=develop * fix mkldnn test && remove comments && test=develop * polish code && test=develop * add only_forward test && test=develop	7 years ago
jerrywgz	5246285e34	test=develop	7 years ago
jerrywgz	b10d84bc5a	fix bug when run on GPU, test=develop	7 years ago
whs	530869f829	Share LoD from Input(Rois). (#15420 ) test=develop	7 years ago
gongweibao	7ab4af2716	Fix brpc compilation. (#15417 )	7 years ago
Dun Liang	e5004f3c1c	fix ci && test=develop	7 years ago
tensor-tang	316e44b1b7	fix unused warnings test=develop	7 years ago
Wu Yi	7e651a38dd	fix mac cmake version 3.13 build (#15386 ) * fix mac cmake version 3.13 test=develop * fix again test=develop	7 years ago
jerrywgz	b62a17bbae	add nms api	7 years ago
tensor-tang	579d758254	fix jitkernel tests and refine benchmark test=develop	7 years ago
jerrywgz	f660553d77	enhance nms for mask rcnn, test=develop	7 years ago
shippingwang	14f2a1060d	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into shufflechannel	7 years ago
jerrywgz	88ee56d0b2	enhance nms for mask rcnn	7 years ago
zhaozhehao	e2ba9668b4	Tree conv op (#15217 ) * refactor tree2col operator with new memory mechanism test=develop * test=develop * test=develop * Modified API according to panyx0718 test=develop * fix API change according to heavengate test=develop * Modify API comment test=develop	7 years ago
Tao Luo	3ede8b67e6	update CMakeLists.txt	7 years ago
Yiqun Liu	f413b6892b	Revert the modification of while_op in #14764 . (#15372 ) * Revert the modification of while_op in #14764. test=develop * Remove the dependency of GRPC_DEPS. test=develop	7 years ago
jerrywgz	ab9d6a4f39	add comments, test=develop	7 years ago
jerrywgz	10dd3b37ad	add axis for box coder op	7 years ago
乔龙飞 Qiao Longfei	adba4384ec	Merge pull request #15161 from jacquesqiao/gru-add-mode gru add origin mode	7 years ago
nhzlx	8817841c73	fix unit test bug test=develop	7 years ago
jerrywgz	5fb2856584	test_develop	7 years ago
Xin Pan	3ecf6bb338	Merge pull request #15028 from yihuaxu/develop_641313ea7_elementwise_mul_mkldnn_bug_fix Fix the exception when tensor format is x	7 years ago
jerrywgz	af448373c7	test=develop	7 years ago
nhzlx	b938324381	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version test=develop	7 years ago
nhzlx	312fe0ece1	add trt int8 calibration support fix comments test=develop	7 years ago
wopeizl	994e73f685	Merge pull request #15351 from wopeizl/fixbuildissue disable the parallel mode for adam op on windows test=develop	7 years ago
jerrywgz	481d8bce2f	add box clip op	7 years ago
Yiqun Liu	568cc2ffa8	Optimize while_op for test (#14764 ) * Simplify the compare op for CPU. * Use asynchronous tensor copy in reshape_op's kernel. * Optimize while_op for test, avoiding creating variables every time. test=develop * Enable the cache of kernel type and kernel function. test=develop * Enable profiling with gperftools. * Remove flags for testing, and fix the linking error. test=develop * Delete the codes of ChooseKernel. test=develop * Fix bug when preparing ExecutorPrepareContext for while_op. * Fix missing depending on grpc libraries. * Remove the redundant print. test=develop * Follow comments. * Remove the codes related to prepare the ExecutorPrepareContext for while_op. test=develop	7 years ago
tensor-tang	3759c1db8c	Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph Enable element_wise_add operator for a ngraph engine	7 years ago
tensor-tang	904a39239d	Merge pull request #15254 from mozga-intel/mozga-intel/softmax_operator_ngraph Enable softmax operator for a ngraph engine	7 years ago
peizhilin	cd562f8fb7	disable the parallel mode for adam op on windows test=develop	7 years ago
Xin Pan	16cb3ebd68	Merge pull request #15268 from xiaolil1/pool-int8 Enhance key generation for Pool INT8 test	7 years ago
tensor-tang	a7fc3d42a0	Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub Fuse/second order mul sub and fuse repeated fc relu	7 years ago
mozga-intel	cba729404d	Enable softmax operator for a ngraph engine test=develop	7 years ago
Qiao Longfei	cd31b90a46	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader test=develop	7 years ago
Qiao Longfei	8c516a24e5	remote min_row_size_to_use_multithread in adam interface test=develop	7 years ago
Qiao Longfei	9b4fe283e1	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam test=develop	7 years ago
Qiyang Min	3f687765e6	Merge pull request #15281 from velconia/fix_expand_op_compile_time Fix expand op compile time bug	7 years ago
minqiyang	c4cf5967db	Change backward op infershape test=develop	7 years ago
tensor-tang	84b0ecdcce	Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub test=develop	7 years ago
chengduo	46d01d798e	Revert "Revert "Remove workspace_handle in conv_cudnn (#15186 )"" (#15290 ) test=develop This reverts commit `358e657f68`.	7 years ago
Qiao Longfei	4d15515c40	fix gru_gpu_kernel test=develop	7 years ago
tensor-tang	93e75c5ae5	refine jitcode of vsub and vsquare test=develop	7 years ago
tensor-tang	d618e48309	fix fuse square mat order and refine test test=develop	7 years ago
Qiao Longfei	4feae25378	fix build problem test=develop	7 years ago
tensor-tang	38de1ff472	add fusion squared mat sub op	7 years ago
Qiao Longfei	e641ffe77b	change interface and api spec for dynamic_gru test=develop	7 years ago
tensor-tang	09c5786e22	add square jitkernel	7 years ago
Qiao Longfei	4c7be265d3	update avx gru grad kernel test=develop	7 years ago
Qiao Longfei	9b16e54064	update gru_grad_op test=develop	7 years ago
Qiao Longfei	e477d789a1	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode	7 years ago
tensor-tang	f347d6e4a1	add repeated fc relu unit test test=develop	7 years ago
tensor-tang	99010e6eae	init repeated fc relu op	7 years ago
tensor-tang	266a5d2f52	implement matmul refer and mkl kernel	7 years ago
tensor-tang	c5623c87a3	init jit matmul kernel	7 years ago
Xin Pan	a1bfb35dd6	try fix py2 test=develop	7 years ago
Dun Liang	a900015c03	add async copy and pinned place	7 years ago
colourful-tree	576c740d5d	Merge pull request #14964 from colourful-tree/data_norm add data norm op	7 years ago
colourful-tree	d5a8909131	Merge pull request #14950 from colourful-tree/develop add teacher student sigmoid loss	7 years ago
minqiyang	bc3e0d6e01	Fix expand op compile time bug test=develop	7 years ago
chengduozh	358e657f68	Revert "Remove workspace_handle in conv_cudnn (#15186 )" test=develop This reverts commit `064512aa47`.	7 years ago
tensor-tang	fc9fbab6a0	Merge pull request #15271 from tensor-tang/fix/typo fix typo and refine	7 years ago
chengduo	064512aa47	Remove workspace_handle in conv_cudnn (#15186 ) * remove workspace_handle in conv2d_cudnn test=develop * remove workspace_handle test=develop * fix bug test=develop * make test_conv2d_op SERIAL test=develop * save memory in conv_cudnn test=develop * enhance thread safety test=develop * enhance temporary allocator test=develop * Add excess fraction test=develop * follow comments test=develop * fix bug and code refine test=develop * fix memory size check test=develop * rename reuse_tmp_allocation_excess_fraction test=develop	7 years ago
tensor-tang	c3a9f3c4b2	fix typo and refine test=develop	7 years ago
tensor-tang	146e942c65	Merge pull request #15250 from tensor-tang/refine/seqpool/feed Refine/seqpool/feed with infer zerocopytensor	7 years ago
xiaolil1	8f17c714de	Conv int8 residual (#15145 ) * Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Enable MKL-DNN INT8 Conv with Relu Fusion OP test=develop * Enable INT8 Conv with residual fusion OP test=develop * Modify code. test=develop * Modify basic INT8 Conv test=develop * Modify Conv. test=develop * fix style test=develop * Fix style test=develop * Fix test test=develop * Modify code. test=develop * Fix test test=develop	7 years ago
xiaoli.liu@intel.com	f34e779f4d	Enhance key generation for INT8 test. test=develop	7 years ago
Wu Yi	fd85418329	[Feature] support mix precision training for resnet (#14899 ) * clip softmax for fp16 * updates * fuse xent support fp16 test=develop * wip * wip * add simple row reduce * wip fp16 accurate softmax * add accurate softmax kernel for fp16 test=develop * update test=develop * fix cpu build test=develop * update api.spec test=develop * follow comments test=develop * fix build test=develop * fix trt build test=develop * fix inference build test=develop * fix merge test=develop * update test=develop * try fix build test=develop * fix build test=develop * rename real_exp test=develop * fortest * remove hacky kernels test=develop * clean up test=develop	7 years ago
tensor-tang	ce909664d8	Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed	7 years ago
乔龙飞 Qiao Longfei	5e74c4e88f	Merge pull request #15100 from jacquesqiao/fix-dist-sparse-decay fix dist sparse l2 decay	7 years ago
tensor-tang	8e086a8521	follow comment and fix typo test=develop	7 years ago
Qiao Longfei	653cd31971	remote unused code	7 years ago
Qiao Longfei	0a79d7a404	fix merge	7 years ago
Qiao Longfei	422449a945	fix style	7 years ago
Qiao Longfei	edad60e612	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader	7 years ago
nhzlx	4e3522e5b4	add trt int8 support test=develop	7 years ago
Qiao Longfei	d0e3b24002	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay test=develop	7 years ago
tensor-tang	f8c305b243	Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat_2 test=develop	7 years ago
tensor-tang	223c61ca5e	Merge pull request #15170 from tensor-tang/jit/seqpool refine seqpool op	7 years ago
Qiao Longfei	c3b9edf958	follow comment test=develop	7 years ago
Zeng Jinle	e29f10d315	Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var Remove op handle lock and fix var	7 years ago
mozga-intel	eff90eb941	PADDLE_WITH_NGRAPH was removed from the code test=develop	7 years ago
mozga-intel	a42f8f4f6f	Enable element_wise_add operator for a ngraph test=develop	7 years ago
mozga-intel	e4184008a4	PADDLE_WITH_NGRAPH was removed from the code test=develop	7 years ago
Qiao Longfei	3ace486ebd	fix sum_op selected rows test=develop	7 years ago
tensor-tang	f702f8fd10	Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat	7 years ago
Qiao Longfei	b16e832d4d	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay	7 years ago
sneaxiy	ed409ac9f4	Revert "Revert "Remove op handle lock"" test=develop	7 years ago
Tao Luo	4d9aa1745a	Merge pull request #14806 from mozga-intel/mozga-intel/scale_operator_ngraph Enable scale operator for a ngraph engine	7 years ago
Tao Luo	dc0c221426	Merge pull request #14803 from mozga-intel/mozga-intel/mean_operator_ngraph Enable mean operator for a ngraph engine	7 years ago
Zeng Jinle	dacfaaa966	Revert "Remove op handle lock" test=develop	7 years ago
Qiyang Min	317840d3ba	Merge pull request #14277 from velconia/add_fused_emb_seq_pool_op Add fused emb seq pool op	7 years ago
tensor-tang	2dd331cc21	Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat test=develop	7 years ago
tensor-tang	316636404f	add seqpool concat unit test	7 years ago
xiaolil1	c8f101e5da	Conv int8 relu (#15130 ) * Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Enable MKL-DNN INT8 Conv with Relu Fusion OP test=develop * Modify basic INT8 Conv test=develop * fix type test=develop * Modify test test=develop	7 years ago
tensor-tang	7923d7271f	add fusion seqpool concat op	7 years ago
Zeng Jinle	f3a13512fc	Merge pull request #15139 from sneaxiy/remove_op_handle_lock Remove op handle lock	7 years ago
Qiao Longfei	44b300556d	change min_row_size_to_use_multithread to parameter of adam test=develop	7 years ago
Qiao Longfei	87b4eb1da4	change min_param_size_to_use_multithread to min_row_size_to_use_multithread	7 years ago
minqiyang	0f94c1ac14	Polish code test=develop	7 years ago
minqiyang	c09a379015	remove const_cast test=develop	7 years ago
tensor-tang	102d93712e	Merge remote-tracking branch 'ups/develop' into jit/seqpool test=develop	7 years ago
tensor-tang	123b98f417	refine heigth and codesize and support all pool test=develop	7 years ago
tensor-tang	0145f40f45	use height from params of jitcode	7 years ago
tensor-tang	e0591deebc	enhance seqpool jitcode	7 years ago
Zeng Jinle	99e6e8b00f	Merge pull request #15179 from sneaxiy/fix_crf_grad_lod Fix crf grad lod share	7 years ago
minqiyang	db8eb9b688	Polish code test=develop	7 years ago
minqiyang	39b98709b1	Move fused ops to fused dir test=develop	7 years ago
minqiyang	920d4a8b78	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_fused_emb_seq_pool_op test=develop	7 years ago

1 2 3 4 5 ...

3501 Commits (16ec4b8c8bd5c95c38c39d2a2528027b4a1930b6)