Paddle

Commit Graph

Author	SHA1	Message	Date
qingqing01	86e912c544	Fix windows compiling (#16230 ) test=develop	6 years ago
qingqing01	8ad672a287	Support sync batch norm. (#16121 ) * Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)	6 years ago
xuezhong	1dad36f6aa	Merge pull request #15609 from xuezhong/add_sample_logits_op add sample_logits and sampled_softmax_with_cross_entropy op	6 years ago
Yiqun Liu	7d96c74ab2	Initialize the benchmark tester for operator. (#15772 ) * Initialize the benchmark tester for operator. test=develop * Rearrange the codes. test=develop	6 years ago
xuezhong	58ad40cc15	add sample_logits op	6 years ago
baojun	efce25673c	Adding ngraph_engine_op (#14948 ) * enable ngraph_engine_op test=develop * merge develop test=develop * avoid const_cast test=develop * rm ngraph_operator test=develop * Added TODO to move EnableNgraph test=develop * Add TODO to remove const_cast test=develop	6 years ago
Yiqun Liu	3008fa1261	Add the CUDA kernel for beam_search op (#15020 ) * Refine the beam_search op and test. * A basic CUDA implementation of beam_search for small batch_size. * Implement CUDA kernel for beam_search_op. * Use multiple CUDA threads in the same block to select the top beam. * Update the python api of beam_search op. * Enable extend function in CPU kernel of beam_search op. * Unify the CUDA codes. test=develop * Unify the CPU kernel of beam_search op. * Ensure the seletced items of beam_search_op's CPU kernel sorted by scores. * Update the description of beam_search in API.spec. * Enable the use of CUDA kernel in beam_search op. * Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements. test=develop * Follow comments. test=develop * Call the CPU kernel for beam_search op when batch_size > 4. test=develop * Remove the except of is_empty op in PrepareData. test=develop	6 years ago
zhaozhehao	e2ba9668b4	Tree conv op (#15217 ) * refactor tree2col operator with new memory mechanism test=develop * test=develop * test=develop * Modified API according to panyx0718 test=develop * fix API change according to heavengate test=develop * Modify API comment test=develop	6 years ago
peizhilin	dba009dbbf	fix script issue test=develop	7 years ago
peizhilin	01c00b07dd	fix test issues on windows test=develop	7 years ago
tensor-tang	693e5e65ce	Merge pull request #14958 from tensor-tang/refine/jit enhance jit	7 years ago
Zeng Jinle	95cbe07c40	Merge pull request #14836 from sneaxiy/feature/py_func Featue/py_func op	7 years ago
tensor-tang	20392be001	Merge remote-tracking branch 'ups/develop' into refine/jit fix conflicts test=develop	7 years ago
peizhilin	19ebd8b4cf	add ctc support for windows	7 years ago
sneaxiy	deb0d41cea	fix cmake fix cmake again test=develop	7 years ago
sneaxiy	8760d23c7d	featue/py_func	7 years ago
tensor-tang	53709e7e61	refine names	7 years ago
tensor-tang	fab0ee8757	Merge remote-tracking branch 'ups/develop' into refine/jitkernel	7 years ago
tensor-tang	77236e33fc	init jitkernel	7 years ago
nhzlx	f75815b78c	add prelu gpu inference	7 years ago
Qiao Longfei	b9d3d75fc4	fix prefetch dependency test=develop	7 years ago
Qiao Longfei	47280ef8b4	lookup table op support prefetch	7 years ago
wopeizl	d9a1f3e58e	Windows/online (#14474 ) * add recordio support * disable the openblas multi-thread on windows since no support adjust the python script * code style * code style test=develop * add create_recordio_file_reader back * fix code style test=develop * fix the gtest.cmake on windows * fix cc_test on windows * fix the win build test=develop * remove fused compile support on windows test=develop * add the jit support test=develop * add the jit support, test=develop * add the jit support, test=develop * add the jit back fix compile error on windows * rollback test=develop * test case fix * disable DSO by default on windows * exclude warpctc_op on windows * exclude the dynload_warpctc out on windows test=develop * fix the scripts error test=develop * disable avx on windows by default test=develop * re-organize the cmake file * disable mkl on windows by default * add warp_ctc back * fix the dependency * fix the dependency * fix the build issue on windows * remove unsupported flag on windows * code style * code style test=develop * fix issue * add profiler, parallel_executor back * clean up the pre-definitions on windows * fix build issue * test=develop	7 years ago
Tao Luo	5d4d117edc	Merge pull request #14502 from qingqing01/cudnn5_fix Fix compling with cuDNN v5	7 years ago
Yu Yang	3edd32d070	fix(Compile): fix depends error when compile op using cub some operators depend on cub and xxhash by header. The dependency should be declared explicitly rather than declared to pybind. test=develop	7 years ago
Dang Qingqing	cda60311f9	Fix compling with cuDNN v5 test=develop	7 years ago
Yu Yang	f1a392a5fe	Merge pull request #13804 from sneaxiy/rewrite_allocation Rewrite allocation	7 years ago
qingqing01	fd7e643153	Convolution fusion operator. (#14449 ) * Convolution fusion operator. * Clean code test=develop	7 years ago
Yu Yang	98bbfc17be	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	7 years ago
Wu Yi	a2d9b34417	Refine operator cmake (#14413 ) * wip simplify operator framework * wip * wip * done test=develop * clean test=develop * fix test=develop * fix deps test=develop * fix cpu build test=develop * fix tensorrt build test=develop * fix tests test=develop * fix test=develop * fix cpu build test=develop	7 years ago
whs	1722678258	Make nce support more distribution. (#13549 ) * Fix truncated normal. * Fix. * Make nce support more distribution. * Fix API.spec. * Fix python API. * Fix. test=develop * Fix API.spec test=develop * Fix sampler. * Fix order of arguments in python API. test=develop	7 years ago
Wu Yi	b32c13dc20	Add cudnn ctc loss (#12366 ) * add cudnn ctc loss * wip add test test=develop * wip * wip * done test=develop * move include cudnn test=develop * test test=develop * fix build test=develop * fix build test=develop * fix build on cudnn5 test=develop * fix cudnn5 build test=develop * fix cudnn5 build test=develop * merge develop softmax functor change test=develop	7 years ago
Yu Yang	c8f6e70ab4	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	7 years ago
peizhilin	61fa5218b9	Merge remote-tracking branch 'upstream/develop' into windows/build	7 years ago
Yu Yang	8f9bfad246	perf(compile): speed up reduce_op compile by splitting files (#14294 ) test=develop	7 years ago
sneaxiy	d231e55065	merge develop test=develop	7 years ago
peizhilin	7638f0afb3	simplify the logic	7 years ago
peizhilin	d01a26280e	Merge remote-tracking branch 'upstream/develop' into windows/build	7 years ago
li099	688ed60116	Add lod tensor array to tensor op (#13990 ) * add lod tensor array concat * add lod tensor array concat * test=develop * add lod tensor array concat test=develop * Fix API.spec test=develop * add lod tensor array concat test=develop * revise some bug of lod tensor array concat test=develop * add unittest for tensor array concat test=develop * change to tensor array to tensor test=develop * revise bug test=develop * revise a bug test=develop * revise a bug test=develop * revise a bug of python3 test=develop	7 years ago
peizhilin	52f7644f53	Merge remote-tracking branch 'upstream/develop' into windows/build	7 years ago
Yu Yang	fdc689142c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	7 years ago
chengduo	c5b6573a5a	Fix input<tensor> (#14208 ) * fix input<tensor> test=develop * fix split_ids test=develop * ElementwiseMul should not support SelectedRows * fix scale op test=develop * change GetTensorFromVar() method to GetTensorOrSelectedRowsFromVar() * fix operator * refine MultiOutput * fix MultiOutput test=develop * disable test_dist_save_load test=develop * fix elementwise_op test=develop * add get_sparse_as_op test=develop * add info for check test=develop * rename get_sparse_as_op with extract_rows_as_op. test=develop * elementwise doesn't support selected_rows * fix regularizer * remove extract_rows_as test=develop * fix ci test=develop * add test for sum_op * fix regularizer test=develop * test=develop * fix pserver weight decay multi inputs test=develop	7 years ago
Zhaolong Xing	ba8b5619a3	Revert "cherry picked windows patches."	7 years ago
Yu Yang	057a682ee9	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation	7 years ago
peizhilin	9d67c1fb69	cpu build support	7 years ago
dzhwinter	60f70b174d	test=develop	7 years ago
dzhwinter	316765839d	add back jit simd instructions. stage.	7 years ago
dzhwinter	bf2e4cb188	cleard. staged	7 years ago
dzhwinter	ebfe5a02b3	merge develop branch	7 years ago
tensor-tang	3c957af139	Merge pull request #14080 from tensor-tang/refine/jit/crf2 Refine/jit/crf decoding	7 years ago
Yu Yang	c01696f8c2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	7 years ago
tensor-tang	21487d78bf	add crf decode jit kernel	7 years ago
minqiyang	8a0f26f45f	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into continue_hash_op	7 years ago
minqiyang	d4f9aa0852	Add hash op implementation	7 years ago
chengduo	a7497653d0	Refine Split op (#13967 ) * speedup split_op test=develop * speedup split_op test=develop * rename ConcatGrad to Split * refine concat and split test=develop * fix compile error	7 years ago
Yu Yang	461f71a90b	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation	7 years ago
tensor-tang	3c249283af	init seqconv eltadd relu op	7 years ago
qingqing01	67a2b5215d	Add affine channel op to speed and save memory for faster-rcnn model. (#13919 ) * Add affine channel op. * Update code and add Python API. test=develop * Update API.spec test=develop	7 years ago
tensor-tang	bcb8ea397d	Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole test=develop	7 years ago
Qiyang Min	f99ea99e36	Merge pull request #13720 from velconia/fix_grad_clip Merge selected_rows for clip_by_norm op	7 years ago
tensor-tang	9131a35676	replace the lstm compute with jitkernel test=develop	7 years ago
sneaxiy	4c672ab1a2	Merge reyoung:rewrite_allocation	7 years ago
minqiyang	bcd8c2ccc3	Add unit test	7 years ago
minqiyang	67308822f8	Add selected_rows merge for clip_by_norm op test=develop	7 years ago
dzhwinter	26771f41ba	"fix compile error" (#13579 ) * "fix compile error" * "fix ci" * rerun ci test=develop * test=develop rerun ci	7 years ago
Xin Pan	425a882165	Merge pull request #13643 from panyx0718/ir2 clean up channel	7 years ago
Yu Yang	5cf395beaf	Fix bug in uts	7 years ago
Dun	161c3e31f7	Optimization of Kernels that related to DeepLabv3+ (#13534 ) * refine reduce by cub * optimize KernelDepthwiseConvFilterGrad * optimize depthwise conv and reduce mean and reduce sum * fix bug: dilation * cuda arch and cuda 8 compatible	7 years ago
Xin Pan	ddd60581b7	clean up channel test=develop	7 years ago
chengduo	6757a31552	[Accelerate] Refine seq_softmax_op (#13421 ) * refine seq_softmax_op * fix seq_softmax * use cub in seq_softmax	7 years ago
tensor-tang	612ba41aee	add simple lstm compute	7 years ago
dzhwinter	c3e1fb5a3e	add demo	7 years ago
dzhwinter	379b471ee2	squash commit	7 years ago
qingqing01	9bd933d3fb	Improve and fix fake_quantize_op (#13092 ) * Improve and fix fake_quantize_op.	7 years ago
dzhwinter	52d60f8f3e	merge conclit	7 years ago
dzhwinter	dbe90cc0f6	merge develop branch	7 years ago
dzhwinter	b74af56bbc	cpu compile is done	7 years ago
fengjiayi	7e0c9f50ae	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op	7 years ago
dzhwinter	cd8f3e9ed0	operator module is done	7 years ago
dzhwinter	0153c21d83	add unstack_op	7 years ago
dzhwinter	7dceb8a080	check some operators	7 years ago
dzhwinter	26dbe35c54	add msvc flags and copy lib done	7 years ago
dzhwinter	eca4563e5d	operators module (#12938 )	7 years ago
dzhwinter	488a2dd2e8	with ir node	7 years ago
dzhwinter	cfbf1ba305	add source	7 years ago
dzhwinter	89f95ea25e	merge develop branch	7 years ago
dzhwinter	34f8c9b6f5	windows port	7 years ago
dzhwinter	e23ddf6ae4	status (#12764 )	7 years ago
fengjiayi	34b209cffa	Complete sequence_padding GPU kernel	7 years ago
dzhwinter	00463fdfe3	cudnn windows support (#12757 ) * cudnn widndows * "add comment" * "windows support" * "fix cmake error"	7 years ago
dzhwinter	5c88cd2af5	remove werror in windows	7 years ago
dzhwinter	64ce1210aa	"windows support"	7 years ago
nhzlx	f55e8901c8	merge develop	7 years ago
nhzlx	1600ba86f6	1. change tensorrt op from cpu to gpu	7 years ago
tensor-tang	eee38464dc	refine fc op use cpu only	7 years ago
tensor-tang	d84a1a0010	fc op use cpu only	7 years ago
tensor-tang	0098a494a2	Merge remote-tracking branch 'ups/develop' into refine/op/fc	7 years ago
tensor-tang	4b5986bb77	enable fc op in normal case	7 years ago
Yu Yang	8dda526a45	Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy Fix 'softmax_with_cross_entropy_op' dependency error	7 years ago
sneaxiy	c50c537732	fix arithmetic error in backward kernel	7 years ago

1 2 3 4 5

227 Commits (671555ed322d24a19a1e14e5568dcfb9024c9028)