Paddle

Commit Graph

Author	SHA1	Message	Date
tensor-tang	36588b3365	fix illegal instruction of rnn1 and text	7 years ago
tensor-tang	e69328c3bc	fix warning and mac compile test=develop	7 years ago
tensor-tang	6447155dac	Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole Fea jitkernel lstm peephole	7 years ago
sneaxiy	4b4af84e67	test=develop	7 years ago
Qiao Longfei	0225957515	change elementwise_add to elementwise_add_to test=develop	7 years ago
Qiao Longfei	b4a32eafdf	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op test=develop	7 years ago
Zeng Jinle	93606c2c2c	Merge pull request #13689 from sneaxiy/sparse_rmsprop Fix sparse rmsprop	7 years ago
sneaxiy	5cedfb60c8	test=develop	7 years ago
Qiao Longfei	936926aadd	code optimize test=develop	7 years ago
Qiyang Min	cab29828a5	Merge pull request #13829 from velconia/accelerate_sequence_pool_op Accelerate SequencePool Op on SUM mode of CPU	7 years ago
Qiao Longfei	c52ccbc109	clean code	7 years ago
Qiao Longfei	6056d04361	optimize blas call	7 years ago
Qiao Longfei	5db7551317	optimize code	7 years ago
Qiao Longfei	eb6d9e3bbe	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op	7 years ago
Qiao Longfei	0170d36c42	fix a bug	7 years ago
Qiyang Min	e37c9e6732	Merge pull request #13828 from velconia/accelerate_selected_rows_functor Accelerate SelectedRows Functors:	7 years ago
Qiao Longfei	86e2e686ee	fix bug	7 years ago
Qiao Longfei	333fd15204	add gpu test for mrege add	7 years ago
Qiao Longfei	ab3e36da80	update MergeAdd for selected_rows_functor.cu	7 years ago
Qiao Longfei	d5c64af24f	change map to unordered_map	7 years ago
Qiao Longfei	005f1923a2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op	7 years ago
tensor-tang	bcb8ea397d	Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole test=develop	7 years ago
tensor-tang	8e182170ba	refine and replace lstm peephole kernel	7 years ago
Dun	5f2e837847	optimize depthwise conv by register memory (#13778 ) * optimize depthwise conv by register memory * test=develop	7 years ago
minqiyang	3f6ec90060	Polish code test=develop	7 years ago
tensor-tang	7ef2699e18	init peephole runtime kernel	7 years ago
minqiyang	0385b0a1ea	Accelerate SequencePool Op on SUM mode test=develop	7 years ago
minqiyang	8ec748cfa0	Accelerate SelectedRows Functors: 1. Accelerate SelectedRows MergeAdd functor 2. Add SelectedRowsSumTo functor to support MergeAdd multiple SelectedRows into one test=develop	7 years ago
Qiao Longfei	38568519f7	optimize code	7 years ago
tensor-tang	3ee8f2c6cf	thread local jit kernels test=develop	7 years ago
tensor-tang	9131a35676	replace the lstm compute with jitkernel test=develop	7 years ago
tensor-tang	b55c247678	add lstm compute unit test	7 years ago
tensor-tang	2a00969165	optimize lstm jitkernel keq8 test=develop	7 years ago
tensor-tang	f2adaf1c3e	add vrelu and lstm kernel test=develop	7 years ago
tensor-tang	e6d8aca3bf	refine code and fix	7 years ago
qiaolongfei	1a59880084	update test_sum_op	7 years ago
qiaolongfei	40d3bd4e81	selected rows merge add support multi input	7 years ago
tensor-tang	ea7dc9cbf6	Merge remote-tracking branch 'ups/develop' into fea/jitkernel test=develop	7 years ago
tensor-tang	2513b2cc4e	fix bug vtanh	7 years ago
tensor-tang	cf8c8e72bd	add vtanh and unit test	7 years ago
tensor-tang	b37fe30417	Merge pull request #13690 from wangguibao/fix_cpu_lstm_compute_cc Avoid multiple definitions of lstm_compute_ctht when linking libpaddle_fluid.so	7 years ago
dzhwinter	26771f41ba	"fix compile error" (#13579 ) * "fix compile error" * "fix ci" * rerun ci test=develop * test=develop rerun ci	7 years ago
tensor-tang	d10a9df7b8	add vaddbias and unit test	7 years ago
tensor-tang	3c8b651187	add vsigmoid avx implementations and unit test	7 years ago
tensor-tang	55e44761fb	refine code and init vsigmoid	7 years ago
wangguibao	1940bc2d83	Avoid multiple definitions of lstm_compute_ctht when linking libpaddle_fluid.so test=develop	7 years ago
sneaxiy	584c3f048f	fix sparse rmsprop	7 years ago
Dun	161c3e31f7	Optimization of Kernels that related to DeepLabv3+ (#13534 ) * refine reduce by cub * optimize KernelDepthwiseConvFilterGrad * optimize depthwise conv and reduce mean and reduce sum * fix bug: dilation * cuda arch and cuda 8 compatible	7 years ago
tensor-tang	2d0ff6a3c2	add vexp and unit test	7 years ago
tensor-tang	b3c63f40fa	add vscal and unit test	7 years ago
tensor-tang	0987f2b4d9	add vadd unit test	7 years ago
tensor-tang	3d928d4f9d	refine and seepdup	7 years ago
tensor-tang	77fc42d2d1	Merge remote-tracking branch 'ups/develop' into fea/jitkernel	7 years ago
tensor-tang	2937314d8e	refine vmul and test	7 years ago
tensor-tang	6c986e127a	fix macro and add vmul unit test	7 years ago
Yu Yang	0be1582df0	Merge pull request #13525 from reyoung/fix_mixed_vector Fix mixed vector	7 years ago
tensor-tang	8c69764d12	add vmul unit tests	7 years ago
tensor-tang	084893a9a9	add vadd kernel	7 years ago
tensor-tang	eeff268a6c	clean and refine kernels	7 years ago
tensor-tang	dee5d35c20	refine vmul	7 years ago
tensor-tang	92031968d7	init vmul kernel	7 years ago
tensor-tang	b9acbcc8c5	init lstm kernel	7 years ago
tensor-tang	c260bf942d	init jit kernel	7 years ago
Yu Yang	3043f51b3a	Merge pull request #13511 from reyoung/fix_ce Revert "Merge pull request #13431 from chengduoZH/refine_lod"	7 years ago
Yu Yang	f7af695801	Merge pull request #13505 from reyoung/fix_selected_rows_functor_test Fix unstable selected_rows_functor_test.cu	7 years ago
Yu Yang	6d2c6f96f1	Revert "Revert "Merge pull request #13431 from chengduoZH/refine_lod"" This reverts commit `a6c8d6b9a2`.	7 years ago
Yu Yang	a6c8d6b9a2	Revert "Merge pull request #13431 from chengduoZH/refine_lod" This reverts commit `bd79e04667`, reversing changes made to `6b4d290c18`.	7 years ago
Zeng Jinle	7f1e312677	Merge pull request #13456 from sneaxiy/refine_sparse_adam Fix sparse Adam and Gradient clip of SelectedRows	7 years ago
Yu Yang	b5996fa124	Fix unstable selected_rows_functor_test.cu	7 years ago
sneaxiy	a29b4227eb	fix sparse gradient clip	7 years ago
Yihua Xu	87086b1386	Refine activation for GRU operator (#13275 ) * Optimize GRU with AVX instruction * Clean code * Add the Unitest and fix the align issue * Remove the remanent part of the unitest part * Code clean * Fix the parameters length issue for fusion_gru to pass CI * Change the default type as float32	7 years ago
chengduo	d402234ba8	Feature/op_fuse_pass (#12440 ) * Add Preface * Add demo code * Save file * Refine code * seems can work * use elementwise strategy * Use ElementwiseComputeEx * Add comments * extract functions from operator * Refine code * Follow comment * code refine * add op_fuse pass * add backward * code refine * use TopologySortOperations * follow comments * refine IsFusible * code enhance * fix op_fusion_pass * refine code * refine fuse_elemwise_act_op * adjust the input and output * refine logic * add intermediate_edge * disable inplace * follow comments * refine logic * follow comments * Remove the removable IntermediateOut * change strategy * code refine * enable fuse backward * code refine * code refine * rename unit test * follow comments	7 years ago
Yu Yang	2c31ea9293	Merge pull request #13424 from chengduoZH/refine_seq_concat Refine seq_concat	7 years ago
Yu Yang	5996e224fa	Merge pull request #13430 from chengduoZH/refine_seq_pool Refine seq_pool	7 years ago
sneaxiy	b6f61faf13	fix adam	7 years ago
chengduoZH	6534f8527a	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_lod	7 years ago
chengduoZH	24459501fe	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_seq_concat	7 years ago
chengduoZH	f92b07f0b5	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_seq_pool	7 years ago
gongweibao	0c8c0d943f	fix macunittest (#13434 )	7 years ago
chengduoZH	cdb9605bad	refine	7 years ago
chengduoZH	cacf549e8a	refine seq_pool	7 years ago
chengduoZH	e7940141ce	refine seq_concat	7 years ago
tensor-tang	7c8730824a	Merge pull request #13396 from tensor-tang/refine/op/lstm Refine/op/lstm	7 years ago
Tao Luo	40c54db301	Merge pull request #13338 from bingyanghuang/bingyang/seq_pool_memcpy Use memcpy to rewrite the sequence pooling LAST and FIRST mode	7 years ago
tensor-tang	e09cf031a8	refine src and header	7 years ago
bingyanghuang	76553c5a6d	fix travis-ci	7 years ago
tensor-tang	bc9971dd6c	fix deps	7 years ago
tensor-tang	ff858d35ed	fix bug and enable on batch mode as well	7 years ago
tensor-tang	8dea07f209	fix comopile	7 years ago
tensor-tang	612ba41aee	add simple lstm compute	7 years ago
bingyanghuang	83394bab3e	modified by luotao's suggestion	7 years ago
Bai Yifan	faf8ad2436	Add ignore_index in cross_entropy op (#13217 ) * add ignore index * update api.spec * enhance softmax_with_cross_entropy	7 years ago
bingyanghuang	1454cd54aa	pre-commit check	7 years ago
bingyanghuang	7429067ab3	clean code	7 years ago
bingyanghuang	cdbc5e7353	Add some comments	7 years ago
bingyanghuang	53185fde11	Rewrite sequence pooling last and first mode with memcpy and clean code	7 years ago
dzhwinter	379b471ee2	squash commit	7 years ago
dzhwinter	f05520060e	fix style (#13142 )	7 years ago
tensor-tang	f38905a6e5	Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_gru	7 years ago
dzhwinter	34757efb8e	fix windows compile	7 years ago
dzhwinter	dbe90cc0f6	merge develop branch	7 years ago
dzhwinter	ab1097cd8e	Feature/template (#13093 ) * remove template operator * "fix compile" * "fix ci" * "fix ci"	7 years ago
tensor-tang	7bdd11d88e	Merge branch 'develop' into optimize/op/fusion_gru	7 years ago
tensor-tang	b0d36c4c3d	add cross vec to speedup gru	7 years ago
chengduo	3bd1d22a7d	Enhance fused_elementwise_activation_op (#12837 ) * Enhance the function of fused_elementwise_activation_op * enhance unit test * Clean Code And Add Doc * Add compound functors * Fix doc and enhance unit test * define Dx and Dy for d_binary_func * add mul_scale * add mul_scale * add elementwise_mul * code refine * code refine * add doc * add AsIntermediate	7 years ago
tensor-tang	2d0ddf8c41	refine cpu gru batch mode	7 years ago
tensor-tang	70d3981220	add cpu vec bias sub	7 years ago
tensor-tang	d941192e74	fix gcc53 on cpu vec (#13020 )	7 years ago
tensor-tang	2328a69157	Merge pull request #13012 from tensor-tang/refine/seq2batch refine seq2batch	7 years ago
tensor-tang	fd4f7c3ab5	refine seq2batch	7 years ago
fengjiayi	7e0c9f50ae	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op	7 years ago
fengjiayi	9cb455fa7d	update function	7 years ago
Zeng Jinle	ef7bd03a03	Merge pull request #12964 from sneaxiy/fix_concat_sync Fix concat bug	7 years ago
qingqing01	1f09bc320c	Support data type int8_t . (#12841 ) * Support int8 type.	7 years ago
dzhwinter	cd8f3e9ed0	operator module is done	7 years ago
chengduo	3e1050a2e8	Add pad_constant_like_op (#12943 ) * Add pad_constant_batch_size_like * refine pad_op * optimize memory	7 years ago
dzhwinter	6cc7870517	fix concat synchronization bug	7 years ago
dzhwinter	2ec589a24e	float.h fixed	7 years ago
dzhwinter	7dceb8a080	check some operators	7 years ago
dzhwinter	26dbe35c54	add msvc flags and copy lib done	7 years ago
Qiao Longfei	3c58b87b45	fix auc layer and add check for auc op (#12954 ) * fix auc layer and add check for auc op * use input to check if states are inited * optimize code	7 years ago
dzhwinter	d7f98f37a7	more platform is done	7 years ago
dzhwinter	eca4563e5d	operators module (#12938 )	7 years ago
dzhwinter	a94d4f51a8	fix math_function compile	7 years ago
tensor-tang	7bdaf09664	Merge remote-tracking branch 'ups/develop' into refine/jit	7 years ago
tensor-tang	3462c29940	refine add bias with avx	7 years ago
dzhwinter	c1ad52f768	pre-commit	7 years ago
dzhwinter	89f95ea25e	merge develop branch	7 years ago
tensor-tang	bb9f98e10d	add inplace test	7 years ago
tensor-tang	f269614bcd	further optimize tanh with avx and mkl	7 years ago
luotao1	2b4edacca0	enhance the forward of concat op	7 years ago
dzhwinter	34f8c9b6f5	windows port	7 years ago
tensor-tang	7a4924cd44	further optimize sigmoid with avx and avx512	7 years ago
tensor-tang	6bd89ba5b6	fix typo	7 years ago
tensor-tang	e3bb98eb38	optimize relu with avx and avx512	7 years ago
tensor-tang	25976fe736	optimize the sigmoid and tanh	7 years ago
tensor-tang	2eb46c2b06	add cpu vec test	7 years ago
tensor-tang	f0f06992c1	Merge pull request #12878 from tensor-tang/feature/op/attention_lstm Add attention lstm cpu forward	7 years ago
fengjiayi	f4a4a4cbd9	add op comment and python layer	7 years ago
tensor-tang	5ca0bb9aad	support more activation type and remove some comments	7 years ago
tensor-tang	ec59f0d454	add cpu vec	7 years ago
tensor-tang	cf5ea925c3	fix bugs	7 years ago
tensor-tang	3dd66390b2	add blas vexp	7 years ago
tensor-tang	0ec1f65cf1	fix blas dot and add cblas scal	7 years ago
tensor-tang	a2203d0466	add cblas dot	7 years ago
tensor-tang	f72ab8961e	refine blas gemm	7 years ago
Yu Yang	3768677980	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/process_lod_grad	7 years ago
Yu Yang	2a36ad1a96	Handle LoD for concat & seq_softmax ops	7 years ago
fengjiayi	ce182d9037	bug fix	7 years ago
Tao Luo	d04ef276a5	Merge pull request #12745 from tensor-tang/refine/op/elewise_mul Refine elementwise mul cpu forward	7 years ago

1 2 3 4 5 ...

422 Commits (e878a8e885ecc6be6b151dbf2f26fadf01abe6da)