Paddle

Commit Graph

Author	SHA1	Message	Date
tensor-tang	0043c42b3e	add vrelu jitcode test=develop	7 years ago
JiabinYang	32e05b01f2	test=develop	7 years ago
sneaxiy	d231e55065	merge develop test=develop	7 years ago
JiabinYang	c8801e100f	grad diff problem to be fixed and need api spec change to be done	7 years ago
peizhilin	ca60e1d34d	Merge remote-tracking branch 'upstream/develop' into windows/build	7 years ago
peizhilin	52f7644f53	Merge remote-tracking branch 'upstream/develop' into windows/build	7 years ago
Qiyang Min	698698f2fa	Merge branch 'develop' into fix_vlog	7 years ago
Yu Yang	fdc689142c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	7 years ago
tensor-tang	22125ebaef	Merge pull request #14321 from tensor-tang/fea/jit/vscal Fea jitcode vscal vaddbias	7 years ago
minqiyang	87450b9ad4	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog test=develop	7 years ago
peizhilin	41b423d41b	remove duplicate	7 years ago
peizhilin	dcfab11193	merge from develop	7 years ago
peizhilin	4ffa92d4f0	Merge branch 'develop' into windows/build	7 years ago
chengduo	c5b6573a5a	Fix input<tensor> (#14208 ) * fix input<tensor> test=develop * fix split_ids test=develop * ElementwiseMul should not support SelectedRows * fix scale op test=develop * change GetTensorFromVar() method to GetTensorOrSelectedRowsFromVar() * fix operator * refine MultiOutput * fix MultiOutput test=develop * disable test_dist_save_load test=develop * fix elementwise_op test=develop * add get_sparse_as_op test=develop * add info for check test=develop * rename get_sparse_as_op with extract_rows_as_op. test=develop * elementwise doesn't support selected_rows * fix regularizer * remove extract_rows_as test=develop * fix ci test=develop * add test for sum_op * fix regularizer test=develop * test=develop * fix pserver weight decay multi inputs test=develop	7 years ago
Zhaolong Xing	ba8b5619a3	Revert "cherry picked windows patches."	7 years ago
minqiyang	fcc0452c8b	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog test=develop	7 years ago
minqiyang	0c3227a523	Change the origin VLOG level to 10 times Fix code to support cpplint syntax check test=develop	7 years ago
tensor-tang	5e64244f25	add vaddbias jitcode test=develop	7 years ago
tensor-tang	5f7956ae59	Merge remote-tracking branch 'ups/develop' into fea/jit/vscal	7 years ago
peizhilin	869487a2b7	Merge remote-tracking branch 'origin/develop' into windows/build	7 years ago
tensor-tang	3d950a812d	combine jitcode of vscal	7 years ago
tensor-tang	03e11f3fc9	add vscal jitcode	7 years ago
dzhwinter	234a1d9248	Merge remote-tracking branch 'origin/develop' into windows/debug test=develop	7 years ago
chengduo	a270fdf2db	Fix SelectedRowsAdd bug (#14309 ) * fix selected_rows bug test=develop * refine cos_sim test=develop	7 years ago
tensor-tang	2f0a379af7	Merge pull request #14307 from tensor-tang/fix/mac fix mac	7 years ago
Zeng Jinle	b2af213009	Merge pull request #14292 from sneaxiy/delete_buggy_selected_rows_functor Delete buggy selected_rows functor	7 years ago
tensor-tang	161ba9c9d1	fix mac test=develop	7 years ago
tensor-tang	e8642c3c1f	Merge pull request #14265 from tensor-tang/fea/jit/vadd add vadd, vaddrelu jitcode	7 years ago
tensor-tang	382307b943	refine code test=develop	7 years ago
tensor-tang	3319072858	fix jit kernel test on mac test=develop	7 years ago
Yu Yang	057a682ee9	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation	7 years ago
chengduo	ffc866159f	hot fix log (#14293 ) test=develop	7 years ago
tensor-tang	25e070ecc7	Merge remote-tracking branch 'ups/develop' into fea/jit/vadd	7 years ago
sneaxiy	9518bc8d0a	delete buggy selected_rows functor test=develop	7 years ago
chengduo	a9b5d42dd4	Add fp16 backward support (#14202 ) * add fp16 backward support test=develop * add sum_op fp16 test * disable test_dist_save_load test=develop * add check_grad for sum * add unit test for softmax_grad fp16 test=develop * add scale_op unit test * add mul_grad_op unit test for fp16 * add cross_entropy_grad and eman_grad unit test for fp16 test=develop * fix cross_entropy unit test * add pool2d fp16 unit test * refine conv2d fp16 unit test test=develop * refine activation unit test test=develop * fix ci test=develop * follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796 test=develop	7 years ago
dzhwinter	2835e04409	merge develop branch. test=develop	7 years ago
tensor-tang	cb4083b9fa	fix compile error test=develop	7 years ago
tensor-tang	dd343a4971	Merge remote-tracking branch 'ups/develop' into fea/jit/vadd	7 years ago
tensor-tang	b81e1b655e	fix jit on mac test=develop	7 years ago
tensor-tang	b68ececb73	add vaddrelu jitcode test=develop	7 years ago
tensor-tang	bb09e31020	add vadd jitcode test=develop	7 years ago
peizhilin	71d7980f69	fix build issue 1	7 years ago
tensor-tang	8465e7876f	auto grow the size and fix test test=develop	7 years ago
tensor-tang	9255119fd9	refine jit vmul with all size	7 years ago
tensor-tang	a9c1824131	refine jit vmul code supporting multiple of 2	7 years ago
tensor-tang	61fdc38e51	Merge pull request #14206 from tensor-tang/fea/jit/gen Fea/jit/gen	7 years ago
peizhilin	9d67c1fb69	cpu build support	7 years ago
Kaipeng Deng	daed473d4a	Merge pull request #14089 from heavengate/pool_exclude add inclusive/exclusive mode in avg pool	7 years ago
tensor-tang	85bcb286f5	refine vmul jitcode test=develop	7 years ago
tensor-tang	a3377f7b0a	refine jitcode and add vmul jitcode implementation	7 years ago
dzhwinter	1ace55c8ee	merge develop branch	7 years ago
tensor-tang	f3badacd97	Merge remote-tracking branch 'ups/develop' into fea/jit/gen	7 years ago
tensor-tang	a53b1b0b1b	refine and init jitkernel vmul	7 years ago
tensor-tang	2139b9f677	add jit gencode	7 years ago
Tao Luo	cdf2579d08	Merge pull request #14053 from jczaja/prv-seqpool-max Max Sequence pool optimization	7 years ago
dzhwinter	316765839d	add back jit simd instructions. stage.	7 years ago
dzhwinter	bf2e4cb188	cleard. staged	7 years ago
dzhwinter	ebfe5a02b3	merge develop branch	7 years ago
tensor-tang	3c957af139	Merge pull request #14080 from tensor-tang/refine/jit/crf2 Refine/jit/crf decoding	7 years ago
Jacek Czaja	458b16f42a	Rebase of seqpool-max optimization test=develop - Added rough profiling - Profiled maxpool itself - First draft of max seqpool optimization (is_test added) - Added unit tests to seqpool - Cosmetic fixes - Fix to UT of Seq pool Disabled grad checking for sequence max pool when is_test is set to True -Cosmetic fix to comment test=develop - Fix to GPU build test=develop - yet another GPU fix for sequence max pool - Fix to comment test=develop - Change to API of sequence_pool test=develop - Yet another API spec change test=develop	7 years ago
dengkaipeng	8f1e398824	move param exclusive to the last in pool2d/pool3d for forward compatibility:. test=develop	7 years ago
Yu Yang	c01696f8c2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	7 years ago
dengkaipeng	c93e044ae0	add inclusive/exclusive mode in PoolOp avg pool type	7 years ago
Qiao Longfei	96d5500934	optimize code	7 years ago
Qiao Longfei	748ee35c89	sum op handle empty input update selected_rows_functor.cu	7 years ago
Qiao Longfei	dd78b5df93	sum op handle empty input	7 years ago
Qiao Longfei	cbe128bbae	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op	7 years ago
Zeng Jinle	97d47a7d08	Merge pull request #13913 from sneaxiy/seq_reverse Add sequence_reverse_op	7 years ago
tensor-tang	64d5b4385e	fix crf decode avx512	7 years ago
tensor-tang	21487d78bf	add crf decode jit kernel	7 years ago
Qiao Longfei	de539d72da	format test=develop	7 years ago
tensor-tang	a05fce6544	Merge remote-tracking branch 'ups/develop' into fix/jit/avx test=develop	7 years ago
Qiyang Min	d0fdcb2f6d	Merge pull request #14048 from velconia/change_sequence_pool_to_cpu Accelerate Sequence Pool Grad Op	7 years ago
tensor-tang	d24d282a7a	fix avx error test=develop	7 years ago
Qiao Longfei	6253b152e6	Merge branch 'optimize-sum-seq-pooling-op' of https://github.com/jacquesqiao/Paddle into optimize-sum-seq-pooling-op	7 years ago
Qiao Longfei	14f5a40898	fix unit test	7 years ago
minqiyang	e2a348cd10	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into change_sequence_pool_to_cpu	7 years ago
Qiao Longfei	f4e6fe0786	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op	7 years ago
minqiyang	2468057da6	Move code to SumSeqPoolGradFunctor test=develop	7 years ago
minqiyang	9725db0d40	Fix copy wrong pos bug test=develop	7 years ago
minqiyang	9c68709036	Accelerate sequence_pool functor	7 years ago
minqiyang	14ebc424d6	Add gpu support for unittest	7 years ago
minqiyang	bd5a82e193	Polish unit test code	7 years ago
minqiyang	047fa2f9aa	Add unit-test for sequence_pooling functor	7 years ago
sneaxiy	92a2817a2b	test=develop	7 years ago
tensor-tang	032c3a07e3	Merge remote-tracking branch 'ups/develop' into refine/jit/gru test=develop	7 years ago
tensor-tang	159be8cc63	optimize fusion gru kernel at size 8	7 years ago
chengduo	a7497653d0	Refine Split op (#13967 ) * speedup split_op test=develop * speedup split_op test=develop * rename ConcatGrad to Split * refine concat and split test=develop * fix compile error	7 years ago
sneaxiy	a9d7a9d720	test=develop	7 years ago
tensor-tang	640e789d3d	add fusion gru jit kernel	7 years ago
Yu Yang	461f71a90b	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation	7 years ago
tensor-tang	23fc896bc2	Merge remote-tracking branch 'ups/develop' into fea/fusion_seqconv_add test=develop	7 years ago
tensor-tang	e5ce965952	refine and add eltadd_relu unit test	7 years ago
tensor-tang	7cb19a5976	fuse elementwise_add and relu	7 years ago
sneaxiy	ac2eba4457	test=develop	7 years ago
tensor-tang	b139b687de	Merge remote-tracking branch 'ups/develop' into fix/jit/exp test=develop	7 years ago
tensor-tang	748435586a	clean code exp avx	7 years ago
tensor-tang	b4751a34a5	fix illegal instruction of rnn2	7 years ago
tensor-tang	36588b3365	fix illegal instruction of rnn1 and text	7 years ago
tensor-tang	e69328c3bc	fix warning and mac compile test=develop	7 years ago
tensor-tang	6447155dac	Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole Fea jitkernel lstm peephole	7 years ago
sneaxiy	4b4af84e67	test=develop	7 years ago
Qiao Longfei	0225957515	change elementwise_add to elementwise_add_to test=develop	7 years ago
Qiao Longfei	b4a32eafdf	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op test=develop	7 years ago
Zeng Jinle	93606c2c2c	Merge pull request #13689 from sneaxiy/sparse_rmsprop Fix sparse rmsprop	7 years ago
sneaxiy	5cedfb60c8	test=develop	7 years ago
Qiao Longfei	936926aadd	code optimize test=develop	7 years ago
Qiyang Min	cab29828a5	Merge pull request #13829 from velconia/accelerate_sequence_pool_op Accelerate SequencePool Op on SUM mode of CPU	7 years ago
Qiao Longfei	c52ccbc109	clean code	7 years ago
Qiao Longfei	6056d04361	optimize blas call	7 years ago
Qiao Longfei	5db7551317	optimize code	7 years ago
Qiao Longfei	eb6d9e3bbe	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op	7 years ago
Qiao Longfei	0170d36c42	fix a bug	7 years ago
Qiyang Min	e37c9e6732	Merge pull request #13828 from velconia/accelerate_selected_rows_functor Accelerate SelectedRows Functors:	7 years ago
Qiao Longfei	86e2e686ee	fix bug	7 years ago
Qiao Longfei	333fd15204	add gpu test for mrege add	7 years ago
Qiao Longfei	ab3e36da80	update MergeAdd for selected_rows_functor.cu	7 years ago
Qiao Longfei	d5c64af24f	change map to unordered_map	7 years ago
Qiao Longfei	005f1923a2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op	7 years ago
tensor-tang	bcb8ea397d	Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole test=develop	7 years ago
tensor-tang	8e182170ba	refine and replace lstm peephole kernel	7 years ago
Dun	5f2e837847	optimize depthwise conv by register memory (#13778 ) * optimize depthwise conv by register memory * test=develop	7 years ago
minqiyang	3f6ec90060	Polish code test=develop	7 years ago
tensor-tang	7ef2699e18	init peephole runtime kernel	7 years ago
minqiyang	0385b0a1ea	Accelerate SequencePool Op on SUM mode test=develop	7 years ago
minqiyang	8ec748cfa0	Accelerate SelectedRows Functors: 1. Accelerate SelectedRows MergeAdd functor 2. Add SelectedRowsSumTo functor to support MergeAdd multiple SelectedRows into one test=develop	7 years ago
Qiao Longfei	38568519f7	optimize code	7 years ago
tensor-tang	3ee8f2c6cf	thread local jit kernels test=develop	7 years ago
tensor-tang	9131a35676	replace the lstm compute with jitkernel test=develop	7 years ago
tensor-tang	b55c247678	add lstm compute unit test	7 years ago
sneaxiy	4c672ab1a2	Merge reyoung:rewrite_allocation	7 years ago
tensor-tang	2a00969165	optimize lstm jitkernel keq8 test=develop	7 years ago
tensor-tang	f2adaf1c3e	add vrelu and lstm kernel test=develop	7 years ago
tensor-tang	e6d8aca3bf	refine code and fix	7 years ago
qiaolongfei	1a59880084	update test_sum_op	7 years ago
qiaolongfei	40d3bd4e81	selected rows merge add support multi input	7 years ago
tensor-tang	ea7dc9cbf6	Merge remote-tracking branch 'ups/develop' into fea/jitkernel test=develop	7 years ago
tensor-tang	2513b2cc4e	fix bug vtanh	7 years ago
tensor-tang	cf8c8e72bd	add vtanh and unit test	7 years ago
tensor-tang	b37fe30417	Merge pull request #13690 from wangguibao/fix_cpu_lstm_compute_cc Avoid multiple definitions of lstm_compute_ctht when linking libpaddle_fluid.so	7 years ago
dzhwinter	26771f41ba	"fix compile error" (#13579 ) * "fix compile error" * "fix ci" * rerun ci test=develop * test=develop rerun ci	7 years ago
tensor-tang	d10a9df7b8	add vaddbias and unit test	7 years ago
tensor-tang	3c8b651187	add vsigmoid avx implementations and unit test	7 years ago
tensor-tang	55e44761fb	refine code and init vsigmoid	7 years ago
wangguibao	1940bc2d83	Avoid multiple definitions of lstm_compute_ctht when linking libpaddle_fluid.so test=develop	7 years ago
sneaxiy	584c3f048f	fix sparse rmsprop	7 years ago
Yu Yang	8e3fdc6e65	Fix SetDevice on init	7 years ago
Yu Yang	524f6e9b36	Refine code	7 years ago
Dun	161c3e31f7	Optimization of Kernels that related to DeepLabv3+ (#13534 ) * refine reduce by cub * optimize KernelDepthwiseConvFilterGrad * optimize depthwise conv and reduce mean and reduce sum * fix bug: dilation * cuda arch and cuda 8 compatible	7 years ago
tensor-tang	2d0ff6a3c2	add vexp and unit test	7 years ago
tensor-tang	b3c63f40fa	add vscal and unit test	7 years ago
tensor-tang	0987f2b4d9	add vadd unit test	7 years ago
tensor-tang	3d928d4f9d	refine and seepdup	7 years ago
tensor-tang	77fc42d2d1	Merge remote-tracking branch 'ups/develop' into fea/jitkernel	7 years ago
tensor-tang	2937314d8e	refine vmul and test	7 years ago
tensor-tang	6c986e127a	fix macro and add vmul unit test	7 years ago
Yu Yang	0be1582df0	Merge pull request #13525 from reyoung/fix_mixed_vector Fix mixed vector	7 years ago
tensor-tang	8c69764d12	add vmul unit tests	7 years ago
tensor-tang	084893a9a9	add vadd kernel	7 years ago
tensor-tang	eeff268a6c	clean and refine kernels	7 years ago
tensor-tang	dee5d35c20	refine vmul	7 years ago
tensor-tang	92031968d7	init vmul kernel	7 years ago
tensor-tang	b9acbcc8c5	init lstm kernel	7 years ago
tensor-tang	c260bf942d	init jit kernel	7 years ago
Yu Yang	3043f51b3a	Merge pull request #13511 from reyoung/fix_ce Revert "Merge pull request #13431 from chengduoZH/refine_lod"	7 years ago
Yu Yang	f7af695801	Merge pull request #13505 from reyoung/fix_selected_rows_functor_test Fix unstable selected_rows_functor_test.cu	7 years ago
Yu Yang	6d2c6f96f1	Revert "Revert "Merge pull request #13431 from chengduoZH/refine_lod"" This reverts commit `a6c8d6b9a2`.	7 years ago
Yu Yang	a6c8d6b9a2	Revert "Merge pull request #13431 from chengduoZH/refine_lod" This reverts commit `bd79e04667`, reversing changes made to `6b4d290c18`.	7 years ago
Zeng Jinle	7f1e312677	Merge pull request #13456 from sneaxiy/refine_sparse_adam Fix sparse Adam and Gradient clip of SelectedRows	7 years ago
Yu Yang	b5996fa124	Fix unstable selected_rows_functor_test.cu	7 years ago
sneaxiy	a29b4227eb	fix sparse gradient clip	7 years ago
Yihua Xu	87086b1386	Refine activation for GRU operator (#13275 ) * Optimize GRU with AVX instruction * Clean code * Add the Unitest and fix the align issue * Remove the remanent part of the unitest part * Code clean * Fix the parameters length issue for fusion_gru to pass CI * Change the default type as float32	7 years ago
chengduo	d402234ba8	Feature/op_fuse_pass (#12440 ) * Add Preface * Add demo code * Save file * Refine code * seems can work * use elementwise strategy * Use ElementwiseComputeEx * Add comments * extract functions from operator * Refine code * Follow comment * code refine * add op_fuse pass * add backward * code refine * use TopologySortOperations * follow comments * refine IsFusible * code enhance * fix op_fusion_pass * refine code * refine fuse_elemwise_act_op * adjust the input and output * refine logic * add intermediate_edge * disable inplace * follow comments * refine logic * follow comments * Remove the removable IntermediateOut * change strategy * code refine * enable fuse backward * code refine * code refine * rename unit test * follow comments	7 years ago
Yu Yang	2c31ea9293	Merge pull request #13424 from chengduoZH/refine_seq_concat Refine seq_concat	7 years ago
Yu Yang	5996e224fa	Merge pull request #13430 from chengduoZH/refine_seq_pool Refine seq_pool	7 years ago
sneaxiy	b6f61faf13	fix adam	7 years ago
chengduoZH	6534f8527a	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_lod	7 years ago
chengduoZH	24459501fe	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_seq_concat	7 years ago
chengduoZH	f92b07f0b5	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_seq_pool	7 years ago
gongweibao	0c8c0d943f	fix macunittest (#13434 )	7 years ago
chengduoZH	cdb9605bad	refine	7 years ago
chengduoZH	cacf549e8a	refine seq_pool	7 years ago
chengduoZH	e7940141ce	refine seq_concat	7 years ago
tensor-tang	7c8730824a	Merge pull request #13396 from tensor-tang/refine/op/lstm Refine/op/lstm	7 years ago
Tao Luo	40c54db301	Merge pull request #13338 from bingyanghuang/bingyang/seq_pool_memcpy Use memcpy to rewrite the sequence pooling LAST and FIRST mode	7 years ago
tensor-tang	e09cf031a8	refine src and header	7 years ago
bingyanghuang	76553c5a6d	fix travis-ci	7 years ago
tensor-tang	bc9971dd6c	fix deps	7 years ago
tensor-tang	ff858d35ed	fix bug and enable on batch mode as well	7 years ago
tensor-tang	8dea07f209	fix comopile	7 years ago
tensor-tang	612ba41aee	add simple lstm compute	7 years ago
bingyanghuang	83394bab3e	modified by luotao's suggestion	7 years ago
Bai Yifan	faf8ad2436	Add ignore_index in cross_entropy op (#13217 ) * add ignore index * update api.spec * enhance softmax_with_cross_entropy	7 years ago
bingyanghuang	1454cd54aa	pre-commit check	7 years ago
bingyanghuang	7429067ab3	clean code	7 years ago
bingyanghuang	cdbc5e7353	Add some comments	7 years ago
bingyanghuang	53185fde11	Rewrite sequence pooling last and first mode with memcpy and clean code	7 years ago
dzhwinter	379b471ee2	squash commit	7 years ago
dzhwinter	f05520060e	fix style (#13142 )	7 years ago
tensor-tang	f38905a6e5	Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_gru	7 years ago
dzhwinter	34757efb8e	fix windows compile	7 years ago
dzhwinter	dbe90cc0f6	merge develop branch	7 years ago
dzhwinter	ab1097cd8e	Feature/template (#13093 ) * remove template operator * "fix compile" * "fix ci" * "fix ci"	7 years ago
tensor-tang	7bdd11d88e	Merge branch 'develop' into optimize/op/fusion_gru	7 years ago
tensor-tang	b0d36c4c3d	add cross vec to speedup gru	7 years ago
chengduo	3bd1d22a7d	Enhance fused_elementwise_activation_op (#12837 ) * Enhance the function of fused_elementwise_activation_op * enhance unit test * Clean Code And Add Doc * Add compound functors * Fix doc and enhance unit test * define Dx and Dy for d_binary_func * add mul_scale * add mul_scale * add elementwise_mul * code refine * code refine * add doc * add AsIntermediate	7 years ago
tensor-tang	2d0ddf8c41	refine cpu gru batch mode	7 years ago
tensor-tang	70d3981220	add cpu vec bias sub	7 years ago
tensor-tang	d941192e74	fix gcc53 on cpu vec (#13020 )	7 years ago
tensor-tang	2328a69157	Merge pull request #13012 from tensor-tang/refine/seq2batch refine seq2batch	7 years ago
tensor-tang	fd4f7c3ab5	refine seq2batch	7 years ago
fengjiayi	7e0c9f50ae	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op	7 years ago
fengjiayi	9cb455fa7d	update function	7 years ago
Zeng Jinle	ef7bd03a03	Merge pull request #12964 from sneaxiy/fix_concat_sync Fix concat bug	7 years ago
qingqing01	1f09bc320c	Support data type int8_t . (#12841 ) * Support int8 type.	7 years ago
dzhwinter	cd8f3e9ed0	operator module is done	7 years ago
chengduo	3e1050a2e8	Add pad_constant_like_op (#12943 ) * Add pad_constant_batch_size_like * refine pad_op * optimize memory	7 years ago
dzhwinter	6cc7870517	fix concat synchronization bug	7 years ago
dzhwinter	2ec589a24e	float.h fixed	7 years ago
dzhwinter	7dceb8a080	check some operators	7 years ago
dzhwinter	26dbe35c54	add msvc flags and copy lib done	7 years ago
Qiao Longfei	3c58b87b45	fix auc layer and add check for auc op (#12954 ) * fix auc layer and add check for auc op * use input to check if states are inited * optimize code	7 years ago
dzhwinter	d7f98f37a7	more platform is done	7 years ago
dzhwinter	eca4563e5d	operators module (#12938 )	7 years ago
dzhwinter	a94d4f51a8	fix math_function compile	7 years ago
tensor-tang	7bdaf09664	Merge remote-tracking branch 'ups/develop' into refine/jit	7 years ago
tensor-tang	3462c29940	refine add bias with avx	7 years ago
dzhwinter	c1ad52f768	pre-commit	7 years ago
dzhwinter	89f95ea25e	merge develop branch	7 years ago
tensor-tang	bb9f98e10d	add inplace test	7 years ago
tensor-tang	f269614bcd	further optimize tanh with avx and mkl	7 years ago
luotao1	2b4edacca0	enhance the forward of concat op	7 years ago
dzhwinter	34f8c9b6f5	windows port	7 years ago
tensor-tang	7a4924cd44	further optimize sigmoid with avx and avx512	7 years ago
tensor-tang	6bd89ba5b6	fix typo	7 years ago
tensor-tang	e3bb98eb38	optimize relu with avx and avx512	7 years ago
tensor-tang	25976fe736	optimize the sigmoid and tanh	7 years ago
tensor-tang	2eb46c2b06	add cpu vec test	7 years ago
tensor-tang	f0f06992c1	Merge pull request #12878 from tensor-tang/feature/op/attention_lstm Add attention lstm cpu forward	7 years ago
fengjiayi	f4a4a4cbd9	add op comment and python layer	7 years ago
tensor-tang	5ca0bb9aad	support more activation type and remove some comments	7 years ago
tensor-tang	ec59f0d454	add cpu vec	7 years ago
tensor-tang	cf5ea925c3	fix bugs	7 years ago
tensor-tang	3dd66390b2	add blas vexp	7 years ago
tensor-tang	0ec1f65cf1	fix blas dot and add cblas scal	7 years ago
tensor-tang	a2203d0466	add cblas dot	7 years ago
tensor-tang	f72ab8961e	refine blas gemm	7 years ago
Yu Yang	3768677980	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/process_lod_grad	7 years ago
Yu Yang	2a36ad1a96	Handle LoD for concat & seq_softmax ops	7 years ago
fengjiayi	ce182d9037	bug fix	7 years ago
Tao Luo	d04ef276a5	Merge pull request #12745 from tensor-tang/refine/op/elewise_mul Refine elementwise mul cpu forward	7 years ago
fengjiayi	34b209cffa	Complete sequence_padding GPU kernel	7 years ago
tensor-tang	b090479409	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm	7 years ago
fengjiayi	8d8d48a34f	Complete sequence_pad_op and its CPU kernel. Add unittests	7 years ago
dzhwinter	4069262f0e	Revert ""cherry picked operators changes" (#12184 )" (#12747 ) This reverts commit `bf3c34960f`.	7 years ago
fengjiayi	3c749fae43	update CPU sequence_padding functor	7 years ago
tensor-tang	92890ac258	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm	7 years ago
tensor-tang	6644ce79a5	add mklml vmul	7 years ago
tensor-tang	ff92b6ba81	Merge pull request #12531 from tensor-tang/refine/op/gru Refine gru cpu forward	7 years ago
tensor-tang	a72f68f223	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm	7 years ago
tensor-tang	f3cd2612ae	refine fc and use the fc compute in fusion_lstm	7 years ago
dzhwinter	bf3c34960f	"cherry picked operators changes" (#12184 ) * "cherry picked operators changes" * "remove duplicated code" * "add constant setter" * "add get expected kernel" * "fix ci" * "add fill constant"	7 years ago
fengjiayi	a38a8db928	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op	7 years ago
tensor-tang	3bf3e77ac8	Merge remote-tracking branch 'ups/develop' into refine/op/gru	7 years ago
chengduo	7c8b69c700	Feature/op fusion (#12240 ) * Add Preface * Add demo code * Save file * Refine code * seems can work * use elementwise strategy * Use ElementwiseComputeEx * Add comments * extract functions from operator * Refine code * Follow comment * code refine * follow comments * follow comments	7 years ago
tensor-tang	54c95e49f0	fix blas	7 years ago
tensor-tang	8c23f7c4f0	fix blas and use packed weight	7 years ago
tensor-tang	43cee33a23	add mkl packed gemm	7 years ago
tensor-tang	d8d2dbcfac	further optimize im2col using variables	7 years ago
tensor-tang	687a322267	Merge remote-tracking branch 'ups/develop' into refine/im2col	7 years ago
tensor-tang	65d418f060	complete im2col with padding==1 and speedup filter width==1	7 years ago
tensor-tang	52eb86e30f	refine im2col benchmark	7 years ago
tensor-tang	3017f46076	add more test cases	7 years ago
tensor-tang	8d6be4fb5f	refine im2col test and add benchmark	7 years ago
tensor-tang	507c143047	im2col cfo cpu code clean	7 years ago
tensor-tang	4eeed0b5e4	refine width padding and enable core copy	7 years ago
Wu Yi	73fcfc06ec	refine conv cudnn enforce (#12353 ) * refine conv cudnn enforce * update * update all cudnn ops * fix	7 years ago
tensor-tang	e3131e2d73	enable width padding	7 years ago
tensor-tang	92518c519f	reuse sizes saving time	7 years ago
tensor-tang	660df122ce	enable padding!=0 and fill height padding with 0	7 years ago
tensor-tang	d8e00facf7	reuse im_size	7 years ago
tensor-tang	b72befc5cc	reuse copy size	7 years ago
tensor-tang	6788af4bf1	refine test cases	7 years ago
tensor-tang	b163e601b6	add gtest	7 years ago
tensor-tang	aae994fd26	refine im2col no padding	7 years ago
Yan Chunwei	02cf54d331	bugfix lod cpu performance (#12297 )	7 years ago
tensor-tang	fc2b578842	add gemm_warp test	7 years ago
tensor-tang	a916c52579	refine gemm	7 years ago
tensor-tang	961e754c9f	mkl split gemm for better perf	7 years ago
tensor-tang	f0cd493c0d	Merge pull request #11989 from tensor-tang/feature/libxsmm introduce libxsmm	7 years ago
Guo Sheng	da3f766821	Merge pull request #12088 from guoshengCS/complete-hsigmoid Complete hsigmoid_op	7 years ago
guosheng	4ee069fdba	Fix the HierarchicalSigmoidGradOpKernel and refine the codes. Now hsigmoid_op is same with V2 implementation and can pass gradient check.	7 years ago
tensor-tang	1c5d6c5692	disable xsmm with float16	7 years ago
tensor-tang	c9ba51ead8	Merge remote-tracking branch 'ups/develop' into feature/libxsmm	7 years ago
tensor-tang	64a8e6d20e	refine the threshold functions	7 years ago
lemon34	29145e1e31	change im2sequence for ctc batch inference (#11696 ) * change im2sequence for ctc batch inference * Update im2sequence_op.cc * change im2sequence for ctc batch inference * update * change PR by comment * fix ocr test error * fix test_im2sequence * modify the old name to standard name * fix test_layers failed	7 years ago
guosheng	e7a4cfc0ff	complete the hsigmoid_op	7 years ago
guosheng	d695381677	Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid	7 years ago
tensor-tang	6bc1aaaac7	refine the ColMajor replacement	7 years ago
tensor-tang	de856da9a6	fix ColMajor and RowMajor replacement	7 years ago

... 4 5 6 7 8 ...

673 Commits (b34933d9ee3b61dbbd642fd02f244c36d0d14550)