Paddle

Commit Graph

Author	SHA1	Message	Date
guochaorong	76e9227467	Merge pull request #13199 from JiayiFeng/fix_CudnnHolder_bug Fix cudnn holder bug	7 years ago
Krzysztof Binias	1658958fe6	Reusing converted weights	7 years ago
Yan Xu	d117bbc313	Merge pull request #13291 from Yancey1989/reset_vars_on_pserver reset received vars on pserver	7 years ago
qingqing01	a39eba77eb	Implement norm_op by CUDA instead of Eigen. (#13273 ) * Implement norm_op by CUDA instead of Eigen. * Remove the commented code.	7 years ago
Yancey1989	32b94a7d13	cache var types	7 years ago
Yancey1989	580f55fa0f	update by comment	7 years ago
Yang Yu	8331e835a8	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug	7 years ago
Yancey1989	6edfae4234	reset received vars on pserver	7 years ago
tensor-tang	40dbd97f8e	Merge remote-tracking branch 'ups/develop' into refine/op/peephole	7 years ago
Qiyang Min	b805751598	Merge pull request #13223 from velconia/open_python35_CI Open python35 ci	7 years ago
Yu Yang	34e467dcab	Merge pull request #13232 from reyoung/feature/fix_layer_norm Use double to reduce	7 years ago
chengduo	886852557f	Refine reshape_grad and transpose_grad (#13074 ) * Add intermediate * fix flatten/squeeze/unsqueeze * Considering compatibility issues, we could not fix the origin op * follow comment * reset the shape of XShape	7 years ago
tensor-tang	3eb55f0643	Merge remote-tracking branch 'ups/develop' into refine/op/peephole	7 years ago
tensor-tang	d7ac1cc836	refine seq when bs is large	7 years ago
tensor-tang	9dd5a177a5	refine batch mode and peephole	7 years ago
Qiao Longfei	6e03f7900f	Add centered mode rmsprop (#13161 ) * rmsprop optimizer support v1 mode * typo * optimize code * refine code * optimize unit test * update test_rmsprop_op.py * update formula of rmsprop * optimize document * update API.spec for RMSPropOptimizer * add default value to check_output_with_place equal_nan	7 years ago
Yan Chunwei	9df2d8b5ba	test/add text-classification test (#13081 )	7 years ago
tensor-tang	f10710b0ca	move seq peephole if out of loop	7 years ago
tensor-tang	2f3b498949	refine fusion seq lstm peephole	7 years ago
tangwei12	d1e2efae6b	reimplement auc in fluid (#13167 ) * reimplement auc in pyton * reimplement auc in fluid * add auc unittest * replace new auc in layers * add batch Auc in Fluid * name formated	7 years ago
Yu Yang	f57d706aa7	Use double to reduce	7 years ago
tensor-tang	5f586e2223	Merge remote-tracking branch 'ups/develop' into refine/op/fusion_lstm	7 years ago
Brian Liu	04272c0d41	Enable lstm peephole (#13160 ) * Refine fusion lstm op code for better readability * Enable peephole in fusion lstm op (seq_mode part) and add unit test * Enable peephole in fused lstop op (batch_mode part) Set batch_mode as default as well * Use pre-commit to clean format * Follow up review comments as well as adding more unit tests for seq mode	7 years ago
fengjiayi	56750e6a3e	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug	7 years ago
Qiao Longfei	cdd14f17f1	fix async mode handle COMPLETE_MESSAGE (#13212 )	7 years ago
minqiyang	8059445fb5	Fix fake_quantize_op	7 years ago
tensor-tang	78d9ad5712	fusion gru enfore only used	7 years ago
tensor-tang	555083ae2a	enforce only used	7 years ago
fengjiayi	db5e3dd767	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug	7 years ago
Jiabin Yang	d091dd02a0	fix mac compile error 0903 (#13184 )	7 years ago
Yu Yang	cda7842e26	Revert "Revert "Add Python Callstacks when Op::Run error (#12759 )"" This reverts commit `1f270275a6`.	7 years ago
qingqing01	9557cc218d	Refine and fix some code for faster-rcnn. (#13135 ) * Fix bug in generate_proposals_op. * Fix data type for RoIs. * Refine and fix rpn_target_assign_op. * Add the missing file bbox_util.h * Rename BoxEncoder to BoxToDelta	7 years ago
fengjiayi	82a1b35b9b	Revert "Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"" This reverts commit `151e169eb7`.	7 years ago
guochaorong	151e169eb7	Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"	7 years ago
Chen Weihang	3b6090e80b	Merge pull request #12887 from chenwhql/sequence_enumerate_op Feat: add sequence enumerate op	7 years ago
tensor-tang	1cc35f3642	Merge pull request #13118 from tensor-tang/optimize/op/fusion_lstm Optimize fusion lstm batch mode	7 years ago
dzhwinter	6fb28796f5	memory (#13143 )	7 years ago
dzhwinter	e722f68318	fix windows compile (#13147 )	7 years ago
dzhwinter	f05520060e	fix style (#13142 )	7 years ago
dzhwinter	856c26faef	fix elementwise (#13146 )	7 years ago
fengjiayi	653c8ded7d	Merge pull request #13078 from JiayiFeng/dev_CudnnHolder Add CudnnHolder and use it in Conv and ConvTranspose op	7 years ago
tensor-tang	20659fc905	Merge pull request #13107 from tensor-tang/optimize/op/fusion_gru Optimize fusion gru	7 years ago
tensor-tang	93c034ee51	Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_lstm	7 years ago
tensor-tang	c7adb99ae0	follow comment and refine code	7 years ago
tensor-tang	83f4bc4ecf	follow comment and refine code	7 years ago
tensor-tang	f38905a6e5	Merge remote-tracking branch 'ups/develop' into optimize/op/fusion_gru	7 years ago
tangwei12	fbdd4f8c0f	Merge pull request #13101 from zenghsh3/develop Fix bug of sampling_id op	7 years ago
tensor-tang	9838bacb35	Merge branch 'develop' into optimize/op/fusion_lstm	7 years ago
qingqing01	9bd933d3fb	Improve and fix fake_quantize_op (#13092 ) * Improve and fix fake_quantize_op.	7 years ago
Tao Luo	3fe0575b62	Merge pull request #13148 from dzhwinter/windows/math_compile cuda math port	7 years ago
chenweihang	7ddbbcb0b5	doc: refine API and doc	7 years ago
dzhwinter	34757efb8e	fix windows compile	7 years ago
tensor-tang	c44108803a	refine prelu	7 years ago
chenweihang	b081363bae	Merge branch 'sequence_enumerate_op' of https://github.com/chenwhql/Paddle into sequence_enumerate_op	7 years ago
chenweihang	0b7d82befb	doc: refine English description	7 years ago
dzhwinter	b11332a07b	"fix style" (#13094 )	7 years ago
dzhwinter	ab1097cd8e	Feature/template (#13093 ) * remove template operator * "fix compile" * "fix ci" * "fix ci"	7 years ago
tensor-tang	80edd7ef29	enable run with fuse pass	7 years ago
fengjiayi	f79ca23115	fix bugs	7 years ago
tensor-tang	a79a77eeb5	refine and clean code	7 years ago
tensor-tang	c459fb5be0	add fusion lstm batch mode	7 years ago
whs	e10aa80f03	Add pad2d op. (#12950 ) * Add pad2d op. * Add unitest and python api. * Fix cuda op kernel. * Fix python api. * Fix python api. * Update API.spec. * Fix python api	7 years ago
tensor-tang	7bdd11d88e	Merge branch 'develop' into optimize/op/fusion_gru	7 years ago
fengjiayi	1f36a4c27c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_CudnnHolder	7 years ago
fengjiayi	b0aca8824d	make CudnnHolder thread safe	7 years ago
tensor-tang	596213906b	add gru seq mode forward	7 years ago
zenghsh3	d7495838b3	refine	7 years ago
zenghsh3	04a05d1d58	merged	7 years ago
zenghsh3	08b73b68c4	fix bug of sampling_id_op	7 years ago
tensor-tang	b0d36c4c3d	add cross vec to speedup gru	7 years ago
tensor-tang	038c16eed2	save intermediate data to out buffer	7 years ago
Xingyuan Bu	0a97d24b41	Faster RCNN Generate Proposal Labels (#12616 ) * Add generate_proposal_labels for Faster-RCNN.	7 years ago
fengjiayi	d5f74b7308	use CudnnHolder in conv_transpose_cudnn_op	7 years ago
fengjiayi	407ff0bdbc	use CudnnHolder in conv_cudnn_op	7 years ago
chengduo	3bd1d22a7d	Enhance fused_elementwise_activation_op (#12837 ) * Enhance the function of fused_elementwise_activation_op * enhance unit test * Clean Code And Add Doc * Add compound functors * Fix doc and enhance unit test * define Dx and Dy for d_binary_func * add mul_scale * add mul_scale * add elementwise_mul * code refine * code refine * add doc * add AsIntermediate	7 years ago
tensor-tang	2d0ddf8c41	refine cpu gru batch mode	7 years ago
tensor-tang	70d3981220	add cpu vec bias sub	7 years ago
jerrywgz	85fe65ae61	modified error info for maxout op	7 years ago
Chen Weihang	b98b744067	Merge branch 'develop' into sequence_enumerate_op	7 years ago
Yan Chunwei	902f19b46a	fea/fuse attention lstm simplify.with fusion lstm.with sequnce expand (#13006 )	7 years ago
Xingyuan Bu	2ad5d91ef8	Faster RCNN Generate Proposals (#12056 ) * Add proposals generation operator for Faster-RCNN.	7 years ago
tensor-tang	89d6d69ce4	Merge pull request #12781 from tensor-tang/feature/op/fusion_gru add fusion gru	7 years ago
tensor-tang	d941192e74	fix gcc53 on cpu vec (#13020 )	7 years ago
tensor-tang	2328a69157	Merge pull request #13012 from tensor-tang/refine/seq2batch refine seq2batch	7 years ago
Xin Pan	2bb15f437c	Merge pull request #12791 from panyx0718/ir3 graph to program pass	7 years ago
Qiao Longfei	a22309afe8	clean useless check code in auc_op (#13023 )	7 years ago
Yu Yang	8965cee89f	Polish PrintOp (#12895 ) * Polish PrintOp * Polish PrintOp * Polish PrintOp * Refine test_print_op	7 years ago
chengduo	7ad39c4077	Enhance pad_constant_like_op (#12999 ) * enhance pad_constant_like_op * add API * add API	7 years ago
qingqing01	0353eddb51	Improve fake_dequantize_op. (#12877 ) * Improve fake_dequantize_op. * Follow comments.	7 years ago
Qiao Longfei	11e01d9b2d	Scale support selectedrows (#12960 ) * add ScaleOpVarTypeInference for scale op * scale op support scale selected rows * optimize code * use FindVar * use FindVarRecursive in ScaleOpVarTypeInference	7 years ago
fengjiayi	7b84c580e2	Merge pull request #12824 from JiayiFeng/dev_sequence_padding_op Sequence pad op	7 years ago
tensor-tang	fd4f7c3ab5	refine seq2batch	7 years ago
Wu Yi	0ee6fed05b	Refine dist rpc deps (#12899 ) * refine dist train RPC deps * clean up * clean up * fix ut * remove input for fetch_barrier * follow comments	7 years ago
fengjiayi	7e0c9f50ae	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op	7 years ago
Zeng Jinle	599a32641b	Merge pull request #12971 from sneaxiy/unstack_op Add unstack op	7 years ago
Tao Luo	26cac36bfd	Merge pull request #12515 from kbinias/kbinias/bnorm-fwd-reuse Reusing primitives for forward Batch Norm operator	7 years ago
tensor-tang	a481c5e98c	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_expand_concat_fc	7 years ago
tensor-tang	49c31febb5	fix typo and op test	7 years ago
fengjiayi	9cb455fa7d	update function	7 years ago
Krzysztof Binias	fb4b4f8d57	Refactor code	7 years ago
Krzysztof Binias	50d3e6e96b	Reusing primitives for forward Batch Norm operator	7 years ago
Zeng Jinle	ef7bd03a03	Merge pull request #12964 from sneaxiy/fix_concat_sync Fix concat bug	7 years ago
sneaxiy	52a480bb98	Merge develop	7 years ago
tensor-tang	02909335e9	rename fusion seq_concat_fc to fusion seqexpand_concat_fc	7 years ago
Xin Pan	1a67061fee	graph to program pass fix a few other things	7 years ago
qingqing01	1f09bc320c	Support data type int8_t . (#12841 ) * Support int8 type.	7 years ago
chenweihang	00b30b9938	doc: unified infershape format	7 years ago
chenweihang	0c4697f8cd	fix: change to enumerate by sentence	7 years ago
tensor-tang	c45cee0349	refine infershape and forward	7 years ago
sneaxiy	24264bc0b8	Merge develop	7 years ago
dzhwinter	0153c21d83	add unstack_op	7 years ago
tensor-tang	c7c2506733	add forward implementation	7 years ago
jerrywgz	6033c1a278	Add error info & remove data sharing between input and output in rnn_memory_helper_op	7 years ago
chengduo	3e1050a2e8	Add pad_constant_like_op (#12943 ) * Add pad_constant_batch_size_like * refine pad_op * optimize memory	7 years ago
dzhwinter	6cc7870517	fix concat synchronization bug	7 years ago
tensor-tang	954b0e113f	init fusion seq expand concat fc op	7 years ago
tensor-tang	c488ee96a7	Merge remote-tracking branch 'ups/develop' into refine/op/fusion_lstm	7 years ago
tensor-tang	e61cf3214d	complete reverse seq	7 years ago
Chen Weihang	4ec12496dd	Merge branch 'develop' into sequence_enumerate_op	7 years ago
tensor-tang	4b28fab8c9	enable more acts	7 years ago
tensor-tang	607c41952e	compute gates	7 years ago
Qiao Longfei	3c58b87b45	fix auc layer and add check for auc op (#12954 ) * fix auc layer and add check for auc op * use input to check if states are inited * optimize code	7 years ago
jerrywgz	835573bbf2	add error_info prelu_op	7 years ago
Yibing Liu	c1488b1796	Merge pull request #12940 from sneaxiy/stack_op Speedup stack_op	7 years ago
dzhwinter	eca4563e5d	operators module (#12938 )	7 years ago
tensor-tang	6be273cbdb	add seq mode lstm	7 years ago
tensor-tang	36363292c3	Merge pull request #12904 from tensor-tang/refine/jit optimize cpu vec activations	7 years ago
jerrywgz	bc7503c85e	modified error_info for maxout_op	7 years ago
Zeng Jinle	d189d4dbab	Merge pull request #12884 from sneaxiy/sequence_mask_op Add sequence_mask_op for DAM model	7 years ago
sneaxiy	3b38e5a4fc	speed up stack_op	7 years ago
tensor-tang	7bdaf09664	Merge remote-tracking branch 'ups/develop' into refine/jit	7 years ago
Tao Luo	989cc2a4f4	Merge pull request #12913 from luotao1/concat enhance the forward of concat op	7 years ago
Tao Luo	8650f6ffae	Merge pull request #12898 from luotao1/expand remove broadcast in sequence_expand	7 years ago
Qiao Longfei	52948a0b50	Merge pull request #12909 from jacquesqiao/fix-sparse-update-bug fix sparse update bug	7 years ago
tensor-tang	ba943d38e3	make runtime avx act	7 years ago
tensor-tang	3462c29940	refine add bias with avx	7 years ago
tangwei12	ef6445ee39	Merge pull request #12908 from seiriosPlus/fill_constant_selectedrows add SelectedRows support in fill_constant_op	7 years ago
tensor-tang	bb9f98e10d	add inplace test	7 years ago
tensor-tang	f269614bcd	further optimize tanh with avx and mkl	7 years ago
chenweihang	733ea0d29b	adjust infershape details	7 years ago
luotao1	e999c74cff	Merge branch 'develop' into concat	7 years ago
luotao1	b61cf7ac4f	Merge branch 'develop' into expand	7 years ago
luotao1	2b4edacca0	enhance the forward of concat op	7 years ago
Tao Luo	3e3b5f4fda	Merge pull request #12675 from Sand3r-/fix-conv-mkldnn-0.15 Update MKLDNN to 0.15, fix convolution integration	7 years ago
tensor-tang	7a4924cd44	further optimize sigmoid with avx and avx512	7 years ago
qiaolongfei	fcf20eed0f	fix sparse update bug	7 years ago
tangwei12	ca22586818	code optimize (cherry picked from commit 587cca7)	7 years ago
Xin Pan	557be6fc58	Merge pull request #12902 from PaddlePaddle/revert-12736 Revert "Disable in_place in batch_norm API. (#12736)"	7 years ago
tensor-tang	6bd89ba5b6	fix typo	7 years ago
Chen Weihang	2969aba14f	Merge branch 'develop' into sequence_enumerate_op	7 years ago
chenweihang	219a2369da	feat: wrap sequence enumerate op	7 years ago
tensor-tang	e3bb98eb38	optimize relu with avx and avx512	7 years ago
guochaorong	1f270275a6	Revert "Add Python Callstacks when Op::Run error (#12759 )" This reverts commit `b2df17003f`.	7 years ago
guochaorong	b1fc238694	Revert "Disable in_place in batch_norm API. (#12736 )" This reverts commit `f5d5d7b2d9`.	7 years ago
tensor-tang	25976fe736	optimize the sigmoid and tanh	7 years ago
tensor-tang	2eb46c2b06	add cpu vec test	7 years ago
sneaxiy	1083e99520	Merge develop	7 years ago
tensor-tang	f0f06992c1	Merge pull request #12878 from tensor-tang/feature/op/attention_lstm Add attention lstm cpu forward	7 years ago
luotao1	83f4edabe9	remove broadcast in sequence_expand	7 years ago
sneaxiy	5ea7bf88ba	Merge pull request #12872 from sneaxiy/stack_op Add stack_op for DAM model	7 years ago
Tao Luo	ef2da86b4f	Merge pull request #12885 from luotao1/test_ditu_rnn enhance test_analyzer to profile ditu inference demo	7 years ago
sneaxiy	e895c98f0a	add support to max_len is None	7 years ago
fengjiayi	f4a4a4cbd9	add op comment and python layer	7 years ago
tangwei12	acdd95d5ca	bug fix	7 years ago
chenweihang	d2e5395b97	feat: add sequence enumerate op	7 years ago
luotao1	9c7fde45a7	enhance test_analyzer to profile ditu inference demo	7 years ago
chengduo	8ad9055804	Add is_test for while_op (#12874 ) * add is_test for while_op * Change API	7 years ago
sneaxiy	64464cb1fa	Merge develop	7 years ago
qingqing01	79918a8442	add sequence_mask_op for DAM model	7 years ago
Yu Yang	b2df17003f	Add Python Callstacks when Op::Run error (#12759 ) * Add Python Callstacks when Op::Run error * Skip op with sub-block * refactor: refine callstack info's format * Reshape only support matrix * Polish Python code * Fix UT * Fix Py3	7 years ago
Yu Yang	17fcc4f5d0	Merge pull request #12864 from reyoung/feature/process_lod_grad Feature/process lod grad	7 years ago
tensor-tang	5ca0bb9aad	support more activation type and remove some comments	7 years ago
sneaxiy	ba168bd2d2	modify API.spec	7 years ago
tensor-tang	d9bf73f3ab	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_gru	7 years ago
tensor-tang	dd938d0b94	fix bugs and pass op test	7 years ago
tensor-tang	ec59f0d454	add cpu vec	7 years ago
tensor-tang	cf5ea925c3	fix bugs	7 years ago
tensor-tang	6ed20474d4	refine attention lstm infershape	7 years ago
tensor-tang	508548f897	implement attention lstm cpu forward	7 years ago
tensor-tang	9affc36c89	init attention lstm	7 years ago
tensor-tang	3dd66390b2	add blas vexp	7 years ago
tensor-tang	0ec1f65cf1	fix blas dot and add cblas scal	7 years ago
tensor-tang	a2203d0466	add cblas dot	7 years ago
tensor-tang	f72ab8961e	refine blas gemm	7 years ago
qingqing01	f5d5d7b2d9	Disable in_place in batch_norm API. (#12736 ) * Disable in_place in batch_norm API.	7 years ago
sneaxiy	c73c5ed573	use for_range	7 years ago
Xin Pan	b548ecbc2b	add stack_op	7 years ago
Yu Yang	eb8fd853bc	Fix sequence_softmax_cudnn op	7 years ago
Yu Yang	3768677980	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/process_lod_grad	7 years ago
Yu Yang	2a36ad1a96	Handle LoD for concat & seq_softmax ops	7 years ago
Yu Yang	211d81863d	Process elemwise grad op's lod. mul_op's lod	7 years ago
Yan Chunwei	9ee698e605	enhance/ditu rnn with fc fuse (#12831 ) * make fc fuse work with ditu rnn * add ditu rnn data download to CMAKE	7 years ago
Xin Pan	78415f326d	Merge pull request #12838 from panyx0718/infer speed up while_op	7 years ago
fengjiayi	ce182d9037	bug fix	7 years ago
Xin Pan	a2c0e52f3e	speed up while_op	7 years ago
tensor-tang	6f78fd7d1e	fuse fc in gru	7 years ago
tensor-tang	300180cc26	init fusion gru op	7 years ago
Zhaolong Xing	21ba32b065	Merge pull request #12843 from NHZlX/fix_ssa_bug_for_trt fix ssa bug with batch_norm and refine the trt	7 years ago
Michał Gallus	cd32ddac12	Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669 ) * Fuse Convolution and Eltwise Add into Conv+Bias * Reduce bias branching at conv_mkldnn_op * Add MKLDNN build checks for Conv Bias * Conv-bias: check if bias input exist befor assignment * Conv-bias: Remove Bias dim check from infershape It was causing conv3d test to crash upon\ncalling HasInput(Bias)	7 years ago
nhzlx	c999895e93	merge develop	7 years ago
nhzlx	276950291a	1. fix ssa bug with batchnorm, 2. refine the trt	7 years ago
Yan Chunwei	896a37b6e3	fea/link ir to inference analysis and fc fuse support (#12789 ) * link IR graph to analysis graph * add clean code and update * add infer_clean_pass * add ir_pass_manager * support fc fuse executation * fix ir circle	7 years ago
dzhwinter	e23ddf6ae4	status (#12764 )	7 years ago
Tao Luo	d04ef276a5	Merge pull request #12745 from tensor-tang/refine/op/elewise_mul Refine elementwise mul cpu forward	7 years ago
tangwei12	cbc6e6eb97	Merge pull request #12247 from seiriosPlus/dis_ckpt_fix add load slice_vars in io.py	7 years ago
Qingsheng Li	3d11d018e0	Fix scatter_op python API (#12742 ) * Fix scatter_op python API and remove inconsistency between implementation and doc * API spec change * Change as review comment	7 years ago
Tao Luo	8f9f414a14	Merge pull request #12805 from tensor-tang/fix/op/elewise_add fix SEGV element wise add at debug mode	7 years ago
tensor-tang	e955361267	Merge pull request #12737 from tensor-tang/feature/op/fusion_lstm add fusion lstm	7 years ago
tensor-tang	82bb9170fb	Merge remote-tracking branch 'ups/develop' into fix/op/elewise_add	7 years ago
Chen Weihang	57b34d9196	Merge pull request #12808 from chenwhql/remove_inplace_param_in_squeeze_and_unsqueeze Refactor: remove inplace parameter from squeeze and unsqueeze op	7 years ago
Yihua Xu	084d4a9e9e	Optimize CRF Decoding with AVX/AVX2/AVX512F instruction (#12767 ) * Optimize CRF decoding with AVX/AVX2 instruction * Enable the AVX2 flags for compiling * Clean the code and decrease the count of multiply calculation * Add the support of AVX512 instruction to optimize CRF Decoding * Clean the code * Enable the AVX512f flags for compiling * Clean the code for the invaluable switch * Fixed the issue to check AVX512F status * Clean the code * Add some explanation of the key points	7 years ago
fengjiayi	34b209cffa	Complete sequence_padding GPU kernel	7 years ago
dzhwinter	00463fdfe3	cudnn windows support (#12757 ) * cudnn widndows * "add comment" * "windows support" * "fix cmake error"	7 years ago
qingqing01	c62f68cb94	Fix bug in conditional_block_op. (#12246 ) * Fix bug in conditional_block_op. * Fix bug and add comments. * Rename arguments.	7 years ago
chenweihang	bc471b6ac4	refactor: remove inplace parameter from squeeze and unsqueeze op	7 years ago
tensor-tang	0507f7bc3c	fix SEGV elementwise add at debug mode	7 years ago
tangwei12	ca1e18c04a	Merge pull request #12469 from seiriosPlus/sum_op_dim_fix sum_op selectedRows dim bug fix	7 years ago
Zhaolong Xing	e5674f6dde	Merge pull request #12753 from NHZlX/add_benchmark modify tensorrt engine op from cpu mode to gpu	7 years ago
tensor-tang	b090479409	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm	7 years ago
tangwei12	b4f52b01d0	bug fix when all inputs are empty	7 years ago
tangwei12	3efac174ea	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix	7 years ago
tangwei12	dbb4f0d35d	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix	7 years ago
Qiao Longfei	fd10669ecb	Add dependency to send recv (#12760 ) Add dependency to send recv	7 years ago
fengjiayi	8d8d48a34f	Complete sequence_pad_op and its CPU kernel. Add unittests	7 years ago
tangwei12	7c12c0f865	add sync in load selectedrows	7 years ago
Michal Gallus	4a7f0698e0	Add consts to new MKLDNN integration Also replace memory types from int64_t to size_t	7 years ago
Michal Gallus	6588d0e039	Update MKLDNN to 0.15, fix conv integration	7 years ago
tangwei12	9f11db4080	add todo in impl	7 years ago
tangwei12	c24a9263ba	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix	7 years ago
tangwei12	ac9ae97001	code fix	7 years ago
nhzlx	f55e8901c8	merge develop	7 years ago
nhzlx	1600ba86f6	1. change tensorrt op from cpu to gpu	7 years ago
tangwei12	bb9f494740	merge develop	7 years ago
dzhwinter	4069262f0e	Revert ""cherry picked operators changes" (#12184 )" (#12747 ) This reverts commit `bf3c34960f`.	7 years ago
Qiao Longfei	653fad08f8	Optimize selected rows for dist lookup table with pthread rwlock (#12635 ) Optimize selected rows for dist lookup table with rwlock	7 years ago
fengjiayi	3c749fae43	update CPU sequence_padding functor	7 years ago
tensor-tang	92890ac258	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm	7 years ago
tangwei12	0749c8822d	Merge pull request #12556 from seiriosPlus/samplingIdOp Sampling id op	7 years ago
tensor-tang	a56142c155	optimize elementwise_mul cpu forward	7 years ago
tensor-tang	6644ce79a5	add mklml vmul	7 years ago
tensor-tang	ff92b6ba81	Merge pull request #12531 from tensor-tang/refine/op/gru Refine gru cpu forward	7 years ago
tangwei12	26b228e405	remove assignment and add vlog	7 years ago
tangwei12	125e9166e1	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix	7 years ago
tensor-tang	a72f68f223	Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm	7 years ago
tensor-tang	df28a3b452	fix lod and op test	7 years ago
Qingsheng Li	317e18abd2	Remove Data Sharing between input and output in scatter_op (#12672 ) * Remove Data Sharing between input and output in scatter_op * Removed data sharing in backward op	7 years ago
tensor-tang	f3cd2612ae	refine fc and use the fc compute in fusion_lstm	7 years ago
tangwei12	822496f626	merge cpu and gpu	7 years ago
dzhwinter	bf3c34960f	"cherry picked operators changes" (#12184 ) * "cherry picked operators changes" * "remove duplicated code" * "add constant setter" * "add get expected kernel" * "fix ci" * "add fill constant"	7 years ago
tensor-tang	40138c4cd6	add unit test of fusion lstm op	7 years ago
jerrywgz	c108376506	Add three modes for prelu_op (#12630 ) * Add three modes for prelu_op.	7 years ago
tangwei12	9f09d68678	add enforce	7 years ago
gongweibao	d06849305a	parameter dispather. (#12666 )	7 years ago
tensor-tang	852bc6f4aa	refine fusion lstm op doc	7 years ago
tensor-tang	8f9132959e	fuse fc in lstm	7 years ago
tensor-tang	ddb05dffb6	init fusion lstm op	7 years ago
tensor-tang	efc5392d97	Merge pull request #12676 from tensor-tang/refine/op/fc refine fc op	7 years ago
tangwei12	470fb7c5c3	bug fix	7 years ago
tangwei12	60dda7bf9f	add gpu Implementation	7 years ago
tangwei12	4661f5589d	random optimize	7 years ago
Bai Yifan	9333a62792	Add flatten op interface and enhance APIs about detection to support variable-length image. (#12422 ) * add flatten api&enhance detection api * unify shape_op data type * update API.spec	7 years ago
tensor-tang	eee38464dc	refine fc op use cpu only	7 years ago
tangwei12	ed937bc6f8	merge	7 years ago
tensor-tang	d84a1a0010	fc op use cpu only	7 years ago
fengjiayi	a38a8db928	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op	7 years ago
tangwei12	478f73c188	merge header in cc	7 years ago
fengjiayi	d6b5302bd6	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support	7 years ago
tensor-tang	c588c64a76	Merge remote-tracking branch 'ups/develop' into refine/op/gru	7 years ago
tensor-tang	0098a494a2	Merge remote-tracking branch 'ups/develop' into refine/op/fc	7 years ago
fengjiayi	5e7aa8c7e5	code clean	7 years ago
tensor-tang	742300baa8	fix unkown omp pragmas	7 years ago
tensor-tang	b9dbb7c5cb	fix bias attri in mkldnn fc	7 years ago
tangwei12	59580a7f69	bug fix	7 years ago
tensor-tang	4b5986bb77	enable fc op in normal case	7 years ago
tensor-tang	e133df6037	enable native fc forward	7 years ago
tensor-tang	6a2a9a8350	Revert "Refine elementwise_add op"	7 years ago
Yu Yang	8dda526a45	Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy Fix 'softmax_with_cross_entropy_op' dependency error	7 years ago
sneaxiy	f6f5cdaa05	Merge pull request #12555 from sneaxiy/refine_layer_norm Refine layer_norm op	7 years ago
sneaxiy	c50c537732	fix arithmetic error in backward kernel	7 years ago
tensor-tang	038cbf799d	add bias for fc op	7 years ago
whs	9d6243b6fb	Fix crop op. (#12603 ) * Fix infer shape of crop op. * Speed crop op.	7 years ago
Bai Yifan	649f5d74f0	fix mine_hard_example bug (#12664 )	7 years ago
sneaxiy	2d9508f8f3	Merge pull request #12554 from sneaxiy/refine_elementwise_add Refine elementwise_add op	7 years ago
tensor-tang	171a0e2b42	add some comment	7 years ago
sneaxiy	2c560623d1	fix dependency error	7 years ago
tensor-tang	5377edd282	refine packed condition	7 years ago
tensor-tang	3bf3e77ac8	Merge remote-tracking branch 'ups/develop' into refine/op/gru	7 years ago
qiaolongfei	c0890988da	add RPCServerProfiler, replace listen and serv optimizer	7 years ago
tangwei12	64a4925cb4	Merge branch 'Pdv' into samplingIdOp	7 years ago
tangwei12	0bfd62be3d	remove gpu supported, will add it later	7 years ago
Tao Luo	5a9ae411e0	Merge pull request #12618 from sfraczek/sfraczek/fix-new-mkldnn-conv-tests fix UT for mkldnn 0.15	7 years ago
sneaxiy	cf799a6a04	Merge pull request #12553 from sneaxiy/refine_softmax_with_cross_entropy Refine softmax_with_cross_entropy op	7 years ago
dzhwinter	8499559c42	"fix style" (#12600 )	7 years ago
sneaxiy	010883689c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_layer_norm	7 years ago
sneaxiy	5d698589ce	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_elementwise_add	7 years ago
sneaxiy	19ff254d05	Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add	7 years ago
Sylwester Fraczek	d74bb6ab9c	fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests	7 years ago
fengjiayi	855c9e3311	clean softmax_op code	7 years ago
fengjiayi	24d51de022	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support	7 years ago
fengjiayi	27df3a9f2b	make cross_entropy_op supporting tensors	7 years ago
fengjiayi	66be53264e	Merge pull request #12592 from JiayiFeng/fix_mac_compile_error fix mac compile error	7 years ago
fengjiayi	8e604a10aa	fix mac compile error	7 years ago
nhzlx	551c802cdc	merge develop	7 years ago
sneaxiy	ad45d39222	refine layer_norm	7 years ago
chengduo	7c8b69c700	Feature/op fusion (#12240 ) * Add Preface * Add demo code * Save file * Refine code * seems can work * use elementwise strategy * Use ElementwiseComputeEx * Add comments * extract functions from operator * Refine code * Follow comment * code refine * follow comments * follow comments	7 years ago
sneaxiy	1b4515f6db	refine softmax_with_cross_entropy	7 years ago
nhzlx	3a0caf801f	modify trt engine op test	7 years ago
nhzlx	e51d045a6d	modify trt engine op test	7 years ago
nhzlx	e8954a36f5	merge develop	7 years ago
nhzlx	32a9e050bc	mapping the variable name inside the subgraph	7 years ago
Wu Yi	2d036c47cd	polish dist unit test code (#12512 ) * polish dist se resnext ut * update * update * update * avoid cpu initializer differ * change to use executor for now * update by comment * remove lr decay use para exe, should fix para exe bug later * update by comment	7 years ago
fengjiayi	7834b4a470	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support	7 years ago
tangwei12	5bfdefae91	Merge branch 'Pdv' into samplingIdOp	7 years ago
tangwei12	b30bdde15a	random optimize	7 years ago
tangwei12	9c63fef63c	random optimize	7 years ago
Qiao Longfei	88a607c342	Merge pull request #12541 from jacquesqiao/optimize-profiler optimize profiler	7 years ago
tangwei12	5b9716d1f6	add dims check	7 years ago
tangwei12	4cd504d3b4	bug fix	7 years ago
sneaxiy	e57bc4d745	Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add	7 years ago
sneaxiy	222fbbedfb	Merge branch 'develop' into refine_elementwise_add	7 years ago
sneaxiy	4b83afff6e	Merge branch 'develop' into refine_elementwise_add	7 years ago
sneaxiy	b2d0ee5159	refine elementwise_add op	7 years ago
tangwei12	da2cc99f67	sampling op optimize	7 years ago
fengjiayi	7c55e08c93	stash	7 years ago
tangwei12	4973e07be3	sampling op optimize	7 years ago
tensor-tang	836068569f	Merge remote-tracking branch 'ups/develop' into refine/op/gru	7 years ago
tensor-tang	18c322c2a1	seperate cpu and gpu implementations for gru kernel compute	7 years ago
tensor-tang	54c95e49f0	fix blas	7 years ago
fengjiayi	b656d97e86	Merge pull request #12485 from JiayiFeng/dev_ops_tensor_support Make lookup_table_op and softmax_op supporting high rank tensor	7 years ago
qiaolongfei	1623f1ba4f	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler	7 years ago
tangwei12	3206970b77	sampling op rename	7 years ago
Xin Pan	99a77cfc62	Merge pull request #12468 from panyx0718/improve_profiler2 Improve profiler	7 years ago
qiaolongfei	a3f9d6a38c	optimize profiler	7 years ago
tangwei12	e0ab2f7158	new sampling op	7 years ago
tensor-tang	8c23f7c4f0	fix blas and use packed weight	7 years ago
tensor-tang	d9cc6b1866	replace gru compute with details	7 years ago
tensor-tang	43cee33a23	add mkl packed gemm	7 years ago
tangwei12	766ac488ac	sum_op selectedRows dim bug fix	7 years ago
dzhwinter	595a2c83ae	explicit gradient of elementwise_add/elementwise_sub (#11970 ) * "add gradient register" * "make some enhance" * "better format" * "fix typo" * "fix reuse" * "fix get expected kernel" * "change the mkldnn code" * "fix mkldnn" * "fix mkldnn failed test" * "add comment"	7 years ago
fengjiayi	e7d8e16a66	update softmax_mkldnn_op	7 years ago
Yu Yang	2567afa35d	Merge pull request #12462 from reyoung/feature/fix_cudnn_deterministic Fix bug in cudnn_determistic	7 years ago
fengjiayi	dc111d3476	update softmax_cudnn_op	7 years ago
fengjiayi	f7bd0b227b	Add unittests for softmax_op	7 years ago
gongweibao	819ac3df0a	Modify style (#12465 )	7 years ago
fengjiayi	b314a69523	make softmax supporting tensors	7 years ago
fengjiayi	b1af7e5d9b	Add unittests for lookup_table_op	7 years ago
tangwei12	c4c8f60bec	sum_op selectedRows dim bug fix	7 years ago
Xin Pan	486345551d	clean	7 years ago
Xin Pan	caf10b474f	make profiler use thread_id from g_thread_id Add a few more RecordEvent. Cleanup	7 years ago
Yu Yang	040fc1c39b	Fix bug in cudnn_determistic * Introduced by #11205	7 years ago

... 5 6 7 8 9 ...

2249 Commits (f3729db6e03d5e290020d3cc74cfb50572902c4c)