Paddle

Commit Graph

Author	SHA1	Message	Date
tensor-tang	e09a7c793d	remove the warning log since do not have avx2, avx512 flags test=develop	6 years ago
tensor-tang	f524c1b62b	throw error when mismatch cpu version test=develop	6 years ago
peizhilin	9d67c1fb69	cpu build support	6 years ago
dzhwinter	60f70b174d	test=develop	6 years ago
sneaxiy	7ff320f8cc	merge develop	6 years ago
Kaipeng Deng	daed473d4a	Merge pull request #14089 from heavengate/pool_exclude add inclusive/exclusive mode in avg pool	6 years ago
whs	0c319e0b35	Add affine grid generator op (#12238 ) * Add affine grid generator. * fix ffine grid. * Add unitest. * Add CPU kernel and fix unitest. * Fix CPU kernel. * Refine code. test=develop * Fix python api. test=develop * Update python api. test=develop * Fix comment. test=develop * Rename affine_grid_generator to affine_grid and enhence unitest. test=develop * Fix unitest. test=develop	6 years ago
dzhwinter	0a180584e6	clean cmake. test=develop	6 years ago
dzhwinter	1ace55c8ee	merge develop branch	6 years ago
Tomasz Patejko	8899d42265	MKLDNN conv residual data: primitive reuse interface used. Reorder done when formats are different test=develop	6 years ago
Yu Yang	90d9e5aee8	feat(platform): lazy initialization of devicecontext in pool (#14067 ) * feat(platform): lazy initialization of devicecontext in pool Use std::async(deferer, []{...}) to lazy initialize DeviceContext in Pool test=develop * Add future includes test=develop	6 years ago
dzhwinter	316765839d	add back jit simd instructions. stage.	6 years ago
dzhwinter	bf2e4cb188	cleard. staged	6 years ago
dzhwinter	ebfe5a02b3	merge develop branch	6 years ago
Yu Yang	c01696f8c2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation test=develop	6 years ago
dengkaipeng	c93e044ae0	add inclusive/exclusive mode in PoolOp avg pool type	6 years ago
dzhwinter	7141debe38	add cudnn back. staged.	6 years ago
Sylwester Fraczek	2098b42584	review fixes (Teamcity fails) test=develop	6 years ago
sneaxiy	5be6f762d0	remove_lock_in_some_ops test=develop	6 years ago
Brian Liu	a53e8a8da6	Update MKLDNN integration framework to support Paddle multi-instances Make all blob info saved in global device context to be thread based. Meanwhile save thread id in thread local storage in ParallelDo	6 years ago
Yu Yang	1d4d4e73ab	Remove place hash test=develop	6 years ago
Yu Yang	461f71a90b	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation	6 years ago
gongweibao	58c027cc38	Add rpc profiler flags. (#13989 ) Add rpc profiler flags	6 years ago
Xin Pan	d10e54c460	Merge pull request #14003 from chengduoZH/fix_fast_parallel_exe_bug Fix test_parallel_executor_mnist.py randomly hang.	6 years ago
chengduozh	82d2903b63	Fix fast ParallelExe bug test=develop	6 years ago
sneaxiy	2002e71da8	fix pinned allocator	6 years ago
sneaxiy	21fdf8e87d	add unittest for allocator_facade.cc	6 years ago
gongweibao	078223b3e3	Add rpc timeline. (#13900 ) Add rpc timeline	6 years ago
tensor-tang	6447155dac	Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole Fea jitkernel lstm peephole	6 years ago
Yibing Liu	6b795d424c	Merge pull request #13901 from kuke/seq_slice_py Add py api for sequence_slice_op	6 years ago
dzhwinter	e41a3fcd68	fix update to develop hang problem.	6 years ago
Qiao Longfei	681226e97c	Merge pull request #13864 from jacquesqiao/py-reader-add-test-mode reader block queue add test mode	6 years ago
Yibing Liu	16b2c6dc78	Add py api for sequence_slice_op test=develop	6 years ago
chengduo	2c9839c847	add cuda version display (#13885 ) test=develop	6 years ago
wanghaoshuang	3ae9645084	compile in linux	6 years ago
Qiao Longfei	8686f7c68e	add reader_queue_speed_test_mode flag for speed test	6 years ago
tensor-tang	bcb8ea397d	Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole test=develop	6 years ago
Qiao Longfei	5428cb9908	Profiler support merge data of all thread (#13811 ) * profiler infor merge thread statistic information * update profiler * fix bug * add merge thread msg to report * optimize report * statistic the time of ops in each thread but not all * optimize report format * optimize profile report * optimize profile report test=develop	6 years ago
sneaxiy	4c672ab1a2	Merge reyoung:rewrite_allocation	6 years ago
tensor-tang	ea7dc9cbf6	Merge remote-tracking branch 'ups/develop' into fea/jitkernel test=develop	6 years ago
Xin Pan	ab798a2832	clarify the fraction_of_gpu_memory flag test=develop	6 years ago
Yu Yang	15076c325e	Add comments and polish code style	6 years ago
Yu Yang	29f66c2408	Polish code	6 years ago
Yu Yang	8e3fdc6e65	Fix SetDevice on init	6 years ago
Yu Yang	524f6e9b36	Refine code	6 years ago
Yu Yang	5cf395beaf	Fix bug in uts	6 years ago
dzhwinter	2d00e65819	namespace issue (#13543 ) * flags * "follow comment"	6 years ago
Yu Yang	58ed412f68	refactor(memory): rewrite memory allocation and make it extentable Use OO style to rewrite memory allocation.	6 years ago
typhoonzero	a4f7696a18	Revert "Some trivial optimization (#13530 )" This reverts commit `1d91a49d2f`.	6 years ago
tensor-tang	dee5d35c20	refine vmul	6 years ago
chengduo	1d91a49d2f	Some trivial optimization (#13530 ) * some trivial opt * remove the fix of lod_tensor and shrink_rnn_memory_op * refine ShrinkRNNMemoryOp test=develop	6 years ago
dzhwinter	7806c5625f	fix enforce (#13544 )	6 years ago
dzhwinter	97636a9fcf	"fix link error" (#13545 )	6 years ago
sneaxiy	612e1a3155	modification	7 years ago
sneaxiy	d0b2453ecd	merge develop	7 years ago
sneaxiy	24ea39c4c6	feature/eager_delete_tensor	7 years ago
dzhwinter	85f8dd1c77	debug version	7 years ago
dzhwinter	e1999538eb	debug the device context	7 years ago
dzhwinter	372caf4000	windows staff	7 years ago
dzhwinter	c3e1fb5a3e	add demo	7 years ago
Krzysztof Binias	2ed7982d09	Merge pull request #13327 from kbinias/kbinias/conv-weights-converted-once [MKLDNN] Reusing once reordered convolution weights in test mode	7 years ago
Krzysztof Binias	accdecc681	Correcting Lint errors	7 years ago
Krzysztof Binias	1ce9e9dc30	Renaming decision variable	7 years ago
Krzysztof Binias	1658958fe6	Reusing converted weights	7 years ago
Yang Yu	8331e835a8	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug	7 years ago
Wu Yi	f90c7865f0	Benchmark tool for imgnet (#12305 ) * support test using executor without reader * run imgnet * update fluid benchmark * wip * update * update all models * support pyreader * update * clean up * make profile batches contollable * update API.spec * update scripts * clean dockerfile * update * clean comments * add scope argument docstring * use num_trainers to determine nccl init comms	7 years ago
JiabinYang	e322fc4e0e	add error info for nccl not found	7 years ago
fengjiayi	7b577b92e0	fix a memory bug in CudnnHolder	7 years ago
fengjiayi	82a1b35b9b	Revert "Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"" This reverts commit `151e169eb7`.	7 years ago
guochaorong	151e169eb7	Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"	7 years ago
dzhwinter	6fb28796f5	memory (#13143 )	7 years ago
dzhwinter	f05520060e	fix style (#13142 )	7 years ago
fengjiayi	0236966b68	follow commits	7 years ago
fengjiayi	5398e1a3a6	fix bugs	7 years ago
dzhwinter	dbe90cc0f6	merge develop branch	7 years ago
fengjiayi	f79ca23115	fix bugs	7 years ago
fengjiayi	c501826f42	use framework::RWLock	7 years ago
fengjiayi	1f36a4c27c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_CudnnHolder	7 years ago
fengjiayi	b0aca8824d	make CudnnHolder thread safe	7 years ago
luotao1	7169f9378c	fix mkldnn include format	7 years ago
fengjiayi	15cc9128be	fix compile error	7 years ago
fengjiayi	407ff0bdbc	use CudnnHolder in conv_cudnn_op	7 years ago
fengjiayi	04bfd5c10c	add CudnnHolder to manage cudnn_handle and workspace	7 years ago
Yan Chunwei	902f19b46a	fea/fuse attention lstm simplify.with fusion lstm.with sequnce expand (#13006 )	7 years ago
dzhwinter	b78394ea57	done	7 years ago
dzhwinter	b74af56bbc	cpu compile is done	7 years ago
dzhwinter	78aab05b71	fix more op errors	7 years ago
dzhwinter	cd8f3e9ed0	operator module is done	7 years ago
dzhwinter	d361624c1d	platform module (#12932 ) * platform module * Update profiler.h	7 years ago
dzhwinter	2ec589a24e	float.h fixed	7 years ago
dzhwinter	7dceb8a080	check some operators	7 years ago
dzhwinter	d7f98f37a7	more platform is done	7 years ago
dzhwinter	efd0884fa9	add op registry	7 years ago
dzhwinter	eca4563e5d	operators module (#12938 )	7 years ago
dzhwinter	488a2dd2e8	with ir node	7 years ago
dzhwinter	cfbf1ba305	add source	7 years ago
dzhwinter	c1ad52f768	pre-commit	7 years ago
dzhwinter	89f95ea25e	merge develop branch	7 years ago
dzhwinter	34f8c9b6f5	windows port	7 years ago
tensor-tang	0d46f518ae	refine avx condition and warning	7 years ago
tensor-tang	4e538db14d	refine jit space	7 years ago
tensor-tang	ec59f0d454	add cpu vec	7 years ago
tensor-tang	3dd66390b2	add blas vexp	7 years ago
tensor-tang	0ec1f65cf1	fix blas dot and add cblas scal	7 years ago
tensor-tang	a2203d0466	add cblas dot	7 years ago
Michał Gallus	cd32ddac12	Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669 ) * Fuse Convolution and Eltwise Add into Conv+Bias * Reduce bias branching at conv_mkldnn_op * Add MKLDNN build checks for Conv Bias * Conv-bias: check if bias input exist befor assignment * Conv-bias: Remove Bias dim check from infershape It was causing conv3d test to crash upon\ncalling HasInput(Bias)	7 years ago
dzhwinter	e23ddf6ae4	status (#12764 )	7 years ago
Tao Luo	d04ef276a5	Merge pull request #12745 from tensor-tang/refine/op/elewise_mul Refine elementwise mul cpu forward	7 years ago
dzhwinter	00463fdfe3	cudnn windows support (#12757 ) * cudnn widndows * "add comment" * "windows support" * "fix cmake error"	7 years ago
dzhwinter	17602eab94	windows port of malloc	7 years ago
dzhwinter	2673798ddb	"fix float16 ShuffleDownSync Bug" (#12756 ) * "fix bug" * "add test case"	7 years ago
dzhwinter	5c88cd2af5	remove werror in windows	7 years ago
dzhwinter	64ce1210aa	"windows support"	7 years ago
dzhwinter	36878d78cc	comment out backtarce	7 years ago
dzhwinter	335398f18b	dlfnh	7 years ago
tensor-tang	6644ce79a5	add mklml vmul	7 years ago
tensor-tang	ff92b6ba81	Merge pull request #12531 from tensor-tang/refine/op/gru Refine gru cpu forward	7 years ago
Chen Weihang	1e961b145c	Merge pull request #12591 from chenwhql/enforce_msg_polish polish high frequency enforce error message	7 years ago
Yan Chunwei	0a641ba326	add ratio to profiler (#12701 )	7 years ago
tensor-tang	c588c64a76	Merge remote-tracking branch 'ups/develop' into refine/op/gru	7 years ago
chenweihang	da39d84a48	refine by reviewer's advice	7 years ago
tensor-tang	1ab1d03c62	fix missing macro condition	7 years ago
Qiao Longfei	e8fcb71bed	Merge pull request #12620 from jacquesqiao/timeline-support-pure-cpu Timeline support pure cpu	7 years ago
tensor-tang	3bf3e77ac8	Merge remote-tracking branch 'ups/develop' into refine/op/gru	7 years ago
qiaolongfei	5a6c3cd9e0	fix profiler dead lock	7 years ago
tensor-tang	a50889f523	introduce xbyak	7 years ago
qiaolongfei	3f2aa91970	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu	7 years ago
qiaolongfei	e008600b08	optimize code	7 years ago
qiaolongfei	7c649e06c3	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu	7 years ago
Sylwester Fraczek	d74bb6ab9c	fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests	7 years ago
chenweihang	b1dd4149b9	adjust enforce test cases	7 years ago
chenweihang	61052cdbc6	polish high frequency enforce error message	7 years ago
qiaolongfei	954d680b40	fix test_parallel_do.py	7 years ago
tensor-tang	836068569f	Merge remote-tracking branch 'ups/develop' into refine/op/gru	7 years ago
qiaolongfei	1623f1ba4f	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler	7 years ago
qiaolongfei	4c5bcd7859	add guard to profiler	7 years ago
tensor-tang	43cee33a23	add mkl packed gemm	7 years ago
Xin Pan	caf10b474f	make profiler use thread_id from g_thread_id Add a few more RecordEvent. Cleanup	7 years ago
dzhwinter	6d3da458a7	Fix/float16 style (#12446 ) * "rewrite the test case" * "follow comment"	7 years ago
dzhwinter	39ac9e39c2	float16 type support enhance (#12181 ) * cherry picked * "cherry picked platform" * "add comment" * "fix ci"	7 years ago
tensor-tang	4f0383f52e	fix unknown flag	7 years ago
tensor-tang	9788e5ab87	add flags to control num_threads	7 years ago
tensor-tang	10a1c2bb86	control omp num_threads	7 years ago
typhoonzero	54e9fd3f61	fix cudnn enforce	7 years ago
qiaolongfei	a6d30a8607	profiler support cpu	7 years ago
Xin Pan	7781297c70	variants	7 years ago
Tao Luo	e568acbee2	Merge pull request #12092 from velconia/add_deps_to_device_ctx Add framework_proto to device context deps	7 years ago
minqiyang	2cc6ca43a0	Add framework_proto to device context deps	7 years ago
Jacek Czaja	fbe25ef510	MKLDNN: Extending Conv MKLDNN op to reuse MKLDNN primitives (#11750 ) * - Rebase of conv reuse - clag formatter fixes - Fix to conv reuse - Yet another fix - Fix - Fix - clagn format * - comment update	7 years ago
tensor-tang	2e418a5227	fix conflicts	7 years ago

1 2 3 4 5 ...

457 Commits (eab47459658b10ec799a5dccbfd9bf8f45b9771a)