Paddle

Commit Graph

Author	SHA1	Message	Date
peizhilin	6ccdb1b947	fix build issue on windows for sample prop op test=develop	6 years ago
Dun	c6bd434ffe	add memset CUPTI && test=develop (#15868 )	6 years ago
Sylwester Fraczek	74672d1aff	Change (smart_ptr.get()) -> smart_ptr reason: dereferencing smart pointer is the same as the underlying pointer test=develop	6 years ago
tensor-tang	ee2321debd	Revert 15770 develop `a6910f900` gelu mkl opt (#15872 ) * Revert "Optimze Gelu with MKL Erf function (#15770)" This reverts commit `676995c86c`. * test=develop	6 years ago
chengduo	3b08c9abf4	enhance profiler (#15842 ) test=develop	6 years ago
Yihua Xu	676995c86c	Optimze Gelu with MKL Erf function (#15770 ) * Optimize for gelu operator * Set up the low accuracy mode of MKL ERF function. test=develop * Only enable MKLML ERF when OS is linux * Use the speical mklml version included vmsErf function to verify gelu mkl kernel. test=develop * Add the CUDA macro to avoid NVCC's compile issue. test=develop * Add the TODO comments for mklml library modification. test=develop * Clean Code test=develop * Add the comment of marco for NVCC compiler. test=develop	6 years ago
Tao Luo	e3dd6970fc	disable dam temporarily (#15860 ) test=develop	6 years ago
Dun Liang	35a90e06bf	test=develop	6 years ago
Dun Liang	c9080f516b	test=develop	6 years ago
Dun Liang	1c7bb0e40c	test=develop	6 years ago
Xin Pan	5eb87506bc	add per kernel config and remove const_cast. test=develop	6 years ago
Dun	a83e470405	Profiler refine and add CUDA runtime api tracer (#15301 ) * refine profiler && add runtime tracer * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * fix bug && test=develop * add thread id map && test=develop * test=develop * testing * bug fix * remove cuda event && refine code && test=develop * test=develop * test=develop * test=develop * fix windows temp file && test=develop * test=develop * fix windows bug && test=develop * fix start up issue && test=develop * code polish && test=develop * remove unused code && test=develop * add some cupti cbid && test=develop * add FLAGS_multiple_of_cupti_buffer_size && test=develop * fix compile error && test=develop * add keyword && test=develop * fix && test=develop * code polish && test=develop	6 years ago
mozga-intel	13ec2d331b	Enable momentum operator for a ngraph engine (#15673 ) * Enable momentum operator for a ngraph engine test=develop * Update tests test=develop * Unnecessary line of the code as intended was removed test=develop	6 years ago
Tao Luo	c797a1f050	remove legacy any.cmake	6 years ago
Tao Luo	bd2fa73620	Merge pull request #15794 from sneaxiy/fix-warnings Fix compile warning	6 years ago
tensor-tang	e1c707fe9c	fix warnings (#15790 ) * fix warnings test=develop * fix enforce test test=develop	6 years ago
sneaxiy	9b8e0e2f17	fix enforce_test test=develop	6 years ago
sneaxiy	209b355762	fix many warning test=develop	6 years ago
Zeng Jinle	fc87ef741b	Merge pull request #15687 from sneaxiy/fix_enforce fix enforce	6 years ago
sneaxiy	f0590947c3	fix enforce test=develop	6 years ago
tensor-tang	31fd8ce1e1	Merge pull request #15375 from mozga-intel/mozga-intel/batch_norm_ngraph_operator Enable batch_norm operator for a ngraph engine	6 years ago
dzhwinter	04e9776aef	add details. test=develop	6 years ago
mozga-intel	1198ccae6b	Enable batch_norm operator for a ngraph engine test=develop	6 years ago
peizhilin	883d22093a	fix the lib_any dependency test=develop	6 years ago
wopeizl	3614dadf23	Merge pull request #15631 from wopeizl/windows/fixci fix ci broken randomly and disable some warnings	6 years ago
peizhilin	061299be87	fix dependency test=develop	6 years ago
baojun	ac4cde009d	Enable accuracy op for ngraph engine (#15592 ) * Added accuracy ngraph op test=develop * fixed name type test=develop	6 years ago
dzhwinter	ce0394bcd0	merge develop branch. test=develop	6 years ago
guoshengCS	b6c3b69af8	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-beam-search-size test=develop	6 years ago
liuwei1031	6e84eb131f	expose peak gpu memory API to python test=develop (#15529 ) * expose peak gpu memory API to python test=develop * add unittest for peak gpu memory monitoring test=develop * add pybind change test=develop * add mutex to gpu mem usage monitor test=develop * update benchmark flag definition file test=develop * tweak unittest for memory monitoring test=develop	6 years ago
guoshengCS	5dfce93101	To make CUDA_LAUNCH_KERNEL_HELPER support large size. test=develop	6 years ago
tensor-tang	8117725852	add jit kernel hsum, hmax and softmax refer code test=develop	6 years ago
sneaxiy	ba4f43fd62	fix compile error in distributed mode test=develop	6 years ago
Yiqun Liu	3008fa1261	Add the CUDA kernel for beam_search op (#15020 ) * Refine the beam_search op and test. * A basic CUDA implementation of beam_search for small batch_size. * Implement CUDA kernel for beam_search_op. * Use multiple CUDA threads in the same block to select the top beam. * Update the python api of beam_search op. * Enable extend function in CPU kernel of beam_search op. * Unify the CUDA codes. test=develop * Unify the CPU kernel of beam_search op. * Ensure the seletced items of beam_search_op's CPU kernel sorted by scores. * Update the description of beam_search in API.spec. * Enable the use of CUDA kernel in beam_search op. * Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements. test=develop * Follow comments. test=develop * Call the CPU kernel for beam_search op when batch_size > 4. test=develop * Remove the except of is_empty op in PrepareData. test=develop	6 years ago
Zeng Jinle	2480a3df7d	Merge pull request #15496 from sneaxiy/lazy_allocator2 Fix bug when user set CUDA_VISIBLE_DEVICES be empty and run CPU-only models	6 years ago
sneaxiy	9c360cc798	test=develop	6 years ago
Xin Pan	58cb18d9d9	Merge pull request #15322 from velconia/imperative_resnet Imperative Resnet	6 years ago
sneaxiy	51227bd447	lazy_allocator test=develop	6 years ago
tangwei12	8b50ad80ff	checkpoint at distributed training (#14854 ) checkpoint for distributed training.	6 years ago
minqiyang	8ce198b2e1	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet test=develop	6 years ago
minqiyang	315b133e67	Add single GPU support to imperative	7 years ago
tensor-tang	3759c1db8c	Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph Enable element_wise_add operator for a ngraph engine	7 years ago
peizhilin	eea75a1d93	fix issue when type is invalid test=develop	7 years ago
peizhilin	9adb158e5b	Merge remote-tracking branch 'upstream/develop' into debug/support	7 years ago
chengduo	46d01d798e	Revert "Revert "Remove workspace_handle in conv_cudnn (#15186 )"" (#15290 ) test=develop This reverts commit `358e657f68`.	7 years ago
Wojciech Uss	cb2ba58458	Fix performance drop when with MKL-DNN test=develop	7 years ago
chengduozh	c4eced9881	fix thread safe bug test=develop	7 years ago
chengduozh	358e657f68	Revert "Remove workspace_handle in conv_cudnn (#15186 )" test=develop This reverts commit `064512aa47`.	7 years ago
wopeizl	5d9edb4124	Merge pull request #15156 from wopeizl/windows/fixgpuissue fix gpu buils issue on windows test=develop	7 years ago
chengduo	064512aa47	Remove workspace_handle in conv_cudnn (#15186 ) * remove workspace_handle in conv2d_cudnn test=develop * remove workspace_handle test=develop * fix bug test=develop * make test_conv2d_op SERIAL test=develop * save memory in conv_cudnn test=develop * enhance thread safety test=develop * enhance temporary allocator test=develop * Add excess fraction test=develop * follow comments test=develop * fix bug and code refine test=develop * fix memory size check test=develop * rename reuse_tmp_allocation_excess_fraction test=develop	7 years ago
xiaolil1	8f17c714de	Conv int8 residual (#15145 ) * Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Enable MKL-DNN INT8 Conv with Relu Fusion OP test=develop * Enable INT8 Conv with residual fusion OP test=develop * Modify code. test=develop * Modify basic INT8 Conv test=develop * Modify Conv. test=develop * fix style test=develop * Fix style test=develop * Fix test test=develop * Modify code. test=develop * Fix test test=develop	7 years ago
peizhilin	439691f5bd	adjust the shlwapi on windows test=develop	7 years ago
peizhilin	92da467c99	Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue	7 years ago
peizhilin	c1235c935f	add the enable_debug flag test=develop	7 years ago
Zeng Jinle	e29f10d315	Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var Remove op handle lock and fix var	7 years ago
mozga-intel	a42f8f4f6f	Enable element_wise_add operator for a ngraph test=develop	7 years ago
Zeng Jinle	c562be20d9	Merge pull request #15193 from sneaxiy/fix_cudnn_compatible_check Fix cudnn compatible check	7 years ago
peizhilin	1cd95d8a0b	use thread local instance test=develop	7 years ago
sneaxiy	ed409ac9f4	Revert "Revert "Remove op handle lock"" test=develop	7 years ago
peizhilin	d54133ea85	not include the numeric under linux test=develop	7 years ago
peizhilin	a6f5ceee74	add the python callstack for debug support test=develop	7 years ago
Zeng Jinle	dacfaaa966	Revert "Remove op handle lock" test=develop	7 years ago
xiaolil1	c8f101e5da	Conv int8 relu (#15130 ) * Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Enable MKL-DNN INT8 Conv with Relu Fusion OP test=develop * Modify basic INT8 Conv test=develop * fix type test=develop * Modify test test=develop	7 years ago
sneaxiy	9793a0b6a6	fix_cudnn_compatible_check	7 years ago
Zeng Jinle	ccb322d6a5	merge develop	7 years ago
Zeng Jinle	f3a13512fc	Merge pull request #15139 from sneaxiy/remove_op_handle_lock Remove op handle lock	7 years ago
xiaolil1	bbc9336878	Enable basic MKL-DNN INT8 Conv OP (#15124 ) * Enable basic MKL-DNN INT8 Conv OP test=develop * Modify test case test=develop * Clean unittest code test=develop * Fix test test=develop * Modify test test=develop * Modify basic INT8 Conv test=develop	7 years ago
peizhilin	c919b2f31d	Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue	7 years ago
peizhilin	fd4f4d0e5f	fix build issue test=develop	7 years ago
Yan Xu	a1e60ab19b	Merge pull request #14791 from Yancey1989/parallel_graph_mode [Feature] Add ParallelGraph executor mode in parallelexecutor to improve performance	7 years ago
peizhilin	9ae50dd07d	fix gpu buils issue on windows test=develop	7 years ago
sneaxiy	d0a8a1e950	remove_op_handle_lock test=develop	7 years ago
Yancey1989	e65436103f	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode test=develop	7 years ago
sneaxiy	6f06e6cdac	Merge remote origin test=develop	7 years ago
Xin Pan	9186451f60	hide GetTensor test=develop	7 years ago
sneaxiy	d25395fc98	remove tensor core lock test=develop	7 years ago
Yancey1989	82b42e31f0	polish unittest test=develop	7 years ago
Yancey1989	0a885ac12a	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode test=develop	7 years ago
peizhilin	813c2ce539	fix timer test=develop	7 years ago
wopeizl	7ab501264d	Merge pull request #15069 from wopeizl/windows/dsosupport add cuda dso support for windows	7 years ago
guru4elephant	ff739449ab	Merge pull request #15018 from guru4elephant/add_timer Add debug thread function for async executor	7 years ago
Yancey1989	4743c9cd5d	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode	7 years ago
wopeizl	719ebe3786	Merge pull request #15070 from wopeizl/windows/testcasefix fix test issues on windows	7 years ago
Qiyang Min	0238a3bb4f	Merge pull request #14972 from velconia/accelerate_lstm Accelerate PADDLE_ENFORCE	7 years ago
Yancey1989	86bb583881	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode	7 years ago
peizhilin	01c00b07dd	fix test issues on windows test=develop	7 years ago
peizhilin	1e7f83e60a	add cuda dso support for windows test=develop	7 years ago
Yancey1989	41a64f6a2a	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode	7 years ago
Wu Yi	856f0da0fe	Fp16 training (#14992 ) * wip * wip * wip * wip for test * add fp16 tests test=develop * fix cpu build test=develop * fix test=develop * fix py3 tests test=develop * fix lr_scheduler dtype test=develop * fix test=dvelop * test fix ci compile test=develop * fix build and merge test=develop * fallback momentumop change to general test=develop * make fp16 lr schedule simple test=develop * fix ut test=develop * fix tests test=develop * remove fp16 learning rate cast test=develop	7 years ago
chengduo	b9fb03cf54	Move GetTensor to tensor_util (#15011 ) * refine tensor test=develop * refine tensor test=develop * fix device_context log test=develop	7 years ago
dongdaxiang	ab2abfc5b2	Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer test=develop	7 years ago
dongdaxiang	4cb833d2de	Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer test=develop	7 years ago
tensor-tang	f0e02a65ed	Merge pull request #14974 from xiaolil1/quantize Add Quantize OP	7 years ago
dongdaxiang	68a2d1f3d7	Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer add timer_test test=develop	7 years ago
dongdaxiang	2e5ebc4594	Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer test=develop	7 years ago
dongdaxiang	5dfd9c9aa9	Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer test=develop	7 years ago
dongdaxiang	d0a5159946	Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer test=develop	7 years ago
dongdaxiang	f9b8168508	Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer test=develop	7 years ago
minqiyang	52b4821a6e	Fix Sprintf problem test=develop	7 years ago
minqiyang	010f657b33	Polish code test=develop	7 years ago
minqiyang	45acfbd011	1. Add specific condition for one or no arg in PADDLE_ENFORCE 2. Add unit test for new enforce feature test=develop	7 years ago
dongdaxiang	2dee8f6cd5	add TrainFilesWithTimer in async_executor	7 years ago
xiaoli.liu@intel.com	d83d0f33fd	extract templated function test=develop	7 years ago
wopeizl	b117a5f208	Merge pull request #14931 from wopeizl/windows/mkl add mkl support for windows	7 years ago
dongdaxiang	cf6188a823	add a linux timer	7 years ago
chengduo	79bd6dfa18	[Feature] Add Temporary Allocator (#14875 ) * Add Temporal Allocator * add Temporay Allocator to DeviceContext test=develop * code refine test=develop * fix mean_iou test=develop * Add DeviceTemporaryAllocator test=develop * fix conv_op bug test=develop * small fix test=develop * code refine test=develop * log refine test=develop * fix unit test test=develop * move double check * refine concat_and_split test=develop * add limit_of_temporary_allocation test=develop * fix name test=develop	7 years ago
minqiyang	e4719eb462	Fix bug in Windows VC 2010 test=develop	7 years ago
minqiyang	5a5c577529	Polish code test=develop	7 years ago
minqiyang	099186cd41	Support one argument PADDLE_ENFORCE test=develop	7 years ago
minqiyang	4af97c6946	Polish code	7 years ago
minqiyang	41b81293ab	Polish code test=develop	7 years ago
peizhilin	9e60c58666	Merge remote-tracking branch 'upstream/develop' into windows/mkl test=develop	7 years ago
minqiyang	bc66401566	Polish code test=develop	7 years ago
minqiyang	53619a79b4	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_lstm	7 years ago
peizhilin	b06ce129bc	some not so useful adjust test=develop	7 years ago
minqiyang	679d1a9e0b	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_lstm	7 years ago
Jacek Czaja	709d9e3cb7	- Added reusing MKL-DNN primitives for Transpose MKL-DNN op test=develop	7 years ago
peizhilin	40a94a138f	remove irrelevant fix for mkl test=develop	7 years ago
mozga-intel	9035bb81fe	Enable mul operator for a ngraph engine (#14801 ) * Enable mul operator for a ngraph test=develop * Enable activation ops test test=develop * Remove unused line test=develop	7 years ago
peizhilin	07c7eaabb4	Merge remote-tracking branch 'upstream/develop' into windows/mkl test=develop	7 years ago
peizhilin	ed5bd5e586	test=develop	7 years ago
peizhilin	19ebd8b4cf	add ctc support for windows	7 years ago
minqiyang	a3fa3f85d7	Polish code test=develop	7 years ago
Yu Yang	2803cf5776	Merge pull request #14868 from reyoung/feature/refine_w2v Feature/refine w2v	7 years ago
peizhilin	b601f2de8d	include the mkl fix only test=develop	7 years ago
peizhilin	5a6d7fe2ff	add mkl,ctc support for windows	7 years ago
wopeizl	0f085f0a5a	Merge pull request #14892 from wopeizl/windows/port3 fix script issue	7 years ago
Zeng Jinle	36a1d021a4	Merge pull request #14927 from sneaxiy/fix_cuda_stream_callback_in_cuda10 Fix stream_callback_manager bug in CUDA 10	7 years ago
wopeizl	fa78fc60be	Merge pull request #14907 from wopeizl/windows/avx add avx support for windows	7 years ago
sneaxiy	2373aeb5e8	fix bug test=develop	7 years ago
minqiyang	aa41ee75a1	Accelerate PADDLE_ENFORCE	7 years ago
peizhilin	41456e1723	Remove the useless definition test=develop	7 years ago
Yu Yang	740e1626ce	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/refine_w2v test=develop	7 years ago
Yancey1989	a760a550b0	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode	7 years ago
peizhilin	d519fd6944	test=develop	7 years ago
Yu Yang	bacf1d2399	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/tensor_type	7 years ago
Yan Chunwei	a985949be9	Fea/fuse conv elementwise add fuse (#14669 )	7 years ago
Yancey1989	4a4ccac1d0	update by comment test=develop	7 years ago
peizhilin	23dec78772	fix script issue test=develop	7 years ago
Yancey1989	c722b1dcb6	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode test=develop	7 years ago
Yu Yang	4ecdb6f486	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/tensor_type test=develop	7 years ago
Yu Yang	7b10bf0e60	Use mkl	7 years ago
sneaxiy	ca84c2ca8f	merge develop test=develop	7 years ago
Yu Yang	81520a24cf	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/refine_eigen_tensor	7 years ago
Yu Yang	9bd70a1e04	Change tensor uses proto::VarType::type test=develop	7 years ago
Yu Yang	8175983ef9	Merge pull request #14814 from reyoung/feature/gprof Add gperftools supports for PE	7 years ago
Yu Yang	5e60906996	Fix compile error test=develop	7 years ago
Yu Yang	7604b1ad51	Fix Eigen macro when using GPU The macro should be defined by compiler rather than by source. test=develop	7 years ago
sneaxiy	7923042365	merge develop test=develop	7 years ago
Yu Yang	b22d638d8f	Speed up SizeOfType test=develop	7 years ago
Yancey1989	2dda19f756	Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode	7 years ago
sneaxiy	66182abda6	add cuda cudnn version check test=develop	7 years ago
Zeng Jinle	add98c9e7d	Merge pull request #14745 from sneaxiy/fix_eigen_deallocate Fix eigen deallocate bug	7 years ago
Yancey1989	cb8a24be14	clean code	7 years ago
Tao Luo	54fcafb5f6	Merge pull request #14707 from yihuaxu/develop_4f71a6ee2_conv3d_mkldnn_opt Implement conv3d with mkldnn library	7 years ago
Yancey1989	c9de6f1b05	init parallel graph mode	7 years ago
sneaxiy	0f96c2e80f	fix thread-safety bug test=develop	7 years ago
Yihua Xu	65dbc7cca4	Merge branch 'develop' into develop_4f71a6ee2_conv3d_mkldnn_opt	7 years ago
tensor-tang	4a93db9288	remove jit namespace test=develop	7 years ago
sneaxiy	900765224c	fix deallocate bug test=develop	7 years ago
liuhongyu	773dc73fbf	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_5_support	7 years ago
liuhongyu	8daf67f90f	fix bugs; test=develop	7 years ago
Xin Pan	052cc5f538	Merge pull request #14725 from ZongwuYang/my-cool-stuff My cool stuff	7 years ago
Wu Yi	29d9fb53fc	[Feature] multi process multi gpu dist training, boost v100 performance by 20% (#14661 ) * wip multi process multi gpu dist training * workable for p2p * update test=develop * change back env name test=develop * fix alloc init * fix cpu build test=devlop * fix mac tests test=develop * refine code * refine test=develop	7 years ago
liuhongyu	968dd3c078	add cudnn 5 support; test=develop	7 years ago
ZongwuYang	1560eb4a6d	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into my-cool-stuff	7 years ago
ZongwuYang	deb04809bd	test=develop Fix the bug that profiler cannot trace the nccl allreduce operator	7 years ago
sneaxiy	35a2578426	fix bug test=develop	7 years ago
sneaxiy	64ad051b9a	merge develop test=develop	7 years ago
sneaxiy	c47c451a00	fix bug	7 years ago
Yihua Xu	669191c9cc	Implement conv3d with mkldnn library (test=develop)	7 years ago
Hongyu Liu	4f71a6ee2c	Merge pull request #14622 from PaddlePaddle/add_cudnn_lstm Add cudnn lstm	7 years ago
Yibing Liu	c7382df80f	Print assert failure id in lookup_table_op (#14698 )	7 years ago
sneaxiy	096673f675	refactor eager deletion test=develop	7 years ago
phlrain	cf1fe61004	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm	7 years ago
Tao Luo	20120d9c97	Merge pull request #14608 from jczaja/prv-conv2d-transpose-mkldnn [MKL-DNN]conv2d transpose	7 years ago
Tao Luo	ea47685f91	Merge pull request #14646 from jczaja/prv-softmax-mkl-sasum Softmax for inference MKL further changes	7 years ago
minqiyang	a02ce58f2c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog test=develop	7 years ago
Tao Luo	4ec9de0122	Merge pull request #14628 from Sand3r-/mgallus/mkldnn-elementwise_mul EltwiseMul: Changes from previous PR	7 years ago
Clementine	6c71c1f8f9	Add activation gelu (#14569 )	7 years ago
Michal Gallus	9455be0ba5	EltwiseMul: Extract StringToFormat to MKLDNN helper test=develop	7 years ago
Jacek Czaja	8bfa1fa9bb	- ASUM MKL integration	7 years ago
liuhongyu	05917c3c79	add cudnn lstm; test=develop	7 years ago
peizhilin	38715e6fd0	minor fix	7 years ago
Jacek Czaja	fb24690a58	- conv2d transpose MKL-DNN test=develop - Added new header for MKLDNN reuse functionality - Extended conv2d_transpose GetExpectedKernelType for MKL-DNN supporrt - Buildable conv transpose mkldnn and conv mkldnn using conv template - Conv2d transpose roughlt implemented and buildable - Added modifications conv2d transpose MKLDNN unit tests - Fix to UT of conv2d transpose mkldnn op - Wrong type of MKLDNN primitive was chosen for conv2d transpose - HAcks for conv2d transpose - UT enalbed - Replaced copying loop with memcpy - Draft of passing lambda into AcquireMemory - Made reorder (IOHW->OIHW) to be called only once	7 years ago
minqiyang	be04d99fe4	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog test=develop	7 years ago
minqiyang	53433d7f2e	Revert the changes of VLOG test=develop	7 years ago
peizhilin	36cd18b549	Merge remote-tracking branch 'upstream/develop' into windows/build	7 years ago
peizhilin	b2f8d4183d	Given the different fraction_of_gpu_memory_to_use depends on platform	7 years ago
Yu Yang	26af9cf90c	Merge pull request #14565 from chengduoZH/fix_cublas_warp_error Fix cublas warp error	7 years ago
chengduozh	f7847ca6a3	fix cublas warp error test=develop	7 years ago
luotao1	e21edb26f6	add Set/GetCPUNumThreads api	7 years ago
peizhilin	445fff24dc	add the bigobj option to NVCC compile fix code style	7 years ago
chengduo	00b9e9a135	Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929 ) * refine cublase test=develop * code refine * refine cublas * add GEMME_EX * add enable_cublas_tensor_op_math doc and add cublasCall test=develop * fix CublasCall for cuda version test=develop * fix error test=develop * fix GEMM_EX to be compatible with gcc 4.8 test=develop * add GEMM_EX test=develop * to compatiable with gcc4.8 test=develop	7 years ago
peizhilin	7c8c9dc9bf	fix unit test cases	7 years ago
wopeizl	d9a1f3e58e	Windows/online (#14474 ) * add recordio support * disable the openblas multi-thread on windows since no support adjust the python script * code style * code style test=develop * add create_recordio_file_reader back * fix code style test=develop * fix the gtest.cmake on windows * fix cc_test on windows * fix the win build test=develop * remove fused compile support on windows test=develop * add the jit support test=develop * add the jit support, test=develop * add the jit support, test=develop * add the jit back fix compile error on windows * rollback test=develop * test case fix * disable DSO by default on windows * exclude warpctc_op on windows * exclude the dynload_warpctc out on windows test=develop * fix the scripts error test=develop * disable avx on windows by default test=develop * re-organize the cmake file * disable mkl on windows by default * add warp_ctc back * fix the dependency * fix the dependency * fix the build issue on windows * remove unsupported flag on windows * code style * code style test=develop * fix issue * add profiler, parallel_executor back * clean up the pre-definitions on windows * fix build issue * test=develop	7 years ago
peizhilin	6e66fadb95	clean up the pre-definitions on windows	7 years ago
peizhilin	67562a6fcd	Merge remote-tracking branch 'upstream/develop' into windows/build	7 years ago
peizhilin	703b26e697	add profiler, parallel_executor back	7 years ago
chengduo	a8d3aaae2a	print output log warning (#14497 ) test=develop	7 years ago

... 2 3 4 5 6 ...

764 Commits (dfdd73cbc029a000468fc502f33376b627a6e482)