Paddle

Commit Graph

Author	SHA1	Message	Date
tensor-tang	8117725852	add jit kernel hsum, hmax and softmax refer code test=develop	6 years ago
peizhilin	1e7f83e60a	add cuda dso support for windows test=develop	6 years ago
peizhilin	40a94a138f	remove irrelevant fix for mkl test=develop	6 years ago
peizhilin	ed5bd5e586	test=develop	6 years ago
Yu Yang	7b10bf0e60	Use mkl	6 years ago
liuhongyu	8daf67f90f	fix bugs; test=develop	6 years ago
liuhongyu	968dd3c078	add cudnn 5 support; test=develop	6 years ago
phlrain	cf1fe61004	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm	6 years ago
Tao Luo	ea47685f91	Merge pull request #14646 from jczaja/prv-softmax-mkl-sasum Softmax for inference MKL further changes	6 years ago
Jacek Czaja	8bfa1fa9bb	- ASUM MKL integration	6 years ago
liuhongyu	05917c3c79	add cudnn lstm; test=develop	6 years ago
minqiyang	be04d99fe4	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog test=develop	6 years ago
minqiyang	53433d7f2e	Revert the changes of VLOG test=develop	6 years ago
peizhilin	36cd18b549	Merge remote-tracking branch 'upstream/develop' into windows/build	6 years ago
chengduozh	f7847ca6a3	fix cublas warp error test=develop	6 years ago
chengduo	00b9e9a135	Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929 ) * refine cublase test=develop * code refine * refine cublas * add GEMME_EX * add enable_cublas_tensor_op_math doc and add cublasCall test=develop * fix CublasCall for cuda version test=develop * fix error test=develop * fix GEMM_EX to be compatible with gcc 4.8 test=develop * add GEMM_EX test=develop * to compatiable with gcc4.8 test=develop	6 years ago
peizhilin	7c8c9dc9bf	fix unit test cases	6 years ago
wopeizl	d9a1f3e58e	Windows/online (#14474 ) * add recordio support * disable the openblas multi-thread on windows since no support adjust the python script * code style * code style test=develop * add create_recordio_file_reader back * fix code style test=develop * fix the gtest.cmake on windows * fix cc_test on windows * fix the win build test=develop * remove fused compile support on windows test=develop * add the jit support test=develop * add the jit support, test=develop * add the jit support, test=develop * add the jit back fix compile error on windows * rollback test=develop * test case fix * disable DSO by default on windows * exclude warpctc_op on windows * exclude the dynload_warpctc out on windows test=develop * fix the scripts error test=develop * disable avx on windows by default test=develop * re-organize the cmake file * disable mkl on windows by default * add warp_ctc back * fix the dependency * fix the dependency * fix the build issue on windows * remove unsupported flag on windows * code style * code style test=develop * fix issue * add profiler, parallel_executor back * clean up the pre-definitions on windows * fix build issue * test=develop	6 years ago
peizhilin	6e66fadb95	clean up the pre-definitions on windows	6 years ago
qingqing01	fd7e643153	Convolution fusion operator. (#14449 ) * Convolution fusion operator. * Clean code test=develop	6 years ago
Wu Yi	b32c13dc20	Add cudnn ctc loss (#12366 ) * add cudnn ctc loss * wip add test test=develop * wip * wip * done test=develop * move include cudnn test=develop * test test=develop * fix build test=develop * fix build test=develop * fix build on cudnn5 test=develop * fix cudnn5 build test=develop * fix cudnn5 build test=develop * merge develop softmax functor change test=develop	6 years ago
tensor-tang	1be85d011d	add mkl vsqr and vpow	6 years ago
Qiyang Min	698698f2fa	Merge branch 'develop' into fix_vlog	6 years ago
qingqing01	abe209234f	Exhaustive search for cuDNN conv. (#14286 ) * exhaustive search for cuDNN conv. * Refine code and add unit testing. * Fix model load in fluid/inference and unit testing in conv2d * Follow comments. * Fix compiling test=develop	6 years ago
minqiyang	0c3227a523	Change the origin VLOG level to 10 times Fix code to support cpplint syntax check test=develop	6 years ago
qingqing01	db8c52da5e	Revert " Exhaustive search for cuDNN conv. (#14043 )" This reverts commit `ce7d9b0799`.	6 years ago
qingqing01	ce7d9b0799	Exhaustive search for cuDNN conv. (#14043 ) * exhaustive search for cuDNN conv. * Refine code and add unit testing. * Clean code * Fix model load in fluid/inference and unit testing in conv2d * Follow comments.	6 years ago
whs	0c319e0b35	Add affine grid generator op (#12238 ) * Add affine grid generator. * fix ffine grid. * Add unitest. * Add CPU kernel and fix unitest. * Fix CPU kernel. * Refine code. test=develop * Fix python api. test=develop * Update python api. test=develop * Fix comment. test=develop * Rename affine_grid_generator to affine_grid and enhence unitest. test=develop * Fix unitest. test=develop	6 years ago
dzhwinter	2d00e65819	namespace issue (#13543 ) * flags * "follow comment"	6 years ago
JiabinYang	e322fc4e0e	add error info for nccl not found	7 years ago
dzhwinter	d361624c1d	platform module (#12932 ) * platform module * Update profiler.h	7 years ago
tensor-tang	3dd66390b2	add blas vexp	7 years ago
tensor-tang	0ec1f65cf1	fix blas dot and add cblas scal	7 years ago
tensor-tang	a2203d0466	add cblas dot	7 years ago
dzhwinter	e23ddf6ae4	status (#12764 )	7 years ago
Tao Luo	d04ef276a5	Merge pull request #12745 from tensor-tang/refine/op/elewise_mul Refine elementwise mul cpu forward	7 years ago
dzhwinter	00463fdfe3	cudnn windows support (#12757 ) * cudnn widndows * "add comment" * "windows support" * "fix cmake error"	7 years ago
tensor-tang	6644ce79a5	add mklml vmul	7 years ago
tensor-tang	43cee33a23	add mkl packed gemm	7 years ago
dzhwinter	99a99ec7e3	"remove lapack" (#11966 )	7 years ago
Tao Luo	2dae8a4631	Merge pull request #11596 from tensor-tang/refine/mklml/dyload enable dynamic load mklml lib on fluid	7 years ago
Yi Wang	2625178add	No NCCL on macOS (#11652 ) * Make paddle no longer depend on boost * Update enforce.h	7 years ago
tensor-tang	28a0ef9522	remove usr local lib when dynamic load lib	7 years ago
tensor-tang	3e73a7a924	add usr local lib to dynamic search path	7 years ago
tensor-tang	f503f12925	enable dynamic load mklml lib on fluid	7 years ago
Xin Pan	d2afd21021	Remove cuptiFinalize. In cupti samples, only cuptiFlush is used. I can't find any places calling cuptiFinalize and this API can error out as not_implemented in some cuda installation.	7 years ago
yuyang18	53dab95b75	Static DSO handle	7 years ago
yuyang18	c5115950a8	Use static for dlsym	7 years ago
Yu Yang	3d53631bad	Make dyload strictly use the same ABI in header	7 years ago
Luo Tao	d4682247e1	auto find tensorrt library	7 years ago

1 2

66 Commits (b5ebca47a352412b01692d01aff7b6f4f371b685)