Paddle

Commit Graph

Author	SHA1	Message	Date
Qiyang Min	698698f2fa	Merge branch 'develop' into fix_vlog	6 years ago
qingqing01	abe209234f	Exhaustive search for cuDNN conv. (#14286 ) * exhaustive search for cuDNN conv. * Refine code and add unit testing. * Fix model load in fluid/inference and unit testing in conv2d * Follow comments. * Fix compiling test=develop	6 years ago
minqiyang	0c3227a523	Change the origin VLOG level to 10 times Fix code to support cpplint syntax check test=develop	6 years ago
qingqing01	db8c52da5e	Revert " Exhaustive search for cuDNN conv. (#14043 )" This reverts commit `ce7d9b0799`.	6 years ago
qingqing01	ce7d9b0799	Exhaustive search for cuDNN conv. (#14043 ) * exhaustive search for cuDNN conv. * Refine code and add unit testing. * Clean code * Fix model load in fluid/inference and unit testing in conv2d * Follow comments.	6 years ago
whs	0c319e0b35	Add affine grid generator op (#12238 ) * Add affine grid generator. * fix ffine grid. * Add unitest. * Add CPU kernel and fix unitest. * Fix CPU kernel. * Refine code. test=develop * Fix python api. test=develop * Update python api. test=develop * Fix comment. test=develop * Rename affine_grid_generator to affine_grid and enhence unitest. test=develop * Fix unitest. test=develop	6 years ago
dzhwinter	2d00e65819	namespace issue (#13543 ) * flags * "follow comment"	6 years ago
JiabinYang	e322fc4e0e	add error info for nccl not found	7 years ago
dzhwinter	d361624c1d	platform module (#12932 ) * platform module * Update profiler.h	7 years ago
tensor-tang	3dd66390b2	add blas vexp	7 years ago
tensor-tang	0ec1f65cf1	fix blas dot and add cblas scal	7 years ago
tensor-tang	a2203d0466	add cblas dot	7 years ago
dzhwinter	e23ddf6ae4	status (#12764 )	7 years ago
Tao Luo	d04ef276a5	Merge pull request #12745 from tensor-tang/refine/op/elewise_mul Refine elementwise mul cpu forward	7 years ago
dzhwinter	00463fdfe3	cudnn windows support (#12757 ) * cudnn widndows * "add comment" * "windows support" * "fix cmake error"	7 years ago
tensor-tang	6644ce79a5	add mklml vmul	7 years ago
tensor-tang	43cee33a23	add mkl packed gemm	7 years ago
dzhwinter	99a99ec7e3	"remove lapack" (#11966 )	7 years ago
Tao Luo	2dae8a4631	Merge pull request #11596 from tensor-tang/refine/mklml/dyload enable dynamic load mklml lib on fluid	7 years ago
Yi Wang	2625178add	No NCCL on macOS (#11652 ) * Make paddle no longer depend on boost * Update enforce.h	7 years ago
tensor-tang	28a0ef9522	remove usr local lib when dynamic load lib	7 years ago
tensor-tang	3e73a7a924	add usr local lib to dynamic search path	7 years ago
tensor-tang	f503f12925	enable dynamic load mklml lib on fluid	7 years ago
Xin Pan	d2afd21021	Remove cuptiFinalize. In cupti samples, only cuptiFlush is used. I can't find any places calling cuptiFinalize and this API can error out as not_implemented in some cuda installation.	7 years ago
yuyang18	53dab95b75	Static DSO handle	7 years ago
yuyang18	c5115950a8	Use static for dlsym	7 years ago
Yu Yang	3d53631bad	Make dyload strictly use the same ABI in header	7 years ago
Luo Tao	d4682247e1	auto find tensorrt library	7 years ago
Yan Chunwei	186659798f	add tensorrt build support(#9891 )	7 years ago
Kexin Zhao	7ed457e77a	Fix cuda 7.5 error with cublas GEMM (#9811 ) * fix gemm error for cuda 7.5 * fix version number	7 years ago
Yi Wang	47a4ec0672	Remove call_once.h (#9764 ) * Remove call_once.h * "fix ci"	7 years ago
Yi Wang	e185502ebe	Fix cpplint errors with paddle/fluid/platform/dynload (#9715 ) * Update source files. * Update headers * Update * Update * Update * Update * Fix a CMake dependency	7 years ago
Kexin Zhao	d00bd9eb72	Update the cuda API and enable tensor core for GEMM (#9622 ) * change from hgemm to gemmEx * fix cpplint	7 years ago
Kexin Zhao	9ba36604d8	fix cpplint error	7 years ago
Kexin Zhao	187ba08789	enable tensor core for conv cudnn	7 years ago
kexinzhao	90215b7844	Add float16 GEMM math function on GPU (#8695 ) * test cpu float16 data transform * add isnan etc * small fix * fix containsNAN test error * add data_type transform GPU test * add float16 GPU example * fix error * fix GPU test error * initial commit * fix error * small fix * add more gemm fp16 tests * fix error * add utility function	7 years ago
Xin Pan	f3cbfc021c	Add MEMCPY information	7 years ago
Yu Yang	22b5c07a7d	Fix the compilation on CUDA 9.1/GCC 5.3 * Make CUPTI_LIB_PATH not passing by macro. * Add missing header	7 years ago
Xin Pan	9bbce49353	Fix version date.	7 years ago
Xin Pan	b9ec24c6e9	Extend current profiler for timeline and more features.	7 years ago
Yang Yang(Tony)	87f4311a88	compile with nccl2 (#8411 ) * compile with nccl2 * add ncclGroup; it is necessary in nccl2 * add back libnccl-dev	7 years ago
qingqing01	24509f4af9	Fix the grammar in copyright. (#8403 )	7 years ago
Yi Wang	fc374821dd	Correct #include path	7 years ago
Yi Wang	90648f336d	Move file to fluid/; Edit CMakeLists.txt	7 years ago

44 Commits (d93b2d0365355430f3db723dc3e278851b7a88b4)