Paddle

Commit Graph

Author	SHA1	Message	Date
qingqing01	58db07b7bb	Check errors for the cuda kernel calls. (#5436 )	8 years ago
QI JUN	afd1e844fd	remove unused code (#5219 ) * remove unused code * fix cmake file * fix build error	8 years ago
Dong Zhihong	16a39d24f3	fix conflict	8 years ago
Qiao Longfei	56b723c40d	Cudnn batch norm op (#5067 ) * init cudnn batch norm op * rename batch_norm_cudnn_op.cc batch_norm_op.cu * correct name style * add ExtractNCWHD, simplify code * fix ExtractNCWHD * use CUDNN_ENFORCE instead of PADDLE_ENFORCE	8 years ago
Dong Zhihong	0990c87bf6	checkin nccl operator	8 years ago
Yu Yang	94e741d6f0	Use external project for NCCL (#5028 )	8 years ago
Yu Yang	43c6ff212e	Feature/nccl dso (#5001 ) * "add nccl enforce" * Dev * Update comment * Add nccl test * Follow comments	8 years ago
Markus Kliegl	164898277c	MatMul operator (#4856 ) * initial matmul operator Similar to np.matmul, but also has transpose_X and transpose_Y flags, and only supports tensors from rank 1 to 3 inclusive. For GPU, uses cublas?gemmStridedBatched. For CPU, uses cblas_?gemm_batch if available via MKL; otherwise a simple serial implementation that loops over the batch dimension is employed for now.	8 years ago
武毅	a3ccbdb3b6	Cudnn conv op (#4195 ) * add cudnn_conv_op * WIP * update * update * fix grad check * use platform::memory * add support group for cudnn * update * follow comments * fix onlycpu build * update cuda define * follow comments * follow comments * merge with updates * fix compile error * follow comments * follow comments	8 years ago
Yang Yang(Tony)	c3bf332666	Merge pull request #4537 from QiJune/executor_impl Executor interface design and implementation	8 years ago
Luo Tao	871a3f6e76	remove unused PADDLE_ONLY_CPU comment	8 years ago
Yang Yang	e51557130e	clean up for review	8 years ago
qijun	1f5192a27b	fix executor gpu unittest	8 years ago
qijun	39f75a13a4	Merge remote-tracking branch 'baidu/develop' into executor_impl	8 years ago
Yi Wang	880b874b47	Merge branch 'develop' of https://github.com/paddlepaddle/paddle into paddle_only_cpu	8 years ago
Yi Wang	2b204f048b	Rename platform::GetDeviceCount into platform::GetCUDADeviceCount	8 years ago
qijun	e02cc571cf	Merge remote-tracking branch 'baidu/develop' into executor_impl	8 years ago
qijun	fe10e86dd5	fix gpu build error	8 years ago
Yi Wang	4558807c48	Use PADDLE_WITH_CUDA instead of PADDLE_WITH_GPU	8 years ago
Yu Yang	84500f9487	Change `PADDLE_ONLY_CPU` to `PADDLE_WITH_GPU` By shell command ```bash sed -i 's#ifdef PADDLE_ONLY_CPU#ifndef PADDLE_WITH_GPU#g' `find ./paddle/ -name '.h' -o -name '.cc' -o -name '.cpp' -o -name '.c' -o -name '.cu'` sed -i 's#ifndef PADDLE_ONLY_CPU#ifdef PADDLE_WITH_GPU#g' `find ./paddle/ -name '.h' -o -name '.cc' -o -name '.cpp' -o -name '.c' -o -name '.cu'` ```	8 years ago
qijun	cb198fa7b6	merge baidu/develop	8 years ago
qijun	395051512d	remove device context manager	8 years ago
qijun	6c4d1f551d	refine codes	8 years ago
qijun	023ed5eb39	merge baidu/develop	8 years ago
qijun	b5dbe88b5a	follow comments	8 years ago
dzhwinter	8acc010691	Merge branch 'develop' into macro	8 years ago
dongzhihong	5423cb3e57	format	8 years ago
Yu Yang	8fd845e0fa	Unify Map in OpDescBind	8 years ago
chengduoZH	df59889984	remove conflict	8 years ago
qijun	b611a479fc	fix gpu build error	8 years ago
qijun	7a6fcc7d30	move EigenDeviceConverter to device_context.h	8 years ago
Yu Yang	f2feb33384	Follow comments	8 years ago
Yu Yang	3a5693e0a8	Add Skeleton of Double support	8 years ago
chengduoZH	3c0f079333	remove conflict and fix InferShape function	8 years ago
Yu Yang	bc30ba19ed	Merge pull request #4375 from reyoung/feature/use_bool_for_enforce Use `bool` for PADDLE_ENFORCE, not int	8 years ago
chengduoZH	30a586df0c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into Add_pool_op	8 years ago
Qiao Longfei	d0ad82cff1	fix nv_library (#4370 ) * fix nv_library * fix symbol in gpu_info.h	8 years ago
Yu Yang	699dbe3be9	Use `bool` for PADDLE_ENFORCE, not int * If stat is an integer, bool value will implicit cast to int before pass to PADDLE_ENFORCE	8 years ago
Yu Yang	ba1f5b5c58	Sync computation when Python invoke `run` * Since GPU is an async device by default. We should sync computation when Python invoke `run`. So Python can get the correct computation result	8 years ago
chengduoZH	0417e4e4bf	fix framework::LoDTensor => Tensor	8 years ago
dangqingqing	41a2321a0e	Refine platform::Transform function and fix prelu_op testing.	8 years ago
Yu Yang	87e4e25db1	Change Transform API Using DeviceContext, not Place to get stream	8 years ago
Yu Yang	847fe47310	Merge branch 'develop' of github.com:baidu/Paddle into feature/remove_lazy_init_in_dev_ctx	8 years ago
Yu Yang	81d56ca86b	Remove lazy-initialization in device_context * Also use `const DeviceContext&` all the time, to prevent `const_cast` Fix #4169 Fix #3468 Fix #3475	8 years ago
武毅	8580dce308	Refine accuracy_op CUDA kernel (#4097 ) * refind accuracy_op * follow comments * follow comments	8 years ago
Yu Yang	9d3b920d75	Merge pull request #3981 from reyoung/feature/transform_api Host and device transform API	8 years ago
liaogang	59d661b9a9	Fix enforce test failed Note: If no symbol with a suitable value is found, both this field and dli_saddr shall be set to NULL.	8 years ago
Yu Yang	f8c6792aa3	Extract DevPtrCast to device_ptr_cast.h	8 years ago
Yu Yang	54d88d4472	Merge branch 'develop' of github.com:baidu/Paddle into feature/transform_api	8 years ago
Yu Yang	6fbf097bcc	Mark thrust::device_ptr in transform Fix TravisCI	8 years ago

1 2 3 4

181 Commits (5f99ae908b5fac433df28cc806d5514a6054b26c)