Paddle

Commit Graph

Author	SHA1	Message	Date
zhaoyuchen2018	b93870e696	Improve topk performance. (#21087 ) * Improve topk performance. give 200000 data to compute topk, before opt: cost 1s after opt: cost 0.0028s. * Refine return value. * Add cuda util funtions. * Fix ComputeBlockSize bug & refine comments. Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Zeng Jinle	a710ccc0cb	refine error message of allocator again, test=develop (#21023 )	5 years ago
Huihuang Zheng	ea6ee76fa9	GPU allocation uses fraction of available memory (#18896 ) GPU allocation uses fraction of available memory, also fix the GetUsed without lock	6 years ago
zhhsplendid	22715487dc	add allocator flags test=develop	6 years ago
Wu Yi	29d9fb53fc	[Feature] multi process multi gpu dist training, boost v100 performance by 20% (#14661 ) * wip multi process multi gpu dist training * workable for p2p * update test=develop * change back env name test=develop * fix alloc init * fix cpu build test=devlop * fix mac tests test=develop * refine code * refine test=develop	6 years ago
chengduo	00b9e9a135	Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929 ) * refine cublase test=develop * code refine * refine cublas * add GEMME_EX * add enable_cublas_tensor_op_math doc and add cublasCall test=develop * fix CublasCall for cuda version test=develop * fix error test=develop * fix GEMM_EX to be compatible with gcc 4.8 test=develop * add GEMM_EX test=develop * to compatiable with gcc4.8 test=develop	6 years ago
chengduo	2c9839c847	add cuda version display (#13885 ) test=develop	6 years ago
typhoonzero	a4f7696a18	Revert "Some trivial optimization (#13530 )" This reverts commit `1d91a49d2f`.	6 years ago
chengduo	1d91a49d2f	Some trivial optimization (#13530 ) * some trivial opt * remove the fix of lod_tensor and shrink_rnn_memory_op * refine ShrinkRNNMemoryOp test=develop	6 years ago
fengjiayi	9f11da5931	Add synchronous TensorCopy and use it in double buffer	7 years ago
Yi Wang	535646cf25	Update (#9717 )	7 years ago
Yi Wang	0c43a376e2	Fix cpplint errors with paddle/fluid/platform/gpu_info.* (#9710 ) * Fix cpplint errors with paddle/fluid/platform/gpu_info.* * Update	7 years ago
Kexin Zhao	1998d5afa2	add gpu info func to get compute cap	7 years ago
chengduoZH	00e596edbe	get max threads of GPU	7 years ago
qingqing01	24509f4af9	Fix the grammar in copyright. (#8403 )	7 years ago
Yi Wang	90648f336d	Move file to fluid/; Edit CMakeLists.txt	7 years ago

16 Commits (03479469a700ce30edea0fe80a7c14982a6082db)