Paddle

Commit Graph

Author	SHA1	Message	Date
wawltor	b6a4349dd4	fix the error message for the math dir https://github.com/PaddlePaddle/Paddle/pull/27332	5 years ago
ShenLiang	c609066074	Add Matmul op (#26411 ) * add matmul_v2	5 years ago
Guo Sheng	a8c0fb4e86	Add cholesky_op (#23543 ) * Add cholesky_op forward part. test=develop * Complete cholesky_op forward part. test=develop * Add cholesky_op backward part. test=develop * Complete cholesky_op backward part. test=develop * Refine cholesky_op error check and docs. test=develop * Add grad_check unit test for cholesky_op. test=develop * Fix sample code in cholesky doc. test=develop * Refine some error messages of cholesky_op. test=develop * Refine some error messages of cholesky_op. test=develop * Remove unused input in cholesky_grad. test=develop * Remove unused input in cholesky_grad. test=develop * Fix stream for cusolverDnSetStream. test=develop * Update PADDLE_ENFORCE_CUDA_SUCCESS from cholesky_op to adapt to latest code. test=develop * Add CUSOLVER ERROR in enforce.h test=develop * Fix the missing return value in cholesky. test=develop	5 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	5 years ago
Liufang Sang	f0b1518438	add dequantize_abs_max op and modify lookup_table op (#20899 ) * add int8 kernel to lookup_table op and add dequantize op test=develop * change paddle_enforce to paddle_enforce_eq test=develop * change copyright and change some not suitable code test=develop * remove debug log test=develop * replace GetInputType with IndicateVarDataType test=develop * fix EmptyGradMaker test=develop * fix diff between cpu and gpu test=develop * use memcopy when int8_t test=develop	6 years ago
danleifeng	425279a57b	Improve elementwise operators performance in same dimensions. (#19763 ) Improve elementwise operators performance in same dimensions	6 years ago
Bob Zhu	c670058a8d	add support of matmul with multiple head even different width and height (#19708 ) * add support of matmul with multiple head even different width and height Original matmul with multiple head supports only the mat_a.width == mat_b.height, in that case, mat_b will be horizontally split. In this patch, we extend the support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height, in this case, mab_b will be vertically split. One example is A is [3, 8], B is [2, 16], head_number is 4. In this case, A will be split as [3, 2], B will be (vertically) split as [2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16] test=develop * add support of matmul with multiple head even different width and height Original matmul with multiple head supports only the mat_a.width == mat_b.height, in that case, mat_b will be horizontally split. In this patch, we extend the support when mat_a.width != mat_b.height but mat_a.width/head_number == mat_b.height, in this case, mab_b will be vertically split. One example is A is [3, 8], B is [2, 16], head_number is 4. In this case, A will be split as [3, 2], B will be (vertically) split as [2, 4]. The final result will be 4 matrix of 4 matrix of [3,4], i.e. [3, 16] test=develop * refactor the code of matmul with multiple head even different width and height test=develop	6 years ago
Tao Luo	0a46d34538	refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607 ) test=develop	6 years ago
zhouwei25	84c728013c	fix the compilation issue on windows caused by mkl_CSRMM (#19533 )	6 years ago
Yihua Xu	b920395842	Use sparse matrix to implement fused emb_seq_pool operator (#19064 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * Ignore the deprecated status for windows test=develop	6 years ago
Bob Zhu	220eef602e	Extend Matmul to support matrix multiplication with multiple heads (#18570 ) * extend matmul op to support multiple head multiplication With the support of multiple head, the multiplication of two big matrixes is split into multiplication of several (head_number) small matrixes. e.g. if Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].	6 years ago
Zeng Jinle	f5641000bb	Add a unittest to inplace elementwise_add (#18385 ) * add_elementwise_add_inplace_test,test=develop * rename file, test=develop	6 years ago
Yihua Xu	7396788694	Optimize gelu operation with mkl erf. test=develop	6 years ago
tensor-tang	ee2321debd	Revert 15770 develop `a6910f900` gelu mkl opt (#15872 ) * Revert "Optimze Gelu with MKL Erf function (#15770)" This reverts commit `676995c86c`. * test=develop	6 years ago
Yihua Xu	676995c86c	Optimze Gelu with MKL Erf function (#15770 ) * Optimize for gelu operator * Set up the low accuracy mode of MKL ERF function. test=develop * Only enable MKLML ERF when OS is linux * Use the speical mklml version included vmsErf function to verify gelu mkl kernel. test=develop * Add the CUDA macro to avoid NVCC's compile issue. test=develop * Add the TODO comments for mklml library modification. test=develop * Clean Code test=develop * Add the comment of marco for NVCC compiler. test=develop	6 years ago
Yu Yang	7b10bf0e60	Use mkl	7 years ago
Jacek Czaja	48e1b97e8e	- Coding style fixes test=develop	7 years ago
Jacek Czaja	cf40daee58	- Building fix to softmax for inference	7 years ago
Jacek Czaja	8bfa1fa9bb	- ASUM MKL integration	7 years ago
tensor-tang	64f7516aee	fix lrn on mac (#14426 ) * rename and fix blas vsqr test=develop * update	7 years ago
tensor-tang	1be85d011d	add mkl vsqr and vpow	7 years ago
tensor-tang	cf5ea925c3	fix bugs	7 years ago
tensor-tang	3dd66390b2	add blas vexp	7 years ago
tensor-tang	0ec1f65cf1	fix blas dot and add cblas scal	7 years ago
tensor-tang	a2203d0466	add cblas dot	7 years ago
tensor-tang	f72ab8961e	refine blas gemm	7 years ago
tensor-tang	6644ce79a5	add mklml vmul	7 years ago
tensor-tang	54c95e49f0	fix blas	7 years ago
tensor-tang	43cee33a23	add mkl packed gemm	7 years ago
tensor-tang	a916c52579	refine gemm	7 years ago
tensor-tang	961e754c9f	mkl split gemm for better perf	7 years ago
tensor-tang	1c5d6c5692	disable xsmm with float16	7 years ago
tensor-tang	64a8e6d20e	refine the threshold functions	7 years ago
tensor-tang	6bc1aaaac7	refine the ColMajor replacement	7 years ago
tensor-tang	de856da9a6	fix ColMajor and RowMajor replacement	7 years ago
tensor-tang	c3941745b3	add libxsmm_gemm	7 years ago
tensor-tang	f503f12925	enable dynamic load mklml lib on fluid	7 years ago
Tomasz Patejko	e43c8f33cd	MKL elementwise add: elementwise_add uses vAdd VML function when MKL is used	7 years ago
yuyang18	66590a0b88	Fix typo in blas_impl.h	7 years ago
Yu Yang	0a13d3c67a	Move MatMul to blas_impl.h Rename MatDim to MatDescriptor	7 years ago
Yu Yang	ef6ea790dc	Clean and extract blas	7 years ago
Yu Yang	815d888468	Clean MatMul	7 years ago
Yu Yang	4db43c6c9f	Naive implement cblas	7 years ago
Yu Yang	c888e01660	Refactor GEMM in blas	7 years ago

44 Commits (b6a4349dd40eee17e485e149e09af4b29caa3d66)