Commit Graph

136 Commits (02cf54d331f4e926c1a5a4a7548a0d03dcacfab6)

Author SHA1 Message Date
Yan Chunwei 02cf54d331
bugfix lod cpu performance (#12297)
7 years ago
tensor-tang fc2b578842 add gemm_warp test
7 years ago
tensor-tang a916c52579 refine gemm
7 years ago
tensor-tang 961e754c9f mkl split gemm for better perf
7 years ago
tensor-tang f0cd493c0d
Merge pull request #11989 from tensor-tang/feature/libxsmm
7 years ago
Guo Sheng da3f766821
Merge pull request #12088 from guoshengCS/complete-hsigmoid
7 years ago
guosheng 4ee069fdba Fix the HierarchicalSigmoidGradOpKernel and refine the codes. Now hsigmoid_op is same with V2 implementation and can pass gradient check.
7 years ago
tensor-tang 1c5d6c5692 disable xsmm with float16
7 years ago
tensor-tang c9ba51ead8 Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
tensor-tang 64a8e6d20e refine the threshold functions
7 years ago
lemon34 29145e1e31 change im2sequence for ctc batch inference (#11696)
7 years ago
guosheng e7a4cfc0ff complete the hsigmoid_op
7 years ago
guosheng d695381677 Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
tensor-tang 6bc1aaaac7 refine the ColMajor replacement
7 years ago
tensor-tang de856da9a6 fix ColMajor and RowMajor replacement
7 years ago
tensor-tang 21516e5cbe add unit test of smm
7 years ago
tensor-tang c3941745b3 add libxsmm_gemm
7 years ago
tensor-tang 7782a4ab53 fix blas build issue
7 years ago
tensor-tang 17987eb3fc link libxsmm
7 years ago
tensor-tang 3df99e72ab Merge remote-tracking branch 'ups/develop' into refine/set_num_threads
7 years ago
dzhwinter 4ed0b62476
Move fluid::framework::InitDevices into fluid::platform (#11757)
7 years ago
dzhwinter 99a99ec7e3
"remove lapack" (#11966)
7 years ago
Xin Pan a9086bf320 also move a few other dir to legacy/
7 years ago
tensor-tang e3a96300bb move SetNumThreads to platform
7 years ago
tensor-tang 1f09ddf806 Merge remote-tracking branch 'ups/develop' into refine/mklml/dyload
7 years ago
Tao Luo bfe5dc6312
Merge pull request #11607 from chengduoZH/fix_concat_warning
7 years ago
chengduoZH 804c767107 fix concat warning
7 years ago
tensor-tang f503f12925 enable dynamic load mklml lib on fluid
7 years ago
fengjiayi 12619fcf90 fix a compile error
7 years ago
qiaolongfei 762160bd8c fix concat grad kernel
7 years ago
qingqing01 9c90dc9728
Make the CUDA kernel of concat correct and fix unit tests. (#11541)
7 years ago
qiaolongfei ad1ad738d8 add gpu support for concat
7 years ago
qiaolongfei 9c128fe656 concat support data as input
7 years ago
weixing02 ee13b396f2 fix some errors
7 years ago
weixing02 8bd148dc00 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into hsigmoid_op
7 years ago
tensor-tang 9169b3b802
Merge pull request #10789 from Xreki/core_fix_openblas_threads
7 years ago
guochaorong 04b8d3d03c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into paddle_fix
7 years ago
guochaorong 0fec9469f9 fix some bugs introduced by unfreed memory
7 years ago
weixing02 3e46ec41a9 add hsigmoid
7 years ago
qingqing01 3ba75d4a69
Check label range in cross entropy calculation. (#10954)
7 years ago
Tomasz Patejko e43c8f33cd MKL elementwise add: elementwise_add uses vAdd VML function when MKL is used
7 years ago
Liu Yiqun 50ba205d79 Merge branch 'develop' into core_fix_openblas_threads
7 years ago
Liu Yiqun 39eb871ddf Add an interface to set the number of threads for math function, and set the default value to 1 for inference.
7 years ago
yuyang18 fd2b4b478e Make tensor support uint8
7 years ago
Yiqun Liu b7026f79a9
Fix a bug related to dispensable inputs and refine the inference unittest (#10527)
7 years ago
yuyang18 66590a0b88 Fix typo in blas_impl.h
7 years ago
yuyang18 27197290dc matmul support float16/double
7 years ago
Yu Yang fcd31d6161 Follow comments and polish code names
7 years ago
Yu Yang 0a13d3c67a Move MatMul to blas_impl.h
7 years ago
Yu Yang 3dd01823a8 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/clean_matmul
7 years ago