Commit Graph

177 Commits (557be6fc58a8fad13a830df33ec77560faaa3d7c)

Author SHA1 Message Date
tensor-tang 64a8e6d20e refine the threshold functions
7 years ago
lemon34 29145e1e31 change im2sequence for ctc batch inference (#11696)
7 years ago
guosheng e7a4cfc0ff complete the hsigmoid_op
7 years ago
guosheng d695381677 Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
tensor-tang 6bc1aaaac7 refine the ColMajor replacement
7 years ago
tensor-tang de856da9a6 fix ColMajor and RowMajor replacement
7 years ago
tensor-tang 21516e5cbe add unit test of smm
7 years ago
tensor-tang c3941745b3 add libxsmm_gemm
7 years ago
tensor-tang 7782a4ab53 fix blas build issue
7 years ago
tensor-tang 17987eb3fc link libxsmm
7 years ago
tensor-tang 3df99e72ab Merge remote-tracking branch 'ups/develop' into refine/set_num_threads
7 years ago
dzhwinter 4ed0b62476
Move fluid::framework::InitDevices into fluid::platform (#11757)
7 years ago
dzhwinter 99a99ec7e3
"remove lapack" (#11966)
7 years ago
Xin Pan a9086bf320 also move a few other dir to legacy/
7 years ago
tensor-tang e3a96300bb move SetNumThreads to platform
7 years ago
tensor-tang 1f09ddf806 Merge remote-tracking branch 'ups/develop' into refine/mklml/dyload
7 years ago
Tao Luo bfe5dc6312
Merge pull request #11607 from chengduoZH/fix_concat_warning
7 years ago
chengduoZH 804c767107 fix concat warning
7 years ago
tensor-tang f503f12925 enable dynamic load mklml lib on fluid
7 years ago
fengjiayi 12619fcf90 fix a compile error
7 years ago
qiaolongfei 762160bd8c fix concat grad kernel
7 years ago
qingqing01 9c90dc9728
Make the CUDA kernel of concat correct and fix unit tests. (#11541)
7 years ago
qiaolongfei ad1ad738d8 add gpu support for concat
7 years ago
qiaolongfei 9c128fe656 concat support data as input
7 years ago
weixing02 ee13b396f2 fix some errors
7 years ago
weixing02 8bd148dc00 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into hsigmoid_op
7 years ago
tensor-tang 9169b3b802
Merge pull request #10789 from Xreki/core_fix_openblas_threads
7 years ago
guochaorong 04b8d3d03c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into paddle_fix
7 years ago
guochaorong 0fec9469f9 fix some bugs introduced by unfreed memory
7 years ago
weixing02 3e46ec41a9 add hsigmoid
7 years ago
qingqing01 3ba75d4a69
Check label range in cross entropy calculation. (#10954)
7 years ago
Tomasz Patejko e43c8f33cd MKL elementwise add: elementwise_add uses vAdd VML function when MKL is used
7 years ago
Liu Yiqun 50ba205d79 Merge branch 'develop' into core_fix_openblas_threads
7 years ago
Liu Yiqun 39eb871ddf Add an interface to set the number of threads for math function, and set the default value to 1 for inference.
7 years ago
yuyang18 fd2b4b478e Make tensor support uint8
7 years ago
Yiqun Liu b7026f79a9
Fix a bug related to dispensable inputs and refine the inference unittest (#10527)
7 years ago
yuyang18 66590a0b88 Fix typo in blas_impl.h
7 years ago
yuyang18 27197290dc matmul support float16/double
7 years ago
Yu Yang fcd31d6161 Follow comments and polish code names
7 years ago
Yu Yang 0a13d3c67a Move MatMul to blas_impl.h
7 years ago
Yu Yang 3dd01823a8 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/clean_matmul
7 years ago
Yu Yang c6a6d87f96 Rewrite Matmul, make code cleaner
7 years ago
fengjiayi b708ec0ae1
Merge pull request #10412 from JiayiFeng/correct_TensorCopy_misuse
7 years ago
Darcy 8f8a4768dc adding device_context to blas deps list (#10420)
7 years ago
fengjiayi 0c99cd7bbb fix errors in sequence_padding_test
7 years ago
Siddharth Goyal b65282168c Fix cpplint errors in lstm kernel (#10394)
7 years ago
fengjiayi e309f42293 fix errors in concat_test
7 years ago
Yu Yang 0285a2b95d
Merge pull request #10371 from reyoung/refine_code
7 years ago
Abhinav Arora c9f55dfafc
Fix CPPLint issues in /math/detail/gru_kernel.h (#10390)
7 years ago
Yu Yang ef6ea790dc Clean and extract blas
7 years ago
Yu Yang 815d888468 Clean MatMul
7 years ago
Yu Yang bc8160350b Fix compile
7 years ago
Yu Yang a6edeb39b3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/clean_blas
7 years ago
chengduo 4fbde42cdf Fix __shfl_down_sync_ of cross_entropy (#10345)
7 years ago
Yu Yang caa4027d9d Follow comments
7 years ago
Abhinav Arora 1945b729b6
Fix CPPLint issues with math/sequence_padding (#10317)
7 years ago
chengduo 9bcd9f661b fix cpplint error (#10329)
7 years ago
Yu Yang 4db43c6c9f Naive implement cblas
7 years ago
Yu Yang 60d6348e69 Revert develop
7 years ago
Yu Yang 86af6bdc81 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/clean_blas
7 years ago
Yu Yang 49dedfad17 Polish code and tests
7 years ago
Abhinav Arora 738585476d
Fix more CPPLint issues in fluid/operators/math (#10276)
7 years ago
dzhwinter eb6f9dd5de
Feature/cuda9 cudnn7 (#10140)
7 years ago
Yu Yang c888e01660 Refactor GEMM in blas
7 years ago
Abhinav Arora e735359631
Fix more CPPlint issues in fluid/operators/math (#10249)
7 years ago
fengjiayi 71fa3ca9c4
Merge pull request #10232 from JiayiFeng/fix_unittests
7 years ago
fengjiayi 30f9dc92e5 fix errors
7 years ago
fengjiayi 330fa95cbd Follow comments
7 years ago
Abhinav Arora 83b1a8f6bf
Pending more CPPLint errors in fluid/operators/math (#10243)
7 years ago
fengjiayi bcf260e1e8 fix several unit tests
7 years ago
Abhinav Arora f457d5da06
Fix more CPPLint errors (#10218)
7 years ago
Yu Yang 580dad0c2c Fix compile when there is no mkl
7 years ago
Yu Yang 2a06e307d0 Fix batch_gemm bugs
7 years ago
Kexin Zhao 92913027fc
fix unused var error (#9908)
7 years ago
Kexin Zhao 617e790a59
fix cuda 7.5 compile error (#9885)
7 years ago
Kexin Zhao 7ed457e77a Fix cuda 7.5 error with cublas GEMM (#9811)
7 years ago
Kexin Zhao b2a1c9e8b7 Add float16 support to non-cudnn softmax op on GPU (#9686)
7 years ago
Kexin Zhao d00bd9eb72 Update the cuda API and enable tensor core for GEMM (#9622)
7 years ago
chengduoZH e099b18045 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/add_CUDAPinnedPlace
7 years ago
Yang Yu af230d9bef Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
dzhwinter 8425c2c859
Speed/sequence op1 (#9217)
7 years ago
Yang Yu b0775588c0 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
chengduoZH ab601c19c3 Add CUDAPinnedPlace
7 years ago
Luo Tao 6332bd1ed8 Merge branch 'develop' into infer_mkl
7 years ago
Yu Yang 50e7e25db3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
chengduoZH aca9180a76 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/fix_concat
7 years ago
chengduoZH 750aff10ce code refine
7 years ago
chengduoZH 043f47b27f fix concat op
7 years ago
Luo Tao ae820a34bc Merge branch 'develop' into infer_mkl
7 years ago
Tao Luo 9126e626fc
Merge pull request #9165 from ROCmSoftwarePlatform/amd_cmake_01
7 years ago
Kexin Zhao 4eaa789730 resolve conflict
7 years ago
Kexin Zhao ed2bc194c5
Merge pull request #9176 from kexinzhao/batch_norm_fp16
7 years ago
Kexin Zhao 70e7122785 initial commit
7 years ago
sabreshao e50205e744 CMake refine for HIP support.
7 years ago
Yang yaming 381c6a026d
Merge pull request #9100 from pkuyym/fix-9049
7 years ago
yangyaming 2f2c5f5e60 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-9049
7 years ago
Xi Chen 9eae086e39 add math_function to softmax's dep list
7 years ago
Yu Yang 9cb8f50302 Complete fetch op
7 years ago
Kexin Zhao 39c676e208 initial commit
7 years ago
xuwei06 ab3543e35e Fix compilation for gcc5.4
7 years ago