tensor-tang
54c95e49f0
fix blas
7 years ago
tensor-tang
8c23f7c4f0
fix blas and use packed weight
7 years ago
tensor-tang
43cee33a23
add mkl packed gemm
7 years ago
tensor-tang
d8d2dbcfac
further optimize im2col using variables
7 years ago
tensor-tang
687a322267
Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
tensor-tang
65d418f060
complete im2col with padding==1 and speedup filter width==1
7 years ago
tensor-tang
52eb86e30f
refine im2col benchmark
7 years ago
tensor-tang
3017f46076
add more test cases
7 years ago
tensor-tang
8d6be4fb5f
refine im2col test and add benchmark
7 years ago
tensor-tang
507c143047
im2col cfo cpu code clean
7 years ago
tensor-tang
4eeed0b5e4
refine width padding and enable core copy
7 years ago
Wu Yi
73fcfc06ec
refine conv cudnn enforce ( #12353 )
...
* refine conv cudnn enforce
* update
* update all cudnn ops
* fix
7 years ago
tensor-tang
e3131e2d73
enable width padding
7 years ago
tensor-tang
92518c519f
reuse sizes saving time
7 years ago
tensor-tang
660df122ce
enable padding!=0 and fill height padding with 0
7 years ago
tensor-tang
d8e00facf7
reuse im_size
7 years ago
tensor-tang
b72befc5cc
reuse copy size
7 years ago
tensor-tang
6788af4bf1
refine test cases
7 years ago
tensor-tang
b163e601b6
add gtest
7 years ago
tensor-tang
aae994fd26
refine im2col no padding
7 years ago
Yan Chunwei
02cf54d331
bugfix lod cpu performance ( #12297 )
7 years ago
tensor-tang
fc2b578842
add gemm_warp test
7 years ago
tensor-tang
a916c52579
refine gemm
7 years ago
tensor-tang
961e754c9f
mkl split gemm for better perf
7 years ago
tensor-tang
f0cd493c0d
Merge pull request #11989 from tensor-tang/feature/libxsmm
...
introduce libxsmm
7 years ago
Guo Sheng
da3f766821
Merge pull request #12088 from guoshengCS/complete-hsigmoid
...
Complete hsigmoid_op
7 years ago
guosheng
4ee069fdba
Fix the HierarchicalSigmoidGradOpKernel and refine the codes. Now hsigmoid_op is same with V2 implementation and can pass gradient check.
7 years ago
tensor-tang
1c5d6c5692
disable xsmm with float16
7 years ago
tensor-tang
c9ba51ead8
Merge remote-tracking branch 'ups/develop' into feature/libxsmm
7 years ago
tensor-tang
64a8e6d20e
refine the threshold functions
7 years ago
lemon34
29145e1e31
change im2sequence for ctc batch inference ( #11696 )
...
* change im2sequence for ctc batch inference
* Update im2sequence_op.cc
* change im2sequence for ctc batch inference
* update
* change PR by comment
* fix ocr test error
* fix test_im2sequence
* modify the old name to standard name
* fix test_layers failed
7 years ago
guosheng
e7a4cfc0ff
complete the hsigmoid_op
7 years ago
guosheng
d695381677
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into complete-hsigmoid
7 years ago
tensor-tang
6bc1aaaac7
refine the ColMajor replacement
7 years ago
tensor-tang
de856da9a6
fix ColMajor and RowMajor replacement
7 years ago
tensor-tang
21516e5cbe
add unit test of smm
7 years ago
tensor-tang
c3941745b3
add libxsmm_gemm
7 years ago
tensor-tang
7782a4ab53
fix blas build issue
7 years ago
tensor-tang
17987eb3fc
link libxsmm
7 years ago
tensor-tang
3df99e72ab
Merge remote-tracking branch 'ups/develop' into refine/set_num_threads
...
fix conflicts
7 years ago
dzhwinter
4ed0b62476
Move fluid::framework::InitDevices into fluid::platform ( #11757 )
...
* move to platform
* "move init from framework to platform"
* "remove used init"
* "fix ci"
* "fix ci"
* "fix generic"
* "fix ci"
* "fix ci"
* "fix ci"
* "disable fragile test"
7 years ago
dzhwinter
99a99ec7e3
"remove lapack" ( #11966 )
7 years ago
Xin Pan
a9086bf320
also move a few other dir to legacy/
7 years ago
tensor-tang
e3a96300bb
move SetNumThreads to platform
7 years ago
tensor-tang
1f09ddf806
Merge remote-tracking branch 'ups/develop' into refine/mklml/dyload
7 years ago
Tao Luo
bfe5dc6312
Merge pull request #11607 from chengduoZH/fix_concat_warning
...
Fix concat compile warning
7 years ago
chengduoZH
804c767107
fix concat warning
7 years ago
tensor-tang
f503f12925
enable dynamic load mklml lib on fluid
7 years ago
fengjiayi
12619fcf90
fix a compile error
7 years ago
qiaolongfei
762160bd8c
fix concat grad kernel
7 years ago