JiabinYang
42470f14b7
test=develop
7 years ago
peizhilin
445fff24dc
add the bigobj option to NVCC compile
...
fix code style
7 years ago
qingqing01
36f08eef3b
CUDA kernel for density_prior_box_op. ( #14513 )
...
* CUDA kernel for density_prior_box_op.
* Support flatten to 2D.
7 years ago
tensor-tang
6a7f83d45d
enable gru jitcode and refine act and lstm jitcode
...
test=develop
7 years ago
tensor-tang
686eaf20ba
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
peizhilin
81bd7eeff4
rollback the format
7 years ago
Qiao Longfei
1f87f263a2
clean code
7 years ago
Qiao Longfei
361cb0e078
lookup remote table can compile
7 years ago
JiabinYang
0fca16847c
temp
7 years ago
JiabinYang
e9be3366a9
test=develop
7 years ago
chengduo
00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH ( #13929 )
...
* refine cublase
test=develop
* code refine
* refine cublas
* add GEMME_EX
* add enable_cublas_tensor_op_math doc and add cublasCall
test=develop
* fix CublasCall for cuda version
test=develop
* fix error
test=develop
* fix GEMM_EX to be compatible with gcc 4.8
test=develop
* add GEMM_EX
test=develop
* to compatiable with gcc4.8
test=develop
7 years ago
peizhilin
dfbac60398
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
7c8c9dc9bf
fix unit test cases
7 years ago
tensor-tang
0c5ed5f6fc
enable peephole jitcode
...
test=develop
7 years ago
JiabinYang
3c6102a367
test=develop
7 years ago
Qiao Longfei
7c3ce2952d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
7 years ago
Qiao Longfei
60a4f69b3c
add lookup remote table op
7 years ago
Qiao Longfei
e0b48f7e29
init lookup remote table
7 years ago
tensor-tang
e3b61cf52b
init gru jitcode and fix lstm jitcode
...
test=develop
7 years ago
tensor-tang
0f25446574
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
Dun
ae7d22862b
Group Norm ( #13843 )
...
Add group normalization operator.
7 years ago
wopeizl
d9a1f3e58e
Windows/online ( #14474 )
...
* add recordio support
* disable the openblas multi-thread on windows since no support
adjust the python script
* code style
* code style
test=develop
* add create_recordio_file_reader back
* fix code style
test=develop
* fix the gtest.cmake on windows
* fix cc_test on windows
* fix the win build
test=develop
* remove fused compile support on windows
test=develop
* add the jit support
test=develop
* add the jit support, test=develop
* add the jit support, test=develop
* add the jit back
fix compile error on windows
* rollback test=develop
* test case fix
* disable DSO by default on windows
* exclude warpctc_op on windows
* exclude the dynload_warpctc out on windows
test=develop
* fix the scripts error
test=develop
* disable avx on windows by default
test=develop
* re-organize the cmake file
* disable mkl on windows by default
* add warp_ctc back
* fix the dependency
* fix the dependency
* fix the build issue on windows
* remove unsupported flag on windows
* code style
* code style
test=develop
* fix issue
* add profiler, parallel_executor back
* clean up the pre-definitions on windows
* fix build issue
* test=develop
7 years ago
JiabinYang
57a18e32a1
test=develop
7 years ago
peizhilin
bef475c92b
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo
5d4d117edc
Merge pull request #14502 from qingqing01/cudnn5_fix
...
Fix compling with cuDNN v5
7 years ago
Jiabin Yang
f7b55de9e5
Merge branch 'develop' into enhance_hierachical_sigmod_op
7 years ago
Yu Yang
e68c1fcd5a
Merge pull request #14522 from reyoung/feature/fix_op_header_deps
...
fix(Compile): fix depends error when compile op using cub
7 years ago
tensor-tang
3562051302
add gru refer code and remove redundant avx code
...
test=develop
7 years ago
JiabinYang
af9a3301da
test=develop
7 years ago
Zhaolong Xing
ad349e770f
Merge pull request #14452 from NHZlX/fix_avg_pool_trt_bug
...
fix avg pool trt bug
7 years ago
tensor-tang
f913860873
jitkernel lstm refer support peephole
...
test=develop
7 years ago
tensor-tang
2f9b5f2383
Merge branch 'develop' into fea/jit/rnn
7 years ago
JiabinYang
014e50c284
test=develop
7 years ago
Yu Yang
3edd32d070
fix(Compile): fix depends error when compile op using cub
...
some operators depend on cub and xxhash by header. The dependency should be declared explicitly rather than declared to pybind.
test=develop
7 years ago
Dang Qingqing
cda60311f9
Fix compling with cuDNN v5
...
test=develop
7 years ago
peizhilin
67562a6fcd
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang
10fb4ceefc
Merge pull request #14351 from tpatejko/tpatejko/mkldnn-elementwise_mul
...
[MKLDNN][JIT][AVX512] Elementwise Mul
7 years ago
jerrywgz
13e254faed
refine code, test=develop
7 years ago
tensor-tang
b4c826c548
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
...
test=develop
7 years ago
tensor-tang
ce31deb7e9
refine refer code and add lstm refer code
...
test=develop
7 years ago
jerrywgz
79cec53111
add ignore index for sigmoid cross entropy with logits op, test=develop
7 years ago
nhzlx
e62872df8b
fix conflicts
7 years ago
tensor-tang
c2cfb03a72
add lstm jitcode
7 years ago
peizhilin
25adf970b2
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo
1d3e9bde1e
Merge pull request #14488 from yihuaxu/develop_7a64d48f5_stack_opt
...
Optimize the stack operator
7 years ago
tensor-tang
7aa3aff338
Merge pull request #14465 from tensor-tang/fea/jit/exp
...
jitcode act support all size
7 years ago
Tao Luo
1b894e495f
Merge pull request #14437 from jczaja/prv-softmax-mkl
...
Introducing MKL to softmax for inference
7 years ago
peizhilin
3a72a634cf
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yihua Xu
a906a361be
Add the macro for NVCC (test=develop)
7 years ago
Yihua Xu
d91740acb1
Revert "Remove the remnant code (test=develop)"
...
This reverts commit be50670348
.
7 years ago
Yihua Xu
be50670348
Remove the remnant code (test=develop)
7 years ago
qingqing01
9eefd2c766
Modify some infer-shape about detection operators in compile-time. ( #14483 )
...
* Modify some infer-shape in compile-time.
7 years ago
Yihua Xu
f4c869d872
Optimize the layer_norm operator with AVX intrinsic function ( #14417 )
...
* Optimize layer_norm operator with AVX intrinsic functions
* Revert the wrong modifications
* Implement the jit kernel for layer_norm operator
* Add math headfile to fix the compile issue (test=develop)
* Add math headfile to fix the compile issue (test=develop)
* Fixed the intrinsic headfile issue (test=develop)
* Fix the conflicts (test=develop)
* Revert for CUDA compiler (test=develop)
* Fixed the cuda depency (test=develop)
* Fix the marco issues (test=develop)
7 years ago
peizhilin
ee0fd78c81
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yu Yang
f1a392a5fe
Merge pull request #13804 from sneaxiy/rewrite_allocation
...
Rewrite allocation
7 years ago
Yihua Xu
f418f552df
Merge branch 'develop' into develop_7a64d48f5_stack_opt (test=develop)
7 years ago
peizhilin
8443961a4f
add warp_ctc back
7 years ago
qingqing01
fd7e643153
Convolution fusion operator. ( #14449 )
...
* Convolution fusion operator.
* Clean code
test=develop
7 years ago
Yu Yang
98bbfc17be
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
peizhilin
4a6769da84
re-organize the cmake file
7 years ago
dengkaipeng
8ef6280c03
Add operator double support. test=develop
7 years ago
peizhilin
1aff40a4c6
exclude warpctc_op on windows
7 years ago
peizhilin
7d51a0e887
disable DSO by default on windows
7 years ago
peizhilin
b967e01cbe
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Wu Yi
d7bd0361cb
fix dist deps ( #14471 )
...
* fix dist deps test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
7 years ago
Jacek Czaja
9b0eae3023
- Removing partial specialization of sotmax for inference for GPU
...
test=develop
7 years ago
peizhilin
a3e952f41d
add the jit back
...
fix compile error on windows
7 years ago
tensor-tang
a19b3225a1
fix jitcode small size
...
test=develop
7 years ago
Jacek Czaja
be80bb4f28
- Fix to GPU
...
test=develop
7 years ago
tensor-tang
4dbdfa60ef
sigmoid and tanh support all size
...
test=develop
7 years ago
tensor-tang
ccb8963705
refine exp jitcode with all size
...
test=develop
7 years ago
peizhilin
1cc23ef67d
merge from paddle:develop
7 years ago
tensor-tang
d3eae8f61b
refine relu and fix addrelu test
7 years ago
tensor-tang
4e67fe6a12
refine act and vxx with all size
7 years ago
tensor-tang
ba3eaed7a7
exp support all size
7 years ago
tensor-tang
1ffce8c0ae
fix build error on noavx
...
test=develop
7 years ago
Michal Gallus
c69c41604e
MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
...
test=develop
7 years ago
Michal Gallus
785066eb8a
MKLDNN elementwise_mul: Check if AVX512 is available
...
test=develop
7 years ago
Michal Gallus
08f63c4d12
MKLDNN elementwise_mul: Lint changes to UT & integration
...
test=develop
7 years ago
Michal Gallus
49b09327f6
MKLDNN elementwise_mul: Reorder on non-nchw input, fallback on non-16 divisable fm
...
test=develop
7 years ago
Michal Gallus
d14858e4ba
MKLDNN elementwise_mul: Parallelize mul
7 years ago
Michal Gallus
ed31936ba1
MKLDNN elementwise_mul: Support NCHW, update UT
7 years ago
Tomasz Patejko
700bcbf74f
MKLDNN elementwise_mul: h and w loops implemented in xbyak
7 years ago
Tomasz Patejko
ad09facafe
MKLDNN elementwise_mul: CPU tests initially refactored. MKLDNN mul test for broadcast added
7 years ago
Tomasz Patejko
2d73ad180a
MKLDNN elementwise_mul: simple xbyak version for AVX512
7 years ago
Tomasz Patejko
213ec37d6a
MKLDNN elementwise_add: simple initial implementation of the operator for MKLDNN format
7 years ago
Wu Yi
a2d9b34417
Refine operator cmake ( #14413 )
...
* wip simplify operator framework
* wip
* wip
* done test=develop
* clean test=develop
* fix test=develop
* fix deps test=develop
* fix cpu build test=develop
* fix tensorrt build test=develop
* fix tests test=develop
* fix test=develop
* fix cpu build test=develop
7 years ago
peizhilin
764f97deac
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
8580b7a130
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang
7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
...
jitcode act relu, exp, sigmoid, tanh
7 years ago
Jiabin Yang
28bd5b7bad
fix space_to_depth_op unicode problem ( #14430 )
...
* fix space_to_depth_op unicode problem
* test=develop
7 years ago
Jacek Czaja
513bb6c151
Squashing MKL based softmax for inference
...
test=develop
- Added profiling to softmax functors
- MKL based softmax inference op
- Fix to softmax compuation via MKL
- cleaning
- Cosmetic fixes to softmax MKL
- Fix to ON_INFER lack of propagation
7 years ago
nhzlx
9b64aac41f
add macro for pool2dDirectCUDAFunctor
...
test=develop
7 years ago
whs
1722678258
Make nce support more distribution. ( #13549 )
...
* Fix truncated normal.
* Fix.
* Make nce support more distribution.
* Fix API.spec.
* Fix python API.
* Fix.
test=develop
* Fix API.spec
test=develop
* Fix sampler.
* Fix order of arguments in python API.
test=develop
7 years ago
nhzlx
83f8c403a7
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
...
test=develop
7 years ago
nhzlx
b969116988
fxi avg pool trt bug and fix cpplint
7 years ago
tensor-tang
1f00723fa3
exp, sigmoid, tanh jitcode support more size
...
test=develop
7 years ago
Qiyang Min
d971d5b875
Merge pull request #14431 from velconia/fix_expand_op_dim_in_compile_time
...
Fix expand op incorrect infer shape
7 years ago
Wu Yi
b32c13dc20
Add cudnn ctc loss ( #12366 )
...
* add cudnn ctc loss
* wip add test test=develop
* wip
* wip
* done test=develop
* move include cudnn test=develop
* test test=develop
* fix build test=develop
* fix build test=develop
* fix build on cudnn5 test=develop
* fix cudnn5 build test=develop
* fix cudnn5 build test=develop
* merge develop softmax functor change test=develop
7 years ago
tensor-tang
8cda7b3d20
Merge remote-tracking branch 'ups/develop' into fea/jit/act
...
test=develop
7 years ago
tensor-tang
e2d6eddd32
remove ComputeDeprecated
...
test=develop
7 years ago
peizhilin
6d0d5a76eb
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
dengkaipeng
f115eb0d1e
enhance api. test=develop
7 years ago
tensor-tang
64f7516aee
fix lrn on mac ( #14426 )
...
* rename and fix blas vsqr
test=develop
* update
7 years ago
Yu Yang
c8f6e70ab4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
tensor-tang
f65ddff8d1
unify act jitcode of relu, exp, sigmoid and tanh
7 years ago
tensor-tang
6a159071b6
add vtanh jitcode of size 8
7 years ago
tensor-tang
046374bcd1
add vsigmoid jitcode of size 8
7 years ago
minqiyang
560b29ccb7
Polish code
...
test=develop
7 years ago
minqiyang
21d6e8e8c8
Polish code
...
test=develop
7 years ago
minqiyang
50b6e4c6bc
Fix expand grad op infer shape
...
test=develop
7 years ago
Sylwester Fraczek
8a1eeec579
add mkldnn prop_kind phase for inference-only case to pooling and activations ( #14278 )
...
* add is_test to pooling and activations
add prop_kind support for layers activation. conv and pooling
add a pass that sets is_test to true
add transpiler version of is_test pass
test=develop
* patch test and pass
test=develop
* add pass to analyzer.h
test=develop
* add is_test attr description & pass only on mkldnn
in:
activation_op.cc
batch_norm_op.cc
conv_op.cc
dropout_op.cc
lrn_op.cc
pool_op.cc
sequence_pool_op.cc
softmax_op.cc
* fix is_test handling for activation pool and conv
* change description of is_test for all layers again
* remove GetAttr(use_mkldnn) from pass
* rename correct_mkldnn_test_phase to is_test
and remove dependency on MKLDNN
test=develop
* review fix magic number
* two if(..)s into one
* Check is_test once and pass mkldnn forward prop kind
* dereference shared_ptr with * (without get())
test=develop
* add is_test_pass back
test=develop
7 years ago
peizhilin
d1429ac4a5
add recordio support
7 years ago
chengduo
82773477ae
Add selu ( #14415 )
...
* add selu
* use for range
test=develop
* add API
test=develop
* follow comment
test=develop
* update API.spec
test=develop
7 years ago
dengkaipeng
95d5060ddd
fix abs -> fabs error. test=develop
7 years ago
minqiyang
30147d7f58
Fix expand op incorrect infer shape
...
test=develop
7 years ago
JiabinYang
ba9ff508e8
temp fix
7 years ago
Yihua Xu
03ccb9a461
Optimize the stack operator
7 years ago
dengkaipeng
2faa2b4048
remove cu file. test=develop
7 years ago
tensor-tang
ee2a7f1b8c
refine exp and fix error on avx
...
test=develop
7 years ago
tensor-tang
1e06a32a0d
add vexp jitcode of size 8
...
test=develop
7 years ago
tensor-tang
2354409601
Merge pull request #14374 from tensor-tang/fea/jit/act
...
add vrelu jitcode
7 years ago
Tao Luo
5ef123c778
Merge branch 'develop' into dam_fc
7 years ago
dzhwinter
d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
...
Windows/online
7 years ago
Tao Luo
d3e63e6e04
Merge pull request #14412 from jczaja/prv-dam-softmax
...
Softmax for Inference is enabled when ON_INFER is set
7 years ago
peizhilin
be332a13bc
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Jacek Czaja
b361579f09
- Softmax for Inference is enabled when ON_INFER is set
...
test=develop
7 years ago
Tao Luo
980a6753a8
fix typo to pass the ci
...
test=develop
7 years ago
Tao Luo
8f301f4618
Merge pull request #14381 from qingqing01/manylinux_v5_fix
...
Fix compiling with cuDNN v5.
7 years ago
peizhilin
1a9008c420
code style fix
...
test=develop
7 years ago
Tao Luo
e0d4e04bdd
fix some compiler warning
...
test=develop
7 years ago
Tao Luo
8ea13e336a
add in_num_col_dims for fc
7 years ago
JiabinYang
a507845a77
test=develop
7 years ago
Tao Luo
9eb0ab1db3
Merge pull request #14384 from tensor-tang/refine/lrn
...
Refine lrn cpu forward
7 years ago
peizhilin
30ddc07a7e
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiao Longfei
e65cbd3b06
Merge pull request #14387 from jacquesqiao/lookup_sparse_table_add_test_mode
...
Lookup sparse table add test mode
7 years ago
Qiao Longfei
6cf8f24b1b
Merge pull request #14389 from jacquesqiao/fix_sgd_op_optimize_sparse_table
...
sgd_op optimize selected rows do not enforce id < height
7 years ago
Xin Pan
10ab177f89
Merge pull request #14403 from PaddlePaddle/revert-14337-prv-dam-softmax
...
Revert "Softmax op optimization for inference "
7 years ago
Yan Chunwei
9f252e0032
Combine Inference Analysis with IR ( #13914 )
7 years ago
Tao Luo
5b9c62faee
Revert "Softmax op optimization for inference "
7 years ago
Tao Luo
6490bb2765
Merge pull request #14337 from jczaja/prv-dam-softmax
...
Softmax op optimization for inference
7 years ago
chengduo
9f68e9a7fe
fix auc op ( #14385 )
...
test=develop
7 years ago
dengkaipeng
a0284f6fbc
Add backward CPU kernel. test=develop
7 years ago
Dang Qingqing
d219818434
Fix compiling in cuDNN v5.
...
test=develop
7 years ago
Qiao Longfei
efb5c03f60
sgd_op optimize selected rows do not enforce id < height
...
test=develop
7 years ago
Qiao Longfei
7aa8b2ccf2
optimize code
7 years ago
Qiao Longfei
8d205c853c
add is_test for lookup_sparse_table
7 years ago
tensor-tang
b4dfba1779
refine lrn_op cpu forward and speedup
...
test=develop
7 years ago
tensor-tang
1be85d011d
add mkl vsqr and vpow
7 years ago
JiabinYang
f4be1d99d0
polish code and test
7 years ago
ruri
4a55fb5f5b
Add density_prior_box_op ( #14226 )
...
Density prior box operator for image detection model.
7 years ago
tensor-tang
0043c42b3e
add vrelu jitcode
...
test=develop
7 years ago
dengkaipeng
36c46152e1
Add unittest for yolov3_loss. test=develop
7 years ago
dengkaipeng
77c1328fa7
add CPU kernel forward
7 years ago
dengkaipeng
5d0b568ecb
Add YOLOv3 loss operator. test=develop
7 years ago
JiabinYang
b8ff0972b6
test=develop
7 years ago
JiabinYang
32e05b01f2
test=develop
7 years ago
peizhilin
61fa5218b9
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yibing Liu
bd2943788b
Fix gather & stack op ( #14355 )
...
* Add int type support for stack_op
* Improve gather op to support index with shape N x 1
test=develop
* Fix stack_op kernel's registry
test=develop
7 years ago
Yu Yang
8f9bfad246
perf(compile): speed up reduce_op compile by splitting files ( #14294 )
...
test=develop
7 years ago
sneaxiy
d231e55065
merge develop
...
test=develop
7 years ago
JiabinYang
c8801e100f
grad diff problem to be fixed and need api spec change to be done
7 years ago
Jacek Czaja
03299ed46c
- Fix to linking for GPU builds of softmax inference
...
test=develop
7 years ago
Jacek Czaja
0756343767
- Fix GPU compilation
...
test=develop
7 years ago
Jacek Czaja
d332326847
- Added unit tests for softmax is_test=True op
...
test=develop
7 years ago
Jacek Czaja
c1fccc29c1
- Noise adding removed for Test phase of softmax
7 years ago
peizhilin
7638f0afb3
simplify the logic
7 years ago
peizhilin
d01a26280e
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Xin Pan
ff28b1ffc0
Merge pull request #14071 from barrierye/add_similarity_focus_op
...
Add similarity focus op
7 years ago
li099
688ed60116
Add lod tensor array to tensor op ( #13990 )
...
* add lod tensor array concat
* add lod tensor array concat
* test=develop
* add lod tensor array concat
test=develop
* Fix API.spec
test=develop
* add lod tensor array concat
test=develop
* revise some bug of lod tensor array concat
test=develop
* add unittest for tensor array concat
test=develop
* change to tensor array to tensor
test=develop
* revise bug
test=develop
* revise a bug
test=develop
* revise a bug
test=develop
* revise a bug of python3
test=develop
7 years ago
peizhilin
e23061e0dc
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
chengduo
6c6e638550
Add InferVarType for some op ( #14201 )
...
* add_infer_var_type
test=develop
* InferVarTypeHelper-> VarTypeInferenceHelper
test=develop
* PassInputTypeAndDTypeOnOutput
test=develop
* follow comment
test=develop
7 years ago
peizhilin
1eec5a428f
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Kaipeng Deng
0b38822624
Merge pull request #14345 from heavengate/fix_grid_sampler
...
fix #14344 : win compile error, EigenTenor * float unsupport. test=develop
7 years ago
peizhilin
ca60e1d34d
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
52f7644f53
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiyang Min
698698f2fa
Merge branch 'develop' into fix_vlog
7 years ago
qingqing01
abe209234f
Exhaustive search for cuDNN conv. ( #14286 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
* Fix compiling test=develop
7 years ago
Yu Yang
b59a9bfb7c
Clean buffered_allocator
...
test=develop
7 years ago
Kaipeng Deng
f215534ecf
Merge pull request #14205 from heavengate/nearest_interp
...
Add interpolate operator replace bilinear_interp_op and add nearest neighbor interp mode
7 years ago
dengkaipeng
72108d8dbe
fix win compile error: EigenTenor * float unsupport. test=develop
7 years ago
Yu Yang
26fb34c365
Merge develop tiny fix
7 years ago
Yu Yang
fdc689142c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
tensor-tang
22125ebaef
Merge pull request #14321 from tensor-tang/fea/jit/vscal
...
Fea jitcode vscal vaddbias
7 years ago
Tao Luo
34e9e59f4a
Merge pull request #14333 from kbinias/change-hardcoded-format-and-bump-mkldnn-version
...
Changed hardcoded format to any in convolution and bumped MKL-DNN version to 0.17rc
7 years ago
minqiyang
87450b9ad4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
peizhilin
41b423d41b
remove duplicate
7 years ago
peizhilin
dcfab11193
merge from develop
7 years ago
peizhilin
4ffa92d4f0
Merge branch 'develop' into windows/build
7 years ago
chengduo
c5b6573a5a
Fix input<tensor> ( #14208 )
...
* fix input<tensor>
test=develop
* fix split_ids
test=develop
* ElementwiseMul should not support SelectedRows
* fix scale op
test=develop
* change GetTensorFromVar() method to GetTensorOrSelectedRowsFromVar()
* fix operator
* refine MultiOutput
* fix MultiOutput
test=develop
* disable test_dist_save_load
test=develop
* fix elementwise_op
test=develop
* add get_sparse_as_op
test=develop
* add info for check
test=develop
* rename get_sparse_as_op with extract_rows_as_op.
test=develop
* elementwise doesn't support selected_rows
* fix regularizer
* remove extract_rows_as
test=develop
* fix ci
test=develop
* add test for sum_op
* fix regularizer
test=develop
* test=develop
* fix pserver weight decay multi inputs test=develop
7 years ago
Krzysztof Binias
f1c1acf1ac
Changed hardcoded format to any in convolution and bumped MKL-DNN version to 0.17-rc
...
test=develop
7 years ago
Tao Luo
813e54efbd
Merge pull request #14328 from PaddlePaddle/revert-14046-windows/debug
...
Revert "cherry picked windows patches."
7 years ago
minqiyang
3db9fad764
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
Xin Pan
b03a44e062
Merge pull request #14026 from JiabinYang/add_reorg_op
...
Add reorg op
7 years ago
Zhaolong Xing
ba8b5619a3
Revert "cherry picked windows patches."
7 years ago
minqiyang
fcc0452c8b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
minqiyang
0c3227a523
Change the origin VLOG level to 10 times
...
Fix code to support cpplint syntax check
test=develop
7 years ago
tensor-tang
5e64244f25
add vaddbias jitcode
...
test=develop
7 years ago
tensor-tang
5f7956ae59
Merge remote-tracking branch 'ups/develop' into fea/jit/vscal
7 years ago
peizhilin
869487a2b7
Merge remote-tracking branch 'origin/develop' into windows/build
7 years ago
tensor-tang
3d950a812d
combine jitcode of vscal
7 years ago
tensor-tang
03e11f3fc9
add vscal jitcode
7 years ago
dzhwinter
234a1d9248
Merge remote-tracking branch 'origin/develop' into windows/debug
...
test=develop
7 years ago
chengduo
a270fdf2db
Fix SelectedRowsAdd bug ( #14309 )
...
* fix selected_rows bug
test=develop
* refine cos_sim
test=develop
7 years ago
tensor-tang
2f0a379af7
Merge pull request #14307 from tensor-tang/fix/mac
...
fix mac
7 years ago
Zeng Jinle
b2af213009
Merge pull request #14292 from sneaxiy/delete_buggy_selected_rows_functor
...
Delete buggy selected_rows functor
7 years ago
tensor-tang
161ba9c9d1
fix mac
...
test=develop
7 years ago
tensor-tang
e8642c3c1f
Merge pull request #14265 from tensor-tang/fea/jit/vadd
...
add vadd, vaddrelu jitcode
7 years ago
dengkaipeng
8b47d90f5d
add 'actual_shape' attribute. test=develop
7 years ago
tensor-tang
382307b943
refine code
...
test=develop
7 years ago
tensor-tang
3319072858
fix jit kernel test on mac
...
test=develop
7 years ago
Yu Yang
057a682ee9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Qiao Longfei
e0c8397426
Merge pull request #14257 from jacquesqiao/optimize-pserver-profiler-thread-pool
...
clean rpc server profiler
7 years ago
chengduo
ffc866159f
hot fix log ( #14293 )
...
test=develop
7 years ago
Zhaolong Xing
65b61db10a
Merge pull request #13927 from NHZlX/fix_googlenet_bug_with_rule
...
Fix googlenet bug with rule
7 years ago
tensor-tang
25e070ecc7
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
barrierye
ef8218be22
update docs test=develop
7 years ago
sneaxiy
9518bc8d0a
delete buggy selected_rows functor
...
test=develop
7 years ago
chengduo
a9b5d42dd4
Add fp16 backward support ( #14202 )
...
* add fp16 backward support
test=develop
* add sum_op fp16 test
* disable test_dist_save_load
test=develop
* add check_grad for sum
* add unit test for softmax_grad fp16
test=develop
* add scale_op unit test
* add mul_grad_op unit test for fp16
* add cross_entropy_grad and eman_grad unit test for fp16
test=develop
* fix cross_entropy unit test
* add pool2d fp16 unit test
* refine conv2d fp16 unit test
test=develop
* refine activation unit test
test=develop
* fix ci
test=develop
* follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
test=develop
7 years ago
Qiao Longfei
3b8dd9ebbd
optimize code test=develop
7 years ago
Qiao Longfei
2921f8a79c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
7 years ago
dzhwinter
2835e04409
merge develop branch. test=develop
7 years ago
dzhwinter
deb4af70ef
add test
7 years ago
qingqing01
db8c52da5e
Revert " Exhaustive search for cuDNN conv. ( #14043 )"
...
This reverts commit ce7d9b0799
.
7 years ago
qingqing01
ce7d9b0799
Exhaustive search for cuDNN conv. ( #14043 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Clean code
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
7 years ago
tensor-tang
cb4083b9fa
fix compile error
...
test=develop
7 years ago
tensor-tang
dd343a4971
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
Zeng Jinle
fcbe84cb50
Merge pull request #14270 from sneaxiy/fix_rmsprop_enforce_bug
...
Fix rmsprop_op enforce bug
7 years ago
nhzlx
5700fafd0f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_googlenet_bug_with_rule
...
test=develop
7 years ago
nhzlx
86b99ac953
fix comments and fix bug
7 years ago
tensor-tang
e6cfdf6c74
Merge pull request #14274 from tensor-tang/fix/jit
...
fix jit on mac
7 years ago
Zeng Jinle
8ac2242b6e
Merge pull request #14075 from sneaxiy/remove_some_locks_in_pe
...
Remove some locks in ParallelExecutor
7 years ago
tensor-tang
b81e1b655e
fix jit on mac
...
test=develop
7 years ago
sneaxiy
11f032a82e
fix rmsprop_op enforce bug
...
test=develop
7 years ago
tensor-tang
b68ececb73
add vaddrelu jitcode
...
test=develop
7 years ago
peizhilin
1f12ba6192
gpu support, fix build issue:
...
1. Non utf-8 characters within comments of OPs may lead to protobuf fail to parse_from_string
2. comment out some ops which not supported on windows
3. cuda libs may not be correctly linked to target on windows
7 years ago
Wu Yi
8fc05e0373
fix cpu build test=develop ( #14260 )
7 years ago
tensor-tang
bb09e31020
add vadd jitcode
...
test=develop
7 years ago
Qiao Longfei
59fbfbfbf7
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
...
test=develop
7 years ago
whs
d6a6a13039
Fix build error of affine grid op in mac os. ( #14237 )
...
* Fix build error of affine grid op in mac os.
test=develop
* Make function return reference.
test=develop
7 years ago
tensor-tang
d55481cfeb
Merge pull request #14241 from tensor-tang/refine/jit/vmulcode
...
Refine/jit/vmulcode
7 years ago
Qiao Longfei
9e4e9e9b6e
clean rpc server profiler
7 years ago
Zeng Jinle
8d930195d9
Merge pull request #14238 from sneaxiy/fix_read_lod_level_bug
...
Fix lod_level share bug in read_op
7 years ago
Wu Yi
306236c2c0
feature/DC asgd ( #12722 )
...
* wip
* add ref_by_trainer_id op
* ready to test
* fix ref inputs
* refine rpc_op_handle
* fix merge bug
7 years ago
dengkaipeng
fef2faa709
limit CUDA kernel parallel threads max number to 4096. test=develop
7 years ago
tensor-tang
c3cbf0b8ef
Merge pull request #14185 from tpatejko/tpatejko/mkldnn-conv-residual-data-reorder
...
Residual data reorder in MKLDNN convolution
7 years ago
peizhilin
71d7980f69
fix build issue 1
7 years ago
dengkaipeng
34bfae243a
Add Interpolate operation. test=develop
7 years ago
sneaxiy
46d4829dd1
fix lod_level share bug in read_op
...
test=develop
7 years ago
tensor-tang
8465e7876f
auto grow the size and fix test
...
test=develop
7 years ago
tensor-tang
9255119fd9
refine jit vmul with all size
7 years ago
tensor-tang
a9c1824131
refine jit vmul code supporting multiple of 2
7 years ago
tensor-tang
61fdc38e51
Merge pull request #14206 from tensor-tang/fea/jit/gen
...
Fea/jit/gen
7 years ago
peizhilin
9d67c1fb69
cpu build support
7 years ago
barrierye
5e7bb6a9bd
update docs test=develop
7 years ago
dzhwinter
60f70b174d
test=develop
7 years ago
sneaxiy
7ff320f8cc
merge develop
7 years ago
dongzhihong
00cf66964f
Merge remote-tracking branch 'origin/develop' into fix/sign_op
...
test=develop
7 years ago
Kaipeng Deng
daed473d4a
Merge pull request #14089 from heavengate/pool_exclude
...
add inclusive/exclusive mode in avg pool
7 years ago
Kaipeng Deng
64f3e3ed8f
Merge pull request #14069 from heavengate/grid_sampler
...
Grid sampler operator for spatial transformer network.
7 years ago
sneaxiy
366ebb93f7
test=develop
7 years ago
dzhwinter
eb2f7ed21b
refine tests. test=develop
7 years ago
Jiabin Yang
9f65b616b2
Merge branch 'develop' into add_reorg_op
7 years ago
Kaipeng Deng
0b29078201
Merge branch 'develop' into grid_sampler
7 years ago
whs
0c319e0b35
Add affine grid generator op ( #12238 )
...
* Add affine grid generator.
* fix ffine grid.
* Add unitest.
* Add CPU kernel and fix unitest.
* Fix CPU kernel.
* Refine code.
test=develop
* Fix python api.
test=develop
* Update python api.
test=develop
* Fix comment.
test=develop
* Rename affine_grid_generator to affine_grid and enhence unitest.
test=develop
* Fix unitest.
test=develop
7 years ago
tangwei12
d325e668b8
[1.1] Load vars on PSERVER ( #14037 )
...
* fix dim0 in _load_slice_up_vars
* fix dim0 in _load_slice_up_vars, fix innershape in delete_var_op
* Revert "fix lookuptable in reduce strategy"
This reverts commit 0e722c5
* add unit test for dist
* add unit test for dist, test=develop
* cancel revert, test=develop
7 years ago
tensor-tang
85bcb286f5
refine vmul jitcode
...
test=develop
7 years ago
tensor-tang
a764e900a5
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
...
test=develop
7 years ago
tensor-tang
a3377f7b0a
refine jitcode and add vmul jitcode implementation
7 years ago
dzhwinter
1ace55c8ee
merge develop branch
7 years ago
dengkaipeng
df4a3544aa
nearest neighbor interp add cuda kernel. test=develop
7 years ago
chengduo
2ccf77d1c1
Refine GetTensorFromVar ( #14160 )
...
* fix GetTensorFromVar
test=release/1.1
* refine GetTensorFromVar
test=develop
7 years ago
dengkaipeng
9755611938
add unittest for nearest_neighbor_interp_op
7 years ago
dengkaipeng
a24691a2a9
add nearest neighbor interpolation operator cpu kernel
7 years ago
JiabinYang
8d3c3e048b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
tensor-tang
f3badacd97
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
7 years ago
tensor-tang
a53b1b0b1b
refine and init jitkernel vmul
7 years ago
tensor-tang
2139b9f677
add jit gencode
7 years ago
Tomasz Patejko
8899d42265
MKLDNN conv residual data: primitive reuse interface used. Reorder done when formats are different
...
test=develop
7 years ago
chengduo
b73708d20b
add int and int64 dtype for gather_op ( #14175 )
...
test=develop
7 years ago
Tomasz Patejko
f11934cbe6
MKLDNN conv residual data: residual data is reorder when formats are incorrect
7 years ago
Tao Luo
cdf2579d08
Merge pull request #14053 from jczaja/prv-seqpool-max
...
Max Sequence pool optimization
7 years ago
Kaipeng Deng
a3b26e8528
Merge branch 'develop' into grid_sampler
7 years ago
dengkaipeng
7333fe8e55
add math formula for exclusive/inclusive mode in avg pool. test=develop
7 years ago
dzhwinter
316765839d
add back jit simd instructions. stage.
7 years ago
Xin Pan
eb7ed1b720
Merge pull request #13897 from gmcather/develop
...
1.add position encoding 2.logloss in nn.py
7 years ago
barrierye
fc23cc9d30
update paddle/fluid/API.spec
...
test=develop
7 years ago
dzhwinter
bf2e4cb188
cleard. staged
7 years ago
chengduo
2f639113ee
Fix sum_op's GetExpectedKernelType ( #14112 )
...
* fix sum_op's GetExpectedKernelType
test=develop
* fix ci fail
test=develop
7 years ago
gmcather
ba22624d7e
position encoding && log loss
...
test=develop
7 years ago
dzhwinter
ebfe5a02b3
merge develop branch
7 years ago
qingqing01
cb27a9219d
Merge pull request #13971 from sefira/FasterOpDoc
...
generate proposal labels doc
7 years ago
sneaxiy
5e5d2223a1
test=develop
7 years ago
tensor-tang
3c957af139
Merge pull request #14080 from tensor-tang/refine/jit/crf2
...
Refine/jit/crf decoding
7 years ago
barrierye
5f3acac9b3
update paddle/fluid/API.spec
...
test=develop
7 years ago
Jacek Czaja
458b16f42a
Rebase of seqpool-max optimization
...
test=develop
- Added rough profiling
- Profiled maxpool itself
- First draft of max seqpool optimization (is_test added)
- Added unit tests to seqpool
- Cosmetic fixes
- Fix to UT of Seq pool
Disabled grad checking for sequence max pool when is_test is set to True
-Cosmetic fix to comment
test=develop
- Fix to GPU build
test=develop
- yet another GPU fix for sequence max pool
- Fix to comment
test=develop
- Change to API of sequence_pool
test=develop
- Yet another API spec change
test=develop
7 years ago
dengkaipeng
ff6329bd5f
fix some inappropriate expressions in api doc for grid_sampler. test=develop
7 years ago
dengkaipeng
8f1e398824
move param exclusive to the last in pool2d/pool3d for forward compatibility:. test=develop
7 years ago
dengkaipeng
593e1b18d7
fix some bugs and add some doc for GridSampleOp
7 years ago
dengkaipeng
0bb0e0c10f
add Grid Sampler Operator for STN.
7 years ago
Yu Yang
c01696f8c2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
Qiao Longfei
d26ff8cb2d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpu-for-1.1-merge-with-shape
7 years ago
JiabinYang
e0a89503f8
test=develop
7 years ago
Wu Yi
26200f2e42
[1.1] [project] train imagenet using large batch size ( #13766 )
...
* fix nccl2 lars dist support
* put lars in momentum op
* add tests lars
* fix ci
* fix cpu kernel
* soft warning
* remove lars in test_recognize_digits.py
* move to another op
* add file
* update api.spec test=develop
* update test=develop
* fix api.spec test=develop
* wip
* wip, finish grad merge ops
* wip, finish graph build
* wip test running
* work on 1 gpu
* workable version
* update
* fix tests
* fuse broadcast op
* fix compile failed
* refine
* add batch merge test mnist
* fix CI test=develop
* fix build
* use independent bn params for batch merge test=develop
* update api.spec
* follow comments and for test
* wip
* refine tests test=develop
* follow comments test=develop
* remove startup bn modify test=develop
* follow comments test=develop
* fix merge test=develop
7 years ago
barrierye
8c1e304307
merge nn.py
7 years ago
dengkaipeng
c93e044ae0
add inclusive/exclusive mode in PoolOp avg pool type
7 years ago
JiabinYang
9a74c4489f
test=develop
7 years ago
barrierye
9dc28179a4
add similarity_focus op
7 years ago
Qiao Longfei
7cd2417fe2
Merge branch 'develop' into cpu-for-1.1-merge-with-shape
...
test=develop
7 years ago
dzhwinter
c8adc2c6fe
cudnn version. staged.
7 years ago
Yan Chunwei
ee74be3a49
[1.1] Bugfix/tensorarray ( #14044 )
7 years ago
Qiyang Min
33b4920d2d
Merge pull request #14057 from velconia/continue_hash_op
...
[1.1] Add hash_op implementation
7 years ago
Qiyang Min
209f24a241
Merge pull request #14051 from velconia/accelerate_embedding_grad
...
[1.1] Accelerate sparse embedding grad op in CPU device
7 years ago
Qiao Longfei
7cfc3c4415
Merge branch 'optimize-sum-seq-pooling-op' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
72aef6b168
sum selected rows check empty
7 years ago
Qiao Longfei
641369f92b
Merge branch 'dist-table-do-not-init-on-trainer' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
d69c820707
Merge branch 'add-flag-to-control-rpc-thread-num' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
1ed9ef6d70
Merge branch 'shape_int_to_int64' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
da61a5b672
Merge branch 'optimizer-prefetch' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
tangwei12
5ce3a32e06
Merge branch 'develop' into optimizer-prefetch
7 years ago
seiriosPlus
b6590b05fb
submit by tangwei12, test=develop
7 years ago
tangwei12
cb1ccc710b
fix shape type in uniform_random_op.cu
7 years ago
Qiao Longfei
575f22711d
optimize code
...
test=develop
7 years ago
Qiao Longfei
96d5500934
optimize code
7 years ago
Qiao Longfei
748ee35c89
sum op handle empty input update selected_rows_functor.cu
7 years ago
Qiao Longfei
dd78b5df93
sum op handle empty input
7 years ago
Qiao Longfei
cbe128bbae
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei
f4df0cb1a2
update the type of shape to int64, format code
7 years ago
Qiao Longfei
7dcb0dc8c6
update year
7 years ago
Qiao Longfei
68aeb4e7e9
add fake init test in test_dist_transpiler
7 years ago
Qiao Longfei
a13c788a04
fix a bug
7 years ago
Zeng Jinle
97d47a7d08
Merge pull request #13913 from sneaxiy/seq_reverse
...
Add sequence_reverse_op
7 years ago
JiabinYang
6e3615422f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Jiabin Yang
a3efba176c
Merge pull request #14085 from jerrywgz/fix_generate_proposals_op
...
[1.1] fix erase end in generate proposals op
7 years ago
dzhwinter
7141debe38
add cudnn back. staged.
7 years ago
Qiao Longfei
0328ffd3ab
add fake init op
7 years ago
Hongyu Liu
379d933ae5
Merge pull request #14036 from phlrain/add_dropout_att_new
...
Add dropout att new 1.1 merge
7 years ago
tangwei12
d8b697357f
update height_sections to int64_t
7 years ago
jerrywgz
de2f965c9b
test=develop
7 years ago
dzhwinter
09409bad4d
staged. test speed=49ms in 1080.
7 years ago
tensor-tang
64d5b4385e
fix crf decode avx512
7 years ago
tensor-tang
21487d78bf
add crf decode jit kernel
7 years ago
sneaxiy
1af3fe8c35
test=develop
7 years ago
Qiao Longfei
de539d72da
format
...
test=develop
7 years ago
sneaxiy
5be6f762d0
remove_lock_in_some_ops
...
test=develop
7 years ago
buxingyuan
6c1d74bb47
Merge branch 'develop' into FasterOpDoc
...
test=develop
7 years ago
JiabinYang
7bcba47e41
test=develop
7 years ago
barrierye
a7f94ec794
add similarity_focus op
7 years ago
minqiyang
0de6811ee0
Change reserve to resize
...
test=develop
7 years ago
JiabinYang
9cad409f2a
test=develop
7 years ago
minqiyang
5660d6a3ba
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
tensor-tang
a05fce6544
Merge remote-tracking branch 'ups/develop' into fix/jit/avx
...
test=develop
7 years ago
JiabinYang
bd064c0f44
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Qiyang Min
d0fdcb2f6d
Merge pull request #14048 from velconia/change_sequence_pool_to_cpu
...
Accelerate Sequence Pool Grad Op
7 years ago
Yu Yang
8310ce6007
Fix cluster memory
...
test=develop
7 years ago
tensor-tang
d24d282a7a
fix avx error
...
test=develop
7 years ago
tensor-tang
9cb8738f54
Merge pull request #14018 from tensor-tang/refine/jit/gru
...
Refine/jit/gru
7 years ago
Qiao Longfei
6253b152e6
Merge branch 'optimize-sum-seq-pooling-op' of https://github.com/jacquesqiao/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei
14f5a40898
fix unit test
7 years ago
minqiyang
5de4619781
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
minqiyang
0695c1fbe8
Add remind for code
...
test=develop
7 years ago
minqiyang
0c5c4c4a5b
Add blas header file
...
test=develop
7 years ago
buxingyuan
d0ccdf8fc1
follow comments
...
test=develop
7 years ago
minqiyang
e2a348cd10
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into change_sequence_pool_to_cpu
7 years ago
Qiao Longfei
f4e6fe0786
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
minqiyang
40141f749b
Implement the unittest for hash op
...
test=develop
7 years ago
minqiyang
8a0f26f45f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into continue_hash_op
7 years ago
minqiyang
d4f9aa0852
Add hash op implementation
7 years ago
tangwei12
755927d2b0
shape type to int64_t, test=develop
7 years ago
Qiao Longfei
7357d8412e
add flags for control the thead num for pserver
7 years ago
minqiyang
1a3b38a432
Polish code
...
test=develop
7 years ago
minqiyang
133bac2b10
Accelerate embedding op grad
...
test=develop
7 years ago
dzhwinter
597d92179b
clean demo_ci
7 years ago
phlrain
201d4f2a85
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
a6e6bc45d6
modify dropout att; test=develop
7 years ago
minqiyang
2468057da6
Move code to SumSeqPoolGradFunctor
...
test=develop
7 years ago
minqiyang
9725db0d40
Fix copy wrong pos bug
...
test=develop
7 years ago
minqiyang
9c68709036
Accelerate sequence_pool functor
7 years ago
minqiyang
14ebc424d6
Add gpu support for unittest
7 years ago
jerrywgz
e906c8e5e7
Merge pull request #14022 from jerrywgz/fix_rpn_target_assign_op
...
fix random fail in rpn target assign
7 years ago
minqiyang
bd5a82e193
Polish unit test code
7 years ago
minqiyang
047fa2f9aa
Add unit-test for sequence_pooling functor
7 years ago
qingqing01
c7379a7320
Fix top_k op ( #14034 )
...
1. Fix CUDA kernel when height is large than 2048.
2. Support input with more than 2D.
3. Fix unit test when k is large than 1.
4. Enhence unit testing.
test=develop
7 years ago
sneaxiy
016bf51e3f
test=develop
7 years ago
JiabinYang
c13f1ef3c4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Xin Pan
8837669782
Merge pull request #13982 from panyx0718/fix
...
Clean up Reuse
7 years ago
dzhwinter
dbd0075b68
Merge branch 'windows/support' into lb
7 years ago
dzhwinter
c6dcffc61a
lb. add debug output
7 years ago
sneaxiy
92a2817a2b
test=develop
7 years ago
JiabinYang
8e8e8e66ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
phlrain
049c9c7d2a
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
ffb24a73ec
add dropout attr; test=develop
7 years ago
wanghaoshuang
5993155d67
Merge remote-tracking branch 'dzhwinter/windows/support' into windows/support
7 years ago
wanghaoshuang
f9e7cfb03c
save binary file
7 years ago
tensor-tang
032c3a07e3
Merge remote-tracking branch 'ups/develop' into refine/jit/gru
...
test=develop
7 years ago
tensor-tang
159be8cc63
optimize fusion gru kernel at size 8
7 years ago
Tao Luo
23da8defc8
Merge pull request #14028 from luotao1/fix_resnet50_test
...
fix typo and warning in analyzer_resnet50_test
7 years ago
Yu Yang
71c846ef8a
Revert buggy changes
...
test=develop
7 years ago
JiabinYang
ff07dc315e
test=develop
7 years ago
chengduo
a7497653d0
Refine Split op ( #13967 )
...
* speedup split_op
test=develop
* speedup split_op
test=develop
* rename ConcatGrad to Split
* refine concat and split
test=develop
* fix compile error
7 years ago
Yu Yang
dbf9f6f408
Fix distribute compile
...
test=develop
7 years ago
jerrywgz
e0708e62ba
refine code
7 years ago
jerrywgz
1c591c3909
Merge branch 'develop' into fix_rpn_target_assign_op
7 years ago
sneaxiy
a9d7a9d720
test=develop
7 years ago
Tao Luo
316bc9bfc9
fix typo and warning in analyzer_resnet50_test
...
test=develop
7 years ago
jerrywgz
f06c6193d7
fix rpn target assign test=develop
7 years ago
dongzhihong
563e7bca7f
"fix op. test=develop"
7 years ago
Xin Pan
8f2116d8fa
clean up after the changes have been stopped for so long.
...
test=develop
7 years ago
tensor-tang
83dc689877
Merge remote-tracking branch 'ups/develop' into refine/jit/gru
...
test=develop
7 years ago
tensor-tang
640e789d3d
add fusion gru jit kernel
7 years ago
JiabinYang
39d39775c3
test=develop
7 years ago
JiabinYang
70351de1b5
test=develop
7 years ago
Yu Yang
461f71a90b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
qingqing01
0e24138494
Merge pull request #13991 from qingqing01/refine_generate_proposals_op
...
Refine generate proposals op
7 years ago
gongweibao
58c027cc38
Add rpc profiler flags. ( #13989 )
...
Add rpc profiler flags
7 years ago
Tao Luo
42aa1d409d
Merge pull request #13485 from tpatejko/tpatejko/capi-resnet-conv-elementwise-fusion
...
MKLDNN conv+elementwise_add fusion for residual connections in Resnet
7 years ago
tensor-tang
664159ad42
Merge pull request #13998 from tensor-tang/fea/fusion_seqconv_add
...
Fea/fusion seqconv eltadd relu
7 years ago
Qiao Longfei
40d65a1369
optimize code
7 years ago
Qiao Longfei
d37b9797ec
update test
7 years ago
Qiao Longfei
4051fb36b5
add monitor thread
7 years ago
Qiao Longfei
e67783375d
code clean
7 years ago
Qiao Longfei
5c65eff6ef
update test for ctr data
7 years ago
jerrywgz
765085d297
Merge pull request #13904 from jerrywgz/roialign
...
Add RoI align operator.
7 years ago
Dang Qingqing
56936b9e25
Refine doc for generate_proposals_op.
...
test=develop
7 years ago
Tomasz Patejko
4be45af1cc
MKLDNN conv + elementwise_add fusion: skip connection attribute renamed. Comments about patterns added.
...
test=develop
7 years ago
Michal Gallus
f688197182
MKLDNN conv + elementwise_add fusion: Fix output_data to point to the right tensor, also fix transpiler integration
7 years ago
Tomasz Patejko
bf95ac36a7
MKLDNN conv + elementwise_add fusion: further reformatting
7 years ago
Tomasz Patejko
b8e54ab5cc
MKLDNN conv + elementwise_add fusion: parameter name changed to ResidualData
7 years ago
Tomasz Patejko
41f3d78fdf
MKLDNN conv + elementwise_add fusion: output and elemwise param share data in conv primitive. Output is properly allocated
7 years ago
Tomasz Patejko
56528531ea
MKLDNN conv + elementwis_add fusion: initial work on passing eltwise data to conv primitive
7 years ago
Qiao Longfei
044d2e20bf
update test method
7 years ago
Dang Qingqing
4801ee8f97
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_generate_proposals_op
7 years ago
Qiao Longfei
92cbaa41eb
add GetTimeInSec
7 years ago
tensor-tang
23fc896bc2
Merge remote-tracking branch 'ups/develop' into fea/fusion_seqconv_add
...
test=develop
7 years ago
tensor-tang
339e655aec
refine and add seqconv elementwiseadd relu op test
7 years ago
jerrywgz
a1d3db031b
Merge pull request #13844 from jerrywgz/fix_roi_pool
...
fix roi pool register
7 years ago
Dang Qingqing
8e0b9496de
Fix unit test
...
test=develop
7 years ago
tensor-tang
0a9f5f1790
Merge pull request #13968 from tensor-tang/fix/jit/exp
...
Fix jit exp
7 years ago
Yipeng
fcb2e8103e
Ocr end2end dev ( #13889 )
...
* add detect and end2end code
* update the scale for coodinates restore
* fix merge bug with dev.
* fix merge bug with dev.
* test=develop
* fix code style test=develop
* fix code style test=develop
* test=develop
* test=develop
* test=develop
7 years ago
tensor-tang
e5ce965952
refine and add eltadd_relu unit test
7 years ago
sneaxiy
5a38930660
test=develop
7 years ago
Qiao Longfei
dd2dfeb624
add debug information
7 years ago
Qiao Longfei
803e2ed9f4
add ctr_reader_test and fix bug
7 years ago
tensor-tang
7cb19a5976
fuse elementwise_add and relu
7 years ago
Qiao Longfei
c8bd521045
add reader thread status
7 years ago
tensor-tang
3c249283af
init seqconv eltadd relu op
7 years ago
Qiao Longfei
71cbc8bd24
optimize code
7 years ago
Qiao Longfei
694e8945a2
add a base class for reader
7 years ago
Qiao Longfei
d981333e94
add a base class for reader
7 years ago
Qiao Longfei
a06173eedc
clean code
7 years ago
Qiao Longfei
71c2ad412f
complete read thread
7 years ago
sneaxiy
ac2eba4457
test=develop
7 years ago
Qiao Longfei
0f3ece775d
use gzstream
7 years ago
jerrywgz
553342624e
test=develop
7 years ago
jerrywgz
9a14ca91b8
test=develop
7 years ago
tensor-tang
60ff05e312
Merge branch 'luotao1-fix_rnn2_test' into fix/jit/exp
...
test=develop
7 years ago
Qiao Longfei
a1e0f5abb7
add gzstream.cmake
7 years ago
Tao Luo
7d680be5a3
Merge branch 'develop' into mkldnn_test
7 years ago
buxingyuan
0bb3b099c2
generate_proposal_labels doc
7 years ago
Qiao Longfei
20f181cdc1
init ctr_reader
7 years ago
gongweibao
a831ecc75d
Add grpc error context. ( #13957 )
...
Add grpc error context
7 years ago
tensor-tang
b139b687de
Merge remote-tracking branch 'ups/develop' into fix/jit/exp
...
test=develop
7 years ago
qingqing01
67a2b5215d
Add affine channel op to speed and save memory for faster-rcnn model. ( #13919 )
...
* Add affine channel op.
* Update code and add Python API.
test=develop
* Update API.spec
test=develop
7 years ago
tensor-tang
748435586a
clean code exp avx
7 years ago
tensor-tang
b4751a34a5
fix illegal instruction of rnn2
7 years ago
tensor-tang
30dfbdee7f
Merge pull request #13951 from tensor-tang/fix/warning
...
fix warning and mac compile
7 years ago
tensor-tang
36588b3365
fix illegal instruction of rnn1 and text
7 years ago
Tao Luo
6a4e9230ed
Merge branch 'develop' into mkldnn_test
7 years ago
gongweibao
078223b3e3
Add rpc timeline. ( #13900 )
...
Add rpc timeline
7 years ago
dzhwinter
29382db625
Merge pull request #13874 from dzhwinter/fix/momentum
...
add sparse update momentum. test=develop
7 years ago
qingqing01
5dbb2e9986
Small changes for sum_op to avoid zero setting. ( #13923 )
7 years ago
Tao Luo
e47f4186ae
fix some compiler warning
7 years ago
dzhwinter
00e8791f66
fix compile in cpu error. test=develop
7 years ago
tensor-tang
e69328c3bc
fix warning and mac compile
...
test=develop
7 years ago
Qiao Longfei
d26e4507da
init ctr data
7 years ago
dzhwinter
d239cf2e15
use binary search. test=develop
7 years ago
dzhwinter
a9f5f822e6
use binary search. test=develop
7 years ago
tensor-tang
6447155dac
Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole
...
Fea jitkernel lstm peephole
7 years ago
sneaxiy
4b4af84e67
test=develop
7 years ago
jerrywgz
4c9884e713
refine unittest test=develop
7 years ago
Qiao Longfei
0225957515
change elementwise_add to elementwise_add_to test=develop
7 years ago
Qiao Longfei
bd2b6d7f8f
sum_op support inplace
7 years ago
Xin Pan
7fb5b66ac2
Merge pull request #13916 from panyx0718/fix2
...
Make Var::GetMutable robust
7 years ago
dzhwinter
3861269594
merge develop branch
7 years ago
jerrywgz
98c3294b85
Merge branch 'roialign' of https://github.com/jerrywgz/Paddle into roialign
7 years ago
tangwei12
fa2ab3346c
fill constant add infervarshape, lookuptable clone lr var ( #13830 )
...
* fill constant add infervarshape, lookuptable clone lr var
* test=develop
* add lookuptable ut, test=develop
* bug fix in transpliler about async with lookup table
* test=develop
7 years ago
jerrywgz
8c79071d6a
roi_align for gpu
7 years ago
Xin Pan
342e436158
Make Var::GetMutable robust
...
test=develop
7 years ago
Yan Chunwei
7a751b83ac
fix isfinite_op sprintf ( #13850 )
...
test=develop
7 years ago
Qiyang Min
e3a64fca44
Merge pull request #13835 from velconia/fix_reshape_op
...
Fix Reshape op when input is the same with output
7 years ago
Yibing Liu
46b0b7903c
Merge pull request #13856 from kuke/seq_unpad_op
...
Add sequence unpad op
7 years ago
Qiao Longfei
b4a32eafdf
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
...
test=develop
7 years ago
jerrywgz
c9d2046f76
roi_align for gpu
7 years ago
jerrywgz
2f5a80174e
add roi_align api
7 years ago
dzhwinter
e41a3fcd68
fix update to develop hang problem.
7 years ago
Zeng Jinle
93606c2c2c
Merge pull request #13689 from sneaxiy/sparse_rmsprop
...
Fix sparse rmsprop
7 years ago
Qiao Longfei
681226e97c
Merge pull request #13864 from jacquesqiao/py-reader-add-test-mode
...
reader block queue add test mode
7 years ago
jerrywgz
90f39b1123
Merge branch 'roialign' of https://github.com/jerrywgz/Paddle into roialign
7 years ago
Xin Pan
288a112ffd
Revert "Revert "Revert "Make variable::GetMutable robust"""
7 years ago
sneaxiy
5cedfb60c8
test=develop
7 years ago