sneaxiy
faac8a76ce
remove unnecessary codes
...
test=develop
7 years ago
Yu Yang
ff9e531bd9
style(platform): disable warning when cuda cc not matched ( #14029 )
...
Warning only at first when CUDA CC not matched.
test=develop
7 years ago
Qiao Longfei
59fbfbfbf7
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
...
test=develop
7 years ago
Qiao Longfei
fe4cd50286
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-thread-pool
...
test=develop
7 years ago
whs
d6a6a13039
Fix build error of affine grid op in mac os. ( #14237 )
...
* Fix build error of affine grid op in mac os.
test=develop
* Make function return reference.
test=develop
7 years ago
Qiao Longfei
ac415c0094
change lock_guard to unique_lock
7 years ago
Qiao Longfei
f4a76078d0
optimize thread pool
7 years ago
tensor-tang
d55481cfeb
Merge pull request #14241 from tensor-tang/refine/jit/vmulcode
...
Refine/jit/vmulcode
7 years ago
Qiao Longfei
9e4e9e9b6e
clean rpc server profiler
7 years ago
Zeng Jinle
8d930195d9
Merge pull request #14238 from sneaxiy/fix_read_lod_level_bug
...
Fix lod_level share bug in read_op
7 years ago
Wu Yi
306236c2c0
feature/DC asgd ( #12722 )
...
* wip
* add ref_by_trainer_id op
* ready to test
* fix ref inputs
* refine rpc_op_handle
* fix merge bug
7 years ago
dengkaipeng
fef2faa709
limit CUDA kernel parallel threads max number to 4096. test=develop
7 years ago
tensor-tang
c3cbf0b8ef
Merge pull request #14185 from tpatejko/tpatejko/mkldnn-conv-residual-data-reorder
...
Residual data reorder in MKLDNN convolution
7 years ago
peizhilin
71d7980f69
fix build issue 1
7 years ago
tensor-tang
6b49ee42c3
Merge pull request #14239 from tensor-tang/fix/avx
...
Fix avx illegal instuctions
7 years ago
tensor-tang
ef9c10927d
Merge pull request #14233 from tensor-tang/fix/guide
...
throw error when mismatch cpu avx version
7 years ago
dengkaipeng
34bfae243a
Add Interpolate operation. test=develop
7 years ago
sneaxiy
46d4829dd1
fix lod_level share bug in read_op
...
test=develop
7 years ago
tensor-tang
8465e7876f
auto grow the size and fix test
...
test=develop
7 years ago
tensor-tang
9255119fd9
refine jit vmul with all size
7 years ago
tensor-tang
a9c1824131
refine jit vmul code supporting multiple of 2
7 years ago
tensor-tang
61fdc38e51
Merge pull request #14206 from tensor-tang/fea/jit/gen
...
Fea/jit/gen
7 years ago
tensor-tang
e09a7c793d
remove the warning log since do not have avx2, avx512 flags
...
test=develop
7 years ago
tensor-tang
f524c1b62b
throw error when mismatch cpu version
...
test=develop
7 years ago
peizhilin
9d67c1fb69
cpu build support
7 years ago
barrierye
5e7bb6a9bd
update docs test=develop
7 years ago
Xin Pan
c2d70fca30
fix to only check block 0
...
test=develop
7 years ago
minqiyang
e46f03e19d
Add TESTING_DEBUG_MODE to support debug info in daily CI test
...
test=develop
7 years ago
dzhwinter
baf0ff4510
Merge pull request #14020 from dzhwinter/fix/sign_op
...
"fix sign op"
7 years ago
dzhwinter
60f70b174d
test=develop
7 years ago
sneaxiy
7ff320f8cc
merge develop
7 years ago
Xin Pan
d0459ac8d0
Merge pull request #14223 from panyx0718/fix5
...
add more debug info.
7 years ago
dongzhihong
00cf66964f
Merge remote-tracking branch 'origin/develop' into fix/sign_op
...
test=develop
7 years ago
Kaipeng Deng
daed473d4a
Merge pull request #14089 from heavengate/pool_exclude
...
add inclusive/exclusive mode in avg pool
7 years ago
Kaipeng Deng
64f3e3ed8f
Merge pull request #14069 from heavengate/grid_sampler
...
Grid sampler operator for spatial transformer network.
7 years ago
Xin Pan
aaeedd0ff3
make it warn
...
test=develop
7 years ago
Zeng Jinle
b316437a50
Merge pull request #14087 from sneaxiy/add_use_cudnn_in_softmax_with_xe
...
Add numeric_stable_mode parameters to softmax_with_xe op
7 years ago
Xin Pan
ddd2225b56
add more debug info.
...
test=develop
7 years ago
sneaxiy
bbc818a5a1
test=develop
7 years ago
sneaxiy
366ebb93f7
test=develop
7 years ago
sneaxiy
203027ca86
test=develop
7 years ago
Tao Luo
d2a56f7909
Merge pull request #14159 from sfraczek/sfraczek/depthwise-conv-mkldnn-pass
...
add depthwise conv mkldnn pass
7 years ago
dzhwinter
cc02353d10
test=develop
7 years ago
dzhwinter
eb2f7ed21b
refine tests. test=develop
7 years ago
Jiabin Yang
9f65b616b2
Merge branch 'develop' into add_reorg_op
7 years ago
Xin Pan
08d22cf7e1
Merge pull request #14091 from panyx0718/fix2
...
add program check
7 years ago
Wu Yi
91b2851cdc
enable pyreader use pin memory ( #14066 )
...
* enable pyreader use pin memory
* add py reader pin memory test test=develop
7 years ago
Kaipeng Deng
0b29078201
Merge branch 'develop' into grid_sampler
7 years ago
whs
0c319e0b35
Add affine grid generator op ( #12238 )
...
* Add affine grid generator.
* fix ffine grid.
* Add unitest.
* Add CPU kernel and fix unitest.
* Fix CPU kernel.
* Refine code.
test=develop
* Fix python api.
test=develop
* Update python api.
test=develop
* Fix comment.
test=develop
* Rename affine_grid_generator to affine_grid and enhence unitest.
test=develop
* Fix unitest.
test=develop
7 years ago
sneaxiy
cf1944af2a
test=develop
7 years ago
tangwei12
d325e668b8
[1.1] Load vars on PSERVER ( #14037 )
...
* fix dim0 in _load_slice_up_vars
* fix dim0 in _load_slice_up_vars, fix innershape in delete_var_op
* Revert "fix lookuptable in reduce strategy"
This reverts commit 0e722c5
* add unit test for dist
* add unit test for dist, test=develop
* cancel revert, test=develop
7 years ago
dengkaipeng
e99da0b583
api change: create_variable_for_type_inference. test=develop
7 years ago
Tao Luo
2eaa291e91
Merge pull request #14197 from luotao1/remove_with_fast_bundle_test
...
remove unused WITH_FAST_BUNDLE_TEST option
7 years ago
Yan Chunwei
f76fee644c
fix graph pattern detector ( #14186 )
7 years ago
Tao Luo
fe8f178582
fix word2vec related inference unit-tests ( #14203 )
7 years ago
chengduo
e1742050ea
fix merge lod_tensor bug ( #14199 )
...
test=develop
7 years ago
dzhwinter
0a180584e6
clean cmake. test=develop
7 years ago
tensor-tang
85bcb286f5
refine vmul jitcode
...
test=develop
7 years ago
tensor-tang
a764e900a5
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
...
test=develop
7 years ago
tensor-tang
a3377f7b0a
refine jitcode and add vmul jitcode implementation
7 years ago
dzhwinter
1ace55c8ee
merge develop branch
7 years ago
dzhwinter
9da7b33515
details
7 years ago
dengkaipeng
df4a3544aa
nearest neighbor interp add cuda kernel. test=develop
7 years ago
Xin Pan
913b569903
Merge pull request #14151 from panyx0718/fix
...
add a small test to verify tensor type
7 years ago
sneaxiy
c7305fbe2f
buffered_allocator: add unittest and fix bug
...
test=develop
7 years ago
dengkaipeng
da8ee1fbaa
fix API.spec not add defaults. test=develop
7 years ago
chengduo
2ccf77d1c1
Refine GetTensorFromVar ( #14160 )
...
* fix GetTensorFromVar
test=release/1.1
* refine GetTensorFromVar
test=develop
7 years ago
Tao Luo
5ac575cf62
remove unused WITH_FAST_BUNDLE_TEST option
...
test=develop
7 years ago
dengkaipeng
9755611938
add unittest for nearest_neighbor_interp_op
7 years ago
dengkaipeng
a24691a2a9
add nearest neighbor interpolation operator cpu kernel
7 years ago
sneaxiy
e3fc544cf7
merge develop
7 years ago
sneaxiy
2bef0ca346
add buffered_allocator
...
remove Free() method in UnmanagedAllocator
7 years ago
JiabinYang
8d3c3e048b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Yan Xu
d10b8efcc0
Merge pull request #14152 from Yancey1989/add_fused_broadcast_unittest
...
add fused broadcast op unit test
7 years ago
Yu Yang
c21597cf07
fix(PE): use shared_ptr<BlockingQueue> for cross thread communication ( #14136 )
...
It seems that the blocking queue might be destroyed early than Run
method complete. It might because the Run method throw some unhandled
exception. However, it should be shared_ptr when multthread access an
resource. So change BlockingQueue as a shared_ptr.
test=develop
7 years ago
tensor-tang
f3badacd97
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
7 years ago
tensor-tang
a53b1b0b1b
refine and init jitkernel vmul
7 years ago
tensor-tang
2139b9f677
add jit gencode
7 years ago
Yan Chunwei
06e508ab58
fix simple_on_word2vec random fail ( #14171 )
7 years ago
Tomasz Patejko
8899d42265
MKLDNN conv residual data: primitive reuse interface used. Reorder done when formats are different
...
test=develop
7 years ago
chengduo
b73708d20b
add int and int64 dtype for gather_op ( #14175 )
...
test=develop
7 years ago
Tomasz Patejko
f11934cbe6
MKLDNN conv residual data: residual data is reorder when formats are incorrect
7 years ago
Yan Chunwei
62a0fe0860
fix tensor array bug ( #14166 )
...
remove the optimized but buggy implementation
7 years ago
chengduo
ed087f8232
refine op_handle ( #14178 )
...
test=develop
7 years ago
Tao Luo
cdf2579d08
Merge pull request #14053 from jczaja/prv-seqpool-max
...
Max Sequence pool optimization
7 years ago
Kaipeng Deng
a3b26e8528
Merge branch 'develop' into grid_sampler
7 years ago
dengkaipeng
7333fe8e55
add math formula for exclusive/inclusive mode in avg pool. test=develop
7 years ago
Xin Pan
35915fc543
Merge pull request #14147 from luotao1/remove_with_inference
...
remove with_inference option
7 years ago
Yu Yang
90d9e5aee8
feat(platform): lazy initialization of devicecontext in pool ( #14067 )
...
* feat(platform): lazy initialization of devicecontext in pool
Use std::async(deferer, []{...}) to lazy initialize DeviceContext in Pool
test=develop
* Add future includes
test=develop
7 years ago
dzhwinter
316765839d
add back jit simd instructions. stage.
7 years ago
Xin Pan
eb7ed1b720
Merge pull request #13897 from gmcather/develop
...
1.add position encoding 2.logloss in nn.py
7 years ago
Sylwester Fraczek
4e2aaf01bc
add depthwise conv mkldnn pass
...
added depthwise conv mkldnn pass which for MKLDNN changes depthwise_conv operator to conv operator because for mkldnn this is the same api
test=develop
7 years ago
barrierye
fc23cc9d30
update paddle/fluid/API.spec
...
test=develop
7 years ago
Yancey1989
6bfa6a0a33
add fused broadcast op unit test, test=develop
7 years ago
Xin Pan
e2db0b9bf3
add a small test to verify tensor type
...
test=develop
7 years ago
dzhwinter
bf2e4cb188
cleard. staged
7 years ago
Yan Chunwei
70ce6dcd67
fix api_impl ci error ( #14140 )
7 years ago
Xin Pan
eb37ed4c16
Merge pull request #14141 from JiabinYang/fix_inference_model_latest
...
Fix inference model not found on Mac CI
7 years ago
Xin Pan
a943134a97
fix a few more tests
...
test=develop
7 years ago
chengduo
2f639113ee
Fix sum_op's GetExpectedKernelType ( #14112 )
...
* fix sum_op's GetExpectedKernelType
test=develop
* fix ci fail
test=develop
7 years ago
Xin Pan
5839e3236b
add program check
...
test=develop
7 years ago
gmcather
ba22624d7e
position encoding && log loss
...
test=develop
7 years ago
Tao Luo
3a96d41d72
remove with_inference option
...
test=develop
7 years ago
sneaxiy
2494ca83ab
test=develop
7 years ago
dzhwinter
ebfe5a02b3
merge develop branch
7 years ago
JiabinYang
7c45e77c41
test=develop
7 years ago
barrierye
b5f78ce42d
update paddle/fluid/API.spec
...
test=develop
7 years ago
qingqing01
cb27a9219d
Merge pull request #13971 from sefira/FasterOpDoc
...
generate proposal labels doc
7 years ago
sneaxiy
5e5d2223a1
test=develop
7 years ago
tensor-tang
3c957af139
Merge pull request #14080 from tensor-tang/refine/jit/crf2
...
Refine/jit/crf decoding
7 years ago
Xin Pan
aa87a989ec
Merge pull request #14119 from Superjomn/fix/api-impl-tester
...
disable some tests
7 years ago
barrierye
5f3acac9b3
update paddle/fluid/API.spec
...
test=develop
7 years ago
Xin Pan
9ef19d4919
Merge pull request #14106 from luotao1/fix_cmake_warning
...
[1.1] fix cmake warning when ON_INFER=false
7 years ago
sneaxiy
f2eed667c0
test=develop
7 years ago
Xin Pan
16dfedb8b8
Merge pull request #14103 from jacquesqiao/cpu-for-1.1-merge-with-shape
...
[1.1] Cpu for 1.1 merge with shape
7 years ago
sneaxiy
cef8cc81db
merge develop
7 years ago
Jacek Czaja
458b16f42a
Rebase of seqpool-max optimization
...
test=develop
- Added rough profiling
- Profiled maxpool itself
- First draft of max seqpool optimization (is_test added)
- Added unit tests to seqpool
- Cosmetic fixes
- Fix to UT of Seq pool
Disabled grad checking for sequence max pool when is_test is set to True
-Cosmetic fix to comment
test=develop
- Fix to GPU build
test=develop
- yet another GPU fix for sequence max pool
- Fix to comment
test=develop
- Change to API of sequence_pool
test=develop
- Yet another API spec change
test=develop
7 years ago
superjomn
5f7fda0b07
disable some tests
...
test=develop
7 years ago
dengkaipeng
ff6329bd5f
fix some inappropriate expressions in api doc for grid_sampler. test=develop
7 years ago
Tao Luo
d3534d2b14
refine warning message
...
test=develop
7 years ago
Xin Pan
177720a737
Merge pull request #14116 from chengduoZH/release/1.1.0
...
[1.1]Fix op_role value
7 years ago
chengduozh
acec4cb8ca
[1.1]fix op_role value
...
test=release/1.1
7 years ago
barrierye
73671379cd
update paddle/fluid/API.spec
...
test=develop
7 years ago
dengkaipeng
8f1e398824
move param exclusive to the last in pool2d/pool3d for forward compatibility:. test=develop
7 years ago
dengkaipeng
593e1b18d7
fix some bugs and add some doc for GridSampleOp
7 years ago
dengkaipeng
0bb0e0c10f
add Grid Sampler Operator for STN.
7 years ago
Qiao Longfei
3d4e050802
fix compile, optimize code test=develop
7 years ago
Yu Yang
c01696f8c2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
Qiao Longfei
d26ff8cb2d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpu-for-1.1-merge-with-shape
7 years ago
JiabinYang
e0a89503f8
test=develop
7 years ago
JiabinYang
0e3038680b
test=develop
7 years ago
Tao Luo
79da263b11
Merge pull request #14032 from sfraczek/sfraczek/fix-test-multithreading-mkldnn
...
fix test resnet50 multi-threading on mkldnn
7 years ago
Wu Yi
26200f2e42
[1.1] [project] train imagenet using large batch size ( #13766 )
...
* fix nccl2 lars dist support
* put lars in momentum op
* add tests lars
* fix ci
* fix cpu kernel
* soft warning
* remove lars in test_recognize_digits.py
* move to another op
* add file
* update api.spec test=develop
* update test=develop
* fix api.spec test=develop
* wip
* wip, finish grad merge ops
* wip, finish graph build
* wip test running
* work on 1 gpu
* workable version
* update
* fix tests
* fuse broadcast op
* fix compile failed
* refine
* add batch merge test mnist
* fix CI test=develop
* fix build
* use independent bn params for batch merge test=develop
* update api.spec
* follow comments and for test
* wip
* refine tests test=develop
* follow comments test=develop
* remove startup bn modify test=develop
* follow comments test=develop
* fix merge test=develop
7 years ago
sneaxiy
2414f92f54
test=develop
7 years ago
barrierye
8c1e304307
merge nn.py
7 years ago
sneaxiy
45559d042c
move to pass
...
test=develop
7 years ago
dengkaipeng
c93e044ae0
add inclusive/exclusive mode in PoolOp avg pool type
7 years ago
JiabinYang
9a74c4489f
test=develop
7 years ago
barrierye
9dc28179a4
add similarity_focus op
7 years ago
Qiao Longfei
7cd2417fe2
Merge branch 'develop' into cpu-for-1.1-merge-with-shape
...
test=develop
7 years ago
Xin Pan
0a80f06ec4
Merge pull request #14086 from panyx0718/fix6
...
delete unused codes.
7 years ago
sneaxiy
a314a80cdb
merge develop
7 years ago
Tao Luo
4928ff32a9
fix cmake warning when ON_INFER=false
...
test=develop
7 years ago
dzhwinter
c8adc2c6fe
cudnn version. staged.
7 years ago
Qiao Longfei
06ffbc4f28
Merge branch 'shape_int_to_int64' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge-with-shape
...
test=develop
7 years ago
seiriosPlus
06de824ba8
fix shape in floats
7 years ago
Yan Chunwei
ee74be3a49
[1.1] Bugfix/tensorarray ( #14044 )
7 years ago
Qiyang Min
33b4920d2d
Merge pull request #14057 from velconia/continue_hash_op
...
[1.1] Add hash_op implementation
7 years ago
Qiyang Min
209f24a241
Merge pull request #14051 from velconia/accelerate_embedding_grad
...
[1.1] Accelerate sparse embedding grad op in CPU device
7 years ago
minqiyang
2fec8c5d9a
Polish code
...
test=develop
7 years ago
minqiyang
7f7af5d412
Add xxhash deps to inference demo and trainer demo
...
test=develop
7 years ago
Qiao Longfei
7cfc3c4415
Merge branch 'optimize-sum-seq-pooling-op' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
72aef6b168
sum selected rows check empty
7 years ago
minqiyang
fe18adfbaa
Add fluid inference support
...
test=develop
7 years ago
seiriosPlus
c34610f86d
Fix lookup table at CPU Reduce strategy, test=develop
7 years ago
Qiao Longfei
641369f92b
Merge branch 'dist-table-do-not-init-on-trainer' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
d69c820707
Merge branch 'add-flag-to-control-rpc-thread-num' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
1ed9ef6d70
Merge branch 'shape_int_to_int64' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
f1a3fb041b
Merge branch 'fix_lookuptable_in_reduce' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
da61a5b672
Merge branch 'optimizer-prefetch' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
tangwei12
5ce3a32e06
Merge branch 'develop' into optimizer-prefetch
7 years ago
seiriosPlus
b6590b05fb
submit by tangwei12, test=develop
7 years ago
Wu Yi
9da9b1926b
[1.1] fix graph num hang ( #14072 )
...
* fix graph num hang test=develop
* re-enable tests test=develop
* re-enable graph num check test=develop
* fix multi device pass role check test=develop
7 years ago
tangwei12
cb1ccc710b
fix shape type in uniform_random_op.cu
7 years ago
Qiao Longfei
575f22711d
optimize code
...
test=develop
7 years ago
Qiao Longfei
96d5500934
optimize code
7 years ago
Qiao Longfei
748ee35c89
sum op handle empty input update selected_rows_functor.cu
7 years ago
Qiao Longfei
dd78b5df93
sum op handle empty input
7 years ago
Qiao Longfei
cbe128bbae
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei
f4df0cb1a2
update the type of shape to int64, format code
7 years ago
Qiao Longfei
fad42fe7cc
broadcast handle not inited parameter
7 years ago
Qiao Longfei
7dcb0dc8c6
update year
7 years ago
Qiao Longfei
68aeb4e7e9
add fake init test in test_dist_transpiler
7 years ago
Tao Luo
5ed3e6f3f6
Merge pull request #14042 from luotao1/remove_unused_code
...
[1.1] remove unused code in paddle_inference_api.h
7 years ago
Qiao Longfei
a13c788a04
fix a bug
7 years ago
Zeng Jinle
97d47a7d08
Merge pull request #13913 from sneaxiy/seq_reverse
...
Add sequence_reverse_op
7 years ago
JiabinYang
6e3615422f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Jiabin Yang
a3efba176c
Merge pull request #14085 from jerrywgz/fix_generate_proposals_op
...
[1.1] fix erase end in generate proposals op
7 years ago
dzhwinter
7141debe38
add cudnn back. staged.
7 years ago
Guo Sheng
b9ae1c49f8
Merge pull request #13994 from guoshengCS/add-reshape-reuse-input
...
[1.1] Make reshape_op reuse input.
7 years ago
Zeng Jinle
60058180cb
Merge pull request #13945 from sneaxiy/unify_mixed_vector_api
...
Unify API of mixed_vector in GPU and CPU
7 years ago
Qiao Longfei
0328ffd3ab
add fake init op
7 years ago
Xin Pan
bcc9126e7b
Merge pull request #14056 from panyx0718/fix
...
Fix threadpool
7 years ago
Sylwester Fraczek
2098b42584
review fixes (Teamcity fails)
...
test=develop
7 years ago
Tao Luo
961baea16c
Merge pull request #14063 from wojtuss/wojtuss/remove-unused-EnableMKLDNN
...
remove unused method from naive executor
7 years ago
Hongyu Liu
379d933ae5
Merge pull request #14036 from phlrain/add_dropout_att_new
...
Add dropout att new 1.1 merge
7 years ago
tangwei12
d8b697357f
update height_sections to int64_t
7 years ago
minqiyang
a2820b9899
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
Xin Pan
bba0c4a9f2
delete unused codes.
...
test=develop
7 years ago
jerrywgz
de2f965c9b
test=develop
7 years ago
guosheng
cc0e23973d
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
tangwei12
318ba99124
revert changes in protobuf.cc and type_defs
7 years ago
tangwei12
aa6dc82f4b
revert changes in protobuf.cc and type_defs
7 years ago
dzhwinter
09409bad4d
staged. test speed=49ms in 1080.
7 years ago
tensor-tang
64d5b4385e
fix crf decode avx512
7 years ago
tensor-tang
21487d78bf
add crf decode jit kernel
7 years ago
sneaxiy
b1fd62f39e
test=develop
7 years ago
guosheng
3cfaeac288
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
sneaxiy
1af3fe8c35
test=develop
7 years ago
Xin Pan
d5d09672c8
better fix
...
test=develop
7 years ago
Qiao Longfei
de539d72da
format
...
test=develop
7 years ago
sneaxiy
5be6f762d0
remove_lock_in_some_ops
...
test=develop
7 years ago
buxingyuan
6c1d74bb47
Merge branch 'develop' into FasterOpDoc
...
test=develop
7 years ago
Xin Pan
726fd438cd
avoid blocking everyone
...
please fix offline
7 years ago
JiabinYang
7bcba47e41
test=develop
7 years ago
barrierye
a7f94ec794
add similarity_focus op
7 years ago
Tao Luo
8ab953e37c
auto insert infer_graph_clean_pass as the default first one
...
test=develop
7 years ago
Tao Luo
d70c7fb9b3
Merge branch 'develop' into remove_unused_code
7 years ago
Tao Luo
ea2bdd192d
Merge branch 'develop' into remove_unused_code
7 years ago
minqiyang
0de6811ee0
Change reserve to resize
...
test=develop
7 years ago
tangwei12
b58957d9d7
Revert "fix lookuptable in reduce strategy"
...
This reverts commit 0e722c5
7 years ago
JiabinYang
9cad409f2a
test=develop
7 years ago
tangwei12
2761eafb92
shape type to int64_t, test=develop
7 years ago
tangwei12
d4a8967c1e
add const in &, test=develop
7 years ago
minqiyang
5660d6a3ba
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
guosheng
1f92c30565
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
tensor-tang
a05fce6544
Merge remote-tracking branch 'ups/develop' into fix/jit/avx
...
test=develop
7 years ago
JiabinYang
bd064c0f44
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
tangwei12
0e25e397bd
shape type to int64_t, test=develop
7 years ago
Qiyang Min
d0fdcb2f6d
Merge pull request #14048 from velconia/change_sequence_pool_to_cpu
...
Accelerate Sequence Pool Grad Op
7 years ago
tangwei12
d1e85e33d7
shape type to int64_t, test=develop
7 years ago
Yu Yang
8310ce6007
Fix cluster memory
...
test=develop
7 years ago
tensor-tang
d24d282a7a
fix avx error
...
test=develop
7 years ago
tensor-tang
9cb8738f54
Merge pull request #14018 from tensor-tang/refine/jit/gru
...
Refine/jit/gru
7 years ago
Xin Pan
70effddfc1
fix
...
test=develop
7 years ago
Xin Pan
64e7688ade
clean more APIs
...
test=develop
7 years ago
Xin Pan
c891bc22f5
clarify Reset
...
test=develop
7 years ago
Qiao Longfei
6253b152e6
Merge branch 'optimize-sum-seq-pooling-op' of https://github.com/jacquesqiao/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei
14f5a40898
fix unit test
7 years ago
minqiyang
447a680a2b
Add API.spec
...
test=develop
7 years ago
minqiyang
5de4619781
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
minqiyang
0695c1fbe8
Add remind for code
...
test=develop
7 years ago
minqiyang
0c5c4c4a5b
Add blas header file
...
test=develop
7 years ago
guosheng
aac426444f
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
buxingyuan
d0ccdf8fc1
follow comments
...
test=develop
7 years ago
minqiyang
e2a348cd10
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into change_sequence_pool_to_cpu
7 years ago
Qiao Longfei
f4e6fe0786
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Xin Pan
4f59690b4c
clean unused codes
...
test=develop
7 years ago
Xin Pan
784a19ecd0
fix some thread-safty issue and simplify threadpool
...
test=develop
7 years ago
Wojciech Uss
be58997443
remove unused method from naive executor
...
test=develop
7 years ago
minqiyang
40141f749b
Implement the unittest for hash op
...
test=develop
7 years ago
Sylwester Fraczek
741cb33bd9
test multithreading
7 years ago
Brian Liu
a53e8a8da6
Update MKLDNN integration framework to support Paddle multi-instances
...
Make all blob info saved in global device context to be thread based.
Meanwhile save thread id in thread local storage in ParallelDo
7 years ago
minqiyang
8a0f26f45f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into continue_hash_op
7 years ago
minqiyang
d4f9aa0852
Add hash op implementation
7 years ago
dzhwinter
468467f391
update real incnet tester
7 years ago
tangwei12
39b3bf24d0
shape type to int64_t, test=develop
7 years ago
tangwei12
755927d2b0
shape type to int64_t, test=develop
7 years ago
Qiao Longfei
7357d8412e
add flags for control the thead num for pserver
7 years ago
phlrain
a4ad286e6b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
469bdb9e55
modify api.spec; test=develop
7 years ago
minqiyang
1a3b38a432
Polish code
...
test=develop
7 years ago
dzhwinter
b154e0b492
clean demo_ci
7 years ago
minqiyang
133bac2b10
Accelerate embedding op grad
...
test=develop
7 years ago
Zhaolong Xing
2256fae45d
Merge pull request #13938 from NHZlX/ocr_attention_support
...
ceil pool mode support for ocr attention model.
7 years ago
dzhwinter
abe8e207c4
clean demo_ci
7 years ago
dzhwinter
597d92179b
clean demo_ci
7 years ago
phlrain
201d4f2a85
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
a6e6bc45d6
modify dropout att; test=develop
7 years ago
minqiyang
2468057da6
Move code to SumSeqPoolGradFunctor
...
test=develop
7 years ago
minqiyang
9725db0d40
Fix copy wrong pos bug
...
test=develop
7 years ago
minqiyang
9c68709036
Accelerate sequence_pool functor
7 years ago
minqiyang
14ebc424d6
Add gpu support for unittest
7 years ago
jerrywgz
e906c8e5e7
Merge pull request #14022 from jerrywgz/fix_rpn_target_assign_op
...
fix random fail in rpn target assign
7 years ago
minqiyang
bd5a82e193
Polish unit test code
7 years ago
minqiyang
047fa2f9aa
Add unit-test for sequence_pooling functor
7 years ago
qingqing01
c7379a7320
Fix top_k op ( #14034 )
...
1. Fix CUDA kernel when height is large than 2048.
2. Support input with more than 2D.
3. Fix unit test when k is large than 1.
4. Enhence unit testing.
test=develop
7 years ago
sneaxiy
016bf51e3f
test=develop
7 years ago
Tao Luo
f7bbcfa913
remove unused code in paddle_inference_api.h
...
test=develop
7 years ago
JiabinYang
c056328563
test=develop
7 years ago
nhzlx
11f189bacf
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_demo_ci_trt
...
test=develop
7 years ago
tangwei12
8b7f45a889
add longs in framework
7 years ago
JiabinYang
c13f1ef3c4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
tangwei12
f3729db6e0
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into Pdv
7 years ago
Xin Pan
8837669782
Merge pull request #13982 from panyx0718/fix
...
Clean up Reuse
7 years ago
dzhwinter
dbd0075b68
Merge branch 'windows/support' into lb
7 years ago
dzhwinter
c6dcffc61a
lb. add debug output
7 years ago
wanghaoshuang
78cf76a1ca
fix linux compile
7 years ago
tangwei12
770e2a1881
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into Pdv
7 years ago
chengduo
e943f4508b
add graph number check ( #14025 )
...
test=develop
7 years ago
sneaxiy
92a2817a2b
test=develop
7 years ago
JiabinYang
8e8e8e66ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
nhzlx
ae8f26072d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_demo_ci_trt
...
test=develop
7 years ago
phlrain
049c9c7d2a
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
ffb24a73ec
add dropout attr; test=develop
7 years ago
sneaxiy
8f07f60915
test=develop
7 years ago
wanghaoshuang
5993155d67
Merge remote-tracking branch 'dzhwinter/windows/support' into windows/support
7 years ago
wanghaoshuang
f9e7cfb03c
save binary file
7 years ago
tensor-tang
032c3a07e3
Merge remote-tracking branch 'ups/develop' into refine/jit/gru
...
test=develop
7 years ago
tensor-tang
159be8cc63
optimize fusion gru kernel at size 8
7 years ago
dzhwinter
607080e888
windows static library
7 years ago
Tao Luo
23da8defc8
Merge pull request #14028 from luotao1/fix_resnet50_test
...
fix typo and warning in analyzer_resnet50_test
7 years ago
Yu Yang
71c846ef8a
Revert buggy changes
...
test=develop
7 years ago
JiabinYang
ff07dc315e
test=develop
7 years ago
chengduo
a7497653d0
Refine Split op ( #13967 )
...
* speedup split_op
test=develop
* speedup split_op
test=develop
* rename ConcatGrad to Split
* refine concat and split
test=develop
* fix compile error
7 years ago
Yu Yang
dbf9f6f408
Fix distribute compile
...
test=develop
7 years ago
guosheng
3099a8f3aa
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
jerrywgz
e0708e62ba
refine code
7 years ago
jerrywgz
1c591c3909
Merge branch 'develop' into fix_rpn_target_assign_op
7 years ago
sneaxiy
a9d7a9d720
test=develop
7 years ago
Tao Luo
316bc9bfc9
fix typo and warning in analyzer_resnet50_test
...
test=develop
7 years ago
guosheng
6447b69aec
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
jerrywgz
f06c6193d7
fix rpn target assign test=develop
7 years ago
Yu Yang
1d4d4e73ab
Remove place hash
...
test=develop
7 years ago
dongzhihong
563e7bca7f
"fix op. test=develop"
7 years ago
Xin Pan
4625f83f92
better handle var type inference
...
avoid the default one that usually overwrites manually set ones
test=develop
7 years ago
Xin Pan
8f2116d8fa
clean up after the changes have been stopped for so long.
...
test=develop
7 years ago
tensor-tang
83dc689877
Merge remote-tracking branch 'ups/develop' into refine/jit/gru
...
test=develop
7 years ago
tensor-tang
640e789d3d
add fusion gru jit kernel
7 years ago
JiabinYang
39d39775c3
test=develop
7 years ago
JiabinYang
70351de1b5
test=develop
7 years ago
Yu Yang
461f71a90b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
qingqing01
0e24138494
Merge pull request #13991 from qingqing01/refine_generate_proposals_op
...
Refine generate proposals op
7 years ago
gongweibao
58c027cc38
Add rpc profiler flags. ( #13989 )
...
Add rpc profiler flags
7 years ago
Xin Pan
d10e54c460
Merge pull request #14003 from chengduoZH/fix_fast_parallel_exe_bug
...
Fix test_parallel_executor_mnist.py randomly hang.
7 years ago
Tao Luo
42aa1d409d
Merge pull request #13485 from tpatejko/tpatejko/capi-resnet-conv-elementwise-fusion
...
MKLDNN conv+elementwise_add fusion for residual connections in Resnet
7 years ago
Yu Yang
9dcddf92f2
Polish best_fit_allocator
7 years ago
tensor-tang
664159ad42
Merge pull request #13998 from tensor-tang/fea/fusion_seqconv_add
...
Fea/fusion seqconv eltadd relu
7 years ago
Yu Yang
0c25da39a0
Refine auto_increment_allocator
7 years ago
Yu Yang
ab87a88200
Polish retry allocator
7 years ago
guosheng
6d3b030bb5
Refine the api of reshape to be compatible.
...
test=develop
7 years ago
chengduozh
82d2903b63
Fix fast ParallelExe bug
...
test=develop
7 years ago
Tomasz Patejko
aa35aaa1ab
MKLDNN conv + elementwise_add fusion: fixing formatting
...
test=develop
7 years ago
jerrywgz
765085d297
Merge pull request #13904 from jerrywgz/roialign
...
Add RoI align operator.
7 years ago
Dang Qingqing
56936b9e25
Refine doc for generate_proposals_op.
...
test=develop
7 years ago
Tomasz Patejko
ce2464fd98
MKLDNN conv + elementwise_add fusion: UT for missing bias added. UTs refactored. Some minor changes in the pass
7 years ago
Tomasz Patejko
4e72ab411e
MKLDNN conv + elementwise_add fusion: fix for crash when bias is not present
7 years ago
Tomasz Patejko
415b261555
MKLDNN conv + elementwise_add fusion: fusion options added
7 years ago
Tomasz Patejko
1676094697
MKLDNN conv + elementwise_add fusion: turn on residual connection pass when CAPI is used.
...
test=develop
7 years ago
Tomasz Patejko
0fe3079c46
MKLDNN conv + elementwise_add fusion: fix for order of parameters in elementwise_add in resnet50
...
test=develop
7 years ago
Tomasz Patejko
b73b868366
MKLDNN conv + elementwise_add fusion: bias in tests made persistent.
...
test=develop
7 years ago
Tomasz Patejko
a1fa203287
MKLDNN conv + elementwise_add fusion: name of the pass reused with name_scope_
7 years ago
Tomasz Patejko
2c43419db1
MKLDNN conv + elementwise_add fusion: comment explaining CorrectGraphEdges added
7 years ago
Tomasz Patejko
8fb29b2ca9
MKLDNN conv + elementwise_add fusion: new nodes marked as input or output
...
test=develop
7 years ago
Tomasz Patejko
cc1c8e37c1
MKLDNN conv + elementwise_add fusion: attributes in new conv op copied from old op
7 years ago
Tomasz Patejko
a27a8c5da8
MKLDNN conv + elementwise_add fusion: bias in test marked as persistable
7 years ago
Tomasz Patejko
af8c71317c
MKLDNN conv + elementwise_add fusion: CorrectGraphEdges refactored
7 years ago
Tomasz Patejko
3e033087f1
MKLDNN conv + elementwise_add fusion: LinkNodes function removed and
...
macro used.
test=develop
7 years ago
Tomasz Patejko
4be45af1cc
MKLDNN conv + elementwise_add fusion: skip connection attribute renamed. Comments about patterns added.
...
test=develop
7 years ago
Tomasz Patejko
9a335e0277
MKLDNN conv + elementwise_add fusion: changed a name of a formal argument in ElementwiseAdd pattern
7 years ago
Tomasz Patejko
fb7a50b230
MKLDNN conv + elementwise_add fusion: removed commented code. Internal functions marked as static.
...
test=develop
7 years ago
Michal Gallus
f688197182
MKLDNN conv + elementwise_add fusion: Fix output_data to point to the right tensor, also fix transpiler integration
7 years ago
Tomasz Patejko
efd76614fb
MKLDNN conv + elementwise_add fusion: implementation changed to conform with Paddle API
7 years ago
Tomasz Patejko
347bf90412
MKLDNN conv + elementwise_add fusion: bias is also handled
7 years ago
Tomasz Patejko
bf95ac36a7
MKLDNN conv + elementwise_add fusion: further reformatting
7 years ago
Tomasz Patejko
cbe122ae2e
MKLDNN conv + elementwise_add fusion: correcting formatting
7 years ago
Tomasz Patejko
2a251bbf27
MKLDNN conv + elementwise_add fusion: some refactoring: consts, function calls instead of constant values
7 years ago
Tomasz Patejko
b8e54ab5cc
MKLDNN conv + elementwise_add fusion: parameter name changed to ResidualData
7 years ago
Tomasz Patejko
27573ece03
MKLDNN conv + elementwise_add fusion: trailing spaces removed
7 years ago
Tomasz Patejko
7f5c8a95e8
MKLDNN conv + elementwise_add fusion: arguments are replaced for many parameters in operator
7 years ago