tensor-tang
2f0a379af7
Merge pull request #14307 from tensor-tang/fix/mac
...
fix mac
7 years ago
Zeng Jinle
b2af213009
Merge pull request #14292 from sneaxiy/delete_buggy_selected_rows_functor
...
Delete buggy selected_rows functor
7 years ago
tensor-tang
161ba9c9d1
fix mac
...
test=develop
7 years ago
Sylwester Fraczek
f395075efc
rebased and stuff broke
7 years ago
tensor-tang
e8642c3c1f
Merge pull request #14265 from tensor-tang/fea/jit/vadd
...
add vadd, vaddrelu jitcode
7 years ago
Sylwester Fraczek
a60957f386
addd test_analyzer_mobilenet
7 years ago
dengkaipeng
8b47d90f5d
add 'actual_shape' attribute. test=develop
7 years ago
tensor-tang
382307b943
refine code
...
test=develop
7 years ago
tensor-tang
3319072858
fix jit kernel test on mac
...
test=develop
7 years ago
tensor-tang
44cb70c088
Merge remote-tracking branch 'ups/develop' into fix/mac
7 years ago
Yu Yang
c774bcbd2d
Merge device_context
...
test=develop
7 years ago
Yu Yang
057a682ee9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Yu Yang
c28beb8a3c
test(Pe): add dry run tests for pe ( #14254 )
...
Dry run tests will skip `Op.Run` and just perform job scheduling. It helps to analysis dead lock in PE.
test=develop
7 years ago
tensor-tang
c9730d33d9
fix run error on mac
...
test=develop
7 years ago
Xin Pan
80132933b7
Merge pull request #14281 from luotao1/face
...
refine analysis_resnet50_tester
7 years ago
Qiao Longfei
e0c8397426
Merge pull request #14257 from jacquesqiao/optimize-pserver-profiler-thread-pool
...
clean rpc server profiler
7 years ago
chengduo
ffc866159f
hot fix log ( #14293 )
...
test=develop
7 years ago
Zhaolong Xing
65b61db10a
Merge pull request #13927 from NHZlX/fix_googlenet_bug_with_rule
...
Fix googlenet bug with rule
7 years ago
tensor-tang
25e070ecc7
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
barrierye
ef8218be22
update docs test=develop
7 years ago
Tao Luo
eea36739cc
refine test_helper.h
...
test=develop
7 years ago
Qiao Longfei
6449faec37
Merge pull request #14259 from jacquesqiao/optimize-thread-pool
...
Optimize thread pool
7 years ago
sneaxiy
9518bc8d0a
delete buggy selected_rows functor
...
test=develop
7 years ago
chengduo
a9b5d42dd4
Add fp16 backward support ( #14202 )
...
* add fp16 backward support
test=develop
* add sum_op fp16 test
* disable test_dist_save_load
test=develop
* add check_grad for sum
* add unit test for softmax_grad fp16
test=develop
* add scale_op unit test
* add mul_grad_op unit test for fp16
* add cross_entropy_grad and eman_grad unit test for fp16
test=develop
* fix cross_entropy unit test
* add pool2d fp16 unit test
* refine conv2d fp16 unit test
test=develop
* refine activation unit test
test=develop
* fix ci
test=develop
* follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
test=develop
7 years ago
Qiao Longfei
3b8dd9ebbd
optimize code test=develop
7 years ago
Tao Luo
2b791f1f63
unify analyzer_face_tester to analyzer_resnet50_tester
...
test=develop
7 years ago
Qiao Longfei
2921f8a79c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
7 years ago
Tao Luo
1ead9318d5
remove unused code in test_helper.h to pass ci
...
test=develop
7 years ago
Qiao Longfei
4062f00f2a
optimize thread pool code
...
test=develop
7 years ago
dzhwinter
2835e04409
merge develop branch. test=develop
7 years ago
dzhwinter
deb4af70ef
add test
7 years ago
qingqing01
db8c52da5e
Revert " Exhaustive search for cuDNN conv. ( #14043 )"
...
This reverts commit ce7d9b0799
.
7 years ago
qingqing01
ce7d9b0799
Exhaustive search for cuDNN conv. ( #14043 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Clean code
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
7 years ago
tensor-tang
cb4083b9fa
fix compile error
...
test=develop
7 years ago
tensor-tang
dd343a4971
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
Zeng Jinle
fcbe84cb50
Merge pull request #14270 from sneaxiy/fix_rmsprop_enforce_bug
...
Fix rmsprop_op enforce bug
7 years ago
Tao Luo
7a2887d212
add analyzer_face_tester
...
test=develop
7 years ago
Tao Luo
2ec65ae0db
download face_model in CMakeLists.txt
...
test=develop
7 years ago
Tao Luo
2f9a5a2e0a
add analyzer_face_tester
7 years ago
Xin Pan
cb2d33a851
resolve conflict
...
test=develop
7 years ago
nhzlx
5700fafd0f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_googlenet_bug_with_rule
...
test=develop
7 years ago
nhzlx
86b99ac953
fix comments and fix bug
7 years ago
tensor-tang
e6cfdf6c74
Merge pull request #14274 from tensor-tang/fix/jit
...
fix jit on mac
7 years ago
peizhilin
a37918c31f
fix python package issue
7 years ago
Xin Pan
25123a3b7e
add tests
...
test=develop
7 years ago
Xin Pan
8c11d3fed6
clean up
7 years ago
Xin Pan
0a89650507
fix more tests
...
test=develop
7 years ago
Xin Pan
a3b27e3237
fix
...
test=develop
7 years ago
Xin Pan
f25eb9a71d
fix some tests.
...
test=develop
7 years ago
Xin Pan
adf5615e54
clean kGraphOp
...
test=develop
7 years ago
Xin Pan
fb576cb5cb
allow to compare type
...
test=develop
7 years ago
Xin Pan
ead94bfc6c
fix destructor
...
test=develop
7 years ago
Xin Pan
2e14999942
clean1
...
test=develop
7 years ago
Xin Pan
34b401fc6c
clean up a global graph attr.
7 years ago
Zeng Jinle
8ac2242b6e
Merge pull request #14075 from sneaxiy/remove_some_locks_in_pe
...
Remove some locks in ParallelExecutor
7 years ago
tensor-tang
b81e1b655e
fix jit on mac
...
test=develop
7 years ago
sneaxiy
11f032a82e
fix rmsprop_op enforce bug
...
test=develop
7 years ago
tensor-tang
b68ececb73
add vaddrelu jitcode
...
test=develop
7 years ago
sneaxiy
8684553633
stream callback support in cuda 10
...
test=develop
7 years ago
peizhilin
1f12ba6192
gpu support, fix build issue:
...
1. Non utf-8 characters within comments of OPs may lead to protobuf fail to parse_from_string
2. comment out some ops which not supported on windows
3. cuda libs may not be correctly linked to target on windows
7 years ago
Wu Yi
8fc05e0373
fix cpu build test=develop ( #14260 )
7 years ago
Zhen Wang
4dbc01841d
Nlp dam ( #14248 )
...
* add dam test
* update fuse_statis
* use separated dam model.
* Revert "use separated dam model."
This reverts commit 13e775c86f909b164b7cc1d35a8a24b964ec622e.
* test=develop
* modify the cmake file about infer test, test=develop.
* remove one comment, test=develop.
7 years ago
tensor-tang
bb09e31020
add vadd jitcode
...
test=develop
7 years ago
sneaxiy
faac8a76ce
remove unnecessary codes
...
test=develop
7 years ago
Yu Yang
ff9e531bd9
style(platform): disable warning when cuda cc not matched ( #14029 )
...
Warning only at first when CUDA CC not matched.
test=develop
7 years ago
Qiao Longfei
59fbfbfbf7
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
...
test=develop
7 years ago
Qiao Longfei
fe4cd50286
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-thread-pool
...
test=develop
7 years ago
whs
d6a6a13039
Fix build error of affine grid op in mac os. ( #14237 )
...
* Fix build error of affine grid op in mac os.
test=develop
* Make function return reference.
test=develop
7 years ago
Qiao Longfei
ac415c0094
change lock_guard to unique_lock
7 years ago
Qiao Longfei
f4a76078d0
optimize thread pool
7 years ago
tensor-tang
d55481cfeb
Merge pull request #14241 from tensor-tang/refine/jit/vmulcode
...
Refine/jit/vmulcode
7 years ago
Qiao Longfei
9e4e9e9b6e
clean rpc server profiler
7 years ago
Zeng Jinle
8d930195d9
Merge pull request #14238 from sneaxiy/fix_read_lod_level_bug
...
Fix lod_level share bug in read_op
7 years ago
Wu Yi
306236c2c0
feature/DC asgd ( #12722 )
...
* wip
* add ref_by_trainer_id op
* ready to test
* fix ref inputs
* refine rpc_op_handle
* fix merge bug
7 years ago
dengkaipeng
fef2faa709
limit CUDA kernel parallel threads max number to 4096. test=develop
7 years ago
tensor-tang
c3cbf0b8ef
Merge pull request #14185 from tpatejko/tpatejko/mkldnn-conv-residual-data-reorder
...
Residual data reorder in MKLDNN convolution
7 years ago
peizhilin
71d7980f69
fix build issue 1
7 years ago
tensor-tang
6b49ee42c3
Merge pull request #14239 from tensor-tang/fix/avx
...
Fix avx illegal instuctions
7 years ago
tensor-tang
ef9c10927d
Merge pull request #14233 from tensor-tang/fix/guide
...
throw error when mismatch cpu avx version
7 years ago
dengkaipeng
34bfae243a
Add Interpolate operation. test=develop
7 years ago
sneaxiy
46d4829dd1
fix lod_level share bug in read_op
...
test=develop
7 years ago
tensor-tang
8465e7876f
auto grow the size and fix test
...
test=develop
7 years ago
tensor-tang
9255119fd9
refine jit vmul with all size
7 years ago
tensor-tang
a9c1824131
refine jit vmul code supporting multiple of 2
7 years ago
tensor-tang
61fdc38e51
Merge pull request #14206 from tensor-tang/fea/jit/gen
...
Fea/jit/gen
7 years ago
tensor-tang
e09a7c793d
remove the warning log since do not have avx2, avx512 flags
...
test=develop
7 years ago
tensor-tang
f524c1b62b
throw error when mismatch cpu version
...
test=develop
7 years ago
peizhilin
9d67c1fb69
cpu build support
7 years ago
barrierye
5e7bb6a9bd
update docs test=develop
7 years ago
Xin Pan
c2d70fca30
fix to only check block 0
...
test=develop
7 years ago
dzhwinter
baf0ff4510
Merge pull request #14020 from dzhwinter/fix/sign_op
...
"fix sign op"
7 years ago
dzhwinter
60f70b174d
test=develop
7 years ago
sneaxiy
7ff320f8cc
merge develop
7 years ago
Xin Pan
d0459ac8d0
Merge pull request #14223 from panyx0718/fix5
...
add more debug info.
7 years ago
dongzhihong
00cf66964f
Merge remote-tracking branch 'origin/develop' into fix/sign_op
...
test=develop
7 years ago
Kaipeng Deng
daed473d4a
Merge pull request #14089 from heavengate/pool_exclude
...
add inclusive/exclusive mode in avg pool
7 years ago
Kaipeng Deng
64f3e3ed8f
Merge pull request #14069 from heavengate/grid_sampler
...
Grid sampler operator for spatial transformer network.
7 years ago
Xin Pan
aaeedd0ff3
make it warn
...
test=develop
7 years ago
Zeng Jinle
b316437a50
Merge pull request #14087 from sneaxiy/add_use_cudnn_in_softmax_with_xe
...
Add numeric_stable_mode parameters to softmax_with_xe op
7 years ago
Xin Pan
ddd2225b56
add more debug info.
...
test=develop
7 years ago
sneaxiy
bbc818a5a1
test=develop
7 years ago
sneaxiy
366ebb93f7
test=develop
7 years ago
sneaxiy
203027ca86
test=develop
7 years ago
Tao Luo
d2a56f7909
Merge pull request #14159 from sfraczek/sfraczek/depthwise-conv-mkldnn-pass
...
add depthwise conv mkldnn pass
7 years ago
dzhwinter
cc02353d10
test=develop
7 years ago
dzhwinter
eb2f7ed21b
refine tests. test=develop
7 years ago
Jiabin Yang
9f65b616b2
Merge branch 'develop' into add_reorg_op
7 years ago
Xin Pan
08d22cf7e1
Merge pull request #14091 from panyx0718/fix2
...
add program check
7 years ago
Wu Yi
91b2851cdc
enable pyreader use pin memory ( #14066 )
...
* enable pyreader use pin memory
* add py reader pin memory test test=develop
7 years ago
Kaipeng Deng
0b29078201
Merge branch 'develop' into grid_sampler
7 years ago
whs
0c319e0b35
Add affine grid generator op ( #12238 )
...
* Add affine grid generator.
* fix ffine grid.
* Add unitest.
* Add CPU kernel and fix unitest.
* Fix CPU kernel.
* Refine code.
test=develop
* Fix python api.
test=develop
* Update python api.
test=develop
* Fix comment.
test=develop
* Rename affine_grid_generator to affine_grid and enhence unitest.
test=develop
* Fix unitest.
test=develop
7 years ago
sneaxiy
cf1944af2a
test=develop
7 years ago
tangwei12
d325e668b8
[1.1] Load vars on PSERVER ( #14037 )
...
* fix dim0 in _load_slice_up_vars
* fix dim0 in _load_slice_up_vars, fix innershape in delete_var_op
* Revert "fix lookuptable in reduce strategy"
This reverts commit 0e722c5
* add unit test for dist
* add unit test for dist, test=develop
* cancel revert, test=develop
7 years ago
dengkaipeng
e99da0b583
api change: create_variable_for_type_inference. test=develop
7 years ago
Yan Chunwei
f76fee644c
fix graph pattern detector ( #14186 )
7 years ago
Tao Luo
fe8f178582
fix word2vec related inference unit-tests ( #14203 )
7 years ago
chengduo
e1742050ea
fix merge lod_tensor bug ( #14199 )
...
test=develop
7 years ago
dzhwinter
0a180584e6
clean cmake. test=develop
7 years ago
tensor-tang
85bcb286f5
refine vmul jitcode
...
test=develop
7 years ago
tensor-tang
a764e900a5
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
...
test=develop
7 years ago
tensor-tang
a3377f7b0a
refine jitcode and add vmul jitcode implementation
7 years ago
dzhwinter
1ace55c8ee
merge develop branch
7 years ago
dzhwinter
9da7b33515
details
7 years ago
dengkaipeng
df4a3544aa
nearest neighbor interp add cuda kernel. test=develop
7 years ago
Xin Pan
913b569903
Merge pull request #14151 from panyx0718/fix
...
add a small test to verify tensor type
7 years ago
sneaxiy
c7305fbe2f
buffered_allocator: add unittest and fix bug
...
test=develop
7 years ago
dengkaipeng
da8ee1fbaa
fix API.spec not add defaults. test=develop
7 years ago
chengduo
2ccf77d1c1
Refine GetTensorFromVar ( #14160 )
...
* fix GetTensorFromVar
test=release/1.1
* refine GetTensorFromVar
test=develop
7 years ago
dengkaipeng
9755611938
add unittest for nearest_neighbor_interp_op
7 years ago
dengkaipeng
a24691a2a9
add nearest neighbor interpolation operator cpu kernel
7 years ago
sneaxiy
e3fc544cf7
merge develop
7 years ago
sneaxiy
2bef0ca346
add buffered_allocator
...
remove Free() method in UnmanagedAllocator
7 years ago
JiabinYang
8d3c3e048b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Yan Xu
d10b8efcc0
Merge pull request #14152 from Yancey1989/add_fused_broadcast_unittest
...
add fused broadcast op unit test
7 years ago
Yu Yang
c21597cf07
fix(PE): use shared_ptr<BlockingQueue> for cross thread communication ( #14136 )
...
It seems that the blocking queue might be destroyed early than Run
method complete. It might because the Run method throw some unhandled
exception. However, it should be shared_ptr when multthread access an
resource. So change BlockingQueue as a shared_ptr.
test=develop
7 years ago
tensor-tang
f3badacd97
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
7 years ago
tensor-tang
a53b1b0b1b
refine and init jitkernel vmul
7 years ago
tensor-tang
2139b9f677
add jit gencode
7 years ago
Yan Chunwei
06e508ab58
fix simple_on_word2vec random fail ( #14171 )
7 years ago
Tomasz Patejko
8899d42265
MKLDNN conv residual data: primitive reuse interface used. Reorder done when formats are different
...
test=develop
7 years ago
chengduo
b73708d20b
add int and int64 dtype for gather_op ( #14175 )
...
test=develop
7 years ago
Tomasz Patejko
f11934cbe6
MKLDNN conv residual data: residual data is reorder when formats are incorrect
7 years ago
Yan Chunwei
62a0fe0860
fix tensor array bug ( #14166 )
...
remove the optimized but buggy implementation
7 years ago
chengduo
ed087f8232
refine op_handle ( #14178 )
...
test=develop
7 years ago
Tao Luo
cdf2579d08
Merge pull request #14053 from jczaja/prv-seqpool-max
...
Max Sequence pool optimization
7 years ago
Kaipeng Deng
a3b26e8528
Merge branch 'develop' into grid_sampler
7 years ago
dengkaipeng
7333fe8e55
add math formula for exclusive/inclusive mode in avg pool. test=develop
7 years ago
Xin Pan
35915fc543
Merge pull request #14147 from luotao1/remove_with_inference
...
remove with_inference option
7 years ago
Yu Yang
90d9e5aee8
feat(platform): lazy initialization of devicecontext in pool ( #14067 )
...
* feat(platform): lazy initialization of devicecontext in pool
Use std::async(deferer, []{...}) to lazy initialize DeviceContext in Pool
test=develop
* Add future includes
test=develop
7 years ago
dzhwinter
316765839d
add back jit simd instructions. stage.
7 years ago
Xin Pan
eb7ed1b720
Merge pull request #13897 from gmcather/develop
...
1.add position encoding 2.logloss in nn.py
7 years ago
Sylwester Fraczek
4e2aaf01bc
add depthwise conv mkldnn pass
...
added depthwise conv mkldnn pass which for MKLDNN changes depthwise_conv operator to conv operator because for mkldnn this is the same api
test=develop
7 years ago
barrierye
fc23cc9d30
update paddle/fluid/API.spec
...
test=develop
7 years ago
Yancey1989
6bfa6a0a33
add fused broadcast op unit test, test=develop
7 years ago
Xin Pan
e2db0b9bf3
add a small test to verify tensor type
...
test=develop
7 years ago
dzhwinter
bf2e4cb188
cleard. staged
7 years ago
Yan Chunwei
70ce6dcd67
fix api_impl ci error ( #14140 )
7 years ago
Xin Pan
eb37ed4c16
Merge pull request #14141 from JiabinYang/fix_inference_model_latest
...
Fix inference model not found on Mac CI
7 years ago
Xin Pan
a943134a97
fix a few more tests
...
test=develop
7 years ago
chengduo
2f639113ee
Fix sum_op's GetExpectedKernelType ( #14112 )
...
* fix sum_op's GetExpectedKernelType
test=develop
* fix ci fail
test=develop
7 years ago
Xin Pan
5839e3236b
add program check
...
test=develop
7 years ago
gmcather
ba22624d7e
position encoding && log loss
...
test=develop
7 years ago
Tao Luo
3a96d41d72
remove with_inference option
...
test=develop
7 years ago
sneaxiy
2494ca83ab
test=develop
7 years ago
dzhwinter
ebfe5a02b3
merge develop branch
7 years ago
JiabinYang
7c45e77c41
test=develop
7 years ago
barrierye
b5f78ce42d
update paddle/fluid/API.spec
...
test=develop
7 years ago
qingqing01
cb27a9219d
Merge pull request #13971 from sefira/FasterOpDoc
...
generate proposal labels doc
7 years ago
sneaxiy
5e5d2223a1
test=develop
7 years ago
tensor-tang
3c957af139
Merge pull request #14080 from tensor-tang/refine/jit/crf2
...
Refine/jit/crf decoding
7 years ago
Xin Pan
aa87a989ec
Merge pull request #14119 from Superjomn/fix/api-impl-tester
...
disable some tests
7 years ago
barrierye
5f3acac9b3
update paddle/fluid/API.spec
...
test=develop
7 years ago
Xin Pan
9ef19d4919
Merge pull request #14106 from luotao1/fix_cmake_warning
...
[1.1] fix cmake warning when ON_INFER=false
7 years ago
sneaxiy
f2eed667c0
test=develop
7 years ago
Xin Pan
16dfedb8b8
Merge pull request #14103 from jacquesqiao/cpu-for-1.1-merge-with-shape
...
[1.1] Cpu for 1.1 merge with shape
7 years ago
sneaxiy
cef8cc81db
merge develop
7 years ago
Jacek Czaja
458b16f42a
Rebase of seqpool-max optimization
...
test=develop
- Added rough profiling
- Profiled maxpool itself
- First draft of max seqpool optimization (is_test added)
- Added unit tests to seqpool
- Cosmetic fixes
- Fix to UT of Seq pool
Disabled grad checking for sequence max pool when is_test is set to True
-Cosmetic fix to comment
test=develop
- Fix to GPU build
test=develop
- yet another GPU fix for sequence max pool
- Fix to comment
test=develop
- Change to API of sequence_pool
test=develop
- Yet another API spec change
test=develop
7 years ago
superjomn
5f7fda0b07
disable some tests
...
test=develop
7 years ago
dengkaipeng
ff6329bd5f
fix some inappropriate expressions in api doc for grid_sampler. test=develop
7 years ago
Tao Luo
d3534d2b14
refine warning message
...
test=develop
7 years ago
Xin Pan
177720a737
Merge pull request #14116 from chengduoZH/release/1.1.0
...
[1.1]Fix op_role value
7 years ago
chengduozh
acec4cb8ca
[1.1]fix op_role value
...
test=release/1.1
7 years ago
barrierye
73671379cd
update paddle/fluid/API.spec
...
test=develop
7 years ago
dengkaipeng
8f1e398824
move param exclusive to the last in pool2d/pool3d for forward compatibility:. test=develop
7 years ago
dengkaipeng
593e1b18d7
fix some bugs and add some doc for GridSampleOp
7 years ago
dengkaipeng
0bb0e0c10f
add Grid Sampler Operator for STN.
7 years ago
Qiao Longfei
3d4e050802
fix compile, optimize code test=develop
7 years ago
Yu Yang
c01696f8c2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
Qiao Longfei
d26ff8cb2d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpu-for-1.1-merge-with-shape
7 years ago
JiabinYang
e0a89503f8
test=develop
7 years ago
JiabinYang
0e3038680b
test=develop
7 years ago
Tao Luo
79da263b11
Merge pull request #14032 from sfraczek/sfraczek/fix-test-multithreading-mkldnn
...
fix test resnet50 multi-threading on mkldnn
7 years ago
Wu Yi
26200f2e42
[1.1] [project] train imagenet using large batch size ( #13766 )
...
* fix nccl2 lars dist support
* put lars in momentum op
* add tests lars
* fix ci
* fix cpu kernel
* soft warning
* remove lars in test_recognize_digits.py
* move to another op
* add file
* update api.spec test=develop
* update test=develop
* fix api.spec test=develop
* wip
* wip, finish grad merge ops
* wip, finish graph build
* wip test running
* work on 1 gpu
* workable version
* update
* fix tests
* fuse broadcast op
* fix compile failed
* refine
* add batch merge test mnist
* fix CI test=develop
* fix build
* use independent bn params for batch merge test=develop
* update api.spec
* follow comments and for test
* wip
* refine tests test=develop
* follow comments test=develop
* remove startup bn modify test=develop
* follow comments test=develop
* fix merge test=develop
7 years ago
sneaxiy
2414f92f54
test=develop
7 years ago
barrierye
8c1e304307
merge nn.py
7 years ago
sneaxiy
45559d042c
move to pass
...
test=develop
7 years ago
dengkaipeng
c93e044ae0
add inclusive/exclusive mode in PoolOp avg pool type
7 years ago
JiabinYang
9a74c4489f
test=develop
7 years ago
barrierye
9dc28179a4
add similarity_focus op
7 years ago
Qiao Longfei
7cd2417fe2
Merge branch 'develop' into cpu-for-1.1-merge-with-shape
...
test=develop
7 years ago
Xin Pan
0a80f06ec4
Merge pull request #14086 from panyx0718/fix6
...
delete unused codes.
7 years ago
sneaxiy
a314a80cdb
merge develop
7 years ago
Tao Luo
4928ff32a9
fix cmake warning when ON_INFER=false
...
test=develop
7 years ago
dzhwinter
c8adc2c6fe
cudnn version. staged.
7 years ago
Qiao Longfei
06ffbc4f28
Merge branch 'shape_int_to_int64' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge-with-shape
...
test=develop
7 years ago
seiriosPlus
06de824ba8
fix shape in floats
7 years ago
Yan Chunwei
ee74be3a49
[1.1] Bugfix/tensorarray ( #14044 )
7 years ago
Qiyang Min
33b4920d2d
Merge pull request #14057 from velconia/continue_hash_op
...
[1.1] Add hash_op implementation
7 years ago
Qiyang Min
209f24a241
Merge pull request #14051 from velconia/accelerate_embedding_grad
...
[1.1] Accelerate sparse embedding grad op in CPU device
7 years ago
minqiyang
7f7af5d412
Add xxhash deps to inference demo and trainer demo
...
test=develop
7 years ago
Qiao Longfei
7cfc3c4415
Merge branch 'optimize-sum-seq-pooling-op' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
72aef6b168
sum selected rows check empty
7 years ago
minqiyang
fe18adfbaa
Add fluid inference support
...
test=develop
7 years ago
seiriosPlus
c34610f86d
Fix lookup table at CPU Reduce strategy, test=develop
7 years ago
Qiao Longfei
641369f92b
Merge branch 'dist-table-do-not-init-on-trainer' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
d69c820707
Merge branch 'add-flag-to-control-rpc-thread-num' of ssh://github.com/jacquesqiao/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
1ed9ef6d70
Merge branch 'shape_int_to_int64' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
f1a3fb041b
Merge branch 'fix_lookuptable_in_reduce' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
Qiao Longfei
da61a5b672
Merge branch 'optimizer-prefetch' of https://github.com/seiriosPlus/Paddle into cpu-for-1.1-merge
7 years ago
tangwei12
5ce3a32e06
Merge branch 'develop' into optimizer-prefetch
7 years ago
seiriosPlus
b6590b05fb
submit by tangwei12, test=develop
7 years ago
Wu Yi
9da9b1926b
[1.1] fix graph num hang ( #14072 )
...
* fix graph num hang test=develop
* re-enable tests test=develop
* re-enable graph num check test=develop
* fix multi device pass role check test=develop
7 years ago
tangwei12
cb1ccc710b
fix shape type in uniform_random_op.cu
7 years ago
Qiao Longfei
575f22711d
optimize code
...
test=develop
7 years ago
Qiao Longfei
96d5500934
optimize code
7 years ago
Qiao Longfei
748ee35c89
sum op handle empty input update selected_rows_functor.cu
7 years ago
Qiao Longfei
dd78b5df93
sum op handle empty input
7 years ago
Qiao Longfei
cbe128bbae
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei
f4df0cb1a2
update the type of shape to int64, format code
7 years ago
Qiao Longfei
fad42fe7cc
broadcast handle not inited parameter
7 years ago
Qiao Longfei
7dcb0dc8c6
update year
7 years ago
Qiao Longfei
68aeb4e7e9
add fake init test in test_dist_transpiler
7 years ago
Tao Luo
5ed3e6f3f6
Merge pull request #14042 from luotao1/remove_unused_code
...
[1.1] remove unused code in paddle_inference_api.h
7 years ago
Qiao Longfei
a13c788a04
fix a bug
7 years ago
Zeng Jinle
97d47a7d08
Merge pull request #13913 from sneaxiy/seq_reverse
...
Add sequence_reverse_op
7 years ago
JiabinYang
6e3615422f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
Jiabin Yang
a3efba176c
Merge pull request #14085 from jerrywgz/fix_generate_proposals_op
...
[1.1] fix erase end in generate proposals op
7 years ago
dzhwinter
7141debe38
add cudnn back. staged.
7 years ago
Guo Sheng
b9ae1c49f8
Merge pull request #13994 from guoshengCS/add-reshape-reuse-input
...
[1.1] Make reshape_op reuse input.
7 years ago
Zeng Jinle
60058180cb
Merge pull request #13945 from sneaxiy/unify_mixed_vector_api
...
Unify API of mixed_vector in GPU and CPU
7 years ago
Qiao Longfei
0328ffd3ab
add fake init op
7 years ago
Xin Pan
bcc9126e7b
Merge pull request #14056 from panyx0718/fix
...
Fix threadpool
7 years ago
Sylwester Fraczek
2098b42584
review fixes (Teamcity fails)
...
test=develop
7 years ago
Tao Luo
961baea16c
Merge pull request #14063 from wojtuss/wojtuss/remove-unused-EnableMKLDNN
...
remove unused method from naive executor
7 years ago
Hongyu Liu
379d933ae5
Merge pull request #14036 from phlrain/add_dropout_att_new
...
Add dropout att new 1.1 merge
7 years ago
tangwei12
d8b697357f
update height_sections to int64_t
7 years ago
minqiyang
a2820b9899
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
Xin Pan
bba0c4a9f2
delete unused codes.
...
test=develop
7 years ago
jerrywgz
de2f965c9b
test=develop
7 years ago
guosheng
cc0e23973d
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
tangwei12
318ba99124
revert changes in protobuf.cc and type_defs
7 years ago
tangwei12
aa6dc82f4b
revert changes in protobuf.cc and type_defs
7 years ago
dzhwinter
09409bad4d
staged. test speed=49ms in 1080.
7 years ago
tensor-tang
64d5b4385e
fix crf decode avx512
7 years ago
tensor-tang
21487d78bf
add crf decode jit kernel
7 years ago
sneaxiy
b1fd62f39e
test=develop
7 years ago
guosheng
3cfaeac288
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
sneaxiy
1af3fe8c35
test=develop
7 years ago
Xin Pan
d5d09672c8
better fix
...
test=develop
7 years ago
Qiao Longfei
de539d72da
format
...
test=develop
7 years ago
sneaxiy
5be6f762d0
remove_lock_in_some_ops
...
test=develop
7 years ago
buxingyuan
6c1d74bb47
Merge branch 'develop' into FasterOpDoc
...
test=develop
7 years ago
Xin Pan
726fd438cd
avoid blocking everyone
...
please fix offline
7 years ago
JiabinYang
7bcba47e41
test=develop
7 years ago
barrierye
a7f94ec794
add similarity_focus op
7 years ago
Tao Luo
8ab953e37c
auto insert infer_graph_clean_pass as the default first one
...
test=develop
7 years ago
Tao Luo
d70c7fb9b3
Merge branch 'develop' into remove_unused_code
7 years ago
Tao Luo
ea2bdd192d
Merge branch 'develop' into remove_unused_code
7 years ago
minqiyang
0de6811ee0
Change reserve to resize
...
test=develop
7 years ago
tangwei12
b58957d9d7
Revert "fix lookuptable in reduce strategy"
...
This reverts commit 0e722c5
7 years ago
JiabinYang
9cad409f2a
test=develop
7 years ago
tangwei12
2761eafb92
shape type to int64_t, test=develop
7 years ago
tangwei12
d4a8967c1e
add const in &, test=develop
7 years ago
minqiyang
5660d6a3ba
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
guosheng
1f92c30565
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
tensor-tang
a05fce6544
Merge remote-tracking branch 'ups/develop' into fix/jit/avx
...
test=develop
7 years ago
JiabinYang
bd064c0f44
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
tangwei12
0e25e397bd
shape type to int64_t, test=develop
7 years ago
Qiyang Min
d0fdcb2f6d
Merge pull request #14048 from velconia/change_sequence_pool_to_cpu
...
Accelerate Sequence Pool Grad Op
7 years ago
tangwei12
d1e85e33d7
shape type to int64_t, test=develop
7 years ago
Yu Yang
8310ce6007
Fix cluster memory
...
test=develop
7 years ago
tensor-tang
d24d282a7a
fix avx error
...
test=develop
7 years ago
tensor-tang
9cb8738f54
Merge pull request #14018 from tensor-tang/refine/jit/gru
...
Refine/jit/gru
7 years ago
Xin Pan
70effddfc1
fix
...
test=develop
7 years ago
Xin Pan
64e7688ade
clean more APIs
...
test=develop
7 years ago
Xin Pan
c891bc22f5
clarify Reset
...
test=develop
7 years ago
Qiao Longfei
6253b152e6
Merge branch 'optimize-sum-seq-pooling-op' of https://github.com/jacquesqiao/Paddle into optimize-sum-seq-pooling-op
7 years ago
Qiao Longfei
14f5a40898
fix unit test
7 years ago
minqiyang
447a680a2b
Add API.spec
...
test=develop
7 years ago
minqiyang
5de4619781
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_embedding_grad
7 years ago
minqiyang
0695c1fbe8
Add remind for code
...
test=develop
7 years ago
minqiyang
0c5c4c4a5b
Add blas header file
...
test=develop
7 years ago
guosheng
aac426444f
Merge branch 'develop' of https://github.com/PaddlePaddle/paddle into add-reshape-reuse-input
...
test=develop
7 years ago
buxingyuan
d0ccdf8fc1
follow comments
...
test=develop
7 years ago
minqiyang
e2a348cd10
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into change_sequence_pool_to_cpu
7 years ago
Qiao Longfei
f4e6fe0786
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-sum-seq-pooling-op
7 years ago
Xin Pan
4f59690b4c
clean unused codes
...
test=develop
7 years ago
Xin Pan
784a19ecd0
fix some thread-safty issue and simplify threadpool
...
test=develop
7 years ago
Wojciech Uss
be58997443
remove unused method from naive executor
...
test=develop
7 years ago
minqiyang
40141f749b
Implement the unittest for hash op
...
test=develop
7 years ago
Sylwester Fraczek
741cb33bd9
test multithreading
7 years ago
Brian Liu
a53e8a8da6
Update MKLDNN integration framework to support Paddle multi-instances
...
Make all blob info saved in global device context to be thread based.
Meanwhile save thread id in thread local storage in ParallelDo
7 years ago
minqiyang
8a0f26f45f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into continue_hash_op
7 years ago
minqiyang
d4f9aa0852
Add hash op implementation
7 years ago
dzhwinter
468467f391
update real incnet tester
7 years ago
tangwei12
39b3bf24d0
shape type to int64_t, test=develop
7 years ago
tangwei12
755927d2b0
shape type to int64_t, test=develop
7 years ago
Qiao Longfei
7357d8412e
add flags for control the thead num for pserver
7 years ago
phlrain
a4ad286e6b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
469bdb9e55
modify api.spec; test=develop
7 years ago
minqiyang
1a3b38a432
Polish code
...
test=develop
7 years ago
dzhwinter
b154e0b492
clean demo_ci
7 years ago
minqiyang
133bac2b10
Accelerate embedding op grad
...
test=develop
7 years ago
Zhaolong Xing
2256fae45d
Merge pull request #13938 from NHZlX/ocr_attention_support
...
ceil pool mode support for ocr attention model.
7 years ago
dzhwinter
abe8e207c4
clean demo_ci
7 years ago
dzhwinter
597d92179b
clean demo_ci
7 years ago
phlrain
201d4f2a85
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
a6e6bc45d6
modify dropout att; test=develop
7 years ago
minqiyang
2468057da6
Move code to SumSeqPoolGradFunctor
...
test=develop
7 years ago
minqiyang
9725db0d40
Fix copy wrong pos bug
...
test=develop
7 years ago
minqiyang
9c68709036
Accelerate sequence_pool functor
7 years ago
minqiyang
14ebc424d6
Add gpu support for unittest
7 years ago
jerrywgz
e906c8e5e7
Merge pull request #14022 from jerrywgz/fix_rpn_target_assign_op
...
fix random fail in rpn target assign
7 years ago
minqiyang
bd5a82e193
Polish unit test code
7 years ago
minqiyang
047fa2f9aa
Add unit-test for sequence_pooling functor
7 years ago
qingqing01
c7379a7320
Fix top_k op ( #14034 )
...
1. Fix CUDA kernel when height is large than 2048.
2. Support input with more than 2D.
3. Fix unit test when k is large than 1.
4. Enhence unit testing.
test=develop
7 years ago
sneaxiy
016bf51e3f
test=develop
7 years ago
Tao Luo
f7bbcfa913
remove unused code in paddle_inference_api.h
...
test=develop
7 years ago
JiabinYang
c056328563
test=develop
7 years ago
nhzlx
11f189bacf
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_demo_ci_trt
...
test=develop
7 years ago
tangwei12
8b7f45a889
add longs in framework
7 years ago
JiabinYang
c13f1ef3c4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
tangwei12
f3729db6e0
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into Pdv
7 years ago
Xin Pan
8837669782
Merge pull request #13982 from panyx0718/fix
...
Clean up Reuse
7 years ago
dzhwinter
dbd0075b68
Merge branch 'windows/support' into lb
7 years ago
dzhwinter
c6dcffc61a
lb. add debug output
7 years ago
wanghaoshuang
78cf76a1ca
fix linux compile
7 years ago
tangwei12
770e2a1881
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into Pdv
7 years ago
chengduo
e943f4508b
add graph number check ( #14025 )
...
test=develop
7 years ago
sneaxiy
92a2817a2b
test=develop
7 years ago
JiabinYang
8e8e8e66ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
nhzlx
ae8f26072d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_demo_ci_trt
...
test=develop
7 years ago
phlrain
049c9c7d2a
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_dropout_att_new
7 years ago
phlrain
ffb24a73ec
add dropout attr; test=develop
7 years ago
sneaxiy
8f07f60915
test=develop
7 years ago
wanghaoshuang
5993155d67
Merge remote-tracking branch 'dzhwinter/windows/support' into windows/support
7 years ago
wanghaoshuang
f9e7cfb03c
save binary file
7 years ago
tensor-tang
032c3a07e3
Merge remote-tracking branch 'ups/develop' into refine/jit/gru
...
test=develop
7 years ago
tensor-tang
159be8cc63
optimize fusion gru kernel at size 8
7 years ago
dzhwinter
607080e888
windows static library
7 years ago