minqiyang
ac80273686
Change definitions to PADDLE_WITH_JEMALLOC
7 years ago
minqiyang
c8965dc1ab
Polish code
...
test=develop
7 years ago
tensor-tang
5c68dee798
fix debug compile of analysis pass fail
...
test=develop
7 years ago
乔龙飞 Qiao Longfei
d243e555eb
Merge pull request #15080 from jacquesqiao/optimize-assign
...
Optimize assign
7 years ago
jerrywgz
11f1baa406
refine code, test=develop
7 years ago
Zhaolong Xing
b7b68f2a8c
Merge pull request #15461 from NHZlX/fix_trt_stream_bug
...
fix trt stream bug.
7 years ago
luotao1
353b5f06a7
refine analyzer_bert_test to pass the ci
...
test=develop
7 years ago
tangwei12
8b50ad80ff
checkpoint at distributed training ( #14854 )
...
checkpoint for distributed training.
7 years ago
luotao1
cc618934c0
Merge branch 'bert_test' of https://github.com/fc500110/Paddle into fc500110-bert_test
7 years ago
jerrywgz
57e5f61ec8
add gpu kernel, test=develop
7 years ago
jerrywgz
cc53453057
add comment and refine code, test=develop
7 years ago
nhzlx
e6218c1d7b
change the input to a smaller value
...
test=develop
7 years ago
qingqing01
07dc5a1506
Add generate_mask_labels_op to support Mask-RCNN and refine some code. ( #15371 )
...
* Add generate_mask_labels_op to support Mask-RCNN.
* Refine sigmoid_cross_entropy to support nomalize mode.
* Fix generator_proposals_label.
* Use DeviceTemporaryAllocator in roi_pool and roi_algin.
* Remove shape check in data_feeder.
7 years ago
gongweibao
9f5108a673
Add cicheck_brpc ( #15468 )
7 years ago
Qiao Longfei
6833ec06dc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-assign
...
test=develop
7 years ago
Yiqun Liu
eaad3e4c3d
Add check of input in sequence_expand op. ( #15466 )
...
* Add check of input in sequence_expand op.
test=develop
* Correct the unittest of sequence_expand op.
test=develop
7 years ago
sneaxiy
ef788603d4
merge develop
...
test=develop
7 years ago
gongweibao
f4dec5cdee
Check collective server's data. ( #15449 )
7 years ago
Zhen Wang
58727e8e6d
Merge pull request #15455 from wzzju/graph_quantization
...
Graph quantization pass. TODO(Add public API comments.)
7 years ago
jerrywgz
f44b1507f0
revised API spec, test=develop
7 years ago
jerrywgz
b449f8ff2f
revised API spec, test=develop
7 years ago
fuchang01
4a33a44f45
analyzer bert tester
7 years ago
Tao Luo
fef3fd6d62
Merge pull request #15452 from luotao1/legacy_option
...
remove legacy compiler option
7 years ago
Paddle CI
289aba750a
Polish code
...
test=develop
7 years ago
jerrywgz
c12a969bd4
refine comment and unittest, test=develop
7 years ago
chengduo
5a8bd82c0c
Remove workspace_handle ( #15376 )
...
* remove workspace_handle
test=develop
* set constant for loss
test=develop
7 years ago
jerrywgz
1c558ad388
add gpu kernel for box clip, test=develop
7 years ago
JiabinYang
266e0b63cd
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/imperative
...
simple rnn
7 years ago
JiabinYang
e686818aed
simple RNN
7 years ago
tianshuo78520a
3308e3c4cb
update python_list;test=develop
7 years ago
WangZhen
4e91d8d291
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
...
test=develop
7 years ago
nhzlx
5b92ddabe2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_trt_stream_bug
...
test=develop
7 years ago
nhzlx
2f4aee361a
fix comments
...
test=develop
7 years ago
WangZhen
c6f99a1645
Update API.spec. test=develop
7 years ago
WangZhen
b913463e83
Update according to the reviewers' suggestion. test=develop
7 years ago
sneaxiy
d8568acd19
turn on remove_unnecessary_lock
...
test=develop
7 years ago
tianshuo78520a
4dde620eb3
test=develop
7 years ago
Qiao Longfei
a71f7ed787
update API.spec test=develop
7 years ago
tianshuo78520a
4b164c71b8
update linux grammar
7 years ago
nhzlx
ec213730bc
fix trt stream bug.
...
BUG: After continuing to input different data, the output cannot be aligned
test=develop
7 years ago
wopeizl
a8aa79130b
Merge pull request #15453 from wopeizl/fix15313
...
fix pr 15313
7 years ago
gongweibao
7f8b40f68d
Fix brpc complation error. ( #15451 )
7 years ago
WangZhen
3ce6172052
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
7 years ago
WangZhen
787c5e714c
Update the API.spec. test=develop.
7 years ago
WangZhen
59e5cc51d6
Add quantization transform pass and UT.
7 years ago
flame
d60751fb71
add python inference api ( #15248 )
...
add python inference api
7 years ago
jerrywgz
0d4b60ab8b
add lod for slice op, test=develop
7 years ago
tianshuo78520a
58e63124eb
update finction
7 years ago
Tao Luo
54c0da080d
fix compiler error in paddle_build.sh
...
test=develop
7 years ago
tianshuo78520a
e297c39b52
update linux function
7 years ago
dzhwinter
8f3b252392
squash commits. test=develop
7 years ago
peizhilin
e6a3a3a31a
fix pr 15313
...
test=develop
7 years ago
Qiao Longfei
9449844c2a
update ctr_reader in API.spec
...
test=develop
7 years ago
Tao Luo
cf29ea1592
remove legacy ANDROID option
7 years ago
Tao Luo
e000d17a0c
remove legacy WITH_SWIG_PY option
7 years ago
jerrywgz
66bb5dd760
refine infer shape, test=develop
7 years ago
Tao Luo
561ae9d507
remove legacy WITH_C_API option
7 years ago
tensor-tang
266e625d2e
Merge pull request #15399 from tensor-tang/refine/seqpool/fc
...
fix cpu jitkernel test and refine benchmark test
7 years ago
Qiao Longfei
45578c1b48
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
7 years ago
jerrywgz
7d0c5fafa9
add API spec, test=develop
7 years ago
Yan Chunwei
885c4e57ab
fea/infer memory optim2 ( #14953 )
7 years ago
jerrywgz
0d91507859
fix share lod, test=develop
7 years ago
minqiyang
a21f4e38c3
Polish code
...
test=develop
7 years ago
minqiyang
8ce198b2e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
...
test=develop
7 years ago
minqiyang
31a1cd8ce5
Align the first batch of gpu resnet
7 years ago
Tao Luo
6597ccb01f
Merge pull request #15413 from luotao1/legacy_code
...
remove legacy code
7 years ago
JiabinYang
3972dd88fb
test=develop, refine code
7 years ago
Dun
9f8f0fc2d3
Memory optimization of depthwise conv op and group norm op ( #15313 )
...
* mem opt
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine with cub test=develop
* fix mkldnn test && remove comments && test=develop
* polish code && test=develop
* add only_forward test && test=develop
7 years ago
jerrywgz
5246285e34
test=develop
7 years ago
jerrywgz
b10d84bc5a
fix bug when run on GPU, test=develop
7 years ago
whs
530869f829
Share LoD from Input(Rois). ( #15420 )
...
test=develop
7 years ago
gongweibao
7ab4af2716
Fix brpc compilation. ( #15417 )
7 years ago
Xin Pan
9a9c690e71
Merge pull request #15343 from panyx0718/imperative3
...
add a GAN model in imperative mode
7 years ago
Dun Liang
e5004f3c1c
fix ci && test=develop
7 years ago
WangZhen
e2ff300b02
add UT for quantization.
7 years ago
Tao Luo
c102f427d2
make 'paddle version' valid
...
test=develop
7 years ago
WangZhen
451896fce4
init quantization.
7 years ago
tensor-tang
316e44b1b7
fix unused warnings
...
test=develop
7 years ago
JiabinYang
b17da93cc8
test=develop, fast_install shell for linux and mac
7 years ago
Wu Yi
7e651a38dd
fix mac cmake version 3.13 build ( #15386 )
...
* fix mac cmake version 3.13 test=develop
* fix again test=develop
7 years ago
jerrywgz
b62a17bbae
add nms api
7 years ago
tensor-tang
579d758254
fix jitkernel tests and refine benchmark
...
test=develop
7 years ago
jerrywgz
f660553d77
enhance nms for mask rcnn, test=develop
7 years ago
shippingwang
14f2a1060d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into shufflechannel
7 years ago
jerrywgz
88ee56d0b2
enhance nms for mask rcnn
7 years ago
Tao Luo
193edfa746
remove legacy build_android and build_ios
...
test=develop
7 years ago
zhaozhehao
e2ba9668b4
Tree conv op ( #15217 )
...
* refactor tree2col operator with new memory mechanism test=develop
* test=develop
* test=develop
* Modified API according to panyx0718 test=develop
* fix API change according to heavengate test=develop
* Modify API comment test=develop
7 years ago
Tao Luo
3ede8b67e6
update CMakeLists.txt
7 years ago
Tao Luo
5316c64776
remove legacy cluster_train code
7 years ago
Tao Luo
eec133ca6a
remove legacy testing code
7 years ago
Tao Luo
81da854903
remove legacy C++ code
7 years ago
Tao Luo
8f522c15ed
Merge pull request #15408 from luotao1/mm_dnn
...
test_analyzer_mm_dnn runs in serial
7 years ago
Tao Luo
001827c270
test_analyzer_mm_dnn runs in serial
...
test=develop
7 years ago
Tao Luo
140fc1e92c
Merge pull request #15392 from luotao1/pyramid_dnn
...
add pyramid_dnn c++ inference test
7 years ago
Wei Liu
413543eb8f
print peak memory usage
7 years ago
Yan Chunwei
c9e5aa19c1
get tensor API add more comments ( #15345 )
7 years ago
Yiqun Liu
f413b6892b
Revert the modification of while_op in #14764 . ( #15372 )
...
* Revert the modification of while_op in #14764 .
test=develop
* Remove the dependency of GRPC_DEPS.
test=develop
7 years ago
jerrywgz
ab9d6a4f39
add comments, test=develop
7 years ago
jerrywgz
10dd3b37ad
add axis for box coder op
7 years ago
Yan Chunwei
e84234b551
make clone thread safe ( #15363 )
7 years ago
乔龙飞 Qiao Longfei
adba4384ec
Merge pull request #15161 from jacquesqiao/gru-add-mode
...
gru add origin mode
7 years ago
gongweibao
7cd4dd7ce4
Hide varhandle members. ( #15382 )
7 years ago
Tao Luo
668563088e
add pyramid_dnn c++ inference test
...
test=develop
7 years ago
Zhaolong Xing
236201c222
Merge pull request #15350 from NHZlX/fix_bug_for_precditor
...
fix analysis config bug
7 years ago
nhzlx
8817841c73
fix unit test bug
...
test=develop
7 years ago
Yan Chunwei
e07900d317
cache tensor ptr in ZeroCopyTensor ( #15352 )
7 years ago
Yan Chunwei
b7916440ff
hot fix the Native clone ( #15344 )
7 years ago
jerrywgz
5fb2856584
test_develop
7 years ago
minqiyang
dbd4d058af
Add static implementation and fix fc layer
7 years ago
Xin Pan
3ecf6bb338
Merge pull request #15028 from yihuaxu/develop_641313ea7_elementwise_mul_mkldnn_bug_fix
...
Fix the exception when tensor format is x
7 years ago
jerrywgz
e2044c09e9
test=develop
7 years ago
jerrywgz
af448373c7
test=develop
7 years ago
Xin Pan
e395f2c6a3
polish codes
...
test=develop
7 years ago
nhzlx
b95f2ff8fe
fix win build bug
...
test=develop
7 years ago
nhzlx
b938324381
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
...
test=develop
7 years ago
nhzlx
312fe0ece1
add trt int8 calibration support
...
fix comments
test=develop
7 years ago
wopeizl
994e73f685
Merge pull request #15351 from wopeizl/fixbuildissue
...
disable the parallel mode for adam op on windows test=develop
7 years ago
minqiyang
315b133e67
Add single GPU support to imperative
7 years ago
jerrywgz
481d8bce2f
add box clip op
7 years ago
Yiqun Liu
568cc2ffa8
Optimize while_op for test ( #14764 )
...
* Simplify the compare op for CPU.
* Use asynchronous tensor copy in reshape_op's kernel.
* Optimize while_op for test, avoiding creating variables every time.
test=develop
* Enable the cache of kernel type and kernel function.
test=develop
* Enable profiling with gperftools.
* Remove flags for testing, and fix the linking error.
test=develop
* Delete the codes of ChooseKernel.
test=develop
* Fix bug when preparing ExecutorPrepareContext for while_op.
* Fix missing depending on grpc libraries.
* Remove the redundant print.
test=develop
* Follow comments.
* Remove the codes related to prepare the ExecutorPrepareContext for while_op.
test=develop
7 years ago
tensor-tang
3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
...
Enable element_wise_add operator for a ngraph engine
7 years ago
tensor-tang
904a39239d
Merge pull request #15254 from mozga-intel/mozga-intel/softmax_operator_ngraph
...
Enable softmax operator for a ngraph engine
7 years ago
nhzlx
e61a1b9514
merge develop test=develop
7 years ago
peizhilin
cd562f8fb7
disable the parallel mode for adam op on windows test=develop
7 years ago
nhzlx
b2ba3471fd
fix analysis config bug.
7 years ago
Xin Pan
01dc15ce32
Merge pull request #15329 from panyx0718/imperative2
...
add imperative mode design
7 years ago
Xin Pan
16cb3ebd68
Merge pull request #15268 from xiaolil1/pool-int8
...
Enhance key generation for Pool INT8 test
7 years ago
Xin Pan
9a4314f025
imperative gan
...
test=develop
7 years ago
tensor-tang
a7fc3d42a0
Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub
...
Fuse/second order mul sub and fuse repeated fc relu
7 years ago
bingyanghuang
a152a5c731
Disable conv3d mkldnn in dam ( #15335 )
...
* disable conv3d mkldnn in dam
* Add some comments test=develop
7 years ago
Xin Pan
73093656b8
Merge pull request #15331 from panyx0718/api
...
expose CompiledProgram
7 years ago
Xin Pan
2db6e3ed2a
Merge pull request #15292 from panyx0718/imperative
...
polish imperative codes
7 years ago
乔龙飞 Qiao Longfei
b14d4cdd75
Merge pull request #14890 from jacquesqiao/multithread-sparse-adam
...
adam support multithread
7 years ago
Xin Pan
6b762f6519
add doc
...
test=develop
7 years ago
Xin Pan
d7b159355c
add more doc
...
test=develop
7 years ago
mozga-intel
cba729404d
Enable softmax operator for a ngraph engine
...
test=develop
7 years ago
Qiao Longfei
cd31b90a46
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
...
test=develop
7 years ago
wopeizl
0fbb76f66b
Merge pull request #15204 from wopeizl/debug/support
...
add the python callstack for debug support test=develop
7 years ago
Xin Pan
24bb6a6aec
expose CompiledProgram
...
test=develop
7 years ago
Xin Pan
783dbe9abb
more doc
...
test=develop
7 years ago
Xin Pan
f997109bb1
polish
7 years ago
Xin Pan
c1fdacd4b4
add imperative mode design
...
test=develop
7 years ago
Qiao Longfei
8c516a24e5
remote min_row_size_to_use_multithread in adam interface test=develop
7 years ago
Tao Luo
9497d43921
Merge pull request #15307 from luotao1/trace_deps
...
fix imperative compile when WITH_PYTHON=OFF
7 years ago
tensor-tang
1a95cd227d
disable seqpool test on mac or without mkl
...
test=develop
7 years ago
Qiao Longfei
9b4fe283e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
7 years ago
tensor-tang
0b6447a482
Merge pull request #15310 from luotao1/ZeroCopy_omp
...
fix multi-threads in ZeroCopyProfile
7 years ago
peizhilin
5e450833bd
test=develop
7 years ago
Qiyang Min
3f687765e6
Merge pull request #15281 from velconia/fix_expand_op_compile_time
...
Fix expand op compile time bug
7 years ago
peizhilin
eea75a1d93
fix issue when type is invalid
...
test=develop
7 years ago
peizhilin
9adb158e5b
Merge remote-tracking branch 'upstream/develop' into debug/support
7 years ago
minqiyang
29ceb93126
Use malloc and free in JeMalloc
...
test=develop
7 years ago
Tao Luo
2411ed4286
fix multi-threads in ZeroCopyProfile
...
test=develop
7 years ago
minqiyang
c4cf5967db
Change backward op infershape
...
test=develop
7 years ago
tensor-tang
84b0ecdcce
Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub
...
test=develop
7 years ago
tensor-tang
7035f051a8
adjust acc on mac
7 years ago
luotao1
346561a37f
fix imperative compile when WITH_PYTHON=OFF
...
test=develop
7 years ago
Xin Pan
b29eca3b71
code style
...
test=develop
7 years ago
Xin Pan
7bc67c31e5
polish more
...
test=develop
7 years ago
Xin Pan
0c04cac484
polish
...
test=develop
7 years ago
Xin Pan
47ef2df01a
polish
...
test=develop
7 years ago
Xin Pan
0d5819eb4f
polish imperative codes
...
test=develop
7 years ago
Tao Luo
e33427da0d
Merge pull request #15280 from luotao1/random_test
...
fix CompareDeterministic error when test_all_data
7 years ago
chengduo
46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn ( #15186 )"" ( #15290 )
...
test=develop
This reverts commit 358e657f68
.
7 years ago
Qiao Longfei
4d15515c40
fix gru_gpu_kernel test=develop
7 years ago
tensor-tang
93e75c5ae5
refine jitcode of vsub and vsquare
...
test=develop
7 years ago
tensor-tang
d618e48309
fix fuse square mat order and refine test
...
test=develop
7 years ago
tensor-tang
a5d2a6d1ad
add fuse pass of sequared mat sub fusion
7 years ago
tensor-tang
531f4a1578
Merge branch 'fuse/repeatedfcrelu' into fuse/second_order_mul_sub
7 years ago
tensor-tang
84e023eae5
adjust the acc since the refer result is too large
...
test=develop
7 years ago
Qiao Longfei
4feae25378
fix build problem test=develop
7 years ago
tensor-tang
38de1ff472
add fusion squared mat sub op
7 years ago
Qiao Longfei
e641ffe77b
change interface and api spec for dynamic_gru test=develop
7 years ago
tensor-tang
09c5786e22
add square jitkernel
7 years ago
Qiao Longfei
4c7be265d3
update avx gru grad kernel test=develop
7 years ago
tensor-tang
4461a458a5
adjust diff since abs is too large
...
test=develop
7 years ago
Qiao Longfei
9b16e54064
update gru_grad_op
...
test=develop
7 years ago
tensor-tang
ca6fdc6e33
refine and fix test
...
test=develop
7 years ago
tensor-tang
a89296ac1f
add repeated fc relu pass
7 years ago
Qiao Longfei
e477d789a1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode
7 years ago
tensor-tang
f347d6e4a1
add repeated fc relu unit test
...
test=develop
7 years ago
tensor-tang
99010e6eae
init repeated fc relu op
7 years ago
tensor-tang
266a5d2f52
implement matmul refer and mkl kernel
7 years ago
tensor-tang
c5623c87a3
init jit matmul kernel
7 years ago
Xin Pan
a92860a3b1
Merge pull request #15298 from panyx0718/fix
...
Fix python2 bug cause CE to fail
7 years ago
Dun Liang
0c5c561bd1
test=develop
7 years ago
Xin Pan
50b4ac08b0
fix
...
test=develop
7 years ago
Xin Pan
a1bfb35dd6
try fix py2
...
test=develop
7 years ago
tensor-tang
781cd0cf51
add multi threads test of seqpool test ( #15293 )
7 years ago
Dun Liang
a900015c03
add async copy and pinned place
7 years ago
Xin Pan
3f65869ba6
try fix
...
test=develop
7 years ago
Xin Pan
3e79e6544f
try fix
...
test=develop
7 years ago
Tao Luo
1d434a9de6
Merge pull request #15291 from wojtuss/wojtuss/fix-performance-drop
...
Fix performance drop when with MKL-DNN
7 years ago
Cheerego
e387667ffe
Merge pull request #15288 from tianshuo78520a/tools
...
test=develop
7 years ago
minqiyang
c86b3dd6e6
Polish code
...
test=develop
7 years ago
minqiyang
ddfb9f1123
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_shared_ptr
...
test=develop
7 years ago
Xin Pan
d1220f23bb
Merge pull request #15229 from panyx0718/imperative
...
support python codes in the imperative model
7 years ago
colourful-tree
576c740d5d
Merge pull request #14964 from colourful-tree/data_norm
...
add data norm op
7 years ago
tianshuo78520a
1c6d0342c0
test=develop
7 years ago
colourful-tree
d5a8909131
Merge pull request #14950 from colourful-tree/develop
...
add teacher student sigmoid loss
7 years ago
minqiyang
bc3e0d6e01
Fix expand op compile time bug
...
test=develop
7 years ago
Tao Luo
cbd1c7c01f
fix CompareDeterministic error when test_all_data
...
test=develop
7 years ago
Xin Pan
6a18c0f9ff
Merge pull request #15278 from chengduoZH/revert_remove_workspace_handle_in_conv2d_cudnn
...
Revert "Remove workspace_handle in conv_cudnn (#15186 )"
7 years ago
Zhaolong Xing
98e85f3735
add_transpose_flatten_concat_fuse ( #15121 )
7 years ago
chengduozh
c4eced9881
fix thread safe bug
...
test=develop
7 years ago
chengduozh
358e657f68
Revert "Remove workspace_handle in conv_cudnn ( #15186 )"
...
test=develop
This reverts commit 064512aa47
.
7 years ago
wopeizl
5d9edb4124
Merge pull request #15156 from wopeizl/windows/fixgpuissue
...
fix gpu buils issue on windows test=develop
7 years ago
Wojciech Uss
cb2ba58458
Fix performance drop when with MKL-DNN
...
test=develop
7 years ago
tensor-tang
fc9fbab6a0
Merge pull request #15271 from tensor-tang/fix/typo
...
fix typo and refine
7 years ago
minqiyang
d0b640dca1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_shared_ptr
...
test=develop
7 years ago
chengduo
064512aa47
Remove workspace_handle in conv_cudnn ( #15186 )
...
* remove workspace_handle in conv2d_cudnn
test=develop
* remove workspace_handle
test=develop
* fix bug
test=develop
* make test_conv2d_op SERIAL
test=develop
* save memory in conv_cudnn
test=develop
* enhance thread safety
test=develop
* enhance temporary allocator
test=develop
* Add excess fraction
test=develop
* follow comments
test=develop
* fix bug and code refine
test=develop
* fix memory size check
test=develop
* rename reuse_tmp_allocation_excess_fraction
test=develop
7 years ago
minqiyang
687171d22b
Move from shared_ptr to raw pointer
...
test=develop
7 years ago
tensor-tang
c3a9f3c4b2
fix typo and refine
...
test=develop
7 years ago
tensor-tang
146e942c65
Merge pull request #15250 from tensor-tang/refine/seqpool/feed
...
Refine/seqpool/feed with infer zerocopytensor
7 years ago
xiaolil1
8f17c714de
Conv int8 residual ( #15145 )
...
* Enable basic MKL-DNN INT8 Conv OP
test=develop
* Modify test case
test=develop
* Clean unittest code
test=develop
* Fix test
test=develop
* Modify test
test=develop
* Enable MKL-DNN INT8 Conv with Relu Fusion OP
test=develop
* Enable INT8 Conv with residual fusion OP
test=develop
* Modify code.
test=develop
* Modify basic INT8 Conv
test=develop
* Modify Conv.
test=develop
* fix style
test=develop
* Fix style
test=develop
* Fix test
test=develop
* Modify code.
test=develop
* Fix test
test=develop
7 years ago
Tao Luo
93d5c1ed5a
Merge pull request #15261 from wopeizl/fixdemos
...
remove the dismatch enclosure to avoid warning message test=develop
7 years ago
xiaoli.liu@intel.com
f34e779f4d
Enhance key generation for INT8 test.
...
test=develop
7 years ago
peizhilin
439691f5bd
adjust the shlwapi on windows
...
test=develop
7 years ago
peizhilin
92da467c99
Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue
7 years ago
Wu Yi
fd85418329
[Feature] support mix precision training for resnet ( #14899 )
...
* clip softmax for fp16
* updates
* fuse xent support fp16 test=develop
* wip
* wip
* add simple row reduce
* wip fp16 accurate softmax
* add accurate softmax kernel for fp16 test=develop
* update test=develop
* fix cpu build test=develop
* update api.spec test=develop
* follow comments test=develop
* fix build test=develop
* fix trt build test=develop
* fix inference build test=develop
* fix merge test=develop
* update test=develop
* try fix build test=develop
* fix build test=develop
* rename real_exp test=develop
* fortest
* remove hacky kernels test=develop
* clean up test=develop
7 years ago
tensor-tang
96786d3716
add compare_determine of seqpool1 test
...
test=develop
7 years ago
tensor-tang
ab9c4b2a9f
refine seqpool concat pass and remove unused nodes
...
test=develop
7 years ago
tensor-tang
ce909664d8
Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
7 years ago
peizhilin
e239558e56
remove the dismatch enclosure to avoid warning message test=develop
7 years ago
flame
fb63cd89d4
Add python ir graph API ( #14917 )
7 years ago
tensor-tang
a0a27bd240
add seqpool concat fuse pass tester
...
test=develop
7 years ago
Tao Luo
7d13d20769
Merge pull request #15245 from luotao1/rnn1_multi_thread
...
reduce threads number to avoid analyzer_rnn1_tester hang in CI
7 years ago
minqiyang
80197fac26
Add missing files
...
test=develop
7 years ago
Tao Luo
2b11c710b3
Merge pull request #15249 from NHZlX/fix_trt_demo_ci
...
fix demo ci bug
7 years ago
乔龙飞 Qiao Longfei
5e74c4e88f
Merge pull request #15100 from jacquesqiao/fix-dist-sparse-decay
...
fix dist sparse l2 decay
7 years ago
tensor-tang
8e086a8521
follow comment and fix typo
...
test=develop
7 years ago
minqiyang
08e2a5d611
Polish tracer code
...
test=develop
7 years ago
minqiyang
cded24768c
Remove shared_ptr holder for VarBase
...
test=develop
7 years ago
minqiyang
c8d1a8e909
Change var_ and grad_ to shared_ptr
7 years ago
minqiyang
7aab39af15
Change grads to VarBase
7 years ago
tensor-tang
54afcb7ec6
add compare zerocopy test with native result
...
test=develop
7 years ago
tensor-tang
137060135e
fix zerocopy size
7 years ago
tensor-tang
7461356723
add zerocopy for seqpool test
7 years ago
tensor-tang
48410b9bfe
Merge pull request #15237 from tensor-tang/fuse/seqpool_concat_2
...
Fuse/seqpool concat 2
7 years ago
nhzlx
e7d83389e6
fix demo ci bug
...
1. trt_demo bug
2. trigger exit when exists a bug
test=develop
7 years ago
Tao Luo
9b41e45584
Merge pull request #15222 from luotao1/native_config
...
fix analyzer_test runs error in native_config
7 years ago
Tao Luo
d43983b61d
reduce threads number to avoid hang in CI
...
test=develop
7 years ago
Qiao Longfei
653cd31971
remote unused code
7 years ago
Qiao Longfei
0a79d7a404
fix merge
7 years ago
Qiao Longfei
422449a945
fix style
7 years ago
Qiao Longfei
edad60e612
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
7 years ago
nhzlx
c1264e99f3
fix win error
...
test=develop
7 years ago
peizhilin
c1235c935f
add the enable_debug flag
...
test=develop
7 years ago
nhzlx
4e3522e5b4
add trt int8 support
...
test=develop
7 years ago
Xin Pan
7b73fc9e1a
Merge pull request #15089 from panyx0718/api
...
try unify Executor and ParallelExecutor
7 years ago
Xin Pan
9597fd05e9
polish
...
test=develop
7 years ago
Qiao Longfei
d0e3b24002
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
...
test=develop
7 years ago
tensor-tang
f8c305b243
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat_2
...
test=develop
7 years ago
Tao Luo
197d0f2431
fix trt_model_tester to pass the ci
...
test=develop
7 years ago
tensor-tang
223c61ca5e
Merge pull request #15170 from tensor-tang/jit/seqpool
...
refine seqpool op
7 years ago
Qiao Longfei
c3b9edf958
follow comment test=develop
7 years ago
Zeng Jinle
e29f10d315
Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var
...
Remove op handle lock and fix var
7 years ago
Zeng Jinle
7b638f2781
Merge pull request #15218 from sneaxiy/fix_same_name_func
...
Fix same name func framework::ToTypeIndex
7 years ago
Tao Luo
feee78d9f0
Merge pull request #15214 from tensor-tang/fix/debug
...
fix debug build error
7 years ago
Xin Pan
7aad6afd49
forward and backward
...
test=develop
7 years ago
mozga-intel
eff90eb941
PADDLE_WITH_NGRAPH was removed from the code
...
test=develop
7 years ago
mozga-intel
a42f8f4f6f
Enable element_wise_add operator for a ngraph
...
test=develop
7 years ago
mozga-intel
e4184008a4
PADDLE_WITH_NGRAPH was removed from the code
...
test=develop
7 years ago
Qiao Longfei
3ace486ebd
fix sum_op selected rows test=develop
7 years ago
Tao Luo
71d9097a89
fix analyzer_test runs error in native_config
...
test=develop
7 years ago
Tao Luo
9c02765158
Merge pull request #15210 from Superjomn/fix/analysis_tester_bug
...
fix analysis_tester bug
7 years ago
tensor-tang
72d2a1801e
add seqpool concat fuse pass
...
test=develop
7 years ago
tensor-tang
f702f8fd10
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat
7 years ago
sneaxiy
bc205ef374
fix same name func
...
test=develop
7 years ago
tensor-tang
69fd3fdb52
fix debug build error
...
test=develop
7 years ago
Xin Pan
2349acea48
checkpoint
...
test=develop
7 years ago
xuezhong
c0bc818688
Merge pull request #15188 from velconia/add_pyramid_dnn_support
...
Add no lock optimization pass
7 years ago
Qiao Longfei
b16e832d4d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
7 years ago
Qiao Longfei
4d169ad981
update api spec test=develop
7 years ago
superjomn
23bdd0a223
fix analysis_tester bug
...
test=develop
7 years ago
Yan Chunwei
d09d6eadc0
make inference api work with Doxygen ( #15195 )
7 years ago
Zeng Jinle
c562be20d9
Merge pull request #15193 from sneaxiy/fix_cudnn_compatible_check
...
Fix cudnn compatible check
7 years ago
peizhilin
1cd95d8a0b
use thread local instance test=develop
7 years ago
minqiyang
7b7d0d0caf
Change hash function back
...
test=develop
7 years ago
Xin Pan
11d4d39cd7
forward working
...
test=develop
7 years ago
sneaxiy
ed409ac9f4
Revert "Revert "Remove op handle lock""
...
test=develop
7 years ago
sneaxiy
4a443ffc98
merge develop
...
test=develop
7 years ago
peizhilin
d54133ea85
not include the numeric under linux test=develop
7 years ago
sneaxiy
7c7342bf12
fix scope.var()
...
test=develop
7 years ago
Tao Luo
4d9aa1745a
Merge pull request #14806 from mozga-intel/mozga-intel/scale_operator_ngraph
...
Enable scale operator for a ngraph engine
7 years ago
Tao Luo
dc0c221426
Merge pull request #14803 from mozga-intel/mozga-intel/mean_operator_ngraph
...
Enable mean operator for a ngraph engine
7 years ago
Xin Pan
b629133375
checkpoint runnable PyLayer
...
test=develop
7 years ago
peizhilin
a6f5ceee74
add the python callstack for debug support test=develop
7 years ago
Zeng Jinle
dacfaaa966
Revert "Remove op handle lock"
...
test=develop
7 years ago
Tao Luo
6ca9a4810b
Merge pull request #15196 from luotao1/serial
...
run analyzer_tester serial in multi-thread
7 years ago
Xin Pan
c4b09a713f
polish
...
test=develop
7 years ago
minqiyang
b76695418a
Polish log
...
test=develop
7 years ago
minqiyang
1bfbc0d963
Polish code
...
test=develop
7 years ago
minqiyang
7f45b9511a
Polish code
7 years ago
minqiyang
68a07328fa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_pyramid_dnn_support
...
test=develop
7 years ago
Qiyang Min
317840d3ba
Merge pull request #14277 from velconia/add_fused_emb_seq_pool_op
...
Add fused emb seq pool op
7 years ago
tensor-tang
2dd331cc21
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat
...
test=develop
7 years ago
tensor-tang
316636404f
add seqpool concat unit test
7 years ago
Yan Chunwei
6ccf8685f7
refactor tensorrt node teller ( #15181 )
7 years ago
Tao Luo
7dc0181c46
run analyzer_tester serial in multi-thread
...
test=develop
7 years ago
xiaolil1
c8f101e5da
Conv int8 relu ( #15130 )
...
* Enable basic MKL-DNN INT8 Conv OP
test=develop
* Modify test case
test=develop
* Clean unittest code
test=develop
* Fix test
test=develop
* Modify test
test=develop
* Enable MKL-DNN INT8 Conv with Relu Fusion OP
test=develop
* Modify basic INT8 Conv
test=develop
* fix type
test=develop
* Modify test
test=develop
7 years ago
sneaxiy
9793a0b6a6
fix_cudnn_compatible_check
7 years ago
Zeng Jinle
ccb322d6a5
merge develop
7 years ago
Xin Pan
0d0bc61248
update api
...
test=develop
7 years ago
tensor-tang
7923d7271f
add fusion seqpool concat op
7 years ago
Zeng Jinle
f3a13512fc
Merge pull request #15139 from sneaxiy/remove_op_handle_lock
...
Remove op handle lock
7 years ago
Qiao Longfei
44b300556d
change min_row_size_to_use_multithread to parameter of adam
...
test=develop
7 years ago
Qiao Longfei
87b4eb1da4
change min_param_size_to_use_multithread to min_row_size_to_use_multithread
7 years ago
minqiyang
0f94c1ac14
Polish code
...
test=develop
7 years ago
minqiyang
00e4de04bf
Polish code
7 years ago
minqiyang
4bfa110fd8
Add no lock optimize pass
...
test=develop
7 years ago
Qiyang Min
1df2399e00
Merge pull request #15180 from velconia/add_pyramid_dnn_support
...
Add JeMalloc
7 years ago
chengduo
eabb2105fa
Refactor MultiDevSSAGraphBuilder ( #15090 )
...
* Refactor ParallelExecutor
test=develop
* extract Reduce and AllReduce mode from MultiDevSSAGraphBuilder
test=develop
* Refactor MultiDevSSAGraphBuilder
test=developt
* Remove enable_data_balance
test=develop
* code refine
test=develop
* remove data balance
test=develop
* refine ScaleLossGradOp
test=develop
* remove uncessary file
test=develop
* code refine
test=develop
* modify function name
test=develop
* follow comments
test=develop
* add is_distribution field
test=develop
* set is_distribution
test=develop
* fix DistSSAGraphBuilder
test=develop
7 years ago
Yan Chunwei
875a07c32d
refactor inference analysis api ( #14634 )
7 years ago
minqiyang
c09a379015
remove const_cast
...
test=develop
7 years ago
tensor-tang
102d93712e
Merge remote-tracking branch 'ups/develop' into jit/seqpool
...
test=develop
7 years ago
tensor-tang
123b98f417
refine heigth and codesize and support all pool
...
test=develop
7 years ago
tensor-tang
0145f40f45
use height from params of jitcode
7 years ago
tensor-tang
e0591deebc
enhance seqpool jitcode
7 years ago
Zeng Jinle
99e6e8b00f
Merge pull request #15179 from sneaxiy/fix_crf_grad_lod
...
Fix crf grad lod share
7 years ago
minqiyang
db8eb9b688
Polish code
...
test=develop
7 years ago
minqiyang
f4c990e7b8
Add fused embedding ops
7 years ago
minqiyang
39b98709b1
Move fused ops to fused dir
...
test=develop
7 years ago
minqiyang
920d4a8b78
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_fused_emb_seq_pool_op
...
test=develop
7 years ago
minqiyang
b2716909b4
Add changes to paddle_build
...
test=develop
7 years ago
Tao Luo
5ee596cae5
Merge pull request #15175 from baojun-nervana/intel/mkldnn
...
Update ngraph to resolve issue with mkldnn upgrade
7 years ago
乔龙飞 Qiao Longfei
7c891e1ecc
Merge pull request #15111 from jacquesqiao/fix-adam-tmp-var
...
Fix adam tmp var on cpu
7 years ago
mozga-intel
e77956c920
Enable mean operator for a ngraph
...
test=develop
7 years ago
mozga-intel
dd768714ab
Enable scale operator for a ngraph
...
test=develop
7 years ago
sneaxiy
be425461a1
fix crf grad lod share
...
test=develop
7 years ago
Qiao Longfei
3e1b914fcb
update gru op forward kernel
7 years ago
Qiao Longfei
7a81ab8607
complete gru_unite_op and test
7 years ago
Qiao Longfei
72618c8da5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode
7 years ago
Qiao Longfei
17b1b660fc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
7 years ago
Qiao Longfei
c15270c5b2
optimize multi thread adam
7 years ago
乔龙飞 Qiao Longfei
e1679b8847
Merge pull request #14893 from JiabinYang/feature/add_prefech_hs
...
Feature/add prefech hs
7 years ago
baojun-nervana
f0cde74564
Update ngraph with elt-wise relu test=develop
7 years ago
tensor-tang
92201d3956
support avg and sqrt pool and add mkl impl
...
test=develop
7 years ago
tensor-tang
c50060bb26
add jitcode impl and use it
7 years ago
tensor-tang
142bb41748
add seqpool jitkernel test and benchmark
7 years ago
tensor-tang
e58a569c6c
use seqpool jitkernel
7 years ago
tensor-tang
3e01a4048f
add refer seqpool jitkernel
7 years ago
Qiao Longfei
4ecb9c93f0
update API.spec
...
test=develop
7 years ago
Xin Pan
f1c7f4b016
Merge pull request #15142 from tianshuo78520a/tools
...
test=develop
7 years ago
xiaolil1
bbc9336878
Enable basic MKL-DNN INT8 Conv OP ( #15124 )
...
* Enable basic MKL-DNN INT8 Conv OP
test=develop
* Modify test case
test=develop
* Clean unittest code
test=develop
* Fix test
test=develop
* Modify test
test=develop
* Modify basic INT8 Conv
test=develop
7 years ago
Xin Pan
8ae9094e07
polish and resolve conflicts
...
test=develop
7 years ago
Xin Pan
5e928e579a
try unify Executor and ParallelExecutor
...
test=develop
7 years ago
Qiao Longfei
e10af895de
update gru grad op
...
test=develop
7 years ago
Qiao Longfei
78ec7c0f99
gru add origin mode
...
test=develop
7 years ago
peizhilin
c919b2f31d
Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue
7 years ago