Commit Graph

6387 Commits (29a4b21bc8d49067e0e4ce470aedb74b29050b37)

Author SHA1 Message Date
jerrywgz 9eb2d7b3e1 refine code, test=develop
6 years ago
nhzlx 484b3bc801 When cudnn version < 7100, there is problem with conv_fusion.
6 years ago
jerrywgz 6dfd789bfc Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_nms
6 years ago
jerrywgz 6928f8318f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_axis_for_boxcoder
6 years ago
jerrywgz e60c8438fc Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_clip_op
6 years ago
tensor-tang af07118dd7
Merge pull request #15486 from tensor-tang/fix/pass/debug
6 years ago
liuwei1031 5d026a881a Gpu memory monitoring (#15436)
6 years ago
Xin Pan 58cb18d9d9
Merge pull request #15322 from velconia/imperative_resnet
6 years ago
sneaxiy 51227bd447 lazy_allocator
6 years ago
tink2123 48cc484643 add align_corners and align_mode for image_resize
6 years ago
minqiyang ac80273686 Change definitions to PADDLE_WITH_JEMALLOC
6 years ago
minqiyang c8965dc1ab Polish code
6 years ago
tensor-tang 5c68dee798 fix debug compile of analysis pass fail
6 years ago
乔龙飞 Qiao Longfei d243e555eb
Merge pull request #15080 from jacquesqiao/optimize-assign
6 years ago
jerrywgz 11f1baa406 refine code, test=develop
6 years ago
Zhaolong Xing b7b68f2a8c
Merge pull request #15461 from NHZlX/fix_trt_stream_bug
6 years ago
luotao1 353b5f06a7 refine analyzer_bert_test to pass the ci
6 years ago
tangwei12 8b50ad80ff
checkpoint at distributed training (#14854)
6 years ago
luotao1 cc618934c0 Merge branch 'bert_test' of https://github.com/fc500110/Paddle into fc500110-bert_test
6 years ago
jerrywgz 57e5f61ec8 add gpu kernel, test=develop
6 years ago
jerrywgz cc53453057 add comment and refine code, test=develop
6 years ago
nhzlx e6218c1d7b change the input to a smaller value
6 years ago
qingqing01 07dc5a1506
Add generate_mask_labels_op to support Mask-RCNN and refine some code. (#15371)
6 years ago
Qiao Longfei 6833ec06dc Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-assign
6 years ago
Yiqun Liu eaad3e4c3d
Add check of input in sequence_expand op. (#15466)
6 years ago
sneaxiy ef788603d4 merge develop
6 years ago
gongweibao f4dec5cdee
Check collective server's data. (#15449)
6 years ago
Zhen Wang 58727e8e6d
Merge pull request #15455 from wzzju/graph_quantization
6 years ago
jerrywgz f44b1507f0 revised API spec, test=develop
6 years ago
jerrywgz b449f8ff2f revised API spec, test=develop
6 years ago
fuchang01 4a33a44f45 analyzer bert tester
6 years ago
Tao Luo fef3fd6d62
Merge pull request #15452 from luotao1/legacy_option
6 years ago
Paddle CI 289aba750a Polish code
6 years ago
jerrywgz c12a969bd4 refine comment and unittest, test=develop
6 years ago
chengduo 5a8bd82c0c
Remove workspace_handle (#15376)
6 years ago
jerrywgz 1c558ad388 add gpu kernel for box clip, test=develop
6 years ago
JiabinYang 266e0b63cd Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/imperative
6 years ago
JiabinYang e686818aed simple RNN
6 years ago
WangZhen 4e91d8d291 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
6 years ago
nhzlx 5b92ddabe2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_trt_stream_bug
6 years ago
nhzlx 2f4aee361a fix comments
6 years ago
WangZhen c6f99a1645 Update API.spec. test=develop
6 years ago
WangZhen b913463e83 Update according to the reviewers' suggestion. test=develop
6 years ago
sneaxiy d8568acd19 turn on remove_unnecessary_lock
6 years ago
Qiao Longfei a71f7ed787 update API.spec test=develop
6 years ago
nhzlx ec213730bc fix trt stream bug.
6 years ago
wopeizl a8aa79130b
Merge pull request #15453 from wopeizl/fix15313
6 years ago
gongweibao 7f8b40f68d
Fix brpc complation error. (#15451)
6 years ago
WangZhen 3ce6172052 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
6 years ago
WangZhen 787c5e714c Update the API.spec. test=develop.
6 years ago
WangZhen 59e5cc51d6 Add quantization transform pass and UT.
6 years ago
flame d60751fb71
add python inference api (#15248)
6 years ago
jerrywgz 0d4b60ab8b add lod for slice op, test=develop
6 years ago
dzhwinter 8f3b252392 squash commits. test=develop
6 years ago
peizhilin e6a3a3a31a fix pr 15313
6 years ago
Qiao Longfei 9449844c2a update ctr_reader in API.spec
6 years ago
Tao Luo cf29ea1592 remove legacy ANDROID option
6 years ago
jerrywgz 66bb5dd760 refine infer shape, test=develop
6 years ago
tensor-tang 266e625d2e
Merge pull request #15399 from tensor-tang/refine/seqpool/fc
6 years ago
Qiao Longfei 45578c1b48 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
6 years ago
jerrywgz 7d0c5fafa9 add API spec, test=develop
6 years ago
Yan Chunwei 885c4e57ab
fea/infer memory optim2 (#14953)
6 years ago
jerrywgz 0d91507859 fix share lod, test=develop
6 years ago
minqiyang a21f4e38c3 Polish code
6 years ago
minqiyang 8ce198b2e1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
6 years ago
minqiyang 31a1cd8ce5 Align the first batch of gpu resnet
6 years ago
Tao Luo 6597ccb01f
Merge pull request #15413 from luotao1/legacy_code
6 years ago
Dun 9f8f0fc2d3 Memory optimization of depthwise conv op and group norm op (#15313)
6 years ago
jerrywgz 5246285e34 test=develop
6 years ago
jerrywgz b10d84bc5a fix bug when run on GPU, test=develop
6 years ago
whs 530869f829
Share LoD from Input(Rois). (#15420)
6 years ago
gongweibao 7ab4af2716
Fix brpc compilation. (#15417)
6 years ago
Xin Pan 9a9c690e71
Merge pull request #15343 from panyx0718/imperative3
6 years ago
Dun Liang e5004f3c1c fix ci && test=develop
6 years ago
WangZhen e2ff300b02 add UT for quantization.
6 years ago
WangZhen 451896fce4 init quantization.
6 years ago
tensor-tang 316e44b1b7 fix unused warnings
6 years ago
Wu Yi 7e651a38dd
fix mac cmake version 3.13 build (#15386)
6 years ago
jerrywgz b62a17bbae add nms api
6 years ago
tensor-tang 579d758254 fix jitkernel tests and refine benchmark
6 years ago
jerrywgz f660553d77 enhance nms for mask rcnn, test=develop
6 years ago
shippingwang 14f2a1060d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into shufflechannel
6 years ago
jerrywgz 88ee56d0b2 enhance nms for mask rcnn
6 years ago
zhaozhehao e2ba9668b4 Tree conv op (#15217)
6 years ago
Tao Luo 3ede8b67e6 update CMakeLists.txt
6 years ago
Tao Luo 8f522c15ed
Merge pull request #15408 from luotao1/mm_dnn
6 years ago
Tao Luo 001827c270 test_analyzer_mm_dnn runs in serial
6 years ago
Tao Luo 140fc1e92c
Merge pull request #15392 from luotao1/pyramid_dnn
6 years ago
Yan Chunwei c9e5aa19c1
get tensor API add more comments (#15345)
6 years ago
Yiqun Liu f413b6892b
Revert the modification of while_op in #14764. (#15372)
6 years ago
jerrywgz ab9d6a4f39 add comments, test=develop
6 years ago
jerrywgz 10dd3b37ad add axis for box coder op
6 years ago
Yan Chunwei e84234b551
make clone thread safe (#15363)
6 years ago
乔龙飞 Qiao Longfei adba4384ec
Merge pull request #15161 from jacquesqiao/gru-add-mode
6 years ago
gongweibao 7cd4dd7ce4
Hide varhandle members. (#15382)
6 years ago
Tao Luo 668563088e add pyramid_dnn c++ inference test
6 years ago
Zhaolong Xing 236201c222
Merge pull request #15350 from NHZlX/fix_bug_for_precditor
6 years ago
nhzlx 8817841c73 fix unit test bug
6 years ago
Yan Chunwei e07900d317
cache tensor ptr in ZeroCopyTensor (#15352)
6 years ago
Yan Chunwei b7916440ff
hot fix the Native clone (#15344)
6 years ago
jerrywgz 5fb2856584 test_develop
6 years ago
minqiyang dbd4d058af Add static implementation and fix fc layer
6 years ago
Xin Pan 3ecf6bb338
Merge pull request #15028 from yihuaxu/develop_641313ea7_elementwise_mul_mkldnn_bug_fix
6 years ago
jerrywgz e2044c09e9 test=develop
6 years ago
jerrywgz af448373c7 test=develop
6 years ago
Xin Pan e395f2c6a3 polish codes
6 years ago
nhzlx b95f2ff8fe fix win build bug
6 years ago
nhzlx b938324381 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
6 years ago
nhzlx 312fe0ece1 add trt int8 calibration support
6 years ago
wopeizl 994e73f685
Merge pull request #15351 from wopeizl/fixbuildissue
6 years ago
minqiyang 315b133e67 Add single GPU support to imperative
6 years ago
jerrywgz 481d8bce2f add box clip op
6 years ago
Yiqun Liu 568cc2ffa8
Optimize while_op for test (#14764)
6 years ago
tensor-tang 3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
6 years ago
tensor-tang 904a39239d
Merge pull request #15254 from mozga-intel/mozga-intel/softmax_operator_ngraph
6 years ago
nhzlx e61a1b9514 merge develop test=develop
6 years ago
peizhilin cd562f8fb7 disable the parallel mode for adam op on windows test=develop
6 years ago
nhzlx b2ba3471fd fix analysis config bug.
6 years ago
Xin Pan 01dc15ce32
Merge pull request #15329 from panyx0718/imperative2
6 years ago
Xin Pan 16cb3ebd68
Merge pull request #15268 from xiaolil1/pool-int8
6 years ago
Xin Pan 9a4314f025 imperative gan
6 years ago
tensor-tang a7fc3d42a0
Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub
6 years ago
bingyanghuang a152a5c731 Disable conv3d mkldnn in dam (#15335)
6 years ago
Xin Pan 73093656b8
Merge pull request #15331 from panyx0718/api
6 years ago
Xin Pan 2db6e3ed2a
Merge pull request #15292 from panyx0718/imperative
6 years ago
乔龙飞 Qiao Longfei b14d4cdd75
Merge pull request #14890 from jacquesqiao/multithread-sparse-adam
6 years ago
Xin Pan 6b762f6519 add doc
6 years ago
Xin Pan d7b159355c add more doc
6 years ago
mozga-intel cba729404d Enable softmax operator for a ngraph engine
6 years ago
Qiao Longfei cd31b90a46 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
6 years ago
wopeizl 0fbb76f66b
Merge pull request #15204 from wopeizl/debug/support
6 years ago
Xin Pan 24bb6a6aec expose CompiledProgram
6 years ago
Xin Pan 783dbe9abb more doc
6 years ago
Xin Pan f997109bb1 polish
6 years ago
Xin Pan c1fdacd4b4 add imperative mode design
6 years ago
Qiao Longfei 8c516a24e5 remote min_row_size_to_use_multithread in adam interface test=develop
6 years ago
Tao Luo 9497d43921
Merge pull request #15307 from luotao1/trace_deps
6 years ago
tensor-tang 1a95cd227d disable seqpool test on mac or without mkl
6 years ago
Qiao Longfei 9b4fe283e1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
6 years ago
tensor-tang 0b6447a482
Merge pull request #15310 from luotao1/ZeroCopy_omp
6 years ago
peizhilin 5e450833bd test=develop
6 years ago
Qiyang Min 3f687765e6
Merge pull request #15281 from velconia/fix_expand_op_compile_time
6 years ago
peizhilin eea75a1d93 fix issue when type is invalid
6 years ago
peizhilin 9adb158e5b Merge remote-tracking branch 'upstream/develop' into debug/support
6 years ago
minqiyang 29ceb93126 Use malloc and free in JeMalloc
6 years ago
Tao Luo 2411ed4286 fix multi-threads in ZeroCopyProfile
6 years ago
minqiyang c4cf5967db Change backward op infershape
6 years ago
tensor-tang 84b0ecdcce Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub
6 years ago
tensor-tang 7035f051a8 adjust acc on mac
6 years ago
luotao1 346561a37f fix imperative compile when WITH_PYTHON=OFF
6 years ago
Xin Pan b29eca3b71 code style
6 years ago
Xin Pan 7bc67c31e5 polish more
6 years ago
Xin Pan 0c04cac484 polish
6 years ago
Xin Pan 47ef2df01a polish
6 years ago
Xin Pan 0d5819eb4f polish imperative codes
6 years ago
Tao Luo e33427da0d
Merge pull request #15280 from luotao1/random_test
6 years ago
chengduo 46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn (#15186)"" (#15290)
6 years ago
Qiao Longfei 4d15515c40 fix gru_gpu_kernel test=develop
6 years ago
tensor-tang 93e75c5ae5 refine jitcode of vsub and vsquare
6 years ago
tensor-tang d618e48309 fix fuse square mat order and refine test
6 years ago
tensor-tang a5d2a6d1ad add fuse pass of sequared mat sub fusion
6 years ago
tensor-tang 531f4a1578 Merge branch 'fuse/repeatedfcrelu' into fuse/second_order_mul_sub
6 years ago
tensor-tang 84e023eae5 adjust the acc since the refer result is too large
6 years ago
Qiao Longfei 4feae25378 fix build problem test=develop
6 years ago
tensor-tang 38de1ff472 add fusion squared mat sub op
6 years ago
Qiao Longfei e641ffe77b change interface and api spec for dynamic_gru test=develop
6 years ago
tensor-tang 09c5786e22 add square jitkernel
6 years ago
Qiao Longfei 4c7be265d3 update avx gru grad kernel test=develop
6 years ago
tensor-tang 4461a458a5 adjust diff since abs is too large
6 years ago
Qiao Longfei 9b16e54064 update gru_grad_op
6 years ago
tensor-tang ca6fdc6e33 refine and fix test
6 years ago
tensor-tang a89296ac1f add repeated fc relu pass
6 years ago
Qiao Longfei e477d789a1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode
6 years ago
tensor-tang f347d6e4a1 add repeated fc relu unit test
6 years ago
tensor-tang 99010e6eae init repeated fc relu op
6 years ago
tensor-tang 266a5d2f52 implement matmul refer and mkl kernel
6 years ago
tensor-tang c5623c87a3 init jit matmul kernel
6 years ago
Xin Pan a92860a3b1
Merge pull request #15298 from panyx0718/fix
6 years ago
Dun Liang 0c5c561bd1 test=develop
6 years ago
Xin Pan 50b4ac08b0 fix
6 years ago
Xin Pan a1bfb35dd6 try fix py2
6 years ago
tensor-tang 781cd0cf51 add multi threads test of seqpool test (#15293)
6 years ago
Dun Liang a900015c03 add async copy and pinned place
6 years ago
Xin Pan 3f65869ba6 try fix
6 years ago
Xin Pan 3e79e6544f try fix
6 years ago
Tao Luo 1d434a9de6
Merge pull request #15291 from wojtuss/wojtuss/fix-performance-drop
6 years ago
minqiyang c86b3dd6e6 Polish code
6 years ago
minqiyang ddfb9f1123 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_shared_ptr
6 years ago
Xin Pan d1220f23bb
Merge pull request #15229 from panyx0718/imperative
6 years ago
colourful-tree 576c740d5d
Merge pull request #14964 from colourful-tree/data_norm
6 years ago
colourful-tree d5a8909131
Merge pull request #14950 from colourful-tree/develop
6 years ago
minqiyang bc3e0d6e01 Fix expand op compile time bug
6 years ago
Tao Luo cbd1c7c01f fix CompareDeterministic error when test_all_data
6 years ago
Xin Pan 6a18c0f9ff
Merge pull request #15278 from chengduoZH/revert_remove_workspace_handle_in_conv2d_cudnn
6 years ago
Zhaolong Xing 98e85f3735 add_transpose_flatten_concat_fuse (#15121)
6 years ago
chengduozh c4eced9881 fix thread safe bug
6 years ago
chengduozh 358e657f68 Revert "Remove workspace_handle in conv_cudnn (#15186)"
6 years ago
wopeizl 5d9edb4124
Merge pull request #15156 from wopeizl/windows/fixgpuissue
6 years ago
Wojciech Uss cb2ba58458 Fix performance drop when with MKL-DNN
6 years ago
tensor-tang fc9fbab6a0
Merge pull request #15271 from tensor-tang/fix/typo
6 years ago
minqiyang d0b640dca1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_shared_ptr
6 years ago
chengduo 064512aa47
Remove workspace_handle in conv_cudnn (#15186)
6 years ago
minqiyang 687171d22b Move from shared_ptr to raw pointer
6 years ago
tensor-tang c3a9f3c4b2 fix typo and refine
6 years ago
tensor-tang 146e942c65
Merge pull request #15250 from tensor-tang/refine/seqpool/feed
6 years ago
xiaolil1 8f17c714de Conv int8 residual (#15145)
6 years ago
Tao Luo 93d5c1ed5a
Merge pull request #15261 from wopeizl/fixdemos
6 years ago
xiaoli.liu@intel.com f34e779f4d Enhance key generation for INT8 test.
6 years ago
peizhilin 439691f5bd adjust the shlwapi on windows
6 years ago
peizhilin 92da467c99 Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue
6 years ago
Wu Yi fd85418329
[Feature] support mix precision training for resnet (#14899)
6 years ago
tensor-tang 96786d3716 add compare_determine of seqpool1 test
6 years ago
tensor-tang ab9c4b2a9f refine seqpool concat pass and remove unused nodes
6 years ago
tensor-tang ce909664d8 Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
6 years ago
peizhilin e239558e56 remove the dismatch enclosure to avoid warning message test=develop
6 years ago
flame fb63cd89d4
Add python ir graph API (#14917)
6 years ago
tensor-tang a0a27bd240 add seqpool concat fuse pass tester
6 years ago
Tao Luo 7d13d20769
Merge pull request #15245 from luotao1/rnn1_multi_thread
6 years ago
minqiyang 80197fac26 Add missing files
6 years ago
Tao Luo 2b11c710b3
Merge pull request #15249 from NHZlX/fix_trt_demo_ci
6 years ago
乔龙飞 Qiao Longfei 5e74c4e88f
Merge pull request #15100 from jacquesqiao/fix-dist-sparse-decay
6 years ago
tensor-tang 8e086a8521 follow comment and fix typo
6 years ago
minqiyang 08e2a5d611 Polish tracer code
6 years ago
minqiyang cded24768c Remove shared_ptr holder for VarBase
6 years ago
minqiyang c8d1a8e909 Change var_ and grad_ to shared_ptr
6 years ago
minqiyang 7aab39af15 Change grads to VarBase
6 years ago
tensor-tang 54afcb7ec6 add compare zerocopy test with native result
6 years ago
tensor-tang 137060135e fix zerocopy size
6 years ago
tensor-tang 7461356723 add zerocopy for seqpool test
6 years ago
tensor-tang 48410b9bfe
Merge pull request #15237 from tensor-tang/fuse/seqpool_concat_2
6 years ago
nhzlx e7d83389e6 fix demo ci bug
6 years ago
Tao Luo 9b41e45584
Merge pull request #15222 from luotao1/native_config
6 years ago
Tao Luo d43983b61d reduce threads number to avoid hang in CI
6 years ago
Qiao Longfei 653cd31971 remote unused code
6 years ago
Qiao Longfei 0a79d7a404 fix merge
6 years ago
Qiao Longfei 422449a945 fix style
6 years ago
Qiao Longfei edad60e612 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
6 years ago
nhzlx c1264e99f3 fix win error
6 years ago
peizhilin c1235c935f add the enable_debug flag
6 years ago
nhzlx 4e3522e5b4 add trt int8 support
6 years ago
Xin Pan 7b73fc9e1a
Merge pull request #15089 from panyx0718/api
6 years ago
Xin Pan 9597fd05e9 polish
6 years ago
Qiao Longfei d0e3b24002 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
6 years ago
tensor-tang f8c305b243 Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat_2
6 years ago
Tao Luo 197d0f2431 fix trt_model_tester to pass the ci
6 years ago
tensor-tang 223c61ca5e
Merge pull request #15170 from tensor-tang/jit/seqpool
6 years ago
Qiao Longfei c3b9edf958 follow comment test=develop
6 years ago
Zeng Jinle e29f10d315
Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var
6 years ago
Zeng Jinle 7b638f2781
Merge pull request #15218 from sneaxiy/fix_same_name_func
6 years ago
Tao Luo feee78d9f0
Merge pull request #15214 from tensor-tang/fix/debug
6 years ago