Commit Graph

11210 Commits (b58957d9d792b8ec85ad460a02ecc1f13575e7cd)

Author SHA1 Message Date
Yibing Liu cbe4292516 Add sequence unpad op
7 years ago
tensor-tang bcb8ea397d Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole
7 years ago
tensor-tang 8e182170ba refine and replace lstm peephole kernel
7 years ago
nhzlx efa5bac7ad fix demo_ci bug in vis_demo.cc
7 years ago
tensor-tang dc5a7b906d fix default number of threads when inference with or without MKLDNN
7 years ago
Xin Pan 228506618b Avoid GetMutable implicitly reset Var Type.
7 years ago
jerrywgz 3c963336e4 fix roi pool register
7 years ago
Dun 5f2e837847 optimize depthwise conv by register memory (#13778)
7 years ago
minqiyang 3f6ec90060 Polish code
7 years ago
minqiyang 9878eedbaa Change API.spec
7 years ago
Qiao Longfei 5428cb9908
Profiler support merge data of all thread (#13811)
7 years ago
nhzlx bf7a2789a0 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_trt_pad_op
7 years ago
nhzlx 0cb88c34be add op converter
7 years ago
tensor-tang 7ef2699e18 init peephole runtime kernel
7 years ago
Qiyang Min f99ea99e36
Merge pull request #13720 from velconia/fix_grad_clip
7 years ago
minqiyang f40848828d Polish code
7 years ago
minqiyang e2e82bde32 Accelerate Reshape op
7 years ago
minqiyang 0385b0a1ea Accelerate SequencePool Op on SUM mode
7 years ago
minqiyang 8ec748cfa0 Accelerate SelectedRows Functors:
7 years ago
Xin Pan 63b2e98f3d Explain LoD and a few other concepts
7 years ago
Tao Luo 9b11a17502
Revert "[MKLDNN] Pass: Fuse Conv + Bias"
7 years ago
Tao Luo ce248a15d9
Merge pull request #13368 from Sand3r-/mgallus/conv-bias-pass
7 years ago
whs 7e651c8641
Fix truncated norm (#13785)
7 years ago
Tao Luo 16b1beb244
Merge pull request #13486 from sfraczek/sfraczek/conv-bn-fuse-pass
7 years ago
Zhaolong Xing 5d5587fff7
Merge pull request #13792 from NHZlX/trt_dy_lib
7 years ago
Michal Gallus 40b17be4b0 Pass: Fuse Conv + Bias
7 years ago
minqiyang 1456b8ec7d Add unittest for clip_by_norm_op with SelectedRows
7 years ago
Tao Luo fd0dd07ab4
Merge pull request #13726 from jczaja/prv-fused_embedding_fc_lstm-ut
7 years ago
Sylwester Fraczek 3fcca40909 eigen sqrt fix and change 1e-5 to epsilon
7 years ago
Qiao Longfei 5fc305220c
Merge pull request #13787 from PaddlePaddle/revert-13637-optimize-opyreader
7 years ago
nhzlx 9445502f90 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_dy_lib
7 years ago
chengduo e1904ac2c8
Add doc (#13765)
7 years ago
nhzlx d347ea689a fix comments
7 years ago
chengduo e1761709f8
Set the right shape of selected_rows (#13723)
7 years ago
tensor-tang 3ee8f2c6cf thread local jit kernels
7 years ago
tensor-tang 9131a35676 replace the lstm compute with jitkernel
7 years ago
Qiao Longfei 9d087d5139 Revert "optimize pyreader"
7 years ago
tensor-tang b55c247678 add lstm compute unit test
7 years ago
nhzlx f3af90d121 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_dy_lib
7 years ago
nhzlx f569095084 add tensorrt api lib to paddle_fluid
7 years ago
qingqing01 6094a72308
Fix bug in reduce_op caused by PR #13534 (#13748)
7 years ago
Tao Luo 59c306eb52
Merge pull request #13776 from luotao1/revert_fast_math
7 years ago
Tao Luo 84a55155ec revert with_fast_math to ON
7 years ago
Wu Yi cc36bab184
fix manylinux multi arch docker build test=develop (#13770)
7 years ago
Qiao Longfei b1d5135ffb
Merge pull request #13637 from jacquesqiao/optimize-opyreader
7 years ago
dzhwinter a46e30aa6d
enhance isinf/isnan in tensor util, avoid copy back to cpu (#12688)
7 years ago
tensor-tang 2a00969165 optimize lstm jitkernel keq8
7 years ago
tensor-tang f2adaf1c3e add vrelu and lstm kernel
7 years ago
Xin Pan 943e4deb23
Merge pull request #13750 from panyx0718/fix
7 years ago
Jacek Czaja 9f15d8817e - Cleanup as suggessted by reviewers
7 years ago
Wu Yi 25262ed076
fix cuda9 docker build test=develop (#13701)
7 years ago
Sylwester Fraczek 78f98294c2 conv bn fuse pass
7 years ago
Jacek Czaja ae8b4717cc - Cleaning fused_embedding_fc_lstm op
7 years ago
Jacek Czaja fd31b54cf1 - Removed disabled code
7 years ago
Jacek Czaja f9da2d6416 - Removed disabled diagnostic code
7 years ago
Jacek Czaja 809dbc5c17 - Added file for fused_embedded_fc_lstm_op unit test
7 years ago
Tao Luo 75bd0f188b
Merge pull request #13754 from luotao1/fast_math
7 years ago
qiaolongfei 5238a7f5b9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-opyreader
7 years ago
tensor-tang e6d8aca3bf refine code and fix
7 years ago
qingqing01 41e4f7ea0a
Optimize Topk when height is large. (#13710)
7 years ago
xuezhong 65ed45a154
Merge pull request #13717 from chengduoZH/enhance_while
7 years ago
minqiyang bcd8c2ccc3 Add unit test
7 years ago
Tao Luo 28889caea5 disable EIGEN_FAST_MATH and use_fast_math
7 years ago
tensor-tang ea7dc9cbf6 Merge remote-tracking branch 'ups/develop' into fea/jitkernel
7 years ago
Xin Pan d2079b1ddb clean unused code and small optimize
7 years ago
tensor-tang 2513b2cc4e fix bug vtanh
7 years ago
chengduoZH e59ab42caa add nodes for drnn
7 years ago
Xin Pan ab798a2832 clarify the fraction_of_gpu_memory flag
7 years ago
Tao Luo d770b9bda3
Merge pull request #13663 from luotao1/resnet50_ut
7 years ago
dzhwinter 32c260cd1f
"fix operators cmake" (#13581)
7 years ago
Tao Luo 6ef6a9180a
Merge pull request #13727 from Sand3r-/mgallus/enable-mkldnn-naive-exe
7 years ago
minqiyang f20fc95539 Resize output ddims and rows
7 years ago
qiaolongfei 91756a5a90 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-opyreader
7 years ago
Zeng Jinle 8cd17c04c1
Merge pull request #13718 from sneaxiy/fix_eager_deletion
7 years ago
Michal Gallus 09d9d77a8f Enable MKLDNN in Naive Executor
7 years ago
Jiabin Yang 8e63bc2307
Merge pull request #13700 from JiabinYang/fix/mac_ci_job
7 years ago
tensor-tang 5ef14dd386
Merge pull request #13715 from tensor-tang/fix/op
7 years ago
minqiyang 67308822f8 Add selected_rows merge for clip_by_norm op
7 years ago
sneaxiy 9606b37ce4 test=develop
7 years ago
Xin Pan c0dfd5eee8
Merge pull request #13712 from luotao1/doc_element
7 years ago
tensor-tang ea0b98e007 bugfix: fusion lstm and gru batch,seq mode switch
7 years ago
Tao Luo 69ed75e77c refine elementwise doc
7 years ago
Tao Luo 49cd43000c
Merge pull request #12981 from chenwhql/sequence_erase_op_infershape_note_polish
7 years ago
JiabinYang 248400f43a test=develop
7 years ago
tensor-tang cf8c8e72bd add vtanh and unit test
7 years ago
Tao Luo 067b8582af
Merge pull request #13625 from wanghaoshuang/fix_warning_roi
7 years ago
tensor-tang b37fe30417
Merge pull request #13690 from wangguibao/fix_cpu_lstm_compute_cc
7 years ago
dzhwinter 26771f41ba
"fix compile error" (#13579)
7 years ago
JiabinYang 4cc3c4c976 test=develop
7 years ago
Yu Yang 186b2b138d
Revert "Merge pull request #13201 from reyoung/revert_callstack" (#13697)
7 years ago
Tao Luo a89afd4c22
Merge pull request #13685 from luotao1/naive_cmake
7 years ago
tensor-tang d10a9df7b8 add vaddbias and unit test
7 years ago
tensor-tang 3c8b651187 add vsigmoid avx implementations and unit test
7 years ago
luotao1 9cbf2023ab rollback paddle_inference_helper.h to helper.h
7 years ago
sneaxiy 6f748a035d test=develop
7 years ago
tensor-tang 55e44761fb refine code and init vsigmoid
7 years ago
Xin Pan 2f5a7cc470
Merge pull request #13673 from panyx0718/infer
7 years ago
wangguibao 1940bc2d83 Avoid multiple definitions of lstm_compute_ctht when linking libpaddle_fluid.so
7 years ago
sneaxiy 584c3f048f fix sparse rmsprop
7 years ago
chengduo d6747a9ac2
make check_graph choosable (#13674)
7 years ago
Tao Luo 824a82d728
Merge pull request #13672 from luotao1/gen_fluid_library
7 years ago
luotao1 d55d7e04fd update libpaddle_fluid.so with zeroCopy
7 years ago
Xin Pan 425a882165
Merge pull request #13643 from panyx0718/ir2
7 years ago
luotao1 a989a4e7c2 refine paddle_inference_helper.h
7 years ago
Jiabin Yang 2364494090
Merge pull request #13662 from JiabinYang/mac/fix_unittest_0927
7 years ago
Xin Pan 642905958a fix compile error
7 years ago
Xin Pan 33b68fdf25 fix compile error
7 years ago
tensor-tang ede4b230be
Merge pull request #13553 from jczaja/prv-fused_embedding_fc_lstm_op
7 years ago
Jiabin Yang 618b3297e6
Merge pull request #13668 from JiabinYang/mac/fix_ci_unittest09272
7 years ago
Xin Pan 6746b1fdf3 add missing header
7 years ago
Dun 161c3e31f7 Optimization of Kernels that related to DeepLabv3+ (#13534)
7 years ago
Xin Pan 5fb72d840a add header
7 years ago
Xin Pan ddd60581b7 clean up channel
7 years ago
tensor-tang 2d0ff6a3c2 add vexp and unit test
7 years ago
tensor-tang b3c63f40fa add vscal and unit test
7 years ago
Xin Pan 3d339797fb clean use_mkldnn options
7 years ago
Tao Luo cfbd71c223 reduce inference ci time
7 years ago
Tao Luo 83ca657f96 Merge branch 'develop' into resnet50_ut
7 years ago
Xin Pan 35b713c3fd
Merge pull request #13670 from typhoonzero/disable_dist_se_resnext
7 years ago
typhoonzero e6d357ff5d disable dist se resnet
7 years ago
tensor-tang 0987f2b4d9 add vadd unit test
7 years ago
Jacek Czaja e202f33aa9 - Yet another clarification to comment
7 years ago
JiabinYang 358b386953 test=develop
7 years ago
tensor-tang 3d928d4f9d refine and seepdup
7 years ago
qiaolongfei c5292b181e change py_reader_by_data to create_py_reader_by_data
7 years ago
Xin Pan 00ca94578c
Merge pull request #13657 from panyx0718/fix
7 years ago
Zeng Jinle 1cbaf71a68
Merge pull request #13620 from sneaxiy/fix_api_kwargs2
7 years ago
Tao Luo 21ee30595b clean some CMakeLists
7 years ago
dzhwinter 2d00e65819
namespace issue (#13543)
7 years ago
JiabinYang 9ae5baebfa test=develop
7 years ago
Jiabin Yang a5b20a9e37
Merge pull request #13651 from velconia/mac_py3
7 years ago
Jacek Czaja 1df69f7c9d - Fix to comment
7 years ago
Tao Luo b31905c54d Merge branch 'develop' into resnet50_ut
7 years ago
Tao Luo 1dcd6ee532 add resnet50 inference UT
7 years ago
Xin Pan 2c4b8393ce
Merge pull request #13573 from velconia/fix_api
7 years ago
tensor-tang 77fc42d2d1 Merge remote-tracking branch 'ups/develop' into fea/jitkernel
7 years ago
Yu Yang 593ad763cd refactor(op): polish generate_proposals_op
7 years ago
Xin Pan d24f1f0aa4 Current scope needs to be thread-safe for training
7 years ago
velconia 4a7b9f7833 Fix pip install in mac
7 years ago
Wu Yi 7a5f3f750b
Fix memory optimization with dist train (#13535)
7 years ago
Yan Chunwei c8744d118d
fea/infer executor and concurrency performance issue bug fix (#13451)
7 years ago
tensor-tang 2937314d8e refine vmul and test
7 years ago
Dang Qingqing f189bf6a42 Update API.spec
7 years ago
Dang Qingqing e79ad2ea87 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into quantize_transpiler_update
7 years ago
Jiabin Yang 643b6faa0c
Merge pull request #13631 from JiabinYang/mac/add_ci_job
7 years ago
qiaolongfei 1d618225a7 add py_reader_by_data to API.spec
7 years ago
tensor-tang 6c986e127a fix macro and add vmul unit test
7 years ago
chengduo 5175b3cb2b
Add GraphChecker (#13580)
7 years ago
Jacek Czaja 910cd415f2 - Disabled embedding_fc_lstm_fuse by defult and
7 years ago
sneaxiy 31e67b9042 test=develop
7 years ago
minqiyang 7aa0247bd1 Regenerate API.spec
7 years ago
qiaolongfei 85ddb5c76e Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-opyreader
7 years ago
Xin Pan 7cd2761736
Merge pull request #13416 from panyx0718/ir
7 years ago
minqiyang 4c89137427 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_api
7 years ago
chengduo 43a3af86be
refine sgd_op (#13626)
7 years ago
qiaolongfei accf3f7505 optimize pyreader
7 years ago
Qiyang Min adae0a3b54
Merge pull request #13627 from velconia/mac_py3
7 years ago
Xin Pan 2c01c2216a
Merge pull request #13531 from gongweibao/generator2
7 years ago
qingqing01 479ad4bb92
Merge branch 'develop' into quantize_transpiler_update
7 years ago
Yu Yang 0be1582df0
Merge pull request #13525 from reyoung/fix_mixed_vector
7 years ago
Jacek Czaja d5114c60b0 - Reviewers suggesstions to fused_embedding_fc_lstm_op
7 years ago
Jacek Czaja 7ab5626dee - Added initial pass for embedding-fc-lstm
7 years ago
JiabinYang 6e26a45c89 test=develop
7 years ago
chengduo 4e81e22827
add op frequence (#13328)
7 years ago
qingqing01 fd4c4df93d
Cuda speed for generate_proposals_op. (#13596)
7 years ago
tensor-tang 8c69764d12 add vmul unit tests
7 years ago
velconia 1512cf247f Polish code
7 years ago
velconia 688ddc9095 Polish code
7 years ago
velconia d26d356de3 Make python3 only build in fluid only
7 years ago
tensor-tang 084893a9a9 add vadd kernel
7 years ago
wanghaoshuang 153d4f5d15 test=develop
7 years ago
wanghaoshuang 5d7395cd0f Fix warning of roi perspective transform op.
7 years ago
Yan Chunwei 9e8d372ff4
hide attention lstm fuse (#13615)
7 years ago
Dang Qingqing f7bd1761a0 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into quantize_transpiler_update
7 years ago
Tao Luo e9bc5faab1
Merge pull request #13614 from luotao1/gen_capi
7 years ago
Wu Yi 10dffc68cd
Merge pull request #13618 from typhoonzero/revert_13530
7 years ago
sneaxiy f1a08a3bab test=develop
7 years ago
tangwei12 97cf1eb6d7
Add distributed unit tests about text_classification/simnet-bow/ctr (#12812)
7 years ago
typhoonzero a4f7696a18 Revert "Some trivial optimization (#13530)"
7 years ago
tangwei12 85362e98dd
Batch AUC (#13567)
7 years ago
tensor-tang 6938e6cf06
Merge pull request #13603 from tensor-tang/refine/peephole
7 years ago
Zhaolong Xing 9b03d53543
Merge pull request #13469 from NHZlX/add_ut_for_trt
7 years ago
Tao Luo 1e46c91a1b change the install prefix for capi
7 years ago
Wu Yi 16e73e0d28
hide operator API (#12543)
7 years ago
tensor-tang eeff268a6c clean and refine kernels
7 years ago
tensor-tang dee5d35c20 refine vmul
7 years ago
tensor-tang 209e9c3db1 refine peephole
7 years ago
JiabinYang 2d35fec233 test=develop
7 years ago
JiabinYang 87501e1a1c Add mutable proc for mac run test
7 years ago
velconia 44c7beb0a6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into mac_py3
7 years ago
velconia 671a948226 Add python3.5 support for mac
7 years ago
Dang Qingqing d94920ce6f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into quantize_transpiler_update
7 years ago
chengduo 1d91a49d2f
Some trivial optimization (#13530)
7 years ago
ktlichkid 5093afced1 Fix bug in sequence_slice_op
7 years ago
Xin Pan ebf9171df3
Merge pull request #13532 from panyx0718/infer
7 years ago
nhzlx 6c81230683 update code for config change
7 years ago
tensor-tang 92031968d7 init vmul kernel
7 years ago
tensor-tang b9acbcc8c5 init lstm kernel
7 years ago
tensor-tang c260bf942d init jit kernel
7 years ago
nhzlx 5c57e15044 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_ut_for_trt
7 years ago
Tao Luo f67483bf3b add seq_conv UT (#13517)
7 years ago
Tao Luo c07b2a97a9
Merge pull request #13521 from Sand3r-/mgallus/fix-pooling-ceiled-size
7 years ago
Tao Luo d0000082c4
Merge pull request #13552 from sfraczek/sfraczek/conv-relu-update
7 years ago
dzhwinter cc20867d49
flags (#13542)
7 years ago
dzhwinter 7806c5625f
fix enforce (#13544)
7 years ago
Michal Gallus 0e6b303f54 MKLDNN Pooling: inline functions handling ceiled mode
7 years ago
minqiyang b1448ded40 Port clip and clip_by_norm op to nn and change API.sepc
7 years ago
Yu Yang 21bb9e91fc
Merge pull request #13201 from reyoung/revert_callstack
7 years ago
gongweibao be97c47efc merge
7 years ago
gongweibao 3dc54af2d3 merge
7 years ago
Michal Gallus f465b03ef9 Enable MKLDNN in Analysis Predictor
7 years ago
Xin Pan cbdf9833e3 hide create_passes_from_strategy for now
7 years ago
Wu Yi 3fa68dc101
mac ci make install fix (#13528)
7 years ago
Sylwester Fraczek e5d1bd1e93 remove unused variable nodes2delete
7 years ago
Sylwester Fraczek a49aa4dac9 make bias unnecessary for ConvRelu fuse
7 years ago
Sylwester Fraczek 493ef0c8df do not remove conv node just rewire the output
7 years ago
Sylwester Fraczek 667b661786 updated the test
7 years ago
Zeng Jinle 2cd558fb36
Merge pull request #13561 from sneaxiy/fix_api_kwargs
7 years ago
Yan Chunwei e426cdae32
fix inference output with lod (#13557)
7 years ago
Xin Pan bc1fa4fd6f
Merge pull request #13556 from panyx0718/doc
7 years ago
sneaxiy 48d82bd008 add out params
7 years ago
Dang Qingqing b7146d60e4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into quantize_transpiler_update
7 years ago
Yu Yang 46498bf1a0
Merge pull request #13508 from reyoung/hide_parameter
7 years ago
gongweibao 6ba86617ba Merge branch 'generator2' of https://github.com/gongweibao/Paddle into generator2
7 years ago
gongweibao 1113337764 merge
7 years ago
Xin Pan 7ba55aa294 fix CMAKE
7 years ago
Xin Pan 6974265292 support offline train
7 years ago
Yu Yang 606dfb13d5
Merge pull request #13442 from reyoung/feature/remove_trainer_api
7 years ago
Xin Pan f117feab0c modify comments
7 years ago
Yu Yang 7119d6c3cf Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_callstack
7 years ago
Xin Pan c2b3838cf5 add some comments
7 years ago
Yan Chunwei 5de14c6b96
refine inference api (#13518)
7 years ago
Wu Yi aeb2dc2b05
Nccl2 dist API (#13506)
7 years ago
dzhwinter c66a8d2cd8
add guide (#13332)
7 years ago
dzhwinter 24447ec517
flags (#13541)
7 years ago
dzhwinter 4fd5eb2255
"refine cmake" (#13546)
7 years ago
Xin Pan 0d9ee0dced fix resolve conflicts
7 years ago
Xin Pan b43e49fa31 fix
7 years ago
Xin Pan afc603c108 update API.spec
7 years ago
Xin Pan 36c2a9af27 pass builder allow cutomize pass in python.
7 years ago
Xin Pan 355a2265a0 update API.spec
7 years ago
Xin Pan eb1aeb175b
Merge pull request #13538 from baiyfbupt/softshrink
7 years ago
dzhwinter 97636a9fcf
"fix link error" (#13545)
7 years ago
Jiabin Yang efc2ac950c
Merge pull request #13527 from JiabinYang/mac/fix_mac_compile
7 years ago
Qiao Longfei bcc7bff12f
Merge pull request #13488 from jacquesqiao/fix-img_conv_group-doc
7 years ago
nhzlx baae7e4f63 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_ut_for_trt
7 years ago
baiyf 3986242e5e remove kwargs
7 years ago
Zeng Jinle a8f66365c9
Merge pull request #13524 from sneaxiy/fix_api_kwargs
7 years ago
sneaxiy 70e70d7d38 fix api.spec
7 years ago
nhzlx 2763321684 fix comments
7 years ago