Commit Graph

10520 Commits (5b5fa37fb98bfa05f23e5ad508f6dbf3e7ec9f93)

Author SHA1 Message Date
fengjiayi ce182d9037 bug fix
7 years ago
Xin Pan a2c0e52f3e speed up while_op
7 years ago
typhoonzero dd7a79158b add scope info in graphviz debug
7 years ago
tensor-tang 6f78fd7d1e fuse fc in gru
7 years ago
tensor-tang 300180cc26 init fusion gru op
7 years ago
Zhaolong Xing 21ba32b065
Merge pull request #12843 from NHZlX/fix_ssa_bug_for_trt
7 years ago
Michał Gallus cd32ddac12 Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669)
7 years ago
nhzlx c999895e93 merge develop
7 years ago
nhzlx 276950291a 1. fix ssa bug with batchnorm, 2. refine the trt
7 years ago
Yan Chunwei 896a37b6e3
fea/link ir to inference analysis and fc fuse support (#12789)
7 years ago
dzhwinter e23ddf6ae4
status (#12764)
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
tangwei12 cbc6e6eb97
Merge pull request #12247 from seiriosPlus/dis_ckpt_fix
7 years ago
Qiyang Min 72965226e6
Merge pull request #12818 from velconia/fix_python3_CI_job
7 years ago
minqiyang 656c77e712 Resume cicheck
7 years ago
minqiyang e1492f19e1 Change the sequence of ci check
7 years ago
tangwei12 44bade8b17 fix api spec
7 years ago
Zhaolong Xing 470335e8c4
Merge pull request #12786 from NHZlX/add_batch_norm_trt_converter
7 years ago
Qingsheng Li 3d11d018e0
Fix scatter_op python API (#12742)
7 years ago
nhzlx ff052c0e6f merge develop
7 years ago
nhzlx c6a5c4b0c0 add comments for execute in ut_helper
7 years ago
minqiyang 50d66a0790 Fix prelu_op
7 years ago
minqiyang beb93bb901 Fix ut bug for graph_test
7 years ago
Tao Luo 8f9f414a14
Merge pull request #12805 from tensor-tang/fix/op/elewise_add
7 years ago
tensor-tang e955361267
Merge pull request #12737 from tensor-tang/feature/op/fusion_lstm
7 years ago
tensor-tang 82bb9170fb Merge remote-tracking branch 'ups/develop' into fix/op/elewise_add
7 years ago
tangwei12 99f74be561
Merge pull request #12802 from seiriosPlus/inference_teeny_mistakes
7 years ago
Tao Luo 2ae885e224
Merge pull request #12811 from luotao1/tensorrt_compiler_bug
7 years ago
Chen Weihang 57b34d9196
Merge pull request #12808 from chenwhql/remove_inplace_param_in_squeeze_and_unsqueeze
7 years ago
Xin Pan daf464af68
Merge pull request #12807 from panyx0718/fix
7 years ago
luotao1 808e5b1748 fix tensorrt compiler bug
7 years ago
Yihua Xu 084d4a9e9e Optimize CRF Decoding with AVX/AVX2/AVX512F instruction (#12767)
7 years ago
fengjiayi 34b209cffa Complete sequence_padding GPU kernel
7 years ago
dzhwinter 00463fdfe3
cudnn windows support (#12757)
7 years ago
Xin Pan 4a4c469f61 add test
7 years ago
qingqing01 c62f68cb94
Fix bug in conditional_block_op. (#12246)
7 years ago
nhzlx 1bf9d9e90c fix comments
7 years ago
chenweihang bc471b6ac4 refactor: remove inplace parameter from squeeze and unsqueeze op
7 years ago
Xin Pan 7473d5f735 fix program_desc constructor
7 years ago
tensor-tang 0507f7bc3c fix SEGV elementwise add at debug mode
7 years ago
tangwei12 cfb12f09bf fix some teeny mistakes
7 years ago
Yu Yang c6af7201e9
Merge pull request #12692 from reyoung/feature/fast_executor
7 years ago
Xin Pan e525aa232e
Merge pull request #12780 from panyx0718/ir4
7 years ago
Tao Luo 7decbaaa13
Merge pull request #12762 from luotao1/anakin_cuda_env
7 years ago
nhzlx 324dd16816 merge develop
7 years ago
yuyang18 b8029fd650 Follow comments
7 years ago
tangwei12 ca1e18c04a
Merge pull request #12469 from seiriosPlus/sum_op_dim_fix
7 years ago
Xin Pan 1d3343240e fix
7 years ago
nhzlx 144b20c160 add batch norm op converter
7 years ago
nhzlx 14311bb094 merge develop
7 years ago
Zhaolong Xing e5674f6dde
Merge pull request #12753 from NHZlX/add_benchmark
7 years ago
Zhaolong Xing 310708726b
Merge pull request #12761 from NHZlX/global_pooling_trt
7 years ago
tensor-tang b090479409 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
nhzlx 1e92baf746 fix comments
7 years ago
Xin Pan 17b88811e0 fix ProgramToGraph
7 years ago
tangwei12 b4f52b01d0 bug fix when all inputs are empty
7 years ago
tangwei12 3efac174ea Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tangwei12 dbb4f0d35d Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix
7 years ago
Qiao Longfei fd10669ecb
Add dependency to send recv (#12760)
7 years ago
nhzlx ce7f361a80 fix comments
7 years ago
Xin Pan a9217031ba small fix
7 years ago
fengjiayi 8d8d48a34f Complete sequence_pad_op and its CPU kernel. Add unittests
7 years ago
nhzlx df9cbabcee add pool2d test for global_pooling true
7 years ago
dzhwinter 2673798ddb
"fix float16 ShuffleDownSync Bug" (#12756)
7 years ago
Yan Chunwei 6fe5547db7
switch NodeAttr to boost::varient (#12539)
7 years ago
Chen Weihang 535a6e9206
Merge pull request #12509 from JiabinYang/scripts0802
7 years ago
nhzlx 133ec69625 add batch norm trt converter
7 years ago
tangwei12 7c12c0f865 add sync in load selectedrows
7 years ago
luotao1 413bf9d494 disable anakin when cuda < 8.0 or cudnn < 7.0
7 years ago
Michal Gallus 4a7f0698e0 Add consts to new MKLDNN integration
7 years ago
Michal Gallus 6588d0e039 Update MKLDNN to 0.15, fix conv integration
7 years ago
tangwei12 9f11db4080 add todo in impl
7 years ago
tangwei12 40febec402 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into dis_ckpt_fix
7 years ago
tangwei12 c24a9263ba Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
Qiao Longfei 03d4c7efd3
add rw lock test (#12752)
7 years ago
dzhwinter f36818d532
"windows testing easier" (#12739)
7 years ago
nhzlx 2bdd20be22 add support for global pooling for trt
7 years ago
tangwei12 ac9ae97001 code fix
7 years ago
nhzlx f55e8901c8 merge develop
7 years ago
nhzlx 1600ba86f6 1. change tensorrt op from cpu to gpu
7 years ago
tangwei12 bb9f494740 merge develop
7 years ago
tangwei12 eba7177475 add unit test and code fix
7 years ago
dzhwinter 4069262f0e
Revert ""cherry picked operators changes" (#12184)" (#12747)
7 years ago
Qiao Longfei 653fad08f8
Optimize selected rows for dist lookup table with pthread rwlock (#12635)
7 years ago
Qiao Longfei 64d48f4d6a
fix mac compile (#12751)
7 years ago
fengjiayi 3c749fae43 update CPU sequence_padding functor
7 years ago
tensor-tang 92890ac258 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tangwei12 0749c8822d
Merge pull request #12556 from seiriosPlus/samplingIdOp
7 years ago
Qiyang Min 340a104c58
Merge pull request #12658 from velconia/port_pybind11
7 years ago
tensor-tang a56142c155 optimize elementwise_mul cpu forward
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
7 years ago
tangwei12 26b228e405 remove assignment and add vlog
7 years ago
Chen Weihang d4d8f83137
Merge pull request #12633 from chenwhql/demangle_type_name
7 years ago
Chen Weihang 1e961b145c
Merge pull request #12591 from chenwhql/enforce_msg_polish
7 years ago
tangwei12 125e9166e1 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into sum_op_dim_fix
7 years ago
tensor-tang a72f68f223 Merge remote-tracking branch 'ups/develop' into feature/op/fusion_lstm
7 years ago
tensor-tang df28a3b452 fix lod and op test
7 years ago
Tao Luo 17da113c87
Merge pull request #12693 from luotao1/anakin_bug
7 years ago
Qingsheng Li 317e18abd2
Remove Data Sharing between input and output in scatter_op (#12672)
7 years ago
tensor-tang f3cd2612ae refine fc and use the fc compute in fusion_lstm
7 years ago
qingqing01 c44fb00371
Add name in relu and log API. (#12438)
7 years ago
luotao1 9f3789944c use latest anakin commit
7 years ago
tangwei12 822496f626 merge cpu and gpu
7 years ago
dzhwinter bf3c34960f
"cherry picked operators changes" (#12184)
7 years ago
tensor-tang 40138c4cd6 add unit test of fusion lstm op
7 years ago
jerrywgz c108376506 Add three modes for prelu_op (#12630)
7 years ago
tangwei12 9f09d68678 add enforce
7 years ago
gongweibao d06849305a
parameter dispather. (#12666)
7 years ago
tensor-tang 852bc6f4aa refine fusion lstm op doc
7 years ago
tensor-tang 8f9132959e fuse fc in lstm
7 years ago
tensor-tang ddb05dffb6 init fusion lstm op
7 years ago
tensor-tang efc5392d97
Merge pull request #12676 from tensor-tang/refine/op/fc
7 years ago
minqiyang a32ce8c444 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
Yan Chunwei 5d2834fcf7
fea/ir support fuse, based on graph pattern detection helper (#12636)
7 years ago
tangwei12 470fb7c5c3 bug fix
7 years ago
minqiyang 0d7047ca79 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
yuyang18 d1d825ee02 Hide unnecessary API
7 years ago
yuyang18 265302edea Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/fast_executor
7 years ago
tangwei12 60dda7bf9f add gpu Implementation
7 years ago
tangwei12 4661f5589d random optimize
7 years ago
Wu Yi bd87f67f0e
Dist transpile can pass startup program by argument (#12606)
7 years ago
Bai Yifan 9333a62792
Add flatten op interface and enhance APIs about detection to support variable-length image. (#12422)
7 years ago
tensor-tang eee38464dc refine fc op use cpu only
7 years ago
tangwei12 ed937bc6f8 merge
7 years ago
fengjiayi f276006f0c
Merge pull request #12694 from JiayiFeng/dev_op_tensor_support
7 years ago
Yu Yang a197737c02
Merge pull request #12690 from reyoung/feature/better_exception_holder
7 years ago
Yan Chunwei e765dead86
add profiler to fluid inference (#12707)
7 years ago
tensor-tang d84a1a0010 fc op use cpu only
7 years ago
tensor-tang fbc164047d Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
Xin Pan d96ee24f0b
Merge pull request #12697 from panyx0718/ir2
7 years ago
minqiyang 77f12e000f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
tangwei12 f56102505a add pserver_endpoints args in load_inference_model
7 years ago
fengjiayi a38a8db928 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_sequence_padding_op
7 years ago
tangwei12 478f73c188 merge header in cc
7 years ago
fengjiayi d6b5302bd6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
Yan Chunwei 0a641ba326
add ratio to profiler (#12701)
7 years ago
tensor-tang c588c64a76 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang 0098a494a2 Merge remote-tracking branch 'ups/develop' into refine/op/fc
7 years ago
gongweibao 842fb021b3
Fix clone() bug. (#12583)
7 years ago
Qiao Longfei 5d579e1a96
add export_for_deployment flag to save_inference_model (#12582)
7 years ago
chenweihang 7797e55f42 use paddle::platform::demangle
7 years ago
minqiyang e0d5f8a820 Move compat module to python/paddle
7 years ago
chenweihang da39d84a48 refine by reviewer's advice
7 years ago
Xin Pan 891c3c0f9a test and doc IR Graph
7 years ago
minqiyang 7e0f66e99a Polish code
7 years ago
minqiyang 5338417b47 Polish code style
7 years ago
minqiyang ae39709e59 Polish code
7 years ago
minqiyang 55d7f55c63 Revert the changes to attribute.h
7 years ago
fengjiayi 5e7aa8c7e5 code clean
7 years ago
chenweihang 21d5b94228 error message refine: add demangle api to attribute type
7 years ago
minqiyang 1800fef142 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
tensor-tang 742300baa8 fix unkown omp pragmas
7 years ago
yuyang18 05cadf1b24 Add FastExecutor
7 years ago
tensor-tang b9dbb7c5cb fix bias attri in mkldnn fc
7 years ago
yuyang18 c6eb7a89ff Merge branch 'feature/better_exception_holder' into feature/fast_executor
7 years ago
yuyang18 aac80ef4cc Polish API of exception holder
7 years ago
yuyang18 d49763a87d Stash
7 years ago
tangwei12 59580a7f69 bug fix
7 years ago
Zhaolong Xing 83c85f34e8
Merge pull request #12598 from NHZlX/add_tensorrt_softmax
7 years ago
Tao Luo 1e1974c998
Merge pull request #12563 from luotao1/anakin_test
7 years ago
tensor-tang a85bf42ae4
Merge pull request #12681 from PaddlePaddle/revert-12554-refine_elementwise_add
7 years ago
tensor-tang 4b5986bb77 enable fc op in normal case
7 years ago
Wu Yi 8b77448d5f
hide misc APIs (#12540)
7 years ago
tensor-tang e133df6037 enable native fc forward
7 years ago
tensor-tang 6a2a9a8350
Revert "Refine elementwise_add op"
7 years ago
minqiyang 68b221401d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
Yu Yang 8dda526a45
Merge pull request #12659 from sneaxiy/refine_softmax_with_cross_entropy
7 years ago
sneaxiy f6f5cdaa05
Merge pull request #12555 from sneaxiy/refine_layer_norm
7 years ago
sneaxiy c50c537732 fix arithmetic error in backward kernel
7 years ago
tensor-tang 038cbf799d add bias for fc op
7 years ago
whs 9d6243b6fb Fix crop op. (#12603)
7 years ago
Bai Yifan 649f5d74f0
fix mine_hard_example bug (#12664)
7 years ago
Tao Luo 51cc80cca0
Merge pull request #12662 from tensor-tang/fix/xbyak
7 years ago
sneaxiy 2d9508f8f3
Merge pull request #12554 from sneaxiy/refine_elementwise_add
7 years ago
tensor-tang 171a0e2b42 add some comment
7 years ago
tensor-tang 1ab1d03c62 fix missing macro condition
7 years ago
Xin Pan 6b45c5a134
Merge pull request #12605 from panyx0718/ir
7 years ago
sneaxiy 2c560623d1 fix dependency error
7 years ago
Qiao Longfei 331151f065
Merge pull request #12647 from jacquesqiao/add-RPCServerProfiler
7 years ago
Qiao Longfei e8fcb71bed
Merge pull request #12620 from jacquesqiao/timeline-support-pure-cpu
7 years ago
minqiyang e4057d071b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_pybind11
7 years ago
tensor-tang 5377edd282 refine packed condition
7 years ago
tensor-tang 3bf3e77ac8 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei 5a6c3cd9e0 fix profiler dead lock
7 years ago
Tao Luo 16b65c559d
Merge pull request #12646 from tensor-tang/feature/jit/xbyak
7 years ago
chengduo 64824ac73f
Add write after write dependence (#12632)
7 years ago
qiaolongfei c0890988da add RPCServerProfiler, replace listen and serv optimizer
7 years ago
tensor-tang a50889f523 introduce xbyak
7 years ago
qiaolongfei 3f2aa91970 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
tangwei12 64a4925cb4 Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12 0bfd62be3d remove gpu supported, will add it later
7 years ago
luotao1 2ea110cd4a Merge branch 'develop' into anakin_test
7 years ago
luotao1 a222d336ca modify the anakin_model download dir
7 years ago
luotao1 22bc328951 fix anakin-NOTFOUND compiler error
7 years ago
luotao1 b2367f3661 update anakin.cmake
7 years ago
Qiyang Min 29fac3c092
Merge pull request #12390 from velconia/port_python3_syntax
7 years ago
Tao Luo 5a9ae411e0
Merge pull request #12618 from sfraczek/sfraczek/fix-new-mkldnn-conv-tests
7 years ago
qiaolongfei d080d3e694 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
sneaxiy cf799a6a04
Merge pull request #12553 from sneaxiy/refine_softmax_with_cross_entropy
7 years ago
xzl 29ad9794bb Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_softmax
7 years ago
luotao1 f4bcee1d6f Merge branch 'develop' into anakin_test
7 years ago
luotao1 94042ccd2d add comment
7 years ago
dzhwinter 8499559c42
"fix style" (#12600)
7 years ago
sneaxiy 010883689c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_layer_norm
7 years ago
minqiyang bc12c2c616 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
sneaxiy 5d698589ce Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_elementwise_add
7 years ago
sneaxiy 19ff254d05 Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
qiaolongfei e008600b08 optimize code
7 years ago
Yan Chunwei 7555cfe33a
fix inference double free bug (#12613)
7 years ago
Zhaolong Xing 5dc57b71ee
Merge pull request #12593 from NHZlX/filter_redundant_output
7 years ago
Luo Tao 64c0ba288a fix inference_lib_dist error
7 years ago
qiaolongfei 7c649e06c3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
minqiyang 09103084d3 Polish compat.py and add unittest for it
7 years ago
Sylwester Fraczek d74bb6ab9c fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests
7 years ago
Xin Pan 626abfc33a code clean up and renaming
7 years ago
Qiao Longfei c1446342ff
Merge pull request #12577 from jacquesqiao/optimize-vlog-before-and-after-op-run
7 years ago
minqiyang c3fdf3aee4 Fix divide problem in CI
7 years ago
fengjiayi 855c9e3311 clean softmax_op code
7 years ago
fengjiayi 24d51de022 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
fengjiayi 27df3a9f2b make cross_entropy_op supporting tensors
7 years ago
Chen Weihang b2435a3a11
Merge pull request #12374 from chenwhql/py_calc_memory
7 years ago
fengjiayi 66be53264e
Merge pull request #12592 from JiayiFeng/fix_mac_compile_error
7 years ago
chenweihang b1dd4149b9 adjust enforce test cases
7 years ago
Yu Yang cb79b0233e
Merge pull request #12595 from reyoung/fix_scale_loss_with_memopt
7 years ago
nhzlx 641f32da8c add softmax op converter
7 years ago
nhzlx 943950c190 refine graph draw
7 years ago
Yu Yang c4f8afa258 Fix bug when memopt optimize loss.grad and use ParallelExecutor
7 years ago
nhzlx 7a019cd608 merge develop
7 years ago
nhzlx e823ce68bb filter redundant output
7 years ago
fengjiayi 8e604a10aa fix mac compile error
7 years ago
nhzlx 551c802cdc merge develop
7 years ago
nhzlx c69ae865db fix comments
7 years ago
Luo Tao e8aa6d1283 add anakin compiler from github source code
7 years ago
chenweihang 61052cdbc6 polish high frequency enforce error message
7 years ago
sneaxiy ad45d39222 refine layer_norm
7 years ago
chengduo 7c8b69c700
Feature/op fusion (#12240)
7 years ago
sneaxiy 1b4515f6db refine softmax_with_cross_entropy
7 years ago
nhzlx 8f9e704f94 merge develop
7 years ago
nhzlx 3a0caf801f modify trt engine op test
7 years ago
nhzlx e51d045a6d modify trt engine op test
7 years ago
Luo Tao 21b4d90ab9 Merge branch 'develop' into anakin_test
7 years ago
qiaolongfei b4d48531e4 optimize vlog before and after op run, move into op.run
7 years ago
Qiao Longfei 88e47e1e2d
Merge pull request #12570 from jacquesqiao/add-flag-to-disable-inference
7 years ago
nhzlx e8954a36f5 merge develop
7 years ago
nhzlx 32a9e050bc mapping the variable name inside the subgraph
7 years ago
minqiyang 6abe819f07 Fix pybind11 problem
7 years ago
Wu Yi 2d036c47cd
polish dist unit test code (#12512)
7 years ago
qiaolongfei 9331ba752f add WITH_INFERENCE flag
7 years ago
chengduo 97a77512b4
Fix the order of sum (#12562)
7 years ago
fengjiayi 7834b4a470 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_op_tensor_support
7 years ago
Luo Tao cf74473244 make inference_anakin_test SERIAL
7 years ago
Jeff Wang 4713f0a9e4
Simplify the travis script. (#12557)
7 years ago
tangwei12 5bfdefae91 Merge branch 'Pdv' into samplingIdOp
7 years ago
tangwei12 b30bdde15a random optimize
7 years ago
tangwei12 9c63fef63c random optimize
7 years ago
Qiao Longfei 88a607c342
Merge pull request #12541 from jacquesqiao/optimize-profiler
7 years ago
tangwei12 5b9716d1f6 add dims check
7 years ago
tangwei12 4cd504d3b4 bug fix
7 years ago
sneaxiy e57bc4d745 Merge branch 'refine_elementwise_add' of https://github.com/sneaxiy/Paddle into refine_elementwise_add
7 years ago
qiaolongfei 954d680b40 fix test_parallel_do.py
7 years ago
sneaxiy 222fbbedfb Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy 4b83afff6e
Merge branch 'develop' into refine_elementwise_add
7 years ago
sneaxiy b2d0ee5159 refine elementwise_add op
7 years ago
tangwei12 da2cc99f67 sampling op optimize
7 years ago
Tao Luo 0fd2f713a4
Merge pull request #12548 from Superjomn/bugfix/disable-anakin-test
7 years ago
fengjiayi 7c55e08c93 stash
7 years ago
superjomn ebe1920626 add comment
7 years ago
superjomn 3c5e15de03 disable anakin test
7 years ago
tangwei12 4973e07be3 sampling op optimize
7 years ago
tensor-tang 836068569f Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
tensor-tang 18c322c2a1 seperate cpu and gpu implementations for gru kernel compute
7 years ago
tensor-tang 54c95e49f0 fix blas
7 years ago
fengjiayi b656d97e86
Merge pull request #12485 from JiayiFeng/dev_ops_tensor_support
7 years ago
qiaolongfei 52576c5f38 revert inference
7 years ago
qiaolongfei 1623f1ba4f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler
7 years ago
tangwei12 3206970b77 sampling op rename
7 years ago
qiaolongfei 903b2c0162 optimize code
7 years ago
Xin Pan 99a77cfc62
Merge pull request #12468 from panyx0718/improve_profiler2
7 years ago
qiaolongfei 4c5bcd7859 add guard to profiler
7 years ago
qiaolongfei d553e2ff3f revert inference
7 years ago
qiaolongfei a3f9d6a38c optimize profiler
7 years ago
tangwei12 e0ab2f7158 new sampling op
7 years ago
minqiyang a58dd3e557 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
tensor-tang 8c23f7c4f0 fix blas and use packed weight
7 years ago
tensor-tang d9cc6b1866 replace gru compute with details
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago
minqiyang f9ef0ee8a9 Polish code
7 years ago
minqiyang c4d000a990 Make code more efficient
7 years ago
JiabinYang 4af5d3e3d3 fix the paddle script causes 'command not found' error'
7 years ago
minqiyang 9812bb8b48 Fix pserver can NOT start with DebugString problem
7 years ago
tangwei12 766ac488ac sum_op selectedRows dim bug fix
7 years ago
Zhaolong Xing d7dd0868db
Merge pull request #12449 from NHZlX/add_tensorrt_elementwise_add
7 years ago
nhzlx d50f776b27 merge develop
7 years ago
Bai Yifan 900d61dd98
Clean python api (#12406)
7 years ago
dzhwinter 0c8fde7dce
"cherry picked cpp tests" (#12182)
7 years ago
dzhwinter 595a2c83ae
explicit gradient of elementwise_add/elementwise_sub (#11970)
7 years ago
nhzlx 64a08f840f increase the test batch
7 years ago
Zhaolong Xing f37f875f1f
Merge pull request #12349 from NHZlX/add_tensorrt_conv2d_converter
7 years ago
Zhaolong Xing 7e6bac3ea6
Merge pull request #12479 from NHZlX/fix_gtest_test_eq_warning
7 years ago
fengjiayi e7d8e16a66 update softmax_mkldnn_op
7 years ago
nhzlx c7e6a11bc1 merge develop
7 years ago
nhzlx 0015df1b12 modify op converter for conv2d
7 years ago
Yu Yang 2567afa35d
Merge pull request #12462 from reyoung/feature/fix_cudnn_deterministic
7 years ago
fengjiayi dc111d3476 update softmax_cudnn_op
7 years ago
nhzlx 66406619ec merge develop
7 years ago
nhzlx a2749adf5d fix warning
7 years ago
fengjiayi f7bd0b227b Add unittests for softmax_op
7 years ago
gongweibao 819ac3df0a
Modify style (#12465)
7 years ago
cuichaowen 046de2acdb Improve anakin feature (#11961)
7 years ago
fengjiayi b314a69523 make softmax supporting tensors
7 years ago
fengjiayi b1af7e5d9b Add unittests for lookup_table_op
7 years ago
tangwei12 c4c8f60bec sum_op selectedRows dim bug fix
7 years ago
Xin Pan 486345551d clean
7 years ago
Xin Pan caf10b474f make profiler use thread_id from g_thread_id
7 years ago
nhzlx c13efe02d9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_elementwise_add
7 years ago
nhzlx a5c96af33c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_conv2d_converter
7 years ago
Yu Yang 040fc1c39b Fix bug in cudnn_determistic
7 years ago
fengjiayi 7efdf05ac2 make look_up_op supporting tensor ids
7 years ago
Tao Luo baff71d504
Merge pull request #12460 from luotao1/small_tgz
7 years ago
Yan Chunwei dcfbc6a661
inference analyzer as bin (#12450)
7 years ago
Yan Chunwei 31a2c87688
fea/lightly support lod (#12451)
7 years ago
fengjiayi 38863a2c9d
Merge pull request #12454 from JiayiFeng/dev_exception_holder
7 years ago
Qiao Longfei 690625fe15
Merge pull request #12456 from jacquesqiao/add-profiler-to-pserver
7 years ago
dzhwinter 6d3da458a7
Fix/float16 style (#12446)
7 years ago
yuyang18 59c900e1e9 Update API.spec
7 years ago
Luo Tao 5e6f7bc569 compress the fluid.tgz
7 years ago
fengjiayi bc1b7b96ec Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_exception_holder
7 years ago
qiaolongfei 7e46a8d172 fix logical bug, optimize code
7 years ago
qiaolongfei d04dca3798 revert cmakelist
7 years ago
qiaolongfei 0b62f61d29 add init flag in __init__.py for listen_and_serv_profile_period
7 years ago
dzhwinter 91fb0156ca
Memory/reshape op (#12414)
7 years ago
qiaolongfei b4496ee442 Merge branch 'fix-mac-build-graph_executor' of ssh://github.com/jacquesqiao/Paddle into add-profiler-to-pserver
7 years ago
qiaolongfei c8c8c01a23 fix mac build of graph_executor
7 years ago
qiaolongfei 0b861bbca9 add profiler for listen_and_serv op
7 years ago
Zhaolong Xing 7ae73e33da
Merge pull request #12432 from Superjomn/fea/analysis-ssa
7 years ago
fengjiayi 3e4083ed1f Make exception handling of threaded_ssa_graph_executor an independent class
7 years ago
tensor-tang 059b27840c
Merge pull request #12408 from tensor-tang/refine/im2col
7 years ago
Superjomn 15c2f1abb3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fea/analysis-ssa
7 years ago
nhzlx b241a47e8e merge develop
7 years ago
nhzlx 5fcdd81da7 tiny modify
7 years ago
minqiyang ce4eba3b0d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
qiaolongfei 236fc1bd38 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-test-for-split-ids-op
7 years ago
qingqing01 f372f27e3f
Hidden APIs for While, StaticRNN, ParallelDo. (#12332)
7 years ago
minqiyang 000ba1ac5f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
Xin Pan 4b8ae523c4
Merge pull request #12367 from panyx0718/ir_pass
7 years ago
nhzlx f05c7fb8ae Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_conv2d_converter
7 years ago
nhzlx 6f6d552790 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_conv2d_converter
7 years ago
Qiao Longfei 297cbeb1c6
Merge pull request #12439 from jacquesqiao/CheckTensorNANOrInf-support-selectedrows
7 years ago
dzhwinter 39ac9e39c2
float16 type support enhance (#12181)
7 years ago
qiaolongfei 3033841b4a CheckTensorNANOrInf support checking SelectedRows
7 years ago
qiaolongfei 147bf00ffe clear mutable rows for the output of split_ids_op
7 years ago
qiaolongfei 91b114a787 change map to unordered_map
7 years ago
tensor-tang d8d2dbcfac further optimize im2col using variables
7 years ago
Superjomn 4d2405d851 inference analysis support ssa
7 years ago
qiaolongfei 91f63cd401 fix split_ids_op and add unit test
7 years ago
tensor-tang 5373fe29c2 Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
Xin Pan 02c31458bb
Merge pull request #12417 from panyx0718/add_dist_deps
7 years ago
Xin Pan 25706d0868 properly set up dep of concat and fetch_bar
7 years ago
minqiyang e96fef2cf7 Fix inference api impl deps
7 years ago
Xin Pan 4abcb1b8e7
Merge pull request #12409 from panyx0718/add_dist_deps
7 years ago
Qiyang Min 7da453630e
Merge pull request #12403 from velconia/fix_hang_up
7 years ago
Xin Pan 398cfb47b1 disable dist_se_resnext since it's not stable yet.
7 years ago
Tao Luo 5a634786af
Merge pull request #12312 from luotao1/unify
7 years ago
Bai Yifan e12b1d1792 Add flatten op (#12341)
7 years ago
Luo Tao 062556f938 Merge branch 'develop' into unify
7 years ago
Xin Pan 5fff8d7a55 add distributed training deps.
7 years ago
nhzlx 98948b975e wrong added file
7 years ago
nhzlx 830aa12c1a add elementwise init code
7 years ago
chengduo 2409d0f710
Refine regularization for selected_rows (#12369)
7 years ago
Zhaolong Xing 85c4912755
Merge pull request #12355 from NHZlX/add_tensorrt_pooling_converter
7 years ago
tensor-tang 5bea9c148c
Merge pull request #12397 from tensor-tang/refine/num_threads
7 years ago
tensor-tang 687a322267 Merge remote-tracking branch 'ups/develop' into refine/im2col
7 years ago
tensor-tang 65d418f060 complete im2col with padding==1 and speedup filter width==1
7 years ago
minqiyang 053540e199 Add volatile to stopped_ member
7 years ago
tensor-tang 4f0383f52e fix unknown flag
7 years ago
minqiyang 0c7d6eb8b2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into port_python3_syntax
7 years ago
minqiyang b78ffde6d5 Add stopped sign for grpc client
7 years ago
fengjiayi ec4c6e1f7c
Merge pull request #12384 from JiayiFeng/dev_update_save_inference_model
7 years ago
tensor-tang 9788e5ab87 add flags to control num_threads
7 years ago
tensor-tang 10a1c2bb86 control omp num_threads
7 years ago
Xin Pan 99c0c20468 add pass test
7 years ago
tensor-tang 52eb86e30f refine im2col benchmark
7 years ago
tensor-tang 3017f46076 add more test cases
7 years ago
typhoonzero 54e9fd3f61 fix cudnn enforce
7 years ago
tensor-tang 8d6be4fb5f refine im2col test and add benchmark
7 years ago
minqiyang 559d36328c Apply 2to3 to current paddle main python code
7 years ago
tensor-tang 507c143047 im2col cfo cpu code clean
7 years ago
fengjiayi 604bd85a45 update inference_optimize()
7 years ago
Xin Pan 12e9bf6c17 clean up
7 years ago
Xin Pan ab72d28a5e clean up and correctness check
7 years ago
tensor-tang 4eeed0b5e4 refine width padding and enable core copy
7 years ago
Tao Luo 3ade95d0db
Merge pull request #12379 from luotao1/demo_ci_fix
7 years ago
fengjiayi 0d43594d16
Merge pull request #12364 from JiayiFeng/dev_add_FLAG_free_idle_memory
7 years ago
Wu Yi 73fcfc06ec
refine conv cudnn enforce (#12353)
7 years ago
Xin Pan aa1085ddc5 all passes
7 years ago
nhzlx fb204fbfbe Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_pooling_converter
7 years ago
nhzlx 4f71a3b12b fix a bug
7 years ago
Luo Tao 83e59257d0 fix manylinux1 Failed to publish artifacts
7 years ago
Xin Pan e4d7d7ae8f pass refactoring
7 years ago
tensor-tang e3131e2d73 enable width padding
7 years ago
Xin Pan 142e832d21 pass registration
7 years ago
Xin Pan 5b183557f3 graph viz pass
7 years ago
qiaolongfei 64e7902530 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into parallel-executor-support-prefetch
7 years ago
Xin Pan d7e08c53c2
Merge pull request #12169 from panyx0718/ir_graph_sort
7 years ago
tensor-tang 92518c519f reuse sizes saving time
7 years ago
tensor-tang 660df122ce enable padding!=0 and fill height padding with 0
7 years ago
tensor-tang d8e00facf7 reuse im_size
7 years ago
tensor-tang 179dd0cb8a
Merge pull request #12337 from tensor-tang/refine/im2col
7 years ago
nhzlx c8adfb3451 add paddle_enforce
7 years ago
nhzlx 5533400720 fix comments
7 years ago
fengjiayi fd2d2c66e9 add flag to prevent unnessary memory free
7 years ago
qiaolongfei e7eeb19f90 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into parallel-executor-support-prefetch
7 years ago
Qiao Longfei 2d21aa76c7
Merge pull request #12331 from jacquesqiao/fix-mixed-tensor
7 years ago
Luo Tao 5ba4337698 unify libpaddle_inference_api into libpaddle_fluid
7 years ago
nhzlx 01566fb61b 1. support mutil batch utest 2. support pool op
7 years ago
qiaolongfei 754e96a30c distribute lookup table work with parallel executor
7 years ago
qiaolongfei 65e5aebd43 fix mixed_vector_test
7 years ago
nhzlx 990741aa85 add weight's dim assert
7 years ago
nhzlx 21890ca0cf Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_tensorrt_pooling_converter
7 years ago
qiaolongfei da035fc674 remove explicit for compile problem
7 years ago
tensor-tang 7b63b85086 fix mismatch of infer api (#12342)
7 years ago
tensor-tang b72befc5cc reuse copy size
7 years ago
Yancey 6133efd9ed
Merge pull request #12218 from Yancey1989/rpc_complete_interface
7 years ago
qiaolongfei c6fb163571 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-mixed-tensor
7 years ago
nhzlx fc41eb40b1 add conv2d trt converter
7 years ago
qingqing01 24bea40116
Hiden some LoDTensor ralated ops' Python wrapper. (#12230)
7 years ago
Zhaolong Xing 6169d724b9
Merge pull request #12324 from NHZlX/enhance_for_tensorrt_infer
7 years ago
nhzlx 4d49e61ab8 fix comments
7 years ago
qiaolongfei 18d539e82a Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-mixed-tensor
7 years ago
Wu Yi 9f0d9dffe6
hide variable API (#12307)
7 years ago
tensor-tang 6788af4bf1 refine test cases
7 years ago
tensor-tang b163e601b6 add gtest
7 years ago
Yu Yang 7c046ae772
Merge pull request #12323 from reyoung/feature/polish_reshape_and_lod_tensor_blocking_queue
7 years ago
nhzlx bcd67bdd71 add assert for GetOutput
7 years ago
qiaolongfei 5022b14de8 fix mixed tensor compile and add cpu unit test
7 years ago
tensor-tang aae994fd26 refine im2col no padding
7 years ago
Yancey1989 fb06ed7bdc Merge branch 'develop' of github.com:PaddlePaddle/Paddle into rpc_complete_interface
7 years ago
nhzlx 7382f98600 1. set ut batch > 1 2. readd the mul op(utest will be added later)
7 years ago
nhzlx bd64979fe9 the argument should not be a const one
7 years ago
Yu Yang 21387e3c2a Tiny refines for lod_tensor_blocking_queue and reshape_op
7 years ago
nhzlx f42ea48996 deal with conflict
7 years ago
nhzlx 940f5dbcac modify the tensorrt engine op to adapt to chage
7 years ago
nhzlx 82527696e7 1. we delelte mul op, 2.modify fc and action op 3. modify the test inferface
7 years ago
nhzlx 2372daff1d there is no batchsize concept in tensorrt's tensor
7 years ago
qiaolongfei 35d09abd01 add profiler for demo_trainer
7 years ago
qiaolongfei a6d30a8607 profiler support cpu
7 years ago
Yan Chunwei 02cf54d331
bugfix lod cpu performance (#12297)
7 years ago
Qiao Longfei b41f8b9d42
Merge pull request #12295 from jacquesqiao/speedup-reduce-sum-grad-op
7 years ago
Xin Pan 5173a53c8a fix reorder issue.
7 years ago