Commit Graph

7208 Commits (59f75ec76e8fea156e97bea8739bb3bd4e27bf87)

Author SHA1 Message Date
Aurelius84 2d1e76fb0c fix API.spec test=develop
6 years ago
luotao1 6ce25c99a0 Merge branch 'develop' into runtime_context
6 years ago
Aurelius84 6cfd20dea8 fix words spell error test=develop
6 years ago
qingqing01 8ad672a287
Support sync batch norm. (#16121)
6 years ago
shippingwang 98d9552f0f update sqrt explaination, test=develop
6 years ago
minqiyang ca392c7e97 Implement infer var type context
6 years ago
Yibing Liu 4ae23cc3c5
Impl fp16 compute kernel for slice_op (#16206)
6 years ago
sneaxiy f0d108f589 fix const_cast
6 years ago
Dang Qingqing e5e7628a62 Skip compile infer shape in box_coder_op
6 years ago
Aurelius84 a59b7d47a8 improve layers.fc api doc test=develop
6 years ago
sneaxiy 3e03695629 fix numeric error
6 years ago
sneaxiy 5a92e4c097 revert revert 16144
6 years ago
sneaxiy e993effb29 include unordered_map to cross_entropy_op.cc
6 years ago
Zeng Jinle a91964c8fe Revert "PaddingRNN model memory optimize"
6 years ago
liuwei1031 1c6caf8466
1. disable reuse SELECTED_ROWS type variable (#16150)
6 years ago
Wojciech Uss b9252f3df8 Add cpu_quantize_squash_pass for C-API quantization (#16128)
6 years ago
minqiyang f83739499c Polish code
6 years ago
minqiyang 7355d41834 1. Add imperative gperf profiler
6 years ago
Zeng Jinle 0b49e43d3a
Merge pull request #16144 from sneaxiy/rnn_mem_opt
6 years ago
luotao1 b2898c0f57 Merge branch 'develop' into runtime_context
6 years ago
minqiyang 98dfb492bb Release GIL lock
6 years ago
sneaxiy ac0e0f5181 merge develop
6 years ago
sneaxiy a7a4f053da Merge develop
6 years ago
Tao Luo 4ef6f738c3
Merge pull request #16154 from luotao1/infershape_example
6 years ago
minqiyang 42e96a029f Accelerate CPU part
6 years ago
sneaxiy 487624e15d fix travis-ci
6 years ago
sneaxiy 0279020ba6 Merge develop
6 years ago
luotao1 1510b866b6 turn off runtime_context_cache for tensorrt
6 years ago
guomingz decdbed054 resolve #15618 (#16114)
6 years ago
sneaxiy 1e9fd40777 combine op files
6 years ago
sneaxiy 682f2dbf29 merge develop
6 years ago
sneaxiy 2c4fcaa683 merge develop
6 years ago
luotao1 d94fd97230 add runtime_context_cache_pass
6 years ago
Kaipeng Deng 1a4a90a81d
Merge pull request #16140 from tink2123/arc_function
6 years ago
Yan Xu 30568473ec
fix broadcast on mp mode (#15951)
6 years ago
baojun e3c37bd564 remove const_cast and refactor ngraph engine code (#15925)
6 years ago
fc500110 1c6e72b905 remove visualizer, which can be replaced by python IrGraph draw API
6 years ago
chengduo 0979956619
Add memory profiler (#16137)
6 years ago
luotao1 b561ad1e55 Merge branch 'develop' into runtime_context
6 years ago
tink2123 61a6165c2c modified api.spec
6 years ago
Zhen Wang 41b8cf0bae
Merge pull request #16162 from wzzju/fix_nan_static_quant
6 years ago
luotao1 fe78a92e6e refine with comments
6 years ago
Zhen Wang 94b7c1ea7b
Merge pull request #16107 from wzzju/add_graph_clone
6 years ago
dengkaipeng 0ff9a403d0 fix format. test=develop
6 years ago
wopeizl 85709f4378
restore the exception caught since it is necessary for python call stack (#16160)
6 years ago
Zhen Wang 5420cf95f5
Merge pull request #16070 from wzzju/channel_wise_quant_op
6 years ago
Zhen Wang 5685a48c23 Add some fixme. test=develop
6 years ago
dengkaipeng b33e6bf5ef remove comment code. test=develop
6 years ago
tink2123 eb09bd456a modified api.spec
6 years ago
dengkaipeng 746740c41b fix API.spec. test=develop
6 years ago
dengkaipeng e4e3764060 use memory Copy. test=develop
6 years ago
dengkaipeng d31693afec no use _gt_score. test=develop
6 years ago
luotao1 5d20954ac4 add runtime shape for fuse_emb_seq_pool_grad
6 years ago
dengkaipeng aad62eeca0 add doc for param default. test=develop
6 years ago
tink2123 a8e375d463 refine doc
6 years ago
dengkaipeng 585766acc0 fix spell mistake in doc. test=develop
6 years ago
dengkaipeng b307533b7d fix format. test=develop
6 years ago
dengkaipeng afdf3c3f84 fix doc.test=develop
6 years ago
dengkaipeng 5b37cf0add fix API.spec for yolov3_loss. test=develop
6 years ago
dengkaipeng af4ef80e5b fix API.spec not add defaults. test=develop
6 years ago
dengkaipeng 0d1a9996ac fix unittest for yolov3_loss. test=develop
6 years ago
dengkaipeng f0804433b0 add mixup score and label_smooth for yolov3_loss. test=develop
6 years ago
dengkaipeng 626fb859d9 add param default doc. test=develop
6 years ago
dengkaipeng 33c8607ef3 fix doc. test=develop
6 years ago
dengkaipeng abb5a9c726 fix doc statement. test=develop
6 years ago
dengkaipeng b399ee2a23 fix doc. test=develop
6 years ago
dengkaipeng ad897304f9 fix pre-commit. test=develop
6 years ago
dengkaipeng 72a18bb160 add bbox range limit. test=develop
6 years ago
dengkaipeng fb863b4820 add API.spec for yolo_box. test=develop
6 years ago
dengkaipeng c9d4676bee fix multi batch idx error. test=develop
6 years ago
dengkaipeng 7808f4c097 fix unittest for yolo_box_op. test=develop
6 years ago
dengkaipeng cb2dca53c1 fix cuda kernel error
6 years ago
dengkaipeng 04b8b9e96c add yolo_box_op CUDA kernel
6 years ago
dengkaipeng 452373decb resize box in input image scale. test=develop
6 years ago
dengkaipeng 3896d955c7 add yolo_box_op CPU kernel
6 years ago
luotao1 8f6597aa0e Merge branch 'develop' into infershape_example
6 years ago
sneaxiy b26e9bd232 refine code
6 years ago
Zhen Wang ac6ef06ffa Add the Clone method in Graph. test=develop
6 years ago
Zhen Wang 01eddf125c Not add graph copy construction method. test=develop
6 years ago
Zhen Wang 1b9c8d5f06 add clone function for IrGraph. test=develop
6 years ago
Tao Luo ccc7c358b3
Merge pull request #16104 from tensor-tang/refine/jit
6 years ago
Tao Luo c49b7855fa
Merge pull request #16120 from Xreki/fix_cmake_compress
6 years ago
Qiyang Min 1f4aa7a202 Imperative remove all descs (#16045)
6 years ago
Zeng Jinle 472f16b5aa
Merge pull request #16063 from sneaxiy/enhance_gc
6 years ago
Tao Luo e31f6e9831
Merge pull request #16146 from luotao1/zero_copy
6 years ago
luotao1 31ccaf0916 add all_kernels_must_compute_runtime_shape example for speedup infershape
6 years ago
tensor-tang 14d871121b enhance jitkernel unit test
6 years ago
Liu Yiqun 4e052e0ac9 Disable inference download for WIN32 temporary.
6 years ago
chengduo ad80bde824
Revert "Revert "Add Event for TensorCopy"" (#16035)
6 years ago
luotao1 1283833395 zero_copy tensor support INT32
6 years ago
tensor-tang cfc83c1445 refine jitcodekey and enhance unit tests
6 years ago
tensor-tang 6ff230a624 Merge remote-tracking branch 'ups/develop' into refine/jit
6 years ago
luotao1 31c4e1d9fc Merge branch 'develop' into zero_copy
6 years ago
wopeizl a38db3cb99
Fixrecordio (#16124)
6 years ago
sneaxiy fc12f38394 add API.spec
6 years ago
sneaxiy b80d76f784 merge develop
6 years ago
sneaxiy cfd012e2cb add unittest
6 years ago
sneaxiy d7407c90aa refine cross_entropy mem
6 years ago
luotao1 9e2c7e69fb simplify the zero_copy tests
6 years ago
tink2123 cfc59b13e9 modified api.spec
6 years ago
sneaxiy 732fa00eaf disable gc in recurrent_op currently
6 years ago
tink2123 e4e0d03459 fix format
6 years ago
Tink_Y 5579fae1d2
Update activation_op.cc
6 years ago
tensor-tang 45bdd84dac enhance the jitkernel helper and add unit tests
6 years ago
tink2123 837ad7f86f Add the inverse trigonometric function
6 years ago
ceci3 415d74a08e Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into npair_loss0
6 years ago
luotao1 aeee4cbe71 add compare between zerocopy and analysis
6 years ago
Liu Yiqun 6bb84b74b2 Change the download and compress command of cmake.
6 years ago
ceci3 3e0eb55515 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into npair_loss0
6 years ago
tensor-tang 14a764c930 simplify the jitkernel templates and tests
6 years ago
ceci3 8b86c12e46 test=develop, update API.spec
6 years ago
Tao Luo 25ca2ca001 change init_idx to INT32 in transformer_test
6 years ago
ceci3 c8610739c3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into npair_loss0
6 years ago
Yiqun Liu 66ead07ef9 Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)
6 years ago
Yihua Xu 0a45441a84 Fix the node's order issue when the content of graph is changed (#16088)
6 years ago
nhzlx 2891070c66 cant not pass ci
6 years ago
lidanqing 717bbc087b Add INT32 support. INT32 in last switch case
6 years ago
nhzlx 93edcd773b 7 refine zero copy
6 years ago
nhzlx 4b59646ed1 fix comments and fix cpplint
6 years ago
nhzlx 5863c86143 6. delete useless predictor id
6 years ago
nhzlx f3d164faad 5. add static trt load model
6 years ago
nhzlx 31008100ba 4. do the trt_engine optim during init.
6 years ago
nhzlx 4f77248dd8 3. when runing in trt mode, do not allocate memory for parameters in fluid.
6 years ago
nhzlx 8c17190279 2. TRTEngine using stream only when execute.
6 years ago
nhzlx 88c24baa25 add static model load for trt
6 years ago
Tao Luo e5e7e9b865 Merge branch 'develop' into transformer_ut
6 years ago
Yiqun Liu 5bde120243
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)
6 years ago
Tao Luo 6f2581e4c5
Merge pull request #16090 from lidanqing-intel/paddle-int32
6 years ago
Yihua Xu 40f1dd818b Fix the node's order issue when the content of graph is changed (#16088)
6 years ago
Zhaolong Xing 3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
6 years ago
jerrywgz b0e3c02410
Merge pull request #15952 from jerrywgz/fpn_ops
6 years ago
tensor-tang 802f362ac4 unify the kernelfuncs cache and add unit test
6 years ago
nhzlx a9ed427749 cant not pass ci
6 years ago
Yiqun Liu f31d515ce3 Enhance the op benchmark: (#16066)
6 years ago
tensor-tang 2e7fea2b7f polish the cast op doc (#16078)
6 years ago
luotao1 fad06cb928 unify ZeroCopy in analysis_test
6 years ago
lidanqing 4aeb261da9 Add INT32 support. INT32 in last switch case
6 years ago
Yiqun Liu 36e2d3241e
Enhance the op benchmark: (#16066)
6 years ago
tensor-tang 9be825a982
polish the cast op doc (#16078)
6 years ago
jerrywgz 847bb6a279 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fpn_ops
6 years ago
Wu Yi 5e92eb3f25 add parallel graph dist test (#16076)
6 years ago
jerrywgz e5759d6c38 refine doc, test=develop
6 years ago
jerrywgz 2b41743791 fix doc, test=develop
6 years ago
jerrywgz c2eda2325b refine code, test=develop
6 years ago
jerrywgz 9eb6d35f59 fix API.spec,test=develop
6 years ago
jerrywgz a2e83d1d7b add box_coder_and_assign, test=develop
6 years ago
Wu Yi d206582337
add parallel graph dist test (#16076)
6 years ago
jerrywgz 893789a0d1
Merge pull request #16050 from jerrywgz/add_box_decoder_and_assign
6 years ago
liuwei1031 1b5768c33b fix a code bug which cause crash when empty variable is used, test=develop (#16080)
6 years ago
sneaxiy 2a639d5c2a add allocator chain to fix bug
6 years ago
liuwei1031 045e5911bf
fix a code bug which cause crash when empty variable is used, test=develop (#16080)
6 years ago
ceci3 c109e6b3aa Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into npair_loss0
6 years ago
ceci3 7613918e23 test=develop, change labels name
6 years ago
xiaolil1 a177d48217 Add Requantize OP (#15318)
6 years ago
chengduo 6fe7478ba8 Refine recurrent_op (#16027)
6 years ago
chengduo f5a3751845
Refine recurrent_op (#16027)
6 years ago
sneaxiy 7b608396fe fix travis-ci format check
6 years ago
ceci3 dc57952b7f test=develop, add random to testfile
6 years ago
chengduo 84e3adbe60 Fix reshape bug (#16069)
6 years ago
wopeizl eb367f990c remove the ignored from is_empty and less_than test=develop (#15971)
6 years ago
liuwei1031 9cc6f4009f add IfElse test case for ir memory optimize (#15998)
6 years ago
luotao1 503efa8b86 refine SetCpuMathLibraryNumThreads
6 years ago
baojun 9f85876885 fix tanh typo test=develop (#16049)
6 years ago
whs bd9669003f Make sequence_erase op support for input with multi-level LoD. (#15982)
6 years ago
Tao Luo 1301dc1a27 remove legacy function in ExecutionContext
6 years ago
lidanqing 21156b8d4c MKLDNN: Add UT for conv_transpose_mkldnn op. (#16030)
6 years ago
dengkaipeng b1a49e873f fix statement. test=develop
6 years ago
dengkaipeng 0e0a2d046d fix API.spec. test=develop
6 years ago
dengkaipeng dbb8d07886 fix doc statement. test=develop
6 years ago
dengkaipeng eeeebdd006 refine doc. test=develop
6 years ago
dengkaipeng 8ee866bf19 fix format. test=develop
6 years ago
dengkaipeng 9c47f36d1b fix spectral_norm doc. test=develop
6 years ago
dengkaipeng 12416a24d2 add doc and test_layers. test=develop
6 years ago
dengkaipeng 63d322f07c fix attr dim calc. test=develop
6 years ago
dengkaipeng ca1502c7f5 add grad kernel for spectral_norm. test=develop
6 years ago
dengkaipeng 8956a59637 add unittest for spectral_norm. test=develop
6 years ago
dengkaipeng fd66089d23 add spectral_norm forwarn kenel
6 years ago
tensor-tang cab46b62f8 refine vbroadcast jitcode
6 years ago
tensor-tang 6010361c7a add vbroadcast mkl code and jitcode
6 years ago
tensor-tang 2e96da453a add vbroadcast jitkernel refer code and use it
6 years ago
tensor-tang 020540948f add jitkernel vcopy and speedup unit test time
6 years ago
tensor-tang 6057f36208
Merge pull request #15996 from tensor-tang/op/embgrad
6 years ago
chengduo c67afb0f76
Fix reshape bug (#16069)
6 years ago
Tao Luo 14b4337663
Merge pull request #16062 from luotao1/num_threads
6 years ago
sneaxiy 33138a421d remove match check
6 years ago
wopeizl 7fbf52daa3
remove the ignored from is_empty and less_than test=develop (#15971)
6 years ago
Zhen Wang 8063b31e2d Reduce redundant code for channel wise dequant op. test=develop
6 years ago
Tao Luo 6375fe45d7
Merge pull request #16039 from luotao1/execution_context
6 years ago
Zhen Wang e8f9dac7ab Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into channel_wise_quant_op
6 years ago
Zhen Wang 806832e091 update the input format of channel wise dequantize op.
6 years ago
jerrywgz f0177a1ed1 refine doc, test=develop
6 years ago
jerrywgz 072eca348a refine doc, test=develop
6 years ago
Kaipeng Deng 6d8771b55c
Merge pull request #15864 from heavengate/spectral_norm
6 years ago
sneaxiy 814a759061 merge develop
6 years ago
sneaxiy 597dc65e76 enhance gc
6 years ago
liuwei1031 caadd0581d
add IfElse test case for ir memory optimize (#15998)
6 years ago
luotao1 06aab1b493 refine SetCpuMathLibraryNumThreads
6 years ago
baojun da45fbdaf5 fix tanh typo test=develop (#16049)
6 years ago
whs 0f99d24083
Make sequence_erase op support for input with multi-level LoD. (#15982)
6 years ago
Zhen Wang 89dee160d1 add channel wise dequantize op.
6 years ago
Tao Luo f4587789d8 remove legacy function in ExecutionContext
6 years ago
luotao1 c0b240aa43 try to fix distributed unit-test
6 years ago
jerrywgz b4f5180299 fix doc, test=develop
6 years ago
jerrywgz 21e0d35ce3 fix formula, test=develop
6 years ago
jerrywgz d1901f27bc refine doc
6 years ago
jerrywgz a1ef7df865 refine code, test=develop
6 years ago
tensor-tang 12eb9aecde Merge remote-tracking branch 'ups/develop' into op/embgrad
6 years ago
jerrywgz e64921c79a fix API.spec,test=develop
6 years ago
jerrywgz d497bd9079 resolve conflict, test=develop
6 years ago
jerrywgz 41471d28ac add box_coder_and_assign, test=develop
6 years ago
lidanqing 02c106c717 MKLDNN: Add UT for conv_transpose_mkldnn op. (#16030)
6 years ago
sneaxiy 7e5a4a3d63 test=develop
6 years ago
luotao1 784826a4f5 enhance cache runtime_context for different scope
6 years ago
dengkaipeng 3eab9e4b95 fix statement. test=develop
6 years ago
dengkaipeng e37f5ab5b1 fix API.spec. test=develop
6 years ago
dengkaipeng 54bbbfa71f fix doc statement. test=develop
6 years ago
dengkaipeng c1a69e3ea0 refine doc. test=develop
6 years ago
dengkaipeng 65d375a09f fix format. test=develop
6 years ago
dengkaipeng 82d514345c fix spectral_norm doc. test=develop
6 years ago
dengkaipeng 2ea5843cbf add doc and test_layers. test=develop
6 years ago
dengkaipeng 037855f42d fix attr dim calc. test=develop
6 years ago
dengkaipeng 70dbd59839 add grad kernel for spectral_norm. test=develop
6 years ago
dengkaipeng 72509ec3bd add unittest for spectral_norm. test=develop
6 years ago
dengkaipeng 3bf1ae9b59 add spectral_norm forwarn kenel
6 years ago
Zhen Wang 545247d7b4 add channel wise quantize op.
6 years ago
sneaxiy f0634da4b5 test=develop
6 years ago
ceci3 44a4ac0f8c fix API.spec and testfile
6 years ago
tensor-tang b16dabd7e0 refine vbroadcast jitcode
6 years ago
tensor-tang c2e56e6bbc Merge remote-tracking branch 'ups/develop' into op/embgrad
6 years ago
ceci3 3b96aa0839 conflict fix
6 years ago
ceci3 06d8e1a15d test=develop
6 years ago
chengduo 92438f6132 Revert "Add Event for TensorCopy" (#16022)
6 years ago
baojun 742839f8f4 fix cpplint test=develop (#16028)
6 years ago
chengduo d4b461eb10 Unified ParallelExecutor and Compiler (#15970)
6 years ago
chengduo 06f3c8575d Add Event for TensorCopy (#15953)
6 years ago
Tink_Y 8949a94691 refine image_resize annotation (#15976)
6 years ago
tangwei12 7b0875e9f8 add op type in check nan/inf (#15986)
6 years ago
Yiqun Liu 2bdf44641c Add the include of cudnn.h to enable the use of CUDNN_VERSION. (#15961)
6 years ago
Yiqun Liu b94307a919 Revert "Optimize while_op when is_test is true. (#15811)" (#15968)
6 years ago
flame eeb70edd9a add anakin fc op converter (#15965)
6 years ago
minqiyang ab5a648481 Add missing headers
6 years ago
minqiyang 94c8ce3f13 reduce ut time
6 years ago
Yiqun Liu c90b82a637 Fix error in CUDA kernel of beam_search. (#15957)
6 years ago
minqiyang 3723dcc301 Polish code
6 years ago
flame afc3fcd509 anakin subgraph engine (#15774)
6 years ago
minqiyang 212242c4e4 Polish code
6 years ago
Yiqun Liu 1b10a7843c Optimize while_op when is_test is true. (#15811)
6 years ago
xiaolil1 91838c3214 Optimize Quantize Op with primitive reuse. (#15929)
6 years ago
luotao1 1c58eee9b2 refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool
6 years ago
minqiyang 3f4aeed57f Polish code
6 years ago