Tao Luo
38898c2808
Merge pull request #16212 from Aurelius84/develop
...
improve layers.fc api doc
6 years ago
minqiyang
db0c970823
Polish code
...
test=develop
6 years ago
Kaipeng Deng
b77ebb2af2
Merge pull request #15919 from heavengate/yolo_box
...
add yolo_box for detection box calc in YOLOv3
6 years ago
minqiyang
362253732c
Polish code
...
test=develop
6 years ago
minqiyang
c0ddb93ccc
Polish code
...
test=develop
6 years ago
minqiyang
b5078c211a
Make infer var type virtual
...
test=develop
6 years ago
minqiyang
9041b238e3
Polish code
...
test=develop
6 years ago
minqiyang
438bca9c3d
Implement Runtime Var Type Inference
...
test=develop
6 years ago
Xin Pan
50ff898378
graph neural network for imperative mode
...
test=develop
6 years ago
luotao1
5ecdc49c6b
set enable_runtime_context_cache_ default false
...
test=develop
6 years ago
Zhaolong Xing
c49e604906
Merge pull request #16213 from qingqing01/compile_infer_shape
...
Skip compile infer shape in box_coder_op
6 years ago
achao2013
81b4fad8b9
add moving average absmax op and fix bug ( #15155 )
...
* Add moving average absmax op in quantilize-aware training.
6 years ago
luotao1
721c2c00ef
refine fc_infershape
...
test=develop
6 years ago
Kaipeng Deng
74037cc1c8
Merge branch 'develop' into yolo_box
6 years ago
Xin Pan
92b9ce3479
Merge pull request #16073 from heavengate/yolov3_loss_imporve
...
Yolov3 loss: add mixup score and label smooth
6 years ago
luotao1
46ee6bb1aa
fix distributed unit-tests
...
test=develop
6 years ago
luotao1
1b59bed989
Merge branch 'develop' into runtime_context
6 years ago
Aurelius84
2d1e76fb0c
fix API.spec test=develop
6 years ago
luotao1
6ce25c99a0
Merge branch 'develop' into runtime_context
6 years ago
Aurelius84
6cfd20dea8
fix words spell error test=develop
6 years ago
qingqing01
8ad672a287
Support sync batch norm. ( #16121 )
...
* Support Sync Batch Norm.
* Note, do not enable it in one device.
Usage:
build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
loss_name=loss_mean.name,
build_strategy=build_strategy)
6 years ago
shippingwang
98d9552f0f
update sqrt explaination, test=develop
6 years ago
minqiyang
ca392c7e97
Implement infer var type context
6 years ago
Yibing Liu
4ae23cc3c5
Impl fp16 compute kernel for slice_op ( #16206 )
...
* Impl fp16 compute kernel for slice_op
test=develop
* Use data() to replace mutable_data()
6 years ago
sneaxiy
f0d108f589
fix const_cast
...
test=develop
6 years ago
Dang Qingqing
e5e7628a62
Skip compile infer shape in box_coder_op
...
test=develop
6 years ago
Aurelius84
a59b7d47a8
improve layers.fc api doc test=develop
6 years ago
sneaxiy
3e03695629
fix numeric error
...
test=develop
6 years ago
sneaxiy
5a92e4c097
revert revert 16144
...
test=develop
6 years ago
sneaxiy
e993effb29
include unordered_map to cross_entropy_op.cc
...
test=develop
6 years ago
Zeng Jinle
a91964c8fe
Revert "PaddingRNN model memory optimize"
...
test=develop
6 years ago
liuwei1031
1c6caf8466
1. disable reuse SELECTED_ROWS type variable ( #16150 )
...
2. remove lod check in reshape op
test=develop
6 years ago
Wojciech Uss
b9252f3df8
Add cpu_quantize_squash_pass for C-API quantization ( #16128 )
...
* Add cpu_quantize_squash_pass for C-API quantization
test=develop
* add cpu_quantize_squash_pass teste
* fix lint: add include memory unorderd_map and unordered_set
test=develop
* lint fix 2
* fixes
test=develop
* refactored
test=develop
* fix windows ci
test=develop
6 years ago
minqiyang
f83739499c
Polish code
...
test=develop
6 years ago
minqiyang
7355d41834
1. Add imperative gperf profiler
...
2. Add binutils 2.27 in manylinux support
test=develop
6 years ago
Zeng Jinle
0b49e43d3a
Merge pull request #16144 from sneaxiy/rnn_mem_opt
...
PaddingRNN model memory optimize
6 years ago
luotao1
b2898c0f57
Merge branch 'develop' into runtime_context
...
test=develop
6 years ago
minqiyang
98dfb492bb
Release GIL lock
6 years ago
Tao Luo
4ef6f738c3
Merge pull request #16154 from luotao1/infershape_example
...
add all_kernels_must_compute_runtime_shape example for speedup infershape
6 years ago
minqiyang
42e96a029f
Accelerate CPU part
6 years ago
sneaxiy
487624e15d
fix travis-ci
...
test=develop
6 years ago
luotao1
1510b866b6
turn off runtime_context_cache for tensorrt
...
test=develop
6 years ago
guomingz
decdbed054
resolve #15618 ( #16114 )
...
* resolve #15618
Backgroud: the PR #15398 raised the box_coder op performance regression, we optimized the code via the more efficency leveraging opemmp.
6 years ago
sneaxiy
1e9fd40777
combine op files
...
test=develop
6 years ago
luotao1
d94fd97230
add runtime_context_cache_pass
...
test=develop
6 years ago
Kaipeng Deng
1a4a90a81d
Merge pull request #16140 from tink2123/arc_function
...
Add the inverse trigonometric function
6 years ago
Yan Xu
30568473ec
fix broadcast on mp mode ( #15951 )
...
* fix broadcast with mp mode
* polish code test=develop
* fix bcast strategy test=develop
* fic cpplint test=develop
* fix py3 failed test=develop
* fix comment test=develop
* update comment test=develop
6 years ago
baojun
e3c37bd564
remove const_cast and refactor ngraph engine code ( #15925 )
...
* remove concast_cast and refactor code test=develop
* reduce flag use test=develop
6 years ago
chengduo
0979956619
Add memory profiler ( #16137 )
...
test=develop
6 years ago
luotao1
b561ad1e55
Merge branch 'develop' into runtime_context
6 years ago
tink2123
61a6165c2c
modified api.spec
...
test=develop
6 years ago
Zhen Wang
41b8cf0bae
Merge pull request #16162 from wzzju/fix_nan_static_quant
...
Fix NaN bugs for static quantization strategy (mutil-cards train).
6 years ago
luotao1
fe78a92e6e
refine with comments
...
test=develop
6 years ago
Zhen Wang
94b7c1ea7b
Merge pull request #16107 from wzzju/add_graph_clone
...
Add clone function for IrGraph.
6 years ago
dengkaipeng
0ff9a403d0
fix format. test=develop
6 years ago
wopeizl
85709f4378
restore the exception caught since it is necessary for python call stack ( #16160 )
...
test=develop
6 years ago
Zhen Wang
5420cf95f5
Merge pull request #16070 from wzzju/channel_wise_quant_op
...
Add channel wise quant op and channel wise dequant op.
6 years ago
Zhen Wang
5685a48c23
Add some fixme. test=develop
6 years ago
dengkaipeng
b33e6bf5ef
remove comment code. test=develop
6 years ago
tink2123
eb09bd456a
modified api.spec
...
test=develop
6 years ago
dengkaipeng
746740c41b
fix API.spec. test=develop
6 years ago
dengkaipeng
e4e3764060
use memory Copy. test=develop
6 years ago
dengkaipeng
d31693afec
no use _gt_score. test=develop
6 years ago
luotao1
5d20954ac4
add runtime shape for fuse_emb_seq_pool_grad
...
test=develop
6 years ago
dengkaipeng
aad62eeca0
add doc for param default. test=develop
6 years ago
tink2123
a8e375d463
refine doc
...
test=develop
6 years ago
dengkaipeng
585766acc0
fix spell mistake in doc. test=develop
6 years ago
dengkaipeng
b307533b7d
fix format. test=develop
6 years ago
dengkaipeng
afdf3c3f84
fix doc.test=develop
6 years ago
dengkaipeng
5b37cf0add
fix API.spec for yolov3_loss. test=develop
6 years ago
dengkaipeng
af4ef80e5b
fix API.spec not add defaults. test=develop
6 years ago
dengkaipeng
0d1a9996ac
fix unittest for yolov3_loss. test=develop
6 years ago
dengkaipeng
f0804433b0
add mixup score and label_smooth for yolov3_loss. test=develop
6 years ago
dengkaipeng
626fb859d9
add param default doc. test=develop
6 years ago
dengkaipeng
33c8607ef3
fix doc. test=develop
6 years ago
dengkaipeng
abb5a9c726
fix doc statement. test=develop
6 years ago
dengkaipeng
b399ee2a23
fix doc. test=develop
6 years ago
dengkaipeng
ad897304f9
fix pre-commit. test=develop
6 years ago
dengkaipeng
72a18bb160
add bbox range limit. test=develop
6 years ago
dengkaipeng
fb863b4820
add API.spec for yolo_box. test=develop
6 years ago
dengkaipeng
c9d4676bee
fix multi batch idx error. test=develop
6 years ago
dengkaipeng
7808f4c097
fix unittest for yolo_box_op. test=develop
6 years ago
dengkaipeng
cb2dca53c1
fix cuda kernel error
6 years ago
dengkaipeng
04b8b9e96c
add yolo_box_op CUDA kernel
6 years ago
dengkaipeng
452373decb
resize box in input image scale. test=develop
6 years ago
dengkaipeng
3896d955c7
add yolo_box_op CPU kernel
6 years ago
luotao1
8f6597aa0e
Merge branch 'develop' into infershape_example
6 years ago
sneaxiy
b26e9bd232
refine code
...
test=develop
6 years ago
Zhen Wang
ac6ef06ffa
Add the Clone method in Graph. test=develop
6 years ago
Zhen Wang
01eddf125c
Not add graph copy construction method. test=develop
6 years ago
Zhen Wang
1b9c8d5f06
add clone function for IrGraph. test=develop
6 years ago
Tao Luo
ccc7c358b3
Merge pull request #16104 from tensor-tang/refine/jit
...
refine jitkernels and test
6 years ago
Tao Luo
c49b7855fa
Merge pull request #16120 from Xreki/fix_cmake_compress
...
Change the download and compress command of cmake.
6 years ago
Qiyang Min
1f4aa7a202
Imperative remove all descs ( #16045 )
...
* Remove Desc in Forward Pass
* Refactor VarBase
* Add dbg info
* Only check type in imperative mode
* Polish code and support optimizer
test=develop
* Fix stop gradient problem in PyLayer
test=develop
6 years ago
Zeng Jinle
472f16b5aa
Merge pull request #16063 from sneaxiy/enhance_gc
...
Enhance gc
6 years ago
Tao Luo
e31f6e9831
Merge pull request #16146 from luotao1/zero_copy
...
unify ZeroCopy in analysis_test
6 years ago
luotao1
31ccaf0916
add all_kernels_must_compute_runtime_shape example for speedup infershape
...
test=develop
6 years ago
tensor-tang
14d871121b
enhance jitkernel unit test
...
test=develop
6 years ago
Liu Yiqun
4e052e0ac9
Disable inference download for WIN32 temporary.
...
test=develop
6 years ago
chengduo
ad80bde824
Revert "Revert "Add Event for TensorCopy"" ( #16035 )
...
* Revert "Revert "Add Event for TensorCopy" (#16022 )"
This reverts commit e2da3a5b22
.
* use default stream
test=develop
6 years ago