Wojciech Uss
b9252f3df8
Add cpu_quantize_squash_pass for C-API quantization ( #16128 )
...
* Add cpu_quantize_squash_pass for C-API quantization
test=develop
* add cpu_quantize_squash_pass teste
* fix lint: add include memory unorderd_map and unordered_set
test=develop
* lint fix 2
* fixes
test=develop
* refactored
test=develop
* fix windows ci
test=develop
6 years ago
Zeng Jinle
0b49e43d3a
Merge pull request #16144 from sneaxiy/rnn_mem_opt
...
PaddingRNN model memory optimize
6 years ago
Tao Luo
4ef6f738c3
Merge pull request #16154 from luotao1/infershape_example
...
add all_kernels_must_compute_runtime_shape example for speedup infershape
6 years ago
tianshuo78520a
f404d53ba5
Api approvals ( #16179 )
6 years ago
sneaxiy
487624e15d
fix travis-ci
...
test=develop
6 years ago
guomingz
decdbed054
resolve #15618 ( #16114 )
...
* resolve #15618
Backgroud: the PR #15398 raised the box_coder op performance regression, we optimized the code via the more efficency leveraging opemmp.
6 years ago
sneaxiy
1e9fd40777
combine op files
...
test=develop
6 years ago
Kaipeng Deng
1a4a90a81d
Merge pull request #16140 from tink2123/arc_function
...
Add the inverse trigonometric function
6 years ago
Yan Xu
30568473ec
fix broadcast on mp mode ( #15951 )
...
* fix broadcast with mp mode
* polish code test=develop
* fix bcast strategy test=develop
* fic cpplint test=develop
* fix py3 failed test=develop
* fix comment test=develop
* update comment test=develop
6 years ago
baojun
e3c37bd564
remove const_cast and refactor ngraph engine code ( #15925 )
...
* remove concast_cast and refactor code test=develop
* reduce flag use test=develop
6 years ago
chengduo
0979956619
Add memory profiler ( #16137 )
...
test=develop
6 years ago
tink2123
61a6165c2c
modified api.spec
...
test=develop
6 years ago
Zhen Wang
41b8cf0bae
Merge pull request #16162 from wzzju/fix_nan_static_quant
...
Fix NaN bugs for static quantization strategy (mutil-cards train).
6 years ago
luotao1
fe78a92e6e
refine with comments
...
test=develop
6 years ago
Zhen Wang
94b7c1ea7b
Merge pull request #16107 from wzzju/add_graph_clone
...
Add clone function for IrGraph.
6 years ago
wopeizl
85709f4378
restore the exception caught since it is necessary for python call stack ( #16160 )
...
test=develop
6 years ago
Zhen Wang
5420cf95f5
Merge pull request #16070 from wzzju/channel_wise_quant_op
...
Add channel wise quant op and channel wise dequant op.
6 years ago
Zhen Wang
5685a48c23
Add some fixme. test=develop
6 years ago
tink2123
eb09bd456a
modified api.spec
...
test=develop
6 years ago
luotao1
5d20954ac4
add runtime shape for fuse_emb_seq_pool_grad
...
test=develop
6 years ago
tink2123
a8e375d463
refine doc
...
test=develop
6 years ago
luotao1
8f6597aa0e
Merge branch 'develop' into infershape_example
6 years ago
sneaxiy
b26e9bd232
refine code
...
test=develop
6 years ago
Zhen Wang
ac6ef06ffa
Add the Clone method in Graph. test=develop
6 years ago
Zhen Wang
01eddf125c
Not add graph copy construction method. test=develop
6 years ago
Zhen Wang
1b9c8d5f06
add clone function for IrGraph. test=develop
6 years ago
Tao Luo
ccc7c358b3
Merge pull request #16104 from tensor-tang/refine/jit
...
refine jitkernels and test
6 years ago
Tao Luo
c49b7855fa
Merge pull request #16120 from Xreki/fix_cmake_compress
...
Change the download and compress command of cmake.
6 years ago
Qiyang Min
1f4aa7a202
Imperative remove all descs ( #16045 )
...
* Remove Desc in Forward Pass
* Refactor VarBase
* Add dbg info
* Only check type in imperative mode
* Polish code and support optimizer
test=develop
* Fix stop gradient problem in PyLayer
test=develop
6 years ago
Zeng Jinle
472f16b5aa
Merge pull request #16063 from sneaxiy/enhance_gc
...
Enhance gc
6 years ago
Tao Luo
e31f6e9831
Merge pull request #16146 from luotao1/zero_copy
...
unify ZeroCopy in analysis_test
6 years ago
luotao1
31ccaf0916
add all_kernels_must_compute_runtime_shape example for speedup infershape
...
test=develop
6 years ago
tensor-tang
14d871121b
enhance jitkernel unit test
...
test=develop
6 years ago
Liu Yiqun
4e052e0ac9
Disable inference download for WIN32 temporary.
...
test=develop
6 years ago
chengduo
ad80bde824
Revert "Revert "Add Event for TensorCopy"" ( #16035 )
...
* Revert "Revert "Add Event for TensorCopy" (#16022 )"
This reverts commit e2da3a5b22
.
* use default stream
test=develop
6 years ago
luotao1
1283833395
zero_copy tensor support INT32
...
test=develop
6 years ago
tensor-tang
cfc83c1445
refine jitcodekey and enhance unit tests
...
test=develop
6 years ago
tensor-tang
6ff230a624
Merge remote-tracking branch 'ups/develop' into refine/jit
6 years ago
luotao1
31c4e1d9fc
Merge branch 'develop' into zero_copy
6 years ago
wopeizl
a38db3cb99
Fixrecordio ( #16124 )
...
* fix recordio on win
test=develop
* test=develop
* test=develop
* fix code style
test=develop
* test=develop
6 years ago
sneaxiy
cfd012e2cb
add unittest
...
test=develop
6 years ago
sneaxiy
d7407c90aa
refine cross_entropy mem
...
test=develop
6 years ago
Cheerego
3c60446e59
fix deadlink ( #16129 )
...
* fix deadlink
fix https://github.com/PaddlePaddle/FluidDoc/issues/679
* test=develop
* test=develop
6 years ago
luotao1
9e2c7e69fb
simplify the zero_copy tests
...
test=develop
6 years ago
tink2123
cfc59b13e9
modified api.spec
...
test=develop
6 years ago
sneaxiy
732fa00eaf
disable gc in recurrent_op currently
...
test=develop
6 years ago
tink2123
e4e0d03459
fix format
...
test=develop
6 years ago
Tink_Y
5579fae1d2
Update activation_op.cc
...
test=develop
6 years ago
tensor-tang
45bdd84dac
enhance the jitkernel helper and add unit tests
...
test=develop
6 years ago
tink2123
837ad7f86f
Add the inverse trigonometric function
...
test=develop
6 years ago