jakpiase
d834f4e6e8
Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel ( #30661 )
...
* added external reorder to profiler
* resolved conflict
* added enable_static
* initial version of lstm, not working yet
* added lstm to operators.cmake
* added vanilla lstm mkldnn op
* added peephole weights integration
* minor changes
* added formatting
* added fusion_lstm_mkldnn to static_whitelist
* added formatting
* removed comment
* moved use_peepholes attribute inside is_cached block
* reverted wrong changes
* minor formatting change
* minor changes
4 years ago
arlesniak
5bf25d1e8b
More precise mkldnn kernel rules in GetExpectedKernelType ( #29840 )
...
* More precise mkldnn kernel choice in GetExpectedKernelType
* Fixes after review
* Refresh develop for CI
* CI experiment
* get back from CI exper
4 years ago
Jacek Czaja
173660be7b
[oneDNN] Cache oneDNN stream not to recreate in each oneDNN op ( #30358 )
4 years ago
Shang Zhizhou
ae0f88a988
add DLA support:C++&&Python api ( #30165 )
...
* add dla
* add dla done
* add python api
Co-authored-by: shangzhizhou <root@szth-rp-fanyi-opera49.szth.baidu.com>
4 years ago
chentianyu03
fb7fbc7a5d
fix abs bug and add abs test case ( #30637 )
...
* add abs test case
* use std::abs to fix abs bug
* fix the abs bug
* fix abs bug
4 years ago
ShenLiang
9514b4aa5f
Fix scatter grad bug ( #30604 )
4 years ago
Pei Yang
cf9bdb9404
extend trt ut timeout threshold ( #30537 )
4 years ago
Thunderbrook
1bebc09253
solve build gpu task core ( #30626 )
...
* build gpu task core
* format
4 years ago
石晓伟
33bf6eb753
revert external gflags, test=develop ( #30623 )
4 years ago
Jacek Czaja
dfdb0359ea
- Disabling oneDNN inplace pass ( #30588 )
4 years ago
TTerror
10271ddfc4
support reduce_max op on kunlun ( #30581 )
...
* support reduce_max op on kunlun
* support reduce_max op on kunlun
* support reduce_max op on kunlun
* support reduce_max op on kunlun
4 years ago
QingshuChen
5013c67644
fix softmax bug for multi_card in kunlun ( #30600 )
4 years ago
wuhuanzhou
7e671c07b6
optimize unity build ( #30195 )
...
* optimize unity build, test=develop
* fix code style error, test=develop
* fix code style error and test /MP settings, test=develop
4 years ago
liuyuhui
e5b0d9e1fc
[Kunlun] Add condition_variable and notify() in BindThreadedSSAGraphExecutor ( #30586 )
4 years ago
wanghuancoder
90773473a0
use nvtx push pop in timeline ( #30567 )
...
* delete empty line of pybing.cc, test=develop
* use nvtx push pop in timeline, test=develop
* change year, test=develop
* add #ifdef PADDLE_WITH_CUDA, test=develop
* add #ifndef WIN32, test=develop
* is_pushed to is_pushed_, test=develop
4 years ago
chentianyu03
358106fcb0
make abs op support complex types ( #30375 )
...
* rewrite abs op
* rewrite abs op and remove abs in activation
* remove abs register in old codes
* fix abs_grad type error
* fix abs double_grad output name error
* modify abs_grad, abs_grad_grad functor for windows building
* format code style
* fix the bug of result is nan when the divisor is zero
* add missing abs attr and add abs for float16
4 years ago
Wilber
2d5758c456
update. ( #30585 )
4 years ago
Tao Luo
9dd71c74df
disable test_analyzer_detect ( #30541 )
4 years ago
tangwei12
c9e78a22c5
add trainers for pserver ( #30523 )
...
* add trainers for pserver
Change-Id: I1a75793ec81ce126d07f4c47cae09b95d530bbc8
4 years ago
wanghuancoder
d1b25ed9d7
add some RecordEvent, for dygraph timeline ( #30299 )
...
* add some RecordEvent, for dygraph timeline, test=develop
* change GpuMemcpySync to memory::Copy, test=develop
* fix compile problem, test=develop
* fix compile problem, test=develop
* fix, test=develop
* fix, test=develop
4 years ago
liym27
ff25c5b36f
Fix bug: GetAttrValue should deal with attr with attrType vector<double> ( #30536 )
4 years ago
WangXi
572c466d19
[Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer ( #30455 )
4 years ago
ykkk2333
549855ac20
add rmsprop_op_xpu test=kunlun ( #30493 )
...
* add rmsprop_op_xpu test=kunlun
* modified rmsprop_op_xpu error code. test=kunlun
4 years ago
Zhou Wei
fb20ec9a4e
fix bug of multicard grad ncclAllReduce ( #30553 )
4 years ago
Zhen Wang
f30d00553a
Fix the compiling error of update_loss_scaling when using cuda9. ( #30538 )
4 years ago
Leo Chen
81217a94d8
unify calling cudaSetDevice ( #30470 )
...
* unify calling cudaSetDevice
* fix compile
4 years ago
pangyoki
00554b3f6b
fix error message of Inplace strategy ( #30520 )
4 years ago
Leo Chen
7043b8cfc6
support layer_norm fp16 in dygraph amp ( #30430 )
...
* support layer_norm fp16 in dygraph amp
* add ut
* refine code
4 years ago
wanghuancoder
59ad6ff3e3
delete empty line of pybing.cc, test=develop ( #30529 )
4 years ago
hutuxian
e207fe6385
Ascend Framework Part2: pybind files ( #30410 )
4 years ago
hutuxian
40ede12631
Ascend Framework Part1: OP & Wrapper ( #30281 )
4 years ago
liuyuhui
843dc3cdbd
[Kunlun]PR3: add xpu executor, multi xpu card train function optimization ( #30317 )
4 years ago
QingshuChen
8489d4f76f
optimize batch_norm & pool op for kunlun ( #30490 )
4 years ago
wanghuancoder
bd97192274
if pybind.cc changed, generate total report, test=develop ( #30514 )
4 years ago
taixiurong
5e5c2827a3
fix range op crash in dygraph xpu place ( #30469 )
4 years ago
JZ-LIANG
16ba0abc79
Recompute Offload: fixed bug in memcpy ( #30484 )
4 years ago
guofei
11e78ebaa3
Modify the calculation logic of LambOptimizer ( #29313 )
...
* Modify the calculation logic of LambOptimizer
4 years ago
Adam Osewski
c5ffad126c
[oneDNN] Refactor fuse pass helper functions to one place. ( #30460 )
...
* Move pass tester helper functions to single common place.
* Use helper functions in two more fuse pass tests.
4 years ago
Zhang Ting
c9a334e1b3
add VecCastCUDAKernel ( #30296 )
4 years ago
pangyoki
13d757362c
Add Inplace strategy (Output reuse Input Varbase) in dygraph ( #30103 )
...
* add view strategy on squeeze,unsqueeze,reshape,flatten
* add squeeze unittest
* add unittests
* use View strategy as name rather than Reuse Allacation
* fix view api doc
* fix format
* use core.ops when input of reshape2 is Tensor
* fix test_cross_entropy_loss error because of reshape2
* fix test_cross_entropy_loss error because of reshape2
* add inplace strategy
* add elementwise_add sub
* let backward op not use inplace
* grad op do not use inplace
* fix memory increase error and add leaf error message
* delete selected_rows
* change op_function
* little change
* solve HandleViewBetweenInputAndOutput
* add unittest and leaf error message
* merge view error
* optimize op_function_generator format and support sum inplace op
* fix format of basic_engine
* fix format for framework
* little change of variable wrapper
* add reshape, squeeze, unsqueeze, scatter api
* add relu elu tanh softmax inplace api
* fix test_squeeze_op unittest
* fix test_relu_op unittest
* fix comment problems
* delete sample code of inplace api
* add reference of grad_pending_nodes in basic_engine
* fix unittest name
* add inplace apis into wlist
* fix error message
* add PADDLE_ENFORCE for set grad op twice
* fix head file error
4 years ago
Yang Zhang
008b0a8b56
Fix float64 bug in layer norm ( #30452 )
...
built-in `rsqrt` is shadowed
4 years ago
石晓伟
715d862868
export global google flags to users, test=develop ( #30448 )
4 years ago
Wojciech Uss
88fc7a7d68
fix cache key for inplaced elementwise ops ( #30404 )
4 years ago
wawltor
3d49882e2c
fix the rnn mask memory bug for out of read ( #30459 )
...
* fix the rnn mask memory bug for out of read
* update the code for the rnn
4 years ago
taixiurong
6a3c8725b0
support transformer v2.0 ( #30381 )
4 years ago
ShenLiang
e85be1b1b2
fix flatten api grad ( #30426 )
4 years ago
yaoxuefeng
6e0da01c61
Heter ps new ( #30198 )
4 years ago
123malin
2a98e9323a
test=develop, add distributed_infer ( #30300 )
...
* test=develop, add distributed_infer
4 years ago
QingshuChen
cf786d22ec
fix bug that cann't find mkldnn(kunlun) ( #30394 )
4 years ago
cc
8e3a294045
skip quantizing ops in cpu inference ( #30342 )
...
* skip quantizing ops in cpu inference, test=develop
4 years ago
alncat
7bbf3ac5ab
Added support for inference using quantization aware trained dygraph ( #30288 )
...
* added support for inference using qunatization aware trained dygraph
* added support for inference using qunatization aware trained dygraph
correct boost get usage
* Delete incorrect warning message (#30196 )
* fix warning and no grad
* clean redundant API alias in 2.0 - part 2 (#30013 )
* delete paddle.nn.functional.assign
* fix dynamic to static error
* just add the op error message for the matmul xpu (#30246 )
add the op error message for the matmul xpu
* Add Static Variable Clone (#30208 )
Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat
* use wget to replace curl to download the lcov file (#30229 )
* use wget to replace curl to download the lcov file
* add cache for lcov
* fix test_pool3d_op timeout issue (#30248 )
* Fix unittests bugs. (#30250 )
* modify error message based on comments (#30189 )
* modify error message based on comments
* edit code according to review.
* Correct spelling according to review.
* Fix bug for 'save mutiple method' (#30218 )
* Fix bug for 'save mutiple method'
* To pass coverage.
* edit code to pass coverage.
* edit code to pass coverage.
* add unittest for coverage.
* change for coverage.
* edit for coverage.
* added support for inference using qunatization aware trained dygraph
* Alias from paddle.fluid.layers.auc to paddle.static.auc (#30206 )
* add alias from fluid.layers.auc to static.auc
* Update __init__.py
* added support for inference using qunatization aware trained dygraph
correct boost get usage
* corrected boost get usage
* corrected naming issues and enforcing zero check
* correct paddle enforce message
* added more error checkings
* corrected error report message and optimized code
* corrected findvar usage
* corrected paddle_enforce in scope
* correct error messages
* correct error reporting format
Co-authored-by: LielinJiang <50691816+LielinJiang@users.noreply.github.com>
Co-authored-by: XiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
Co-authored-by: wawltor <fangzeyang0904@hotmail.com>
Co-authored-by: Huihuang Zheng <zhhsplendid@gmail.com>
Co-authored-by: YUNSHEN XIE <1084314248@qq.com>
Co-authored-by: Bai Yifan <me@ethanbai.com>
Co-authored-by: gongweibao <weibao.gong@gmail.com>
Co-authored-by: WeiXin <weixin10@baidu.com>
Co-authored-by: Jiaqi Liu <liujiaqi06@baidu.com>
4 years ago
GaoWei8
180877e988
Softmax backward optimize ( #30249 )
...
* softmax backward optimize
4 years ago
Zhang Jun
10a8f3e5c3
fix bug on compiling inference shared lib with crypto;test=develop ( #30269 )
...
* fix bug on compiling inference shared lib with crypto;test=develop
* fix cmake bug when build inference lib using -DWITH_CRYPTO=OFF
* update cmake
* remove unnecessary enforce message
4 years ago
Huihuang Zheng
28e156c27f
Fix Sleep Error in enforce.h ( #30335 )
...
usleep function in <unistd.h> only takes argument less than 1,000,000. Current call can exceed this limit, we have to fix it. This PR can fix random CI error.
4 years ago
Leo Chen
3d015f1cf5
Set expected place in child thread for dataloader to avoid costing cuda memory on other card ( #30338 )
...
* set expected place in child thread for dataloader
* set device id when set tensor from numpy
* revert tensor_py change
* add compile guard
* fix ci
* fix bug
4 years ago
QingshuChen
2c1bba02e4
optimize memcpy perf for kunlun ( #30291 )
...
* optimize memcpy perf for kunlun
* remove useless unitest for kunlun mean
* minor
4 years ago
ShenLiang
a60f17b89d
Support unused parameters in dynamic graph distributed ( #30224 )
4 years ago
JZ-LIANG
75936d838f
Recompute Offload ( #30233 )
4 years ago
lidanqing
a60893f6b5
correct the allowed dimension size ( #30326 )
4 years ago
Chen Weihang
c8c8f205ba
remove c++ stacktrace hint ( #30325 )
4 years ago
tangwei12
5e839e4da5
add sparse embedding & load vars for 2.0 & gloo bug fix ( #30306 )
...
* add sparse embedding & load vars for 2.0
Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b
* fix hdfs gloo
Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6
* fix gloo hdfs
Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e
* move loadvar/sparse embedding from incubute to static
Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0
4 years ago
tangwei12
25f80fd304
Fix/distributed proto ( #29981 )
...
* rename sendrecv.proto to namespace paddle.distributed
* split ps with distributed
4 years ago
Chengmo
d479ae1725
【Paddle.Fleet】Support local save sparse param ( #30175 )
...
* add save tensor support
Co-authored-by: seiriosPlus <tangwei12@baidu.com>
4 years ago
Double_V
231501fefc
fix elugradgrad test fail & error message opt ( #30171 )
...
* fix elugradgrad test fail and error message opt
* fix unitest,test=develop
* Update prroi_pool_op.h
fix error message
* opt message,test=develop
* fix ci fail,test=develop
4 years ago
Zhen Wang
fb49ea388e
Fix the accuracy problem of allclose op when using float64 data type in static mode. ( #29890 )
...
* Fix the accuracy problem of allclose op when using float64 data type in static mode.
* Format the code style.
4 years ago
yaoxuefeng
4656525e24
fix datanorm error msg ( #30294 )
4 years ago
furnace
77051cc9f0
add fp16 support for tril_triu op ( #30186 )
4 years ago
石晓伟
efa54629fb
fix header file paths of gflags, commit 3, test=develop ( #30273 )
4 years ago
Chengmo
5b2c15afcd
Fix server.h include device_context ( #30243 )
...
* fix cmake
Co-authored-by: seiriosPlus <tangwei12@baidu.com>
4 years ago
石晓伟
a0ee09148e
enhance error msgs of fusion_seqpool_cvm_concat_op.cc, test=develop ( #30240 )
4 years ago
石晓伟
a66eebab5c
fix header file paths of gflags, commit 4, test=develop ( #30274 )
4 years ago
石晓伟
8c4500ff6d
fix header file paths of gflags, commit 2, test=develop ( #30272 )
4 years ago
liym27
b4989fb744
Support vector<double> as type of op attribute and op set_value suppport vector<double> as value ( #30126 )
4 years ago
wangchaochaohu
8dcae0c55d
register OPMaker and Infer Shape Check for fused_elementwise_add ( #30259 )
4 years ago
AshburnLee
924aac2216
Add tf32 switch for cuDNN ( #29192 )
4 years ago
石晓伟
8ce2482b80
fix header file paths of gflags, commit 1, test=develop ( #30271 )
4 years ago
chentianyu03
c7371b7b20
type promotion for grad ( #30177 )
...
* type promotion for grad
* add type promotion for div op
4 years ago
liym27
3ce878f309
Check the rank of input in kernel of set_value op ( #30147 )
4 years ago
WeiXin
66dc4ac77b
modify error message based on comments ( #30189 )
...
* modify error message based on comments
* edit code according to review.
* Correct spelling according to review.
4 years ago
wawltor
fee424411a
just add the op error message for the matmul xpu ( #30246 )
...
add the op error message for the matmul xpu
4 years ago
GaoWei8
0a21924a8d
optimize softmax forward ( #30217 )
...
* optimize softmax forward
4 years ago
wangchaochaohu
af80859dd6
reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) ( #29885 )
4 years ago
zhang wenhui
5932fee60a
enhance error message, test=develop ( #30220 )
4 years ago
pangyoki
da16b33f2e
add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op ( #29913 )
...
* add view strategy on squeeze,unsqueeze,reshape,flatten
* add squeeze unittest
* add unittests
* use View strategy as name rather than Reuse Allacation
* fix view api doc
* fix format
* use core.ops when input of reshape2 is Tensor
* fix test_cross_entropy_loss error because of reshape2
* delete selected_rows
* change op_function
* little change
* solve HandleViewBetweenInputAndOutput
4 years ago
Jacek Czaja
4aba17b5db
[oneDNN] Added UT for testing elementwise_mul caching ( #30203 )
...
* - Added UT for testing elementwise_mul caching
* lint fixes
4 years ago
Zhen Wang
7f7dfccf20
Support pure fp16 training for AMP API. ( #29544 )
...
* add cast ops before and after unsupported fp16 ops.
* Keep partial net in FP32 pattern.
* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
* Add fp16 support for adam op.
* add multi precision attr for adam.
* Fix the bug of test_multi_precision_fp16_train UT.
* Code format for CI.
* Fix the redefine error about MPTypeTrait on windows.
* fix bugs of the _create_accumulators func in Momentum.
* fix bug when inserting post cast op.
* Add the update_loss_scaling op in allow_set of UnusedVarCheck.
* Update for ci coverage.
* Add some doc for OptimizerWithMixedPrecision.
* Fix the code style.
* Imporve the doc of `amp_init`.
* Change for fp16 testing if users have the infer program defined in separate way.
4 years ago
Leo Chen
789743e190
use cuda generator in bernoulli cuda kernel ( #30199 )
4 years ago
Leo Chen
8696335f86
Fix dtype of ungenerated grad var ( #28511 )
...
* fix dtype of ungenerated grad var
* update ut
* refine code
* set default dtype
* fix could_use_cudnn bug
* remove debug code
* re-implement
* fix bug
4 years ago
Wilber
609c022222
shape op support int8 and uint8 tensor ( #30201 )
4 years ago
Wilber
01a287bf0a
fix windows compile when WITH_PYTHON=ON and WITH_TENSORRT=ON ( #30194 )
4 years ago
ruri
e42e1e80dc
Add version checking, test=op_version ( #30129 )
4 years ago
Leo Chen
1f97d61c68
Add callback after TensorCopy ( #30123 )
...
* change to tensor copy sync
* change to tensor copy sync
* make copy_to safe when use TensorCopy
* refine code
* add ut
* add cudapinned garbagecollector
* add testcase: cpu place -> cuda pinned place
4 years ago
Chengmo
528e03fc08
【Paddle.Fleet】Fix tensor table ( #30075 )
...
* add tensor table
4 years ago
Wilber
ade244948c
disable mkldnn inplace pass on windows ( #30164 )
4 years ago
joanna.wozna.intel
907262ee15
Fix analysis predictor test ( #30191 )
...
* Add a necessary condition
* Remove test for white list and add header
4 years ago
lijianshe02
2dc7ee276b
enhance error message of nll_loss op test=develop ( #30125 )
...
* enhance error message of nll_loss op test=develop
4 years ago
Huihuang Zheng
54bf3f5a56
Refine PADDLE_ENFORCE Error Messages. test=develop ( #30149 )
...
Improve some error messages in parallel_executor.cc, conditional_block_op.cc, recurrent_op.cc
4 years ago
Chen Weihang
d0fb06b27f
[Complex] Simplify prepared op impl to improve performance ( #30153 )
...
* simplify prepared op impl to improve performance
* fix kunlun compile error
* continue fix kunlun compile error
* only transform diff place when dtype diff
* fix failed unittests
* remove useless file
* polish impl by review comment
4 years ago
123malin
c5b415bfd9
Improve Index select cuda kernel ( #30139 )
...
* test=develop, add index_select_cuda kernel
4 years ago
wangchaochaohu
7dd551e08b
refine the paddle place support using str ( #28769 )
4 years ago
WeiXin
404c16763a
Add detailed error message for curandStatus_t, cublasStatus_t, cusolverStatus_t ( #30161 )
4 years ago
Wilber
91a8a25721
enhance error info for py_func ( #30138 )
...
* enhance error info for py_func
* update
4 years ago
weihaoji
b8207af6bc
[XPU] Remove lite_xpu ut lite_resnet50_test since fusion pass changes introduced precision diff. test=develop ( #30122 )
4 years ago
liuyuhui
15fac5e7fa
fix assign_op_xpu concat_op_xpu warining ( #30120 )
4 years ago
Jack Zhou
f5428eca4f
fix enforce msg of sum xpu op ( #30113 )
4 years ago
123malin
198fbdfb60
Add Lookahead and ModelAverage Optimizer ( #30004 )
...
* test=develop, add model_average and lookahead
4 years ago
Leo Chen
adac38c506
add dispenable input for core.ops.reshape2/expand/slice ( #30072 )
...
* add dispenable input 'shape' for core.ops.reshape2
* add dispenable inputs for core.ops.reshape2/expand/slice
* add ut
4 years ago
ShenLiang
becf99d2e8
fix error message ( #30135 )
4 years ago
Zhou Wei
30888ca343
Polish and Optimize the print/repr information of Layer ( #29998 )
...
* Polish and Optimize the print/repr message of all layer
* fix some code format
4 years ago
wangguanzhong
69839f8a9a
fix error message for distribute_fpn_proposals_op ( #30116 )
4 years ago
QingshuChen
8e1c3ddf15
add aarch64 and sunway kunlun lib ( #30027 )
...
* add aarch64 and sunway kunlun lib
* minor
* optimize elementwise_add for kunlun
* update kunlun dependence
* minor
* minor
4 years ago
Shang Zhizhou
05b27695f1
add inference api: DisableTensorRtOps ( #30109 )
...
* snap
* add inference api: DisableTensorRtOPs
* fix code style
* update api to experimental
* update variable name
4 years ago
石晓伟
53bb126510
fix a bug in op_version_registry, test=develop, test=op_version ( #29994 )
4 years ago
xiemoyuan
3e0c492910
Optimize the error message of framework. ( #30134 )
4 years ago
liym27
9922bd4125
Fix bug: In dynamic mode, if start or end is negetive, __getitem__ return wrong result( #30003 )
...
1. when slice_item is a slice:
1) the start of __getitem__ should be std::max(start, 0) if slice
2) the start of __getitem__ should be std::min(end, dim)
2. when slice_item is an integer, it should be in [-dim_len, dim_len)
3. Fix error message to use accurate data
4 years ago
chentianyu03
666e665132
change the kron gradient when complex types ( #29995 )
4 years ago
chentianyu03
a5e422c85d
add trace op_register_version and fix version bug; test=op_version ( #30000 )
...
* add trace op_register_version and fix defaulf bug; test=op_version
* add trace op_register_version; test=op_version
* add trace op_register_version; test=op_version
* add trace op_register_version; test=op_version
* fix missing the template bug of vector; test=op_version
4 years ago
cc
9f34374b48
Fix the formate of raising error in randperm op ( #30108 )
...
* fix the formate of raising error in randperm op
4 years ago
liuyuhui
254ad61959
fix xpu pe sync, test=notest ( #30095 )
4 years ago
Thunderbrook
0b8e1fadc5
add topo-aware in heter-ps ( #30087 )
...
* add topo aware
* resource.h
* topo aware
* format
4 years ago
hong
297fff1a79
support dygraph in xpu place ( #30051 )
...
* support dygraph in xpu place; test=develop
* fix cpu/gpu compile error; test=develop
* fix compile error; test=develop
* fix xpu compile error; testd=develop
4 years ago
wangchaochaohu
d0a5620575
fix the compiler error when gcc4 cuda9.0 ( #29997 )
4 years ago
WangXi
ee16006b5d
Optimization grad merge performance ( #29784 )
4 years ago
yongqiangma
e891f4da1b
Add p_norm op version info ( #30042 )
...
* p_norm fix op version info. test=develop
4 years ago
tangwei12
7d1c149e09
for inference checkpoint ( #30081 )
...
* for inference checkpoint
Change-Id: I36c979240ffa55bf1ef0c9315402960762af6be4
* for inference checkpoint
Change-Id: I82025365d5b792cbea1ead506df685aecc8ac198
4 years ago
tangwei12
7d4bdff07d
fix large scale memory ( #30035 )
...
* memory holder optimize
Change-Id: Ic91af8ac6f2853336d28a9fbbc5e8d0c57b5d05e
* memory holder optimize
Change-Id: I2fd1c14ecc17f5d5ce88b87890381ea801e6367f
* fix large scale memory holder
Change-Id: Ief0992b02b00220e16c72cc637a56e7b5788140f
* fix large scale memory holder
Change-Id: I910142a3952ead643a5604f8f80955f3e6efe655
4 years ago
Shang Zhizhou
08dc5bc27e
fix op version checker of pass bug ( #30028 )
...
* fix op version checker of pass bug
* fix code style
* update pass version
4 years ago
cc
68398abce9
[Inference] zero_copy_tensor supports int8_t ( #30053 )
...
* zero_copy_tensor supports int8_t
4 years ago
whs
1b999d2b5d
Add version checking ( #30040 )
4 years ago
ceci3
85b2f05ab0
register ModifyAttr for instance_norm, test=op_version ( #30065 )
...
* register instance norm, test=op_version
4 years ago
channings
ddcff254db
fix op_register_version for compare ops, test=op_version ( #30007 )
...
Co-authored-by: zhoushunjie <zhoushunjie@baidu.com>
4 years ago
Wilber
66e16b7e99
update lite subgraph. ( #30056 )
4 years ago
GaoWei8
a64822589f
add REGISTER_OP_VERSION for LSTM ( #30038 )
4 years ago
yinhaofeng
6e93fb92f9
Register op version for linspace,test=op_version ( #30025 )
...
* Register op version for linspace,test=op_version
* Register op version for linspace,test=op_version
* Register op version for linspace,test=op_version
* Register op version for linspace,test=op_version
* Register op version for linspace,test=op_version
4 years ago
123malin
d0056c324d
test=develop, add op_register_version for roll_op ( #30023 )
...
* test=develop, add op_register_version for roll_op
4 years ago
chentianyu03
e012930aa3
complex gradient matmul ( #29966 )
...
* dot op support complex types
* matmul support complex types
* add test case
* matmul broadcast gradient support complex
* move conjFunctor to complex_functor.h
4 years ago
ShenLiang
893d37e5c6
Fix rank_attention op_version, test=op_version ( #30006 )
...
* fix rank_attention, test=op_version
4 years ago
Adam Osewski
13aef97043
operator checkpoints for new attributes. ( #29832 )
...
* Add operator checkpoints for new attributes.
* Fix adding subsequent checkpoint to quantize op.
4 years ago
wangguanzhong
844d8e0c2c
add REGISTER_OP_VERSION for generate_proposals, roi_align, roi_pool test=op_version ( #30034 )
4 years ago
cc
c3c064a8fc
Add mkldnn nearest_interp and bilinear_interp op ( #30016 )
...
* Add mkldnn nearest_interp and bilinear_interp op
* don't run mkldnn interpolate in default
* add interpolate_mkldnn_pass
4 years ago
chalsliu
c053bf2a57
Revert "register ModifyAttr for instance_norm, test=op_version ( #29938 )"
4 years ago
wawltor
cc2f94620c
add the support the op version check for matmul, test=op_version ( #30011 )
...
* add the support the op version check for matmul, test=op_version
4 years ago
wawltor
b33aaea86c
add the op version check for the elementwise ops, test=op_version ( #30010 )
...
* add the op version check for the elementwise ops, test=op_version
* add the support check for elementwise_ops, test=op_version
4 years ago
Chengmo
4cbcc9b6da
fix momentum op register ( #29941 )
...
* fix momentum op register
4 years ago
hutuxian
7c1f69bdf0
add op_version for flip op [test=op_version] ( #30019 )
4 years ago
ceci3
77c1684397
register ModifyAttr for instance_norm, test=op_version ( #29938 )
...
* upgrade instance_norm, test=op_version
* fix
4 years ago
Leo Chen
47d10c55d5
Enhance debugging ( #30001 )
...
* add debug code
* add place info
* fix compile problem
* add place for output
4 years ago
FlyingQianMM
d42f93e504
add op_register_version for allclose op; test=op_version ( #29968 )
4 years ago
wawltor
8f49f9d5c9
change the elementwise ops version check, test=op_version
...
change the elementwise ops version check, test=op_version
4 years ago
guofei
b23faf37be
Add moving_average_abs_max_scale op_register_version test=develop ( #29957 )
...
Add moving_average_abs_max_scale op_register_version
4 years ago