zlsh80826
ac63c7cdef
fix a skip_layernorm bug, test=develop ( #26800 )
5 years ago
Jiawei Wang
a1b99fae07
Adadelta Optimizer ( #26590 )
...
* add doc; notest
* fix doc; notest
* update doc; notest
* refine optimizer && adam
* refine optimizer; notest
* add adam
* fix doc
* fix doc && add adamw; notest
* add error message
* bug fix
* refine rmsprop && adamax
* fix ci
* buf fix
* update comment
* unify arguments place; notest
* fix ut, test=develop
* bug fix
* fix conflicts, test=develop
* add examples code
* bug fix
* fix comments
* fix sample code
* add sample code for Optimizer
* add adamax ut, test=develop
* fix rmsprop ut, test=develop
* add ut for optimizer.py and adamw.py
* first commit of adadelta optimizer
* fix learning rate
* fix adadelta doc and add sgd momentum
* remove unused fluid
* fix codestyle
* Update test_adam_op.py
* Update test_adam_op.py
* fix SGD in 2 unittests
* fix SGD in 2 unittests
* fix ci
* fix ut
Co-authored-by: MRXLT <xlt2024@gmail.com>
Co-authored-by: mapingshuo <mps2012@yeah.net>
5 years ago
LielinJiang
346689c6f1
Register conv_transpose Op version for compatible Op upgrades ( #26745 )
...
* fix bug
* add version check
* fix docs, test=document_fix
* fix formula, test=document_fix
5 years ago
Adam
8bcb1f29d9
Add conv+affine_channel fuse pass to MKLDNN pass strategy and fix it ( #26779 )
5 years ago
Wilber
68e0560c2f
refine paddle inference api ( #26774 )
...
* refine paddle inference api
Co-authored-by: nhzlx <nhzlx.dragon@gmail.com>
5 years ago
iducn
64df9b99a9
add shell of GPU version ( #26589 )
5 years ago
Wojciech Uss
7afb1df11e
Decouple weights and bias from fc primitive in MKLDNN cache ( #26708 )
...
* decouple weights and bias from fc primitive in cache
* removed reduntant update of pointers
5 years ago
Zhen Wang
f32ae272ec
Remove `sorted_sum_gradient_` form BasicEngine and PartialGradTask. ( #26766 )
...
Use `Tensor` instead of `Variable` in the doc of paddle.grad.
5 years ago
Leo Chen
844583c8fd
Refine paddle.manual_seed ( #26496 )
...
* refine manual seed
* fix ci problem
* fix unittests
* fix unittest
* set is_init_py=false in manual_seed
* fix unittest
* fix bernoulli_op
* fix(unittest): change random_seed to manual_seed
* 🐞 fix(unittest): fix manual_seed
* trigger ci
* fix test_sentiment
* fix test_imperative_save_load
* fix test_uniform_random_op
* fix test_uniform_random_op
* fix test_jit_save_load
* merge develop
* fix manual_seed
* fix manual_seed
* use global engine
* use shared_ptr
* fix double free
* fix bug
* fix bug
* fix bug
* fix test bug
* fix test bug
* fix test bug
* fix ci
5 years ago
Zhou Wei
2d88b9ffe7
turn on WITH_INFERENCE_API_TEST ( #26746 )
5 years ago
Pei Yang
e3f8e5cf5c
trt int8 support conv2d_transpose ( #26636 )
5 years ago
ShenLiang
29494d703d
fix remainder, floor_div ( #26732 )
...
* fix remainder, floordiv
5 years ago
zhangchunle
623a4c2e56
fix ci coverage build error ( #26761 )
5 years ago
lilong12
5f524efe56
modify error report message, test=develop ( #26743 )
5 years ago
wangchaochaohu
4561fc37e2
Add check point for gather Op ( #26696 )
5 years ago
joanna.wozna.intel
eb097d64f6
Fix int8 performace drop cpu_quantize_placement_pass ( #26715 )
...
* Fix cpu quantize placement pass
* Include string lib
5 years ago
joanna.wozna.intel
02083bda40
Add mkldnn bfloat16 option to C-API ( #26676 )
...
* Add mkldnn bfloat16 option to C-API
* Add test for bfloat16 gpu
* Change coverage test
5 years ago
LutaoChu
1ec30cb160
register cumsum Op version for compatible Op upgrades ( #26734 )
...
register cumsum Op version for compatible Op upgrades
5 years ago
Jack Zhou
c282db3a93
add broadcast feature for elementwise logical op
...
add broadcast feature for elementwise logical op
5 years ago
Yang Zhang
63eef7632e
Fix clip input check ( #26683 )
...
* Fix clip input check
* Fix default min/max value
* Allow both max and min to be None
* Register op change
* Revert OP signature change
5 years ago
Zhen Wang
f9066e6a6f
Update the demo code and the doc of varbase.backward. ( #26506 )
...
* update the demo code and the doc of varbase.backward.
* update the doc of the fake interface `paddle.fluid.Variable`.
* remove BackwardStrategy.
5 years ago
Wilber
1c898b66d6
add bug fix enum. ( #26736 )
5 years ago
Zhou Wei
8071d23073
fix bug that can't print int8_t ( #26712 )
...
fix bug that can't print int8_t
5 years ago
joejiong
f311d3c1cf
Fix pow api type error with python side method, merge elementwise_pow and pow. ( #26163 )
...
As the title
5 years ago
yongqiangma
e4cc6a28b0
Norm op support 2-axis ( #26492 )
5 years ago
chalsliu
dc56c89822
Add the option to execute unit tests only at night ( #26669 )
...
* Add the option to execute unit tests only at night
* set ut nightly label for 3 cases.
5 years ago
xiaoting
89d7d86684
add intepolte_v2 ( #26520 )
...
* add intepolte_v2
* fix linear interp
* polish unittest, test=develop
* update code samples to 2.0 API, test=develop
* remove warning, test_develop
* add name in attrs, test=develop
* polish code, test=develop
* change Align to align, test=develop
* fix unittest in py3,test=develop
* fix coverage, test=develop
* fix coverage, test=develop
* fix for windows ci, test=develop
* fix coverage, test=develop
5 years ago
Adam Osewski
c2c689582e
Update Paddle-Lite commit hash. ( #26413 )
...
* Update Paddle-Lite commit hash.
* Add BF16 data type to VarTyp protobuf message.
5 years ago
Zhang Ting
97cebfa4d3
add dtype for unique ( #26655 )
...
* update doc, test=document_fix
* add attr(dtype)
* refine code
5 years ago
lilong12
1c68138327
[api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis ( #26552 )
...
add collective op for cpu using gloo and paddle.distributed.* apis
5 years ago
joanna.wozna.intel
559e43eee4
Small change in conv2d and quantize pass ( #26671 )
5 years ago
Bai Yifan
8986a82131
fix adaptive gpu grad bug, add doc refine ( #26660 )
5 years ago
wawltor
286eca2d9e
update the code for the topk v2
...
add the top v2 for the paddlepaddle api 2.0
5 years ago
whs
f82384113b
Fix atomicAdd in grid sample op and affine grid op ( #26647 )
...
test=develop
5 years ago
Wilber
32ba8602c6
Enhance py_func error info message. ( #26557 )
5 years ago
chalsliu
cb3f131f1c
Set timeout properity for a few unitests
5 years ago
石晓伟
32ceacf317
update op_version_registry, test=develop ( #26644 )
5 years ago
RandyLi
2f5bdd8dc7
Remove WOBOQ, gen_html() and sphinx ( #26128 )
5 years ago
Dong Daxiang
08d736ad78
【paddle.fleet】add cudnn related strategies to DistributedStrategy ( #26598 )
...
* add cudnn related strategies to DistributedStrategy
5 years ago
Zhang Ting
0a895bc0df
improve unique op ( #26537 )
...
* add unique_v2 op
* remove unique_v2 op
* update doc
5 years ago
whs
a004dfde3d
Use atomicAdd defined in paddle fromework ( #26631 )
...
test=develop
5 years ago
LoveAn
02fc1fef8b
Fix the cmake-function named inference_download_and_uncompress on Windows ( #26512 )
...
* Fix the cmake-function named inference_download_and_uncompress with Windows, test=develop
* Fix some problems when remove limit of unittests on Windows, test=develop
* Using URL to download file instead of DOWNLOAD_COMMAND. test=develop
5 years ago
YUNSHEN XIE
a8b5741fb4
add a few unittests for setting timeout properity ( #26630 )
5 years ago
zhangchunle
ef317b4b14
add mac tests failed exitcode ( #26611 )
5 years ago
wanghuancoder
c1f5df5269
optimized transformation form tensor to numpy ( #26447 )
...
* optimized transformation form tensor to numpy, test=develop
* optimized transformation form tensor to numpy, pass pre-commit, test=develop
* modify fetchophandle zerocopy to deepcopy in PE&CUP, test=develop
* modify py:array construct, test=develop
* fix _fetch_var to use deep copy, test=develop
5 years ago
zhupengyang
c80fcf901e
reduce_mean error if keepdim=True and reduce_all=True ( #26614 )
5 years ago
whs
a065a24232
【2.0 API】Enhance affine grid operator ( #26385 )
...
* Enhance affine grid operator:
1. Add cuda kernel
2. Add align corners options
test=develop
* Move new affine_grid api to functional
test=develop
* Add CUDA kernel for affine_grid.
test=develop
* Add more unitest for grid sample API
test=develop
5 years ago
Qi Li
6f69fbc8ea
fix elu grad whne alpha less then zero, test=develop ( #26543 )
5 years ago
whs
786373ba29
Use atomicAdd defined in paddle framework ( #26628 )
...
test=develop
5 years ago
ruri
1f82c0cd62
[Api2.0] add pixel shuffle ( #26071 )
5 years ago
Zhou Wei
1ed74aae7c
fix msbuild log level ( #26607 )
5 years ago
wanghuancoder
422a162019
api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear ( #26399 )
...
* api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear, test=develop
* api2.0 fix code examples, test=develop
* modify test_bilinear_api, about place,to_tensor , test=develop
* re pass pre-commit, test=develop
* Update common.py
* fix BilinearTensorProduct ci error, test=develop
5 years ago
wanghuancoder
6e823cfec3
add op_function_generator.exe retry in windows, test=develop ( #26591 )
...
add op_function_generator.exe retry in windows
5 years ago
石晓伟
fa08a834be
update op_version_registry, test=develop ( #26592 )
5 years ago
whs
79539cf198
【2.0 API】Add CUDA kernel and enhance options for grid_sample ( #26576 )
...
This PR enhance CPU kernel and add new CUDA kernel to make grid_sample support:
- align_corners: with bool type.
- padding mode: which can be in ['zeros', 'reflect', 'border']
- Interpolation mode: which ca be in ['bilinear', 'nearest']
The old CPU and CUDNN version only support align_corners=true, padding_mode='zeros' and interpolation_mode='bilinear'.
The behavior of the new version op in default mode is compatible with the old version.
5 years ago
Guanghua Yu
8645591d66
support fp64 in huber_loss cuda kernel ( #26583 )
5 years ago
yaoxuefeng
efee426742
support generator seed in related kernals test=develop ( #26495 )
5 years ago
Zhong Hui
bf4a4636f1
change to use bce_loss op, add shape check for bce_loss
...
change to use bce_loss op, add numel check for bce_loss.
5 years ago
ShenLiang
0e81626081
add div, floor_div, remainder ( #26562 )
...
* add div, floor_div, remainder
5 years ago
石晓伟
656e60b18f
new class: op_version_registry, test=develop ( #26542 )
5 years ago
qingqing01
24566e951c
Support empty bbox in bipartite math op ( #26488 )
5 years ago
Jack Zhou
199b0c7c1b
Add isfinite v2 op ( #26344 )
...
add the isnan, isfinite, isinf api for the paddle 2.0
5 years ago
Zhou Wei
28554c3f85
add --user for pip ( #26440 )
5 years ago
wangchaochaohu
ebf9b2125e
add paddle.gather for API2.0 ( #26455 )
5 years ago
wangchaochaohu
9219b79104
gather_nd Op for API 2.0 refine ( #26540 )
5 years ago
zhupengyang
9b14117cac
logsumexp: impl kernel, refine docs ( #26307 )
5 years ago
Wojciech Uss
5c2b9258a6
Fix (de/re)quantize cache keys ( #26549 )
5 years ago
YUNSHEN XIE
df7fe1fe23
fix unittests run with error of Expression too big ( #26573 )
5 years ago
wawltor
6b28456ed0
add the argmax, argmin for the api2.0
...
* add the new api and op for the argmax, argmin
5 years ago
LielinJiang
d26ae9ad87
Update conv_transpose api ( #26427 )
...
* update conv_transpose api
5 years ago
lilong12
faa9b97b78
fix cscatter, test=develop ( #26554 )
5 years ago
WangXi
45711dade7
【API】rename div to divide, add floor_divide, remainder ( #26434 )
5 years ago
LutaoChu
4e0c6d91aa
add paddle.tensor.linalg.diag API, diag_v2 OP and CUDA kernel
...
add paddle.tensor.linalg.diag API, diag_v2 OP and CUDA kernel.
5 years ago
zhupengyang
f8863e0603
leaky_relu and LeakyReLU: alpha->negative_slope ( #26216 )
5 years ago
ShenLiang
c609066074
Add Matmul op ( #26411 )
...
* add matmul_v2
5 years ago
Leo Chen
aa2a9b5d89
add bernoulli op ( #26511 )
...
* add bernoulli op
* fix cuda kernel and add unit test
* refine doc
* fix uniform
5 years ago
Adam
f3909020de
Add mechanism for blocking oneDNN cache clearing ( #26502 )
...
* Add mechanism for blocking oneDNN cache clearing
* Review changes and Add thread guards
5 years ago
ShenLiang
b6eb37f5b3
add error message for cholesky ( #26444 )
...
* add error message
5 years ago
QingshuChen
138ecf24aa
support Baidu Kunlun AI Accelerator ( #25959 )
...
* support Baidu AI Accelerator
* test=kunlun
* minor
* test=kunlun
* support xpu op in separate file
* test=kunlun
* update XPU error message and remove duplicated code
* test=kunlun
* minor
* test=kunlun
* minor
* test=kunlun
5 years ago
yaoxuefeng
4f259354d2
mod cvm test=develop ( #25146 )
...
* mod cvm test=develop
* mod code format test=develop
5 years ago
wangchaochaohu
e167e87974
【API2.0】add masked_select Op for API2.0 ( #26374 )
5 years ago
Pei Yang
379222c3f1
add output scale and trt op teller support for hard_swish and hard_sigmoid ( #26499 )
5 years ago
zhupengyang
6e5670b8bd
mean: not support int32, int64; add check for axis ( #26401 )
5 years ago
zhupengyang
4ad504e7c7
hardshrink: support threshold < 0 ( #26403 )
5 years ago
lilong12
e92f770c42
Add collective ops (reduce) ( #26340 )
5 years ago
wangchaochaohu
bdb805505e
【API2.0】add numel API for paddle test=develop ( #26311 )
5 years ago
wangchaochaohu
2073ffc04d
Enhance the data type of linspace API ( #26247 )
5 years ago
hong19860320
40d193ed17
Add the ReLU6, Tanhshrink, SELU, Softplus, Softshrink and Softsign for the api 2.0 ( #26376 )
5 years ago
Chen Weihang
9108282883
Polish framework error message part 5 ( #26204 )
...
* polish framework error msg part 5
* revert enforce change
* refine error type
* trigger ci check
* polish details by review comment
5 years ago
Zhaolong Xing
f00f982a02
add cub impl for arg max, min ( #25941 )
...
test=develop
5 years ago
Zhang Ting
6914a12f82
rename the inputs of allclose ( #26360 )
...
* rename input
* add unittest, test=develop
* use paddle.data instead of fluid.data, test=develop
5 years ago
YUNSHEN XIE
e3612de8d7
add failed unittests retry ( #26342 )
5 years ago
littletomatodonkey
bcf03273f6
add pad func ( #26106 )
...
* add pad func
* add pad
* test=develop, add pad op and apis
* restore pad2d
* test=develop, fix paddl declare
* fix pad interface
* test=develop, fix pad
* test=develop, add all pad api and cos_sim
* test=develop, remove padding default value
* test=develop, rename var to tensor
* test=develop, add more tests
* test=develop, rename tovar to totensor
* test=develop, fix init
* test=develop, add more test
* test=develop, add more tests
5 years ago
Chengmo
eeeef957c7
Fix ps gpu ( #26218 )
...
* support ps-gpu
5 years ago
Zhong Hui
6cbeafb6c0
add zero norm, inf norm support for p_norm op ( #26364 )
...
* add zero norm, inf norm support for p_norm op
* fix the invalid argument check, fix the dtype problem in test case.
5 years ago
tianshuo78520a
029390b1d2
fix ci bug ( #26276 )
5 years ago
Tao Luo
1b03ab3899
set opencv-python <=4.2.0.32 ( #26415 )
5 years ago
Zhaolong Xing
b7a86e92a8
fix dy shape bug in trt7.1 ( #26273 )
...
test=develop
5 years ago
ceci3
56890dc729
Add SyncBatchNorm ( #26032 )
...
* add SyncBatchNorm,test=develop
5 years ago
GaoWei8
1fbee267d4
remove scope in cudnn lstm ( #25188 )
5 years ago
Zhou Wei
da29760d58
add msvc log from quiet to minimal ( #26383 )
5 years ago
Pei Yang
b757466b0d
fix trt dynamic ernie serialization unit test ( #26228 )
5 years ago
Wilber
3ec0bcbbb8
[Bug] Fix prune for save_inference_model about transformer ( #25347 )
5 years ago
cc
3f816bc8b4
[Quantization] Conv2d_transpose and mul support channnelwise quantization ( #25639 )
...
* Conv2d_transpose and mul support channnelwise quantization, test=develop
* Skip collecting out threshold for output tensor of which the type is not fp32 or fp64, test=develop
* Fix error in test_user_defined_quantization, test=develop
* Add depthwise_conv_bn_fuse, test=develop
* Add conv_transpose_bn_fuse_pass for post_training_quant, test=develop
5 years ago
lilong12
638bbb6153
Improve expand as ( #26290 )
...
align expand_as op to expand.
5 years ago
Thunderbrook
a83e0f264c
fix heter proto ( #26093 )
...
test=develop
5 years ago
Leo Chen
049ac56c08
Print user-friendly error message in core.ops [part 2] ( #26377 )
5 years ago
zhupengyang
586a6dd358
log_softmax and LogSoftmax: impl kernel and refind docs ( #26088 )
5 years ago
yaoxuefeng
23261ff44b
add cpu random Generator ( #26013 )
5 years ago
Sylwester Fraczek
69742bd9a4
Enable mkldnn layout conversion ( #25778 )
...
* enable mkldnn layout conversion
* review fix: remove tmp_place
* fix test mkldnn swish
* add UT for PrepareData CPU->MKLDNN
* add #ifdef PADDLE_WITH_MKLDNN
* Force-push commit
Co-authored-by: grygielski <adam.grygielski@gmail.com>
5 years ago
Leo Chen
672578a797
Print user-friendly error message in core.ops ( #26261 )
...
* print user-friendly error message
* adjust error sumary
5 years ago
Zhou Wei
5017aa76e6
set default python3,fix incompatible,cache dir for third party,unify error code,for windows ( #26178 )
...
* set default python3 for paddle windows,test=win
* set default python3,cache dir for third party,error code,test=win
* fix some incompatible
* fix some error
* set virtual environment,test=win
5 years ago
Jack Zhou
6d22f5c73e
Add PADDLE_ENFORCE in nll loss cuda kernel ( #26294 )
...
* add nll loss API, update demo code of the comment
5 years ago
wangchaochaohu
0b81d76310
[API2.0] add op for cudnn version query test=develop ( #26180 )
5 years ago
lilong12
241b44db14
[API 2.0] adaptive expand op to use shape instead of expand_times ( #26206 )
...
* adaptive expand op to 2.0 (align to torch.expand) , test=develop
5 years ago
wangchaochaohu
bb11cbc250
[API2.0] add Device api (set_device and get_device)( #26103 )
5 years ago
Zhou Wei
6de463d3d1
expose and unify the Tensor concepts to the user ( #25978 )
...
* expose and unify the Tensor concepts to the user
* expose tensor to user
* add copy place for Tensor
* add copy place for Tensor
* add note
* add macro PADDLE_WITH_CUDA
* remove RUN_TYPE=DIST
* fix some error
5 years ago
lilong12
fbd4d3cc97
[API 2.0] add paddle.tile op ( #26245 )
...
* add tile_op, test=develop
5 years ago
Zhou Wei
20147ace3f
fix_copy_if_different ( #25868 )
5 years ago
Wilber
c84aa9c61f
update diff val. ( #26242 )
5 years ago
Yang Zhang
a2d3e5c03b
Fix `paddle.abs` docstring ( #25942 )
...
test=document_fix
remove activation wording
5 years ago
Yang Zhang
22165934bc
Fix `paddle.acos` docstring ( #25958 )
...
test=develop,test=document_fix
remove activation wording
5 years ago
Yang Zhang
a5b5b00e02
Fix `paddle.asin` docstring ( #25967 )
...
test=develop,test=document_fix
remove activation wording
5 years ago
Yang Zhang
c758765769
Fix `paddle.atan` docstring ( #25968 )
...
test=develop,test=document_fix
remove activation wording
tanh -> tan
5 years ago
Yang Zhang
c4e480efc5
Fix `paddle.cos` docstring ( #25969 )
...
test=develop,test=document_fix
explain input/out put range and out of boundary behavior
5 years ago
liuyuhui
935da32d25
【paddle.fleet】upgrade fleet: modify role_maker ( #26038 )
...
* add unittest for paddlerolemaker with gloo
5 years ago
wawltor
2d6cc0b125
support the tuple for attribute of axis in min, max for api2.0
...
Update the code for the min,max, test=develop
5 years ago
Dong Daxiang
50a5bcfc9d
【paddle.fleet】paddle.fleet -> paddle.distributed.fleet. ( #26186 )
...
* move paddle.fleet to paddle.distributed.fleet
5 years ago
Leo Chen
ffe52b4452
[OpDevOptimize] Add common infershape functions ( #26096 )
...
* add unchaged infershape function
* add broadcast infershape function
* fix bug
* rename infershape functions
* add UnaryOpUnchangedInferShapeCheckAxis
* add error message
* add test for common infer shape functions
* dont update existed ops
* dont update op_desc.h
* add more test
* add error check, refine error message
5 years ago
Leo Chen
2d95280e1f
Feature/Enable Auto-Mixed-Precision in dynamic graph ( #24903 )
...
* add auto_cast, test=develop
* add loss scaler, test=develop
* add comments, test=develop
* refine code, test=develop
* refine code, test=develop
* do not set flags automatically, test=develop
* fix custom op bug, test=develop
* add more test, test=develop
* refine enable logic, test=develop
* enable amp test with GPU, test=develop
* add unittest
* add test for found_inf
* follow comments
* follow comments
* remove global variable, use singleton
* add some notes
* update comments
* update comments
* update comments
* add use_dynamic_loss_scaling argument
* refine found_inf
* refine found_inf
5 years ago
Chen Weihang
838e36e9ed
Fix loaded variable suffix repeat error ( #26169 )
...
* fix loaded var suffix repeat error
* use new dygraph name for loaded param
5 years ago
Jack Zhou
dea41da715
add nll loss API for the paddlepaddle api2.0
...
* add nll loss API, update demo code of the comment
5 years ago
Wilber
fb72b192e7
[DOC] Fix dead link ( #26154 )
5 years ago
wawltor
9c17b3c9f8
Add the max, min, maximum, minimum api for the API 2.0
...
* Add the max, min, maximum, minimum api for the API 2.0, test=develop
5 years ago
JZ-LIANG
54003b873e
【paddle.fleet】add lamb to fleet meta optimizer ( #26025 )
...
add lamb to fleet meta optimizer
5 years ago
Yiqun Liu
1be6bf45ae
Add assign to fusion_group and enhance inplace execution in fusion_group. ( #26121 )
5 years ago
lidanqing
65b97d6215
GRU model xnli dataset C++ tester ( #25534 )
...
* Add laxical GRU unit test
performance works
* Get model accuracy
* model and data name to be confirmed
test=develop
* update model name and output format
test=develop
* update according to reviews
test=develop
* add accuracy check
* accuracy check between native and analysis
test=develop
* fix a reading bug, fix gru passes sequence
test=develop
* fix passes sequence
test=develop
5 years ago
Zhen Wang
a86e8c0eef
add more error info for these ops without double grad ops. ( #25987 )
5 years ago
tianshuo78520a
75a1311400
Fix inference CI bug ( #26080 )
...
* Fix inference bug
* fix inference lib
5 years ago
MRXLT
6559229b7e
fix encryption infer ( #25979 )
...
* add encrypt for inference lib
* fix code;test=develop
* fix test; test=develop
* bug fix; test=develop
* add MakeCipher;test=develop
* fix bug;test=develop
* move MakeCipher to paddle space; test=develop
* fix include dir ;test=develop
* add include dir; test=develop
* move include; test=develop
* move include; test=develop
* fix for windows ci
* fix cmake; test=develop
* fix bug
bug fix
5 years ago
lilong12
8caee2ad51
【paddle.fleet】add the support for multi-node training for pipeline ( #25907 )
...
* add the support for multi-node training
5 years ago
LutaoChu
bf2db646de
fix cumsum op for API 2.0, optimize performance
...
update cumsum api and fix up the cumsum op
5 years ago
Adam
1893cd6bb8
Add oneDNN relu6 op ( #26037 )
...
* Add oneDNN relu6 op
* Lint fixes
5 years ago
Zhaolong Xing
50f149a48e
fix cudnn workspace size problem during inference. ( #26021 )
...
test=develop
5 years ago
Zhou Wei
1f74b94d3f
fix compile warning on windows MSVC, fix paddle_build.bat more safe ( #25933 )
...
* Fixed compile warning about incorrect compile options,fix paddle_build.bat
* fix paddle_build.bat to more safe
5 years ago
tangwei12
c14ec8782b
【paddle.fleet】Feature/fleet ps api 2.0 ( #25857 )
...
* add paddle.fleet.AsyncOptimizer
Co-authored-by: dongdaxiang <dongdaxiang@baidu.com>
5 years ago
Chen Weihang
3c8daa9b89
Add pin memory control for BufferedReader ( #26026 )
...
* add pin memory control
* fix buffered reader init problem
* fix unittest error
* add unittest for coverage
5 years ago
Chen Weihang
ad4a0466a5
Add cuda pinned place branch in slice op GetExpectedKernelType ( #26027 )
...
* add cuda pinned place branch
* add unittest
* add skip when not gpu
5 years ago
zhangchunle
86794cccbd
separate approve ( #26035 )
5 years ago
Feiyu Chan
e853ece0a2
update document template for unary elementwise layers ( #25896 )
...
1. update document template for unary elementwise layers(a.k.a. activation layer);
2. remove generate_op_noattr and use generate_activation instead; remove redundant function copies;
3. minor update for docstring to fix rst format errors.
4. fix doc for Rsqrt OP
5. add sample code for each activation separately;
6. remove the unused deprecated decorator.
5 years ago
joanna.wozna.intel
734cf1c3e9
Change use_quantizer attribute name and data type ( #25838 )
...
* Change use_quantizer attribute name and data type
* Fix problem with setting attribute
* Add changes due to review
* Small change in function
* Restore use_quantizer attr for compatibility
5 years ago
Leo Chen
5258d53d65
refine unsqueeze, test=develop ( #25470 )
...
* refine unsqueeze, test=develop
* update unsqueeze, test=develop
* refine unsqueeze, test=develop
* refine unsqueeze, test=develop
* update
* remove None, test=develop
* follow comments
* support bool
* update doc
* follow comments
* merge develop
5 years ago
tangwei12
3755564ae1
Fix/large scale fix ( #25999 )
...
* fix large scale KV
* fix single training using async ssa graph
5 years ago
Leo Chen
751305ecf0
Add flags to control call stack of error message ( #25997 )
...
* add flags_call_stack_level
* update
* refine code
5 years ago
Thunderbrook
fd2947babf
fix compile error with mkl ( #26030 )
...
test=develop
5 years ago
Leo Chen
0a47387bd8
Use static local variable instead of global variable for safty ( #26018 )
...
* remove global variable
* refine code
5 years ago
Pei Yang
beb0ca5fab
Fix TRT plugin registry without TRT lib ( #25982 )
...
* fix trt plugin registry without trt lib
* support trt4
* refine code style
5 years ago
123malin
2191a08317
【paddle.fleet】fleet_util move to paddle.fleet ( #25805 )
...
* test=develop,test=document_fix, remove the out args
* fleet_util move to paddle.fleet
Co-authored-by: WuHaobo <wuhaobo1994@gmail.com>
Co-authored-by: tangwei12 <tangwei12@baidu.com>
5 years ago
yaoxuefeng
224620071b
add new flatten op test=develop ( #25393 )
5 years ago
Adam
68c6160e63
Add oneDNN fusion_gru kernel ( #25594 )
...
* Add oneDNN fusion_gru kernel and fix fc+gru pass
test=develop
* Formatting changes
test=develop
* Lint fixes
test=develop
* Add memory::format_tag::any to GRU weights
test=develop
* Fix build with CUDA
* Fix build with CUDA v2
5 years ago
Thunderbrook
0cb60c700d
add heter ps mode ( #25682 )
...
* add heter ps mode
* code style
test=develop
* add with_pslib
test=develop
* unitest
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* test monitor
test=develop
* prepare trainer
test=develop
* code style
test=develop
5 years ago
Zhong Hui
dca56f47f5
fix invalid read of pnorm gradient function
...
fix invalid read of pnorm gradient function and delete the unused code
5 years ago
WangXi
2c9d0f3cb9
【paddle.fleet】Add dgc to fleet meta optimizer ( #25738 )
...
Add dgc to fleet meta optimizer, rm dgc from optimizer all
5 years ago
Zhaolong Xing
358bc06c72
[CUDNN8 support] : support CUDNN8 ( #25664 )
...
* cunn8 support
test=develop
* fix ci error
test=develop
5 years ago
Zhaolong Xing
5970871a64
add eltwise clip cuda impl. ( #25689 )
...
test=develop
5 years ago
Zhen Wang
82374dc12f
Add some error messages for the op without double grads. ( #25951 )
...
* Add some error messages for the op without double grads.
* fix the test_imperative_double_grad UT.
5 years ago
danleifeng
3dd2e3801a
【paddle.fleet】add fleetrun command for distributed running ( #25806 )
...
* add fleetrun command for distributed running; test=develop
5 years ago
Pei Yang
b717895f64
Fix registering trt plugin ( #25744 )
...
* develop dynamic shape serilization
* add test param for gelu
* fix bugs
* delete redundant comments
* debug
* fix conflict. test=develop
* fix bug. test=develop
* add trt dynamic shape serialized support
* fix ernie serialized bug
test=develop
* fix codestyle
test=develop
* fix bug
test=develop
* fix bug.test=develop
* modify cmakelist test=develop
* fix bug
test=develop
* fix error message. test=develop
* fix trt register plugin based on pr#25003
* add trt dynload
* fix deserialization bug of not finding plugin registration
* refine code style
* recover engine key in tensorrt_subgraph_pass
* for ci coverage
* add unittest for deserialization
Co-authored-by: haozech <chenhaoze94@gmail.com>
5 years ago
wawltor
a697e94693
Update the code of the compare ops for the broadcast function
...
Update the code for the compare ops for the broadcast function
5 years ago
Chen Weihang
9b5a65b819
refine init signal handler meg dumper ( #25911 )
5 years ago
wangchaochaohu
ff717d5158
Add support for tuple of concat Op test=develop ( #25800 )
5 years ago
tangwei12
253fd407e8
Fix/distibuted heart beat ( #25902 )
...
* disable heart beat UT
5 years ago
WangXi
a6c87fd091
Add amp to fleet meta optimizer, test=develop ( #25770 )
5 years ago
Pei Yang
9e9a569dae
add trt int8 support for elementwise_mul and scale ( #25676 )
5 years ago
xujiaqi01
d11c140e28
fix dump, fix cvm check ( #25400 )
...
* fix dump, fix cvm check
test=develop
* fix
test=develop
* fix
test=develop
* fix
test=develop
5 years ago
JZ-LIANG
8ebffc78c9
add lars to fleet meta optimizer ( #25884 )
5 years ago
Dong Daxiang
8d2896f1fe
【paddle.fleet】Fleet run graph in Executor and add two more strategies ( #25844 )
...
* split meta optimizer files
* add graph execution in execution, update two properties in DistributedStrategy, unit tests for these features
5 years ago
Zhang Ting
6486fe8a94
improve GPU performance of transpose, test=develop ( #25862 )
5 years ago
Zhang Ting
2d24f56a7a
avoid data transfer, test=develop ( #25810 )
5 years ago
ShenLiang
bca303165a
fix inverse bug ( #25641 )
...
* fix inverse bug, test=develop
* fix the untest, test=develop
* add singular checking, test=develop
* fix the utest, test=develop
* use memory::copy, test=develop
* fix bost_get, test=develop
* fix position, test=develop
5 years ago
Chen Weihang
48b9a56f1c
Polish framework error message - part 4 ( #25807 )
...
* polish framework error message part 4
* fix type error
* fix message error
* polish by review comments
5 years ago
Aurelius84
e52dae6ef6
Using input.place() in GetExpectedKernel in slice_op ( #25595 )
...
* modify GetExpectedKernelType
* use input place
* add ENFORCE check
5 years ago
wawltor
595a719795
Update the api for the compare_ops
...
Update the code for the compare_ops, update the api and doc
5 years ago
wangchaochaohu
32b9577b2a
refine the split op for API 2.0 test=develop ( #25320 )
5 years ago
lilong12
ce506930c3
Fix the bug that Input(Offsets) and attr(offsets) cannot be set at the same time. ( #24975 )
...
* bug fix, test=develop
5 years ago
tangwei12
2d9dbd31ad
Fix/mkl dnn ( #25835 )
5 years ago
Zhaolong Xing
bcddefef39
[Fix Ut]: fix inference ut which exist bug on windows. ( #25814 )
...
* fix windows test
test=develop
* fix ci
test=develop
5 years ago
lilong12
5f30e57cdd
fix test_pipeline, test=develop ( #25808 )
...
* fix test_pipeline, test=develop
5 years ago
Chen Weihang
d47304e6d9
Refine paddle error stack format ( #25790 )
...
* refine error stack format
* polish compile traceback format
* polish detail format
5 years ago
tangwei12
caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) ( #22957 )
...
* Integrated Trainer of Parameter Server
5 years ago
hong
c2a21ca9c9
Fix dygraph grad bugs ( #25781 )
...
* fix double grad visitid unit; test=develop
* change name hash_pair to HashPair; test=develop
* follow comment; test=develop
5 years ago
cc
42189be67b
[Quant] Remove the output for moving_average_abs_max_scale op ( #25697 )
...
* Remove the output for moving_average_abs_max_scale op, test=develop
5 years ago
Dong Daxiang
a96d54ac19
Generate final strategy ( #25782 )
...
* refine strategy compiler and meta optimizers
make async as a_sync
5 years ago
Chen Weihang
2469b578f5
Unified paddle error format when catch system signal ( #25765 )
...
* unified signal error format
* refine signal error message
5 years ago
tianshuo78520a
818d38f150
Update conda_build.py for opencv dependency( #25654 )
5 years ago
Zhou Wei
b484a59c39
fix copy file random fail on windows ( #25731 )
5 years ago
Chen Weihang
23d1228c4d
remove ProgramTranslator.save_inference_model ( #25740 )
...
* remove ProgramTranslator.save_inference_model
* adapt save_quantized_model
* revert buffer check implemention
* remove useless import function
5 years ago
Chen Weihang
1b3081b1b4
Simplify BufferedReader to improve DataLoader performance ( #25648 )
...
* simplify buffered reader to improve DataLoader performance
* fix 22 failed unittests
* fix cuda pinned context condition
* fix test_reader_reset failed
* fix two failed unittests
* change unittest place
* polish error messaage
* polish cast op GetExpecctedKernelType
* remove debug info in unittest
5 years ago
Pei Yang
55b6205ddf
add set_mkldnn_cache_capacity python api( #25524 )
5 years ago
Zhou Wei
e0a9115e28
fix random compile failure due to missing file ( #25661 )
5 years ago
Pei Yang
eef98b7f86
add macro check for using TRT api dynamicRangeIsSet() ( #25694 )
5 years ago
Pei Yang
f82baed866
fix trt instance norm plugin on gcc8. test=develop ( #25730 )
5 years ago
Dong Daxiang
920d998f1e
add more settings for distributed strategy ( #25685 )
...
* add more settings for distributed strategy
Basically, DistributedStrategy has several parts of configurations:
- BuildStrategy: the same as paddle.fluid.BuildStrategy, but the distributed arguments are moved out of BuildStrategy
- ExecutionStrategy: the same as paddle.fluid.ExecutionStrategy
- collective communication configs: nccl_comm_num, hierarchical allreduce and so on
- distributed algorithms: async_update(mainly used in PS), lars, lamb and so on
5 years ago
Sylwester Fraczek
1aaa26f102
add dnnl sigmoid (logistic) activation ( #25745 )
5 years ago
Chen Weihang
c34c80d302
Polish framework error message part3 ( #25701 )
...
* polish framework error message part3
* polish details
* fix error message print error
5 years ago
arlesniak
e52df3b125
Added DNNL cache management for DyGraph ( #25624 )
...
* Added DNNL cache management for DyGraph
* move FLAGS_use_mkldnn to more general CMakeLists, getu use of the flag in ClearGradients
* missing file
* Fixes after review
* Bringing back original idea of place for 'use_mkldnn' flag to be accessible from platform nad imperative.
* Removed duplicate and added docs
* Fixes for CI
5 years ago
wangchaochaohu
1e4ab728fb
refine the concat Op for API 2.0 test=develop ( #25307 )
5 years ago
Zhen Wang
cea5086853
Fix the double grad bug for the star gan. ( #25655 )
...
* fix the double grad bug for the star gan. test=develop
* update the retain_graph parameter doc. test=develop
* add the unit test for the retain_graph parameter. test=develop
5 years ago
Chen Weihang
364cc53618
Polish paddle fluid framework error message - part2 ( #25667 )
...
* polish framework error meg part2
* polish details
5 years ago
Adam
98899b73d2
Fix FC + GRU fuse pass ( #25687 )
5 years ago
wanghuancoder
1917b38099
fix some errmsg report,in framework/ir/, about 21 files ( #25525 )
...
* fix error msg report in ir/, about 19 files, test=develop
* modified some unclear descriptions, test=develop
* modified some unclear descriptions, test=develop
* modify unit test pass_test.cc, because the error report in pass.cc is used by pass_test.cc, test=develop
5 years ago
Leo Chen
4ec1251a1e
Refine squeeze, test=develop ( #25281 )
...
* refine squeeze, test=develop
* update squeeze, test=develop
* refine compile-time infershape, test=develop
* add more unittest, test=develop
* follow comments, test=develop
* add update_api, test=develop
* follow comments, test=develop
5 years ago
joanna.wozna.intel
e5bbffa84c
Add NOMINMAX define due to windows.h max/min macro conflict ( #25637 )
...
test=develop
5 years ago
cnn
70cee22fde
New features, add sinh and cosh op, test=develop ( #25495 )
...
* New features, add sinh and cosh op, test=develop
* remove duplicate test function and remove out paramters, test=develop
* Add out paramters temporary, remove later. test=develop
* remove out args, PR 25570, test=develop
* remove TestParameter, test=developx
* add test api for static dygraph, test=develop
* add backword unittests for sinh and cosh, test=develop
5 years ago
Zhang Ting
a1350744eb
register fp16 kernel, test=develop ( #25630 )
5 years ago
mapingshuo
5453a912fe
add fp64 support in sequence_pool, test=develop ( #25662 )
...
add fp64 support in sequence_pool, test=develop
5 years ago
Leo Chen
417b243968
fix best_fit_allocator_test on windows, test=develop ( #25650 )
...
* fix best_fit_allocator_test on windows, test=develop
* enable best_fit_allocator_test and test_math_op_patch_var_base, test=develop
5 years ago
GaoWei8
6e86fd3750
fix concat dimension ( #25606 )
...
Fix the condition of concat dimension judgment.
5 years ago
donproc
95fa383df2
optimize embedding cuda kernel lookup_table_v2,test=develop ( #25587 )
5 years ago
石晓伟
7206417259
supports xpu runtime, test=develop ( #25554 )
...
* update ResetHolder, test=develop
* add TensorShare for lite engine, test=develop
* tensor data changed from copying to sharing, test=develop
* supports xpu runtime, test=develop
* fix code styles, test=develop
5 years ago
Chen Weihang
dfb3ae1b9b
Polish some error message in framework holder - Part 1 ( #25509 )
...
* polish some error message in framework, test=develop
* fix unittest error, test=develop
* replace PADDLE_ENFORCE, test=develop
* polish details based review comment, test=develop
5 years ago
Leo Chen
1ab4101d6c
add ci check for changing op-related api without core.ops, test=develop ( #25596 )
...
* add ci check for changing op-related api without core.ops, test=develop
* generate api_source_md5 file when build, test=develop
* add failed example, test=develop
* add failed example, test=develop
* handle exception, test=develop
5 years ago
Zhang Ting
30d1ff3bb4
call cublasGemmStridedBatchedEx when using fp16, test=develop ( #25553 )
5 years ago
Zhaolong Xing
9df18b08f3
Disable windows static library generation ( #25593 )
...
* fix windows ci
test=develop
* fix ci error
5 years ago
Aurelius84
ca1185d06b
[Dy2Stat] Fix scope in run_program_op ( #25579 )
...
* add reinforcement learning model test=develop
* align backward test=develop
* add gym in paddle_build.sh test=develop
* rm pip install in script test=develop
* refine paddle_build.sh test=develop
* fix sed error in macOS test=develop
* polish code test=develop
* fix scope problem
* refine code by reviewer comment
5 years ago
Chen Weihang
a6abd92dfd
Polish install error hint message ( #25531 )
...
* polish install error hint msg, test=develop
* fix variable error, test=develop
* polish hint messgae again
5 years ago
wanghuancoder
9b46fe0440
fix some errmsg report,in framework/ir/, about 5 files ( #25539 )
...
* fix error msg report in ir/, about 5 files, test=develop
* fix error msg report in ir/, about 5 files, test=develop
* fix error msg report in ir/, about 5 files, test=develop
5 years ago
Zhou Wei
1ab60544f2
windows CI scripts for xly,test=develop,test=win ( #25533 )
...
windows CI scripts for xly
5 years ago
Dong Daxiang
e657d7062d
fleet base initial implementation and the API ( #25442 )
...
refactor fleet api under paddle.fleet
update DistributedStrategy
5 years ago
zhangchunle
3382f395ed
example EXcode ( #25578 )
5 years ago
Jacek Czaja
7dbc441eab
[oneDNN] cache cosmetics improvement ( #25576 )
5 years ago
Aurelius84
1a5d3defb1
[Dy2stat] Add Reinforcement learning unittest ( #25445 )
...
* add reinforcement learning model test=develop
* align backward test=develop
* add gym in paddle_build.sh test=develop
* rm pip install in script test=develop
* refine paddle_build.sh test=develop
* fix sed error in macOS test=develop
* polish code test=develop
5 years ago
zhangchunle
1a4a4219cb
mac build exitcode ( #25540 )
5 years ago
hong
e362095e45
fix softmax with cross entropy out of bound; test=develop ( #25549 )
5 years ago
Huihuang Zheng
d8fe517bf8
Add Support for SelectedRows for Transpose OP and Fix a Bug That SelectedRows Cannot be Supported in SimNet ( #25536 )
...
This PR fixes a bug that SelectedRows cannot be supported in SimNet. The reason of this bug is that dygraph basic_engine didn't copy var's type when the var needs to be accumulated during backward. So when a var is SelectedRows and needs to be accumulated, like SimNet which calls net for two times, the var's type will be changed to default LoDTensor thus bug happens. To fix it, we just also copy the type.
Without this PR, the accumulated SelectedRows parameters in dygraph will be changed into LoDTensor. So when we fixed the bug of supporting SelectedRows in SimNet, we found `test_imperative_lod_tensor_to_selected_rows` failed and threw the error that SelectedRows was not supported for Transpose OP. To fix it, too, this PR also added support for SelectedRows for Transpose OP.
5 years ago
Wilber
848aca7ae8
[CI] [Lite-Subgraph] CI add lite subgraph check. ( #25346 )
5 years ago
wanghuancoder
e65c5b8e83
fix some errmsg report, in framework/ir/ ( #25471 )
...
* fix paddle/fluid/framework/ir/ error msg reoprt, test=develop
* modify error msg reoprt in ir/, about errortype, grammar, supplementary infor, test=develop
* modified some unclear descriptions, test=develop
* Modify the problem that report msg is less than 20 characters, test=develop
5 years ago
Shibo Tao
71c71e684c
fix logical_* ops' doc ( #25479 )
...
* fix doc of logical_* op.
* fix doc of op pow.
* fix comment syntax error9D
* fix operator reciprocal demo.
* fix logical_* ops' doc. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
5 years ago
Aurelius84
4717bdbcfb
Fix hang in seq_topk_avg_pooling op ( #25522 )
...
* fix topk_avg_pool hang test=develop
* refactor get_topk_pos test=develop
* add check of channel_num and num_k test=develop
* add TopKPosPaddingId test=develop
5 years ago
LielinJiang
7129f544f0
Add bilateral_slice op ( #25401 )
...
* add bilateral slice op
5 years ago
GaoWei8
c10dcff12d
refine PADDLE_ENFORCE ( #25456 )
...
* Refine PADDLE_ENFORCE in paddle/fluid/platform
test=develop
5 years ago
wanghuancoder
6c0982b942
fix some errmsg report, in framework/ir/mkldnn ( #25467 )
...
* fix paddle/fluid/framework/ir/mkldnn/ error msg reoprt, test=develop
* modify error msg reoprt, about errortype, grammar, supplementary infor, test=develop
* modified some error descriptions, test=develop
5 years ago
wanghuancoder
fce6466217
fix some errmsg report, in framework/ir/ subdir(memory,optimizer,multi_device) ( #25460 )
...
* fix paddle/fluid/framework/ir/multi_devices_graph_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/memory_optimize_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/fuse_optimizer_ops_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/memory_optimize_pass/ error msg reoprt about PADDLE_ENFORCE, test=develop
* modify error msg reoprt,about errortype,grammar. test=develop
* modify error msg reoprt,about PADDLE_ENFORCE to PADDLE_ENFORCE_XXX, test=develop
* modify error msg reoprt,about PADDLE_ENFORCE to PADDLE_ENFORCE_XXX, and %s to %d, test=develop
* modified some error descriptions, test=develop
5 years ago
Zhang Ting
ca725c82f2
improve fp16 performance of slice_grad, test=develop ( #25523 )
5 years ago
yaoxuefeng
5d3766ff3d
modify flip test=develop ( #25312 )
...
According to paddle 2.0 standard
1, change flip api attr name 'dim' to 'axis'.
2, support empty axis
3, change example code to imperative mode.
5 years ago
Chen Weihang
41d2247275
[Dy2static] Refactor ProgramTranslator save_inference_model API ( #24989 )
...
* experimental refactoring, test=develop
* add TranslatedLayer & remove StaticModelRunner, test=develop
* revert tracedlayer change, test=develop
* fix test_mnist unittest error, test=develop
* add doc & examples, test=develop
* polish doc details, test=develop
* add imperative.jit module, test=develop
* change TranslatedLayer pos, test=develop
* adjust jit module import path, test=develop
* polish doc based review result
* add SaveLoadConfig.separate_params to save paraams separately
* add Layer.buffer support, test=develop
* polish doc details based review result, test=develop
* polish details baesd review comments, test=develop
* add empty str check for param, test=develop
* add unittests, test=develop
* polish details based review comment, test=develop
* remove blanks in comment, test=develop
* polish doc details, test=develop
* update imperative doc link, test=develop
* add api attr for load, test=develop
5 years ago
Pei Yang
43f9f180e5
Add api to clear intermediate tensors in AnalysisPredictor ( #25069 )
...
* add api to clear intemediate tensors in analysis predictor. test=develop
* add python api. test=develop
5 years ago
zhangchunle
6bfbb6abab
exitcode normalize ( #25487 )
5 years ago
zhangchunle
cf6eb0e175
summary failedtests ( #25388 )
5 years ago
yaoxuefeng
aaa7cbd56f
modify trace api test=develop ( #25397 )
5 years ago
Huihuang Zheng
f9ac5fb992
[Dy2stat] Fix Memory Optimization in run_program_op and Add SimNet as Unit Test ( #25383 )
...
Add Similarity Net as unit test. During the unit test, we found three problems:
1. The run_program_op has memory optimization error when running dy2stat net multiple times.
2. The support for SelectedRows can cause problem in dy2stat.
3. The return grammar has problem.
This PR fixes the 1. problem but modify codes for the 2. 3. problems to make PR smaller. I will fix those two problems in the next PR(s)
5 years ago
yaoxuefeng
c42d662e2a
modify roll test=develop ( #25321 )
5 years ago
Zhen Wang
548cdbc544
Quantization-aware training for dygraph ( #24634 )
...
* Add the imperative quantization aware training.
* This is the python part of Imperative QAT. test=develop
5 years ago
Chen Weihang
0b54d54fd8
Fix index overflow bug of the CUDA kernel loop increment ( #25435 )
...
* fix softmax_with_cross_entropy cuda kernel overflow bug, test=develop
* replace old macro & for condition, test=develop
* polish details, test=develop
5 years ago
zlsh80826
e528392de9
[Paddle-TRT] SkipLayernorm vectorized memory optimization ( #25117 )
...
* add explicit specialization
* add skiplayernorm vector load if available
* test=develop
5 years ago
Chen Weihang
4061aa6488
Polish ParallelExecutor exception process logic ( #25449 )
...
* polish pe exception process logic, test=develop
* fix unittest, test=develop
* add unittests, test=develop
5 years ago
Jeng Bai-Cheng
fc93266b0a
Improve qkv transpose performance ( #23919 )
...
Use vector instruction (LDG.128) to improve qkv transpose. It
provides 1.4X speedup at same GPU base frequency.
test=develop
5 years ago
zhupengyang
5b573c58e2
randperm API: remove out, devive, stop_gradient; add name ( #25410 )
5 years ago
Chen Weihang
7be285a66f
remove useless property, test=develop ( #25461 )
...
remove useless property
5 years ago
tianshuo78520a
2d028389e4
Fix Cpu CI error( #25457 )
5 years ago
Jacek Czaja
a5d1592f6c
Added missing oneDNN format ( #25450 )
...
test=develop
5 years ago
Chen Weihang
172d4ecb6c
remove WITH_DSO compile option ( #25444 )
5 years ago
Zhen Wang
bb45af02ac
add the c++ part of Imperative QAT. test=develop ( #25446 )
5 years ago
Jacek Czaja
050a9bf79d
[oneDNN] LRN cleanup ( #25416 )
5 years ago
GaoWei8
1974aadcf0
fix concat shape error ( #25414 )
...
* fix concat shape error
test=develop
5 years ago
tangwei12
4b3778a3ee
Revert/barrier for sync ( #25417 )
...
* add retry for prefetch
* Revert "Fix/sync barrier (#25016 )"
This reverts commit be6a315fbd
.
* reopen dist UT, test=develop
* remove fl UT, test=develop
5 years ago
ceci3
52be62c5ae
fix instance norm in dy ( #24717 )
...
* fix bn & in in dy, test=develop
* update instance_norm,test=develop
* fix bugs,test=develop
* add more case in unittest,test=develop
* fix,test=develop
* fix,test=develop
5 years ago
lilong12
e39aa70ec7
add the support for pipeline ( #24560 )
...
* add device_worker for pipeline, test=develop
5 years ago
hong
70d7d07fea
catch bad alloc exception ( #25140 )
...
* cat bad alloc exception; test=develop
* add unitest; test=develop
* move bad alloc catch to the first place; test=develop
* polish error message; test=develop
* polish error message; test=develop
* add mutex header; test=develop
5 years ago
gongweibao
80f1c50738
Fix typo in interface. ( #24779 )
5 years ago
Zhaolong Xing
7b7e605189
[Fix BUGs]: fix multhead matmul pass's instable bug ( #25123 )
...
* fix multhead matmul's instable
test=develop
* fix multihead matmul bug
test=develop
* fix converage problem
test=develop
5 years ago
zhupengyang
eb3173e2b6
rand API: remove out, device, stop_gradient; add name ( #25246 )
5 years ago
GaoWei8
ea7e532598
Refine PADDLE_ENFORCE ( #25369 )
...
* refine PADDLE_ENFORCE
test=develop
5 years ago
zhupengyang
6de75082cb
fix test_hsigmoid windows ci ( #25311 )
5 years ago
Dong Daxiang
d5e40d1ba9
Paddle fleet distributed strategy ( #25379 )
...
* add paddle.fleet.DistributedStrategy for 2.0
5 years ago
WuHaobo
f593c3fb2f
fix the formula of floor OP and ceil OP ( #25292 )
5 years ago
Wojciech Uss
d0a921ba98
Quant2 updates and fixes ( #25313 )
5 years ago
Zhang Ting
bc7610583b
use eval() to improve CPU performance ( #25243 )
5 years ago
lilong12
3d96601b82
modify pipeline optimizer to only support the mode of sync pipeline training ( #25065 )
...
* modify pipeline optimizer, test=develop
5 years ago
Kaipeng Deng
74468bf428
add mish op. ( #24565 )
...
* add mish op. test=develop
5 years ago
Chen Weihang
f07b25d8e5
fix DataLoader.generrator using error, test=develop ( #25355 )
5 years ago
GaoWei8
fb70682f00
fix PADDLE_ENFORCE ( #25297 )
...
* fix PADDLE_ENFORCE and refine the description
test=develop
5 years ago
Yang Zhang
6d6efafeeb
Add `matrix_nms_op` ( #24400 )
...
* Add `matrix_nms_op`
test=develop
* Make ci happy
test=develop
* Exit early when no detection
test=develop
* Fix license year
test=develop
* Output index as well
test=develop
* Match nms2 lod behavior and add `return_index` flag
test=develop
* Make CI happy
test=develop
* Fix wording
test=develop
5 years ago
Chen Weihang
5a959f6e6e
Refactor dynamic dso search functions ( #25214 )
...
* refactor dynamic dso search func, test=develop
* polish details, test=develop
* polish detail based review comments, test=develop
* revert string type change, test=develop
5 years ago
Jacek Czaja
17c751bec6
[oneDNN] Fix to #25078 ( #25256 )
5 years ago
MRXLT
3b8f0a64c2
Encryption infer ( #25119 )
...
* add encrypt api for inference lib
5 years ago
Wilber
4474fc1033
fix compile on windows. test=develop ( #25310 )
5 years ago
Aurelius84
bc2bd3c1ed
modify into eager_tmp of Base Class test=develop ( #25323 )
5 years ago
Chengmo
e85fcaa712
Fix fluid.embedding in Distributed Training ( #25174 )
...
* test=develop, fix_embedding
5 years ago
Aurelius84
494cb36d09
Modify tmp var name prefix in dygraph ( #25280 )
...
* Modify tmp var name prefix in dygraph test=develop
* refine comment test=develop
5 years ago
Wilber
0371cf6f94
fix compile for lite subgraph. test=develop ( #25285 )
5 years ago
Yiqun Liu
c00f827843
Avoid data transforming ShapeTensor from CPU to GPU in fill_constant op. ( #25267 )
5 years ago
Wojciech Uss
23a4f54b73
rename qat into quant ( #24948 )
...
test=develop
5 years ago
123malin
f1a9593d69
test=develop, bug fix for index_select and roll op ( #25251 )
5 years ago
FDInSky
c2e072587c
test=develop fix generate_proposals's error ( #25227 )
5 years ago
Sylwester Fraczek
36abeff44f
adding elementwiseadd quantization ( #25178 )
5 years ago
Wojciech Uss
56fa3880e3
rename qat into quant in filenames only ( #25194 )
...
test=develop
5 years ago
Wilber
4c964abdf7
support build on arm. test=develop ( #25212 )
5 years ago
Wilber
f78e161ea3
remove paddle_use_kernel and paddle_use_op. test=develop ( #25189 )
5 years ago
liym27
1458cc0c68
Fix bug: Don't check dims if contain_unknown_dim of cross_entropy_grad_op in compile time ( #25221 )
5 years ago
liu zhengxi
68e93d8a17
Fix beam_search InferShape ( #25169 )
...
* fix beam_search infershape, test=develop
* fix beam search op unittest, test=develop
5 years ago
Chen Weihang
353ea9e8ad
Add default cudnn lib path ( #25175 )
...
* add default cudnn lib path, test=develop
* change default path in func, test=develop
* move to linux branch, test=develop
* fix var error in other plat, test=develop
5 years ago
Leo Chen
ff5be2fb77
Refine error message in memory folder ( #25095 )
...
* refine PADDLE_THROW, test=develop
* refine error msg, test=develop
* refine cuda error, test=develop
* follow comments, test=develop
* fix compile problem, test=develop
* fix bug, test=develop
5 years ago
Adam
bd0b38e671
Refactor of conv fp32 oneDNN operator ( #25137 )
...
* Refactor of conv fp32 oneDNN operator
test=develop
* Formatting fix
test=develop
* Return Enforces
test=develop
* GetWeights improvements
test=develop
5 years ago
Pei Yang
b2f5a149e7
[Paddle-TRT] Better Paddle-TensorRT support for PaddleSlim quant models ( #25097 )
...
* Paddle-TensorRT support slim QAT. test=develop
* add comments. test=develop
* use RenameInput instead of ResetInputs. test=develop
5 years ago
Tao Luo
2996315fc9
fix profiler_test on win32 ( #25073 )
...
* remove disable profiler_test on win32
* add log
* enlarge the elapsed time
* Revert "add log"
test=develop
5 years ago
Shibo Tao
19c4db1b56
don't re-generate header file if content doesn't change ( #25130 )
...
* don't re-generate header file if content doesn't change. test=develop
* add copy_if_different function. test=develop
5 years ago
iducn
f282599229
disable unitest for gcc8( #25134 )
5 years ago
tianshuo78520a
1eb9ee242b
delete buddy_allocator_test_data to make repo clean ( #25046 )
5 years ago
Chen Weihang
b23801a262
polish tensor set error messag, test=develop ( #25113 )
5 years ago
Jacek Czaja
a7944904d3
[oneDNN]elementwise_add and elementwise_mul int8 support ( #24984 )
...
* Start implementing int8 eltwise add
test=develop
* - Fix to Michal PR
* - Fix
test=develop
* - Lint fixes
test=develop
* - Added checking if elementwise_mul can be used
test=develop
* - Added attribs to skip_attrs_set
test=develop
* - Improved broadcasting
test=develop
- fixes to compilation
- fix
- fix
- Lint fixes
test=develop
* - removed redundant condition
test=develop
Co-authored-by: Michal Gallus <michal.gallus@intel.com>
5 years ago
Zhaolong Xing
843581154f
fix emb eltwise layernorm ( #24873 )
...
test=develop
5 years ago
石晓伟
9ab3cf039c
remove useless test_dot, test=develop ( #24957 )
5 years ago
石晓伟
6783441e70
fix repeat definitions in liengine.cc, test=develop ( #25020 )
5 years ago
Leo Chen
fa657b3dbb
fix bug of prelu when rank not equal 4, test=develop ( #25067 )
...
* fix bug of prelu when rank not equal 4, test=develop
* fix prelu inference, test=develop
* fix api, test=develop
* fix shape when mode is chennel, test=develop
* remove debug code, test=develop
* add unittest, test=develop
5 years ago
zlsh80826
479c8834f7
[Paddle-TRT] Fixes #24731 , opt for SoftmaxKernelWithEltadd kernel, test=develop ( #24834 )
...
* blockReduce opt
* launch threads align to warpSize
* reduce unnecessary shared memory for broadcast reduced value
* vectorize SoftmaxKernelWithEltadd
* add fp16 constrain
* test=develop
5 years ago
hutuxian
5822862d8a
Monitor Framework ( #24079 )
...
* Add a StatValue class in the backend to represent a stat.
* Add a singleton StatRegistry to maintain the collection of stats.
* For the sake of code neatness, we only support type of int and float, which can cover most of the scenarios.
5 years ago
Leo Chen
028de857d4
fix dtype error of compare op, test=develop ( #25059 )
5 years ago
Jeng Bai-Cheng
bef4afa6de
bugfix for unique_ptr of IOptimizationProfile ( #23917 )
...
This commit fixs the compiling bug regarding unique_ptr of IOptimizationProfile.
IOptimizationProfile has protected dtor and is controlled by TensorRT
internally. Application shouldn't delete the pointer of IOptimizationProfile.
See TensorRT document: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_builder.html#a9ac47e100454151d8206ac91d543299a
test=develop
5 years ago
zlsh80826
49e4ee27e1
[Paddle-TRT] slice kernel optimization ( #24783 )
...
* parallel move shared data test=develop
* test=develop
5 years ago
tianshuo78520a
770c11a117
fix make device_context error ( #25045 )
...
* test=develop
* test=develop
* fix bug
* test=develop
* test=develop
5 years ago
tangwei12
be6a315fbd
Fix/sync barrier ( #25016 )
...
* fix sync barrier with barrier monitor, test=develop
5 years ago
ceci3
8db66fc3f6
fix cos_sim, test=develop ( #25017 )
5 years ago
Leo Chen
25a4dac4c2
Use allow list instead of white list ( #25002 )
...
* use allow list instead of white list, test=develop
* reduce include, test=develop
5 years ago
Zhang Ting
621b638550
improve performance of instance_norm, test=develop ( #25005 )
5 years ago
hutuxian
1c224e26af
support CMatchAuc ( #24990 )
...
Support CMatchAucCalculator based on CMatchRankAucCalculator with a new parameter ignore_rank
5 years ago
Zhou Wei
ff8ca52f88
windows publish package scripts ( #24851 )
...
* windows publish package scripts,test=develop
* windows publish package scripts,test=develop
* windows publish package scripts,test=develop
5 years ago
Leo Chen
bfa46c38d5
bn supports reverse_space, test=develop ( #24988 )
5 years ago
wangchaochaohu
613303dbf6
refine the slice Op to improve the performance of xlnet for fp16 training ( #24967 )
5 years ago
silingtong123
37bdb5269f
test=develop, add log message in the function UpdateDllFlag ( #24937 )
...
* test=develop, add log message in the function UpdateDllFlag
* test=develop, add the test
5 years ago
Chen Weihang
d152d7231e
clear old var in scope, test=develop ( #24976 )
5 years ago
Sylwester Fraczek
53d563a0fe
Reshape transpose matmul coverage ( #24970 )
...
* remove gmock from ut
test=develop
* coverage enabled for r+t+m fuse pass
test=develop
5 years ago
wawltor
0eb1b0bc01
Add support the 5d, 6d tensor support for the reduce ops
...
Add the support the 5d,6d tensor support for the reduce ops;
Add the same time, the compile time, it was 22 minutes, it was 21 minutes after fixed.
5 years ago
liuwei1031
8603b5fb72
fix randomly hang issue of PaddleDetection training task on windows ( #24977 )
5 years ago
silingtong123
640196c446
test=develop, remove the tensorrt dll file from windows package ( #24922 )
5 years ago
wangchaochaohu
feba131893
fix the sgement fault error of profiler in seqseq model test=develop ( #24952 )
5 years ago
Sylwester Fraczek
a7ee634b45
fix WARNING: ThreadSanitizer: heap-use-after-free ( #24929 )
...
test=develop
5 years ago
mapingshuo
24e24987f0
fixes the place info in the Print op ( #24934 )
...
fixes the CUDAPlace info in the Print op
5 years ago
Aurelius84
6be0ee159e
Support LoDTensorArray in reverse_op ( #24797 )
...
* Support LoDTensorArray in reverse_op test=develop
* polish en doc and unittest code test=develop
* refine sample code test=develop
* add example of LoDTensorArray test=develop
* fix typo test=develop
5 years ago
Leo Chen
6190023ac9
Refine error message in pybind folder ( #24886 )
...
* refine err_msg of pybind.cc, test=develop
* refine err_msg in tensor_py.h, test=develop
* refine error msg, test=develop
* fix test_exception, test=develop
* follow comments, test=develop
5 years ago
Zhou Wei
4058e736ff
temporarily disable these unittests failed on windows ( #24942 )
5 years ago
Leo Chen
a7cb97a1a5
Fix/isfinite on windows ( #24927 )
...
* refine isfinite, test=develop
* use namespace std of isfinite, test=develop, test=win_gpu
5 years ago
silingtong123
ef9b36873d
test=develop, remove the gflags/gflags.h form paddle_api.h ( #24921 )
5 years ago
whs
4c01d6d53e
Enhance checking in some operator. ( #24473 )
5 years ago
Chen Weihang
4a702ef361
Support SelelctedRows allreduce in multi-cards imperative mode ( #24690 )
...
* support selectedrows allreduce in multi-cards dygraph, test=develop
* remove useless import modules in unittests, test=develop
* add nccl cmake to get nccl version, test=develop
* add if-condition to compiled correctly, test=develop
* add detail version parseing for old nccl, test=develop
* polish camke details, test=develop
* fix remove test cmake error, test=develop
* fix cmake condition, test=develop
* change unittest camke list, test=develop
* fix unittest cmake rule, test=develop, test=framep0
5 years ago
Pei Yang
14b8540551
add default ctor for AnalysisConfig python api. test=develop ( #24924 )
5 years ago
silingtong123
fc4435174b
test=develop, fix the bug of tensorrt package can't compile on windows ( #24860 )
...
* test=develop, fix a bug
* test=develop, remove the macro of PADDLE_DLL_INFERENCE
5 years ago
lilong12
29de0d97a5
add the support to specify device index for device_guard ( #24555 )
...
* add the support of device index for device_guard.
5 years ago
lilong12
6e10022781
add queue_generator_op, dequeue_op, enqueue_op and ut ( #24481 )
...
* add queue_generator_op, dequeue_op, enqueue_op and ut, test=develop
5 years ago
hutuxian
b8f17a049d
fix problem in dump and add log ( #24891 )
...
* Fix the field length in LoD scenario
* Fix the missed lod info when copy tensor in dump field
* Add some log to make debug easy
5 years ago