Zhong Hui
f4c750d721
Add the cpu version of segment sum mean max min op
...
Add the cpu version of segment sum mean max min op
5 years ago
Wilber
afe94903c3
Rename fluid_inference to paddle_inference. ( #27422 )
5 years ago
Pei Yang
8182337096
clear pass logs ( #27434 )
5 years ago
furnace
13a4c74efd
add mv op(c++, python, unit test) ( #27024 )
5 years ago
LutaoChu
f11a53ee76
Optimize argsort Op performance on GPU
...
* argsort op acceleration on GPU when the input size is equal to the length of the ‘axis’ dimension
5 years ago
ceci3
1d3b27cae8
add double grad compute for batch norm ( #27296 )
...
* add double grad compute for batch norm,test=develop
* fix unittest, test=develop
* remove unuse tensor,test=develop
* add format,test=develop
* update, test=develop
5 years ago
Shang Zhizhou
d93661942e
fix bug sequececonv_eltadd_relu_fuse_pass ( #27404 )
...
* fix bug sequececonv_eltadd_relu_fuse_pass, output error when sequence_conv's padding_start > 0
* fix seqconv_eltadd_relu_fuse_pass unitest error
5 years ago
Leo Chen
aba759ba16
[Feature] Enhance inplace addto strategy for gradient accumulation in static graph ( #27112 )
...
* support use add instead of sum to do gradient accumulation
* add inplace addto pass
* add grad_add op and inplace addto pass
* remove debug code
* code refine
* fix bug when sereral sum ops inserts at same op_idx
* fix Flags type
* add addto attribute for conv3d
* fix ut
* code clean
* fix type
5 years ago
LutaoChu
669efb98de
Fix bug: shapes of Topk outputs are wrong when the parameter k is Tensor
...
Fix bug: shapes of Topk outputs are wrong when the parameter k is Tensor
5 years ago
Wilber
39546aa2f3
Add pass compatible and unit test. ( #27377 )
5 years ago
huangxu96
02606d45ef
Quant op dev ( #25932 )
...
* Finished ChannelWiseQuantDequantAbsMaxOp and Passed unittests.
* Finished channel-wise quantize strategy in imperative quantization.
* Added Cuda code of ChannelWiseQuantDequantMaxAbsOP
Add Cuda code of ChannelWiseQuantDequantMaxAbsOp
* Add quant_axis for channel_wise quant.
* fixed a bug in unnitests, which will not trigger axis = 1 case and cannot meet the coverage rate requirement.
* Added some assert infomation and fixed some coding style mistakes.
5 years ago
Leo Chen
bbc84e0fe0
Refine error msg in paddle/fluid/framework/details [part 1] ( #25631 )
...
* refine error msg in var_handle.h, test=develop
* refine all_reduce_op_handle
* fix some error msg
* refine variable_visitor
* refine threaded_ssa_graph_executor
* refine inplace related files
* refine executor related files
* refine fetch_op_handle.cc
* fix bug
* follow comments
5 years ago
MRXLT
f936adbd2d
fix adam ( #27343 )
...
* fix adam
* rmsprop support double
5 years ago
tangwei12
99626502f7
【paddle.fleet】gloo and util ( #27213 )
...
* fix worker endpoints
* fix gloo wrapper for hdfs
* GPU fleetrun support gloo
* parameterserver fleetrun support gloo
* fix get server endpoint
5 years ago
Pei Yang
a5ef246cac
Optimize emb_eltwise_layernorm_plugin and support fp16 ( #27128 )
5 years ago
yaoxuefeng
d726fd5e86
enhance dataset err msg ( #27363 )
5 years ago
Pei Yang
fd7ab4e63c
register pass compatibility ( #27357 )
...
* pass compatibility
* add compatibility registry
* add unittests for different padding
* add assert
* drop errmsg
5 years ago
haozech
7e6dfcf9b2
Add 3 pass version check ( #27283 )
5 years ago
GaoWei8
1a7559718e
fix cudnn dyload ( #27308 )
...
* fix cudnn dyload error
5 years ago
wawltor
b6a4349dd4
fix the error message for the math dir
...
https://github.com/PaddlePaddle/Paddle/pull/27332
5 years ago
HappyAngel
01659a6961
Polish operators error message in average_accumlate OP ( #27268 )
...
* fix op print error info problem. test=develop
* fix build error
* fix format
* fix error msg info
* fix format
5 years ago
Shang Zhizhou
3c11717988
add op version checker to ir passes ( #27329 )
5 years ago
furnace
515efe4240
add empty_like op (python, and unit test), use c++ implementation of empty op, ( #27287 )
...
and optimize the c++ implmentation of empty op as PR#26659 reviews,
and add bool for shape op.
5 years ago
Yi Liu
e9a0fbfff2
OP报错信息优化 ( #27301 )
...
paddle/fluid/operators/distributed_ops OP报错信息优化
5 years ago
Jack Zhou
63203c4abc
enhance reduce op which can reduce tensor with arbitrary rank
...
enhance reduce op which can reduce tensor with arbitrary rank
5 years ago
lilong12
9f9d15e285
fix the bug of non-exit, test=develop ( #27350 )
5 years ago
ShenLiang
9ee77b1f41
Fix elementwise_floordiv op ( #27352 )
...
* fix floordiv
5 years ago
Zhou Wei
ebc6d54446
fix cache file judge ( #27369 )
5 years ago
ShenLiang
54b81fa32c
add adaptivelsgd in meta_optimizer ( #27289 )
...
* add adaptivelsgd
* Todo fix the code to avoid the conflict.
5 years ago
Jack Zhou
6e29c2da05
Error description optimize for the math dir
...
Error description optimize for the math dir
5 years ago
Zhou Wei
f992f8d7ef
fix judge cache file of inference api more accurate ( #27175 )
...
fix judge cache file of inference api more accurate
5 years ago
Jacek Czaja
4582f697b6
- Fix to concat oneDNN overwritting data ( #27273 )
...
test=develop
5 years ago
ShenLiang
c296618c94
fix error message in broadcast/allreduce/gather ( #27302 )
...
* fix error message
5 years ago
Chen Weihang
4f9d6529fe
Polish framework error message part 7 ( #27266 )
...
* polish framework error message part 7
* fix typo
* polish by reviewes comment
5 years ago
wawltor
4e8582fe5a
update the error message check for the some ops
...
update the error message check for the some ops
5 years ago
wawltor
d003573f90
add the error message check for the some operator
...
add the error message check for the some operator
5 years ago
Wilber
dae62556cb
Enhance infer error info message ( #26731 )
5 years ago
Leo Chen
4c8ea492cd
use shared dev_ctx ( #27313 )
5 years ago
Shang Zhizhou
47fdc60ecc
Optimize slice trt plugin ( #26970 )
...
* optimize slice TRT plugin
This patch removes unnecessary barrier for data transfer of needed offset,
so data transfer can be overlap with GPU kernel execution.
This patch also fixes incorrect name of slice plugin. That is, replaces
"layernorm" with "slice"
test=develop
* add serialize/deserialize to slice plugin
* add static shape slice trt plugin
* fix slice trt op convertor dynamic shape bug
* fix format by clang-format
* fix pylint format error
* fix problems commented by peiyang
Co-authored-by: Ryan Jeng <rjeng@nvidia.com>
5 years ago
Wilber
f827665ae6
[Pass Compatible] Bind python compatible. ( #27262 )
5 years ago
石晓伟
bd77a4258d
error messages of inference/tests, test=develop ( #27259 )
5 years ago
Chen Weihang
dafb0e3bb7
Polish framework error message part 6 ( #27257 )
...
* polish framework error msg part 6
* polish lossed item
* fix failed unittest
* polish by reviewer comments
5 years ago
Shang Zhizhou
e6e2e53782
Optimize error report ( #27254 )
...
* optimize errror report
* add test case for pad op converter
* fix some spelling mistake commented by peiyang
5 years ago
GaoWei8
ee1ed42c99
change sequence length attribute to input ( #27193 )
...
* replace sequence length attr to input
5 years ago
Pei Yang
3ae3b86489
fix trt_dynamic_shape_ernie_deserialize_test ( #27290 )
...
* fix trt_dynamic_shape_ernie_deserialize_test
* support when opt cache dir does not exist
5 years ago
joanna.wozna.intel
1483ea2304
Add bfloat16 passes ( #26999 )
5 years ago
lilong12
bf461fa524
Improving error report message for sequence_expand op ( #27245 )
...
* improve err report, test=develop
5 years ago
Zhong Hui
bbad3414e8
Enhance the error messages for files in operators/math
...
Enhance the error messages for files in operators/math
5 years ago
Chen Weihang
79149c8ee6
polish framework error message part 8 ( #27269 )
5 years ago
Pei Yang
aae41c6fca
refine error message related to paddle-TRT ( #27256 )
5 years ago
Zhen Wang
d708b21074
Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. ( #26240 )
...
* update amp_check_finite_and_scale_op for static_amp.
* use amp_check_finite_and_scale in static graph amp.
* update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op).
* add update_loss_scaling op in cpp.
* add update_loss_scaling_op unit test.
* update the doc of the check_finite_and_unscale op
* Update the process of gradients updating skipping if the gradients have infinite values.
* update the way to zero grads.
* update test_update_loss_scaling_op.py
* add log info when find infinite grads.
* add the unit test for UpdateLossScaling Layer.
5 years ago
ShenLiang
2b6a5793fe
remove auto mode from localsgd optimizer ( #27237 )
...
* rm auto from localsgd
5 years ago
Adam
cc3f4b813a
Add int8 GRU kernel ( #27220 )
...
* Add int8 GRU kernel with UTs
* Lint fixes
* More lint fixes
5 years ago
石晓伟
255e0cf978
error messages of inference/capi, test=develop ( #27258 )
5 years ago
Jack Zhou
9437ce36c4
Error description optimize for math dir
...
Error description optimize for math dir
5 years ago
Zhang Ting
5c1bafbbc6
use eval to improve performance, test=develop ( #25459 )
5 years ago
lidanqing
5c4eed66fd
Fix GRU mkldnn kernel fail on look_table_v2 ( #27198 )
...
* Fix the lookup_table_v2 failed on GRU mkldnn kernel issue
test=develop
* fix according to reviews, removed x_num_col_dims
test=develop
* update gru model. change according to reviews
test=develop
* change according to reviews
test=develop
5 years ago
Chen Weihang
33ff833af2
fix loaded no params layer run error ( #27241 )
5 years ago
Wilber
f1ab288201
enhance inference error info. ( #27251 )
5 years ago
Wilber
1b84c0bf43
Lite subgraph refine predictor ( #27167 )
5 years ago
furnace
2e59769612
add empty op (c++, python, unit test) ( #26659 )
5 years ago
lilong12
c5f957ae38
add double grad for tile op and expand_v2 op ( #27114 )
...
* add double grad for tile, test=develop
* add double grad for expand_v2 op, test=develop
5 years ago
lilong12
58a88ba9af
add double grad for expand ( #27183 )
...
* add double grad for expand, test=develop
5 years ago
Qi Li
7c7fbd3218
fix error msg of fused_embedding_fc_lstm_op, test=develop ( #27231 )
5 years ago
Qi Li
78446ecdba
[UT] fix run type of ut test cases of test_train_recognize_digits and test_api_impl, test=develop ( #27218 )
5 years ago
Jacek Czaja
e005861598
[oneDNN]Introducing oneDNN 1.6 ( #27137 )
...
* - introducing oneDNN 1.6
test=develop
* - Removed redundant code
test=develop
5 years ago
ShenLiang
5bd84b22c4
revert divide ( #27202 )
5 years ago
wawltor
fde5cfe881
fix the CudaPinMemory bug for the equal op ( #27176 )
...
fix the CudaPinMemory bug for the equal op and add the test case for the equal op
5 years ago
zhupengyang
cc3306f7c8
restruct logsumexp to speed up compiling ( #27191 )
5 years ago
Steffy-zxf
50e60e8779
update error info for selected_rows_functor
...
update error info for selected_rows_functor
5 years ago
Wilber
edd962b1d0
Add 2.0 inference api doc. ( #27125 )
5 years ago
JZ-LIANG
5d039f4086
modified the implement of Lars optimizer ( #26733 )
...
add lars to fleet meta optimizer
5 years ago
wangchaochaohu
c71d79b1d2
[cuda11 support] change the CMakeLists to support the cuda11 ( #27124 )
5 years ago
Qinghe JING
43b0445b29
Add double grad in reduce sum ( #27115 )
...
* set default value to strategy in distributed_optimizer test=develop
5 years ago
kinghuin
ed292695c5
optimize the error message for math dir
...
optimize the error message for math dir
5 years ago
yongqiangma
4558d395e9
fix Norm op error ( #26771 )
...
* fix frobenius_norm error, rm p=0 2-axis support. test=develop
5 years ago
LielinJiang
4d7d661249
Fix kl and summary bug ( #27132 )
...
* fix summary rnn
* fix kl_div bug when input shape is [1] and reduction is batchmean
5 years ago
WeiXin
13804ed80c
Error msg/polish tensor error msg ( #26976 )
...
* polish one line error message in tensor.cc
* polish error messages in tensor.cc,tensor.h tensor_impl.h
* polish error messages in tensor.cc tensor.h tensor_impl.h
* polish error messages in tensor.cc,tensor.h tensor_impl.h
* polish error messages in tensor.cc tensor.h tensor_impl.h tensor_test.cc
* polish error messages in tensor.cc tensor.h tensor_impl.h
5 years ago
whs
eb01976037
[2.0 API]Add checker in grid_sample_grad op ( #27126 )
5 years ago
wangguanzhong
a28ae86e11
Enhance ops to support LoD as input for dygraph detection models. ( #25316 )
...
* enhance collect_op for dygraph, test=develop
* enhance detection ops with lod, test=develop
* support none bbox left in generate_proposals, test=develop
* unfiy MultiLevelRoisNum, test=develop
* update core.ops, test=develop
* add op register for new input & output, test=develop
5 years ago
LielinJiang
8df5b4d608
Add correlation api to contrib ( #27015 )
...
* add correlation api to contrib
5 years ago
LoveAn
cbcd5e407a
Fix problem that target name already exists when there isn't model data cache, test=develop ( #27142 )
5 years ago
kinghuin
1b102dd552
optimize the error message for unpooling.cc
...
fix the error message for the unpooling.cc
5 years ago
Pei Yang
5fb8c92054
fix multihead matmul shared params ( #27121 )
5 years ago
xiaoting
58f3ef982a
fix typo for interp_v2,test=develop ( #26843 )
...
* fix typo for interp_v2,test=develop
* align with torch, test=develop
* add area mode, test=develop
* fix bug, test=develop
* format notes, test=develop
* update for converage, test=develop
* fix bilinear, test=develop
* fix bicubic, test=develop
* fix typo, test=develop
* fix coverage, test=develop
* fix helper.input_dtype, test=develop
* polish notes, test=develop
* polish notes, test=develop
* polish notes, test=develop
5 years ago
wangchaochaohu
5af81f833c
fix gpu kernel for numel Op ( #27085 )
5 years ago
Wilber
632125415c
Refine python inference api ( #26958 )
5 years ago
YUNSHEN XIE
b150f2b3a6
disable test_trt_dynamic_shape_ernie_ser_deser,test=document_fix ( #27059 )
5 years ago
zhupengyang
19ca6d9dd2
add .part to speed up compile ( #27044 )
5 years ago
LoveAn
fab8bbf25b
Modify data download function and support unittests of inference APIs on windows ( #26988 )
...
* Modify data download function, and support unittests of inference APIs on windows, test=develop
* The import error compatible with py2 and py3, and fix unittests problems of inference APIs on Windows, test=develop
5 years ago
GaoWei8
4ff16eb201
Add padding cudnn interface ( #26370 )
...
* add lstm cudnn of padding data and refine cudnn codes
5 years ago
wawltor
8857e3911f
add the dynamic dtype check for the argmin/argma
...
update the check for the dtype check for the argmin, argmax
5 years ago
wangchaochaohu
041f4ab842
refine linspace Op for dtype setting( #27071 )
5 years ago
yaoxuefeng
9aa39584fe
fix cuda generator hard-coded offset step ( #27027 )
5 years ago
Jacek Czaja
f6653c71e9
[oneDNN] Fix to conv2d grad with groups ( #27006 )
...
* - Added fix to mobilenet
* - compilation fix
* - Fix to conv2d grad oneDNN with groups
test=develop
5 years ago
Chengmo
a72752263b
support heter-xpu-ps ( #27018 )
...
support heter-xpu-ps
5 years ago
whs
2660ea379d
Fix cuda kernel of affine grid ( #27003 )
...
test=develop
5 years ago
ShenLiang
ff3dc8ac73
fix the remainder ( #26995 )
5 years ago
yaoxuefeng
7f3e6ca596
add cuda generator ( #26786 )
5 years ago
Feiyu Chan
c8cc094576
add template specialization for bfloat16 for gcc 4.8 compatability ( #26985 )
5 years ago
wangchaochaohu
3eacced950
[cuda11 support] add support for cublas load of same function name (parameter diff) ( #26963 )
5 years ago
joanna.wozna.intel
95e1434bb2
Add bfloat16 data type ( #25402 )
5 years ago
Yang Zhang
29b844ad5e
Fix clip op attr ( #26924 )
5 years ago
Shang Zhizhou
61fc7a3e45
Pass version check ( #26887 )
5 years ago
huangjun12
e480168fae
fix dropout bug in backward when input is 1d tensor ( #26837 )
...
* fix dropout bug in backward when input is 1d tensor, test=develop
* add test case and refine error message, test=develop
* refine error message, test=develop
5 years ago
YUNSHEN XIE
d8984a6b90
limit timeout value setting on linux ( #26923 )
5 years ago
Zhou Wei
1771d9f880
fix cache judge more safe ( #26910 )
5 years ago
joanna.wozna.intel
0627a319b0
Restore "Add mkldnn bfloat16 option to C-API " ( #26882 )
...
* Add mkldnn bfloat16 option to C-API
* Add test for bfloat16 gpu
* Change coverage test
* Repair capi_gpu test
5 years ago
Jacek Czaja
5e874cc333
- Cosmetic fixes to align with PADDLE_ENFORCE guidelines ( #26891 )
...
test=develop
5 years ago
wanghuancoder
2d2c31a63a
Add FetchAsyncOpHandle, and use it in FastThreadedExecutor ( #26643 )
...
* optimized transformation form tensor to numpy, test=develop
* Modify fetch op handle, from memcpy Sync to memcpy Async, test=develop
* modify CUDAPinnedPlace to CPUPlace, test=develop
* modify CPUPlace to CUDAPinnedPlace, and set default inplace to false, test=develop
* revert fetch_op_handle, add fetch_async_op_handle, test=develop
* revert fetch_op_handle, add fetch_async_op_handle, test=develop
* fix error msg report, test=develop
* fix bug in cpuplace, test=develop
* fix bug in unmerge and tensorarray modle, test=develop
* fix bug, double copy gpu memory, test=develop
* fix chenweihang¡¯s review advice, test=develop
5 years ago
Thunderbrook
5205748481
fix eigen in push sparse; fix hadoop command ( #26872 )
...
* fix eigen in push sparse; fix hadoop command
test=develop
* add log in load_combine_op
test=develop
5 years ago
Zhaolong Xing
932bbe955b
fix pool trt plugin bug ( #26463 )
...
test=develop
5 years ago
wawltor
0a29fc85d6
fix the argmin,argmax op for the paddlepaddle 2.0
...
* fix the argmin,argmax op for the paddlepaddle 2.0, add checkPoint for the argmax/argmin
5 years ago
Chengmo
d0962abd20
supplement bug fix of parameter server ( #26217 )
...
* fix fluid.embedding
5 years ago
zlsh80826
ad6e3dd69c
[Paddle-TRT] Stack op plugin ( #25605 )
...
* add stack_op to CMakeLists
* add dim=3 support for scale op
* add trt stack op, test=develop
* remove debug message
* add stack plugin serialize
* remove slice, scale op, will add later
* enhence error message
* revise trt ernie test to conver the stack op CI testi, test=develop
* add stack op serialization
* fix test shape after adding stack op
* remove slice op, will add after implementing serialization
* roll back to min_graph=5 to avoid using slice op
* fix scale op output layer
* implement stack op createPlugin
* use workspace and move the defination to .cu
* move stack plugin creator definition to .cu, test=develop
5 years ago
Leo Chen
60ffc22026
Refine bernoulli and unsqueeze op ( #26842 )
...
* add check for bernoulli and register bool for unsqueeze
* follow comments
5 years ago
石晓伟
ced6e87eee
Revert "Add mkldnn bfloat16 option to C-API ( #26676 )" ( #26854 )
...
This reverts commit 02083bda40
.
5 years ago
tangwei12
ebc5f99789
add embedding 2.0 ( #26649 )
...
* add embedding 2.0
* add embedding support input int32
5 years ago
hong19860320
40378edfa8
Add the AddCheckpoint macro to softplus op ( #26809 )
5 years ago
GaoWei8
11fb8a1c10
Refine cudnn softmax ( #25757 )
...
* refine cudnn softmax
5 years ago
arlesniak
885c61f086
Add use of global flag 'use_mkldnn' to layer_helper ( #26497 )
...
* get use of global 'use_mkldnn' in layer_helper
* update for CI
* update for CI, relu test
* update for CI, relu test added, make FLAGS_use_mkldnn a public flag
* added more strict tests, fixes after review
* fixes after review
* fixes after review, CI stuff
5 years ago
Pei Yang
78a530c219
[Paddle-TRT] TRT dynamic shape support PaddleSlim quant models ( #26536 )
...
* support trt dynamic shape int8
* add unittest
* add support for sigmoid; adapt to trt6+ api
5 years ago
wawltor
7ee70a47b8
update the doc for the some ops
...
update the doc for the some ops, ceil asin, atan
5 years ago
yaoxuefeng
a47d92d868
fleet add save with whitelist test=develop ( #23376 )
5 years ago
zhupengyang
0f1ad9b06c
leaky_relu and hardshrink add checkpoint for behavior changed ( #26802 )
5 years ago
Chengmo
7f2aa2db3c
【paddle.fleet】Support Heter Parameter Server ( #25998 )
...
* Support Heter Parameter Server
5 years ago
zlsh80826
ac63c7cdef
fix a skip_layernorm bug, test=develop ( #26800 )
5 years ago
Jiawei Wang
a1b99fae07
Adadelta Optimizer ( #26590 )
...
* add doc; notest
* fix doc; notest
* update doc; notest
* refine optimizer && adam
* refine optimizer; notest
* add adam
* fix doc
* fix doc && add adamw; notest
* add error message
* bug fix
* refine rmsprop && adamax
* fix ci
* buf fix
* update comment
* unify arguments place; notest
* fix ut, test=develop
* bug fix
* fix conflicts, test=develop
* add examples code
* bug fix
* fix comments
* fix sample code
* add sample code for Optimizer
* add adamax ut, test=develop
* fix rmsprop ut, test=develop
* add ut for optimizer.py and adamw.py
* first commit of adadelta optimizer
* fix learning rate
* fix adadelta doc and add sgd momentum
* remove unused fluid
* fix codestyle
* Update test_adam_op.py
* Update test_adam_op.py
* fix SGD in 2 unittests
* fix SGD in 2 unittests
* fix ci
* fix ut
Co-authored-by: MRXLT <xlt2024@gmail.com>
Co-authored-by: mapingshuo <mps2012@yeah.net>
5 years ago
LielinJiang
346689c6f1
Register conv_transpose Op version for compatible Op upgrades ( #26745 )
...
* fix bug
* add version check
* fix docs, test=document_fix
* fix formula, test=document_fix
5 years ago
Adam
8bcb1f29d9
Add conv+affine_channel fuse pass to MKLDNN pass strategy and fix it ( #26779 )
5 years ago
Wilber
68e0560c2f
refine paddle inference api ( #26774 )
...
* refine paddle inference api
Co-authored-by: nhzlx <nhzlx.dragon@gmail.com>
5 years ago
Wojciech Uss
7afb1df11e
Decouple weights and bias from fc primitive in MKLDNN cache ( #26708 )
...
* decouple weights and bias from fc primitive in cache
* removed reduntant update of pointers
5 years ago
Zhen Wang
f32ae272ec
Remove `sorted_sum_gradient_` form BasicEngine and PartialGradTask. ( #26766 )
...
Use `Tensor` instead of `Variable` in the doc of paddle.grad.
5 years ago
Leo Chen
844583c8fd
Refine paddle.manual_seed ( #26496 )
...
* refine manual seed
* fix ci problem
* fix unittests
* fix unittest
* set is_init_py=false in manual_seed
* fix unittest
* fix bernoulli_op
* fix(unittest): change random_seed to manual_seed
* 🐞 fix(unittest): fix manual_seed
* trigger ci
* fix test_sentiment
* fix test_imperative_save_load
* fix test_uniform_random_op
* fix test_uniform_random_op
* fix test_jit_save_load
* merge develop
* fix manual_seed
* fix manual_seed
* use global engine
* use shared_ptr
* fix double free
* fix bug
* fix bug
* fix bug
* fix test bug
* fix test bug
* fix test bug
* fix ci
5 years ago
Pei Yang
e3f8e5cf5c
trt int8 support conv2d_transpose ( #26636 )
5 years ago
ShenLiang
29494d703d
fix remainder, floor_div ( #26732 )
...
* fix remainder, floordiv
5 years ago
zhangchunle
623a4c2e56
fix ci coverage build error ( #26761 )
5 years ago
lilong12
5f524efe56
modify error report message, test=develop ( #26743 )
5 years ago
wangchaochaohu
4561fc37e2
Add check point for gather Op ( #26696 )
5 years ago
joanna.wozna.intel
eb097d64f6
Fix int8 performace drop cpu_quantize_placement_pass ( #26715 )
...
* Fix cpu quantize placement pass
* Include string lib
5 years ago
joanna.wozna.intel
02083bda40
Add mkldnn bfloat16 option to C-API ( #26676 )
...
* Add mkldnn bfloat16 option to C-API
* Add test for bfloat16 gpu
* Change coverage test
5 years ago
LutaoChu
1ec30cb160
register cumsum Op version for compatible Op upgrades ( #26734 )
...
register cumsum Op version for compatible Op upgrades
5 years ago
Jack Zhou
c282db3a93
add broadcast feature for elementwise logical op
...
add broadcast feature for elementwise logical op
5 years ago
Yang Zhang
63eef7632e
Fix clip input check ( #26683 )
...
* Fix clip input check
* Fix default min/max value
* Allow both max and min to be None
* Register op change
* Revert OP signature change
5 years ago
Zhen Wang
f9066e6a6f
Update the demo code and the doc of varbase.backward. ( #26506 )
...
* update the demo code and the doc of varbase.backward.
* update the doc of the fake interface `paddle.fluid.Variable`.
* remove BackwardStrategy.
5 years ago
Wilber
1c898b66d6
add bug fix enum. ( #26736 )
5 years ago
Zhou Wei
8071d23073
fix bug that can't print int8_t ( #26712 )
...
fix bug that can't print int8_t
5 years ago
joejiong
f311d3c1cf
Fix pow api type error with python side method, merge elementwise_pow and pow. ( #26163 )
...
As the title
5 years ago
yongqiangma
e4cc6a28b0
Norm op support 2-axis ( #26492 )
5 years ago
chalsliu
dc56c89822
Add the option to execute unit tests only at night ( #26669 )
...
* Add the option to execute unit tests only at night
* set ut nightly label for 3 cases.
5 years ago
xiaoting
89d7d86684
add intepolte_v2 ( #26520 )
...
* add intepolte_v2
* fix linear interp
* polish unittest, test=develop
* update code samples to 2.0 API, test=develop
* remove warning, test_develop
* add name in attrs, test=develop
* polish code, test=develop
* change Align to align, test=develop
* fix unittest in py3,test=develop
* fix coverage, test=develop
* fix coverage, test=develop
* fix for windows ci, test=develop
* fix coverage, test=develop
5 years ago
Adam Osewski
c2c689582e
Update Paddle-Lite commit hash. ( #26413 )
...
* Update Paddle-Lite commit hash.
* Add BF16 data type to VarTyp protobuf message.
5 years ago
Zhang Ting
97cebfa4d3
add dtype for unique ( #26655 )
...
* update doc, test=document_fix
* add attr(dtype)
* refine code
5 years ago
lilong12
1c68138327
[api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis ( #26552 )
...
add collective op for cpu using gloo and paddle.distributed.* apis
5 years ago
joanna.wozna.intel
559e43eee4
Small change in conv2d and quantize pass ( #26671 )
5 years ago
Bai Yifan
8986a82131
fix adaptive gpu grad bug, add doc refine ( #26660 )
5 years ago
wawltor
286eca2d9e
update the code for the topk v2
...
add the top v2 for the paddlepaddle api 2.0
5 years ago
whs
f82384113b
Fix atomicAdd in grid sample op and affine grid op ( #26647 )
...
test=develop
5 years ago
Wilber
32ba8602c6
Enhance py_func error info message. ( #26557 )
5 years ago
chalsliu
cb3f131f1c
Set timeout properity for a few unitests
5 years ago
石晓伟
32ceacf317
update op_version_registry, test=develop ( #26644 )
5 years ago
Dong Daxiang
08d736ad78
【paddle.fleet】add cudnn related strategies to DistributedStrategy ( #26598 )
...
* add cudnn related strategies to DistributedStrategy
5 years ago
Zhang Ting
0a895bc0df
improve unique op ( #26537 )
...
* add unique_v2 op
* remove unique_v2 op
* update doc
5 years ago
whs
a004dfde3d
Use atomicAdd defined in paddle fromework ( #26631 )
...
test=develop
5 years ago
LoveAn
02fc1fef8b
Fix the cmake-function named inference_download_and_uncompress on Windows ( #26512 )
...
* Fix the cmake-function named inference_download_and_uncompress with Windows, test=develop
* Fix some problems when remove limit of unittests on Windows, test=develop
* Using URL to download file instead of DOWNLOAD_COMMAND. test=develop
5 years ago
YUNSHEN XIE
a8b5741fb4
add a few unittests for setting timeout properity ( #26630 )
5 years ago
wanghuancoder
c1f5df5269
optimized transformation form tensor to numpy ( #26447 )
...
* optimized transformation form tensor to numpy, test=develop
* optimized transformation form tensor to numpy, pass pre-commit, test=develop
* modify fetchophandle zerocopy to deepcopy in PE&CUP, test=develop
* modify py:array construct, test=develop
* fix _fetch_var to use deep copy, test=develop
5 years ago
zhupengyang
c80fcf901e
reduce_mean error if keepdim=True and reduce_all=True ( #26614 )
5 years ago
whs
a065a24232
【2.0 API】Enhance affine grid operator ( #26385 )
...
* Enhance affine grid operator:
1. Add cuda kernel
2. Add align corners options
test=develop
* Move new affine_grid api to functional
test=develop
* Add CUDA kernel for affine_grid.
test=develop
* Add more unitest for grid sample API
test=develop
5 years ago
Qi Li
6f69fbc8ea
fix elu grad whne alpha less then zero, test=develop ( #26543 )
5 years ago
whs
786373ba29
Use atomicAdd defined in paddle framework ( #26628 )
...
test=develop
5 years ago
ruri
1f82c0cd62
[Api2.0] add pixel shuffle ( #26071 )
5 years ago
wanghuancoder
422a162019
api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear ( #26399 )
...
* api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear, test=develop
* api2.0 fix code examples, test=develop
* modify test_bilinear_api, about place,to_tensor , test=develop
* re pass pre-commit, test=develop
* Update common.py
* fix BilinearTensorProduct ci error, test=develop
5 years ago
wanghuancoder
6e823cfec3
add op_function_generator.exe retry in windows, test=develop ( #26591 )
...
add op_function_generator.exe retry in windows
5 years ago
石晓伟
fa08a834be
update op_version_registry, test=develop ( #26592 )
5 years ago
whs
79539cf198
【2.0 API】Add CUDA kernel and enhance options for grid_sample ( #26576 )
...
This PR enhance CPU kernel and add new CUDA kernel to make grid_sample support:
- align_corners: with bool type.
- padding mode: which can be in ['zeros', 'reflect', 'border']
- Interpolation mode: which ca be in ['bilinear', 'nearest']
The old CPU and CUDNN version only support align_corners=true, padding_mode='zeros' and interpolation_mode='bilinear'.
The behavior of the new version op in default mode is compatible with the old version.
5 years ago
Guanghua Yu
8645591d66
support fp64 in huber_loss cuda kernel ( #26583 )
5 years ago
yaoxuefeng
efee426742
support generator seed in related kernals test=develop ( #26495 )
5 years ago
Zhong Hui
bf4a4636f1
change to use bce_loss op, add shape check for bce_loss
...
change to use bce_loss op, add numel check for bce_loss.
5 years ago
ShenLiang
0e81626081
add div, floor_div, remainder ( #26562 )
...
* add div, floor_div, remainder
5 years ago
石晓伟
656e60b18f
new class: op_version_registry, test=develop ( #26542 )
5 years ago
qingqing01
24566e951c
Support empty bbox in bipartite math op ( #26488 )
5 years ago
Jack Zhou
199b0c7c1b
Add isfinite v2 op ( #26344 )
...
add the isnan, isfinite, isinf api for the paddle 2.0
5 years ago
wangchaochaohu
ebf9b2125e
add paddle.gather for API2.0 ( #26455 )
5 years ago
wangchaochaohu
9219b79104
gather_nd Op for API 2.0 refine ( #26540 )
5 years ago
zhupengyang
9b14117cac
logsumexp: impl kernel, refine docs ( #26307 )
5 years ago
Wojciech Uss
5c2b9258a6
Fix (de/re)quantize cache keys ( #26549 )
5 years ago
wawltor
6b28456ed0
add the argmax, argmin for the api2.0
...
* add the new api and op for the argmax, argmin
5 years ago
LielinJiang
d26ae9ad87
Update conv_transpose api ( #26427 )
...
* update conv_transpose api
5 years ago
lilong12
faa9b97b78
fix cscatter, test=develop ( #26554 )
5 years ago
WangXi
45711dade7
【API】rename div to divide, add floor_divide, remainder ( #26434 )
5 years ago
LutaoChu
4e0c6d91aa
add paddle.tensor.linalg.diag API, diag_v2 OP and CUDA kernel
...
add paddle.tensor.linalg.diag API, diag_v2 OP and CUDA kernel.
5 years ago
zhupengyang
f8863e0603
leaky_relu and LeakyReLU: alpha->negative_slope ( #26216 )
5 years ago
ShenLiang
c609066074
Add Matmul op ( #26411 )
...
* add matmul_v2
5 years ago
Leo Chen
aa2a9b5d89
add bernoulli op ( #26511 )
...
* add bernoulli op
* fix cuda kernel and add unit test
* refine doc
* fix uniform
5 years ago
Adam
f3909020de
Add mechanism for blocking oneDNN cache clearing ( #26502 )
...
* Add mechanism for blocking oneDNN cache clearing
* Review changes and Add thread guards
5 years ago
ShenLiang
b6eb37f5b3
add error message for cholesky ( #26444 )
...
* add error message
5 years ago
QingshuChen
138ecf24aa
support Baidu Kunlun AI Accelerator ( #25959 )
...
* support Baidu AI Accelerator
* test=kunlun
* minor
* test=kunlun
* support xpu op in separate file
* test=kunlun
* update XPU error message and remove duplicated code
* test=kunlun
* minor
* test=kunlun
* minor
* test=kunlun
5 years ago
yaoxuefeng
4f259354d2
mod cvm test=develop ( #25146 )
...
* mod cvm test=develop
* mod code format test=develop
5 years ago
wangchaochaohu
e167e87974
【API2.0】add masked_select Op for API2.0 ( #26374 )
5 years ago
Pei Yang
379222c3f1
add output scale and trt op teller support for hard_swish and hard_sigmoid ( #26499 )
5 years ago
zhupengyang
6e5670b8bd
mean: not support int32, int64; add check for axis ( #26401 )
5 years ago
zhupengyang
4ad504e7c7
hardshrink: support threshold < 0 ( #26403 )
5 years ago
lilong12
e92f770c42
Add collective ops (reduce) ( #26340 )
5 years ago
wangchaochaohu
bdb805505e
【API2.0】add numel API for paddle test=develop ( #26311 )
5 years ago
wangchaochaohu
2073ffc04d
Enhance the data type of linspace API ( #26247 )
5 years ago
hong19860320
40d193ed17
Add the ReLU6, Tanhshrink, SELU, Softplus, Softshrink and Softsign for the api 2.0 ( #26376 )
5 years ago
Chen Weihang
9108282883
Polish framework error message part 5 ( #26204 )
...
* polish framework error msg part 5
* revert enforce change
* refine error type
* trigger ci check
* polish details by review comment
5 years ago
Zhaolong Xing
f00f982a02
add cub impl for arg max, min ( #25941 )
...
test=develop
5 years ago
Zhang Ting
6914a12f82
rename the inputs of allclose ( #26360 )
...
* rename input
* add unittest, test=develop
* use paddle.data instead of fluid.data, test=develop
5 years ago
littletomatodonkey
bcf03273f6
add pad func ( #26106 )
...
* add pad func
* add pad
* test=develop, add pad op and apis
* restore pad2d
* test=develop, fix paddl declare
* fix pad interface
* test=develop, fix pad
* test=develop, add all pad api and cos_sim
* test=develop, remove padding default value
* test=develop, rename var to tensor
* test=develop, add more tests
* test=develop, rename tovar to totensor
* test=develop, fix init
* test=develop, add more test
* test=develop, add more tests
5 years ago
Chengmo
eeeef957c7
Fix ps gpu ( #26218 )
...
* support ps-gpu
5 years ago
Zhong Hui
6cbeafb6c0
add zero norm, inf norm support for p_norm op ( #26364 )
...
* add zero norm, inf norm support for p_norm op
* fix the invalid argument check, fix the dtype problem in test case.
5 years ago
Zhaolong Xing
b7a86e92a8
fix dy shape bug in trt7.1 ( #26273 )
...
test=develop
5 years ago
ceci3
56890dc729
Add SyncBatchNorm ( #26032 )
...
* add SyncBatchNorm,test=develop
5 years ago
GaoWei8
1fbee267d4
remove scope in cudnn lstm ( #25188 )
5 years ago
Pei Yang
b757466b0d
fix trt dynamic ernie serialization unit test ( #26228 )
5 years ago
Wilber
3ec0bcbbb8
[Bug] Fix prune for save_inference_model about transformer ( #25347 )
5 years ago
cc
3f816bc8b4
[Quantization] Conv2d_transpose and mul support channnelwise quantization ( #25639 )
...
* Conv2d_transpose and mul support channnelwise quantization, test=develop
* Skip collecting out threshold for output tensor of which the type is not fp32 or fp64, test=develop
* Fix error in test_user_defined_quantization, test=develop
* Add depthwise_conv_bn_fuse, test=develop
* Add conv_transpose_bn_fuse_pass for post_training_quant, test=develop
5 years ago
lilong12
638bbb6153
Improve expand as ( #26290 )
...
align expand_as op to expand.
5 years ago
Thunderbrook
a83e0f264c
fix heter proto ( #26093 )
...
test=develop
5 years ago
Leo Chen
049ac56c08
Print user-friendly error message in core.ops [part 2] ( #26377 )
5 years ago
zhupengyang
586a6dd358
log_softmax and LogSoftmax: impl kernel and refind docs ( #26088 )
5 years ago
yaoxuefeng
23261ff44b
add cpu random Generator ( #26013 )
5 years ago
Sylwester Fraczek
69742bd9a4
Enable mkldnn layout conversion ( #25778 )
...
* enable mkldnn layout conversion
* review fix: remove tmp_place
* fix test mkldnn swish
* add UT for PrepareData CPU->MKLDNN
* add #ifdef PADDLE_WITH_MKLDNN
* Force-push commit
Co-authored-by: grygielski <adam.grygielski@gmail.com>
5 years ago
Leo Chen
672578a797
Print user-friendly error message in core.ops ( #26261 )
...
* print user-friendly error message
* adjust error sumary
5 years ago
Jack Zhou
6d22f5c73e
Add PADDLE_ENFORCE in nll loss cuda kernel ( #26294 )
...
* add nll loss API, update demo code of the comment
5 years ago
wangchaochaohu
0b81d76310
[API2.0] add op for cudnn version query test=develop ( #26180 )
5 years ago
lilong12
241b44db14
[API 2.0] adaptive expand op to use shape instead of expand_times ( #26206 )
...
* adaptive expand op to 2.0 (align to torch.expand) , test=develop
5 years ago
wangchaochaohu
bb11cbc250
[API2.0] add Device api (set_device and get_device)( #26103 )
5 years ago
Zhou Wei
6de463d3d1
expose and unify the Tensor concepts to the user ( #25978 )
...
* expose and unify the Tensor concepts to the user
* expose tensor to user
* add copy place for Tensor
* add copy place for Tensor
* add note
* add macro PADDLE_WITH_CUDA
* remove RUN_TYPE=DIST
* fix some error
5 years ago
lilong12
fbd4d3cc97
[API 2.0] add paddle.tile op ( #26245 )
...
* add tile_op, test=develop
5 years ago
Zhou Wei
20147ace3f
fix_copy_if_different ( #25868 )
5 years ago
Wilber
c84aa9c61f
update diff val. ( #26242 )
5 years ago
Yang Zhang
a2d3e5c03b
Fix `paddle.abs` docstring ( #25942 )
...
test=document_fix
remove activation wording
5 years ago
Yang Zhang
22165934bc
Fix `paddle.acos` docstring ( #25958 )
...
test=develop,test=document_fix
remove activation wording
5 years ago
Yang Zhang
a5b5b00e02
Fix `paddle.asin` docstring ( #25967 )
...
test=develop,test=document_fix
remove activation wording
5 years ago
Yang Zhang
c758765769
Fix `paddle.atan` docstring ( #25968 )
...
test=develop,test=document_fix
remove activation wording
tanh -> tan
5 years ago
Yang Zhang
c4e480efc5
Fix `paddle.cos` docstring ( #25969 )
...
test=develop,test=document_fix
explain input/out put range and out of boundary behavior
5 years ago
wawltor
2d6cc0b125
support the tuple for attribute of axis in min, max for api2.0
...
Update the code for the min,max, test=develop
5 years ago
Dong Daxiang
50a5bcfc9d
【paddle.fleet】paddle.fleet -> paddle.distributed.fleet. ( #26186 )
...
* move paddle.fleet to paddle.distributed.fleet
5 years ago
Leo Chen
ffe52b4452
[OpDevOptimize] Add common infershape functions ( #26096 )
...
* add unchaged infershape function
* add broadcast infershape function
* fix bug
* rename infershape functions
* add UnaryOpUnchangedInferShapeCheckAxis
* add error message
* add test for common infer shape functions
* dont update existed ops
* dont update op_desc.h
* add more test
* add error check, refine error message
5 years ago
Leo Chen
2d95280e1f
Feature/Enable Auto-Mixed-Precision in dynamic graph ( #24903 )
...
* add auto_cast, test=develop
* add loss scaler, test=develop
* add comments, test=develop
* refine code, test=develop
* refine code, test=develop
* do not set flags automatically, test=develop
* fix custom op bug, test=develop
* add more test, test=develop
* refine enable logic, test=develop
* enable amp test with GPU, test=develop
* add unittest
* add test for found_inf
* follow comments
* follow comments
* remove global variable, use singleton
* add some notes
* update comments
* update comments
* update comments
* add use_dynamic_loss_scaling argument
* refine found_inf
* refine found_inf
5 years ago
Chen Weihang
838e36e9ed
Fix loaded variable suffix repeat error ( #26169 )
...
* fix loaded var suffix repeat error
* use new dygraph name for loaded param
5 years ago
Jack Zhou
dea41da715
add nll loss API for the paddlepaddle api2.0
...
* add nll loss API, update demo code of the comment
5 years ago
Wilber
fb72b192e7
[DOC] Fix dead link ( #26154 )
5 years ago
wawltor
9c17b3c9f8
Add the max, min, maximum, minimum api for the API 2.0
...
* Add the max, min, maximum, minimum api for the API 2.0, test=develop
5 years ago
JZ-LIANG
54003b873e
【paddle.fleet】add lamb to fleet meta optimizer ( #26025 )
...
add lamb to fleet meta optimizer
5 years ago
Yiqun Liu
1be6bf45ae
Add assign to fusion_group and enhance inplace execution in fusion_group. ( #26121 )
5 years ago
lidanqing
65b97d6215
GRU model xnli dataset C++ tester ( #25534 )
...
* Add laxical GRU unit test
performance works
* Get model accuracy
* model and data name to be confirmed
test=develop
* update model name and output format
test=develop
* update according to reviews
test=develop
* add accuracy check
* accuracy check between native and analysis
test=develop
* fix a reading bug, fix gru passes sequence
test=develop
* fix passes sequence
test=develop
5 years ago
Zhen Wang
a86e8c0eef
add more error info for these ops without double grad ops. ( #25987 )
5 years ago
MRXLT
6559229b7e
fix encryption infer ( #25979 )
...
* add encrypt for inference lib
* fix code;test=develop
* fix test; test=develop
* bug fix; test=develop
* add MakeCipher;test=develop
* fix bug;test=develop
* move MakeCipher to paddle space; test=develop
* fix include dir ;test=develop
* add include dir; test=develop
* move include; test=develop
* move include; test=develop
* fix for windows ci
* fix cmake; test=develop
* fix bug
bug fix
5 years ago
lilong12
8caee2ad51
【paddle.fleet】add the support for multi-node training for pipeline ( #25907 )
...
* add the support for multi-node training
5 years ago
LutaoChu
bf2db646de
fix cumsum op for API 2.0, optimize performance
...
update cumsum api and fix up the cumsum op
5 years ago
Adam
1893cd6bb8
Add oneDNN relu6 op ( #26037 )
...
* Add oneDNN relu6 op
* Lint fixes
5 years ago
Zhaolong Xing
50f149a48e
fix cudnn workspace size problem during inference. ( #26021 )
...
test=develop
5 years ago
tangwei12
c14ec8782b
【paddle.fleet】Feature/fleet ps api 2.0 ( #25857 )
...
* add paddle.fleet.AsyncOptimizer
Co-authored-by: dongdaxiang <dongdaxiang@baidu.com>
5 years ago
Chen Weihang
3c8daa9b89
Add pin memory control for BufferedReader ( #26026 )
...
* add pin memory control
* fix buffered reader init problem
* fix unittest error
* add unittest for coverage
5 years ago
Chen Weihang
ad4a0466a5
Add cuda pinned place branch in slice op GetExpectedKernelType ( #26027 )
...
* add cuda pinned place branch
* add unittest
* add skip when not gpu
5 years ago
Feiyu Chan
e853ece0a2
update document template for unary elementwise layers ( #25896 )
...
1. update document template for unary elementwise layers(a.k.a. activation layer);
2. remove generate_op_noattr and use generate_activation instead; remove redundant function copies;
3. minor update for docstring to fix rst format errors.
4. fix doc for Rsqrt OP
5. add sample code for each activation separately;
6. remove the unused deprecated decorator.
5 years ago
joanna.wozna.intel
734cf1c3e9
Change use_quantizer attribute name and data type ( #25838 )
...
* Change use_quantizer attribute name and data type
* Fix problem with setting attribute
* Add changes due to review
* Small change in function
* Restore use_quantizer attr for compatibility
5 years ago
Leo Chen
5258d53d65
refine unsqueeze, test=develop ( #25470 )
...
* refine unsqueeze, test=develop
* update unsqueeze, test=develop
* refine unsqueeze, test=develop
* refine unsqueeze, test=develop
* update
* remove None, test=develop
* follow comments
* support bool
* update doc
* follow comments
* merge develop
5 years ago
tangwei12
3755564ae1
Fix/large scale fix ( #25999 )
...
* fix large scale KV
* fix single training using async ssa graph
5 years ago
Leo Chen
751305ecf0
Add flags to control call stack of error message ( #25997 )
...
* add flags_call_stack_level
* update
* refine code
5 years ago
Thunderbrook
fd2947babf
fix compile error with mkl ( #26030 )
...
test=develop
5 years ago
Leo Chen
0a47387bd8
Use static local variable instead of global variable for safty ( #26018 )
...
* remove global variable
* refine code
5 years ago
Pei Yang
beb0ca5fab
Fix TRT plugin registry without TRT lib ( #25982 )
...
* fix trt plugin registry without trt lib
* support trt4
* refine code style
5 years ago
123malin
2191a08317
【paddle.fleet】fleet_util move to paddle.fleet ( #25805 )
...
* test=develop,test=document_fix, remove the out args
* fleet_util move to paddle.fleet
Co-authored-by: WuHaobo <wuhaobo1994@gmail.com>
Co-authored-by: tangwei12 <tangwei12@baidu.com>
5 years ago
yaoxuefeng
224620071b
add new flatten op test=develop ( #25393 )
5 years ago
Adam
68c6160e63
Add oneDNN fusion_gru kernel ( #25594 )
...
* Add oneDNN fusion_gru kernel and fix fc+gru pass
test=develop
* Formatting changes
test=develop
* Lint fixes
test=develop
* Add memory::format_tag::any to GRU weights
test=develop
* Fix build with CUDA
* Fix build with CUDA v2
5 years ago
Thunderbrook
0cb60c700d
add heter ps mode ( #25682 )
...
* add heter ps mode
* code style
test=develop
* add with_pslib
test=develop
* unitest
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* test monitor
test=develop
* prepare trainer
test=develop
* code style
test=develop
5 years ago
Zhong Hui
dca56f47f5
fix invalid read of pnorm gradient function
...
fix invalid read of pnorm gradient function and delete the unused code
5 years ago
WangXi
2c9d0f3cb9
【paddle.fleet】Add dgc to fleet meta optimizer ( #25738 )
...
Add dgc to fleet meta optimizer, rm dgc from optimizer all
5 years ago
Zhaolong Xing
358bc06c72
[CUDNN8 support] : support CUDNN8 ( #25664 )
...
* cunn8 support
test=develop
* fix ci error
test=develop
5 years ago
Zhaolong Xing
5970871a64
add eltwise clip cuda impl. ( #25689 )
...
test=develop
5 years ago
Zhen Wang
82374dc12f
Add some error messages for the op without double grads. ( #25951 )
...
* Add some error messages for the op without double grads.
* fix the test_imperative_double_grad UT.
5 years ago
Pei Yang
b717895f64
Fix registering trt plugin ( #25744 )
...
* develop dynamic shape serilization
* add test param for gelu
* fix bugs
* delete redundant comments
* debug
* fix conflict. test=develop
* fix bug. test=develop
* add trt dynamic shape serialized support
* fix ernie serialized bug
test=develop
* fix codestyle
test=develop
* fix bug
test=develop
* fix bug.test=develop
* modify cmakelist test=develop
* fix bug
test=develop
* fix error message. test=develop
* fix trt register plugin based on pr#25003
* add trt dynload
* fix deserialization bug of not finding plugin registration
* refine code style
* recover engine key in tensorrt_subgraph_pass
* for ci coverage
* add unittest for deserialization
Co-authored-by: haozech <chenhaoze94@gmail.com>
5 years ago
wawltor
a697e94693
Update the code of the compare ops for the broadcast function
...
Update the code for the compare ops for the broadcast function
5 years ago
Chen Weihang
9b5a65b819
refine init signal handler meg dumper ( #25911 )
5 years ago
wangchaochaohu
ff717d5158
Add support for tuple of concat Op test=develop ( #25800 )
5 years ago
tangwei12
253fd407e8
Fix/distibuted heart beat ( #25902 )
...
* disable heart beat UT
5 years ago
WangXi
a6c87fd091
Add amp to fleet meta optimizer, test=develop ( #25770 )
5 years ago
Pei Yang
9e9a569dae
add trt int8 support for elementwise_mul and scale ( #25676 )
5 years ago
xujiaqi01
d11c140e28
fix dump, fix cvm check ( #25400 )
...
* fix dump, fix cvm check
test=develop
* fix
test=develop
* fix
test=develop
* fix
test=develop
5 years ago
JZ-LIANG
8ebffc78c9
add lars to fleet meta optimizer ( #25884 )
5 years ago
Dong Daxiang
8d2896f1fe
【paddle.fleet】Fleet run graph in Executor and add two more strategies ( #25844 )
...
* split meta optimizer files
* add graph execution in execution, update two properties in DistributedStrategy, unit tests for these features
5 years ago
Zhang Ting
6486fe8a94
improve GPU performance of transpose, test=develop ( #25862 )
5 years ago
Zhang Ting
2d24f56a7a
avoid data transfer, test=develop ( #25810 )
5 years ago
ShenLiang
bca303165a
fix inverse bug ( #25641 )
...
* fix inverse bug, test=develop
* fix the untest, test=develop
* add singular checking, test=develop
* fix the utest, test=develop
* use memory::copy, test=develop
* fix bost_get, test=develop
* fix position, test=develop
5 years ago
Chen Weihang
48b9a56f1c
Polish framework error message - part 4 ( #25807 )
...
* polish framework error message part 4
* fix type error
* fix message error
* polish by review comments
5 years ago
Aurelius84
e52dae6ef6
Using input.place() in GetExpectedKernel in slice_op ( #25595 )
...
* modify GetExpectedKernelType
* use input place
* add ENFORCE check
5 years ago
wawltor
595a719795
Update the api for the compare_ops
...
Update the code for the compare_ops, update the api and doc
5 years ago
wangchaochaohu
32b9577b2a
refine the split op for API 2.0 test=develop ( #25320 )
5 years ago
lilong12
ce506930c3
Fix the bug that Input(Offsets) and attr(offsets) cannot be set at the same time. ( #24975 )
...
* bug fix, test=develop
5 years ago
tangwei12
2d9dbd31ad
Fix/mkl dnn ( #25835 )
5 years ago
Zhaolong Xing
bcddefef39
[Fix Ut]: fix inference ut which exist bug on windows. ( #25814 )
...
* fix windows test
test=develop
* fix ci
test=develop
5 years ago
lilong12
5f30e57cdd
fix test_pipeline, test=develop ( #25808 )
...
* fix test_pipeline, test=develop
5 years ago
Chen Weihang
d47304e6d9
Refine paddle error stack format ( #25790 )
...
* refine error stack format
* polish compile traceback format
* polish detail format
5 years ago
tangwei12
caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) ( #22957 )
...
* Integrated Trainer of Parameter Server
5 years ago
hong
c2a21ca9c9
Fix dygraph grad bugs ( #25781 )
...
* fix double grad visitid unit; test=develop
* change name hash_pair to HashPair; test=develop
* follow comment; test=develop
5 years ago