Double_V
d43f75e4cc
add rois_num for roi_align xpu OP ( #28077 )
...
* add stack pool2d roi_align xpu op,test=kunlun
* error message opt, test=kunlun
* add xpu unittest,test=kunlun
* skip check grad,test=kunlun
* fix boostget , test=kunlun
* error message opt for XPU, test=kunlun
* add rois_num for roi_align xpu OP, test=develop
5 years ago
xiaoting
e3d02c9574
rm max_input in conv2d for kunlun, test=kunlun ( #28062 )
5 years ago
joanna.wozna.intel
a21b57109c
Add AVX512 instruction check for C-API ( #28087 )
...
* Add AVX512 instruction check for C-API
* Fix formatting
5 years ago
wangchaochaohu
463c72c2d9
refine gpu kernel config for Paddle ( #28085 )
5 years ago
yinhaofeng
2cb1ecb99e
lookup_table_v2_op_xpu report errors;test=kunlun ( #28064 )
...
* lookup_table_v2_op_xpu report errors;test=kunlun
* lookup_table_v2_op_xpu report errors;test=kunlun
5 years ago
yinhaofeng
6f0c3d1f06
xpu adam op ( #28031 )
...
* lookup_table_xpu op report errors;test=kunlun
* add adam xpu op;test=kunlun
* reset lookup
* change adam wrong;test=kunlun
5 years ago
TeslaZhao
a5c95cd588
Add xpu transpose2 op.test=kunlun ( #28086 )
5 years ago
Chengmo
5f04875c30
Fix xpu error message ( #28061 )
...
* fix error message,test=kunlun
* fix, test=kunlun
5 years ago
LutaoChu
c8d32c8c10
Fix diag OP bug on Windows Python3.8
...
Fix diag OP bug on Windows Python3.8 ,remove the std::min
5 years ago
Pei Yang
a0b2f93689
reduce trt warning message ( #28011 )
5 years ago
huangxu96
d466893820
Allclose op ( #27891 )
...
* Still has bugs.
* Fixed allclose_op bug, which cannot deal with some cases of fp64 inputs.
* improved CUDA kernel performance.
* Changed CUDA code.
* Fixed a bug in cuda kernel which cannot deal with large dimension input, and added an unittest for it.
* Add a test case for float32 input.
5 years ago
pangyoki
975bd8873b
Fix error message of multinomial op ( #27946 )
...
* fix multinomial doc
* fix multinomial error message
* little doc change
* fix Categorical class doc
* optimize format of error message
* fix CPU Kernel error message format
* fix isinf and isnan error in WindowsOPENBLAS CI
* delete inf and nan
* add manual_seed in sample code
* little error message change
* change error message to InvalidArgument
* add full point for error message and add manual_seed in CPU environment
5 years ago
Kaipeng Deng
b6eff4427c
update yolo_box support h != w. test=develop ( #27327 )
5 years ago
Double_V
c1eed1fa24
error message opt for XPU, test=kunlun ( #27972 )
...
* add stack pool2d roi_align xpu op,test=kunlun
* error message opt, test=kunlun
* add xpu unittest,test=kunlun
* skip check grad,test=kunlun
* fix boostget , test=kunlun
* error message opt for XPU, test=kunlun
5 years ago
pangyoki
4c5b779a99
Add truncated_gaussian_random XPU kernel ( #27861 )
...
* Add truncated_gaussian_random_op XPU kernel
* Add truncated_gaussian_random_op XPU kernel, test=kunlun
* little change, test=kunlun
* change boost_get to BOOST_GET_CONST
* change boost_get to BOOST_GET_CONST, test=kunlun
* little change, test=kunlun
* use Generator to generate random number and optimize format, test=kunlun
* little change, test=kunlun
* add TODO, test=kunlun
5 years ago
pangyoki
5b8e500135
Add gaussian_random XPU kernels ( #27853 )
...
* Add gaussian_random XPU kernels
* commit kunlun, test=kunlun
* new version, test=kunlun
* change boost_get to BOOST_GET_CONST, test=kunlun
* use Generator to generate random number and optimize format, test=kunlun
* add TODO, test=kunlun
5 years ago
pangyoki
74ce039743
Add uniform_random XPU kernel ( #27846 )
...
* support uniform_random op on Baidu Kunlun
* change dtype of attr shape from int to int64_t
* kunlun ci, test=kunlun
* new version, test=kunlun
* change boost_get to BOOST_GET_CONST
* change boost_get to BOOST_GET_CONST, test=kunlun
* use Generator to generate random number and optimize format
* run Kunlun CI, test=kunlun
* add TODO, test=kunlun
5 years ago
xiaoting
abf4d52a74
Polish kunlun error ( #27974 )
...
* polish error message,test=kunlun
* polish error,test=kunlun
* polish error,test=kunlun
* polish error,test=kunlun
5 years ago
liuyuhui
3e9568653b
add cast/concat/assign xpu op ( #27911 )
...
* addd
* add cast_op_xpu, test=kunlun
* fix bug for cast_op_xpu,test=kunlun
* add concat_op_xpu, test=kunlun
* slove conflicts, test=kunlun
* fix bug,test=kunlun
* add assign_op_xpu, test=kunlun
* fix bug,test=kunlun
* test=kunlun;test=develop
* fix concat bug,test=kunlun
* fix check_dygraph set in test_concat_op_xpu.py,test=kunlun
* fix error message,test=kunlun
Co-authored-by: mapingshuo <mps2012@yeah.net>
5 years ago
Guo Sheng
fa9d3fa5bf
Incorporate cudnn_lstm into LSTM api ( #27217 )
...
* Incorporate cudnn_lstm into LSTM api.
test=develop
* Make coalesce_tensor support alignment optionally.
test=develop
* Reorganize RNN apis. test=develop
* Fix cudnn rnn layout conversion.
test=develop
* Add sequence_length support for RNN cudnn implement.
Add optional init_h and init_c gradient for cudnn_lstm_op.
test=develop
* Use create_parameter for rnn cudnn impl.
test=develop
* Move `self._flat_weight = self.create_parameter()` in RNNBase to main_program.
test=develop
* Update RNN api unittest to use set_device.
test=develop
* Fix set_place for unit tests of RNN apis.
test=develop
* Fix use_align in coalesce_tensor_op.
test=develop
* Adjust RNN apis arguments according to comments.
test=develop
* Polish documents for SimpleRNN apis.
test=develop
* Refine random seed in cudnn_lstm_op.
Expose rnn params from sublayers to RNN.
test=develop
* Fix RNN saving for jit.save.
Refine cudnn_lstm dropout behavior.
test=develop
* Fix doc of GRU. test=develop
* Use ShareDataWith to avoid copying for cudnn_lstm_op test.
test=develop
* Remove updates on cudnn_lstm temporarily.
test=develop
* Use ShareDataWith to avoid copying for cudnn_lstm_op test.
test=develop
* Refine random seed in cudnn_lstm_op.
test=develop
* Fix test_lstm by adjust ConcreteProgram buffer getter.
test=develop
* Use create_parameter instead of create_var for rnn._flat_weight for static graph usage.
test=develop
* Remove W input for cudnn_lstm to pass unused_var_check.
test=develop
* Add test_predict for RNN unit tests coverage.
test=develop
* Fix code style of rnn.
test=develop
* Fix F.rnn usage in rnn.py.
test=develop
5 years ago
chentianyu03
05fd49e974
change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes ( #27998 )
...
* change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes
* format codes
5 years ago
Guanghua Yu
f94d053705
error message optimization in mean_xpu,softmax_with_cross_entropy_op_xpu,test=kunlun ( #27967 )
5 years ago
Jack Zhou
d330cf66cc
Fix xpu enforce ( #27978 )
...
* test=kunlun;
Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast):
* elementwise_div op
* elementwise_max op
* elementwise_mul op (with grad op)
* elementwise_sub op (with grad op)
* 0.05->0.01
* add xpu error message description;test=kunlun
5 years ago
lidanqing
7cb4a8b8f2
[oneDNN] Conv dilation support ( #27914 )
...
* conv dilated mkldnn support: forward and backward pass
* add mkldnn conv_transpose dilation UT
test=develop
* remove unnecessary PADDLE_ENFORCE
* add int8 and bf16 dilated conv UT
* update according to reviews
5 years ago
mapingshuo
64c2634995
fix kunlun kernel of reshape op ( #27988 )
5 years ago
tangwei12
202bfab1be
Feature/large scale kv save base/delta ( #27470 )
...
* add size method for large scale
* add large scale UT
* add ut for checkpoint
5 years ago
123malin
aa3b4ed717
【paddle.fleet】geo send sparse optimize ( #27719 )
...
* test=develop, fix geo sgd communicator
* test=develop, gloo_init_method
* test=develop, bug fix for gloo http_init
5 years ago
Zhou Wei
2ac6c6c3af
fix bug of tensor copy of CUDAPinnedPlace ( #27966 )
5 years ago
joanna.wozna.intel
840c521b77
Fix problem with flags fp32 and int8 ( #27954 )
5 years ago
mapingshuo
5ccaaab8aa
reshape support bool, test=develop ( #27944 )
5 years ago
Qinghe JING
4a4f773658
Add reduce sum and reduce mean xpu op ( #27939 )
...
* add reduce xpu op test=develop;test=kunlun
* add reduce xpu op test=develop;test=kunlun
* add reduce xpu op test=develop;test=kunlun
* add reduce xpu op test=develop;test=kunlun
* add reduce xpu op test=develop;test=kunlun
5 years ago
Zhou Wei
bf412f4665
add tensor clone ( #27953 )
...
* add tensor clone
* fix unittest test_var_base
5 years ago
Feiyu Chan
2e845182d9
support channel last in BatchNorm*d
...
1. support channel last in BatchNorm*d (#27875 )
2. fix a bug in batch_norm_op cuda kernel by extracting ResizeToChannelFist(Last), TransToChannelFirst(Last) to operators/layer_utils.h
5 years ago
guofei
6bbb6e7f45
Implement the function of OutScaleForTraining/OutScaleForInference in dygraph ( #26601 )
...
* Implement the function of OueScaleForTraining/OutScaleForInference in dygraph
test=develop
5 years ago
chentianyu03
d05058d268
Remove and reorganize the alias of APIs ( #27717 )
...
* modify cond while_loop to paddle.static.nn.cond
* modify crop_tensor to paddle.crop
* modify Variable to paddle.static.Variable
* remove nn.beam_search, nn.beam_search_decode, nn.gather_tree
* remove bpr_loss, center_loss, rank_loss, smooth_l1, teacher_student_sigmoid_loss, edit_distance, sampled_softmax_with_cross_entropy in nn.functional
* remove apis in nn.functional.learn_rate.py
* remove pool2d, pool3d, adaptive_pool2d, adaptive_pool3d in nn.functional
* remove apis in nn.functional.vision
* remove erf, soft_relu in nn.functional.activation
* remove apis in nn.functional.extension
* remove nn.functional.rnn
* remove hash from nn.functional.lod
* remove row_conv from nn.functional.extension
* remove one_hot, pad2d, pad_constant_like from nn.functional.common
* remove nn.gather_tree, nn.BilinearTensorProduct, nn.Pool2D, nn.Pad2D
* remove apis from optimizer.__init
* remove tensor.creation.fill_constant
* remove elementwise_mul in nn.functional.common and modify to paddle.multiply
* remove tensor.stat.reduce_mean
* remove reduce_all, reduce_any in tensor.logic
* remove apis in tensor.math
* remove apis in tensor.__init__
* remove has_inf, has_nan in tensor.search
* remove apis in framework.__init__
* remove apis in paddle.__init__
* remove apis in nn.functional.__init__
* modify removed alias apis to raw api in doc and unittests
* fix remove grid_sample bug
* modify removed alias apis to raw api in doc and unittests
* modify removed alias apis to raw api in doc and unittests
* modify removed alias apis to raw api in doc and unittests
* modify removed alias apis to raw api in doc and unittests
* modify removed alias apis to raw api in doc and unittests
* modify removed alias apis to raw api in doc and unittests
* delete alias api relastions in doc
* reserve paddle.compat, paddle.sysconfig
* remove unittest for paddle.reduce_all, paddle.reduce_any
* modify removed alias apis to raw api in doc and unittests
* recover paddle.save and paddle.load
* resolve conflicts
* fix sample code missing paddle.enable_static() bug
* fix sample code missing paddle.enable_static() bug
* fix to_string sample code error
5 years ago
Leo Chen
9a2a4b5f65
Support setting xpu place in dygraph mode ( #27909 )
...
* support setting xpu place
* add ut, test=kunlun
5 years ago
Thunderbrook
3ee6ad6ec5
solve bug in pull_dense_worker ( #27918 )
...
* op error info
* style
* code format
* create pin var bug
5 years ago
MRXLT
263a9e97fd
Fix adam ( #27778 )
...
* fix adam
* fix gpu adam
* fix code style
* fix ut
* update ut add cuda code
5 years ago
Double_V
b0edda4d99
kunlun add op ( #27890 )
...
* add stack pool2d roi_align xpu op,test=kunlun
* error message opt, test=kunlun
* add xpu unittest,test=kunlun
* skip check grad,test=kunlun
* fix boostget , test=kunlun
5 years ago
Jack Zhou
c791df09cf
Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast
...
Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast
5 years ago
wangchaochaohu
c5fcc96d5b
xpu support for fill_constant Op ( #27675 )
5 years ago
Chengmo
328cb289ed
【paddle.fleet】fix sparse load ( #27680 )
...
* add sparse tensor load method
5 years ago
tangwei12
cf70d5b350
fix paddle error informations ( #27889 )
5 years ago
wawltor
95aa53425d
update the code for the topk message optimize
...
update the code for the topk message optimize
5 years ago
Chen Weihang
4ba977c720
Polish some error message in opeators ( #27876 )
...
* polish some error message
* add white list
* revert shell script change
5 years ago
123malin
a4f850748a
【paddle.fleet】bug fix for parameter_recv ( #27838 )
...
* test=develop, bug fix for parameter_recv
* test=develop, for unittest, test_fleet_rolemaker_new
5 years ago
QingshuChen
2712d07644
support kunlun matmul_v2 ( #27910 )
...
*test=kunlun
5 years ago
zhang wenhui
5a83496c8d
Multi task ( #26002 )
...
* add multitask
* add multitask, test=develop
* fix code style, test=develop
* add partail push dense, test=develop
* fix has_kay in py3, test=develop
* fix, test=develop
* fix, test=develop
* fix, test=develop
5 years ago
zhang wenhui
7a58431c0a
fix norm api doc, test=develop ( #27652 )
...
* fix norm api doc, test=develop
* fix error message, test=develop
* fix api norm, test=develop
* add adagrad, test=develop
* fix bug, test=develop
* fix bug, test=develop
* add spetral_norm, test=develop
* fix adagrad, test=develop
* merge , test=develop
5 years ago
yinhaofeng
3eb106da6d
Lookup table v2 xpu ( #27888 )
...
* add lookup_table_v2_op_xpu, test=kunlun
* add lookup_table_v2_op_xpu, test=kunlun
* change some Tips ,test=kunlun
5 years ago
Zhang Ting
d5cc144c60
tune backward filter algorithm for float16 ( #27529 )
...
* use exhaustive_search for float16
* tune algo only when dtype is float16
5 years ago
wanghuancoder
41aad9bfcd
revert 4 files, from clear include by iwyu, test=develop ( #27895 )
5 years ago
hutuxian
3f2a6ab65d
fix error msg ( #27887 )
5 years ago
xiaoting
ae01801f0a
Add dropout and log_loss for kunlun ( #27790 )
...
* add dropout,log_loss, test=kunlun
* fix dropout, test=kunlun
* polish error message, test=kunlun
* change boost::get to BOOST_GET_CONST, test=kunlun
* fix copyright, test=kunlun
5 years ago
Guanghua Yu
70c8c31371
support mean,softmax_with_cross_entropy on Baidu Kunlun ( #27792 )
...
* support mean,softmax_with_cross_entropy on Baidu Kunlun,test=kunlun
* fix unittests error,test=kunlun
* delete boost::get,test=kunlun
5 years ago
Chengmo
1607e87cb9
add xpu sgd & momentum ( #27728 )
...
* add xpu sgd & momentum
5 years ago
Leo Chen
049696bf67
Refine the format of printing tensor ( #27673 )
...
* add sumary feature
* refine printting tensor
* add sci_mode
* add sample code
* fix indent error
* fix _format_item
* polish code
* support item indent
* add ut
* set place for ut
* fix py2 issue
* fix ut
5 years ago
hong19860320
c90d35564b
Add batch_norm and layer_norm XPU kernels ( #27818 )
5 years ago
joanna.wozna.intel
ddcd1b5381
Add bfloat16 resnet50 test ( #27755 )
5 years ago
xiaoting
6da7a7458b
add conv for xpu, test=kunlun ( #27809 )
...
* add conv for xpu, test=kunlun
* polish error_message, test=kunlun
* polish error_message, test=kunlun
* fix copyrigth, test=kunlun
5 years ago
Thunderbrook
04be37c57f
add xpu slice op ( #27349 )
...
* add xpu slice op
test=xpu
* add slice xpu op
test=xpu
* code style
test=kunlun
* style
test=kunlun
* format
test=kunlun
5 years ago
Thunderbrook
8c25dfaacc
op error info ( #27856 )
...
* op error info
* style
* code format
5 years ago
Wilber
345574a6ed
Demo CMakeLists add openmp flag. ( #27848 )
5 years ago
ShenLiang
6d63cd2b93
add gather_op xpu, test=kunlun ( #27822 )
...
* add gather_op xpu, test=develop, test=kunlun
* fix ut, test=develop, test=kunlun
* fix the ut,test=develop, test=kunlun
5 years ago
Feiyu Chan
1d95a0fbc3
fix error message for nce_op ( #27863 )
5 years ago
gongweibao
4237fefeb4
Add shellcheck tools and modify copyright hook ( #27722 )
5 years ago
Chengmo
c5f2802d56
【paddle.fleet】Update fleetrun & ps-heter ( #27472 )
...
* refine fleetrun.ps_launch
* update fleet run for multi device support
* ps_graph support ps-gpu
* fix heter save
* add heter save unittest
* fix unittest & simple code
* update fleetrun
* fix fleetrun
* fix launch barrier
* fix role maker
* add paddlecloud rolemaker unittest
* rename heter_worker_device_guard
5 years ago
Shang Zhizhou
bbc837ee72
add info log for trt input dynamic shape check ( #27796 )
...
* add info log for trt input dynamic shape check
* fix error msg error
5 years ago
guofei
2e1bca99ca
Refine the gradient calculation errors caused by renaming in while_grad ( #27814 )
...
test=develop
5 years ago
wanghuancoder
8fa4c09889
add load_op_xpu for Baidu Kunlun ( #27817 )
...
* add load_op_xpu for Baidu Kunlun, test=kunlun
* add is_compiled_with_xpu for unit test, test=kunlun
* add is_compiled_with_xpu for unit test, test=kunlun
5 years ago
Wilber
9005c5a260
Lite subgraph support arm cpu. ( #27827 )
5 years ago
Jacek Czaja
55e63763ec
[oneDNN] adaptive pool support ( #27747 )
5 years ago
chen zhiyu
6335e6a0a6
add musl option ( #27798 )
5 years ago
yongqiangma
e8a5aefbbd
update CUDAPlace doc. test=document_fix ( #27711 )
5 years ago
Zhang Ting
16999ae49d
use IndexList to improve performance of instance_norm op ( #25132 )
...
* use IndexList to improve performance, test=develop
* remove EIGEN_HAS_INDEX_LIST, test=develop
* use IndexList only when EIGEN_HAS_INDEX_LIST is true
5 years ago
GaoWei8
36bb056ed6
Add flattern weight of lstm ( #27192 )
...
* add flattern weight of lstm
5 years ago
Guanghua Yu
7779790c61
error message optimization in softmax_with_cross_entropy_op ( #27772 )
...
* error message optimization in softmax_with_cross_entropy_op
* fix some unsuited comment
5 years ago
zhupengyang
659d04df2c
hsigmoid -> hsigmoid_loss/HSigmoidLoss; refine docs ( #27745 )
5 years ago
TeslaZhao
070ac9590c
Add double grad in Squeeze and Unsqueeze ( #27810 )
...
* Add double grad in Squeeze and Unsqueeze
* Add double grad in Squeeze and Unsqueeze
5 years ago
Jack Zhou
d4359b0f39
add the kunlun kernel for the paddle 2.0
...
Add xpu kernel for KUNLUN core:
* accuracy op
* sign op
* scale op
* sum op
Add default atol in xpu unittest.
5 years ago
mapingshuo
840d54de9b
add XPU support for shape op and reshape op ( #27804 )
5 years ago
cc
8fabb1c32f
Add test attribute in channelwise_quant op, test=develop ( #27742 )
...
* Add test attribute in channelwise_quant op, test=develop
5 years ago
wangxinxin08
ad99e638fd
add double grad op for matmul ( #27776 )
...
* add matmul doublegrad op
* fix compile errors
* modify code according to review
* delete float16
5 years ago
zhupengyang
0025e0d87b
refine APIs: brelu, hardsigmoid, hardswish, maxout ( #27658 )
5 years ago
zhupengyang
5098891fdf
add softmax xpu kernel ( #27700 )
5 years ago
Double_V
f6ad2375be
fix pool3d bug, test=develop ( #27718 )
...
* fix pool3d bug, test=develop
* fix unitest, test=develop
* fix test and fix pool2d bug, test=develop
5 years ago
石晓伟
0d27591642
save operator version infomation to program desc, test=develop ( #27668 )
5 years ago
Qi Li
b8d2a021f0
fix ut error of test_recognize_digits, test=develop ( #27791 )
5 years ago
Jacek Czaja
631c1f3018
- Fix to 27398 ( #27770 )
...
test=develop
- compilation fix
test=develop
5 years ago
Feiyu Chan
0a7bab4e34
fix error mesage for negative_positive_pair_op and nce_op ( #27779 )
5 years ago
zhupengyang
395cb561aa
refine logsumexp error message and docs ( #27713 )
5 years ago
smallv0221
057e28bc8f
API(lstm_unit, lstmp, sequence_mask, sequence_enumerate, sequence_conv) error message enhancement ( #27572 )
...
* API(Compute) error message enhancement on line 44, 50, 53.
* lstm_unit error message enhancement.
lstmp error message enhancement.
sequence_conv error message enhencement.
sequence_enumerate error message enhencement.
sequence_mask error message enhencement.
* Update lstm_unit_op.cc
* Update lstm_unit_op.h
* error msg enhancement.
* Update sequence_conv_op.cc
* Update lstm_unit_op.cc
* Update sequence_conv_op.cc
* Update sequence_enumerate_op.cc
* Update sequence_enumerate_op.cu
* Update sequence_enumerate_op.h
* Update sequence_pool_op.h
* error message enhencement.
* error message enhancement.
5 years ago
Jacek Czaja
606611d351
[oneDNN] GRU BF16 kernel ( #27731 )
5 years ago
xiemoyuan
6c1acf34ed
Optimize the error message for OP ( #27617 )
...
* Optimize the error message for OPs.
* Optimize the error message for OPs in details.
5 years ago
cc
ec7d11a492
refine fused_elemwise_activation error message ( #27734 )
5 years ago
Zhen Wang
365c2c9c89
fix error message showing in UpdateLossScalingOp ( #27596 )
5 years ago
LielinJiang
9089841b6e
Fix bilateral inference shape bug ( #26822 )
...
* fix bilateral bug
5 years ago
Yiqun Liu
65207b4560
Polish the error message of fc, fused_fc_elementwise_layernorm and fused_embedding_seq_pool. ( #27692 )
...
* Polish the error message of fc_op.
* Polish the error message of fused_fc_elementwise_layer_norm op.
* Polish an error message in fused_embedding_seq_pool_op.
5 years ago
Wojciech Uss
f399bed8d9
Add an option to set number of warmup iterations ( #27739 )
5 years ago
Jacek Czaja
b9fda2ff09
Fix to issue #25537 ( #27546 )
...
* - condidate fix to issue #25537
test=develop
* - UT for transpose NHWC
test=develop
5 years ago
Wojciech Uss
966447e338
Added support for quantization of fusion_gru ( #27518 )
5 years ago
joanna.wozna.intel
0cd4907eba
Add avx512 core instructions check ( #27732 )
...
* Add avx instructions check
* Small fix
* Change function name
* Change uint to unsigned int
5 years ago
hong19860320
7a96d5788d
Optimize the error messages of the CUDA implementation of activation ops ( #27741 )
...
test=develop
5 years ago
tangwei12
fd616fadc2
repen heartbeat ut ( #27684 )
5 years ago
Qi Li
f373269df0
update histogram op for performance optimization, test=develop ( #24912 )
5 years ago
MRXLT
20fb01fb00
fix distributed error info ( #27206 )
...
* fix distributed error info
* bug fix; notest
* error info refine
* update error info
* update error info
* update error info
* bug fix
* bug fix
* bug fix
* bug fix
5 years ago
pangyoki
7cd2c13f1b
add multinomial op ( #27219 )
...
* add multinomial cpu kernel
* fix C++ notype error
* fix windows ci array len error
* let array len be const
* change array to vector
* add cuda kernrl with num_distribution is 1, and not support replacement=False
* add multinomial python api
* support num_distribution different multinomial distributions
* add multinomial python api unittest
* change output dtype to int64
* fix coverage prob
* optimize format
* fix dtype of output error, should be int64_t
5 years ago
Zhang Ting
d2369dd91f
modify docs of CPUPlace and CUDAPinnedPlace, test=document_fix ( #27587 )
5 years ago
Wojciech Uss
42d175385d
Add support for (de/re)quantization with shift ( #27481 )
5 years ago
123malin
cc780b1977
test=develop, optimize geo communicator ( #26857 )
...
* test=develop, optimize geo communicator
5 years ago
Pei Yang
8a4f85feb9
Add unittests and OP version registry for quant_conv2d_dequant_fuse_pass ( #27689 )
5 years ago
yukavio
7b46fb0f14
fix generate_proposals and affine grid error info ( #27636 )
5 years ago
Chen Weihang
b14ecb8632
Polish api BuildStrategy/ExecutionStrategy doc & code example ( #27662 )
...
* polish BuildStrategy api doc & example
* polish ExecutionStrategy api doc & example
* polish details
5 years ago
AshburnLee
c3a3df6466
Add cuda support for unique op ( #27646 )
...
* unique op for cuda is added
* add support for cuda
* Add cuda support for unique op.
* Add support for int32_t and int64_t.
* For old version, process by cpu
* Add VisitDataType for thrust
5 years ago
lilong12
bbc2add703
Initialize gloo for low level collective apis ( #27672 )
...
* add gloo initializer, test=develop
5 years ago
wawltor
29f4922906
optimize the error meesage for detetion_map_op
...
optimize the error meesage for detetion_map_op
5 years ago
whs
daf5aa9b8b
Fix round in grid sample op ( #27657 )
5 years ago
arlesniak
0ecf441af1
Add support for mkldnn ops types selection with FLAGS in dygraph ( #27482 )
...
* Add support for mkldnn ops types selection with FLAGS in dygraph
* use regex to match DNNL verbose
* python3 encoding fix
5 years ago
Wilber
2bc70ab2e2
Fix lite_resnet50 unit test. ( #27611 )
5 years ago
ysh329
2f9cdd9038
API/OP clip_by_norm_op error message enhancement. test=develop ( #27614 )
...
* Fix clip_by_norm_op error message. test=develop
* test=develop
* test=develop
5 years ago
yongqiangma
aac57159c9
enhance array_to_lod_tensor_op lod_tensor_to_array_op errors informaiton ( #27386 )
...
* enhance array_to_lod_tensor_op lod_tensor_to_array_op errors information. test=develop
5 years ago
lilong12
36c0410223
Revert "Initialize gloo for low level collective apis ( #27356 )", test=document_fix ( #27665 )
5 years ago
xiemoyuan
99e3337368
Optimize the error message of OP. ( #27478 )
...
* iCafe 9009: Optimize the error message of OP.
* Optimize the error message of GatherTreeOP.
5 years ago
ShenLiang
e8f873df88
optimize the speed&memory of matmul op ( #27610 )
...
* fix the speed&memory of matmul
* fix the comment
* fix the memory copy
* fix the windows ci
5 years ago
Pei Yang
ae6e40a7fd
Add unittests and OP version registry for tensorrt_subgraph_pass ( #27544 )
...
* add unittests and op version register for tensorrt_subgraph_pass
* rename to test_trt_subgraph_pass.py
* fix softmax converter diff when padding dim=1
5 years ago
tangwei12
9704582eef
fix op error ( #27599 )
...
* fix error
* fix error
* fix error
* merge develop
5 years ago
wanghuancoder
c68a0313a5
add paddle.fluid._cuda_synchronize ( #27595 )
...
* add paddle.fluid._cuda_synchronize, test=develop
* fix bug about core_avx core_noavx, test=develop
* delete CPUPlace and XPUPlace, test=develop
5 years ago
yaoxuefeng
c9a8801325
enhance error messages of lookup_tale, merge_ids, data_norm ( #27619 )
...
* enhance error messages of lookup_tale, merge_ids, data_norm
* fix
* fix error msg in .cu
5 years ago
whs
9cc5603d56
Make grid support stopping graients. ( #27630 )
5 years ago
liym27
074a71bd25
Support assignment to a Variable in dynamic mode but not deal with backward. ( #27471 )
...
* Support assignment to a Variable in dynamic mode. Note: not deal with backward.
* Rewrite VarBase __setitem__ for high-performance.
* try to test 3 means to do __setitem__ and test the performance of 3 means.
* Retain the means of the highest performance: C++ code and don't trace op.
5 years ago
lilong12
5218b7af6b
add ncclSend and ncclRecv ( #27621 )
...
* include ncclRecv and ncclSend, test=develop
5 years ago
lilong12
fa73e4a284
Initialize gloo for low level collective apis ( #27356 )
...
* add gloo initializer, test=develop
5 years ago
furnace
d01f626944
update mv op according PR#27024 ( #27474 )
5 years ago
Double_V
9d783aeddd
Error message opt, test=develop ( #27467 )
...
* Error message opt, test=develop
* solve comments, test=develop
* fix typo, test=develop
5 years ago
Li Fuchen
1501a80f74
add support to float64 input of warpctc op. ( #27399 )
...
* add float64 input to ctc_loss
* modified error message of warpctc
* update repo and tag of warpctc
* add test for warpctc with float64 input
* modified warpctc.cmake to make sure build always
* resolved sample code bug of warpctc
* add core.ops in warpctc dygraph
* fix a bug of test
5 years ago
QingshuChen
6b727e08b1
support elementwise add, activation, matmul on Baidu Kunlun ( #27143 )
...
* support elementwise add, activation, matmul on Baidu Kunlun
* test=kunlun
* minor
* test=kunlun
* reconstuct the xpu directory
* test=kunlun
* minor
* test=kunlun
* minor
* test=kunlun
* minor
* test=kunlun
* minor
* test=kunlun
* minor
* test=kunlun
5 years ago
Jack Zhou
d37b3774fd
register log double grad kernel for cpu and cuda
...
register log double grad kernel for cpu and cuda
5 years ago
Chengmo
d014e29fc6
fix error message ( #27318 )
...
* fix sgd/momentum/dpsgd/rmsprop error message
5 years ago
Leo Chen
35074963e3
Refine error msg in paddle/fluid/framework/details [part 2] ( #27429 )
...
* refine broadcast_op_handle
* refine some error messages
* refine some files
* fix bug
* fix bug
* fix bug
* follow comments
* follow comments
5 years ago
Chengmo
0e101c4f6f
Fix test dist fleet heter ctr ( #27513 )
...
* fix test_dist_fleet_heter_ctr & peformance update
5 years ago
Zhong Hui
a85592bcbf
fix cpplint error for the autmic max/min
...
fix cpplint error for the autmic max/min
5 years ago
joanna.wozna.intel
b0ee1405f7
Add conv2d bfloat16 support ( #27325 )
5 years ago
Leo Chen
a5b3263782
Refine error msg in paddle/fluid/imperative ( #27521 )
...
* refine err msg
* follow comments
5 years ago
Thunderbrook
6f69a4cb05
add xpu in heter mode ( #27000 )
...
* add xpu in heter mode
test=develop
* BOOST_CONST_GET; PADDLE_THROW
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* refine
test=develop
* refine
test=develop
* refine
test=develop
* refine code
test=develop
5 years ago
ceci3
8daccc9ea7
Fix batch norm double grad compute ( #27549 )
...
* fix bn double grad, test=develop
* update, test=develop
5 years ago
ShenLiang
6fc74bbaf6
add fp16 for matmul ( #27523 )
...
* add fp16 for matmul
5 years ago
Zhong Hui
fab4e6d08f
add abs support double grad
...
add abs support double grad for the api 2.0
5 years ago
GaoWei8
36ed83d270
Refine PADDLE_ENFORCE ( #27360 )
...
* refine PADDLE_ENFORCE
5 years ago
liym27
effd51b6be
Fix error message in operator/utils.h ( #27532 )
5 years ago
Leo Chen
6bb02e8e3c
increase retry time ( #27553 )
5 years ago
Shang Zhizhou
77a36f8997
[buf fix]:fix some unittests error ( #27540 )
...
* [buf fix]:fix unittest test_activation_op error
* split long-time unittests to smaller ones
* rename some unittests
5 years ago
Zhong Hui
597345d17b
fix cuda atomic for ARCH<350 for the automic_max
...
fix cuda atomic for ARCH<350 for the automic_max
5 years ago
WangXi
e550fc02ae
fleet2.0 add fp16 grad compression ( #27480 )
5 years ago
cc
c5c13473c6
Add compatibility check for four mkldnn pass ( #27364 )
...
* Add pass compatibility check for four mkldnn pass, test=develop
5 years ago
mapingshuo
c83ade6d6b
add AsDuplicable for sync_comm op( #27515 )
5 years ago
Wilber
3d5522146e
register seq_concat_fc_fuse pass. ( #27479 )
5 years ago
Wilber
df7fabeedc
Fix memory leak for mkldnn. ( #27493 )
5 years ago
ruri
b7319ef518
fix err msg in pixel shuffle op ( #27503 )
5 years ago
Kaipeng Deng
d7f422c984
fix error message in conv/conv_transpose. test=develop ( #27464 )
...
* fix error message in conv/conv_transpose. test=develop
5 years ago
Wilber
ec4155d7d0
windows lib size crop from 5.4G to 3.9G ( #27477 )
5 years ago
ruri
e1fb77d123
[2.0RC]refine error message in shuffle channel OP ( #27505 )
...
* refine err msg in shuffle channel op
5 years ago
Aurelius84
f91c37e665
Refine error message of MatchMatrix and PyramidHash ( #27484 )
5 years ago
Shibo Tao
8f7bb52bd2
fix tensorrt 6 build error. test=develop ( #27511 )
...
* fix tensorrt 6 build error. test=develop
* fix. test=develop
* bug fix
* test=develop
5 years ago
wanghuancoder
df43905f12
use iwyu clean include ( #27267 )
...
* use iwyu clean include, test=develop, test=win
* compilation error, test=develop
* fix compilation error2, test=develop
* fix compilation error3, test=develop
* fix compilation error4, test=develop
* fix compilation error5, test=develop
* fix compilation error6, test=develop
* fix compilation error7, test=develop
* fix compilation error8, test=develop
* fix compilation error8, test=develop
* fix compilation error10, test=develop
* fix compilation error11, test=develop
5 years ago
wangchaochaohu
dc713116e0
refine the error message for bath size like OP ( #27446 )
...
* refine the error message for bath size like
5 years ago
Zhong Hui
4a9d21de49
Add GPU Kernels of Segment Ops, support, sum, max, min, mean
...
Add GPU Kernels of Segment Ops, support, sum, max, min, mean
5 years ago
YUNSHEN XIE
66951ab2ea
modified timeout value for 4 ut ( #27462 )
5 years ago
Shang Zhizhou
c17f9cf25f
[bug fix]:Memory increases after adapting the cudnn version to cudnn8 ( #27436 )
...
* [bug fix]:Memory increases after adapting the cudnn version to 8
* [bug fix]cudnnGetConvolutionForwardAlgorithm not defined
5 years ago
Zhou Wei
1e1ae5c54d
Make the Bind Method of Tensor more automatic ( #27270 )
...
* Makes the Bind Method more intelligent
* Makes the Bind Method more intelligent
* fix unittest
* fix unittest
* fix conflict
5 years ago
LutaoChu
5508c78744
Fix bug: The calculation result of Diag_v2 Op under large size input is wrong ( #27447 )
...
The calculation result of Diag_v2 Op under large size input is wrong
5 years ago
tangwei12
bc5f0246a8
large scale kv speedup ( #26510 )
...
* rename communicator meet->BatchesCounter
* fix parame recv for sparse
* geo sparse init from pserver
* optimize init from pserver
* add large scale optimizer fuse(SGD/ADAM)
* rectification init_worker and exe.run startup program
5 years ago
Qi Li
d7b7dcd10e
fix cmake dependencies of test_recognize_digits, test=develop ( #27475 )
5 years ago
Zhou Wei
292b24aa6d
fix bug MD of compile, And add MD/STATIC/OPENBLAS inference lib check on windows ( #27051 )
5 years ago
Chen Weihang
41b5955538
Polish no onwer ops error message ( #27448 )
...
* polish no onwer op error message
* fix unittest failed
* polish details based reviewer comment
5 years ago
Zhang Ting
906e7f921e
add fuse_bn_act op ( #27230 )
...
* add fused_bn_add_relu op
5 years ago
Wilber
5034d181f3
update for 2.0 inference api. ( #27473 )
5 years ago
Chen Weihang
765064476b
Polish some lost invalid error message ( #27445 )
...
* polish some lost error msg
* add some math file to white list
* polish detail based reviewer commnet
5 years ago
wangchaochaohu
76fb95fe76
avoid data transform for linspace OP ( #27444 )
5 years ago
123malin
a04524759e
Enhance Op's Error Message ( #27455 )
...
* test=develop, update error message
5 years ago
wangchaochaohu
0a862fd356
refine the precious of linspace Op using half way ( #27452 )
5 years ago
Pei Yang
fda54c0212
errmsg refine of trt plugin ( #27309 )
5 years ago
石晓伟
dd4c2d86a5
enhance error messages, test=develop ( #27423 )
5 years ago
Zhong Hui
f4c750d721
Add the cpu version of segment sum mean max min op
...
Add the cpu version of segment sum mean max min op
5 years ago
Wilber
afe94903c3
Rename fluid_inference to paddle_inference. ( #27422 )
5 years ago
Pei Yang
8182337096
clear pass logs ( #27434 )
5 years ago
furnace
13a4c74efd
add mv op(c++, python, unit test) ( #27024 )
5 years ago
LutaoChu
f11a53ee76
Optimize argsort Op performance on GPU
...
* argsort op acceleration on GPU when the input size is equal to the length of the ‘axis’ dimension
5 years ago
ceci3
1d3b27cae8
add double grad compute for batch norm ( #27296 )
...
* add double grad compute for batch norm,test=develop
* fix unittest, test=develop
* remove unuse tensor,test=develop
* add format,test=develop
* update, test=develop
5 years ago
Shang Zhizhou
d93661942e
fix bug sequececonv_eltadd_relu_fuse_pass ( #27404 )
...
* fix bug sequececonv_eltadd_relu_fuse_pass, output error when sequence_conv's padding_start > 0
* fix seqconv_eltadd_relu_fuse_pass unitest error
5 years ago
Leo Chen
aba759ba16
[Feature] Enhance inplace addto strategy for gradient accumulation in static graph ( #27112 )
...
* support use add instead of sum to do gradient accumulation
* add inplace addto pass
* add grad_add op and inplace addto pass
* remove debug code
* code refine
* fix bug when sereral sum ops inserts at same op_idx
* fix Flags type
* add addto attribute for conv3d
* fix ut
* code clean
* fix type
5 years ago
LutaoChu
669efb98de
Fix bug: shapes of Topk outputs are wrong when the parameter k is Tensor
...
Fix bug: shapes of Topk outputs are wrong when the parameter k is Tensor
5 years ago
Wilber
39546aa2f3
Add pass compatible and unit test. ( #27377 )
5 years ago
huangxu96
02606d45ef
Quant op dev ( #25932 )
...
* Finished ChannelWiseQuantDequantAbsMaxOp and Passed unittests.
* Finished channel-wise quantize strategy in imperative quantization.
* Added Cuda code of ChannelWiseQuantDequantMaxAbsOP
Add Cuda code of ChannelWiseQuantDequantMaxAbsOp
* Add quant_axis for channel_wise quant.
* fixed a bug in unnitests, which will not trigger axis = 1 case and cannot meet the coverage rate requirement.
* Added some assert infomation and fixed some coding style mistakes.
5 years ago
Leo Chen
bbc84e0fe0
Refine error msg in paddle/fluid/framework/details [part 1] ( #25631 )
...
* refine error msg in var_handle.h, test=develop
* refine all_reduce_op_handle
* fix some error msg
* refine variable_visitor
* refine threaded_ssa_graph_executor
* refine inplace related files
* refine executor related files
* refine fetch_op_handle.cc
* fix bug
* follow comments
5 years ago
MRXLT
f936adbd2d
fix adam ( #27343 )
...
* fix adam
* rmsprop support double
5 years ago
tangwei12
99626502f7
【paddle.fleet】gloo and util ( #27213 )
...
* fix worker endpoints
* fix gloo wrapper for hdfs
* GPU fleetrun support gloo
* parameterserver fleetrun support gloo
* fix get server endpoint
5 years ago
Pei Yang
a5ef246cac
Optimize emb_eltwise_layernorm_plugin and support fp16 ( #27128 )
5 years ago
yaoxuefeng
d726fd5e86
enhance dataset err msg ( #27363 )
5 years ago
Pei Yang
fd7ab4e63c
register pass compatibility ( #27357 )
...
* pass compatibility
* add compatibility registry
* add unittests for different padding
* add assert
* drop errmsg
5 years ago
haozech
7e6dfcf9b2
Add 3 pass version check ( #27283 )
5 years ago