Commit Graph

17783 Commits (e8db4412d00b9fb72f9a0a04d90f15fbf861c1fa)

Author SHA1 Message Date
Jack Zhou c791df09cf
Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast
4 years ago
wangchaochaohu c5fcc96d5b
xpu support for fill_constant Op (#27675)
4 years ago
tianshuo78520a a820871669
Change PR-CI-Kunlun Test Number (#27923)
4 years ago
Chengmo 328cb289ed
【paddle.fleet】fix sparse load (#27680)
4 years ago
tangwei12 cf70d5b350
fix paddle error informations (#27889)
4 years ago
wawltor 95aa53425d
update the code for the topk message optimize
4 years ago
Chen Weihang 4ba977c720
Polish some error message in opeators (#27876)
4 years ago
123malin a4f850748a
【paddle.fleet】bug fix for parameter_recv (#27838)
4 years ago
QingshuChen 2712d07644
support kunlun matmul_v2 (#27910)
4 years ago
zhang wenhui 5a83496c8d
Multi task (#26002)
4 years ago
zhang wenhui 7a58431c0a
fix norm api doc, test=develop (#27652)
4 years ago
yinhaofeng 3eb106da6d
Lookup table v2 xpu (#27888)
4 years ago
Zhang Ting d5cc144c60
tune backward filter algorithm for float16 (#27529)
4 years ago
wanghuancoder 41aad9bfcd
revert 4 files, from clear include by iwyu, test=develop (#27895)
4 years ago
hutuxian 3f2a6ab65d
fix error msg (#27887)
4 years ago
xiaoting ae01801f0a
Add dropout and log_loss for kunlun (#27790)
4 years ago
Guanghua Yu 70c8c31371
support mean,softmax_with_cross_entropy on Baidu Kunlun (#27792)
4 years ago
Chengmo 1607e87cb9
add xpu sgd & momentum (#27728)
4 years ago
Leo Chen 049696bf67
Refine the format of printing tensor (#27673)
4 years ago
hong19860320 c90d35564b
Add batch_norm and layer_norm XPU kernels (#27818)
4 years ago
joanna.wozna.intel ddcd1b5381
Add bfloat16 resnet50 test (#27755)
4 years ago
xiaoting 6da7a7458b
add conv for xpu, test=kunlun (#27809)
4 years ago
Thunderbrook 04be37c57f
add xpu slice op (#27349)
4 years ago
Thunderbrook 8c25dfaacc
op error info (#27856)
4 years ago
Wilber 345574a6ed
Demo CMakeLists add openmp flag. (#27848)
4 years ago
ShenLiang 6d63cd2b93
add gather_op xpu, test=kunlun (#27822)
4 years ago
Feiyu Chan 1d95a0fbc3
fix error message for nce_op (#27863)
4 years ago
gongweibao 4237fefeb4
Add shellcheck tools and modify copyright hook (#27722)
4 years ago
Chengmo c5f2802d56
【paddle.fleet】Update fleetrun & ps-heter (#27472)
4 years ago
Shang Zhizhou bbc837ee72
add info log for trt input dynamic shape check (#27796)
4 years ago
guofei 2e1bca99ca
Refine the gradient calculation errors caused by renaming in while_grad (#27814)
4 years ago
wanghuancoder 8fa4c09889
add load_op_xpu for Baidu Kunlun (#27817)
4 years ago
Wilber 9005c5a260
Lite subgraph support arm cpu. (#27827)
4 years ago
Jacek Czaja 55e63763ec
[oneDNN] adaptive pool support (#27747)
4 years ago
chen zhiyu 6335e6a0a6
add musl option (#27798)
4 years ago
yongqiangma e8a5aefbbd
update CUDAPlace doc. test=document_fix (#27711)
4 years ago
Zhang Ting 16999ae49d
use IndexList to improve performance of instance_norm op (#25132)
4 years ago
GaoWei8 36bb056ed6
Add flattern weight of lstm (#27192)
4 years ago
Guanghua Yu 7779790c61
error message optimization in softmax_with_cross_entropy_op (#27772)
4 years ago
zhupengyang 659d04df2c
hsigmoid -> hsigmoid_loss/HSigmoidLoss; refine docs (#27745)
4 years ago
TeslaZhao 070ac9590c
Add double grad in Squeeze and Unsqueeze (#27810)
4 years ago
Jack Zhou d4359b0f39
add the kunlun kernel for the paddle 2.0
4 years ago
mapingshuo 840d54de9b
add XPU support for shape op and reshape op (#27804)
4 years ago
cc 8fabb1c32f
Add test attribute in channelwise_quant op, test=develop (#27742)
4 years ago
wangxinxin08 ad99e638fd
add double grad op for matmul (#27776)
4 years ago
zhupengyang 0025e0d87b
refine APIs: brelu, hardsigmoid, hardswish, maxout (#27658)
4 years ago
zhupengyang 5098891fdf
add softmax xpu kernel (#27700)
4 years ago
Double_V f6ad2375be
fix pool3d bug, test=develop (#27718)
4 years ago
石晓伟 0d27591642
save operator version infomation to program desc, test=develop (#27668)
4 years ago
Qi Li b8d2a021f0
fix ut error of test_recognize_digits, test=develop (#27791)
4 years ago
Jacek Czaja 631c1f3018
- Fix to 27398 (#27770)
4 years ago
Feiyu Chan 0a7bab4e34
fix error mesage for negative_positive_pair_op and nce_op (#27779)
4 years ago
zhupengyang 395cb561aa
refine logsumexp error message and docs (#27713)
4 years ago
smallv0221 057e28bc8f
API(lstm_unit, lstmp, sequence_mask, sequence_enumerate, sequence_conv) error message enhancement (#27572)
4 years ago
Jacek Czaja 606611d351
[oneDNN] GRU BF16 kernel (#27731)
4 years ago
xiemoyuan 6c1acf34ed
Optimize the error message for OP (#27617)
4 years ago
cc ec7d11a492
refine fused_elemwise_activation error message (#27734)
4 years ago
Zhen Wang 365c2c9c89
fix error message showing in UpdateLossScalingOp (#27596)
4 years ago
LielinJiang 9089841b6e
Fix bilateral inference shape bug (#26822)
4 years ago
Yiqun Liu 65207b4560
Polish the error message of fc, fused_fc_elementwise_layernorm and fused_embedding_seq_pool. (#27692)
4 years ago
Wojciech Uss f399bed8d9
Add an option to set number of warmup iterations (#27739)
4 years ago
Jacek Czaja b9fda2ff09
Fix to issue #25537 (#27546)
4 years ago
Wojciech Uss 966447e338
Added support for quantization of fusion_gru (#27518)
4 years ago
joanna.wozna.intel 0cd4907eba
Add avx512 core instructions check (#27732)
4 years ago
hong19860320 7a96d5788d
Optimize the error messages of the CUDA implementation of activation ops (#27741)
4 years ago
tangwei12 fd616fadc2
repen heartbeat ut (#27684)
4 years ago
Qi Li f373269df0
update histogram op for performance optimization, test=develop (#24912)
4 years ago
tianshuo78520a 4d5ddbf106
add xpu test (#27622)
4 years ago
MRXLT 20fb01fb00
fix distributed error info (#27206)
4 years ago
pangyoki 7cd2c13f1b
add multinomial op (#27219)
4 years ago
Zhang Ting d2369dd91f
modify docs of CPUPlace and CUDAPinnedPlace, test=document_fix (#27587)
4 years ago
iducn 7c69e36131
add pip new requirements to windows (#27697)
4 years ago
Wojciech Uss 42d175385d
Add support for (de/re)quantization with shift (#27481)
4 years ago
123malin cc780b1977
test=develop, optimize geo communicator (#26857)
4 years ago
Pei Yang 8a4f85feb9
Add unittests and OP version registry for quant_conv2d_dequant_fuse_pass (#27689)
4 years ago
yukavio 7b46fb0f14
fix generate_proposals and affine grid error info (#27636)
4 years ago
Chen Weihang b14ecb8632
Polish api BuildStrategy/ExecutionStrategy doc & code example (#27662)
4 years ago
AshburnLee c3a3df6466
Add cuda support for unique op (#27646)
4 years ago
lilong12 bbc2add703
Initialize gloo for low level collective apis (#27672)
4 years ago
wawltor 29f4922906
optimize the error meesage for detetion_map_op
4 years ago
whs daf5aa9b8b
Fix round in grid sample op (#27657)
4 years ago
arlesniak 0ecf441af1
Add support for mkldnn ops types selection with FLAGS in dygraph (#27482)
4 years ago
Wilber 2bc70ab2e2
Fix lite_resnet50 unit test. (#27611)
4 years ago
ysh329 2f9cdd9038
API/OP clip_by_norm_op error message enhancement. test=develop (#27614)
4 years ago
yongqiangma aac57159c9
enhance array_to_lod_tensor_op lod_tensor_to_array_op errors informaiton (#27386)
4 years ago
lilong12 36c0410223
Revert "Initialize gloo for low level collective apis (#27356)", test=document_fix (#27665)
4 years ago
xiemoyuan 99e3337368
Optimize the error message of OP. (#27478)
4 years ago
ShenLiang e8f873df88
optimize the speed&memory of matmul op (#27610)
4 years ago
Pei Yang ae6e40a7fd
Add unittests and OP version registry for tensorrt_subgraph_pass (#27544)
4 years ago
tangwei12 9704582eef
fix op error (#27599)
4 years ago
wanghuancoder c68a0313a5
add paddle.fluid._cuda_synchronize (#27595)
4 years ago
yaoxuefeng c9a8801325
enhance error messages of lookup_tale, merge_ids, data_norm (#27619)
4 years ago
whs 9cc5603d56
Make grid support stopping graients. (#27630)
4 years ago
liym27 074a71bd25
Support assignment to a Variable in dynamic mode but not deal with backward. (#27471)
4 years ago
lilong12 5218b7af6b
add ncclSend and ncclRecv (#27621)
4 years ago
lilong12 fa73e4a284
Initialize gloo for low level collective apis (#27356)
4 years ago
furnace d01f626944
update mv op according PR#27024 (#27474)
4 years ago
Double_V 9d783aeddd
Error message opt, test=develop (#27467)
4 years ago
Li Fuchen 1501a80f74
add support to float64 input of warpctc op. (#27399)
4 years ago
QingshuChen 6b727e08b1
support elementwise add, activation, matmul on Baidu Kunlun (#27143)
4 years ago