Commit Graph

12587 Commits (73a6fa3ed0fe2bbbfe72c05f42faabccd3bbadb7)

Author SHA1 Message Date
Jiabin Yang 0c38708a90
[Custom Op] Remove unsupport dtypes (#31232)
4 years ago
WangXi b8bce682e0
xpu support fuse allreduce (#31104)
4 years ago
Aurelius84 59b00e8c45
[CustomOP]Support Incremental compilation and Add Version management (#31228)
4 years ago
Chen Weihang 126633c50f
[CustomOp] Split build op marco & polish details (#31229)
4 years ago
Aurelius84 e8d24b546a
[CustomOp] Add Modeling with Custom op unittest (#31218)
4 years ago
littletomatodonkey ad50fa710b
add int pad support for Pad1D/2D/3D (#31209)
4 years ago
jakpiase 2f1165342b
OneDNN hardswish integration (#30211)
4 years ago
Aurelius84 912022fa0c
[CustomOp]Add cpp_extension en doc (#31187)
4 years ago
Chen Weihang e8cdb49aa9
[CustomOp] Support attributes as func input in custom op (#31128)
4 years ago
Zhou Wei ffbf71359a
modify custom op dependent from paddle_framework to paddle_custom_op (#31195)
4 years ago
lilong12 dc8dfba35b
align the default value of some configuration for fleet to that of single cards (#30740)
4 years ago
lilong12 a373aa7645
fix the bug in expand_v2 op (#30984)
4 years ago
Thunderbrook c4f279fe8d
support multi node in heterps (#31102)
4 years ago
Aurelius84 406f4a7513
[CustomOp] Support to specific extra_cflags and exctra_cuda_flags independently (#31059)
4 years ago
qingqing01 572cc8bd0f
Update doc for 2.0 API and some callback (#31180)
4 years ago
Pei Yang 00b09e86ac
[Paddle-TRT] support group_norm (#31040)
4 years ago
Chen Weihang c209751c8d
change test_multiprocess_reader_exception cmake (#31174)
4 years ago
YUNSHEN XIE 153121457f
fix ut timeout (#31061)
4 years ago
Chen Weihang 1ce96fa118
[CustomOp] Add new paddle custom op so (#31141)
4 years ago
tangwei12 ebbdf52557
fix entry (#31079)
4 years ago
Aurelius84 dce2db4857
[CustomOp] Split build directory for each setup.py (#31124)
4 years ago
Zhou Wei 4b220550ef
[Custom OP]Fix problem of custom op unitests on Windows CI (#31114)
4 years ago
chentianyu03 70131b475f
add warning message when dtypes of operator are not same (#31136)
4 years ago
Chen Weihang e60fd1f6a8
[CustomOp] Split test and add inference test (#31078)
4 years ago
xiemoyuan edacb6293c
Optimization of Transformer API (#30957)
4 years ago
WeiXin ee1801c1ad
Save load/save pickle protocol (#31044)
4 years ago
yukavio 99fd9815b6
fix flops api (#31081)
4 years ago
Zhou Wei 44ee251fde
fix UNIX cmake problem (#31113)
4 years ago
Thunderbrook 565354f676
support save multi sparse table in one path (#31108)
4 years ago
Huihuang Zheng cf43a321a8
[Dy2stat] Refactoring tensor_shape_transformer.py to Fix Change after Assign Bug (#31082)
4 years ago
tangwei12 0e4b154298
fix dist fleet ctr ut (#31087)
4 years ago
Zhou Wei adaec0073d
[2.0Custom OP]Support New Custom OP on Windows (#31063)
4 years ago
Chen Weihang 2168f08ac8
add optional for param attr args, test=document_fix (#31105)
4 years ago
Chen Weihang 6beeafe797
[CustomOp] Add more dispatch marco for users (#31058)
4 years ago
TTerror d5323dab41
add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056)
4 years ago
123malin 16b4260b2f
test=develop, save/load, shrink (#30625)
4 years ago
Shibo Tao 4424aac608
export paddle.static.normalize_program method. (#31072)
4 years ago
liym27 5b367dab44
[static setitem] Support the index is Tensor; step>1; step<0 .(#30949)
4 years ago
Jack Zhou 6df1ca54c8
add detail about states index in rnn result, test=document_fix (#31048)
4 years ago
Huihuang Zheng ef627ac5b9
Fix that convert_var_shape doesn't support slice like [0:], test=develop (#31051)
4 years ago
Jacek Czaja f7465641c3
Added reshape grad bf16 (#31035)
4 years ago
Aurelius84 4dbe16c48f
[CustomOp] Refine name argument in setup (#31049)
4 years ago
Aurelius84 f2dc29a9fa
[CustomOp] Support output dtypes in generated Python API (#31045)
4 years ago
ShenLiang 9401173e3a
Remove scale loss before reduce in dygraph (#30807)
4 years ago
Kaipeng Deng c4ddc3ab0d
fix dataloader collate return list mix tensor and numpy array (#30904)
4 years ago
Guanghua Yu 5b267474a9
add offset parameter in roi_align,generate_proposals.etc ops (#30864)
4 years ago
Chen Weihang 75f81233ae
fix regex error & simplify marco name (#31031)
4 years ago
Pei Yang 9b54fe4154
add trt transpose and flatten converter (#31022)
4 years ago
Aurelius84 4c9f96c902
[CustomOp] Support Compile multi ops at same time (#30920)
4 years ago
joanna.wozna.intel caf9d39839
Add Conv Transpose BF16 (#30877)
4 years ago
Huihuang Zheng cbbe127483
Refine fake_interface Error Message (#30981)
4 years ago
Huihuang Zheng c137578341
Add Support for Tuple in for Loop (#30998)
4 years ago
Wojciech Uss 2497f4392f
Handle missing symlink method on Windows (#31006)
4 years ago
Aurelius84 5653c3a488
[CustomOp] Check Compiler ABI compatibility (#30869)
4 years ago
huangjun12 20e300e2df
fix lrn bug in reshape size, test=develop (#30968)
4 years ago
WeiXin 8ab29f4bea
delay timeout of unnittest 'test_static_save_load'. (#30975)
4 years ago
Chen Weihang f649442ddd
New custom operator extension mechanism (#30690)
4 years ago
chajchaj f5ca2db2cc
support label with float input of cross_entropy, test=develop (#30929)
4 years ago
Huihuang Zheng 8e72e031fc
Update gast requirement, test=develop (#30932)
4 years ago
Chen Weihang 010f2caa23
try to fix reader and signal test failed (#30960)
4 years ago
liym27 12c15bebe4
[Static setitem] Support index is ellipsis for setitem in static mode (#30836)
4 years ago
liuyuhui 87197f8c2e
[kunlun]fix sync in multi kunlun xpu dygraph training. (#30943)
4 years ago
wanghuancoder 823f499a8a
fix a bug of Sequential::__getitem__ (#30899)
4 years ago
Jacek Czaja 9e527d9956
[oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925)
4 years ago
liuyuhui 4a8b8b4547
[Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858)
4 years ago
wanghuancoder 90d92111cf
let LayerList could add [None], test=develop (#30911)
4 years ago
taixiurong 24873f4f77
dyngraph (#30892)
4 years ago
Zhen Wang 71acde9afc
Use correct master weights in AdamW. (#30895)
4 years ago
Jacek Czaja abfa822650
[oneDNN]Extended adaptive pooling support for oneDNN pool kernel (#30757)
4 years ago
Zhang Ting e97905c5fa
improve performance of momentum (#30881)
4 years ago
cucuzg ac2e2e6b7f
add clip_by_norm on kunlun, *test=kunlun (#30862)
4 years ago
Kaipeng Deng 302427170f
remove numpy array check in single-process dataloader. test=develop (#30861)
4 years ago
wawltor b7560a59ab
fix the broadcast for the large second input (#30818)
4 years ago
JamesLim 6e1e036a75
Implement cuda kernel for index_sample. (#30380)
4 years ago
AshburnLee 666efc2336
Call new cudnn batch norm API regardless of data type and data layout (#30157)
4 years ago
石晓伟 2ac4143b6c
support xpu with analysis predictor, test=develop (#30832)
4 years ago
joejiong 05d2b7a37f
Update paddle.static.Print with paddle2.0 api (#30846)
4 years ago
Aurelius84 e49d0746dd
[CustomOp] Support install as Package and Add load interface (#30798)
4 years ago
Adam Osewski 4f066e316e
Layer normalization fuse pass. (#30721)
4 years ago
WangXi b1026f64af
【kunlun】dygraph supports multi xpu card training (#30671)
4 years ago
LielinJiang 3a3ff75c52
Fix unittest random failed of test_datasets (#30804)
4 years ago
Shang Zhizhou b909450994
fix trt plugin clone and initialize bugs in TRT7.1+ (#30709)
4 years ago
Shang Zhizhou 200ee33df8
fix unittest random error (#30808)
4 years ago
xiemoyuan db87087283
Optimize the encoder of Transformer. (#30439)
4 years ago
WangXi 31ed9c9eed
Fleet distributed strategy support pure fp16 (#30754)
4 years ago
Aurelius84 2c974cc316
【CustomOp】support setup.py to compile custom op (#30753)
4 years ago
Jiaqi Liu 65a9744cfd
fix paddle.static.acc and auc sample code bug, test=document_fix (#30715)
4 years ago
Wojciech Uss fc00240575
A fix for oneDNN matmul kernel. Fixes issue #30309 (#30723)
4 years ago
tianshuo78520a a12b6bb9cb
add readme in whl package (#30726)
4 years ago
WeiXin 3491acfb1e
Split unittest. (#30727)
4 years ago
liu zhengxi a87d78f1a9
update gather_tree doc (#30693)
4 years ago
liu zhengxi fef3654b4e
upgrade gather_tree to core.ops (#30697)
4 years ago
jakpiase f8da5536ed
REUPLOAD Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30719)
4 years ago
liym27 13ef444fa6
[Dy2Stat] Fix error message when the message has more than one lines. (#30714)
4 years ago
Tao Luo 824a79d383
Revert "Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661)" (#30708)
4 years ago
jakpiase d834f4e6e8
Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661)
4 years ago
Leo Chen 1a13626f5f
polish printing dtype (#30682)
4 years ago
WangXi a28a202603
fix test_gen_nccl_id_op failed (#30686)
4 years ago
123malin 164275704d
test=develop, fix nonzero astuple=true (#30647)
4 years ago
yingshengBD 0eea5d714f
post quantize support insert fake_quantize_dequantize node before the OPs that will be used in VIS's faceid models (#30659)
4 years ago
123malin 06a3e31148
test=develop, fix test_lookahead (#30677)
4 years ago
yukavio 8c5f158172
remove PrettyTable dependence from paddle.flops (#30675)
4 years ago
chentianyu03 fb7fbc7a5d
fix abs bug and add abs test case (#30637)
4 years ago
ShenLiang 9514b4aa5f
Fix scatter grad bug (#30604)
4 years ago
Qi Li 1f5841c2a0
[ROCM] update cmake and dockerfile, test=develop (#30598)
4 years ago
Zhen Wang 4a9de931a2
Fix the bug in fleet amp_init. (#30606)
4 years ago
cnn 7e9f336b58
update document of paddle.vision.dataset, test=document (#30414)
4 years ago
guofei 430f8449f1
Fix the error of save_quantized_model (#30583)
4 years ago
TTerror 10271ddfc4
support reduce_max op on kunlun (#30581)
4 years ago
WeiXin ca33821475
延长单测'test_static_save_load'超时 (#30599)
4 years ago
chentianyu03 358106fcb0
make abs op support complex types (#30375)
4 years ago
huangxu96 138620084c
Add fleet amp_init() (#30572)
4 years ago
wanghuancoder 27a5c0cff6
fix layers train eval bug (#30580)
4 years ago
lilong12 8126a41d73
fix the bug of all_reduce pipeline gradient multiple times (#30437)
4 years ago
Aurelius84 621bc4f771
[Dy2static]Fix paddle prefix in is_paddle_api (#30569)
4 years ago
tangwei12 c9e78a22c5
add trainers for pserver (#30523)
4 years ago
Aurelius84 5067e3a8d2
[Dy2Static]Enhance check of TracedLayers out vars (#30576)
4 years ago
liym27 ff25c5b36f
Fix bug: GetAttrValue should deal with attr with attrType vector<double> (#30536)
4 years ago
WangXi 572c466d19
[Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455)
4 years ago
ykkk2333 549855ac20
add rmsprop_op_xpu test=kunlun (#30493)
4 years ago
Leo Chen 7043b8cfc6
support layer_norm fp16 in dygraph amp (#30430)
4 years ago
Zhang Ting 66c514ce83
[2.0 API] device guard (#30307)
4 years ago
WangXi 7a0a576e51
fix adamw lr_to_coeff is fixed when dygraph (#30526)
4 years ago
cc ce6777fcdf
Fix bug of supporting channelwise dygraph quantized model, test=develop (#30531)
4 years ago
WeiXin c0fb03a0dc
Supplement PR29988(https://github.com/PaddlePaddle/Paddle/pull/29988) (#30507)
4 years ago
hutuxian 9fec1618d2
Ascend Framework Part3: Ascend Parser (#30391)
4 years ago
hutuxian 40ede12631
Ascend Framework Part1: OP & Wrapper (#30281)
4 years ago
Zhang Ting 34bf8dfc40
avoid calling cast twice (#30527)
4 years ago
gongweibao bdae7ed326
Fix potential port conflicts. (#30508)
4 years ago
QingshuChen 8489d4f76f
optimize batch_norm & pool op for kunlun (#30490)
4 years ago
taixiurong 5e5c2827a3
fix range op crash in dygraph xpu place (#30469)
4 years ago
WeiXin 18ecd433f5
Avoid bug on 'MAC python3.5/6'. (#30485)
4 years ago
JZ-LIANG 16ba0abc79
Recompute Offload: fixed bug in memcpy (#30484)
4 years ago
lijianshe02 d8a9ba56ef
fix random seed in nll_loss unittest test=develop (#30468)
4 years ago
cc 5d8d463cf7
Collect weight threshold for lstm op in post_training_quantization (#28701)
4 years ago
guofei 11e78ebaa3
Modify the calculation logic of LambOptimizer (#29313)
4 years ago
LielinJiang 1d7bf1de2b
Update voc dataset url (#30450)
4 years ago
pangyoki 13d757362c
Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103)
4 years ago
WeiXin e5bb4edb2c
perfect 'var_list' of static.load/fluid.load (#30457)
4 years ago
123malin 05f06d9ae1
test=develop, fix fleet.metric (#30438)
4 years ago
taixiurong 6a3c8725b0
support transformer v2.0 (#30381)
4 years ago
Zhou Wei c94a4b9468
Separate AVX and NO_AVX compilation, enhance installation error message (#30413)
4 years ago
Jiaqi Liu e395bcd1e0
add auc into 'all' list (#30310)
4 years ago
Chengmo 859431aadb
fix ps init(#30397)
4 years ago
123malin 2a98e9323a
test=develop, add distributed_infer (#30300)
4 years ago
Wilber 96784ed6c8
fix compile error on ARM (#30398)
4 years ago
Chen Weihang ae1f32091a
fix prune input bug (#30384)
4 years ago
WeiXin 5ff4f1ad5e
move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339)
4 years ago
Huihuang Zheng cd5f11b822
Decrease Batch Size for Windows CI, test=develop (#30331)
4 years ago
cc 8e3a294045
skip quantizing ops in cpu inference (#30342)
4 years ago