Commit Graph

3157 Commits (bc7a3afa687696541b032d56d1e9a8ca8e101c77)

Author SHA1 Message Date
liym27 9602a182b2
[Dynamic Inplace] Support ShareInplaceVersionCounterWith for C++ Tensor (#29842)
5 years ago
liuyuhui 4427df37cf
[Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)
5 years ago
YUNSHEN XIE 2a01756bf3
remove duplicate ut names (#29809)
5 years ago
Chen Weihang a6072055be
[Complex] Handle complex to real after type promotion (#29855)
5 years ago
Leo Chen 6b258317cb
fix TransferInplaceBack (#29830)
5 years ago
QingshuChen 59b47f3b32
feat: support check_nan_inf for kunlun/xpu device (#29694)
5 years ago
tangwei12 032414ca2a
[Feature] one ps (3/4) (#29604)
5 years ago
jakpiase edc06c6a1b
Added fc + activation fuse pass (currently only gelu, sigmoid and tanh are supported) (#29772)
5 years ago
YUNSHEN XIE 24ce051a84
remove duplicate ut reload (#29810)
5 years ago
Thunderbrook 09b6e71928
heter box (#29734)
5 years ago
Jacek Czaja 7b33720c90
[oneDNN] Tensor copy fix to oneDNN tensors (#29771)
5 years ago
Leo Chen 224f3bcbb1
format code (#29714)
5 years ago
石晓伟 8bd2879ef7
update the operator registration for incompatible upgrade, test=develop (#29720)
5 years ago
WangXi 9cbcc6cadc
fleet sync build strategy, test=develop (#29732)
5 years ago
Chen Weihang 6cfa59de1b
[Complex] Add real & imag op and api for complex tensor (#29672)
5 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
5 years ago
lilong12 ff6a145011
update, test=develop (#29559)
5 years ago
Jacek Czaja f6cca62575
[oneDNN] Making ThreadID info in caching key optional (#29272)
5 years ago
JZ-LIANG d33d468f02
[Sharding] add hybrid-dp feature (#29518)
5 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
5 years ago
Aurelius84 2a42250699
Polish hash function of executor cache key (#29556)
5 years ago
jakpiase 57a4f16d9e
added internal and external reorders to profiler (#29443)
5 years ago
LoveAn 03b42d9fa7
fix unittest on windows, test=develop (#29365)
5 years ago
cc a623ce044f
Use different name_scope for different conv type, test=develop (#29355)
5 years ago
liym27 b10ecd9d3a
[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267)
5 years ago
Chen Weihang 9ad800ebb2
Support type promote for basic math ops (quantum required) (#29265)
5 years ago
Aurelius84 67c700b479
[Dy2Stat] Add cache for Executor and Context in run_program_op (#28421)
5 years ago
Chen Weihang 1de32f823d
Hot fix complle failed in gcc4.8 caused by complex impl (#29254)
5 years ago
GeminiCarrie 642abe2a48
Fix a bug when running on an operating system without "bash." (#29131)
5 years ago
ShenLiang 46b73e6cd9
Change the api of DataParallel and Fleet (#29224)
5 years ago
chentianyu03 8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199)
5 years ago
liym27 865a45984f
Check whether there is any inplace operation affecting gradient calculation. (#27901)
5 years ago
Chen Weihang 0b032faeee
Polish unittests details and execution conditions to adapt to MUSL (#29044)
5 years ago
Wojciech Uss 4fd4095d1b
Add quantization of multi_gru op and tests (#28615)
5 years ago
yaoxuefeng 545df287fc
add user_define_dump (#28596)
5 years ago
arlesniak bc902044a4
Fixes mkldnn dygraph learning rate scheduler crashes (#28988)
5 years ago
WangXi 173c22aec2
optimize fast graph executor (#28962)
5 years ago
Shibo Tao db41258501
add API serialize_program, serialize_persistables, save_to_file, deserialize_program, deserialize_persistables, load_from_file. (#29034)
5 years ago
joanna.wozna.intel b0d1ac161e
Add bf16 pool2d and unify bf16 unit tests (#29039)
5 years ago
joanna.wozna.intel fddea67445
Fix cpu_bfloat16_pass (#28730)
5 years ago
Chen Weihang fea0e294ee
Hide the C++ stack by default and add hints (#29042)
5 years ago
Wojciech Uss 7b5a8e46de
Add multi_gru_fuse_pass and tests (#28601)
5 years ago
Wojciech Uss 991345b368
Add multi_gru_seq_fuse_pass and tests (#28604)
5 years ago
lilong12 f77a78cdee
enable pipeline to run with Executor.run() (#28373)
5 years ago
Thunderbrook 0073f9bdb0
support ps-gpu (#28752)
5 years ago
Jacek Czaja bd1d6d3b30
extends oneDNN caching keys so caching objects are unique to executor/predictor (#28758)
5 years ago
gongweibao 1dad8ceaab
Fix gpu memory allocation bug. (#28703)
5 years ago
joanna.wozna.intel 8c0ea4bffe
Add bf16 matmul, fc, elementwise add and mul (#28729)
5 years ago
Wojciech Uss efc3b182f0
a fix for the fc_lstm_fuse_pass (#28709)
5 years ago
wanghuancoder 5aec7dbeb0
use forward declarations for framework.pb.h (#28494)
5 years ago
Jacek Czaja 6d8d3d4c22
[oneDNN] Layer norm bf16 kernel (#28619)
5 years ago
joanna.wozna.intel 2cb71c0cde
Add checkpoint to quantize (#28612)
5 years ago
lidanqing 804271cff9
Op version python mkldnn_inplace test (#28354)
5 years ago
Leo Chen 90805e2df7
Register op_version for new attribute use_addto (#28463)
5 years ago
Shang Zhizhou 8699f38d08
裁剪transformer模型trt支持;修复tensorRT不支持DeletePass的bug (#28517)
5 years ago
lidanqing 0fc181dbd0
[Fix bug] If the pass name is not found, IsCompatible should return false (#28475)
5 years ago
wangchaochaohu d7cfee9b31
Checkout point add (#28488)
5 years ago
Pei Yang 75196cda40
Paddle-TRT int8 support mul op channelwise quant (#28422)
5 years ago
YUNSHEN XIE 369605be1d
fix cmake error when execute build_inference_lib (#28503)
5 years ago
YUNSHEN XIE 1e698c600e
fix cmake error when setting ut timeout properity (#28492)
5 years ago
YUNSHEN XIE ba0756325a
exec ut no more than 15s 1 (#28439)
5 years ago
joanna.wozna.intel 7821759d48
Add bfloat16 softmax and gelu (#28394)
5 years ago
石晓伟 c41fd033e5
check op_version_registry in CI test, test=develop (#28402)
5 years ago
Jacek Czaja ca41541472
[oneDNN]Sum bf16 kernel (#28382)
5 years ago
lidanqing 12b9587be5
Add conv_bias pass version python test (#28278)
5 years ago
石晓伟 21a63f6f90
enhance the op_version_registry, test=develop (#28347)
5 years ago
joanna.wozna.intel 571a63e7ec
Add bf16 transpose2, reshape2, concat ops (#28195)
5 years ago
Zhang Ting fdc06f2158
add Fuse bn add act pass (#28196)
5 years ago
Chen Weihang 813b2ade34
Enrich the python error types of paddle & polish format (#28124)
5 years ago
Adam Osewski 7db747d9e8
oneDNN BatchNorm + Act fusion pass. (#27912)
5 years ago
mapingshuo 81244fbfab
add sharding strategy in fleet(#27900)
5 years ago
Chen Weihang 2babd6ff67
Add compile limit for PADDLE_ENFORCE without error message (#28221)
5 years ago
Leo Chen 1f3be85914
Fix bug of fetch_async_op_handle when fetching the feed variable (#28194)
5 years ago
lidanqing 7cb4a8b8f2
[oneDNN] Conv dilation support (#27914)
5 years ago
Zhou Wei 2ac6c6c3af
fix bug of tensor copy of CUDAPinnedPlace (#27966)
5 years ago
guofei 6bbb6e7f45
Implement the function of OutScaleForTraining/OutScaleForInference in dygraph (#26601)
5 years ago
Thunderbrook 3ee6ad6ec5
solve bug in pull_dense_worker (#27918)
5 years ago
zhang wenhui 5a83496c8d
Multi task (#26002)
5 years ago
wanghuancoder 41aad9bfcd
revert 4 files, from clear include by iwyu, test=develop (#27895)
5 years ago
Leo Chen 049696bf67
Refine the format of printing tensor (#27673)
5 years ago
Chengmo c5f2802d56
【paddle.fleet】Update fleetrun & ps-heter (#27472)
5 years ago
石晓伟 0d27591642
save operator version infomation to program desc, test=develop (#27668)
5 years ago
Jacek Czaja 631c1f3018
- Fix to 27398 (#27770)
5 years ago
Jacek Czaja 606611d351
[oneDNN] GRU BF16 kernel (#27731)
5 years ago
Jacek Czaja b9fda2ff09
Fix to issue #25537 (#27546)
5 years ago
Wojciech Uss 966447e338
Added support for quantization of fusion_gru (#27518)
5 years ago
Pei Yang 8a4f85feb9
Add unittests and OP version registry for quant_conv2d_dequant_fuse_pass (#27689)
5 years ago
AshburnLee c3a3df6466
Add cuda support for unique op (#27646)
5 years ago
Leo Chen 35074963e3
Refine error msg in paddle/fluid/framework/details [part 2] (#27429)
5 years ago
Chengmo 0e101c4f6f
Fix test dist fleet heter ctr (#27513)
5 years ago
joanna.wozna.intel b0ee1405f7
Add conv2d bfloat16 support (#27325)
5 years ago
Thunderbrook 6f69a4cb05
add xpu in heter mode (#27000)
5 years ago
WangXi e550fc02ae
fleet2.0 add fp16 grad compression (#27480)
5 years ago
cc c5c13473c6
Add compatibility check for four mkldnn pass (#27364)
5 years ago
Wilber 3d5522146e
register seq_concat_fc_fuse pass. (#27479)
5 years ago
wanghuancoder df43905f12
use iwyu clean include (#27267)
5 years ago
Pei Yang 8182337096
clear pass logs (#27434)
5 years ago
Shang Zhizhou d93661942e
fix bug sequececonv_eltadd_relu_fuse_pass (#27404)
5 years ago
Leo Chen aba759ba16
[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112)
5 years ago
Wilber 39546aa2f3
Add pass compatible and unit test. (#27377)
5 years ago