Commit Graph

17987 Commits (ebf689197d61af28110fa6b45e91527c47f68076)

Author SHA1 Message Date
iducn f1074e3b19
hide the token output to safely (#28716)
4 years ago
joejiong 32b90b1c2d
add log10 (#28576)
4 years ago
Leo Chen 3d09929b1f
Add check for non-dispensable input (#28666)
4 years ago
Chen Weihang 7eeb99fe02
Add basic hook classes for dygraph & implement reduce hook (#28584)
4 years ago
Guo Sheng 858ffa0c8b
Fix the dropout setting when not initialized in rnn_op. (#28561)
4 years ago
Jacek Czaja 6d8d3d4c22
[oneDNN] Layer norm bf16 kernel (#28619)
4 years ago
lilong12 80d2024644
bug fix, test=develop (#28674)
4 years ago
Zhou Wei bf143652ac
fix lstm OP compile error on windows (#28667)
4 years ago
石晓伟 57dab959ca
add datanorm op new scale_w register (#28657)
4 years ago
cc 65aac81191
Fix fake_quant error when cout > 1024, test=develop (#28603)
4 years ago
lilong12 b2f7ab6636
bug fix, test=develop (#28648)
4 years ago
wawltor 8f2656ef5c
fix the gradient bug for the topk v2
4 years ago
wangchaochaohu a972c33fd7
refine gather OP performance for dynamic mode (#28587)
4 years ago
joanna.wozna.intel 2cb71c0cde
Add checkpoint to quantize (#28612)
4 years ago
lidanqing 804271cff9
Op version python mkldnn_inplace test (#28354)
4 years ago
pangyoki b889a0cee2
add gaussian_random op_version (#28602)
4 years ago
YUNSHEN XIE cf2c42a937
fix exec nightly error on mac (#28567)
4 years ago
Guo Sheng 110febdc54
Fix gradients with ignore_idx in softmax_with_cross_entropy (#28622)
4 years ago
Wilber 8b97bb2e1f
Update cmake for arm ft and fix a bug for Predictor dtor. (#28586)
4 years ago
Leo Chen f962bd3432
Fix cudnn workspace limit in cudnn-8 (#28611)
4 years ago
Leo Chen 90805e2df7
Register op_version for new attribute use_addto (#28463)
4 years ago
danleifeng a24d186814
fix nccl init failed in parallel dygraph mode (#28497)
4 years ago
Zhou Wei 93c39779b4
open a part of GPU unittest for windows (#28378)
4 years ago
lilong12 ed9dd7c9f0
add send and recv ops (#28590)
4 years ago
Zhong Hui a829357e4d
register the op version for some ops
4 years ago
Zhou Wei bf6e7cba7a
updata 2.0 API english doc (#28525)
4 years ago
YUNSHEN XIE 7b1619e69b
disable test_trt_dynamic_shape_transformer_prune,test=document_fix (#28588)
4 years ago
Zhou Wei 849467b5aa
fix user set CUDA_VISIBLE_DEVICES start/end with quotation marks (#28547)
4 years ago
Shang Zhizhou 8699f38d08
裁剪transformer模型trt支持;修复tensorRT不支持DeletePass的bug (#28517)
4 years ago
joejiong 08d2413142
add log2 operator (#28319)
4 years ago
lidanqing 0fc181dbd0
[Fix bug] If the pass name is not found, IsCompatible should return false (#28475)
4 years ago
Wilber 1bf4836580
[Inference] Add TryShrinkMemory interface. (#28409)
4 years ago
wangchaochaohu c52fe48f6f
fix the GetKernelTypeForVar of input for fluid.gather (#28534)
4 years ago
wangchaochaohu d7cfee9b31
Checkout point add (#28488)
4 years ago
YUNSHEN XIE 98dc11bb6a
add monitoring for executive ut at night (#28377)
4 years ago
Pei Yang 75196cda40
Paddle-TRT int8 support mul op channelwise quant (#28422)
4 years ago
zhupengyang 47cbf61dd4
fix softmax unittest float16 random error (#28480)
4 years ago
Zhou Wei 53e9aa948d
remove diff with develop (#28504)
4 years ago
YUNSHEN XIE 369605be1d
fix cmake error when execute build_inference_lib (#28503)
4 years ago
Wilber 645e999afc
fix api_impl test. (#28483)
4 years ago
YUNSHEN XIE 1e698c600e
fix cmake error when setting ut timeout properity (#28492)
4 years ago
wangchaochaohu e14ed71cc2
refine the performance of gather Op (#28458)
4 years ago
wanghuancoder e29ab5eacb
clear clcache cache file and reopen clcache (#28384)
4 years ago
YUNSHEN XIE ba0756325a
exec ut no more than 15s 1 (#28439)
4 years ago
Chen Weihang 155b4f9b6c
Remove selected rows all reduce over height check (#28460)
4 years ago
taixiurong fad4744aa4
fix crash in adam in xpu, *test=kunlun (#28433)
4 years ago
QingshuChen 6bba8e57b1
fix batch_norm_xpu bug & remove xpusimulator dependence (#28430)
4 years ago
Wilber ced5c40c41
Update memory release interface. (#28456)
4 years ago
joanna.wozna.intel 7821759d48
Add bfloat16 softmax and gelu (#28394)
4 years ago
iducn ba0fe0a812
revert the modified shell script (#28453)
4 years ago
Chen Weihang c42e656179
Add retry for dygraph parallel socket bind (#28404)
4 years ago
石晓伟 c41fd033e5
check op_version_registry in CI test, test=develop (#28402)
4 years ago
Jacek Czaja ca41541472
[oneDNN]Sum bf16 kernel (#28382)
4 years ago
Chen Weihang 23439b1688
show cpp stack when catch signal (#28415)
4 years ago
Leo Chen 44a476c2ab
support cuda pinned place (#28416)
4 years ago
lidanqing 12b9587be5
Add conv_bias pass version python test (#28278)
4 years ago
Wilber 05114693cf
[Inference] Memory modification for ShrinkMemory. (#28355)
4 years ago
Leo Chen 8b2436a776
Add broadcast_shape api (#28257)
4 years ago
石晓伟 21a63f6f90
enhance the op_version_registry, test=develop (#28347)
4 years ago
YUNSHEN XIE c1c3e21726
retry will not be executed when the number of failed ut is greater than 20 (#28374)
4 years ago
Shang Zhizhou ea851796e5
TensorRT中ernie模型推理性能优化,支持变长输入 (#28367)
4 years ago
Jacek Czaja 84cc61b2cd
[oneDNN] sum op refactor (#28318)
4 years ago
Wilber 6f0f45f69c
copy_to_cpu support uint8 (#28372)
4 years ago
Wilber 09fd2b2aab
Paddle support compile on sw (#27858)
4 years ago
chen zhiyu 953302d9eb
add musl docker build script (#28027)
4 years ago
Leo Chen 6115c14fca
Pool2d cuda kernel supports fp16 (#28316)
4 years ago
Zhou Wei f41104efa3
fix compile out of memory temporary (#28346)
4 years ago
Guo Sheng 9a600df373
Add rnn_op (#28197)
4 years ago
wangchaochaohu 0f4b6247c8
refine the gpu config for performance optimization (#28291)
4 years ago
Huihuang Zheng acc11c2a62
Retry CUDA Initialization to Fix Random Failure, test=develop (#28323)
4 years ago
wangguanzhong 5262b02585
add generate_proposals_v2 op (#28214)
4 years ago
石晓伟 d9b5f1261c
update the version of pybind, test=develop (#28284)
4 years ago
Leo Chen 18c86fb2fb
hide some logs of p2p (#28307)
4 years ago
lidanqing 8cd1c102d9
Enable GRU infer model running CAPI (#28313)
4 years ago
wangguanzhong 1c385e26f9
add op_function_generator for box_coder (#28303)
4 years ago
iducn f763cb81a6
Modify the shell script according to the specification (#28302)
4 years ago
joanna.wozna.intel 571a63e7ec
Add bf16 transpose2, reshape2, concat ops (#28195)
4 years ago
Guanghua Yu e8f2614da5
Enhance multiclass_nms op to support LoD for dygraph mode (#28276)
4 years ago
石晓伟 842a4e5abd
fix analyzer_capi_tester, test=develop (#28289)
4 years ago
Leo Chen 8953038400
Fix transpose in conv cudnn kernel when addto enabled (#28295)
4 years ago
Tao Luo e1e666a05f
fix conv mkldnn build error (#28288)
4 years ago
Jacek Czaja 0b678d401b
- sum (#28233)
4 years ago
Jacek Czaja c11d9b3035
[oneDNN ] conv2d fwd&bwd optimization (#27871)
4 years ago
Zhou Wei 8f87c7eac4
fix judge bug of errorlevel on cmd (#28271)
4 years ago
wangxinxin08 41d26a8287
update matrix nms op to api 2.0 (#28265)
4 years ago
Leo Chen 7fcb32ddf3
fill_constant op supports NINF (#28270)
4 years ago
wangchaochaohu 6905608cea
refine yolo box Op for performace optimization (#28155)
4 years ago
wangchaochaohu cdadc8f019
refine temporal_shift_op for performance optimization using gpu kernel config (#28114)
4 years ago
Zhang Ting fdc06f2158
add Fuse bn add act pass (#28196)
4 years ago
Chen Weihang 813b2ade34
Enrich the python error types of paddle & polish format (#28124)
4 years ago
Adam Osewski 7db747d9e8
oneDNN BatchNorm + Act fusion pass. (#27912)
4 years ago
Zhou Wei fb7f85291b
fix print tensor place,add cpu/cuda/pin_memory API for Tensor (#28200)
4 years ago
tianshuo78520a 11089cacdb
Fix xpu notest (#28204)
4 years ago
mapingshuo 81244fbfab
add sharding strategy in fleet(#27900)
4 years ago
Chen Weihang 2babd6ff67
Add compile limit for PADDLE_ENFORCE without error message (#28221)
4 years ago
lidanqing 4ea2330759
use FLAGS_use_mkldnn to prevent unnecessary attrs copy (#28146)
4 years ago
tianshuo78520a d835118dbd
Hide log message (#28220)
4 years ago
Double_V 2db77be423
fix wrong data type, test=develop (#28203)
4 years ago
Feiyu Chan efe6e2840c
fix strided_slice_op's GetExpectedKernelType (#28192)
4 years ago
Zhou Wei 271ee58f5c
Enhance build detection (#28123)
4 years ago
Leo Chen 1f3be85914
Fix bug of fetch_async_op_handle when fetching the feed variable (#28194)
4 years ago
WangXi e450823b8b
Fix nccl op test failed, test=develop (#28172)
4 years ago
tianshuo78520a c226b2e45a
update dockerfile (#27589)
4 years ago
Wilber f935ca8a50
[lite-xpu-subgraph] Fix xpu compile and test xpu ci. (#27932)
4 years ago
Zhou Wei 68c473e3e0
fix Automatic GPU detection failed on windows (#28148)
4 years ago
danleifeng f29fb396df
dygraph nccl init support host domain name (#28107)
4 years ago
wangguanzhong 5cd97a1cb0
support multiclass nms for multi-batch, test=develop (#28154)
4 years ago
Pei Yang 602d2ce5c9
change avg pooling from trt plugin to trt layer (#28032)
4 years ago
Double_V 5289b72acc
fix Wmaybe-uninitialized warning in pooling.cc, test=develop (#28126)
4 years ago
Zhou Wei 5d7000215a
fix dynamic_loader more safe and error message on windows (#28117)
4 years ago
tianshuo78520a d87d286707
Add build paddle inference (#28131)
4 years ago
wangguanzhong d1e1f17482
fix generate_proposal_labels in cascade-rcnn series model, test=develop (#27892)
4 years ago
Leo Chen a911c19eb0
fill_constant op supports NaN and Inf (#28109)
4 years ago
zhupengyang 6dd64b0a30
randperm run error in multi-gpus (#27942)
4 years ago
Double_V d43f75e4cc
add rois_num for roi_align xpu OP (#28077)
4 years ago
xiaoting e3d02c9574
rm max_input in conv2d for kunlun, test=kunlun (#28062)
4 years ago
joanna.wozna.intel a21b57109c
Add AVX512 instruction check for C-API (#28087)
4 years ago
wangchaochaohu 463c72c2d9
refine gpu kernel config for Paddle (#28085)
4 years ago
yinhaofeng 2cb1ecb99e
lookup_table_v2_op_xpu report errors;test=kunlun (#28064)
4 years ago
yinhaofeng 6f0c3d1f06
xpu adam op (#28031)
4 years ago
TeslaZhao a5c95cd588
Add xpu transpose2 op.test=kunlun (#28086)
4 years ago
Chengmo 5f04875c30
Fix xpu error message (#28061)
4 years ago
LutaoChu c8d32c8c10
Fix diag OP bug on Windows Python3.8
4 years ago
Pei Yang a0b2f93689
reduce trt warning message (#28011)
4 years ago
huangxu96 d466893820
Allclose op (#27891)
4 years ago
pangyoki 975bd8873b
Fix error message of multinomial op (#27946)
4 years ago
Kaipeng Deng b6eff4427c
update yolo_box support h != w. test=develop (#27327)
4 years ago
Double_V c1eed1fa24
error message opt for XPU, test=kunlun (#27972)
4 years ago
pangyoki 4c5b779a99
Add truncated_gaussian_random XPU kernel (#27861)
4 years ago
pangyoki 5b8e500135
Add gaussian_random XPU kernels (#27853)
4 years ago
pangyoki 74ce039743
Add uniform_random XPU kernel (#27846)
4 years ago
xiaoting abf4d52a74
Polish kunlun error (#27974)
4 years ago
liuyuhui 3e9568653b
add cast/concat/assign xpu op (#27911)
4 years ago
Guo Sheng fa9d3fa5bf
Incorporate cudnn_lstm into LSTM api (#27217)
4 years ago
chentianyu03 05fd49e974
change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes (#27998)
4 years ago
Guanghua Yu f94d053705
error message optimization in mean_xpu,softmax_with_cross_entropy_op_xpu,test=kunlun (#27967)
4 years ago
Jack Zhou d330cf66cc
Fix xpu enforce (#27978)
4 years ago
lidanqing 7cb4a8b8f2
[oneDNN] Conv dilation support (#27914)
4 years ago
mapingshuo 64c2634995
fix kunlun kernel of reshape op (#27988)
4 years ago
tangwei12 202bfab1be
Feature/large scale kv save base/delta (#27470)
4 years ago
123malin aa3b4ed717
【paddle.fleet】geo send sparse optimize (#27719)
4 years ago
Zhou Wei 2ac6c6c3af
fix bug of tensor copy of CUDAPinnedPlace (#27966)
4 years ago
joanna.wozna.intel 840c521b77
Fix problem with flags fp32 and int8 (#27954)
4 years ago
mapingshuo 5ccaaab8aa
reshape support bool, test=develop (#27944)
4 years ago
Qinghe JING 4a4f773658
Add reduce sum and reduce mean xpu op (#27939)
4 years ago
Zhou Wei bf412f4665
add tensor clone (#27953)
4 years ago
Feiyu Chan 2e845182d9
support channel last in BatchNorm*d
4 years ago
guofei 6bbb6e7f45
Implement the function of OutScaleForTraining/OutScaleForInference in dygraph (#26601)
4 years ago
YUNSHEN XIE fea09fe534
disable ut quickly (#27793)
4 years ago
chentianyu03 d05058d268
Remove and reorganize the alias of APIs (#27717)
4 years ago