Commit Graph

17233 Commits (eef98b7f8602b5bd740ceab5f74b19bea66ed853)

Author SHA1 Message Date
Huihuang Zheng f9ac5fb992
[Dy2stat] Fix Memory Optimization in run_program_op and Add SimNet as Unit Test (#25383)
5 years ago
yaoxuefeng c42d662e2a
modify roll test=develop (#25321)
5 years ago
Zhen Wang 548cdbc544
Quantization-aware training for dygraph (#24634)
5 years ago
Chen Weihang 0b54d54fd8
Fix index overflow bug of the CUDA kernel loop increment (#25435)
5 years ago
zlsh80826 e528392de9
[Paddle-TRT] SkipLayernorm vectorized memory optimization (#25117)
5 years ago
Chen Weihang 4061aa6488
Polish ParallelExecutor exception process logic (#25449)
5 years ago
Jeng Bai-Cheng fc93266b0a
Improve qkv transpose performance (#23919)
5 years ago
zhupengyang 5b573c58e2
randperm API: remove out, devive, stop_gradient; add name (#25410)
5 years ago
Chen Weihang 7be285a66f
remove useless property, test=develop (#25461)
5 years ago
tianshuo78520a 2d028389e4
Fix Cpu CI error(#25457)
5 years ago
Jacek Czaja a5d1592f6c
Added missing oneDNN format (#25450)
5 years ago
Chen Weihang 172d4ecb6c
remove WITH_DSO compile option (#25444)
5 years ago
Zhen Wang bb45af02ac
add the c++ part of Imperative QAT. test=develop (#25446)
5 years ago
Jacek Czaja 050a9bf79d
[oneDNN] LRN cleanup (#25416)
5 years ago
GaoWei8 1974aadcf0
fix concat shape error (#25414)
5 years ago
tangwei12 4b3778a3ee
Revert/barrier for sync (#25417)
5 years ago
ceci3 52be62c5ae
fix instance norm in dy (#24717)
5 years ago
lilong12 e39aa70ec7
add the support for pipeline (#24560)
5 years ago
hong 70d7d07fea
catch bad alloc exception (#25140)
5 years ago
gongweibao 80f1c50738
Fix typo in interface. (#24779)
5 years ago
Zhaolong Xing 7b7e605189
[Fix BUGs]: fix multhead matmul pass's instable bug (#25123)
5 years ago
zhupengyang eb3173e2b6
rand API: remove out, device, stop_gradient; add name (#25246)
5 years ago
GaoWei8 ea7e532598
Refine PADDLE_ENFORCE (#25369)
5 years ago
zhupengyang 6de75082cb
fix test_hsigmoid windows ci (#25311)
5 years ago
Dong Daxiang d5e40d1ba9
Paddle fleet distributed strategy (#25379)
5 years ago
WuHaobo f593c3fb2f
fix the formula of floor OP and ceil OP (#25292)
5 years ago
Wojciech Uss d0a921ba98
Quant2 updates and fixes (#25313)
5 years ago
Zhang Ting bc7610583b
use eval() to improve CPU performance (#25243)
5 years ago
lilong12 3d96601b82
modify pipeline optimizer to only support the mode of sync pipeline training (#25065)
5 years ago
Kaipeng Deng 74468bf428
add mish op. (#24565)
5 years ago
Chen Weihang f07b25d8e5
fix DataLoader.generrator using error, test=develop (#25355)
5 years ago
GaoWei8 fb70682f00
fix PADDLE_ENFORCE (#25297)
5 years ago
Yang Zhang 6d6efafeeb
Add `matrix_nms_op` (#24400)
5 years ago
Chen Weihang 5a959f6e6e
Refactor dynamic dso search functions (#25214)
5 years ago
Jacek Czaja 17c751bec6
[oneDNN] Fix to #25078 (#25256)
5 years ago
MRXLT 3b8f0a64c2
Encryption infer (#25119)
5 years ago
Wilber 4474fc1033
fix compile on windows. test=develop (#25310)
5 years ago
Aurelius84 bc2bd3c1ed
modify into eager_tmp of Base Class test=develop (#25323)
5 years ago
Chengmo e85fcaa712
Fix fluid.embedding in Distributed Training (#25174)
5 years ago
Aurelius84 494cb36d09
Modify tmp var name prefix in dygraph (#25280)
5 years ago
Wilber 0371cf6f94
fix compile for lite subgraph. test=develop (#25285)
5 years ago
Yiqun Liu c00f827843
Avoid data transforming ShapeTensor from CPU to GPU in fill_constant op. (#25267)
5 years ago
Wojciech Uss 23a4f54b73
rename qat into quant (#24948)
5 years ago
123malin f1a9593d69
test=develop, bug fix for index_select and roll op (#25251)
5 years ago
FDInSky c2e072587c
test=develop fix generate_proposals's error (#25227)
5 years ago
Sylwester Fraczek 36abeff44f
adding elementwiseadd quantization (#25178)
5 years ago
Wojciech Uss 56fa3880e3
rename qat into quant in filenames only (#25194)
5 years ago
Wilber 4c964abdf7
support build on arm. test=develop (#25212)
5 years ago
Wilber f78e161ea3
remove paddle_use_kernel and paddle_use_op. test=develop (#25189)
5 years ago
liym27 1458cc0c68
Fix bug: Don't check dims if contain_unknown_dim of cross_entropy_grad_op in compile time (#25221)
5 years ago
liu zhengxi 68e93d8a17
Fix beam_search InferShape (#25169)
5 years ago
Chen Weihang 353ea9e8ad
Add default cudnn lib path (#25175)
5 years ago
Leo Chen ff5be2fb77
Refine error message in memory folder (#25095)
5 years ago
Adam bd0b38e671
Refactor of conv fp32 oneDNN operator (#25137)
5 years ago
Pei Yang b2f5a149e7
[Paddle-TRT] Better Paddle-TensorRT support for PaddleSlim quant models (#25097)
5 years ago
Tao Luo 2996315fc9
fix profiler_test on win32 (#25073)
5 years ago
Shibo Tao 19c4db1b56
don't re-generate header file if content doesn't change (#25130)
5 years ago
iducn f282599229
disable unitest for gcc8(#25134)
5 years ago
tianshuo78520a 1eb9ee242b
delete buddy_allocator_test_data to make repo clean (#25046)
5 years ago
Chen Weihang b23801a262
polish tensor set error messag, test=develop (#25113)
5 years ago
Jacek Czaja a7944904d3
[oneDNN]elementwise_add and elementwise_mul int8 support (#24984)
5 years ago
Zhaolong Xing 843581154f
fix emb eltwise layernorm (#24873)
5 years ago
石晓伟 9ab3cf039c
remove useless test_dot, test=develop (#24957)
5 years ago
石晓伟 6783441e70
fix repeat definitions in liengine.cc, test=develop (#25020)
5 years ago
Leo Chen fa657b3dbb
fix bug of prelu when rank not equal 4, test=develop (#25067)
5 years ago
zlsh80826 479c8834f7
[Paddle-TRT] Fixes #24731, opt for SoftmaxKernelWithEltadd kernel, test=develop (#24834)
5 years ago
hutuxian 5822862d8a
Monitor Framework (#24079)
5 years ago
Leo Chen 028de857d4
fix dtype error of compare op, test=develop (#25059)
5 years ago
Jeng Bai-Cheng bef4afa6de
bugfix for unique_ptr of IOptimizationProfile (#23917)
5 years ago
zlsh80826 49e4ee27e1
[Paddle-TRT] slice kernel optimization (#24783)
5 years ago
tianshuo78520a 770c11a117
fix make device_context error (#25045)
5 years ago
tangwei12 be6a315fbd
Fix/sync barrier (#25016)
5 years ago
ceci3 8db66fc3f6
fix cos_sim, test=develop (#25017)
5 years ago
Leo Chen 25a4dac4c2
Use allow list instead of white list (#25002)
5 years ago
Zhang Ting 621b638550
improve performance of instance_norm, test=develop (#25005)
5 years ago
hutuxian 1c224e26af
support CMatchAuc (#24990)
5 years ago
Zhou Wei ff8ca52f88
windows publish package scripts (#24851)
5 years ago
Leo Chen bfa46c38d5
bn supports reverse_space, test=develop (#24988)
5 years ago
wangchaochaohu 613303dbf6
refine the slice Op to improve the performance of xlnet for fp16 training (#24967)
5 years ago
silingtong123 37bdb5269f
test=develop, add log message in the function UpdateDllFlag (#24937)
5 years ago
Chen Weihang d152d7231e
clear old var in scope, test=develop (#24976)
5 years ago
Sylwester Fraczek 53d563a0fe
Reshape transpose matmul coverage (#24970)
5 years ago
wawltor 0eb1b0bc01
Add support the 5d, 6d tensor support for the reduce ops
5 years ago
liuwei1031 8603b5fb72
fix randomly hang issue of PaddleDetection training task on windows (#24977)
5 years ago
silingtong123 640196c446
test=develop, remove the tensorrt dll file from windows package (#24922)
5 years ago
wangchaochaohu feba131893
fix the sgement fault error of profiler in seqseq model test=develop (#24952)
5 years ago
Sylwester Fraczek a7ee634b45
fix WARNING: ThreadSanitizer: heap-use-after-free (#24929)
5 years ago
mapingshuo 24e24987f0
fixes the place info in the Print op (#24934)
5 years ago
Aurelius84 6be0ee159e
Support LoDTensorArray in reverse_op (#24797)
5 years ago
Leo Chen 6190023ac9
Refine error message in pybind folder (#24886)
5 years ago
Zhou Wei 4058e736ff
temporarily disable these unittests failed on windows (#24942)
5 years ago
Leo Chen a7cb97a1a5
Fix/isfinite on windows (#24927)
5 years ago
silingtong123 ef9b36873d
test=develop, remove the gflags/gflags.h form paddle_api.h (#24921)
5 years ago
whs 4c01d6d53e
Enhance checking in some operator. (#24473)
5 years ago
Chen Weihang 4a702ef361
Support SelelctedRows allreduce in multi-cards imperative mode (#24690)
5 years ago
Pei Yang 14b8540551
add default ctor for AnalysisConfig python api. test=develop (#24924)
5 years ago
silingtong123 fc4435174b
test=develop, fix the bug of tensorrt package can't compile on windows (#24860)
5 years ago
lilong12 29de0d97a5
add the support to specify device index for device_guard (#24555)
5 years ago
lilong12 6e10022781
add queue_generator_op, dequeue_op, enqueue_op and ut (#24481)
5 years ago
hutuxian b8f17a049d
fix problem in dump and add log (#24891)
5 years ago