Commit Graph

29843 Commits (f89da4ab4532461903221bc37f97e916fdefcb3d)
 

Author SHA1 Message Date
ceci3 cc387159f3
add pad and concat double grad (#29549)
4 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
4 years ago
Y_Xuan 76738504ad
添加rocm平台支持代码 (#29342)
4 years ago
huangxu96 b96dada4f0
add static.amp into setup.pu.in (#29621)
4 years ago
Zhang Ting 1e9127f688
improve dropout grad (#29605)
4 years ago
wangchaochaohu eab44e1f32
refine (#29622)
4 years ago
YUNSHEN XIE d0b789d27f
disable ut test_cumsum_op (#29613)
4 years ago
Jack Zhou 84bae27779
fix wmt14 doc, remove backward, add bidirect direction in rnn api (#29633)
4 years ago
WangXi 613c46bc07
fix gen_nccl_id_op_helper compile failed, test=develop (#29614)
4 years ago
chen zhiyu f5f8809c1a
1. add python version selection 2.add dynamic flags setting. (#29612)
4 years ago
YUNSHEN XIE 2926e74326
New UT should not exceed 15s (#29492)
4 years ago
Chen Weihang f02aece1f0
Add complex dtype op (add) test example (#29603)
4 years ago
AshburnLee efea540ca9
Add tf32 support for A100 tensor core acceleration for cuBLAS (#28732)
4 years ago
lijianshe02 7779768b53
add transpose double grad test=develop (#29600)
4 years ago
wangchaochaohu 1b69e528d3
optimize for long width for elementwise (#29602)
4 years ago
Wilber 78dad78610
fix none-contiguous bug for python api. (#29615)
4 years ago
Zhou Wei 18f9df0da4
fix cache pip error (#29618)
4 years ago
huangxu96 c05170d3d8
add alias for fluid.contrib.mixed_precision (#29562)
4 years ago
ShenLiang fb6697b424
Fix the dowanload bug in the case of multiple machines (#29551)
4 years ago
ShenLiang 1efef8baed
Fix bug of matmul_v2 for broadcast case (#29599)
4 years ago
Ren Wei (任卫) a9082082d0
Simplify the prompt of const_cast check. (#29548)
4 years ago
qingqing01 8d549fc85d
Add clip double grad (#29590)
4 years ago
Tao Luo 81acc3278c
disable test_parallel_executor_profiler in cuda 10.1 (#29581)
4 years ago
wangchaochaohu ac4bae8ee9
elementwise_add_grad Op optimization (#29575)
4 years ago
arlesniak 62d4483649
Added verbose oneDNN lib version (#29378)
4 years ago
lilong12 ff6a145011
update, test=develop (#29559)
4 years ago
huangxu96 2cb6f94888
add float16 into adaptive_avg_pool2d check list. (#29547)
4 years ago
yukavio ee1a7d020c
add some feature for paddle.flops (#29572)
4 years ago
WangXi 467c716963
gen nccl id use socket (#29431)
4 years ago
Bai Yifan d72604cd46
fix unittst unstable issue on ci machine (#29588)
4 years ago
tangwei12 0034273b7e
add service (#29560)
4 years ago
Leo Chen c0163837a5
Fix compile problem when cuda_arch < 6000 (#29576)
4 years ago
QingshuChen 79a41a9ed6
support roi_align & affine_channel for kunlun (#29561)
4 years ago
liym27 0cad1152f4
[Dy2Stat] 1. Fix bug of for-range stmts. 2. Support that step value is negative in for-range stmts (#29519)
4 years ago
Huihuang Zheng 831e9135b9
Fix Windows Unittest (#29543)
4 years ago
Jacek Czaja f6cca62575
[oneDNN] Making ThreadID info in caching key optional (#29272)
4 years ago
GeminiCarrie 08f24a3108
Fix precision problem (#29567)
4 years ago
Wilber 740c0d58c3
update for xpu ci. (#29568)
4 years ago
JZ-LIANG d33d468f02
[Sharding] add hybrid-dp feature (#29518)
4 years ago
Leo Chen 1e72e03217
remove duplicated macro (#29563)
4 years ago
Zhang Ting 6702040e94
improve dropout (#29465)
4 years ago
Zhang Ting 30d9589afe
add cast cuda kernel (#29352)
4 years ago
Chen Weihang c1a26e2a05
fix train eval set error in static mode (#29540)
4 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
4 years ago
Aurelius84 2a42250699
Polish hash function of executor cache key (#29556)
4 years ago
taixiurong 760d015c14
add xpu ops for training transformer in kunlun (#29539)
4 years ago
Leo Chen 0fdd365665
Add fast path for dropout when p == 0 (#29553)
4 years ago
Wojciech Uss 917a11495f
fix ininite scale values (#29386)
4 years ago
Jacek Czaja 83a693ee55
[oneDNN] Added Unit Test for Multiple instances prediction (#29501)
4 years ago
lijianshe02 bd29052e33
fix random seed in nll_loss unitest test=develop (#29538)
4 years ago