Commit Graph

29635 Commits (4763e6bc4e59b78ac52d02e3b4f4b6fe80a2a91e)
 

Author SHA1 Message Date
WangXi 613c46bc07
fix gen_nccl_id_op_helper compile failed, test=develop (#29614)
5 years ago
chen zhiyu f5f8809c1a
1. add python version selection 2.add dynamic flags setting. (#29612)
5 years ago
YUNSHEN XIE 2926e74326
New UT should not exceed 15s (#29492)
5 years ago
Chen Weihang f02aece1f0
Add complex dtype op (add) test example (#29603)
5 years ago
AshburnLee efea540ca9
Add tf32 support for A100 tensor core acceleration for cuBLAS (#28732)
5 years ago
lijianshe02 7779768b53
add transpose double grad test=develop (#29600)
5 years ago
wangchaochaohu 1b69e528d3
optimize for long width for elementwise (#29602)
5 years ago
Wilber 78dad78610
fix none-contiguous bug for python api. (#29615)
5 years ago
Zhou Wei 18f9df0da4
fix cache pip error (#29618)
5 years ago
huangxu96 c05170d3d8
add alias for fluid.contrib.mixed_precision (#29562)
5 years ago
ShenLiang fb6697b424
Fix the dowanload bug in the case of multiple machines (#29551)
5 years ago
ShenLiang 1efef8baed
Fix bug of matmul_v2 for broadcast case (#29599)
5 years ago
Ren Wei (任卫) a9082082d0
Simplify the prompt of const_cast check. (#29548)
5 years ago
qingqing01 8d549fc85d
Add clip double grad (#29590)
5 years ago
Tao Luo 81acc3278c
disable test_parallel_executor_profiler in cuda 10.1 (#29581)
5 years ago
wangchaochaohu ac4bae8ee9
elementwise_add_grad Op optimization (#29575)
5 years ago
arlesniak 62d4483649
Added verbose oneDNN lib version (#29378)
5 years ago
lilong12 ff6a145011
update, test=develop (#29559)
5 years ago
huangxu96 2cb6f94888
add float16 into adaptive_avg_pool2d check list. (#29547)
5 years ago
yukavio ee1a7d020c
add some feature for paddle.flops (#29572)
5 years ago
WangXi 467c716963
gen nccl id use socket (#29431)
5 years ago
Bai Yifan d72604cd46
fix unittst unstable issue on ci machine (#29588)
5 years ago
tangwei12 0034273b7e
add service (#29560)
5 years ago
Leo Chen c0163837a5
Fix compile problem when cuda_arch < 6000 (#29576)
5 years ago
QingshuChen 79a41a9ed6
support roi_align & affine_channel for kunlun (#29561)
5 years ago
liym27 0cad1152f4
[Dy2Stat] 1. Fix bug of for-range stmts. 2. Support that step value is negative in for-range stmts (#29519)
5 years ago
Huihuang Zheng 831e9135b9
Fix Windows Unittest (#29543)
5 years ago
Jacek Czaja f6cca62575
[oneDNN] Making ThreadID info in caching key optional (#29272)
5 years ago
GeminiCarrie 08f24a3108
Fix precision problem (#29567)
5 years ago
Wilber 740c0d58c3
update for xpu ci. (#29568)
5 years ago
JZ-LIANG d33d468f02
[Sharding] add hybrid-dp feature (#29518)
5 years ago
Leo Chen 1e72e03217
remove duplicated macro (#29563)
5 years ago
Zhang Ting 6702040e94
improve dropout (#29465)
5 years ago
Zhang Ting 30d9589afe
add cast cuda kernel (#29352)
5 years ago
Chen Weihang c1a26e2a05
fix train eval set error in static mode (#29540)
5 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
5 years ago
Aurelius84 2a42250699
Polish hash function of executor cache key (#29556)
5 years ago
taixiurong 760d015c14
add xpu ops for training transformer in kunlun (#29539)
5 years ago
Leo Chen 0fdd365665
Add fast path for dropout when p == 0 (#29553)
5 years ago
Wojciech Uss 917a11495f
fix ininite scale values (#29386)
5 years ago
Jacek Czaja 83a693ee55
[oneDNN] Added Unit Test for Multiple instances prediction (#29501)
5 years ago
lijianshe02 bd29052e33
fix random seed in nll_loss unitest test=develop (#29538)
5 years ago
joanna.wozna.intel 0ce6d7fa77
Fix bf16 activations test for softmax and gelu (#29502)
5 years ago
Zhong Hui 60bfd308ab
fix p_norm with empty shape (#29500)
5 years ago
Zhou Wei b9e926b8e5
change the code format (#29550)
5 years ago
Leo Chen 9f926eb720
Layernorm opt (#29522)
5 years ago
huangxu96 4001979309
Add ReserveSpace in dygraph batch_norm. (#29221)
5 years ago
arlesniak b781953ef5
[oneDNN] Fix flags use test for #29080, assert condition more general (#29493)
5 years ago
tangwei12 ae3f7a7100
add ps table (#29463)
5 years ago
chalsliu 36ec9456cf
Make PADDLE_ROOT as an environment variable
5 years ago