Commit Graph

29479 Commits (1cbb282d7774539a809d32f45bb9b443f56485a7)
 

Author SHA1 Message Date
wangchaochaohu 1b69e528d3
optimize for long width for elementwise (#29602)
4 years ago
Wilber 78dad78610
fix none-contiguous bug for python api. (#29615)
4 years ago
Zhou Wei 18f9df0da4
fix cache pip error (#29618)
4 years ago
huangxu96 c05170d3d8
add alias for fluid.contrib.mixed_precision (#29562)
4 years ago
ShenLiang fb6697b424
Fix the dowanload bug in the case of multiple machines (#29551)
4 years ago
ShenLiang 1efef8baed
Fix bug of matmul_v2 for broadcast case (#29599)
4 years ago
Ren Wei (任卫) a9082082d0
Simplify the prompt of const_cast check. (#29548)
4 years ago
qingqing01 8d549fc85d
Add clip double grad (#29590)
4 years ago
Tao Luo 81acc3278c
disable test_parallel_executor_profiler in cuda 10.1 (#29581)
4 years ago
wangchaochaohu ac4bae8ee9
elementwise_add_grad Op optimization (#29575)
4 years ago
arlesniak 62d4483649
Added verbose oneDNN lib version (#29378)
4 years ago
lilong12 ff6a145011
update, test=develop (#29559)
4 years ago
huangxu96 2cb6f94888
add float16 into adaptive_avg_pool2d check list. (#29547)
4 years ago
yukavio ee1a7d020c
add some feature for paddle.flops (#29572)
4 years ago
WangXi 467c716963
gen nccl id use socket (#29431)
4 years ago
Bai Yifan d72604cd46
fix unittst unstable issue on ci machine (#29588)
4 years ago
tangwei12 0034273b7e
add service (#29560)
4 years ago
Leo Chen c0163837a5
Fix compile problem when cuda_arch < 6000 (#29576)
4 years ago
QingshuChen 79a41a9ed6
support roi_align & affine_channel for kunlun (#29561)
4 years ago
liym27 0cad1152f4
[Dy2Stat] 1. Fix bug of for-range stmts. 2. Support that step value is negative in for-range stmts (#29519)
4 years ago
Huihuang Zheng 831e9135b9
Fix Windows Unittest (#29543)
4 years ago
Jacek Czaja f6cca62575
[oneDNN] Making ThreadID info in caching key optional (#29272)
4 years ago
GeminiCarrie 08f24a3108
Fix precision problem (#29567)
4 years ago
Wilber 740c0d58c3
update for xpu ci. (#29568)
4 years ago
JZ-LIANG d33d468f02
[Sharding] add hybrid-dp feature (#29518)
4 years ago
Leo Chen 1e72e03217
remove duplicated macro (#29563)
4 years ago
Zhang Ting 6702040e94
improve dropout (#29465)
4 years ago
Zhang Ting 30d9589afe
add cast cuda kernel (#29352)
4 years ago
Chen Weihang c1a26e2a05
fix train eval set error in static mode (#29540)
4 years ago
LoveAn b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI (#29499)
4 years ago
Aurelius84 2a42250699
Polish hash function of executor cache key (#29556)
4 years ago
taixiurong 760d015c14
add xpu ops for training transformer in kunlun (#29539)
4 years ago
Leo Chen 0fdd365665
Add fast path for dropout when p == 0 (#29553)
4 years ago
Wojciech Uss 917a11495f
fix ininite scale values (#29386)
4 years ago
Jacek Czaja 83a693ee55
[oneDNN] Added Unit Test for Multiple instances prediction (#29501)
4 years ago
lijianshe02 bd29052e33
fix random seed in nll_loss unitest test=develop (#29538)
4 years ago
joanna.wozna.intel 0ce6d7fa77
Fix bf16 activations test for softmax and gelu (#29502)
4 years ago
Zhong Hui 60bfd308ab
fix p_norm with empty shape (#29500)
4 years ago
Zhou Wei b9e926b8e5
change the code format (#29550)
4 years ago
Leo Chen 9f926eb720
Layernorm opt (#29522)
4 years ago
huangxu96 4001979309
Add ReserveSpace in dygraph batch_norm. (#29221)
4 years ago
arlesniak b781953ef5
[oneDNN] Fix flags use test for #29080, assert condition more general (#29493)
4 years ago
tangwei12 ae3f7a7100
add ps table (#29463)
4 years ago
chalsliu 36ec9456cf
Make PADDLE_ROOT as an environment variable
4 years ago
ShenLiang d8391a1983
fix error message of gather nd (#29521)
4 years ago
chalsliu 98edef3c45
Optimize accurate testing
4 years ago
Zhen Wang 5ac71b36fb
Remove tensor copy in the update_loss_scaling op. (#29426)
4 years ago
Wilber 5fe1f8aff7
update lite tag (#29517)
4 years ago
Zhou Wei e74e1a226c
support deepcopy for Layer/Tensor/Paramerbase (#29387)
4 years ago
chalsliu 701c8e06a0
Support precision test for cuda new ut
4 years ago