Commit Graph

3160 Commits (c98f144fbc012f26c3fd2482d08d174700b09069)

Author SHA1 Message Date
wanghuancoder aab3a3012e
add include for heterbox_trainer.cc, develop=test (#30910)
4 years ago
Adam Osewski 092a2b1413
More UT for LayerNormFuse pass (#30891)
4 years ago
wanghuancoder 35c5b23f68
use iwyu clean include second time, test=develop (#30829)
4 years ago
Adam Osewski 4f066e316e
Layer normalization fuse pass. (#30721)
4 years ago
Thunderbrook cb66c53c2d
dump to cpu (#30750)
4 years ago
WangXi 31ed9c9eed
Fleet distributed strategy support pure fp16 (#30754)
4 years ago
alncat 5b59499e57
fixed compilation error on gcc 4.8.x due to the usage of isfinite (#30733)
4 years ago
liuyuhui 67abfc1588
[Kunlun] fix dead lock for exec_op_count_ (#30718)
4 years ago
alncat 5ace20fc3f
modified conv+bn fuse pass to fix wrong mask in mask rcnn (#30704)
4 years ago
lilong12 7fbc68a2c0
update, test=develop (#30692)
4 years ago
arlesniak 5bf25d1e8b
More precise mkldnn kernel rules in GetExpectedKernelType (#29840)
4 years ago
Jacek Czaja 173660be7b
[oneDNN] Cache oneDNN stream not to recreate in each oneDNN op (#30358)
4 years ago
Thunderbrook 1bebc09253
solve build gpu task core (#30626)
4 years ago
liuyuhui e5b0d9e1fc
[Kunlun] Add condition_variable and notify() in BindThreadedSSAGraphExecutor (#30586)
4 years ago
liym27 ff25c5b36f
Fix bug: GetAttrValue should deal with attr with attrType vector<double> (#30536)
4 years ago
Leo Chen 81217a94d8
unify calling cudaSetDevice (#30470)
4 years ago
hutuxian 40ede12631
Ascend Framework Part1: OP & Wrapper (#30281)
4 years ago
liuyuhui 843dc3cdbd
[Kunlun]PR3: add xpu executor, multi xpu card train function optimization (#30317)
4 years ago
Adam Osewski c5ffad126c
[oneDNN] Refactor fuse pass helper functions to one place. (#30460)
4 years ago
pangyoki 13d757362c
Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103)
4 years ago
yaoxuefeng 6e0da01c61
Heter ps new (#30198)
4 years ago
cc 8e3a294045
skip quantizing ops in cpu inference (#30342)
4 years ago
alncat 7bbf3ac5ab
Added support for inference using quantization aware trained dygraph (#30288)
4 years ago
Zhang Jun 10a8f3e5c3
fix bug on compiling inference shared lib with crypto;test=develop (#30269)
4 years ago
JZ-LIANG 75936d838f
Recompute Offload (#30233)
4 years ago
tangwei12 5e839e4da5
add sparse embedding & load vars for 2.0 & gloo bug fix (#30306)
4 years ago
tangwei12 25f80fd304
Fix/distributed proto (#29981)
4 years ago
liym27 b4989fb744
Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126)
4 years ago
石晓伟 8ce2482b80
fix header file paths of gflags, commit 1, test=develop (#30271)
4 years ago
wangchaochaohu af80859dd6
reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885)
4 years ago
Zhen Wang 7f7dfccf20
Support pure fp16 training for AMP API. (#29544)
4 years ago
Leo Chen 789743e190
use cuda generator in bernoulli cuda kernel (#30199)
4 years ago
Leo Chen 1f97d61c68
Add callback after TensorCopy (#30123)
4 years ago
Chengmo 528e03fc08
【Paddle.Fleet】Fix tensor table (#30075)
4 years ago
Huihuang Zheng 54bf3f5a56
Refine PADDLE_ENFORCE Error Messages. test=develop (#30149)
4 years ago
Chen Weihang d0fb06b27f
[Complex] Simplify prepared op impl to improve performance (#30153)
4 years ago
liuyuhui 15fac5e7fa
fix assign_op_xpu concat_op_xpu warining (#30120)
4 years ago
石晓伟 53bb126510
fix a bug in op_version_registry, test=develop, test=op_version (#29994)
4 years ago
liuyuhui 254ad61959
fix xpu pe sync, test=notest (#30095)
4 years ago
Thunderbrook 0b8e1fadc5
add topo-aware in heter-ps (#30087)
4 years ago
WangXi ee16006b5d
Optimization grad merge performance (#29784)
4 years ago
Shang Zhizhou 08dc5bc27e
fix op version checker of pass bug (#30028)
4 years ago
cc c3c064a8fc
Add mkldnn nearest_interp and bilinear_interp op (#30016)
5 years ago
wawltor cc2f94620c
add the support the op version check for matmul, test=op_version (#30011)
5 years ago
wawltor b33aaea86c
add the op version check for the elementwise ops, test=op_version (#30010)
5 years ago
Leo Chen 47d10c55d5
Enhance debugging (#30001)
5 years ago
wawltor 8f49f9d5c9
change the elementwise ops version check, test=op_version
5 years ago
Thunderbrook 0ca6de171f
add include (#29952)
5 years ago
cc 6a0102b038
map matmul/squeeze2+matmul/reshape2+matmul to mul (#29911)
5 years ago
Jack Zhou 5a4e42ca9a
add gru op_register_version; test=op_version; (#29931)
5 years ago