Commit Graph

813 Commits (c98f144fbc012f26c3fd2482d08d174700b09069)

Author SHA1 Message Date
wanghuancoder 35c5b23f68
use iwyu clean include second time, test=develop (#30829)
4 years ago
liuyuhui 67abfc1588
[Kunlun] fix dead lock for exec_op_count_ (#30718)
4 years ago
liuyuhui e5b0d9e1fc
[Kunlun] Add condition_variable and notify() in BindThreadedSSAGraphExecutor (#30586)
4 years ago
Leo Chen 81217a94d8
unify calling cudaSetDevice (#30470)
4 years ago
liuyuhui 843dc3cdbd
[Kunlun]PR3: add xpu executor, multi xpu card train function optimization (#30317)
4 years ago
pangyoki 13d757362c
Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103)
4 years ago
tangwei12 25f80fd304
Fix/distributed proto (#29981)
4 years ago
liuyuhui 254ad61959
fix xpu pe sync, test=notest (#30095)
4 years ago
WangXi ee16006b5d
Optimization grad merge performance (#29784)
4 years ago
liuyuhui 4427df37cf
[Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)
5 years ago
QingshuChen 59b47f3b32
feat: support check_nan_inf for kunlun/xpu device (#29694)
5 years ago
tangwei12 032414ca2a
[Feature] one ps (3/4) (#29604)
5 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
5 years ago
LoveAn 03b42d9fa7
fix unittest on windows, test=develop (#29365)
5 years ago
Chen Weihang 1de32f823d
Hot fix complle failed in gcc4.8 caused by complex impl (#29254)
5 years ago
chentianyu03 8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199)
5 years ago
WangXi 173c22aec2
optimize fast graph executor (#28962)
5 years ago
Zhang Ting fdc06f2158
add Fuse bn add act pass (#28196)
5 years ago
Leo Chen 1f3be85914
Fix bug of fetch_async_op_handle when fetching the feed variable (#28194)
5 years ago
Leo Chen 35074963e3
Refine error msg in paddle/fluid/framework/details [part 2] (#27429)
5 years ago
wanghuancoder df43905f12
use iwyu clean include (#27267)
5 years ago
Leo Chen aba759ba16
[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112)
5 years ago
Leo Chen bbc84e0fe0
Refine error msg in paddle/fluid/framework/details [part 1] (#25631)
5 years ago
Feiyu Chan c8cc094576
add template specialization for bfloat16 for gcc 4.8 compatability (#26985)
5 years ago
joanna.wozna.intel 95e1434bb2
Add bfloat16 data type (#25402)
5 years ago
wanghuancoder 2d2c31a63a
Add FetchAsyncOpHandle, and use it in FastThreadedExecutor (#26643)
5 years ago
wanghuancoder c1f5df5269
optimized transformation form tensor to numpy (#26447)
5 years ago
tangwei12 3755564ae1
Fix/large scale fix (#25999)
5 years ago
tangwei12 caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957)
5 years ago
Chen Weihang 4061aa6488
Polish ParallelExecutor exception process logic (#25449)
5 years ago
hong 70d7d07fea
catch bad alloc exception (#25140)
5 years ago
Chen Weihang d1062d5278
Replace all errors thrown by LOG(FATAL) with PADDLE_THROW (#24759)
5 years ago
Chen Weihang aa0f254fbe
Add macro BOOST_GET to enrich the error information of boost :: get (#24175)
5 years ago
Zeng Jinle acef55df04
fix isolated var fetch bug, test=develop (#24070)
5 years ago
Zhou Wei 7817003795
Optimize the error messages of paddle CUDA API (#23816)
5 years ago
guofei 2b896c1f6b
Support LoDTensorArray in fetch (#23645)
5 years ago
Zeng Jinle c49791362f
Correct reader device index (#23802)
5 years ago
liym27 06d4aa4e73
API (BuildStrategy) error message enhancement. (#23462)
5 years ago
Zhen Wang 84cd45f674
Solve the conflict of ops with the same name, test for CI. (#23573)
5 years ago
mozga-intel 3baaee9aab
Remove: NGraph engine from PDPD repository (#23545)
5 years ago
qingqing01 6162cf2f2e
Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO. (#23426)
5 years ago
Tao Luo 0b583235f5
Revert "Solve the conflict of ops with the same name. (#23199)" (#23494)
5 years ago
Zhen Wang abe3e6906d
Solve the conflict of ops with the same name. (#23199)
5 years ago
Zeng Jinle 29337f4e17
fix conflict of inferne partial feed with gpu parallel ssa graph executor, test=develop (#23400)
5 years ago
Zeng Jinle 3a21980b78
add reader dependency pass, test=develop (#23301)
5 years ago
Zeng Jinle 7ca77a90ac
add Tensor::IsSharedBufferWith method, test=develop (#23175)
5 years ago
Zeng Jinle bae5930ba1
fix graph attr copy issues, test=develop (#23191)
5 years ago
Zeng Jinle acfc9b8a70
Reader sequential and inference partial feed (#22699)
5 years ago
Zeng Jinle d33c4343e1
Imperative tracer refactoring (#22457)
5 years ago
Zhen Wang 89cfa49156
Unmerged fetch list (#22635)
5 years ago