Commit Graph

452 Commits (develop)

Author SHA1 Message Date
Zhou Wei 04a49b097e
[Custom OP]Remove old custom OP and reduce whl package volume (#31813)
5 years ago
tianshuo78520a e804f08559
delete include framework.pb.h (#31859)
5 years ago
liuyuhui 9ebf05b003
[Kunlun]Multi xpu dygraph performance optimization , add distributed.spawn support for multi xpu and some bug-fixes (#31130)
5 years ago
Qi Li 4d647ec137
[ROCM] update fluid platform for rocm (part5), test=develop (#31315)
5 years ago
Qi Li 28b356b9a2
[ROCM] update fluid framework for rocm (part6), test=develop (#31015)
5 years ago
Chen Weihang f649442ddd
New custom operator extension mechanism (#30690)
5 years ago
joanna.wozna.intel 73cdea01d4
Add bf16 fast performance verification (#30551)
5 years ago
wanghuancoder 90773473a0
use nvtx push pop in timeline (#30567)
5 years ago
wanghuancoder 59ad6ff3e3
delete empty line of pybing.cc, test=develop (#30529)
5 years ago
hutuxian e207fe6385
Ascend Framework Part2: pybind files (#30410)
5 years ago
wanghuancoder bd97192274
if pybind.cc changed, generate total report, test=develop (#30514)
5 years ago
tangwei12 25f80fd304
Fix/distributed proto (#29981)
5 years ago
AshburnLee 924aac2216
Add tf32 switch for cuDNN (#29192)
5 years ago
liuyuhui 4427df37cf
[Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)
5 years ago
tangwei12 032414ca2a
[Feature] one ps (3/4) (#29604)
5 years ago
Thunderbrook 09b6e71928
heter box (#29734)
5 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
5 years ago
AshburnLee efea540ca9
Add tf32 support for A100 tensor core acceleration for cuBLAS (#28732)
5 years ago
yongqiangma 7c508d8668
update unbind norm add CUDAPlace api doc information (#29322)
5 years ago
Chen Weihang 9ad800ebb2
Support type promote for basic math ops (quantum required) (#29265)
5 years ago
Chen Weihang 768dab441e
polish two api doc detail, test=document_fix (#28971)
5 years ago
gongweibao 1dad8ceaab
Fix gpu memory allocation bug. (#28703)
5 years ago
Leo Chen 8b2436a776
Add broadcast_shape api (#28257)
5 years ago
Zhang Ting fdc06f2158
add Fuse bn add act pass (#28196)
6 years ago
Zhou Wei fb7f85291b
fix print tensor place,add cpu/cuda/pin_memory API for Tensor (#28200)
6 years ago
Leo Chen 9a2a4b5f65
Support setting xpu place in dygraph mode (#27909)
6 years ago
Leo Chen 049696bf67
Refine the format of printing tensor (#27673)
6 years ago
yongqiangma e8a5aefbbd
update CUDAPlace doc. test=document_fix (#27711)
6 years ago
石晓伟 0d27591642
save operator version infomation to program desc, test=develop (#27668)
6 years ago
joanna.wozna.intel 0cd4907eba
Add avx512 core instructions check (#27732)
6 years ago
Zhang Ting d2369dd91f
modify docs of CPUPlace and CUDAPinnedPlace, test=document_fix (#27587)
6 years ago
Chen Weihang b14ecb8632
Polish api BuildStrategy/ExecutionStrategy doc & code example (#27662)
6 years ago
wanghuancoder c68a0313a5
add paddle.fluid._cuda_synchronize (#27595)
6 years ago
Leo Chen aba759ba16
[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112)
6 years ago
Wilber f827665ae6
[Pass Compatible] Bind python compatible. (#27262)
6 years ago
lilong12 1c68138327
[api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552)
6 years ago
QingshuChen 138ecf24aa
support Baidu Kunlun AI Accelerator (#25959)
6 years ago
yaoxuefeng 23261ff44b
add cpu random Generator (#26013)
6 years ago
wangchaochaohu 0b81d76310
[API2.0] add op for cudnn version query test=develop (#26180)
6 years ago
wangchaochaohu bb11cbc250
[API2.0] add Device api (set_device and get_device)(#26103)
6 years ago
Zhou Wei 6de463d3d1
expose and unify the Tensor concepts to the user (#25978)
6 years ago
Chen Weihang 838e36e9ed
Fix loaded variable suffix repeat error (#26169)
6 years ago
Thunderbrook 0cb60c700d
add heter ps mode (#25682)
6 years ago
tangwei12 caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957)
6 years ago
gongweibao 80f1c50738
Fix typo in interface. (#24779)
6 years ago
hutuxian 5822862d8a
Monitor Framework (#24079)
6 years ago
Leo Chen 6190023ac9
Refine error message in pybind folder (#24886)
6 years ago
Yanghello aa47356b74
Add crypto python (#24836)
6 years ago
Chen Weihang aa0f254fbe
Add macro BOOST_GET to enrich the error information of boost :: get (#24175)
6 years ago
Zhang Ting ab8f8fa70d
fix example code, test=develop, test=document_fix (#24139)
6 years ago