Commit Graph

843 Commits (e7ac74c85bbc0a1a023a90b9516114c1f458a2d1)

Author SHA1 Message Date
ShenLiang 01e2874a0e
Support multi-stream communication for dynamic graph distributed (#29525)
5 years ago
liuyuhui f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)
5 years ago
Y_Xuan 76738504ad
添加rocm平台支持代码 (#29342)
5 years ago
AshburnLee efea540ca9
Add tf32 support for A100 tensor core acceleration for cuBLAS (#28732)
5 years ago
Wilber 78dad78610
fix none-contiguous bug for python api. (#29615)
5 years ago
Zhou Wei e74e1a226c
support deepcopy for Layer/Tensor/Paramerbase (#29387)
5 years ago
ShenLiang 2ef9e0e23c
Rebuild group automatically in dynamic graph distributed (#29255)
5 years ago
yongqiangma 7c508d8668
update unbind norm add CUDAPlace api doc information (#29322)
5 years ago
liym27 b10ecd9d3a
[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267)
5 years ago
Chen Weihang 9ad800ebb2
Support type promote for basic math ops (quantum required) (#29265)
5 years ago
Zhen Wang be3777a50a
Add pure fp16 training with master weights. (#27712)
5 years ago
chentianyu03 8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199)
5 years ago
Zhou Wei c0a991c874
accumulate gradient for leaf tensor with previous graph and expose leaf tensor concept (#28429)
5 years ago
liym27 865a45984f
Check whether there is any inplace operation affecting gradient calculation. (#27901)
5 years ago
ShenLiang e2d01eb650
Support dynamic graph distributed (#28997)
5 years ago
Leo Chen 770395cb93
Split train_mode and has_grad for tracer (#29064)
5 years ago
Zhou Wei 8ca0a8a859
fix tensor detach to zero copy (#27921)
5 years ago
Chen Weihang 768dab441e
polish two api doc detail, test=document_fix (#28971)
5 years ago
gongweibao 1dad8ceaab
Fix gpu memory allocation bug. (#28703)
5 years ago
Zhou Wei 3b0dd5f620
fix bug that to_tensor not support paddle.Place (#28717)
5 years ago
Leo Chen 3d09929b1f
Add check for non-dispensable input (#28666)
5 years ago
Zhou Wei bf6e7cba7a
updata 2.0 API english doc (#28525)
5 years ago
Wilber 1bf4836580
[Inference] Add TryShrinkMemory interface. (#28409)
5 years ago
石晓伟 c41fd033e5
check op_version_registry in CI test, test=develop (#28402)
5 years ago
Leo Chen 8b2436a776
Add broadcast_shape api (#28257)
5 years ago
石晓伟 21a63f6f90
enhance the op_version_registry, test=develop (#28347)
5 years ago
Shang Zhizhou ea851796e5
TensorRT中ernie模型推理性能优化,支持变长输入 (#28367)
5 years ago
Wilber 6f0f45f69c
copy_to_cpu support uint8 (#28372)
5 years ago
wangguanzhong 5262b02585
add generate_proposals_v2 op (#28214)
5 years ago
石晓伟 d9b5f1261c
update the version of pybind, test=develop (#28284)
5 years ago
wangguanzhong 1c385e26f9
add op_function_generator for box_coder (#28303)
5 years ago
Guanghua Yu e8f2614da5
Enhance multiclass_nms op to support LoD for dygraph mode (#28276)
5 years ago
wangxinxin08 41d26a8287
update matrix nms op to api 2.0 (#28265)
5 years ago
Zhang Ting fdc06f2158
add Fuse bn add act pass (#28196)
5 years ago
Chen Weihang 813b2ade34
Enrich the python error types of paddle & polish format (#28124)
5 years ago
Zhou Wei fb7f85291b
fix print tensor place,add cpu/cuda/pin_memory API for Tensor (#28200)
5 years ago
Wilber f935ca8a50
[lite-xpu-subgraph] Fix xpu compile and test xpu ci. (#27932)
5 years ago
chentianyu03 05fd49e974
change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes (#27998)
5 years ago
tangwei12 202bfab1be
Feature/large scale kv save base/delta (#27470)
5 years ago
Zhou Wei bf412f4665
add tensor clone (#27953)
5 years ago
guofei 6bbb6e7f45
Implement the function of OutScaleForTraining/OutScaleForInference in dygraph (#26601)
5 years ago
chentianyu03 d05058d268
Remove and reorganize the alias of APIs (#27717)
5 years ago
Leo Chen 9a2a4b5f65
Support setting xpu place in dygraph mode (#27909)
5 years ago
Leo Chen 049696bf67
Refine the format of printing tensor (#27673)
5 years ago
joanna.wozna.intel ddcd1b5381
Add bfloat16 resnet50 test (#27755)
5 years ago
Wilber 9005c5a260
Lite subgraph support arm cpu. (#27827)
5 years ago
yongqiangma e8a5aefbbd
update CUDAPlace doc. test=document_fix (#27711)
5 years ago
zhupengyang 659d04df2c
hsigmoid -> hsigmoid_loss/HSigmoidLoss; refine docs (#27745)
5 years ago
石晓伟 0d27591642
save operator version infomation to program desc, test=develop (#27668)
5 years ago
joanna.wozna.intel 0cd4907eba
Add avx512 core instructions check (#27732)
5 years ago