Leo Chen
224f3bcbb1
format code ( #29714 )
5 years ago
石晓伟
8bd2879ef7
update the operator registration for incompatible upgrade, test=develop ( #29720 )
5 years ago
WangXi
9cbcc6cadc
fleet sync build strategy, test=develop ( #29732 )
5 years ago
Chen Weihang
6cfa59de1b
[Complex] Add real & imag op and api for complex tensor ( #29672 )
...
* add complex real op & api & unittest
* add imag op & api & unittest
* refactor op impl
* revert simplify writing due to complile failed
* polish details
* polish grad op code
5 years ago
liuyuhui
f13c3a9cd7
[Kunlun] PR1:Support one Kunlun card training in parallel executor ( #29337 )
5 years ago
lilong12
ff6a145011
update, test=develop ( #29559 )
5 years ago
Jacek Czaja
f6cca62575
[oneDNN] Making ThreadID info in caching key optional ( #29272 )
5 years ago
JZ-LIANG
d33d468f02
[Sharding] add hybrid-dp feature ( #29518 )
...
* Sharding add hybrid-dp feature
* update sharding in distributed_strategy
* update sharding unitest
* revise code format for sharding
5 years ago
LoveAn
b5d4a1f33d
Add the strategy of skipping cc/cu test compilation and execution in CI ( #29499 )
...
* Add the strategy of skipping cc/cu test compilation and execution in CI, test=develop
* fix if error with CI_SKIP_TEST, test=develop
* fix add properties to test error on Linux/MAC, test=develop
* fix set test properties of test_code_generator error, test=develop
* remove test codes and advance judgment of file modification on Linux, test=develop
* rename CI_SKIP_TEST to CI_SKIP_CPP_TEST, test=document_fix
* Add branch judgement on Linux, test=develop
5 years ago
Aurelius84
2a42250699
Polish hash function of executor cache key ( #29556 )
...
* Add more value to calculate hash key
* fix size_t
* polish code
5 years ago
jakpiase
57a4f16d9e
added internal and external reorders to profiler ( #29443 )
...
* added external reorder to profiler
* added external and internal reorders to profiler
* added internal and external reorder to profiler
* added formatting to int/ext reorder commit
* removed unnecessary comment
5 years ago
LoveAn
03b42d9fa7
fix unittest on windows, test=develop ( #29365 )
5 years ago
cc
a623ce044f
Use different name_scope for different conv type, test=develop ( #29355 )
5 years ago
liym27
b10ecd9d3a
[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows ( #29267 )
5 years ago
Chen Weihang
9ad800ebb2
Support type promote for basic math ops (quantum required) ( #29265 )
...
* basic impl of type promote
* add comment & another testcase
* fix complex bugs & support python op promote type
* fix failed unittests & polish code
* add unittest for coverage
* change to only promote complex type
* polish code details
* polish several comments
5 years ago
Aurelius84
67c700b479
[Dy2Stat] Add cache for Executor and Context in run_program_op ( #28421 )
5 years ago
Chen Weihang
1de32f823d
Hot fix complle failed in gcc4.8 caused by complex impl ( #29254 )
...
* hot fix complle failed in gcc4.8
* fix failed unittest
5 years ago
GeminiCarrie
642abe2a48
Fix a bug when running on an operating system without "bash." ( #29131 )
...
* Fix a bug when running on an operating system without "bash."
* add execution condition
* for ci-coverage
5 years ago
ShenLiang
46b73e6cd9
Change the api of DataParallel and Fleet ( #29224 )
5 years ago
chentianyu03
8f45d14263
add complex64 and complex128 type; add +-*/@ and slice opreator for c… ( #29199 )
...
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types
* add test cases for complex elementwise, matmul and getitem unittest
* add test cases for complex types
* add test cases for complex matmul unittest
5 years ago
liym27
865a45984f
Check whether there is any inplace operation affecting gradient calculation. ( #27901 )
...
* Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable.
* Add a new attribute `_inplace_version` for VarBase.
* Raise exception if an inplace operation can result in incorrect gradient computation.
* Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation.
* For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode.
* Use original var_wrapper if the inplace_version is not changed.
* Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.
5 years ago
Chen Weihang
0b032faeee
Polish unittests details and execution conditions to adapt to MUSL ( #29044 )
...
* fix failed tests in yingchun gived list
* add unittests into static_mode_white_list
* add enable static
* fix dist unittest
* skip test_sigmoid_focal_loss_op & add gym
* revert no need skip unittests
* remove gym
5 years ago
Wojciech Uss
4fd4095d1b
Add quantization of multi_gru op and tests ( #28615 )
5 years ago
yaoxuefeng
545df287fc
add user_define_dump ( #28596 )
5 years ago
arlesniak
bc902044a4
Fixes mkldnn dygraph learning rate scheduler crashes ( #28988 )
5 years ago
WangXi
173c22aec2
optimize fast graph executor ( #28962 )
5 years ago
Shibo Tao
db41258501
add API serialize_program, serialize_persistables, save_to_file, deserialize_program, deserialize_persistables, load_from_file. ( #29034 )
5 years ago
joanna.wozna.intel
b0d1ac161e
Add bf16 pool2d and unify bf16 unit tests ( #29039 )
...
* Add bf16 pool2d and unify bf16 unit tests
* Add change default ops test
5 years ago
joanna.wozna.intel
fddea67445
Fix cpu_bfloat16_pass ( #28730 )
...
* Fix cpu_bfloat16_pass
* Add output_format
* Fix incorrect SetOutput
* Change fromating
5 years ago
Chen Weihang
fea0e294ee
Hide the C++ stack by default and add hints ( #29042 )
...
* default not show cpp statck & add hint
* fix failed unittest
* fix failed unittests
5 years ago
Wojciech Uss
7b5a8e46de
Add multi_gru_fuse_pass and tests ( #28601 )
...
* Add multi_gru_fuse_pass and tests
* fix date
* cleaned up headers
5 years ago
Wojciech Uss
991345b368
Add multi_gru_seq_fuse_pass and tests ( #28604 )
...
* Add multi_gru_seq_fuse_pass and tests
* fix date
* removed unused functions
5 years ago
lilong12
f77a78cdee
enable pipeline to run with Executor.run() ( #28373 )
...
* update, test=develop
5 years ago
Thunderbrook
0073f9bdb0
support ps-gpu ( #28752 )
...
* ps gpu transpile
* ps gpu
* remove op
* gps trainer
* local ps
* add macro
* HeterBox
* def cuda
* tab
* code style
* style
Co-authored-by: Thunderbrook <a754913769#163.com>
5 years ago
Jacek Czaja
bd1d6d3b30
extends oneDNN caching keys so caching objects are unique to executor/predictor ( #28758 )
5 years ago
gongweibao
1dad8ceaab
Fix gpu memory allocation bug. ( #28703 )
5 years ago
joanna.wozna.intel
8c0ea4bffe
Add bf16 matmul, fc, elementwise add and mul ( #28729 )
...
* Add bf16 matmul, fc, elementwise add and mul
* Correct unit test
5 years ago
Wojciech Uss
efc3b182f0
a fix for the fc_lstm_fuse_pass ( #28709 )
5 years ago
wanghuancoder
5aec7dbeb0
use forward declarations for framework.pb.h ( #28494 )
...
* use forward declarations for framework.pb.h, test=develop
* use forward declarations for framework.pb.h, test=develop
5 years ago
Jacek Czaja
6d8d3d4c22
[oneDNN] Layer norm bf16 kernel ( #28619 )
5 years ago
joanna.wozna.intel
2cb71c0cde
Add checkpoint to quantize ( #28612 )
...
* Add checkpoint to quantize
* Change bfloat16 option
5 years ago
lidanqing
804271cff9
Op version python mkldnn_inplace test ( #28354 )
...
* add mkldnn inplace op version test
* update mkldnn_inplace fuse pass
* update the inplace test
5 years ago
Leo Chen
90805e2df7
Register op_version for new attribute use_addto ( #28463 )
...
* register op_version for addto
* upgrade pass capability
* change eq to le
* change eq to le
* fix merge
5 years ago
Shang Zhizhou
8699f38d08
裁剪transformer模型trt支持;修复tensorRT不支持DeletePass的bug ( #28517 )
...
* skip_layernorm_op done
* add unittest
* slice op convertor support trt < 6
* skip_layernorm only work in ernie
5 years ago
lidanqing
0fc181dbd0
[Fix bug] If the pass name is not found, IsCompatible should return false ( #28475 )
5 years ago
wangchaochaohu
d7cfee9b31
Checkout point add ( #28488 )
...
* upgrade pass capability
5 years ago
Pei Yang
75196cda40
Paddle-TRT int8 support mul op channelwise quant ( #28422 )
...
* paddle-trt support mul channelwise quant
* add support for depthwise_conv2d
* add errmsg for unsupported op type
5 years ago
YUNSHEN XIE
369605be1d
fix cmake error when execute build_inference_lib ( #28503 )
5 years ago
YUNSHEN XIE
1e698c600e
fix cmake error when setting ut timeout properity ( #28492 )
5 years ago
YUNSHEN XIE
ba0756325a
exec ut no more than 15s 1 ( #28439 )
...
* disable ut test_parallel_executor_fetch_isolated_var,test=document_fix
* test for limiting ut exec time as 15S
* fix an error caused by cannot find ut
* fix some error
* can not find test_transformer
* fix error caused by ut not run in windows
* fix error caused by Compiler Options
* fix error caused by setting timeout value as 15 in python/paddle/tests/CMakeLists.txt
* setting timeout value to 120s for old ut
* add the timeout value setting
* fix error caused by ut only run in coverage_ci
* add analyzer_transformer_profile_tester
* fix some error
* fix some error
* fix error with inference option
* fix error with inference option setting as ON_INFER
* add some ut to set timeout
* modified some option
* fix error
* fix some timeout error
* fix error
* fix error
* fix timeout for test_analyzer_bfloat16_resnet50
* fix error
* setting timeout properity for some ut
* first pr for new ut timeout as 15S
5 years ago
joanna.wozna.intel
7821759d48
Add bfloat16 softmax and gelu ( #28394 )
...
* Add bfloat16 softmax and gelu
* Add pass attr bfloat16_enabled_op_types
* Changes from review
5 years ago
石晓伟
c41fd033e5
check op_version_registry in CI test, test=develop ( #28402 )
5 years ago
Jacek Czaja
ca41541472
[oneDNN]Sum bf16 kernel ( #28382 )
...
* - Added sum bf16 oneDNN
test=develop
* - Fix to UT of sum bf16
test=develop
5 years ago
lidanqing
12b9587be5
Add conv_bias pass version python test ( #28278 )
...
* add conv_bias pass version test
* update according to reviews
5 years ago
石晓伟
21a63f6f90
enhance the op_version_registry, test=develop ( #28347 )
...
* enhance the op_version_registry, test=develop
* add unittests, test=develop
* enhance the op_version_registry, test=develop
* fix bugs, test=develop
* revert pybind_boost_headers.h, test=develop
* fix a attribute bug, test=develop
5 years ago
joanna.wozna.intel
571a63e7ec
Add bf16 transpose2, reshape2, concat ops ( #28195 )
5 years ago
Zhang Ting
fdc06f2158
add Fuse bn add act pass ( #28196 )
...
* add fuse_bn_add_act pass
5 years ago
Chen Weihang
813b2ade34
Enrich the python error types of paddle & polish format ( #28124 )
...
* add multiple exception type
* define all exception & polish compile pystack
* mapping paddle error to python exception
* polish static mode error format
* fix failed unittests
* fix dytostatic test_error
* fix check_nan_inf failed
* add unittest for coverage
* revert some code try to solve compile error
* refactor enforce & error change
* polish code & add unittest
5 years ago
Adam Osewski
7db747d9e8
oneDNN BatchNorm + Act fusion pass. ( #27912 )
5 years ago
mapingshuo
81244fbfab
add sharding strategy in fleet( #27900 )
...
* add sharding
5 years ago
Chen Weihang
2babd6ff67
Add compile limit for PADDLE_ENFORCE without error message ( #28221 )
...
* add compile limit for paddle enforce
* polish elementwise_op_function.cu.h
* fix failed unittest
* fix windows compile failed
* detail polish
* revert no type constructor
5 years ago
Leo Chen
1f3be85914
Fix bug of fetch_async_op_handle when fetching the feed variable ( #28194 )
...
* fix bug of fetch_async_op_handle
* revert some changes of test_buffer_shared_memory_reuse_pass
* revert some changes of test_buffer_shared_memory_reuse_pass
5 years ago
lidanqing
7cb4a8b8f2
[oneDNN] Conv dilation support ( #27914 )
...
* conv dilated mkldnn support: forward and backward pass
* add mkldnn conv_transpose dilation UT
test=develop
* remove unnecessary PADDLE_ENFORCE
* add int8 and bf16 dilated conv UT
* update according to reviews
5 years ago
Zhou Wei
2ac6c6c3af
fix bug of tensor copy of CUDAPinnedPlace ( #27966 )
5 years ago
guofei
6bbb6e7f45
Implement the function of OutScaleForTraining/OutScaleForInference in dygraph ( #26601 )
...
* Implement the function of OueScaleForTraining/OutScaleForInference in dygraph
test=develop
5 years ago
Thunderbrook
3ee6ad6ec5
solve bug in pull_dense_worker ( #27918 )
...
* op error info
* style
* code format
* create pin var bug
5 years ago
zhang wenhui
5a83496c8d
Multi task ( #26002 )
...
* add multitask
* add multitask, test=develop
* fix code style, test=develop
* add partail push dense, test=develop
* fix has_kay in py3, test=develop
* fix, test=develop
* fix, test=develop
* fix, test=develop
5 years ago
wanghuancoder
41aad9bfcd
revert 4 files, from clear include by iwyu, test=develop ( #27895 )
5 years ago
Leo Chen
049696bf67
Refine the format of printing tensor ( #27673 )
...
* add sumary feature
* refine printting tensor
* add sci_mode
* add sample code
* fix indent error
* fix _format_item
* polish code
* support item indent
* add ut
* set place for ut
* fix py2 issue
* fix ut
5 years ago
Chengmo
c5f2802d56
【paddle.fleet】Update fleetrun & ps-heter ( #27472 )
...
* refine fleetrun.ps_launch
* update fleet run for multi device support
* ps_graph support ps-gpu
* fix heter save
* add heter save unittest
* fix unittest & simple code
* update fleetrun
* fix fleetrun
* fix launch barrier
* fix role maker
* add paddlecloud rolemaker unittest
* rename heter_worker_device_guard
5 years ago
石晓伟
0d27591642
save operator version infomation to program desc, test=develop ( #27668 )
5 years ago
Jacek Czaja
631c1f3018
- Fix to 27398 ( #27770 )
...
test=develop
- compilation fix
test=develop
5 years ago
Jacek Czaja
606611d351
[oneDNN] GRU BF16 kernel ( #27731 )
5 years ago
Jacek Czaja
b9fda2ff09
Fix to issue #25537 ( #27546 )
...
* - condidate fix to issue #25537
test=develop
* - UT for transpose NHWC
test=develop
5 years ago
Wojciech Uss
966447e338
Added support for quantization of fusion_gru ( #27518 )
5 years ago
Pei Yang
8a4f85feb9
Add unittests and OP version registry for quant_conv2d_dequant_fuse_pass ( #27689 )
5 years ago
AshburnLee
c3a3df6466
Add cuda support for unique op ( #27646 )
...
* unique op for cuda is added
* add support for cuda
* Add cuda support for unique op.
* Add support for int32_t and int64_t.
* For old version, process by cpu
* Add VisitDataType for thrust
5 years ago
Leo Chen
35074963e3
Refine error msg in paddle/fluid/framework/details [part 2] ( #27429 )
...
* refine broadcast_op_handle
* refine some error messages
* refine some files
* fix bug
* fix bug
* fix bug
* follow comments
* follow comments
5 years ago
Chengmo
0e101c4f6f
Fix test dist fleet heter ctr ( #27513 )
...
* fix test_dist_fleet_heter_ctr & peformance update
5 years ago
joanna.wozna.intel
b0ee1405f7
Add conv2d bfloat16 support ( #27325 )
5 years ago
Thunderbrook
6f69a4cb05
add xpu in heter mode ( #27000 )
...
* add xpu in heter mode
test=develop
* BOOST_CONST_GET; PADDLE_THROW
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* refine
test=develop
* refine
test=develop
* refine
test=develop
* refine code
test=develop
5 years ago
WangXi
e550fc02ae
fleet2.0 add fp16 grad compression ( #27480 )
5 years ago
cc
c5c13473c6
Add compatibility check for four mkldnn pass ( #27364 )
...
* Add pass compatibility check for four mkldnn pass, test=develop
5 years ago
Wilber
3d5522146e
register seq_concat_fc_fuse pass. ( #27479 )
5 years ago
wanghuancoder
df43905f12
use iwyu clean include ( #27267 )
...
* use iwyu clean include, test=develop, test=win
* compilation error, test=develop
* fix compilation error2, test=develop
* fix compilation error3, test=develop
* fix compilation error4, test=develop
* fix compilation error5, test=develop
* fix compilation error6, test=develop
* fix compilation error7, test=develop
* fix compilation error8, test=develop
* fix compilation error8, test=develop
* fix compilation error10, test=develop
* fix compilation error11, test=develop
5 years ago
Pei Yang
8182337096
clear pass logs ( #27434 )
5 years ago
Shang Zhizhou
d93661942e
fix bug sequececonv_eltadd_relu_fuse_pass ( #27404 )
...
* fix bug sequececonv_eltadd_relu_fuse_pass, output error when sequence_conv's padding_start > 0
* fix seqconv_eltadd_relu_fuse_pass unitest error
5 years ago
Leo Chen
aba759ba16
[Feature] Enhance inplace addto strategy for gradient accumulation in static graph ( #27112 )
...
* support use add instead of sum to do gradient accumulation
* add inplace addto pass
* add grad_add op and inplace addto pass
* remove debug code
* code refine
* fix bug when sereral sum ops inserts at same op_idx
* fix Flags type
* add addto attribute for conv3d
* fix ut
* code clean
* fix type
5 years ago
Wilber
39546aa2f3
Add pass compatible and unit test. ( #27377 )
5 years ago
Leo Chen
bbc84e0fe0
Refine error msg in paddle/fluid/framework/details [part 1] ( #25631 )
...
* refine error msg in var_handle.h, test=develop
* refine all_reduce_op_handle
* fix some error msg
* refine variable_visitor
* refine threaded_ssa_graph_executor
* refine inplace related files
* refine executor related files
* refine fetch_op_handle.cc
* fix bug
* follow comments
5 years ago
tangwei12
99626502f7
【paddle.fleet】gloo and util ( #27213 )
...
* fix worker endpoints
* fix gloo wrapper for hdfs
* GPU fleetrun support gloo
* parameterserver fleetrun support gloo
* fix get server endpoint
5 years ago
yaoxuefeng
d726fd5e86
enhance dataset err msg ( #27363 )
5 years ago
Pei Yang
fd7ab4e63c
register pass compatibility ( #27357 )
...
* pass compatibility
* add compatibility registry
* add unittests for different padding
* add assert
* drop errmsg
5 years ago
haozech
7e6dfcf9b2
Add 3 pass version check ( #27283 )
5 years ago
Shang Zhizhou
3c11717988
add op version checker to ir passes ( #27329 )
5 years ago
lilong12
9f9d15e285
fix the bug of non-exit, test=develop ( #27350 )
5 years ago
ShenLiang
54b81fa32c
add adaptivelsgd in meta_optimizer ( #27289 )
...
* add adaptivelsgd
* Todo fix the code to avoid the conflict.
5 years ago
Chen Weihang
4f9d6529fe
Polish framework error message part 7 ( #27266 )
...
* polish framework error message part 7
* fix typo
* polish by reviewes comment
5 years ago
Wilber
f827665ae6
[Pass Compatible] Bind python compatible. ( #27262 )
5 years ago
Chen Weihang
dafb0e3bb7
Polish framework error message part 6 ( #27257 )
...
* polish framework error msg part 6
* polish lossed item
* fix failed unittest
* polish by reviewer comments
5 years ago
joanna.wozna.intel
1483ea2304
Add bfloat16 passes ( #26999 )
5 years ago
Chen Weihang
79149c8ee6
polish framework error message part 8 ( #27269 )
5 years ago
ShenLiang
2b6a5793fe
remove auto mode from localsgd optimizer ( #27237 )
...
* rm auto from localsgd
5 years ago
JZ-LIANG
5d039f4086
modified the implement of Lars optimizer ( #26733 )
...
add lars to fleet meta optimizer
5 years ago
WeiXin
13804ed80c
Error msg/polish tensor error msg ( #26976 )
...
* polish one line error message in tensor.cc
* polish error messages in tensor.cc,tensor.h tensor_impl.h
* polish error messages in tensor.cc tensor.h tensor_impl.h
* polish error messages in tensor.cc,tensor.h tensor_impl.h
* polish error messages in tensor.cc tensor.h tensor_impl.h tensor_test.cc
* polish error messages in tensor.cc tensor.h tensor_impl.h
5 years ago
Pei Yang
5fb8c92054
fix multihead matmul shared params ( #27121 )
5 years ago
yaoxuefeng
7f3e6ca596
add cuda generator ( #26786 )
5 years ago
Feiyu Chan
c8cc094576
add template specialization for bfloat16 for gcc 4.8 compatability ( #26985 )
5 years ago
joanna.wozna.intel
95e1434bb2
Add bfloat16 data type ( #25402 )
5 years ago
Shang Zhizhou
61fc7a3e45
Pass version check ( #26887 )
5 years ago
wanghuancoder
2d2c31a63a
Add FetchAsyncOpHandle, and use it in FastThreadedExecutor ( #26643 )
...
* optimized transformation form tensor to numpy, test=develop
* Modify fetch op handle, from memcpy Sync to memcpy Async, test=develop
* modify CUDAPinnedPlace to CPUPlace, test=develop
* modify CPUPlace to CUDAPinnedPlace, and set default inplace to false, test=develop
* revert fetch_op_handle, add fetch_async_op_handle, test=develop
* revert fetch_op_handle, add fetch_async_op_handle, test=develop
* fix error msg report, test=develop
* fix bug in cpuplace, test=develop
* fix bug in unmerge and tensorarray modle, test=develop
* fix bug, double copy gpu memory, test=develop
* fix chenweihang¡¯s review advice, test=develop
5 years ago
Thunderbrook
5205748481
fix eigen in push sparse; fix hadoop command ( #26872 )
...
* fix eigen in push sparse; fix hadoop command
test=develop
* add log in load_combine_op
test=develop
5 years ago
yaoxuefeng
a47d92d868
fleet add save with whitelist test=develop ( #23376 )
5 years ago
Adam
8bcb1f29d9
Add conv+affine_channel fuse pass to MKLDNN pass strategy and fix it ( #26779 )
5 years ago
Leo Chen
844583c8fd
Refine paddle.manual_seed ( #26496 )
...
* refine manual seed
* fix ci problem
* fix unittests
* fix unittest
* set is_init_py=false in manual_seed
* fix unittest
* fix bernoulli_op
* fix(unittest): change random_seed to manual_seed
* 🐞 fix(unittest): fix manual_seed
* trigger ci
* fix test_sentiment
* fix test_imperative_save_load
* fix test_uniform_random_op
* fix test_uniform_random_op
* fix test_jit_save_load
* merge develop
* fix manual_seed
* fix manual_seed
* use global engine
* use shared_ptr
* fix double free
* fix bug
* fix bug
* fix bug
* fix test bug
* fix test bug
* fix test bug
* fix ci
5 years ago
Pei Yang
e3f8e5cf5c
trt int8 support conv2d_transpose ( #26636 )
5 years ago
zhangchunle
623a4c2e56
fix ci coverage build error ( #26761 )
5 years ago
joanna.wozna.intel
eb097d64f6
Fix int8 performace drop cpu_quantize_placement_pass ( #26715 )
...
* Fix cpu quantize placement pass
* Include string lib
5 years ago
Wilber
1c898b66d6
add bug fix enum. ( #26736 )
5 years ago
Zhou Wei
8071d23073
fix bug that can't print int8_t ( #26712 )
...
fix bug that can't print int8_t
5 years ago
Adam Osewski
c2c689582e
Update Paddle-Lite commit hash. ( #26413 )
...
* Update Paddle-Lite commit hash.
* Add BF16 data type to VarTyp protobuf message.
5 years ago
lilong12
1c68138327
[api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis ( #26552 )
...
add collective op for cpu using gloo and paddle.distributed.* apis
5 years ago
joanna.wozna.intel
559e43eee4
Small change in conv2d and quantize pass ( #26671 )
5 years ago
石晓伟
32ceacf317
update op_version_registry, test=develop ( #26644 )
5 years ago
Dong Daxiang
08d736ad78
【paddle.fleet】add cudnn related strategies to DistributedStrategy ( #26598 )
...
* add cudnn related strategies to DistributedStrategy
5 years ago
wanghuancoder
c1f5df5269
optimized transformation form tensor to numpy ( #26447 )
...
* optimized transformation form tensor to numpy, test=develop
* optimized transformation form tensor to numpy, pass pre-commit, test=develop
* modify fetchophandle zerocopy to deepcopy in PE&CUP, test=develop
* modify py:array construct, test=develop
* fix _fetch_var to use deep copy, test=develop
5 years ago
石晓伟
fa08a834be
update op_version_registry, test=develop ( #26592 )
5 years ago
石晓伟
656e60b18f
new class: op_version_registry, test=develop ( #26542 )
5 years ago
Jack Zhou
199b0c7c1b
Add isfinite v2 op ( #26344 )
...
add the isnan, isfinite, isinf api for the paddle 2.0
5 years ago
QingshuChen
138ecf24aa
support Baidu Kunlun AI Accelerator ( #25959 )
...
* support Baidu AI Accelerator
* test=kunlun
* minor
* test=kunlun
* support xpu op in separate file
* test=kunlun
* update XPU error message and remove duplicated code
* test=kunlun
* minor
* test=kunlun
* minor
* test=kunlun
5 years ago
Chen Weihang
9108282883
Polish framework error message part 5 ( #26204 )
...
* polish framework error msg part 5
* revert enforce change
* refine error type
* trigger ci check
* polish details by review comment
5 years ago
Pei Yang
b757466b0d
fix trt dynamic ernie serialization unit test ( #26228 )
5 years ago
Wilber
3ec0bcbbb8
[Bug] Fix prune for save_inference_model about transformer ( #25347 )
5 years ago
cc
3f816bc8b4
[Quantization] Conv2d_transpose and mul support channnelwise quantization ( #25639 )
...
* Conv2d_transpose and mul support channnelwise quantization, test=develop
* Skip collecting out threshold for output tensor of which the type is not fp32 or fp64, test=develop
* Fix error in test_user_defined_quantization, test=develop
* Add depthwise_conv_bn_fuse, test=develop
* Add conv_transpose_bn_fuse_pass for post_training_quant, test=develop
5 years ago
Thunderbrook
a83e0f264c
fix heter proto ( #26093 )
...
test=develop
5 years ago
yaoxuefeng
23261ff44b
add cpu random Generator ( #26013 )
5 years ago
Zhou Wei
6de463d3d1
expose and unify the Tensor concepts to the user ( #25978 )
...
* expose and unify the Tensor concepts to the user
* expose tensor to user
* add copy place for Tensor
* add copy place for Tensor
* add note
* add macro PADDLE_WITH_CUDA
* remove RUN_TYPE=DIST
* fix some error
5 years ago
Dong Daxiang
50a5bcfc9d
【paddle.fleet】paddle.fleet -> paddle.distributed.fleet. ( #26186 )
...
* move paddle.fleet to paddle.distributed.fleet
5 years ago
Leo Chen
ffe52b4452
[OpDevOptimize] Add common infershape functions ( #26096 )
...
* add unchaged infershape function
* add broadcast infershape function
* fix bug
* rename infershape functions
* add UnaryOpUnchangedInferShapeCheckAxis
* add error message
* add test for common infer shape functions
* dont update existed ops
* dont update op_desc.h
* add more test
* add error check, refine error message
5 years ago
Chen Weihang
838e36e9ed
Fix loaded variable suffix repeat error ( #26169 )
...
* fix loaded var suffix repeat error
* use new dygraph name for loaded param
5 years ago
JZ-LIANG
54003b873e
【paddle.fleet】add lamb to fleet meta optimizer ( #26025 )
...
add lamb to fleet meta optimizer
5 years ago
Yiqun Liu
1be6bf45ae
Add assign to fusion_group and enhance inplace execution in fusion_group. ( #26121 )
5 years ago
MRXLT
6559229b7e
fix encryption infer ( #25979 )
...
* add encrypt for inference lib
* fix code;test=develop
* fix test; test=develop
* bug fix; test=develop
* add MakeCipher;test=develop
* fix bug;test=develop
* move MakeCipher to paddle space; test=develop
* fix include dir ;test=develop
* add include dir; test=develop
* move include; test=develop
* move include; test=develop
* fix for windows ci
* fix cmake; test=develop
* fix bug
bug fix
5 years ago
tangwei12
c14ec8782b
【paddle.fleet】Feature/fleet ps api 2.0 ( #25857 )
...
* add paddle.fleet.AsyncOptimizer
Co-authored-by: dongdaxiang <dongdaxiang@baidu.com>
5 years ago
joanna.wozna.intel
734cf1c3e9
Change use_quantizer attribute name and data type ( #25838 )
...
* Change use_quantizer attribute name and data type
* Fix problem with setting attribute
* Add changes due to review
* Small change in function
* Restore use_quantizer attr for compatibility
5 years ago
tangwei12
3755564ae1
Fix/large scale fix ( #25999 )
...
* fix large scale KV
* fix single training using async ssa graph
5 years ago
Thunderbrook
fd2947babf
fix compile error with mkl ( #26030 )
...
test=develop
5 years ago
Leo Chen
0a47387bd8
Use static local variable instead of global variable for safty ( #26018 )
...
* remove global variable
* refine code
5 years ago
123malin
2191a08317
【paddle.fleet】fleet_util move to paddle.fleet ( #25805 )
...
* test=develop,test=document_fix, remove the out args
* fleet_util move to paddle.fleet
Co-authored-by: WuHaobo <wuhaobo1994@gmail.com>
Co-authored-by: tangwei12 <tangwei12@baidu.com>
5 years ago
Thunderbrook
0cb60c700d
add heter ps mode ( #25682 )
...
* add heter ps mode
* code style
test=develop
* add with_pslib
test=develop
* unitest
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* test monitor
test=develop
* prepare trainer
test=develop
* code style
test=develop
5 years ago