Paddle

Commit Graph

Author	SHA1	Message	Date
liym27	ff25c5b36f	Fix bug: GetAttrValue should deal with attr with attrType vector<double> (#30536 )	4 years ago
Leo Chen	81217a94d8	unify calling cudaSetDevice (#30470 ) * unify calling cudaSetDevice * fix compile	4 years ago
hutuxian	40ede12631	Ascend Framework Part1: OP & Wrapper (#30281 )	4 years ago
liuyuhui	843dc3cdbd	[Kunlun]PR3: add xpu executor, multi xpu card train function optimization (#30317 )	4 years ago
Adam Osewski	c5ffad126c	[oneDNN] Refactor fuse pass helper functions to one place. (#30460 ) * Move pass tester helper functions to single common place. * Use helper functions in two more fuse pass tests.	4 years ago
pangyoki	13d757362c	Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103 ) * add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error	4 years ago
yaoxuefeng	6e0da01c61	Heter ps new (#30198 )	4 years ago
cc	8e3a294045	skip quantizing ops in cpu inference (#30342 ) * skip quantizing ops in cpu inference, test=develop	4 years ago
alncat	7bbf3ac5ab	Added support for inference using quantization aware trained dygraph (#30288 ) * added support for inference using qunatization aware trained dygraph * added support for inference using qunatization aware trained dygraph correct boost get usage * Delete incorrect warning message (#30196) * fix warning and no grad * clean redundant API alias in 2.0 - part 2 (#30013) * delete paddle.nn.functional.assign * fix dynamic to static error * just add the op error message for the matmul xpu (#30246) add the op error message for the matmul xpu * Add Static Variable Clone (#30208) Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat * use wget to replace curl to download the lcov file (#30229) * use wget to replace curl to download the lcov file * add cache for lcov * fix test_pool3d_op timeout issue (#30248) * Fix unittests bugs. (#30250) * modify error message based on comments (#30189) * modify error message based on comments * edit code according to review. * Correct spelling according to review. * Fix bug for 'save mutiple method' (#30218) * Fix bug for 'save mutiple method' * To pass coverage. * edit code to pass coverage. * edit code to pass coverage. * add unittest for coverage. * change for coverage. * edit for coverage. * added support for inference using qunatization aware trained dygraph * Alias from paddle.fluid.layers.auc to paddle.static.auc (#30206) * add alias from fluid.layers.auc to static.auc * Update __init__.py * added support for inference using qunatization aware trained dygraph correct boost get usage * corrected boost get usage * corrected naming issues and enforcing zero check * correct paddle enforce message * added more error checkings * corrected error report message and optimized code * corrected findvar usage * corrected paddle_enforce in scope * correct error messages * correct error reporting format Co-authored-by: LielinJiang <50691816+LielinJiang@users.noreply.github.com> Co-authored-by: XiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com> Co-authored-by: wawltor <fangzeyang0904@hotmail.com> Co-authored-by: Huihuang Zheng <zhhsplendid@gmail.com> Co-authored-by: YUNSHEN XIE <1084314248@qq.com> Co-authored-by: Bai Yifan <me@ethanbai.com> Co-authored-by: gongweibao <weibao.gong@gmail.com> Co-authored-by: WeiXin <weixin10@baidu.com> Co-authored-by: Jiaqi Liu <liujiaqi06@baidu.com>	4 years ago
Zhang Jun	10a8f3e5c3	fix bug on compiling inference shared lib with crypto;test=develop (#30269 ) * fix bug on compiling inference shared lib with crypto;test=develop * fix cmake bug when build inference lib using -DWITH_CRYPTO=OFF * update cmake * remove unnecessary enforce message	4 years ago
JZ-LIANG	75936d838f	Recompute Offload (#30233 )	4 years ago
tangwei12	5e839e4da5	add sparse embedding & load vars for 2.0 & gloo bug fix (#30306 ) * add sparse embedding & load vars for 2.0 Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b * fix hdfs gloo Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6 * fix gloo hdfs Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e * move loadvar/sparse embedding from incubute to static Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0	4 years ago
tangwei12	25f80fd304	Fix/distributed proto (#29981 ) * rename sendrecv.proto to namespace paddle.distributed * split ps with distributed	4 years ago
liym27	b4989fb744	Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126 )	4 years ago
石晓伟	8ce2482b80	fix header file paths of gflags, commit 1, test=develop (#30271 )	4 years ago
wangchaochaohu	af80859dd6	reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885 )	4 years ago
Zhen Wang	7f7dfccf20	Support pure fp16 training for AMP API. (#29544 ) * add cast ops before and after unsupported fp16 ops. * Keep partial net in FP32 pattern. * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode. * Add fp16 support for adam op. * add multi precision attr for adam. * Fix the bug of test_multi_precision_fp16_train UT. * Code format for CI. * Fix the redefine error about MPTypeTrait on windows. * fix bugs of the _create_accumulators func in Momentum. * fix bug when inserting post cast op. * Add the update_loss_scaling op in allow_set of UnusedVarCheck. * Update for ci coverage. * Add some doc for OptimizerWithMixedPrecision. * Fix the code style. * Imporve the doc of `amp_init`. * Change for fp16 testing if users have the infer program defined in separate way.	4 years ago
Leo Chen	789743e190	use cuda generator in bernoulli cuda kernel (#30199 )	4 years ago
Leo Chen	1f97d61c68	Add callback after TensorCopy (#30123 ) * change to tensor copy sync * change to tensor copy sync * make copy_to safe when use TensorCopy * refine code * add ut * add cudapinned garbagecollector * add testcase: cpu place -> cuda pinned place	4 years ago
Chengmo	528e03fc08	【Paddle.Fleet】Fix tensor table (#30075 ) * add tensor table	4 years ago
Huihuang Zheng	54bf3f5a56	Refine PADDLE_ENFORCE Error Messages. test=develop (#30149 ) Improve some error messages in parallel_executor.cc, conditional_block_op.cc, recurrent_op.cc	4 years ago
Chen Weihang	d0fb06b27f	[Complex] Simplify prepared op impl to improve performance (#30153 ) * simplify prepared op impl to improve performance * fix kunlun compile error * continue fix kunlun compile error * only transform diff place when dtype diff * fix failed unittests * remove useless file * polish impl by review comment	4 years ago
liuyuhui	15fac5e7fa	fix assign_op_xpu concat_op_xpu warining (#30120 )	4 years ago
石晓伟	53bb126510	fix a bug in op_version_registry, test=develop, test=op_version (#29994 )	4 years ago
liuyuhui	254ad61959	fix xpu pe sync, test=notest (#30095 )	4 years ago
Thunderbrook	0b8e1fadc5	add topo-aware in heter-ps (#30087 ) * add topo aware * resource.h * topo aware * format	4 years ago
WangXi	ee16006b5d	Optimization grad merge performance (#29784 )	5 years ago
Shang Zhizhou	08dc5bc27e	fix op version checker of pass bug (#30028 ) * fix op version checker of pass bug * fix code style * update pass version	5 years ago
cc	c3c064a8fc	Add mkldnn nearest_interp and bilinear_interp op (#30016 ) * Add mkldnn nearest_interp and bilinear_interp op * don't run mkldnn interpolate in default * add interpolate_mkldnn_pass	5 years ago
wawltor	cc2f94620c	add the support the op version check for matmul, test=op_version (#30011 ) * add the support the op version check for matmul, test=op_version	5 years ago
wawltor	b33aaea86c	add the op version check for the elementwise ops, test=op_version (#30010 ) * add the op version check for the elementwise ops, test=op_version * add the support check for elementwise_ops, test=op_version	5 years ago
Leo Chen	47d10c55d5	Enhance debugging (#30001 ) * add debug code * add place info * fix compile problem * add place for output	5 years ago
wawltor	8f49f9d5c9	change the elementwise ops version check, test=op_version change the elementwise ops version check, test=op_version	5 years ago
Thunderbrook	0ca6de171f	add include (#29952 )	5 years ago
cc	6a0102b038	map matmul/squeeze2+matmul/reshape2+matmul to mul (#29911 ) * map matmul/squeeze2+matmul/reshape2+matmul to mul	5 years ago
Jack Zhou	5a4e42ca9a	add gru op_register_version; test=op_version; (#29931 ) * add gru op_register_version; test=op_version; * Update fc,mul version;test=op_version;	5 years ago
Wilber	2b1d796cd0	[Inference] Solve 2.0 trt performance reduce compare 1.8. (#29925 )	5 years ago
石晓伟	181ea1870b	flush denormals to zero, test=develop (#29924 ) * flush denormals to zero, test=develop * add comments, test=develop	5 years ago
liuyuhui	3d1741b794	[Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor (#29926 )	5 years ago
liym27	9602a182b2	[Dynamic Inplace] Support ShareInplaceVersionCounterWith for C++ Tensor (#29842 ) * Revert "[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267)" This reverts commit `b10ecd9d3a`. * Support ShareInplaceVersionCounterWith to share the same inplace version counter for VarBase	5 years ago
liuyuhui	4427df37cf	[Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574 )	5 years ago
YUNSHEN XIE	2a01756bf3	remove duplicate ut names (#29809 )	5 years ago
Chen Weihang	a6072055be	[Complex] Handle complex to real after type promotion (#29855 ) * try to add fwd op input dtypes * refactor base impl * return tmp_ins after dygraph prepare data * fix typo found in debug * polish comment & add complex net test * revert detail change * fix unittest failed * add complex kernel condition control * fix xpu test failed & polish comment * polish details by review comments	5 years ago
Leo Chen	6b258317cb	fix TransferInplaceBack (#29830 )	5 years ago
QingshuChen	59b47f3b32	feat: support check_nan_inf for kunlun/xpu device (#29694 ) * feat: support check_nan_inf for kunlun device * support kunlun stack * minor	5 years ago
tangwei12	032414ca2a	[Feature] one ps (3/4) (#29604 ) * oneps (3/4) Co-authored-by: MrChengmo <cmchengmo@163.com> Co-authored-by: malin10 <malin10@baidu.com> Co-authored-by: chengmo <chengmo@baidu.com>	5 years ago
jakpiase	edc06c6a1b	Added fc + activation fuse pass (currently only gelu, sigmoid and tanh are supported) (#29772 )	5 years ago
YUNSHEN XIE	24ce051a84	remove duplicate ut reload (#29810 ) * remove duplicate ut reload * remove duplicate ut define in cmakelist	5 years ago
Thunderbrook	09b6e71928	heter box (#29734 ) * 　add heter box * add trainer, worker, wrapper... * format * for ci * format * remove boost get * boost & copyright * rename * 　rename * format * format * format Co-authored-by: yaoxuefeng6 <yaoxuefeng@baidu.com>	5 years ago
Jacek Czaja	7b33720c90	[oneDNN] Tensor copy fix to oneDNN tensors (#29771 ) * - Tensor copy fix to oneDNN tensors * - Fixes after review	5 years ago
Leo Chen	224f3bcbb1	format code (#29714 )	5 years ago
石晓伟	8bd2879ef7	update the operator registration for incompatible upgrade, test=develop (#29720 )	5 years ago
WangXi	9cbcc6cadc	fleet sync build strategy, test=develop (#29732 )	5 years ago
Chen Weihang	6cfa59de1b	[Complex] Add real & imag op and api for complex tensor (#29672 ) * add complex real op & api & unittest * add imag op & api & unittest * refactor op impl * revert simplify writing due to complile failed * polish details * polish grad op code	5 years ago
liuyuhui	f13c3a9cd7	[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337 )	5 years ago
lilong12	ff6a145011	update, test=develop (#29559 )	5 years ago
Jacek Czaja	f6cca62575	[oneDNN] Making ThreadID info in caching key optional (#29272 )	5 years ago
JZ-LIANG	d33d468f02	[Sharding] add hybrid-dp feature (#29518 ) * Sharding add hybrid-dp feature * update sharding in distributed_strategy * update sharding unitest * revise code format for sharding	5 years ago
LoveAn	b5d4a1f33d	Add the strategy of skipping cc/cu test compilation and execution in CI (#29499 ) * Add the strategy of skipping cc/cu test compilation and execution in CI, test=develop * fix if error with CI_SKIP_TEST, test=develop * fix add properties to test error on Linux/MAC, test=develop * fix set test properties of test_code_generator error, test=develop * remove test codes and advance judgment of file modification on Linux, test=develop * rename CI_SKIP_TEST to CI_SKIP_CPP_TEST, test=document_fix * Add branch judgement on Linux, test=develop	5 years ago
Aurelius84	2a42250699	Polish hash function of executor cache key (#29556 ) * Add more value to calculate hash key * fix size_t * polish code	5 years ago
jakpiase	57a4f16d9e	added internal and external reorders to profiler (#29443 ) * added external reorder to profiler * added external and internal reorders to profiler * added internal and external reorder to profiler * added formatting to int/ext reorder commit * removed unnecessary comment	5 years ago
LoveAn	03b42d9fa7	fix unittest on windows, test=develop (#29365 )	5 years ago
cc	a623ce044f	Use different name_scope for different conv type, test=develop (#29355 )	5 years ago
liym27	b10ecd9d3a	[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267 )	5 years ago
Chen Weihang	9ad800ebb2	Support type promote for basic math ops (quantum required) (#29265 ) * basic impl of type promote * add comment & another testcase * fix complex bugs & support python op promote type * fix failed unittests & polish code * add unittest for coverage * change to only promote complex type * polish code details * polish several comments	5 years ago
Aurelius84	67c700b479	[Dy2Stat] Add cache for Executor and Context in run_program_op (#28421 )	5 years ago
Chen Weihang	1de32f823d	Hot fix complle failed in gcc4.8 caused by complex impl (#29254 ) * hot fix complle failed in gcc4.8 * fix failed unittest	5 years ago
GeminiCarrie	642abe2a48	Fix a bug when running on an operating system without "bash." (#29131 ) * Fix a bug when running on an operating system without "bash." * add execution condition * for ci-coverage	5 years ago
ShenLiang	46b73e6cd9	Change the api of DataParallel and Fleet (#29224 )	5 years ago
chentianyu03	8f45d14263	add complex64 and complex128 type; add +-/@ and slice opreator for c… (#29199 ) add complex64 and complex128 type; add +-/@ and slice opreator for complex types add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest	5 years ago
liym27	865a45984f	Check whether there is any inplace operation affecting gradient calculation. (#27901 ) * Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable. * Add a new attribute `_inplace_version` for VarBase. * Raise exception if an inplace operation can result in incorrect gradient computation. * Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation. * For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode. * Use original var_wrapper if the inplace_version is not changed. * Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.	5 years ago
Chen Weihang	0b032faeee	Polish unittests details and execution conditions to adapt to MUSL (#29044 ) * fix failed tests in yingchun gived list * add unittests into static_mode_white_list * add enable static * fix dist unittest * skip test_sigmoid_focal_loss_op & add gym * revert no need skip unittests * remove gym	5 years ago
Wojciech Uss	4fd4095d1b	Add quantization of multi_gru op and tests (#28615 )	5 years ago
yaoxuefeng	545df287fc	add user_define_dump (#28596 )	5 years ago
arlesniak	bc902044a4	Fixes mkldnn dygraph learning rate scheduler crashes (#28988 )	5 years ago
WangXi	173c22aec2	optimize fast graph executor (#28962 )	5 years ago
Shibo Tao	db41258501	add API serialize_program, serialize_persistables, save_to_file, deserialize_program, deserialize_persistables, load_from_file. (#29034 )	5 years ago
joanna.wozna.intel	b0d1ac161e	Add bf16 pool2d and unify bf16 unit tests (#29039 ) * Add bf16 pool2d and unify bf16 unit tests * Add change default ops test	5 years ago
joanna.wozna.intel	fddea67445	Fix cpu_bfloat16_pass (#28730 ) * Fix cpu_bfloat16_pass * Add output_format * Fix incorrect SetOutput * Change fromating	5 years ago
Chen Weihang	fea0e294ee	Hide the C++ stack by default and add hints (#29042 ) * default not show cpp statck & add hint * fix failed unittest * fix failed unittests	5 years ago
Wojciech Uss	7b5a8e46de	Add multi_gru_fuse_pass and tests (#28601 ) * Add multi_gru_fuse_pass and tests * fix date * cleaned up headers	5 years ago
Wojciech Uss	991345b368	Add multi_gru_seq_fuse_pass and tests (#28604 ) * Add multi_gru_seq_fuse_pass and tests * fix date * removed unused functions	5 years ago
lilong12	f77a78cdee	enable pipeline to run with Executor.run() (#28373 ) * update, test=develop	5 years ago
Thunderbrook	0073f9bdb0	support ps-gpu (#28752 ) * ps gpu transpile * ps gpu * remove op * gps trainer * local ps * add macro * HeterBox * def cuda * tab * code style * style Co-authored-by: Thunderbrook <a754913769#163.com>	5 years ago
Jacek Czaja	bd1d6d3b30	extends oneDNN caching keys so caching objects are unique to executor/predictor (#28758 )	5 years ago
gongweibao	1dad8ceaab	Fix gpu memory allocation bug. (#28703 )	5 years ago
joanna.wozna.intel	8c0ea4bffe	Add bf16 matmul, fc, elementwise add and mul (#28729 ) * Add bf16 matmul, fc, elementwise add and mul * Correct unit test	5 years ago
Wojciech Uss	efc3b182f0	a fix for the fc_lstm_fuse_pass (#28709 )	5 years ago
wanghuancoder	5aec7dbeb0	use forward declarations for framework.pb.h (#28494 ) * use forward declarations for framework.pb.h, test=develop * use forward declarations for framework.pb.h, test=develop	5 years ago
Jacek Czaja	6d8d3d4c22	[oneDNN] Layer norm bf16 kernel (#28619 )	5 years ago
joanna.wozna.intel	2cb71c0cde	Add checkpoint to quantize (#28612 ) * Add checkpoint to quantize * Change bfloat16 option	5 years ago
lidanqing	804271cff9	Op version python mkldnn_inplace test (#28354 ) * add mkldnn inplace op version test * update mkldnn_inplace fuse pass * update the inplace test	5 years ago
Leo Chen	90805e2df7	Register op_version for new attribute use_addto (#28463 ) * register op_version for addto * upgrade pass capability * change eq to le * change eq to le * fix merge	5 years ago
Shang Zhizhou	8699f38d08	裁剪transformer模型trt支持；修复tensorRT不支持DeletePass的bug (#28517 ) * skip_layernorm_op done * add unittest * slice op convertor support trt < 6 * skip_layernorm only work in ernie	5 years ago
lidanqing	0fc181dbd0	[Fix bug] If the pass name is not found, IsCompatible should return false (#28475 )	5 years ago
wangchaochaohu	d7cfee9b31	Checkout point add (#28488 ) * upgrade pass capability	5 years ago
Pei Yang	75196cda40	Paddle-TRT int8 support mul op channelwise quant (#28422 ) * paddle-trt support mul channelwise quant * add support for depthwise_conv2d * add errmsg for unsupported op type	5 years ago
YUNSHEN XIE	369605be1d	fix cmake error when execute build_inference_lib (#28503 )	5 years ago
YUNSHEN XIE	1e698c600e	fix cmake error when setting ut timeout properity (#28492 )	5 years ago
YUNSHEN XIE	ba0756325a	exec ut no more than 15s 1 (#28439 ) * disable ut test_parallel_executor_fetch_isolated_var,test=document_fix * test for limiting ut exec time as 15S * fix an error caused by cannot find ut * fix some error * can not find test_transformer * fix error caused by ut not run in windows * fix error caused by Compiler Options * fix error caused by setting timeout value as 15 in python/paddle/tests/CMakeLists.txt * setting timeout value to 120s for old ut * add the timeout value setting * fix error caused by ut only run in coverage_ci * add analyzer_transformer_profile_tester * fix some error * fix some error * fix error with inference option * fix error with inference option setting as ON_INFER * add some ut to set timeout * modified some option * fix error * fix some timeout error * fix error * fix error * fix timeout for test_analyzer_bfloat16_resnet50 * fix error * setting timeout properity for some ut * first pr for new ut timeout as 15S	5 years ago

1 2 3 4 5 ...

3196 Commits (5165bc854ac35c22c3d1d8c04629f5420972a23a)