Paddle

Commit Graph

Author	SHA1	Message	Date
wawltor	3d49882e2c	fix the rnn mask memory bug for out of read (#30459 ) * fix the rnn mask memory bug for out of read * update the code for the rnn	4 years ago
taixiurong	6a3c8725b0	support transformer v2.0 (#30381 )	4 years ago
ShenLiang	e85be1b1b2	fix flatten api grad (#30426 )	4 years ago
yaoxuefeng	6e0da01c61	Heter ps new (#30198 )	4 years ago
123malin	2a98e9323a	test=develop, add distributed_infer (#30300 ) * test=develop, add distributed_infer	4 years ago
QingshuChen	cf786d22ec	fix bug that cann't find mkldnn(kunlun) (#30394 )	4 years ago
cc	8e3a294045	skip quantizing ops in cpu inference (#30342 ) * skip quantizing ops in cpu inference, test=develop	4 years ago
alncat	7bbf3ac5ab	Added support for inference using quantization aware trained dygraph (#30288 ) * added support for inference using qunatization aware trained dygraph * added support for inference using qunatization aware trained dygraph correct boost get usage * Delete incorrect warning message (#30196) * fix warning and no grad * clean redundant API alias in 2.0 - part 2 (#30013) * delete paddle.nn.functional.assign * fix dynamic to static error * just add the op error message for the matmul xpu (#30246) add the op error message for the matmul xpu * Add Static Variable Clone (#30208) Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat * use wget to replace curl to download the lcov file (#30229) * use wget to replace curl to download the lcov file * add cache for lcov * fix test_pool3d_op timeout issue (#30248) * Fix unittests bugs. (#30250) * modify error message based on comments (#30189) * modify error message based on comments * edit code according to review. * Correct spelling according to review. * Fix bug for 'save mutiple method' (#30218) * Fix bug for 'save mutiple method' * To pass coverage. * edit code to pass coverage. * edit code to pass coverage. * add unittest for coverage. * change for coverage. * edit for coverage. * added support for inference using qunatization aware trained dygraph * Alias from paddle.fluid.layers.auc to paddle.static.auc (#30206) * add alias from fluid.layers.auc to static.auc * Update __init__.py * added support for inference using qunatization aware trained dygraph correct boost get usage * corrected boost get usage * corrected naming issues and enforcing zero check * correct paddle enforce message * added more error checkings * corrected error report message and optimized code * corrected findvar usage * corrected paddle_enforce in scope * correct error messages * correct error reporting format Co-authored-by: LielinJiang <50691816+LielinJiang@users.noreply.github.com> Co-authored-by: XiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com> Co-authored-by: wawltor <fangzeyang0904@hotmail.com> Co-authored-by: Huihuang Zheng <zhhsplendid@gmail.com> Co-authored-by: YUNSHEN XIE <1084314248@qq.com> Co-authored-by: Bai Yifan <me@ethanbai.com> Co-authored-by: gongweibao <weibao.gong@gmail.com> Co-authored-by: WeiXin <weixin10@baidu.com> Co-authored-by: Jiaqi Liu <liujiaqi06@baidu.com>	4 years ago
GaoWei8	180877e988	Softmax backward optimize (#30249 ) * softmax backward optimize	4 years ago
Zhou Wei	b1d8ff45d7	running unit test sigle GPU parallely on Linux/windows GPU (#29523 )	4 years ago
Zhang Jun	10a8f3e5c3	fix bug on compiling inference shared lib with crypto;test=develop (#30269 ) * fix bug on compiling inference shared lib with crypto;test=develop * fix cmake bug when build inference lib using -DWITH_CRYPTO=OFF * update cmake * remove unnecessary enforce message	4 years ago
Huihuang Zheng	28e156c27f	Fix Sleep Error in enforce.h (#30335 ) usleep function in <unistd.h> only takes argument less than 1,000,000. Current call can exceed this limit, we have to fix it. This PR can fix random CI error.	4 years ago
Leo Chen	3d015f1cf5	Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338 ) * set expected place in child thread for dataloader * set device id when set tensor from numpy * revert tensor_py change * add compile guard * fix ci * fix bug	4 years ago
QingshuChen	2c1bba02e4	optimize memcpy perf for kunlun (#30291 ) * optimize memcpy perf for kunlun * remove useless unitest for kunlun mean * minor	4 years ago
ShenLiang	a60f17b89d	Support unused parameters in dynamic graph distributed (#30224 )	4 years ago
JZ-LIANG	75936d838f	Recompute Offload (#30233 )	4 years ago
lidanqing	a60893f6b5	correct the allowed dimension size (#30326 )	4 years ago
Chen Weihang	c8c8f205ba	remove c++ stacktrace hint (#30325 )	4 years ago
tangwei12	5e839e4da5	add sparse embedding & load vars for 2.0 & gloo bug fix (#30306 ) * add sparse embedding & load vars for 2.0 Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b * fix hdfs gloo Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6 * fix gloo hdfs Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e * move loadvar/sparse embedding from incubute to static Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0	4 years ago
tangwei12	25f80fd304	Fix/distributed proto (#29981 ) * rename sendrecv.proto to namespace paddle.distributed * split ps with distributed	4 years ago
Chengmo	d479ae1725	【Paddle.Fleet】Support local save sparse param (#30175 ) * add save tensor support Co-authored-by: seiriosPlus <tangwei12@baidu.com>	4 years ago
Double_V	231501fefc	fix elugradgrad test fail & error message opt (#30171 ) * fix elugradgrad test fail and error message opt * fix unitest,test=develop * Update prroi_pool_op.h fix error message * opt message,test=develop * fix ci fail,test=develop	4 years ago
Zhen Wang	fb49ea388e	Fix the accuracy problem of allclose op when using float64 data type in static mode. (#29890 ) * Fix the accuracy problem of allclose op when using float64 data type in static mode. * Format the code style.	4 years ago
yaoxuefeng	4656525e24	fix datanorm error msg (#30294 )	4 years ago
furnace	77051cc9f0	add fp16 support for tril_triu op (#30186 )	4 years ago
石晓伟	efa54629fb	fix header file paths of gflags, commit 3, test=develop (#30273 )	4 years ago
Chengmo	5b2c15afcd	Fix server.h include device_context (#30243 ) * fix cmake Co-authored-by: seiriosPlus <tangwei12@baidu.com>	4 years ago
石晓伟	a0ee09148e	enhance error msgs of fusion_seqpool_cvm_concat_op.cc, test=develop (#30240 )	4 years ago
石晓伟	a66eebab5c	fix header file paths of gflags, commit 4, test=develop (#30274 )	4 years ago
石晓伟	8c4500ff6d	fix header file paths of gflags, commit 2, test=develop (#30272 )	4 years ago
liym27	b4989fb744	Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126 )	4 years ago
wangchaochaohu	8dcae0c55d	register OPMaker and Infer Shape Check for fused_elementwise_add (#30259 )	4 years ago
AshburnLee	924aac2216	Add tf32 switch for cuDNN (#29192 )	4 years ago
石晓伟	8ce2482b80	fix header file paths of gflags, commit 1, test=develop (#30271 )	4 years ago
chentianyu03	c7371b7b20	type promotion for grad (#30177 ) * type promotion for grad * add type promotion for div op	4 years ago
liym27	3ce878f309	Check the rank of input in kernel of set_value op (#30147 )	4 years ago
WeiXin	66dc4ac77b	modify error message based on comments (#30189 ) * modify error message based on comments * edit code according to review. * Correct spelling according to review.	4 years ago
wawltor	fee424411a	just add the op error message for the matmul xpu (#30246 ) add the op error message for the matmul xpu	4 years ago
GaoWei8	0a21924a8d	optimize softmax forward (#30217 ) * optimize softmax forward	4 years ago
wangchaochaohu	af80859dd6	reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885 )	4 years ago
zhang wenhui	5932fee60a	enhance error message, test=develop (#30220 )	4 years ago
pangyoki	da16b33f2e	add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op (#29913 ) * add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput	4 years ago
Jacek Czaja	4aba17b5db	[oneDNN] Added UT for testing elementwise_mul caching (#30203 ) * - Added UT for testing elementwise_mul caching * lint fixes	4 years ago
Zhen Wang	7f7dfccf20	Support pure fp16 training for AMP API. (#29544 ) * add cast ops before and after unsupported fp16 ops. * Keep partial net in FP32 pattern. * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode. * Add fp16 support for adam op. * add multi precision attr for adam. * Fix the bug of test_multi_precision_fp16_train UT. * Code format for CI. * Fix the redefine error about MPTypeTrait on windows. * fix bugs of the _create_accumulators func in Momentum. * fix bug when inserting post cast op. * Add the update_loss_scaling op in allow_set of UnusedVarCheck. * Update for ci coverage. * Add some doc for OptimizerWithMixedPrecision. * Fix the code style. * Imporve the doc of `amp_init`. * Change for fp16 testing if users have the infer program defined in separate way.	4 years ago
Leo Chen	789743e190	use cuda generator in bernoulli cuda kernel (#30199 )	4 years ago
Leo Chen	8696335f86	Fix dtype of ungenerated grad var (#28511 ) * fix dtype of ungenerated grad var * update ut * refine code * set default dtype * fix could_use_cudnn bug * remove debug code * re-implement * fix bug	4 years ago
Wilber	609c022222	shape op support int8 and uint8 tensor (#30201 )	4 years ago
Wilber	01a287bf0a	fix windows compile when WITH_PYTHON=ON and WITH_TENSORRT=ON (#30194 )	4 years ago
ruri	e42e1e80dc	Add version checking, test=op_version (#30129 )	4 years ago
Leo Chen	1f97d61c68	Add callback after TensorCopy (#30123 ) * change to tensor copy sync * change to tensor copy sync * make copy_to safe when use TensorCopy * refine code * add ut * add cudapinned garbagecollector * add testcase: cpu place -> cuda pinned place	4 years ago
Chengmo	528e03fc08	【Paddle.Fleet】Fix tensor table (#30075 ) * add tensor table	4 years ago
Wilber	ade244948c	disable mkldnn inplace pass on windows (#30164 )	4 years ago
joanna.wozna.intel	907262ee15	Fix analysis predictor test (#30191 ) * Add a necessary condition * Remove test for white list and add header	4 years ago
lijianshe02	2dc7ee276b	enhance error message of nll_loss op test=develop (#30125 ) * enhance error message of nll_loss op test=develop	4 years ago
Huihuang Zheng	54bf3f5a56	Refine PADDLE_ENFORCE Error Messages. test=develop (#30149 ) Improve some error messages in parallel_executor.cc, conditional_block_op.cc, recurrent_op.cc	4 years ago
Chen Weihang	d0fb06b27f	[Complex] Simplify prepared op impl to improve performance (#30153 ) * simplify prepared op impl to improve performance * fix kunlun compile error * continue fix kunlun compile error * only transform diff place when dtype diff * fix failed unittests * remove useless file * polish impl by review comment	4 years ago
123malin	c5b415bfd9	Improve Index select cuda kernel (#30139 ) * test=develop, add index_select_cuda kernel	4 years ago
wangchaochaohu	7dd551e08b	refine the paddle place support using str (#28769 )	4 years ago
WeiXin	404c16763a	Add detailed error message for curandStatus_t, cublasStatus_t, cusolverStatus_t (#30161 )	4 years ago
Wilber	91a8a25721	enhance error info for py_func (#30138 ) * enhance error info for py_func * update	4 years ago
weihaoji	b8207af6bc	[XPU] Remove lite_xpu ut lite_resnet50_test since fusion pass changes introduced precision diff. test=develop (#30122 )	4 years ago
liuyuhui	15fac5e7fa	fix assign_op_xpu concat_op_xpu warining (#30120 )	4 years ago
Jack Zhou	f5428eca4f	fix enforce msg of sum xpu op (#30113 )	4 years ago
123malin	198fbdfb60	Add Lookahead and ModelAverage Optimizer (#30004 ) * test=develop, add model_average and lookahead	4 years ago
Leo Chen	adac38c506	add dispenable input for core.ops.reshape2/expand/slice (#30072 ) * add dispenable input 'shape' for core.ops.reshape2 * add dispenable inputs for core.ops.reshape2/expand/slice * add ut	4 years ago
ShenLiang	becf99d2e8	fix error message (#30135 )	4 years ago
Zhou Wei	30888ca343	Polish and Optimize the print/repr information of Layer (#29998 ) * Polish and Optimize the print/repr message of all layer * fix some code format	4 years ago
Zhou Wei	9c99d37906	fix unittest failed on windows (#29837 )	4 years ago
wangguanzhong	69839f8a9a	fix error message for distribute_fpn_proposals_op (#30116 )	4 years ago
QingshuChen	8e1c3ddf15	add aarch64 and sunway kunlun lib (#30027 ) * add aarch64 and sunway kunlun lib * minor * optimize elementwise_add for kunlun * update kunlun dependence * minor * minor	4 years ago
Shang Zhizhou	05b27695f1	add inference api： DisableTensorRtOps (#30109 ) * snap * add inference api: DisableTensorRtOPs * fix code style * update api to experimental * update variable name	4 years ago
石晓伟	53bb126510	fix a bug in op_version_registry, test=develop, test=op_version (#29994 )	4 years ago
xiemoyuan	3e0c492910	Optimize the error message of framework. (#30134 )	4 years ago
liym27	9922bd4125	Fix bug: In dynamic mode, if start or end is negetive, __getitem__ return wrong result(#30003 ) 1. when slice_item is a slice: 1) the start of __getitem__ should be std::max(start, 0) if slice 2) the start of __getitem__ should be std::min(end, dim) 2. when slice_item is an integer, it should be in [-dim_len, dim_len) 3. Fix error message to use accurate data	4 years ago
chentianyu03	666e665132	change the kron gradient when complex types (#29995 )	4 years ago
chentianyu03	a5e422c85d	add trace op_register_version and fix version bug; test=op_version (#30000 ) * add trace op_register_version and fix defaulf bug; test=op_version * add trace op_register_version; test=op_version * add trace op_register_version; test=op_version * add trace op_register_version; test=op_version * fix missing the template bug of vector; test=op_version	4 years ago
cc	9f34374b48	Fix the formate of raising error in randperm op (#30108 ) * fix the formate of raising error in randperm op	4 years ago
liuyuhui	254ad61959	fix xpu pe sync, test=notest (#30095 )	4 years ago
Thunderbrook	0b8e1fadc5	add topo-aware in heter-ps (#30087 ) * add topo aware * resource.h * topo aware * format	4 years ago
hong	297fff1a79	support dygraph in xpu place (#30051 ) * support dygraph in xpu place; test=develop * fix cpu/gpu compile error; test=develop * fix compile error; test=develop * fix xpu compile error; testd=develop	4 years ago
wangchaochaohu	d0a5620575	fix the compiler error when gcc4 cuda9.0 (#29997 )	4 years ago
WangXi	ee16006b5d	Optimization grad merge performance (#29784 )	4 years ago
yongqiangma	e891f4da1b	Add p_norm op version info (#30042 ) * p_norm fix op version info. test=develop	4 years ago
tangwei12	7d1c149e09	for inference checkpoint (#30081 ) * for inference checkpoint Change-Id: I36c979240ffa55bf1ef0c9315402960762af6be4 * for inference checkpoint Change-Id: I82025365d5b792cbea1ead506df685aecc8ac198	4 years ago
tangwei12	7d4bdff07d	fix large scale memory (#30035 ) * memory holder optimize Change-Id: Ic91af8ac6f2853336d28a9fbbc5e8d0c57b5d05e * memory holder optimize Change-Id: I2fd1c14ecc17f5d5ce88b87890381ea801e6367f * fix large scale memory holder Change-Id: Ief0992b02b00220e16c72cc637a56e7b5788140f * fix large scale memory holder Change-Id: I910142a3952ead643a5604f8f80955f3e6efe655	4 years ago
Shang Zhizhou	08dc5bc27e	fix op version checker of pass bug (#30028 ) * fix op version checker of pass bug * fix code style * update pass version	4 years ago
cc	68398abce9	[Inference] zero_copy_tensor supports int8_t (#30053 ) * zero_copy_tensor supports int8_t	4 years ago
whs	1b999d2b5d	Add version checking (#30040 )	4 years ago
ceci3	85b2f05ab0	register ModifyAttr for instance_norm, test=op_version (#30065 ) * register instance norm, test=op_version	4 years ago
channings	ddcff254db	fix op_register_version for compare ops, test=op_version (#30007 ) Co-authored-by: zhoushunjie <zhoushunjie@baidu.com>	4 years ago
Wilber	66e16b7e99	update lite subgraph. (#30056 )	4 years ago
GaoWei8	a64822589f	add REGISTER_OP_VERSION for LSTM (#30038 )	4 years ago
yinhaofeng	6e93fb92f9	Register op version for linspace,test=op_version (#30025 ) * Register op version for linspace,test=op_version * Register op version for linspace,test=op_version * Register op version for linspace,test=op_version * Register op version for linspace,test=op_version * Register op version for linspace,test=op_version	4 years ago
123malin	d0056c324d	test=develop, add op_register_version for roll_op (#30023 ) * test=develop, add op_register_version for roll_op	4 years ago
chentianyu03	e012930aa3	complex gradient matmul (#29966 ) * dot op support complex types * matmul support complex types * add test case * matmul broadcast gradient support complex * move conjFunctor to complex_functor.h	4 years ago
ShenLiang	893d37e5c6	Fix rank_attention op_version, test=op_version (#30006 ) * fix rank_attention, test=op_version	4 years ago
Adam Osewski	13aef97043	operator checkpoints for new attributes. (#29832 ) * Add operator checkpoints for new attributes. * Fix adding subsequent checkpoint to quantize op.	4 years ago
wangguanzhong	844d8e0c2c	add REGISTER_OP_VERSION for generate_proposals, roi_align, roi_pool test=op_version (#30034 )	4 years ago
cc	c3c064a8fc	Add mkldnn nearest_interp and bilinear_interp op (#30016 ) * Add mkldnn nearest_interp and bilinear_interp op * don't run mkldnn interpolate in default * add interpolate_mkldnn_pass	4 years ago
chalsliu	c053bf2a57	Revert "register ModifyAttr for instance_norm, test=op_version (#29938 )"	4 years ago

1 2 3 4 5 ...

18301 Commits (f8da5536edaa004fd42988539508f6810a2fe958)