Paddle

Commit Graph

Author	SHA1	Message	Date
lilong12	bf461fa524	Improving error report message for sequence_expand op (#27245 ) * improve err report, test=develop	5 years ago
Zhong Hui	bbad3414e8	Enhance the error messages for files in operators/math Enhance the error messages for files in operators/math	5 years ago
Pei Yang	aae41c6fca	refine error message related to paddle-TRT (#27256 )	5 years ago
Zhen Wang	d708b21074	Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240 ) * update amp_check_finite_and_scale_op for static_amp. * use amp_check_finite_and_scale in static graph amp. * update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op). * add update_loss_scaling op in cpp. * add update_loss_scaling_op unit test. * update the doc of the check_finite_and_unscale op * Update the process of gradients updating skipping if the gradients have infinite values. * update the way to zero grads. * update test_update_loss_scaling_op.py * add log info when find infinite grads. * add the unit test for UpdateLossScaling Layer.	5 years ago
Adam	cc3f4b813a	Add int8 GRU kernel (#27220 ) * Add int8 GRU kernel with UTs * Lint fixes * More lint fixes	5 years ago
Jack Zhou	9437ce36c4	Error description optimize for math dir Error description optimize for math dir	5 years ago
Zhang Ting	5c1bafbbc6	use eval to improve performance, test=develop (#25459 )	5 years ago
lidanqing	5c4eed66fd	Fix GRU mkldnn kernel fail on look_table_v2 (#27198 ) * Fix the lookup_table_v2 failed on GRU mkldnn kernel issue test=develop * fix according to reviews, removed x_num_col_dims test=develop * update gru model. change according to reviews test=develop * change according to reviews test=develop	5 years ago
Chen Weihang	33ff833af2	fix loaded no params layer run error (#27241 )	5 years ago
Wilber	1b84c0bf43	Lite subgraph refine predictor (#27167 )	5 years ago
furnace	2e59769612	add empty op (c++, python, unit test) (#26659 )	5 years ago
lilong12	c5f957ae38	add double grad for tile op and expand_v2 op (#27114 ) * add double grad for tile, test=develop * add double grad for expand_v2 op, test=develop	5 years ago
lilong12	58a88ba9af	add double grad for expand (#27183 ) * add double grad for expand, test=develop	5 years ago
Qi Li	7c7fbd3218	fix error msg of fused_embedding_fc_lstm_op, test=develop (#27231 )	5 years ago
Jacek Czaja	e005861598	[oneDNN]Introducing oneDNN 1.6 (#27137 ) * - introducing oneDNN 1.6 test=develop * - Removed redundant code test=develop	5 years ago
ShenLiang	5bd84b22c4	revert divide (#27202 )	5 years ago
wawltor	fde5cfe881	fix the CudaPinMemory bug for the equal op (#27176 ) fix the CudaPinMemory bug for the equal op and add the test case for the equal op	5 years ago
zhupengyang	cc3306f7c8	restruct logsumexp to speed up compiling (#27191 )	5 years ago
Steffy-zxf	50e60e8779	update error info for selected_rows_functor update error info for selected_rows_functor	5 years ago
JZ-LIANG	5d039f4086	modified the implement of Lars optimizer (#26733 ) add lars to fleet meta optimizer	5 years ago
wangchaochaohu	c71d79b1d2	[cuda11 support] change the CMakeLists to support the cuda11 (#27124 )	5 years ago
Qinghe JING	43b0445b29	Add double grad in reduce sum (#27115 ) * set default value to strategy in distributed_optimizer test=develop	5 years ago
kinghuin	ed292695c5	optimize the error message for math dir optimize the error message for math dir	5 years ago
yongqiangma	4558d395e9	fix Norm op error (#26771 ) * fix frobenius_norm error, rm p=0 2-axis support. test=develop	5 years ago
LielinJiang	4d7d661249	Fix kl and summary bug (#27132 ) * fix summary rnn * fix kl_div bug when input shape is [1] and reduction is batchmean	5 years ago
whs	eb01976037	[2.0 API]Add checker in grid_sample_grad op (#27126 )	5 years ago
wangguanzhong	a28ae86e11	Enhance ops to support LoD as input for dygraph detection models. (#25316 ) * enhance collect_op for dygraph, test=develop * enhance detection ops with lod, test=develop * support none bbox left in generate_proposals, test=develop * unfiy MultiLevelRoisNum, test=develop * update core.ops, test=develop * add op register for new input & output, test=develop	5 years ago
LielinJiang	8df5b4d608	Add correlation api to contrib (#27015 ) * add correlation api to contrib	5 years ago
kinghuin	1b102dd552	optimize the error message for unpooling.cc fix the error message for the unpooling.cc	5 years ago
xiaoting	58f3ef982a	fix typo for interp_v2,test=develop (#26843 ) * fix typo for interp_v2,test=develop * align with torch, test=develop * add area mode, test=develop * fix bug, test=develop * format notes, test=develop * update for converage, test=develop * fix bilinear, test=develop * fix bicubic, test=develop * fix typo, test=develop * fix coverage, test=develop * fix helper.input_dtype, test=develop * polish notes, test=develop * polish notes, test=develop * polish notes, test=develop	5 years ago
wangchaochaohu	5af81f833c	fix gpu kernel for numel Op (#27085 )	5 years ago
zhupengyang	19ca6d9dd2	add .part to speed up compile (#27044 )	5 years ago
GaoWei8	4ff16eb201	Add padding cudnn interface (#26370 ) * add lstm cudnn of padding data and refine cudnn codes	5 years ago
wawltor	8857e3911f	add the dynamic dtype check for the argmin/argma update the check for the dtype check for the argmin, argmax	5 years ago
wangchaochaohu	041f4ab842	refine linspace Op for dtype setting(#27071 )	5 years ago
yaoxuefeng	9aa39584fe	fix cuda generator hard-coded offset step (#27027 )	5 years ago
Jacek Czaja	f6653c71e9	[oneDNN] Fix to conv2d grad with groups (#27006 ) * - Added fix to mobilenet * - compilation fix * - Fix to conv2d grad oneDNN with groups test=develop	5 years ago
Chengmo	a72752263b	support heter-xpu-ps (#27018 ) support heter-xpu-ps	5 years ago
whs	2660ea379d	Fix cuda kernel of affine grid (#27003 ) test=develop	5 years ago
ShenLiang	ff3dc8ac73	fix the remainder (#26995 )	5 years ago
yaoxuefeng	7f3e6ca596	add cuda generator (#26786 )	5 years ago
joanna.wozna.intel	95e1434bb2	Add bfloat16 data type (#25402 )	5 years ago
Yang Zhang	29b844ad5e	Fix clip op attr (#26924 )	5 years ago
huangjun12	e480168fae	fix dropout bug in backward when input is 1d tensor (#26837 ) * fix dropout bug in backward when input is 1d tensor, test=develop * add test case and refine error message, test=develop * refine error message, test=develop	5 years ago
Jacek Czaja	5e874cc333	- Cosmetic fixes to align with PADDLE_ENFORCE guidelines (#26891 ) test=develop	5 years ago
Thunderbrook	5205748481	fix eigen in push sparse; fix hadoop command (#26872 ) * fix eigen in push sparse; fix hadoop command test=develop * add log in load_combine_op test=develop	5 years ago
wawltor	0a29fc85d6	fix the argmin,argmax op for the paddlepaddle 2.0 * fix the argmin,argmax op for the paddlepaddle 2.0， add checkPoint for the argmax/argmin	5 years ago
Chengmo	d0962abd20	supplement bug fix of parameter server (#26217 ) * fix fluid.embedding	5 years ago
Leo Chen	60ffc22026	Refine bernoulli and unsqueeze op (#26842 ) * add check for bernoulli and register bool for unsqueeze * follow comments	5 years ago
tangwei12	ebc5f99789	add embedding 2.0 (#26649 ) * add embedding 2.0 * add embedding support input int32	5 years ago
hong19860320	40378edfa8	Add the AddCheckpoint macro to softplus op (#26809 )	5 years ago
GaoWei8	11fb8a1c10	Refine cudnn softmax (#25757 ) * refine cudnn softmax	5 years ago
wawltor	7ee70a47b8	update the doc for the some ops update the doc for the some ops, ceil asin, atan	5 years ago
zhupengyang	0f1ad9b06c	leaky_relu and hardshrink add checkpoint for behavior changed (#26802 )	5 years ago
Chengmo	7f2aa2db3c	【paddle.fleet】Support Heter Parameter Server (#25998 ) * Support Heter Parameter Server	5 years ago
Jiawei Wang	a1b99fae07	Adadelta Optimizer (#26590 ) * add doc; notest * fix doc; notest * update doc; notest * refine optimizer && adam * refine optimizer; notest * add adam * fix doc * fix doc && add adamw; notest * add error message * bug fix * refine rmsprop && adamax * fix ci * buf fix * update comment * unify arguments place; notest * fix ut, test=develop * bug fix * fix conflicts, test=develop * add examples code * bug fix * fix comments * fix sample code * add sample code for Optimizer * add adamax ut, test=develop * fix rmsprop ut, test=develop * add ut for optimizer.py and adamw.py * first commit of adadelta optimizer * fix learning rate * fix adadelta doc and add sgd momentum * remove unused fluid * fix codestyle * Update test_adam_op.py * Update test_adam_op.py * fix SGD in 2 unittests * fix SGD in 2 unittests * fix ci * fix ut Co-authored-by: MRXLT <xlt2024@gmail.com> Co-authored-by: mapingshuo <mps2012@yeah.net>	5 years ago
LielinJiang	346689c6f1	Register conv_transpose Op version for compatible Op upgrades (#26745 ) * fix bug * add version check * fix docs, test=document_fix * fix formula, test=document_fix	5 years ago
Wojciech Uss	7afb1df11e	Decouple weights and bias from fc primitive in MKLDNN cache (#26708 ) * decouple weights and bias from fc primitive in cache * removed reduntant update of pointers	5 years ago
Leo Chen	844583c8fd	Refine paddle.manual_seed (#26496 ) * refine manual seed * fix ci problem * fix unittests * fix unittest * set is_init_py=false in manual_seed * fix unittest * fix bernoulli_op * fix(unittest): change random_seed to manual_seed * 🐞fix(unittest): fix manual_seed * trigger ci * fix test_sentiment * fix test_imperative_save_load * fix test_uniform_random_op * fix test_uniform_random_op * fix test_jit_save_load * merge develop * fix manual_seed * fix manual_seed * use global engine * use shared_ptr * fix double free * fix bug * fix bug * fix bug * fix test bug * fix test bug * fix test bug * fix ci	5 years ago
ShenLiang	29494d703d	fix remainder, floor_div (#26732 ) * fix remainder, floordiv	5 years ago
lilong12	5f524efe56	modify error report message, test=develop (#26743 )	5 years ago
wangchaochaohu	4561fc37e2	Add check point for gather Op (#26696 )	5 years ago
LutaoChu	1ec30cb160	register cumsum Op version for compatible Op upgrades (#26734 ) register cumsum Op version for compatible Op upgrades	5 years ago
Jack Zhou	c282db3a93	add broadcast feature for elementwise logical op add broadcast feature for elementwise logical op	5 years ago
Yang Zhang	63eef7632e	Fix clip input check (#26683 ) * Fix clip input check * Fix default min/max value * Allow both max and min to be None * Register op change * Revert OP signature change	5 years ago
joejiong	f311d3c1cf	Fix pow api type error with python side method, merge elementwise_pow and pow. (#26163 ) As the title	5 years ago
yongqiangma	e4cc6a28b0	Norm op support 2-axis (#26492 )	5 years ago
xiaoting	89d7d86684	add intepolte_v2 (#26520 ) * add intepolte_v2 * fix linear interp * polish unittest, test=develop * update code samples to 2.0 API, test=develop * remove warning, test_develop * add name in attrs, test=develop * polish code, test=develop * change Align to align, test=develop * fix unittest in py3,test=develop * fix coverage, test=develop * fix coverage, test=develop * fix for windows ci, test=develop * fix coverage, test=develop	5 years ago
Zhang Ting	97cebfa4d3	add dtype for unique (#26655 ) * update doc, test=document_fix * add attr(dtype) * refine code	5 years ago
lilong12	1c68138327	[api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552 ) add collective op for cpu using gloo and paddle.distributed.* apis	5 years ago
joanna.wozna.intel	559e43eee4	Small change in conv2d and quantize pass (#26671 )	5 years ago
Bai Yifan	8986a82131	fix adaptive gpu grad bug, add doc refine (#26660 )	5 years ago
wawltor	286eca2d9e	update the code for the topk v2 add the top v2 for the paddlepaddle api 2.0	5 years ago
whs	f82384113b	Fix atomicAdd in grid sample op and affine grid op (#26647 ) test=develop	5 years ago
Wilber	32ba8602c6	Enhance py_func error info message. (#26557 )	5 years ago
Zhang Ting	0a895bc0df	improve unique op (#26537 ) * add unique_v2 op * remove unique_v2 op * update doc	5 years ago
whs	a004dfde3d	Use atomicAdd defined in paddle fromework (#26631 ) test=develop	5 years ago
zhupengyang	c80fcf901e	reduce_mean error if keepdim=True and reduce_all=True (#26614 )	5 years ago
whs	a065a24232	【2.0 API】Enhance affine grid operator (#26385 ) * Enhance affine grid operator: 1. Add cuda kernel 2. Add align corners options test=develop * Move new affine_grid api to functional test=develop * Add CUDA kernel for affine_grid. test=develop * Add more unitest for grid sample API test=develop	5 years ago
Qi Li	6f69fbc8ea	fix elu grad whne alpha less then zero, test=develop (#26543 )	5 years ago
whs	786373ba29	Use atomicAdd defined in paddle framework (#26628 ) test=develop	5 years ago
ruri	1f82c0cd62	[Api2.0] add pixel shuffle (#26071 )	5 years ago
whs	79539cf198	【2.0 API】Add CUDA kernel and enhance options for grid_sample (#26576 ) This PR enhance CPU kernel and add new CUDA kernel to make grid_sample support: - align_corners: with bool type. - padding mode: which can be in ['zeros', 'reflect', 'border'] - Interpolation mode: which ca be in ['bilinear', 'nearest'] The old CPU and CUDNN version only support align_corners=true, padding_mode='zeros' and interpolation_mode='bilinear'. The behavior of the new version op in default mode is compatible with the old version.	5 years ago
Guanghua Yu	8645591d66	support fp64 in huber_loss cuda kernel (#26583 )	5 years ago
yaoxuefeng	efee426742	support generator seed in related kernals test=develop (#26495 )	5 years ago
Zhong Hui	bf4a4636f1	change to use bce_loss op, add shape check for bce_loss change to use bce_loss op, add numel check for bce_loss.	5 years ago
ShenLiang	0e81626081	add div, floor_div, remainder (#26562 ) * add div, floor_div, remainder	5 years ago
qingqing01	24566e951c	Support empty bbox in bipartite math op (#26488 )	5 years ago
Jack Zhou	199b0c7c1b	Add isfinite v2 op (#26344 ) add the isnan, isfinite, isinf api for the paddle 2.0	5 years ago
wangchaochaohu	ebf9b2125e	add paddle.gather for API2.0 (#26455 )	5 years ago
wangchaochaohu	9219b79104	gather_nd Op for API 2.0 refine (#26540 )	5 years ago
zhupengyang	9b14117cac	logsumexp: impl kernel, refine docs (#26307 )	5 years ago
Wojciech Uss	5c2b9258a6	Fix (de/re)quantize cache keys (#26549 )	5 years ago
wawltor	6b28456ed0	add the argmax, argmin for the api2.0 * add the new api and op for the argmax, argmin	5 years ago
LielinJiang	d26ae9ad87	Update conv_transpose api (#26427 ) * update conv_transpose api	5 years ago
lilong12	faa9b97b78	fix cscatter, test=develop (#26554 )	5 years ago
WangXi	45711dade7	【API】rename div to divide, add floor_divide, remainder (#26434 )	5 years ago
LutaoChu	4e0c6d91aa	add paddle.tensor.linalg.diag API, diag_v2 OP and CUDA kernel add paddle.tensor.linalg.diag API, diag_v2 OP and CUDA kernel.	5 years ago
zhupengyang	f8863e0603	leaky_relu and LeakyReLU: alpha->negative_slope (#26216 )	5 years ago
ShenLiang	c609066074	Add Matmul op (#26411 ) * add matmul_v2	5 years ago
Leo Chen	aa2a9b5d89	add bernoulli op (#26511 ) * add bernoulli op * fix cuda kernel and add unit test * refine doc * fix uniform	5 years ago
Adam	f3909020de	Add mechanism for blocking oneDNN cache clearing (#26502 ) * Add mechanism for blocking oneDNN cache clearing * Review changes and Add thread guards	5 years ago
ShenLiang	b6eb37f5b3	add error message for cholesky (#26444 ) * add error message	5 years ago
QingshuChen	138ecf24aa	support Baidu Kunlun AI Accelerator (#25959 ) * support Baidu AI Accelerator * test=kunlun * minor * test=kunlun * support xpu op in separate file * test=kunlun * update XPU error message and remove duplicated code * test=kunlun * minor * test=kunlun * minor * test=kunlun	5 years ago
yaoxuefeng	4f259354d2	mod cvm test=develop (#25146 ) * mod cvm test=develop * mod code format test=develop	5 years ago
wangchaochaohu	e167e87974	【API2.0】add masked_select Op for API2.0 (#26374 )	5 years ago
zhupengyang	6e5670b8bd	mean: not support int32, int64; add check for axis (#26401 )	5 years ago
zhupengyang	4ad504e7c7	hardshrink: support threshold < 0 (#26403 )	5 years ago
lilong12	e92f770c42	Add collective ops (reduce) (#26340 )	5 years ago
wangchaochaohu	bdb805505e	【API2.0】add numel API for paddle test=develop (#26311 )	5 years ago
wangchaochaohu	2073ffc04d	Enhance the data type of linspace API (#26247 )	5 years ago
hong19860320	40d193ed17	Add the ReLU6, Tanhshrink, SELU, Softplus, Softshrink and Softsign for the api 2.0 (#26376 )	5 years ago
Zhaolong Xing	f00f982a02	add cub impl for arg max, min (#25941 ) test=develop	5 years ago
Zhang Ting	6914a12f82	rename the inputs of allclose (#26360 ) * rename input * add unittest, test=develop * use paddle.data instead of fluid.data, test=develop	5 years ago
littletomatodonkey	bcf03273f6	add pad func (#26106 ) * add pad func * add pad * test=develop, add pad op and apis * restore pad2d * test=develop, fix paddl declare * fix pad interface * test=develop, fix pad * test=develop, add all pad api and cos_sim * test=develop, remove padding default value * test=develop, rename var to tensor * test=develop, add more tests * test=develop, rename tovar to totensor * test=develop, fix init * test=develop, add more test * test=develop, add more tests	5 years ago
Chengmo	eeeef957c7	Fix ps gpu (#26218 ) * support ps-gpu	5 years ago
Zhong Hui	6cbeafb6c0	add zero norm, inf norm support for p_norm op (#26364 ) * add zero norm, inf norm support for p_norm op * fix the invalid argument check, fix the dtype problem in test case.	5 years ago
GaoWei8	1fbee267d4	remove scope in cudnn lstm (#25188 )	5 years ago
cc	3f816bc8b4	[Quantization] Conv2d_transpose and mul support channnelwise quantization (#25639 ) * Conv2d_transpose and mul support channnelwise quantization, test=develop * Skip collecting out threshold for output tensor of which the type is not fp32 or fp64, test=develop * Fix error in test_user_defined_quantization, test=develop * Add depthwise_conv_bn_fuse, test=develop * Add conv_transpose_bn_fuse_pass for post_training_quant, test=develop	5 years ago
lilong12	638bbb6153	Improve expand as (#26290 ) align expand_as op to expand.	5 years ago
zhupengyang	586a6dd358	log_softmax and LogSoftmax: impl kernel and refind docs (#26088 )	5 years ago
yaoxuefeng	23261ff44b	add cpu random Generator (#26013 )	5 years ago
Sylwester Fraczek	69742bd9a4	Enable mkldnn layout conversion (#25778 ) * enable mkldnn layout conversion * review fix: remove tmp_place * fix test mkldnn swish * add UT for PrepareData CPU->MKLDNN * add #ifdef PADDLE_WITH_MKLDNN * Force-push commit Co-authored-by: grygielski <adam.grygielski@gmail.com>	5 years ago
Jack Zhou	6d22f5c73e	Add PADDLE_ENFORCE in nll loss cuda kernel (#26294 ) * add nll loss API, update demo code of the comment	5 years ago
lilong12	241b44db14	[API 2.0] adaptive expand op to use shape instead of expand_times (#26206 ) * adaptive expand op to 2.0 (align to torch.expand) , test=develop	5 years ago
lilong12	fbd4d3cc97	[API 2.0] add paddle.tile op (#26245 ) * add tile_op, test=develop	5 years ago
Yang Zhang	a2d3e5c03b	Fix `paddle.abs` docstring (#25942 ) test=document_fix remove activation wording	5 years ago
Yang Zhang	22165934bc	Fix `paddle.acos` docstring (#25958 ) test=develop,test=document_fix remove activation wording	5 years ago
Yang Zhang	a5b5b00e02	Fix `paddle.asin` docstring (#25967 ) test=develop,test=document_fix remove activation wording	5 years ago
Yang Zhang	c758765769	Fix `paddle.atan` docstring (#25968 ) test=develop,test=document_fix remove activation wording tanh -> tan	5 years ago
Yang Zhang	c4e480efc5	Fix `paddle.cos` docstring (#25969 ) test=develop,test=document_fix explain input/out put range and out of boundary behavior	5 years ago
wawltor	2d6cc0b125	support the tuple for attribute of axis in min, max for api2.0 Update the code for the min,max, test=develop	5 years ago
Leo Chen	ffe52b4452	[OpDevOptimize] Add common infershape functions (#26096 ) * add unchaged infershape function * add broadcast infershape function * fix bug * rename infershape functions * add UnaryOpUnchangedInferShapeCheckAxis * add error message * add test for common infer shape functions * dont update existed ops * dont update op_desc.h * add more test * add error check, refine error message	5 years ago
Leo Chen	2d95280e1f	Feature/Enable Auto-Mixed-Precision in dynamic graph (#24903 ) * add auto_cast, test=develop * add loss scaler, test=develop * add comments, test=develop * refine code, test=develop * refine code, test=develop * do not set flags automatically, test=develop * fix custom op bug, test=develop * add more test, test=develop * refine enable logic, test=develop * enable amp test with GPU, test=develop * add unittest * add test for found_inf * follow comments * follow comments * remove global variable, use singleton * add some notes * update comments * update comments * update comments * add use_dynamic_loss_scaling argument * refine found_inf * refine found_inf	5 years ago
wawltor	9c17b3c9f8	Add the max, min, maximum, minimum api for the API 2.0 * Add the max, min, maximum, minimum api for the API 2.0, test=develop	5 years ago
Yiqun Liu	1be6bf45ae	Add assign to fusion_group and enhance inplace execution in fusion_group. (#26121 )	5 years ago
lilong12	8caee2ad51	【paddle.fleet】add the support for multi-node training for pipeline (#25907 ) * add the support for multi-node training	5 years ago
LutaoChu	bf2db646de	fix cumsum op for API 2.0, optimize performance update cumsum api and fix up the cumsum op	5 years ago
Adam	1893cd6bb8	Add oneDNN relu6 op (#26037 ) * Add oneDNN relu6 op * Lint fixes	5 years ago
Zhaolong Xing	50f149a48e	fix cudnn workspace size problem during inference. (#26021 ) test=develop	5 years ago
Chen Weihang	3c8daa9b89	Add pin memory control for BufferedReader (#26026 ) * add pin memory control * fix buffered reader init problem * fix unittest error * add unittest for coverage	5 years ago
Chen Weihang	ad4a0466a5	Add cuda pinned place branch in slice op GetExpectedKernelType (#26027 ) * add cuda pinned place branch * add unittest * add skip when not gpu	5 years ago
Feiyu Chan	e853ece0a2	update document template for unary elementwise layers (#25896 ) 1. update document template for unary elementwise layers(a.k.a. activation layer); 2. remove generate_op_noattr and use generate_activation instead; remove redundant function copies; 3. minor update for docstring to fix rst format errors. 4. fix doc for Rsqrt OP 5. add sample code for each activation separately; 6. remove the unused deprecated decorator.	5 years ago
joanna.wozna.intel	734cf1c3e9	Change use_quantizer attribute name and data type (#25838 ) * Change use_quantizer attribute name and data type * Fix problem with setting attribute * Add changes due to review * Small change in function * Restore use_quantizer attr for compatibility	5 years ago
Leo Chen	5258d53d65	refine unsqueeze, test=develop (#25470 ) * refine unsqueeze, test=develop * update unsqueeze, test=develop * refine unsqueeze, test=develop * refine unsqueeze, test=develop * update * remove None, test=develop * follow comments * support bool * update doc * follow comments * merge develop	5 years ago
yaoxuefeng	224620071b	add new flatten op test=develop (#25393 )	5 years ago
Adam	68c6160e63	Add oneDNN fusion_gru kernel (#25594 ) * Add oneDNN fusion_gru kernel and fix fc+gru pass test=develop * Formatting changes test=develop * Lint fixes test=develop * Add memory::format_tag::any to GRU weights test=develop * Fix build with CUDA * Fix build with CUDA v2	5 years ago
Zhong Hui	dca56f47f5	fix invalid read of pnorm gradient function fix invalid read of pnorm gradient function and delete the unused code	5 years ago
Zhaolong Xing	358bc06c72	[CUDNN8 support] : support CUDNN8 (#25664 ) * cunn8 support test=develop * fix ci error test=develop	5 years ago
Zhaolong Xing	5970871a64	add eltwise clip cuda impl. (#25689 ) test=develop	5 years ago
Pei Yang	b717895f64	Fix registering trt plugin (#25744 ) * develop dynamic shape serilization * add test param for gelu * fix bugs * delete redundant comments * debug * fix conflict. test=develop * fix bug. test=develop * add trt dynamic shape serialized support * fix ernie serialized bug test=develop * fix codestyle test=develop * fix bug test=develop * fix bug.test=develop * modify cmakelist test=develop * fix bug test=develop * fix error message. test=develop * fix trt register plugin based on pr#25003 * add trt dynload * fix deserialization bug of not finding plugin registration * refine code style * recover engine key in tensorrt_subgraph_pass * for ci coverage * add unittest for deserialization Co-authored-by: haozech <chenhaoze94@gmail.com>	5 years ago
wawltor	a697e94693	Update the code of the compare ops for the broadcast function Update the code for the compare ops for the broadcast function	5 years ago
wangchaochaohu	ff717d5158	Add support for tuple of concat Op test=develop (#25800 )	5 years ago
tangwei12	253fd407e8	Fix/distibuted heart beat (#25902 ) * disable heart beat UT	5 years ago
xujiaqi01	d11c140e28	fix dump, fix cvm check (#25400 ) * fix dump, fix cvm check test=develop * fix test=develop * fix test=develop * fix test=develop	5 years ago
Zhang Ting	6486fe8a94	improve GPU performance of transpose, test=develop (#25862 )	5 years ago
Zhang Ting	2d24f56a7a	avoid data transfer, test=develop (#25810 )	5 years ago
ShenLiang	bca303165a	fix inverse bug (#25641 ) * fix inverse bug, test=develop * fix the untest, test=develop * add singular checking, test=develop * fix the utest, test=develop * use memory::copy, test=develop * fix bost_get, test=develop * fix position, test=develop	5 years ago
Aurelius84	e52dae6ef6	Using input.place() in GetExpectedKernel in slice_op (#25595 ) * modify GetExpectedKernelType * use input place * add ENFORCE check	5 years ago
wawltor	595a719795	Update the api for the compare_ops Update the code for the compare_ops, update the api and doc	5 years ago
wangchaochaohu	32b9577b2a	refine the split op for API 2.0 test=develop (#25320 )	5 years ago
lilong12	ce506930c3	Fix the bug that Input(Offsets) and attr(offsets) cannot be set at the same time. (#24975 ) * bug fix, test=develop	5 years ago
tangwei12	2d9dbd31ad	Fix/mkl dnn (#25835 )	5 years ago
tangwei12	caa90a6510	Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957 ) * Integrated Trainer of Parameter Server	5 years ago
cc	42189be67b	[Quant] Remove the output for moving_average_abs_max_scale op (#25697 ) * Remove the output for moving_average_abs_max_scale op, test=develop	5 years ago
Chen Weihang	23d1228c4d	remove ProgramTranslator.save_inference_model (#25740 ) * remove ProgramTranslator.save_inference_model * adapt save_quantized_model * revert buffer check implemention * remove useless import function	5 years ago
Chen Weihang	1b3081b1b4	Simplify BufferedReader to improve DataLoader performance (#25648 ) * simplify buffered reader to improve DataLoader performance * fix 22 failed unittests * fix cuda pinned context condition * fix test_reader_reset failed * fix two failed unittests * change unittest place * polish error messaage * polish cast op GetExpecctedKernelType * remove debug info in unittest	5 years ago
Zhou Wei	e0a9115e28	fix random compile failure due to missing file (#25661 )	5 years ago
Sylwester Fraczek	1aaa26f102	add dnnl sigmoid (logistic) activation (#25745 )	5 years ago
wangchaochaohu	1e4ab728fb	refine the concat Op for API 2.0 test=develop (#25307 )	5 years ago
Adam	98899b73d2	Fix FC + GRU fuse pass (#25687 )	5 years ago
Leo Chen	4ec1251a1e	Refine squeeze, test=develop (#25281 ) * refine squeeze, test=develop * update squeeze, test=develop * refine compile-time infershape, test=develop * add more unittest, test=develop * follow comments, test=develop * add update_api, test=develop * follow comments, test=develop	5 years ago
joanna.wozna.intel	e5bbffa84c	Add NOMINMAX define due to windows.h max/min macro conflict (#25637 ) test=develop	5 years ago
cnn	70cee22fde	New features, add sinh and cosh op, test=develop (#25495 ) * New features, add sinh and cosh op, test=develop * remove duplicate test function and remove out paramters, test=develop * Add out paramters temporary, remove later. test=develop * remove out args, PR 25570, test=develop * remove TestParameter, test=developx * add test api for static dygraph, test=develop * add backword unittests for sinh and cosh, test=develop	5 years ago
Zhang Ting	a1350744eb	register fp16 kernel, test=develop (#25630 )	5 years ago
mapingshuo	5453a912fe	add fp64 support in sequence_pool, test=develop (#25662 ) add fp64 support in sequence_pool, test=develop	5 years ago
GaoWei8	6e86fd3750	fix concat dimension (#25606 ) Fix the condition of concat dimension judgment.	5 years ago
donproc	95fa383df2	optimize embedding cuda kernel lookup_table_v2,test=develop (#25587 )	5 years ago
石晓伟	7206417259	supports xpu runtime, test=develop (#25554 ) * update ResetHolder, test=develop * add TensorShare for lite engine, test=develop * tensor data changed from copying to sharing, test=develop * supports xpu runtime, test=develop * fix code styles, test=develop	5 years ago
Zhang Ting	30d1ff3bb4	call cublasGemmStridedBatchedEx when using fp16, test=develop (#25553 )	5 years ago
Aurelius84	ca1185d06b	[Dy2Stat] Fix scope in run_program_op (#25579 ) * add reinforcement learning model test=develop * align backward test=develop * add gym in paddle_build.sh test=develop * rm pip install in script test=develop * refine paddle_build.sh test=develop * fix sed error in macOS test=develop * polish code test=develop * fix scope problem * refine code by reviewer comment	5 years ago
Jacek Czaja	7dbc441eab	[oneDNN] cache cosmetics improvement (#25576 )	5 years ago
hong	e362095e45	fix softmax with cross entropy out of bound; test=develop (#25549 )	5 years ago
Huihuang Zheng	d8fe517bf8	Add Support for SelectedRows for Transpose OP and Fix a Bug That SelectedRows Cannot be Supported in SimNet (#25536 ) This PR fixes a bug that SelectedRows cannot be supported in SimNet. The reason of this bug is that dygraph basic_engine didn't copy var's type when the var needs to be accumulated during backward. So when a var is SelectedRows and needs to be accumulated, like SimNet which calls net for two times, the var's type will be changed to default LoDTensor thus bug happens. To fix it, we just also copy the type. Without this PR, the accumulated SelectedRows parameters in dygraph will be changed into LoDTensor. So when we fixed the bug of supporting SelectedRows in SimNet, we found `test_imperative_lod_tensor_to_selected_rows` failed and threw the error that SelectedRows was not supported for Transpose OP. To fix it, too, this PR also added support for SelectedRows for Transpose OP.	5 years ago
Wilber	848aca7ae8	[CI] [Lite-Subgraph] CI add lite subgraph check. (#25346 )	5 years ago
Shibo Tao	71c71e684c	fix logical_* ops' doc (#25479 ) * fix doc of logical_* op. * fix doc of op pow. * fix comment syntax error9D * fix operator reciprocal demo. * fix logical_* ops' doc. test=develop,test=document_fix * bug fix. test=develop,test=document_fix * bug fix. test=develop,test=document_fix * bug fix. test=develop,test=document_fix * bug fix. test=develop,test=document_fix	5 years ago
Aurelius84	4717bdbcfb	Fix hang in seq_topk_avg_pooling op (#25522 ) * fix topk_avg_pool hang test=develop * refactor get_topk_pos test=develop * add check of channel_num and num_k test=develop * add TopKPosPaddingId test=develop	5 years ago
LielinJiang	7129f544f0	Add bilateral_slice op (#25401 ) * add bilateral slice op	5 years ago
Zhang Ting	ca725c82f2	improve fp16 performance of slice_grad, test=develop (#25523 )	5 years ago
yaoxuefeng	5d3766ff3d	modify flip test=develop (#25312 ) According to paddle 2.0 standard 1, change flip api attr name 'dim' to 'axis'. 2, support empty axis 3, change example code to imperative mode.	5 years ago
Chen Weihang	41d2247275	[Dy2static] Refactor ProgramTranslator save_inference_model API (#24989 ) * experimental refactoring, test=develop * add TranslatedLayer & remove StaticModelRunner, test=develop * revert tracedlayer change, test=develop * fix test_mnist unittest error, test=develop * add doc & examples, test=develop * polish doc details, test=develop * add imperative.jit module, test=develop * change TranslatedLayer pos, test=develop * adjust jit module import path, test=develop * polish doc based review result * add SaveLoadConfig.separate_params to save paraams separately * add Layer.buffer support, test=develop * polish doc details based review result, test=develop * polish details baesd review comments, test=develop * add empty str check for param, test=develop * add unittests, test=develop * polish details based review comment, test=develop * remove blanks in comment, test=develop * polish doc details, test=develop * update imperative doc link, test=develop * add api attr for load, test=develop	5 years ago
yaoxuefeng	aaa7cbd56f	modify trace api test=develop (#25397 )	5 years ago
Huihuang Zheng	f9ac5fb992	[Dy2stat] Fix Memory Optimization in run_program_op and Add SimNet as Unit Test (#25383 ) Add Similarity Net as unit test. During the unit test, we found three problems: 1. The run_program_op has memory optimization error when running dy2stat net multiple times. 2. The support for SelectedRows can cause problem in dy2stat. 3. The return grammar has problem. This PR fixes the 1. problem but modify codes for the 2. 3. problems to make PR smaller. I will fix those two problems in the next PR(s)	5 years ago
yaoxuefeng	c42d662e2a	modify roll test=develop (#25321 )	5 years ago
Zhen Wang	548cdbc544	Quantization-aware training for dygraph (#24634 ) * Add the imperative quantization aware training. * This is the python part of Imperative QAT. test=develop	5 years ago
Chen Weihang	0b54d54fd8	Fix index overflow bug of the CUDA kernel loop increment (#25435 ) * fix softmax_with_cross_entropy cuda kernel overflow bug, test=develop * replace old macro & for condition, test=develop * polish details, test=develop	5 years ago
zlsh80826	e528392de9	[Paddle-TRT] SkipLayernorm vectorized memory optimization (#25117 ) * add explicit specialization * add skiplayernorm vector load if available * test=develop	5 years ago
zhupengyang	5b573c58e2	randperm API: remove out, devive, stop_gradient; add name (#25410 )	5 years ago
Zhen Wang	bb45af02ac	add the c++ part of Imperative QAT. test=develop (#25446 )	5 years ago
Jacek Czaja	050a9bf79d	[oneDNN] LRN cleanup (#25416 )	5 years ago
GaoWei8	1974aadcf0	fix concat shape error (#25414 ) * fix concat shape error test=develop	5 years ago
tangwei12	4b3778a3ee	Revert/barrier for sync (#25417 ) * add retry for prefetch * Revert "Fix/sync barrier (#25016)" This reverts commit `be6a315fbd`. * reopen dist UT, test=develop * remove fl UT, test=develop	5 years ago
ceci3	52be62c5ae	fix instance norm in dy (#24717 ) * fix bn & in in dy, test=develop * update instance_norm,test=develop * fix bugs,test=develop * add more case in unittest,test=develop * fix,test=develop * fix,test=develop	5 years ago
zhupengyang	eb3173e2b6	rand API: remove out, device, stop_gradient; add name (#25246 )	5 years ago
zhupengyang	6de75082cb	fix test_hsigmoid windows ci (#25311 )	5 years ago
WuHaobo	f593c3fb2f	fix the formula of floor OP and ceil OP (#25292 )	5 years ago
Zhang Ting	bc7610583b	use eval() to improve CPU performance (#25243 )	5 years ago
Kaipeng Deng	74468bf428	add mish op. (#24565 ) * add mish op. test=develop	5 years ago
Yang Zhang	6d6efafeeb	Add `matrix_nms_op` (#24400 ) * Add `matrix_nms_op` test=develop * Make ci happy test=develop * Exit early when no detection test=develop * Fix license year test=develop * Output index as well test=develop * Match nms2 lod behavior and add `return_index` flag test=develop * Make CI happy test=develop * Fix wording test=develop	5 years ago
Chengmo	e85fcaa712	Fix fluid.embedding in Distributed Training (#25174 ) * test=develop, fix_embedding	5 years ago
Yiqun Liu	c00f827843	Avoid data transforming ShapeTensor from CPU to GPU in fill_constant op. (#25267 )	5 years ago
123malin	f1a9593d69	test=develop, bug fix for index_select and roll op (#25251 )	5 years ago
FDInSky	c2e072587c	test=develop fix generate_proposals's error (#25227 )	5 years ago
Wilber	4c964abdf7	support build on arm. test=develop (#25212 )	5 years ago
liym27	1458cc0c68	Fix bug: Don't check dims if contain_unknown_dim of cross_entropy_grad_op in compile time (#25221 )	5 years ago
liu zhengxi	68e93d8a17	Fix beam_search InferShape (#25169 ) * fix beam_search infershape, test=develop * fix beam search op unittest, test=develop	5 years ago
Adam	bd0b38e671	Refactor of conv fp32 oneDNN operator (#25137 ) * Refactor of conv fp32 oneDNN operator test=develop * Formatting fix test=develop * Return Enforces test=develop * GetWeights improvements test=develop	5 years ago
Shibo Tao	19c4db1b56	don't re-generate header file if content doesn't change (#25130 ) * don't re-generate header file if content doesn't change. test=develop * add copy_if_different function. test=develop	5 years ago
Jacek Czaja	a7944904d3	[oneDNN]elementwise_add and elementwise_mul int8 support (#24984 ) * Start implementing int8 eltwise add test=develop * - Fix to Michal PR * - Fix test=develop * - Lint fixes test=develop * - Added checking if elementwise_mul can be used test=develop * - Added attribs to skip_attrs_set test=develop * - Improved broadcasting test=develop - fixes to compilation - fix - fix - Lint fixes test=develop * - removed redundant condition test=develop Co-authored-by: Michal Gallus <michal.gallus@intel.com>	5 years ago
Leo Chen	fa657b3dbb	fix bug of prelu when rank not equal 4, test=develop (#25067 ) * fix bug of prelu when rank not equal 4, test=develop * fix prelu inference, test=develop * fix api, test=develop * fix shape when mode is chennel, test=develop * remove debug code, test=develop * add unittest, test=develop	5 years ago
zlsh80826	479c8834f7	[Paddle-TRT] Fixes #24731 , opt for SoftmaxKernelWithEltadd kernel, test=develop (#24834 ) * blockReduce opt * launch threads align to warpSize * reduce unnecessary shared memory for broadcast reduced value * vectorize SoftmaxKernelWithEltadd * add fp16 constrain * test=develop	5 years ago
Leo Chen	028de857d4	fix dtype error of compare op, test=develop (#25059 )	5 years ago
tianshuo78520a	770c11a117	fix make device_context error (#25045 ) * test=develop * test=develop * fix bug * test=develop * test=develop	5 years ago
tangwei12	be6a315fbd	Fix/sync barrier (#25016 ) * fix sync barrier with barrier monitor, test=develop	5 years ago
ceci3	8db66fc3f6	fix cos_sim, test=develop (#25017 )	5 years ago
Zhang Ting	621b638550	improve performance of instance_norm, test=develop (#25005 )	5 years ago
wangchaochaohu	613303dbf6	refine the slice Op to improve the performance of xlnet for fp16 training (#24967 )	5 years ago
Chen Weihang	d152d7231e	clear old var in scope, test=develop (#24976 )	5 years ago
wawltor	0eb1b0bc01	Add support the 5d, 6d tensor support for the reduce ops Add the support the 5d,6d tensor support for the reduce ops; Add the same time, the compile time, it was 22 minutes, it was 21 minutes after fixed.	5 years ago
mapingshuo	24e24987f0	fixes the place info in the Print op (#24934 ) fixes the CUDAPlace info in the Print op	5 years ago
Aurelius84	6be0ee159e	Support LoDTensorArray in reverse_op (#24797 ) * Support LoDTensorArray in reverse_op test=develop * polish en doc and unittest code test=develop * refine sample code test=develop * add example of LoDTensorArray test=develop * fix typo test=develop	5 years ago
Leo Chen	a7cb97a1a5	Fix/isfinite on windows (#24927 ) * refine isfinite, test=develop * use namespace std of isfinite, test=develop, test=win_gpu	5 years ago
whs	4c01d6d53e	Enhance checking in some operator. (#24473 )	5 years ago
lilong12	6e10022781	add queue_generator_op, dequeue_op, enqueue_op and ut (#24481 ) * add queue_generator_op, dequeue_op, enqueue_op and ut, test=develop	5 years ago
Leo Chen	1e818158f5	Feature/add amp_checkout_finite_and_scale op (#24875 ) * add amp_check_finite_and_scale op, test=develop * add cpu kernel, test=develop * use bool, test=develop * follow comments, test=develop	5 years ago
leesusu	a6beb96dd0	FTRL with sparse update, test=develop (#22092 )	5 years ago
Chen Weihang	d1062d5278	Replace all errors thrown by LOG(FATAL) with PADDLE_THROW (#24759 ) * remove REPLACE_ENFORCE_GLOG compile option & add ci rule prohibit LOG(FATAL) using, test=develop * remove ci test case, test=develop * replace all LOG(FATAL) & polish message, test=develop * fix typo, test=develop * polish error info detail, test=develop	5 years ago
Michał Gallus	23a85f030c	Remove old mkldnn_elementwise_mul test (#24855 ) test=develop	5 years ago
Leo Chen	b67ded04f2	Support gradient accumulation of fp16 in imperative mode (#24823 ) * support gradient accumulation of fp16 in imperative mode, test=develop * enhance coverage test, test=develop * follow comments, test=develop	5 years ago
Qi Li	704cad6a66	Add histc op (#24562 ) * add histc operator, test=develop * update english doc to 2.0 API, test=develop * update API from histc to histogram, test=develop Co-authored-by: root <root@yq01-gpu-255-129-15-00.epc.baidu.com>	5 years ago
Yi Liu	12bffdc086	Enhance error message of checkpoint_notify_op, fake_init_op gen_nccl_id_op and listen_and_serv_op (#24554 ) test=develop	5 years ago
Adam	b490e41c1d	Add isCached() mechanism for BatchNorm and LRN oneDNN operators (#24798 ) * Add isCached() mechanism for BatchNorm and LRN oneDNN operators test=develop * Formatting fix test=develop	5 years ago
Aurelius84	a7e21cbed3	Move input_size check into RunTime phrase of gru_unit_op and refine error message (#24776 ) * Add IsRuntime judgement in GRUUnit test=develop * add IsRuntime judgement is GradOp test=develop * Refine Error Message of SelecteInput/Output test=develop * refine Error Message of RNNMemoryHelperOp test=develop	5 years ago
Zhou Wei	f66594a558	fix bug that diag API can't use on Windows(#24762 )	5 years ago
Leo Chen	c0911fdd32	rename inplace/no_need_buffer inferer, part4, test=develop (#24781 )	5 years ago
Chen Weihang	be82de4c79	polish two error message, test=develop (#24778 )	5 years ago
Leo Chen	b0e7439fbc	rename inplace/no_need_buffer inferer, part2, test=develop (#24733 )	5 years ago
Leo Chen	a6fbba65ff	rename inplace/no_need_buffer inferer, part3, test=develop (#24734 )	5 years ago
wangchaochaohu	355caee18b	fix conv_transpose Op fp16 error test=develop (#24695 )	5 years ago
Adam	56a714a19b	Add isCached() machinism to oneDNN pooling primitive (#24724 )	5 years ago

... 3 4 5 6 7 ...

5691 Commits (8f2656ef5ca4ab16f06d94b8ca9392d3f0f760ae)