Paddle

Commit Graph

Author	SHA1	Message	Date
Zhaolong Xing	430b0099c9	[Paddle-TRT]: Ernie Dynamic shape support. (#23138 ) * add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop	5 years ago
liym27	6af480ca33	Support int64 for op assign_value. test=develop (#23179 )	5 years ago
Zeng Jinle	53e6f8e1da	rename macro, test=develop (#23161 )	5 years ago
Zeng Jinle	bba740710d	add cuda resource pool for BufferedReader, test=develop (#23152 )	5 years ago
Zeng Jinle	7d8d50b6cc	rename no_need_buffer_vars macro, test=develop (#23160 )	5 years ago
Liufang Sang	a486a739e1	fix compile error in win gpu (#23196 ) * fix compile error in win gpu test=develop * fix compile error in win gpu test=develop * fix compile error in win gpu test=develop	5 years ago
Zeng Jinle	7ca77a90ac	add Tensor::IsSharedBufferWith method, test=develop (#23175 )	5 years ago
Zeng Jinle	b8886bf122	rename no_need_buffer_vars_macro, test=develop (#23159 )	5 years ago
wangchaochaohu	b721e23b25	transpose cudnn using cudnn v7 api (#19738 ) * refine the transopose conv using v7 to choose algorithm	5 years ago
Adam	4f5e4540f8	Improve SGD jit code to work with large data (#23120 )	5 years ago
Liufang Sang	4db031902d	add dequantize_log_op and make pyramid hash support int8 weight (#22548 ) * add dequantize_log_op and make pyramid hash support int8 weight test=develop * add unittest and update pyramid hash op test=develop * remove paddle_enforce test=develop * fix error message test=develop * remove incorrent commit test=develop * fix error message in log_dequantize test=develop * change 2019 to 2020 test=develop * remove useless check_grad test=develop	5 years ago
Zeng Jinle	9258e96094	fix read op comments, test=develop, test=document_fix (#23122 )	5 years ago
Zeng Jinle	acfc9b8a70	Reader sequential and inference partial feed (#22699 ) * sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop	5 years ago
Wilber	95b356a069	update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114 ) update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input	5 years ago
Zeng Jinle	a31d7328b7	Add dygraph double grad implementation (#22939 ) * add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop	5 years ago
songyouwei	2e2da7124b	high-performance dygraph slice (#22879 ) * move __getitem__ to cpp * bug fix * add type check and gil release * support negative step with omitted ends test=develop * code refine test=develop * bug fix test=develop * slice always return different pyobj test=develop	5 years ago
Sylwester Fraczek	abee05a8c8	added mkldnn swish activation (#23041 )	5 years ago
Zhaolong Xing	8c6fde9e69	fix align error (#23090 ) test=develop	5 years ago
Liufang Sang	915b892a15	Fix div zero in fake quantize op (#22966 ) * fix div zero test=develop * fix div zero test=develop * add hostdevice function test=develop * add eps when is zero test=develop	5 years ago
Feiyu Chan	01ab8a0619	add approximation for gelu, test=develop (#22961 ) add approximation for gelu, default value is False (only kernel with eigen is added, remove code for computing gelu with MKLDNN temporarily)	5 years ago
Adam	5842ae6785	Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )" (#22985 )	5 years ago
GaoWei8	1dc1f9270e	Fix lod error of concat op for axis = 0 (#22538 )	5 years ago
Zhang Ting	714b0076b6	Override GetKernelTypeForVar to avoid device transform, test=develop (#23032 )	5 years ago
wangchaochaohu	112e3edbf6	fix the conv group problem test=develop (#23025 )	5 years ago
wangchaochaohu	3757e0687c	Add Unittest for backward of fusion group (#22932 ) * add fusion group test for backward and refine code	5 years ago
chengjuntao	63f3ada7b9	fix bug which input shape (#22965 ) * fix bug which input shape, test=develop * add error type,test=develop	5 years ago
wangchaochaohu	f0d193a23c	Cast fusion for fusion group (#22876 ) * add support for expression type convert and add cast Op support in fusion group	5 years ago
yaoxuefeng	29a7a52d38	Fix instag (#22632 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * fix compile warning test=develop * add attr default test=develop * add unittest test=develop * fix style test=develop * fix style test=develop * change out_val_ifempty to out_val_if_empty test=develop	5 years ago
wawltor	f154d5860f	Speed up the matmul op, use the gemm replace the batch gemm (#22926 ) In the op of gemm, we use the gemm to replace batch gemm, speed up the matmul op	5 years ago
Adam	056edf3929	Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )	5 years ago
Zhaolong Xing	8d6dc102fe	[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494 ) * 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop	5 years ago
guofei	3d8571e884	modify assign op and add unittest of assign op (#22769 ) As the title.	5 years ago
Zeng Jinle	d33c4343e1	Imperative tracer refactoring (#22457 ) * refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop	5 years ago
tangwei12	ad9c8f6d2d	fix communicator when break under pyreder mode (#22911 ) * fix communicator when breaking under PyReader mode, test=develop * revert some vlog level to 0, test=develop	5 years ago
mapingshuo	5ba9dfc16a	add lookup_table_dequant_op (#22900 ) add lookup_table_dequant_op	5 years ago
Zhaolong Xing	dd67d44a50	[Paddle-TRT] : (Part1) Dynamic shape support (#22868 ) * change the ci trt from version 5. to 6.0 * paddle-trt dynamic shape support init * conv+bias or conv+bn dynamic shape support test=develop * modity trt engine opconvert test=develop * fix ci error test=develop	5 years ago
tangwei12	07e13b84cd	remove vlog, test=develop (#22898 )	5 years ago
Wilber	f686310d81	fix concat_mkldnn op. test=develop (#22692 ) fix concat_mkldnn op when encounter extreame conditions.	5 years ago
Zhaolong Xing	1a533ed2de	[BUG]: Multihead matmul op's ouput size should be BxSx(N*H) (#22848 ) test=develop	5 years ago
石晓伟	ddb9b46fec	change the function in op_teller, test=develop (#22794 ) * change the function in op_teller, test=develop * correct the commit-id, test=develop	5 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	5 years ago
Kaipeng Deng	ebc7ffc300	fix detection_map. test=develop (#22705 )	5 years ago
zhaoyuchen2018	72dde4abde	Refine adam op to improve performance, test=develop (#22346 ) * Refine adam op, test=develop * Fuse kernels together to reduce cpu time. * Refine paddle enforce, test=develop * Remove some comments, test=develop * Refine code,test=develop * Refine cuda kernel, test=develop * Refine code according to comments, test=develop	5 years ago
wangguanzhong	f2d1cd119a	fix lod level, test=develop (#22755 )	5 years ago
FlyingQianMM	79d712346f	Correct CPU gradients of the argsort op (#22739 ) * Correct CPU gradients of the argsort op, form a network to test its forward and backward process, test=develop * fix dynamic threshold error in test_argsort_op, test=develop	5 years ago
guofei	ae8b5f11a3	Change ShareDataWith() to TensorCopy() in ref_by_trainer_id (#22717 ) As the title	5 years ago
chengjuntao	15c2667143	register fp16 for assign op (#22744 ) * register fp16 for assign op, test=develop * add op test for fp16, test=develop	5 years ago
dyning	1c0653462d	fix generate_mask_labels lod level (#22743 )	5 years ago
GaoWei8	ba140222d6	fix compile&runtime lod_equality of lod_reset (#22737 )	5 years ago
ShenLiang	3132681e8a	add partial_sum op in contrib (#22292 ) * add partial_sum_op, test=develop * modify the Paddle Error Message, test=develop * modify the Paddle Error Message, test=develop * modify the bug for python3, test=develop * modify the ut for ci, test=develop * mv to contrib, test=develop * use check_variable_and_dtype, test=develop * fix ci, test=develop * fix conflict, test=dvelop * add partial concat, test=develop * fix the conflict, test=develop * fix the error, test=develop * rm SSE4, test=develop	5 years ago
ShenLiang	e136661304	add partial_concat op in contrib (#22528 ) * add partial_concat, test=develop * fix the grids and blocks, test=develop * fix the Paddle_Enforce, test=develop * fix the doc of op, test=develop * fix the doc, test=develop * fix the doc of the op, test=develop * replace -1 with None, test=develop	5 years ago
tianshuo78520a	d2ba91aad1	fix typo words (#22653 )	5 years ago
Yibing Liu	6e7bfe30a6	register fp16 kernel for some ops (#22650 ) (#22696 ) test=develop	5 years ago
tangwei12	66a3150135	SYNC with communicaotor (#22344 ) * add sync communicator and implement	5 years ago
Yiqun Liu	22bbd54719	Add the support of fp16 in fusion_group (#22239 )	5 years ago
Huihuang Zheng	adfa5b8354	Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp (#22673 ) 1. Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp. 2. Also enrich PADDLE_ENFORCE error messages.	5 years ago
lidanqing	d926214535	[UT coverage] improve the mul_mkldnn_op line coverage (#22408 ) * improve the mul_mkldnn_op line coverage test=develop * remove fp32 mul mkldnn kernel test=develop * locally refactoring test=develop * change according to reviews test=develop	5 years ago
Zhaolong Xing	a06d75a280	[Paddle-TRT] Refine the error log about runtime batch and max_batch_size. (#22535 ) * fix trt log test=develop * fix comments test=develop	5 years ago
Adam	608447bfd5	Update MKLDNN to v1.2 (#22521 )	5 years ago
Adam	ab610a34ff	transpose_mkldnn code change to meet Paddle standards (#22591 )	5 years ago
Jiawei Wang	8f035fb637	Add TopK Op Grad CPU&GPU Kernel test=develop (#22628 ) * Add TopK Op Grad CPU&GPU Kernel test=develop * Add TopK Op Grad, modify grad op maker test=develop * Add TopK Op Grad, modify grad op maker test=develop * Add TopK Op Grad, modify PADDLE_ENFORCE test=develop * Add TopK Op Grad, modify PADDLE_THROW test=develop * Add TopK Op Grad, modify unittest test=develop * fix ngraph top k op unittest test=develop	5 years ago
Steffy-zxf	90ee366653	update ops's unittest data type from float32 to float64 and shape over 100 (#22544 ) * update ops's unittest of elementwise_pow, elementwise_max, elementwise_min, scale and sqrt 1. update elementwise_pow, elementwise_max and scale's unitests with input data type (float32 -> float64) 2. fix bug that the elementwise_pow doesn't meet threshold requirements with tackling float64 data 3. remove sqrt from op_accuracy_white_list.py 4. update the unittests of elementwise_pow, elementwise_max and elementwise_min ops that their input data shape over 100 5. test=develop * modify the writing style according suggestions test=develop	5 years ago
Zhaolong Xing	8acd745c25	[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486 ) * 1. optim multihead matmul: fuse three fc to multihtead matmul test=develop * fix conflict test=develop * fix comments test=develop	5 years ago
Guo Sheng	31b5464632	Add support for dynamic_decode(while) training. (#22231 ) * Add support for dynamic_decode(while) training. test=develop * Fix assign_op and tensor_array_read_write_op after solving conflict. test=develop * Fix test_rnn_decode_api.py. test=develop * Refine docs for apis in rnn.py. test=develop * Adjust outputs of dynamic_decode. test=develop * Remove the force_cpu update in assign_op. test=develop * Remove the force_cpu update in assign_op. test=develop * Make RNNCell.get_initial_states support batch_dim_idx argument. test=develop * Rename _create_array_outof_while as _create_array_out_of_while in rnn.py. test=develop	5 years ago
Wojciech Uss	4cddb43c5c	Add support for Ernie NLP model to the Slim QAT (#22506 ) * a test for Ernie QAT INT8 accuracy check test=develop * Remove NLP comparison test to split PRs test=develop * Fix typo and tabs, delete commented lines test=develop * re-combine the 2 PRs, test=develop Co-authored-by: Michał Gallus <sand3r@interia.eu> Co-authored-by: bingyanghuang <33643817+bingyanghuang@users.noreply.github.com>	5 years ago
Double_V	58d99247f4	support slice double grad, test=develop (#22166 ) * support slice double grad, test=develop * merge two doublegradopmaker to one doublegradopmaker,test=develop * change the shape of slice_OP's unittest, test=develop	5 years ago
hutuxian	1a7962be97	Paddlebox about box_wrapper (#22497 ) Refine PaddleBox Framework, Main functions: * Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC. * Replace FeedPass with new interface: BeginFeedPass & EndFeedPass * Refactor Pull/Push Sparse Function in box_wrapper. * Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct. * Cache copied keys in pull sparse in order to reuse it in push period.	5 years ago
huzhiqiang	9e29d3ebed	【OpPorting Example】DEMO OF FIX COMPILE&RUNTIME LOD_EQUALITY (#22460 )	5 years ago
zhaoyuchen2018	54970444ce	Improve transpose performance with tile sm copy, test=develop (#22311 ) * Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Wilber	a90fa54092	Compile without nccl deps. [1/2] (#22509 ) 支持不依赖nccl进行编译。[1/2] 多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Wilber	de009152a7	Compile without nccl deps. [2/2] (#22484 ) Compile without nccl deps. [1/2] Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Yiqun Liu	4b2227e958	Fix dismatch of std::max's arguments type on windows. (#22507 ) test=develop	5 years ago
Wilber	870f465887	fix test_fusion_seqpool_concat lod level between compile and runtime (#22488 )	5 years ago
Zhong Hui	a61d09527b	Fix the integer overflow problem of sequence2batch (#22479 ) Fix the integer overflow problem in the op of sequence2batch, change the int32_t to size_t， In the /paddle/fluid/operators/math/sequence2batch.h#L122.	5 years ago
cc	197913ebe1	Add weight quantization in post_training_quanzitaion (#22445 ) * support weight quantization in post_training_quanzitaion, test=develop * add test for weight quantization, test=develop	5 years ago
Tao Luo	7c9ce097f1	refine reshape_op shape error message (#22480 ) test=develop	5 years ago
LielinJiang	2b1386b2b2	optimize performance of interpolate op (#22436 ) * optimize interpolate op, test=develop	5 years ago
Yiqun Liu	44b45b9f07	Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456 ) * Add log in memory::Copy for debug purpose. * Change to use context in DeviceContextPool directly in sequence_pooling_test, instead to new one. * Change to use context in DeviceContextPool directly in sequence_padding_test, instead to new one. test=develop * Change the type of second_dim from size_t to int64_t. test=develop	5 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Tao Luo	943cb8c664	fix sigmoid cudnn bug (#22439 ) * Sigmoid bug fix, test=develop * fix code format test=develop Co-authored-by: Manjunath Bhat <manjunathbhat9920@gmail.com>	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
liu zhengxi	0404e7a985	Update the precision of pad, pad2d, pad_constant_like's unit tests from fp32 to fp64 (#22394 ) * update the ut precision of pad pad2d pad_constant_like from fp32 to fp64, test=develop	5 years ago
Michał Gallus	269db0d1d1	[DNNL] Fix accuracy in INT8 FC (#22404 ) * Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop	5 years ago
joanna.wozna.intel	fb3086fd57	[UT coverage]Remove unnecessary transpose op registration (#22402 )	5 years ago
lidanqing	ade5022681	[UT Coverage]Improve sum_mkldnn_op line coverage (#22275 )	5 years ago
Wojciech Uss	92462e948d	improve elementwise_add_mkldnn_op test code coverage (#22359 )	5 years ago
ceci3	20f30dd604	add benchmark flag for conv_transpose (#22389 )	5 years ago
Chengmo	8f36c39537	Fix GEO-SGD init & send Bug (#22375 ) * test=develop, fix geo Send & Init	5 years ago
zhupengyang	c6f888e5a5	update unittest accuracy to float64 for relu, prelu, maxout (#22273 )	5 years ago
wangchaochaohu	0d8b222b79	Optimize the depthwise op test=develop (#22265 )	5 years ago
qingqing01	2d20869c94	Fix infer_shape in compling for elementwise_op (#22291 )	5 years ago
tangwei12	82bc814a57	integrated HALF_ASYNC to communicator (#21869 ) * add half_async in the communicator * fix DistributedStrategy	5 years ago
wangchaochaohu	1e932eccfa	remove unused code test=develop (#22327 )	5 years ago
Leo Chen	3e5744aa65	Remove unused inputs for some operators (#22284 ) * remove unused inputs, test=develop * remove unused inputs, test=develop * update dtype, test=develop * remove unused inputs, test=develop * update op_use_default_grad_op_maker, tese=develop * resolve conflicts, test=develop * follow comments, test=develop * update center_loss_grad, test=develop	5 years ago
zhangchunle	805328e13b	fix typo in error message (#22312 )	5 years ago
lidanqing	895f8da7d6	change std::cout to log(INFO), vlog (#22316 )	5 years ago
Bai Yifan	faba4b116a	Remove disable flag in test_fsp_op.py (#22171 ) * fix fsp_op, test=develop * fix fsp grad op maker, test=develop * update op_use_default_grad_op_maker.spec, test=develop	5 years ago
Adam	9942d9ed5c	Add caching mechanizm to requantize_mkldnn_op (#22223 )	5 years ago
123malin	985bceac53	Bug fix for sparse recorder (#21969 ) * test=develop, bug fix for sparse recorder	5 years ago
FlyingQianMM	443a713c9e	add backward gradient computation for op argsort (#22203 ) * add backward gradient computation for op argsort test=developo * use pre-commit test=develop	5 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	5 years ago
baojun	298ee7d28a	Improve ngraph file line coverage (#22155 )	5 years ago
zhongpu	d0f0a2520c	test Optimizer in dygraph (#21949 ) * test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop	5 years ago
石晓伟	ad0dfb17c1	[Feature] Lite subgraph (#22114 )	5 years ago
zhaoyuchen2018	3d4f2aa689	Refine stack op to improve xlnet performance, test=develop (#22142 ) stack's wait cost a lot of cpu time, use cuda kernel to do memory copy will reduce cpu time. Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
liu zhengxi	64a4044292	add double register op_data_type of pad2d and fix compile error, test=develop (#22075 )	5 years ago
Double_V	6ea3809143	Support prroi_pool_op with Tensor and LoDTensor rois (#20649 ) 1. Add a new input named batch_roi_nums for prroi_pool_op. batch_roi_nums includes the number of roi for each image in batch when rois is Tensor. This information is saved in rois's lod when rois is LoDTensor. 2. add grad check to prroi_pool_op and solve unnormal X grad diff in CPU.	5 years ago
zhaoyuchen2018	3dbd4087fe	Fix windows build not kernel issue, test=develop (#22105 ) windows conv_fusion failed as no kernel， explicit declare lambda Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Chengmo	418abc92f4	Update pyramid related OP (#21372 ) * add special way to add distribute vars， Update Pyramid hash op	5 years ago
Feiyu Chan	14aebc7a95	add erf op (#21785 ) * add erf op and python interface. * add fp16 support for erf op. * add unitests for erf op and its python interface.	5 years ago
Chen Weihang	ba8414d3a5	replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109 )	5 years ago
Double_V	fab4b0765a	support elu_op double grad (#21822 ) * support elu activation double grad,test=develop * delete the code commit in .cc,test=develop * fix relu test unpass, test=develop * add elu double grad kernel and unit test * add caculate dX in elu double grad functor, test=develop * update the commit code,test=develop	5 years ago
Pei Yang	0a51098a71	Add TRT support for BERT (#21135 ) * add gelu plugin * align trt bert with gpu * add support for fused fc with relu, * add unittest for bert trt	5 years ago
Jacek Czaja	b0b27ff699	[MKL-DNN] Conv grad and Batch Norm grad NHWC support (#22088 )	5 years ago
123malin	7fb817d447	add distributed_strategy (#21710 ) * add distributed_strategy	5 years ago
Jacek Czaja	ad8a9cb82c	[MKL-DNN] Pool & LRN Grad Ops NHWC support (#21747 )	5 years ago
Kaipeng Deng	34c57120eb	polish cross_entropy ENFORCE (#22056 )	5 years ago
SunAhong1993	7f4abaf2f5	register int/int64_t/float16 in pow/square kernel,test=develop (#22023 ) * register int/int64_t/float16 in pow/square kernel,test=develop * add abs/square/exp type,test=develop	5 years ago
Leo Chen	3f653c8323	register NoNeedBufferVarsInference for max_pool_grad_op, test=develop (#22055 ) * fix test_conv2d_ngraph for grad diff, test=develop * register NoNeedBufferVarsInference for max_pool_grad_op, test=develop * refine error message, test=develop * fix numpy, test=develop * disable test conv2d_ngraph_op, test=develop Co-authored-by: Zhang Ting <709968123@qq.com>	5 years ago
Yiqun Liu	d48320777e	Add the first implememtation of fusion_group op (#19621 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop	5 years ago
Michał Gallus	6192108408	[DNNL] 3D Fully-Connected (#21746 )	5 years ago
FDInSky	aa2ed0dcc6	fix generate_proposal_labesl op (#21793 ) * test=develop fix generate_proposal_labesl op	5 years ago
ceci3	95d79b6d00	update error log for batch_norm_grad (#22017 ) * update error information about batch_norm_grad * update bn,test=develop	5 years ago
Aurelius84	c53b62eb8e	fix integer overflow in match_matrix (#22036 ) * fix integer overflow in match_matrix test=develop * fix integer overflow in match_matrix test=develop * fix typo test=develop	5 years ago
wangchaochaohu	64baee4144	polish code test=develop (#22014 )	5 years ago
danleifeng	b7697f6218	fix broadcast bug;test=develop (#21898 )	5 years ago
zhaoyuchen2018	8859ddd6cf	Refine multihead kernel, align block to 32 (#21961 ) * Refine multihead kernel, align block to 32 test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine log comments test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
zhoushiyu	cee2ccb078	add shuffle batch op (#21674 ) * add shuffle batch op, test=develop, test=document_preview * fix size_t conflict and check_output test=develop, test=document_preview * fix bug test=develop, test=document_preview * add unittest of shuffle_batch layer test=develop, test=document_preview * fix py coverage and op input type, test=develop, test=document_preview * fix py coverage, test=develop * fix en doc, test=develop * move to contrib test=develop * add unique_name test=develop * invoke shuffle_batch in contrib.layers test=develop	5 years ago
mapingshuo	c3e1954918	make reverse op support negative axis (#21925 ) * make reverse op support negative axis	5 years ago
Aurelius84	10d6846900	Remove double registered dataType in Pad2d (#21942 ) * fix compile error in CUDA10 test=develop * remove double in pad2d test=develop	5 years ago
hutuxian	27decacb8a	fix aucop stat shape (#21846 ) * fix stat shape back in global auc scenario * add UT to cover global auc	5 years ago
Aurelius84	5cb2c74127	add register op_data_type of pad/expand_as et.al (#21718 ) * add register op_data_type test=develop * fix register bug in isfinite op test=develop * rm int int64_t in pad2d gradKernel test=develop	5 years ago
hong	30d000f8c2	fix matmul error message; test=develop (#21885 )	5 years ago
zhouwei25	a01663ca1f	remove patch command and file of cares to Improved quality of Paddle Repo (#21776 )	5 years ago
Aurelius84	51a86d2b6b	Optimize adam speed (#21777 ) * optimize adam speed by removing _finish_update test=develop * fix SparseAdamFunctor param list test=develop * Remove scale_op in expect_list of adam_op test=develop * fix test optimizer loss assert error test=develop * fix test optimizer loss assert error test=develop * modify PADDLE_ENFORCE usage test=develop * fix op_type in lamb_op.cc test=develop * fix errors ostream format bug test=develop * add betaPowOut in ngraph op test=develop * fix ngraph::op api for gcc8 test=develop * clean code test=develop * modify struct into class test=develop * remove code of beta1Tensor in lamb_op test=develop	5 years ago
FDInSky	6b9fbcf3ad	Update iou_similarity op to support non-normalized bbox (#21671 ) Update iou_similarity op to support non-normalized bbox	5 years ago
guofei	46f9184aff	Modify the while_loop API (#21844 )	5 years ago
Guo Sheng	7689b6aaa4	Fix default label dim of label_smooth_op. test=develop (#21862 )	5 years ago
GaoWei8	d4dda8628e	optimize fc jit (#21878 ) test=develop	5 years ago
Chen Weihang	2b941736f3	fix softmax_with_cross_entropy_fix bug, test=develop (#21810 )	5 years ago
Chengmo	a86f11b5f5	Speed GEO dense calc & communication (#21579 ) * test=develop, speed dense calc & communication	5 years ago
Wojciech Uss	666c3bb9b0	handle multi-inputs with empty inputs for mkldnn_concat_op (#21827 ) test=develop	5 years ago
guofei	8b7c50f49a	Make While Op could run on GPU place and add while_loop unittest (#21672 ) 1. Make while_op accept GPU conditional data 2. Add more complex test cases for while_loop API	5 years ago
WangXi	17299b8d21	fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop (#21801 )	5 years ago
Huihuang Zheng	0677a1c1c1	Fix That conditional_block_op Doesn't Have InferShape (#21733 )	5 years ago
zhaoyuchen2018	a5a8d14414	Fix softmax cuda bug (#21720 ) * Fix softmax cuda bug * Refine multihead log and softmax logic	6 years ago
Kaipeng Deng	943a44492b	yolo_box OP add Attr(clip_bbox). (#21620 ) * yolo_box OP add Attr(clip_bbox). test=develop	6 years ago
Leo Chen	7181afd75c	Fix elementwise_pow bug on CUDA place with integer (#21675 ) * fix elementwise_pow bug on integer, test=develop * use llrint to support elementwise_pow_grad, test=develop * add some tests, test=develop * revert grad functor, test=develop	6 years ago
Chen Weihang	1fd1f06f11	Rename paddle throw error macro (#21657 ) * rename paddle throw error macro, test=develop * fix new error use case, test=develop	6 years ago
joanna.wozna.intel	d419b859c0	Add reshape int8 mkldnn op (#21428 ) * Add reshape int8 op test=develop * Change test to CPUPlace test=develop * Correct tests test=develop	6 years ago
tangwei12	9ad940fdfe	memory leak for cpu (#21174 ) * add fake init for the trainer, fix large memory hold in the trainer * do not merge recv vars from a remote endpoint, test=develop * add recv and save op, merge slice var in one op, save memory * remove hsigmoid with pull sparse, test=develop	6 years ago
GaoWei8	5af0c7ba89	Modify padding strategy: remove weight copy in fc padding (#21650 ) test=develop	6 years ago
wangchaochaohu	5eec8cf5af	fix the mean grad OP performance improvement test=develop (#21658 )	6 years ago
Zeng Jinle	29f64c8c9e	refine some grad op makers, test=develop (#21629 )	6 years ago
mapingshuo	e2d849b989	Dropout with seed (#21590 ) * add seed op	6 years ago
Adam	e81f0228df	MKL-DNN 1.0 Update (#20162 ) * MKLDNN v1.0 rebase to Paddle 1.6 test=develop * Add hacky paddle::string::to_string() implementation * vectorize<int64-t>() -> vectorize() cleanup test=develop * PADDLE_ENFORCE and void_cast fixes test=develop * Rebase changes test=develop * Cosmetics test=develop * Delete MKL from mkldnn.cmake test=develop * CMake debug commands test=develop * Delete MKLDNN_VERBOSE and rebase fixes test=develop * Rebase fixes test=develop * Temporarily disable int8 resnet101 vgg16 and vgg19 tests test=develop * Add libmkldnn.so.1 to python setup test=develop * Add libmkldnn.so.1 to inference_lib cmake after rebase test=develop * Post rebase fixes + FC int8 changes test=develop * Fix LRN NHWC test=develop * Fix NHWC conv3d test=develop * Windows build fix + next conv3d fix test=develop * Fix conv2d on AVX2 machines test=develop	6 years ago
wangchaochaohu	95b95a284b	Mean gpu optimize (#21643 ) * accelerate mean op test=develop	6 years ago
Zeng Jinle	0f8888360e	Polish op registry codes (#21561 ) * polish infer shape registry, test=develop * modify some operators registry, test=develop	6 years ago
Aurelius84	3d9dee575e	Set lod_level of Out in compile time of sequence_pool_op (#21604 )	6 years ago
Huihuang Zheng	1dcf6a7212	Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532 ) Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests. Fix bugs: 1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op. 2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var. This PR also did some code clean up: 1. Print the var name when sgd_op catches shape error so that it is easier to debug 2. Fix a typo: dicta -> dict	6 years ago
Jacek Czaja	8f5a93a07b	- Fix to regression in performance of ResNet-50 training (#21588 ) test=develop	6 years ago
Jacek Czaja	9ce0e29dc3	[MKL-DNN] Batch norm mkl-dnn NHWC support (#21553 ) * - BAtch norm mkl-dnn NHWC test=develop - compilation fix test=develop - UT fix - cosmetics test=develop - Fix to Batch Norm MKL-DNN NHWC UT test=develop Conflicts: paddle/fluid/operators/batch_norm_op.h * - Lint fixes test=develop	6 years ago
Youwei Song	cdba41af4d	dygraph Embedding layer use lookuptable v2 (#21209 ) * dygraph Embedding layer use lookuptable v2 test=develop * fix test_nce test=develop	6 years ago
wangchaochaohu	4c9b3dafa7	fill_constant_batch_size_like OP precious problem fix (#21337 ) * fix fill_constant_batch_size_like_op precious problem test=develop	6 years ago
WangXi	768f9242e9	Fix dgc clip & rampup step, test=develop (#21491 )	6 years ago
Zeng Jinle	3662fb71a7	remove eval() calls in Eigen, test=develop (#21498 )	6 years ago
Jacek Czaja	18a5d30754	[MKL-DNN] Conv2d and Conv2d transpose MKL-DNN NHWC support (#21466 )	6 years ago
Tao Luo	70eb397677	remove unused snappy/snappystream depends in distributed codes (#21484 ) test=develop	6 years ago
lilong12	0bc8bdf724	set dim[0] to -1 if dim[0] < 0 during compiling for c_allgather op (#21402 ) * set dim[0] to -1 if dim[0] < 0 and remove assertion to runtime, test=develop * modify ENFORCE message, test=develop * add validation for x.shape[0] > 0, test=develop * add ut, test=develop	6 years ago
tangwei12	0bddb951c2	fix async mode, test=develop (#21367 )	6 years ago
Leo Chen	b3090ad406	fix synchronization problem in softmax_with_cross_entropy_op, test=develop (#21480 )	6 years ago
Tao Luo	01fa4ead61	fix -Wno-error=sign-compare warning in gcc8 (#21434 ) * fix -Wno-error=sign-compare warning in gcc8 test=develop * fix warning in distributed codes test=develop	6 years ago
Lv Mengsi	37f3e56dea	Fix transpose conv (#21406 ) * fix transpose conv,test=develop * fix comments test=develop	6 years ago
hutuxian	7e68bc896b	refactor AUC OP and add its CUDA Kernel (#21336 ) * refactor AUC OP and add its CUDA Kernel * the layout of global auc doesn't change	6 years ago
wawltor	dbbe6e9cb6	fix the device supported of the op unique and unique_with_counts. (#21395 ) * fix the device supported of the op unique and unique_with_counts. test=develop test=document_fix * Fix the precision of test in the op of unique and unique_with_counts. test=develop test=document_fix	6 years ago
wangguanzhong	379e3febf2	fix shape check in density_prior_box, test=develop (#21414 ) * fix shape check in density_prior_box, test=develop	6 years ago
Adam	76b55da15a	Fix bug in UpdatePadding for int64_t type (#21465 ) test=develop	6 years ago
Pei Yang	7b28d938bf	show shape diff in wrong trt input shape errmsg, test=develop (#21451 )	6 years ago
Jie Fang	5e813b53c5	nhwc optimization for batchnorm (#21090 )	6 years ago
Leo Chen	e0c9d856fb	add unused input vars check for OpWithKernel, test=develop (#21169 ) * add unused input vars check for OpWithKernel, test=develop * remove unused vars in some ops, test=develop * fix batch_norm, test=develop * add white list, test=develop * add CI check for white list, test=develop * :ove white list to c++, test=develop * solve failure of CI, test=develop * add unittest for unused_var_check, test=develop * refine code, enable check in operator_test, test=develop * skip mkldnn, test=develop * extend white list, test=develop * refine condition of mkldnn, test=develop * fix paddle_build, test=develop * follow comments, test=develop * fix GetExpectedKernelType * add wiki ref to err_msg, test=develop * follow comment, test=develop	6 years ago
Chen Weihang	664f958a02	Fix optimizer op infershape failed in dygraph multi-cards mode (#21374 ) * add param & grad shape check for sgd op * add _reshape_inplece interface for dygraph parallel * refine unittest based paddle/models scripts, test=develop * add unittest for parallel grad fuse, test=develop	6 years ago
Huihuang Zheng	630be31952	Fix Cond Bug for Nested Control Flow (#21340 ) * Commit before merging develop test=develop * Backup after working with Huihuang logs * Commit before deleting Huihuang debug loggings * Commit before debug test=develop * Fix bug commit test=develop * Backup of fixing bugs test=develop * Clean up code test=develop * Fix a bug in sum_op test=develop	6 years ago
Jacek Czaja	cd43c4440e	[MKL-DNN] LRN and Pool2d (FWD) NHWC support (#21375 )	6 years ago
Leo Chen	add62acfd1	remove kDepXOut for abs_grad op, test=develop (#21407 )	6 years ago
Adam	9107bf209f	Add template version of UpdatePadding (#21426 ) test=develop	6 years ago
zhaoyuchen2018	b16274556a	Add dscending for argsort (#21400 ) * Add ascending for argsort * Refine api doc description. * Refine descending description * Add int32 logic to speedup when data is small size. * Remove int32 opt as not support in python	6 years ago
hong	ac8546701d	Add dygraph execution context (#20157 ) * add_dygraph_execution_context * add dygraph infershape context and execution context; test=develop * fix imperative bug; test=develop * remove inputs outputs interface from execution context, because it have same function with inputNames; test=develop * remove tracer_test ctest; test=develop * fix split op bug; test=develop * fix unitests bug; test=develop * fix distribute test bug; test=develop * fix ngraph compile bug; test=develop * fix grad maker bug; test=develop * fix load op bugs; test=develop * fix operator.cc construct bug; test=develop * remove useless name find in operator; test=develop * add tracer_test; test=develop * fix concat, split bug; test=develop * remove tracer_test unitest; test=develop * fix attribute check bug; test=develop * add test code to fix converage; test=develop * remove useless code, change check backward input in engin; test=develop * unlock var type infer shape;test=develop * add ShareAllLoD api; test=develop * add dygraph infershape context unitest; test=develop * remove increase and decrease lod in dygraph; test=develop * addd override; test=develop * fix increase descrease lod; test=develop * fix paddle_enforce; test=develop * disable lod op dygraph check; test=develop * fix paddle enforce error; test=develop * add comment for op_registry and OperatorBase; test=develop * optimize the comment of op_registry; test=develop * fix format of comment; test=develop * fix format of comment; test=develop * optimize the format of comment; test=develop * optimize the format of the comment; test=develop * optimize comment of op_registry; test=develop	6 years ago
hutuxian	a6b089c614	add macro to ban windows (#21422 ) remove nccl related code in windows	6 years ago
Kaipeng Deng	ebfb720a63	add Adam beta1/beta2 support Variable (#21234 ) * add Adam beta1/beta2 support Variable. test=develop	6 years ago
Zeng Jinle	09696d5df8	Use system allocator in OpTest (#21335 ) * use system allocator in unittests, test=develop * fix op bugs, test=develop * fix tensor copy bug when src and dst are the same, test=develop	6 years ago
Kaipeng Deng	67c836fb5c	batch_norm momentum support variable (#21246 ) * batch_norm momentum support variable. test=develop * fix format. test=develop * add batch_norm momentum variable example. test=develop * move MomentumTensor to training branch. test=develop * split example. test=develop * fix doc. test=develop * fix PADDLE_ENFORCE ci. test=develop * fix format. test=develop	6 years ago
ShenLiang	e2c6f434ec	Add Lod information for gather_nd & scatter_nd (#21404 ) * add lod information, test=develop * add lod, test=develop * fix lod, test=develop * fix lod, test=develop	6 years ago
Tao Luo	c0656dcb1a	remove -Wno-error=sign-compare, make warning as error (#21358 ) * remove -Wno-error=sign-compare, make warning as error test=develop test=document_fix * fix exist compile warning test=develop	6 years ago
Zeng Jinle	b97fc16d21	fix lod_reset bug, test=develop (#21392 )	6 years ago
Zeng Jinle	89966525f1	Polish reference count pass (#21324 ) * fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop	6 years ago
hutuxian	47a82e38e3	Support data_norm gpu kernel (#21325 ) * support data_norm_op run in CUDA * add two parameters sync_stats & summary_decay_rate * add UT	6 years ago
GaoWei8	8493f20ebc	Polish the codes of fc when needs padding (#21378 ) test=develop	6 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	6 years ago
Zeng Jinle	b639a882c3	fix syn bn grad maker, test=develop, test=document_fix (#21317 )	6 years ago
Youwei Song	4d0f5ab1a8	add axis check for concat op (#21288 ) * add axis check for concat op test=develop * fix PADDLE_ENFORCE format test=develop * move to ComputeAxis for InferShape check test=develop	6 years ago
zhaoyuchen2018	afb134847d	Fix ernie python infer diff (#21311 ) * Fix ernie pythoin infer diff * Refine mask test=develop	6 years ago
Lv Mengsi	b6ce4f8b2f	Fix mistake of batch norm op (#21237 ) * fix_bn * revert unittest,test=develop	6 years ago
lilong12	41d13209d7	add the framework support for distfc (#21197 ) * add the framework support for distfc and ut, test=develop * fix the implementation of shard_index_op, test=develop	6 years ago
GaoWei8	234060f88f	Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972 ) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop	6 years ago
Jacek Czaja	f4cf028a8c	[MKL-DNN] Error throwing for NHWC layout for MKL-DNN ops (#21207 )	6 years ago
Michał Gallus	ed9ceb9f98	Refactor MKL-DNN ElementwiseMul (#21061 ) * Refactor MKL-DNN ElementwiseMul remove manual fallback, remove format attrs test=develop * Refine PADDLE_ENFORCEs in eltwise_mul_op.h test=develop * Make ElementwiseMulOp inherit from ElementwiseOp * Change type of simd_width to int test=develop * Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp test=develop * Restore attributes test=develop * Fix test coverage for mkldnn eltwise mul test=develop * Conform to new is_run_common_broadcast API test=develop * Add UT for AreDimsAndFormatCorrect test=develop	6 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	6 years ago
wangchaochaohu	6514f52e46	fix the fill_constant op precious problem (#21322 ) * fix the fill_constant op precious problem test=develop	6 years ago
zhaoyuchen2018	08c19c585d	Improve argsort performance. (#21267 ) * Improve argsort performance. - Give 200000 data to compute argsort on v100, can speed up ~190x before opt cost: 0.53s after opt cost:0.0027s - Add fp16 support * Refine error message * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
WangXi	8ac7687e36	Fix dgc accuracy by mv regularization to local (#21278 )	6 years ago
Leo Zhao	b19e1a1b56	use prefetch to load next mem into cache (#21206 ) * use prefetch to load next mem into cache test=develop * remove hard code memcpy om pyramid_hash_ff test=develop	6 years ago
gongweibao	ed2a185248	optimize nhwc for tensor core in ConvOp and ConvGradOp (#20597 )	6 years ago
Yihua Xu	69dd5152cf	Fix the crash issue when scale or bias was null-pointer. (#21284 ) * Fix the crash issue when scale or bias was null-pointer. test=develop * Add the error message for passing CI. test=develop	6 years ago
Zhang Ting	698b8b73ad	optimize lod_reset op to avoid data transform	6 years ago
Liufang Sang	f0b1518438	add dequantize_abs_max op and modify lookup_table op (#20899 ) * add int8 kernel to lookup_table op and add dequantize op test=develop * change paddle_enforce to paddle_enforce_eq test=develop * change copyright and change some not suitable code test=develop * remove debug log test=develop * replace GetInputType with IndicateVarDataType test=develop * fix EmptyGradMaker test=develop * fix diff between cpu and gpu test=develop * use memcopy when int8_t test=develop	6 years ago
hutuxian	a6ce2306f9	support cvm_op run in gpu (#21300 ) Previously, CVM OP was only able to run in CPU. This PR implements its GPU kernel. What's more, we improve the UTs about CVM OP.	6 years ago
Yihua Xu	b085ecc258	Avoid the string as the key of map to improve the jit performance (#21292 ) * Avoid the string as the key of map to improve the jit performance. test=develop * Use map to replace unordered_map. test=develop	6 years ago
zhongpu	c4ede95c74	open dygraph op test, test=develop (#19787 ) * open dygraph op test, test=develop * modify to_variable, test=develop * modify input and output for dygraph, test=develop * modify input and output for dygraph(fix bug), test=develop * fix input processing of dygraph op test, test=develop * fix bug, test=develop * fix op test, test=develop * fix forward bug for dygraph, test=develop * fix mkldnn op test for forward, test=develop * update nn.py for dygraph, test=develop * fix crop_tensor_op, test=develop * fix elementwise_mul_op, test=develop * fix fill_op, test=develop * fix some mkldnn op, test=develop * open backward op test for dygraph, test=develop * delete log, test=develop * close backward op test for dygraph, test=develop * fix bug for edit_distance_op and test_lstm_cudnn_op, test=develop * fix optest backward bug for dygraph, test=develop * fix optest backward bug for dygraph, test=develop * close backward op test for dygraph, test=develop * close backward op test for dygraph, test=develop * open dygraph op test, test=develop * fix op test for dygraph, fix GradOpDescMaker, test=develop * fix bug for linear_chain_crf_op.h, test=develop * remove log, test=develop * remove log, test=develop * remove log for op_test.py, test=develop * remove log for op_test.py, test=develop * fix bug for var_conv_2d_op, change PADDLE_ENFORCE, test=develop * fix PADDLE_ENFORCE_EQ for hierarchical_sigmoid_op.cc, test=develop * fix bug for test_increment_ngraph_op.py, test=develop * fix lod for op test in dygraph, test=develop * refactor op_test.py to reduce redundant code, test=develop * fix lod optest, modify InputVar/OutputVar to HasInput/HasOutput, test=develop * remove debug log, test=develop * remove redundant code in base.py, test=develop * fix some error in optest, test=develop * fix ClearNoNeedBufferInputs function's bug for LoDTensor, test=develop * refactor op_test.py, test=develop * remove redundant writing, test=develop * fix error(get tensor of the grad variable), test=develop * fix test_concat_mkldnn test_conv2d_mkldnn, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix some redundant code, test=develop * reslove conflict and rewrite paddle error message, test=develop	6 years ago
danleifeng	6fc3e8ec84	edit elementwise_mul doublegrad inplace (#21245 )	6 years ago
zhaoyuchen2018	3ff5cc2d5e	Fix topk compile failed on windows (#21243 ) * Fix topk compile failed on windows * Use explicit cast for assign data	6 years ago
Zhang Ting	01a9646323	optimize assign op to avoid copy data from GPU to GPU (#21181 ) * optimize assign op to avoid copy data from GPU to GPU, test=develop * modified GetkernelTypeForVar and just avoid device transform, test=develop	6 years ago
danleifeng	0e7baabe59	extend elementwise broadcast function (#20957 )	6 years ago
Adam	d623e863c9	Fix GELU grad error (#21204 ) test=develop	6 years ago
yaoxuefeng	b5d8ba8394	fix data_norm op to avoid impractical normalization result test=develop (#21152 ) * fix auc drop first commit test=develop * update datanorm op * update datanorm with enforce test=develop * update test=develop * update format test=develop * update format * update format test=develop * add unit test test=develop * update unit test test=develop * update format test=develop * update format test=develop * update API description test=develop * update API description test=develop * update format test=develop * fix codes as comments test=develop * fix description as comments test=develop * fix description as comments test=develop * update codes.. test=develop	6 years ago
Zhang Ting	9cbe7bccba	modified error message and API doc for channel_last supported Op (#21002 ) * modified error message for conv and conv_transpose, test=develop * modified doc of conv and conv_transpose op, test=develop * modified the expression for error message, test=develop * modified error message for group_norm op, test=develop * modified detail of Attr(data_format) or Attr(data_layout) * add ValueError in API doc for maxout op, test=develop	6 years ago
guofei	56b5d14704	Fix the error of init variable in StaticRNN when stop_gradient=ON (#21118 )	6 years ago
WangXi	3c98ec90ce	Fix INF bug of softmax_cross_entropy_op (#21165 )	6 years ago
Yihua Xu	eec9c9cbe7	Fix jit tls issue (#21151 )	6 years ago
ruri	aeb887911f	Refine edit distance cn (#21121 )	6 years ago
Kaipeng Deng	98b59cb82c	fix elementwise_mod float point kernel. test=develop (#21183 )	6 years ago
whs	cfdd1fc2cd	Fix warpctc in padding mode. (#21033 )	6 years ago
Chen Weihang	8da0cd537a	Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop add more already exists examples, test=develop	6 years ago
zhaoyuchen2018	b93870e696	Improve topk performance. (#21087 ) * Improve topk performance. give 200000 data to compute topk, before opt: cost 1s after opt: cost 0.0028s. * Refine return value. * Add cuda util funtions. * Fix ComputeBlockSize bug & refine comments. Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
Chen Weihang	8414575b78	Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137 ) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop	6 years ago
Chen Weihang	7e5f74b825	Add examples for error message writing specification - InvalidArgument (#21132 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop fix error, test=develop	6 years ago
zhaoyuchen2018	4a544762a2	Add Asypadding for conv fusion. (#21041 ) * Add Asypadding for conv fusion. test=develop reference: pr/20042 * Fix eigen build link error * Change back file mode * Use math function & add more checks.	6 years ago
WangXi	de5d3ff688	Fix dgc buffer illegal & reuse velocity (#21012 )	6 years ago
ceci3	f62a929151	fix instance norm (#21042 ) * fix instance norm * update unitest,test=develop	6 years ago
lilong12	e249d9a3e2	fix the computation for dx (grad for x) for prelu operation. (#20949 ) * set the default value of alpha for prelu to 0.25, test=develop * add the call to __syncthreads(), test=develop * fix the implementation of cpu prelu, test=develop * repair the implementation of element mode prelu, test=develop * modify test_prelu_op.py, test=develop	6 years ago
Zhang Ting	e0285eae64	add check for input channels and Attr(groups), test=develop (#21095 )	6 years ago
Yiqun Liu	35f17ae28f	Add the check of lod_level between compile-time and runtime. (#20961 ) * Add the check of lod_level between compile-time and runtime. test=develop * Fix bug in check_compile_vs_runtime. test=develop * Fix the check of output when it is dispensiable or intermediate. test=develop * Share lod of x to out in match_matrix_tensor op in compile-time. * Implement GetLoDLevel in InferShapeContext. * Set the default value of check_compile_vs_runtime to False and enable it in test_sequence_pad_op. test=develop * Enable check_compile_vs_runtime in test_match_matrix_tensor. * Add the implementation of SetLoDLevel in InferShapeContext. * Remove the implementation of IncreaseLoDLevel and call Get/SetLoDLevel instead. * Remove the implementation of DecreaseLoDLevel and call Set/GetLoDLevel instead. * Refine some ops and unittests. test=develop * Fix a typo. test=develop * Remove the check of var type, and change int to int32_t. test=develop * Add unittest for Get/SetLoDLevel. test=develop	6 years ago
Chen Weihang	826254f664	Add pre-condition check for fuse optimizer op pass (#21005 ) * add pre condition check for fuse optimizer op pass, test=develop * add log & set init to zero, test=develop * fix test_fuse_all_reduce_pass failed, test=develop * polish details, test=develop * refine PADDLE_ENFORCE & remove needless VLOG, test=develop * refactor op check method, test=develop	6 years ago
Aurelius84	1cd6721873	Optimizer mmcpy if _rand_len=16 and remove data copy in GradKernel (#21099 )	6 years ago
joanna.wozna.intel	77c2083586	Add transpose2 INT8 for mkl-dnn (#19424 ) * Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdba4c859abb945e062ab13124f70508054, reversing changes made to 2ce6473f144da298aba4a43d46918f27d463cf7c. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd78ca47ae56881161172b2aacd349aba90. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop	6 years ago
LielinJiang	06063b7001	add op locality_aware_nms, test=develop (#20976 )	6 years ago
wangchaochaohu	fc385777e4	fix the compile cost long time test=develop (#21064 )	6 years ago
Chen Weihang	2f27b10331	Add dependency for error_codes.proto (#21084 ) * fix activation_functions deps, test=develop, test=document_fix * add error_codes_proto deps, test=develop, test=document_fix * try delete enforce.h, test=develop, test=document_fix	6 years ago
wangchaochaohu	149a1e3124	Expand refine (#21063 ) * fix the expand op compile time cost long time test=develop * add tag for just copy test=develop	6 years ago
Wojciech Uss	af3ff422cc	Fix dst memory allocation in elementwise_add (#21059 ) test=develop	6 years ago
liym27	26a6e27afe	fix bug in pool/conv/conv_transpose: UpdatePaddingAndDilation, _get_padding_with_SAME and conv2dtranspose_forward_naive. (#20997 ) * fix bug in pool/conv/conv_transpose: 1. It should be stride[i] not stride[0] in UpdatePaddingAndDilation; 2. fix bug of func _get_padding_with_SAME in test_conv/conv_transpose_op.py; 3. fix bug of the computation process in function conv2dtranspose_forward_naive. test=develop * change test to make the data of different dimensions different. test=develop	6 years ago

... 3 4 5 6 7 ...

5199 Commits (05c3bc3bf616731a2da15747b8da9cb8064e39b9)