Paddle

Commit Graph

Author	SHA1	Message	Date
Leo Chen	aaa4fe491a	use function instead of lambda, test=develop (#22348 ) * use function instead of lambda, test=develop * follow comments, test=develop	6 years ago
Adam	e7a9f6bbb7	[Bugfix] Preserve shape in inpalce operators (#22360 )	6 years ago
qingqing01	2d20869c94	Fix infer_shape in compling for elementwise_op (#22291 )	6 years ago
Yiqun Liu	b7cac50b64	Implement a common python unittest to test the ir passes. (#22209 ) * Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop	6 years ago
tangwei12	82bc814a57	integrated HALF_ASYNC to communicator (#21869 ) * add half_async in the communicator * fix DistributedStrategy	6 years ago
wangchaochaohu	1e932eccfa	remove unused code test=develop (#22327 )	6 years ago
Leo Chen	3e5744aa65	Remove unused inputs for some operators (#22284 ) * remove unused inputs, test=develop * remove unused inputs, test=develop * update dtype, test=develop * remove unused inputs, test=develop * update op_use_default_grad_op_maker, tese=develop * resolve conflicts, test=develop * follow comments, test=develop * update center_loss_grad, test=develop	6 years ago
zhangchunle	805328e13b	fix typo in error message (#22312 )	6 years ago
lidanqing	895f8da7d6	change std::cout to log(INFO), vlog (#22316 )	6 years ago
石晓伟	8cb04664b9	revert paddle_fluid.map, test=develop (#22236 )	6 years ago
Chen Weihang	35efbe6d95	Speeding up dygraph DataLoader with multiprocessing (#21762 ) * add multiprocess for dygraph data loader, test=develop * polish code & add safe gurad, test=develop * refactor dygraph dataloader & add signal handler, test=develop * fix member initializer compile error on ci, test=develop * fix member initializer compile error one more, test=develop * remove useless config, test=develop * skip windows incompatible problem, test=develop * add unittest for coverage, test=coverage * add more exception unittest case, test=develop * deal with signal handler coverage, test=develop * polish code & add signal handler tests, test=develop * deal with coverage ci problem, test=develop * split data loader test & coverage ci fix, test=develop * remove test_imperative_data_loader_with_exception, test=develop * remove singal process except test case, test=develop * add exception tests again & remove sample list test, test=develop * split normal and exception unittests to diff class, test=develop * polish doc for use_multiprocess effect in static mode, test=develop	6 years ago
Zeng Jinle	9435533adf	remove op_use_default_grad_op_maker.spec, test=develop, test=document_fix (#22300 )	6 years ago
wangchaochaohu	7b76a76495	fix the conda build confilict test=develop (#22279 )	6 years ago
Zeng Jinle	5e601a92ad	polish grad op check (#22290 ) * polish grad op check, test=develop, test=document_fix * keep op_use_default_grad_maker.spec to avoid conflict, test=develop, test=document_fix	6 years ago
Bai Yifan	faba4b116a	Remove disable flag in test_fsp_op.py (#22171 ) * fix fsp_op, test=develop * fix fsp grad op maker, test=develop * update op_use_default_grad_op_maker.spec, test=develop	6 years ago
Zhen Wang	e40cfb1010	fix the bug of assert_is_op_output. test=develop (#22262 )	6 years ago
Wojciech Uss	d3a6647372	improve placement pass tests code coverage (#22197 )	6 years ago
liu zhengxi	07afc29e90	Make api.cc malloc consistent with paddle_api.h for PaddleBuf (#22255 )	6 years ago
silingtong123	4f1da4adcb	remove the useless third_party library from C++ inference library (#22021 ) * remove the useless third_party library from C++ inference library * revert removing the install directory	6 years ago
zhouwei25	549e6de7ac	faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164 )	6 years ago
xujiaqi01	e3a457d34b	add collective communication library in fleet (#22211 ) * add collective communication library in fleet to replace mpi * test=develop	6 years ago
Zhen Wang	f2522e91c4	fix the type error caused by setting bool attr in OpDesc. test=develop (#22257 )	6 years ago
songyouwei	0ba1d140d4	Add CI check for sequence ops' unittests (#21615 )	6 years ago
Zeng Jinle	1b76e789cf	remove cuda allocator ctor, test=develop (#22212 )	6 years ago
Adam	9942d9ed5c	Add caching mechanizm to requantize_mkldnn_op (#22223 )	6 years ago
Wilber	1230c110cb	[fluid-lite] adjust to relative error (#22232 ) - fluid和lite精度比较替换为相对误差	6 years ago
123malin	985bceac53	Bug fix for sparse recorder (#21969 ) * test=develop, bug fix for sparse recorder	6 years ago
Chen Weihang	fc0b21e17b	Polish fetch error message of parallel executor (#22206 ) * polish error message of parallel executor, test=develop * change PADDLE_ENFORCE, test=develop	6 years ago
Wojciech Uss	2e90c4eb0a	improve mkldnn_quantizer_config test code coverage (#22216 )	6 years ago
Wilber	5750152e80	support fluid-lite subgraph run resnet test=develop (#22191 ) - 添加了fluid-lite子图方式运行resnet的单测 - 修改了依赖Lite的git commit id	6 years ago
wangchaochaohu	621d3e0b66	fix the bug of profile update (#22207 ) * fix the bug of profile update test=develop	6 years ago
FlyingQianMM	443a713c9e	add backward gradient computation for op argsort (#22203 ) * add backward gradient computation for op argsort test=developo * use pre-commit test=develop	6 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	6 years ago
zhouwei25	2f3e2a84af	fix ci rule to show Shell variables (#22177 )	6 years ago
baojun	298ee7d28a	Improve ngraph file line coverage (#22155 )	6 years ago
zhongpu	d0f0a2520c	test Optimizer in dygraph (#21949 ) * test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop	6 years ago
石晓伟	ad0dfb17c1	[Feature] Lite subgraph (#22114 )	6 years ago
joanna.wozna.intel	5b2e98aa17	Add multiple quantize operators fuse (#22062 )	6 years ago
Yiqun Liu	96980c2244	Polish the PADDLE_ENFORCE in fusion_group pass related codes. (#22144 ) * Polish the PADDLE_ENFORCE in fusion_group pass related codes. test=develop * Correct the unittest because of the change relu_grad's formula. test=develop	6 years ago
wangchaochaohu	c3876cf82d	add support for nested profiling event and printing in different level (#22061 ) * add support for nested profiling event and printing in different level	6 years ago
Zeng Jinle	c3bcd3c1e2	fix dygraph non zero gpu bug, test=develop (#22165 )	6 years ago
zhaoyuchen2018	3d4f2aa689	Refine stack op to improve xlnet performance, test=develop (#22142 ) stack's wait cost a lot of cpu time, use cuda kernel to do memory copy will reduce cpu time. Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
zhongpu	cf475f95df	Remove FC in dygraph, modify FC to Linear in sample code (#22082 ) * modify fc to linear in sample code, test=develop * remove FC, test=develop * remove warnings, test=develop * drop fluid/imperative/README.md , test=develop * change fc to linear, test=develop * polish code style, test=develop	6 years ago
liu zhengxi	64a4044292	add double register op_data_type of pad2d and fix compile error, test=develop (#22075 )	6 years ago
Liu Xudong	7ba7acd197	Add coverage tools (#21975 ) Add coverage data processing tools.	6 years ago
Double_V	6ea3809143	Support prroi_pool_op with Tensor and LoDTensor rois (#20649 ) 1. Add a new input named batch_roi_nums for prroi_pool_op. batch_roi_nums includes the number of roi for each image in batch when rois is Tensor. This information is saved in rois's lod when rois is LoDTensor. 2. add grad check to prroi_pool_op and solve unnormal X grad diff in CPU.	6 years ago
Pei Yang	d8a9b134e3	fix trt instance_norm serialize bug. test=develop (#22152 )	6 years ago
zhongpu	cc1a9f4238	fix sample code in paddle/fluid/imperative/README.md (#22141 ) * fix sample code, test=develop * polish code style, test=develop	6 years ago
Zeng Jinle	4c2df8e4d4	fix allocator strategy comment, test=develop, test=document_fix (#22121 )	6 years ago
bingyanghuang	7872d06ff4	Add explanation on conv grad for dims<3 (#22125 )	6 years ago
liu zhengxi	724b13e459	fix xception precision problem, test=develop (#22124 )	6 years ago
Yiqun Liu	b1401fb74d	Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#22094 ) test=develop	6 years ago
Pei Yang	50bee83f71	add TRT support for instance_norm op (#21928 ) * add TRT support for instance_norm op	6 years ago
zhaoyuchen2018	3dbd4087fe	Fix windows build not kernel issue, test=develop (#22105 ) windows conv_fusion failed as no kernel， explicit declare lambda Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
Chengmo	418abc92f4	Update pyramid related OP (#21372 ) * add special way to add distribute vars， Update Pyramid hash op	6 years ago
bingyanghuang	4b4a9cc88f	fix format in operator.cc (#22101 )	6 years ago
Feiyu Chan	14aebc7a95	add erf op (#21785 ) * add erf op and python interface. * add fp16 support for erf op. * add unitests for erf op and its python interface.	6 years ago
Chen Weihang	ba8414d3a5	replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109 )	6 years ago
silingtong123	6c20e7c4e6	test=develop, remove unused parameter from class RuntimeInferShapeContext constructors (#22046 )	6 years ago
Double_V	fab4b0765a	support elu_op double grad (#21822 ) * support elu activation double grad,test=develop * delete the code commit in .cc,test=develop * fix relu test unpass, test=develop * add elu double grad kernel and unit test * add caculate dX in elu double grad functor, test=develop * update the commit code,test=develop	6 years ago
Pei Yang	0a51098a71	Add TRT support for BERT (#21135 ) * add gelu plugin * align trt bert with gpu * add support for fused fc with relu, * add unittest for bert trt	6 years ago
Jacek Czaja	b0b27ff699	[MKL-DNN] Conv grad and Batch Norm grad NHWC support (#22088 )	6 years ago
Huihuang Zheng	dd4361568e	Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029 )	6 years ago
Zeng Jinle	9587249442	polish allocator strategy doc, test=develop, test=document_fix (#22095 )	6 years ago
Zeng Jinle	d9f5d1eb29	ag allocator by default, test=develop (#21837 )	6 years ago
123malin	7fb817d447	add distributed_strategy (#21710 ) * add distributed_strategy	6 years ago
Jacek Czaja	ad8a9cb82c	[MKL-DNN] Pool & LRN Grad Ops NHWC support (#21747 )	6 years ago
Kaipeng Deng	34c57120eb	polish cross_entropy ENFORCE (#22056 )	6 years ago
SunAhong1993	7f4abaf2f5	register int/int64_t/float16 in pow/square kernel,test=develop (#22023 ) * register int/int64_t/float16 in pow/square kernel,test=develop * add abs/square/exp type,test=develop	6 years ago
Leo Chen	3f653c8323	register NoNeedBufferVarsInference for max_pool_grad_op, test=develop (#22055 ) * fix test_conv2d_ngraph for grad diff, test=develop * register NoNeedBufferVarsInference for max_pool_grad_op, test=develop * refine error message, test=develop * fix numpy, test=develop * disable test conv2d_ngraph_op, test=develop Co-authored-by: Zhang Ting <709968123@qq.com>	6 years ago
Yiqun Liu	d48320777e	Add the first implememtation of fusion_group op (#19621 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop	6 years ago
Michał Gallus	6192108408	[DNNL] 3D Fully-Connected (#21746 )	6 years ago
FDInSky	aa2ed0dcc6	fix generate_proposal_labesl op (#21793 ) * test=develop fix generate_proposal_labesl op	6 years ago
ceci3	95d79b6d00	update error log for batch_norm_grad (#22017 ) * update error information about batch_norm_grad * update bn,test=develop	6 years ago
Aurelius84	c53b62eb8e	fix integer overflow in match_matrix (#22036 ) * fix integer overflow in match_matrix test=develop * fix integer overflow in match_matrix test=develop * fix typo test=develop	6 years ago
Chen Weihang	2e9082250d	polish default error msg & cublas error hint, test=develop (#22032 )	6 years ago
wangchaochaohu	64baee4144	polish code test=develop (#22014 )	6 years ago
Chen Weihang	35ff1568e9	Add error message for cublas inItizalize failed (#21995 )	6 years ago
Chen Weihang	fbb42173a9	fix no hint problem when use ENFORCE for cuda, test=develop (#21994 )	6 years ago
zhouwei25	e66f92d1ae	Modify demo_ci to support Windows, prepare for PR_Windows_Inference (#21873 )	6 years ago
danleifeng	b7697f6218	fix broadcast bug;test=develop (#21898 )	6 years ago
liu zhengxi	196e20dfbb	Fix multi-threads memory out of bounds error for passes (#21920 ) * fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop * fix attention_lstm_fuse_pass during multi-threads inference, test=develop * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop * fix fc_lstm_fuse_pass during multi-threads inference, test=develop * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop	6 years ago
zhaoyuchen2018	8859ddd6cf	Refine multihead kernel, align block to 32 (#21961 ) * Refine multihead kernel, align block to 32 test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine log comments test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
silingtong123	fd9b00df4b	test=develop, remove unused variable (#21974 )	6 years ago
zhoushiyu	cee2ccb078	add shuffle batch op (#21674 ) * add shuffle batch op, test=develop, test=document_preview * fix size_t conflict and check_output test=develop, test=document_preview * fix bug test=develop, test=document_preview * add unittest of shuffle_batch layer test=develop, test=document_preview * fix py coverage and op input type, test=develop, test=document_preview * fix py coverage, test=develop * fix en doc, test=develop * move to contrib test=develop * add unique_name test=develop * invoke shuffle_batch in contrib.layers test=develop	6 years ago
mapingshuo	c3e1954918	make reverse op support negative axis (#21925 ) * make reverse op support negative axis	6 years ago
石晓伟	03479469a7	fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841 ) * fix multi-thread error of fc_gru_fuse_pass.cc, test=develop * export FLAGS and GLOG symbols, test=develop	6 years ago
wangchaochaohu	de9ba01f11	add conda build python script test=develop (#21943 ) * add script for conda package build	6 years ago
Aurelius84	10d6846900	Remove double registered dataType in Pad2d (#21942 ) * fix compile error in CUDA10 test=develop * remove double in pad2d test=develop	6 years ago
zhouwei25	2df4be5d35	Fix openblas bug to support compile on windows when WITH_MKL=OFF (#21902 ) * Fix openblas to support compile on Windows when WITH_MKL=OFF	6 years ago
hutuxian	27decacb8a	fix aucop stat shape (#21846 ) * fix stat shape back in global auc scenario * add UT to cover global auc	6 years ago
Pei Yang	3e5008ad01	fix trt calib not working bug, test=develop (#21934 )	6 years ago
Aurelius84	5cb2c74127	add register op_data_type of pad/expand_as et.al (#21718 ) * add register op_data_type test=develop * fix register bug in isfinite op test=develop * rm int int64_t in pad2d gradKernel test=develop	6 years ago
qingqing01	2066745847	Pack imperative/layer into paddle_framework.so (#21921 ) * Pack imperative/layer into paddle_framework.so	6 years ago
hong	30d000f8c2	fix matmul error message; test=develop (#21885 )	6 years ago
zhouwei25	a01663ca1f	remove patch command and file of cares to Improved quality of Paddle Repo (#21776 )	6 years ago
flame	2bbc0d7d60	python zero copy inference, delete pass (#21897 ) * python zero copy inference * support delete inference pass	6 years ago
Aurelius84	51a86d2b6b	Optimize adam speed (#21777 ) * optimize adam speed by removing _finish_update test=develop * fix SparseAdamFunctor param list test=develop * Remove scale_op in expect_list of adam_op test=develop * fix test optimizer loss assert error test=develop * fix test optimizer loss assert error test=develop * modify PADDLE_ENFORCE usage test=develop * fix op_type in lamb_op.cc test=develop * fix errors ostream format bug test=develop * add betaPowOut in ngraph op test=develop * fix ngraph::op api for gcc8 test=develop * clean code test=develop * modify struct into class test=develop * remove code of beta1Tensor in lamb_op test=develop	6 years ago
Leo Chen	310edc0d0c	Update layers used in ptb model to use auto-generated op functions in dygraph mode (#21724 ) * update layers, test=develop * fix input numpy, test=develop * fix bugs, test=develop * follow commments, test=develop * update getitem, test=develop	6 years ago
lidanqing	9dff56e8e2	change qat_performance with mobilenet, change batch_size of qat2_resnet50 (#21895 ) test=develop	6 years ago
FDInSky	6b9fbcf3ad	Update iou_similarity op to support non-normalized bbox (#21671 ) Update iou_similarity op to support non-normalized bbox	6 years ago
guofei	46f9184aff	Modify the while_loop API (#21844 )	6 years ago
Guo Sheng	7689b6aaa4	Fix default label dim of label_smooth_op. test=develop (#21862 )	6 years ago
zhouwei25	13e4756f18	change ci check rule of deleting unit-test (#21876 )	6 years ago
GaoWei8	d4dda8628e	optimize fc jit (#21878 ) test=develop	6 years ago
zhouwei25	013225bb68	fix Execution order of ci_check_unittest, and add it to Linux_py35 (#21640 )	6 years ago
Chen Weihang	2b941736f3	fix softmax_with_cross_entropy_fix bug, test=develop (#21810 )	6 years ago
Thunderbrook	c3cf42d0f7	add table id in cache shuffle (#21585 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop * add table id in cache shuffle test=develop * table id test=develop * code style test=develop	6 years ago
Michał Gallus	253e664275	Disable memory opt pass when DNNL is on (#21826 ) * Disable memory opt pass when DNNL is on * Refine comment above mem optimization pass enablement test=develop	6 years ago
Chengmo	a86f11b5f5	Speed GEO dense calc & communication (#21579 ) * test=develop, speed dense calc & communication	6 years ago
Wojciech Uss	666c3bb9b0	handle multi-inputs with empty inputs for mkldnn_concat_op (#21827 ) test=develop	6 years ago
Zeng Jinle	aa4d6a5d6c	Add some debug flags to auto growth allocator (#21766 ) * add some debug flags to auto growth allocator, test=develop * add comments about auto growth, test=develop	6 years ago
guofei	8b7c50f49a	Make While Op could run on GPU place and add while_loop unittest (#21672 ) 1. Make while_op accept GPU conditional data 2. Add more complex test cases for while_loop API	6 years ago
WangXi	17299b8d21	fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop (#21801 )	6 years ago
Huihuang Zheng	557bce77da	Fix Backward Bugs in Conditional Block (#21809 ) The fixed bugs: 1. The condition sub-graph is not pruned 2. When backward graph is extremely simple, the whole backward ops are pruned.	6 years ago
xujiaqi01	0eb4d990c4	fix compiled error when with_pslib=on (#21769 ) * fix compiled error of butil when with_pslib=on and with_testing=on * test=develop	6 years ago
Huihuang Zheng	0677a1c1c1	Fix That conditional_block_op Doesn't Have InferShape (#21733 )	6 years ago
zhaoyuchen2018	a5a8d14414	Fix softmax cuda bug (#21720 ) * Fix softmax cuda bug * Refine multihead log and softmax logic	6 years ago
Kaipeng Deng	943a44492b	yolo_box OP add Attr(clip_bbox). (#21620 ) * yolo_box OP add Attr(clip_bbox). test=develop	6 years ago
Michał Gallus	a5159d8480	Re-anble vgg and resnet101 models download (#21713 ) test=develop	6 years ago
Leo Chen	7181afd75c	Fix elementwise_pow bug on CUDA place with integer (#21675 ) * fix elementwise_pow bug on integer, test=develop * use llrint to support elementwise_pow_grad, test=develop * add some tests, test=develop * revert grad functor, test=develop	6 years ago
石晓伟	2bb135825e	fix analysis_predictor when func is called multiple times, test=release/1.6 (#21665 )	6 years ago
lidanqing	d3a96632fa	Add fc-dequantize squash in cpu_quantize_squash_pass for ernie model (#21714 ) * fc-dequantize squash test=develop * change according to reviews test=develop * change PADDLE_ENFORCE test=develop * add second test when fc-dequant do not fuse test=develop * change all related PADDLE_ENFORCE test=develop	6 years ago
Chen Weihang	1fd1f06f11	Rename paddle throw error macro (#21657 ) * rename paddle throw error macro, test=develop * fix new error use case, test=develop	6 years ago
WangXi	8754cbd1f2	fix std::min type in nan_inf, test=develop (#21725 )	6 years ago
Leo Chen	fbe3ac217e	polish cmake, test=develop (#21681 ) * polish cmake, test=develop * add current directory to LD_LIBRARY_PATH, test=develop	6 years ago
joanna.wozna.intel	d419b859c0	Add reshape int8 mkldnn op (#21428 ) * Add reshape int8 op test=develop * Change test to CPUPlace test=develop * Correct tests test=develop	6 years ago
WangXi	8a0f611b64	Rewrite check nan inf tools (#21076 )	6 years ago
tangwei12	9ad940fdfe	memory leak for cpu (#21174 ) * add fake init for the trainer, fix large memory hold in the trainer * do not merge recv vars from a remote endpoint, test=develop * add recv and save op, merge slice var in one op, save memory * remove hsigmoid with pull sparse, test=develop	6 years ago
Zhaolong Xing	fbbd94a6ce	there is bug for inference using auto grwoth allocator (#21621 ) test=develop	6 years ago
Zeng Jinle	73461a7ae6	Make OperatorWithKernel::InferShape abstract (#21633 ) * make OperatorWithKernel::InferShape virtual, test=develop * fix test_prepare_op by relu, test=develop	6 years ago
mapingshuo	686f0ecb6a	add `no_need_buffer_slots` interface to pybind (#21575 ) * add no_need_buffer_slots interface to pybind	6 years ago
Zeng Jinle	6828f3684b	fix op_registry, add ignore op_function_impl.h, test=develop (#21654 )	6 years ago
GaoWei8	5af0c7ba89	Modify padding strategy: remove weight copy in fc padding (#21650 ) test=develop	6 years ago
Chen Weihang	d96acc3363	Refine dygraph DataLoader implementation (#21634 ) * refine dygraph dataloader & polish related code, test=develop * refine code based review comment, test=develop	6 years ago
wangchaochaohu	5eec8cf5af	fix the mean grad OP performance improvement test=develop (#21658 )	6 years ago
Zeng Jinle	29f64c8c9e	refine some grad op makers, test=develop (#21629 )	6 years ago
mapingshuo	e2d849b989	Dropout with seed (#21590 ) * add seed op	6 years ago
Adam	e81f0228df	MKL-DNN 1.0 Update (#20162 ) * MKLDNN v1.0 rebase to Paddle 1.6 test=develop * Add hacky paddle::string::to_string() implementation * vectorize<int64-t>() -> vectorize() cleanup test=develop * PADDLE_ENFORCE and void_cast fixes test=develop * Rebase changes test=develop * Cosmetics test=develop * Delete MKL from mkldnn.cmake test=develop * CMake debug commands test=develop * Delete MKLDNN_VERBOSE and rebase fixes test=develop * Rebase fixes test=develop * Temporarily disable int8 resnet101 vgg16 and vgg19 tests test=develop * Add libmkldnn.so.1 to python setup test=develop * Add libmkldnn.so.1 to inference_lib cmake after rebase test=develop * Post rebase fixes + FC int8 changes test=develop * Fix LRN NHWC test=develop * Fix NHWC conv3d test=develop * Windows build fix + next conv3d fix test=develop * Fix conv2d on AVX2 machines test=develop	6 years ago
rensilin	7f5d532a9c	fix: fail to call ZeroCopyTensor::mutable_data() when device_id is no… (#21461 ) * ZeroCopyTensor::mutable_data in the right device, test=develop * add unittest for zerocopy, test=develop	6 years ago
xujiaqi01	f404157205	fix master patch when slot is dense (#21580 ) * fix master patch when slot is dense * test=develop	6 years ago
xujiaqi01	c05706fe73	fix code style of fleet_wrapper (#21639 ) * fix code style of fleet_wrapper * test=develop	6 years ago
wangchaochaohu	95b95a284b	Mean gpu optimize (#21643 ) * accelerate mean op test=develop	6 years ago
Leo Chen	48600d7f17	Add op function generator for dygraph (#21569 ) * add op function generator, test=develop * add unittest, test=develop * follow comments, test=develop * fix windows compilation problem, test=develop	6 years ago
lidanqing	fbf9eca0d3	QAT Int8 document (#21360 ) * update benchmark for int8v2, QAT1, QAT2 accuracy and performance test=document_fix * change according to reviews test=develop test=document_fix * improve some descriptions and some models test=develop test=document_fix * update models benchmark data test=develop test=document_fix * update int8v2 and qat2 performance test=develop test=document_fix	6 years ago
liym27	be6a639655	Add CI for checking Input/Output/Attr of modified Ops (#21522 ) * add shell scripts. test=develop * rename test_pybind_inference to test_pybind_interface and print repeat process in check_op_desc.py. test=develop * add approval RD. test=develop	6 years ago
Leo Chen	4f81d1bd5f	Refine VarBase init function (#21587 ) * refine init function, test=develop * add tests, test=develop * remove extern, which may cause symbol error in gcc-4.8, test=develop	6 years ago
Leo Chen	84b7267100	dygraph_grad_maker supports varbase without grad_var (#21524 ) * dygraph_grad_maker supports varbase without grad_var, test=develop * fix compile, test=develop * fix test_tracer, test=develop * follow comments, test=develop	6 years ago
xujiaqi01	88960684aa	rm optimize_for in framework.proto (#21571 ) * remove optimize_for in framework.proto * test=develop	6 years ago
Zeng Jinle	0f8888360e	Polish op registry codes (#21561 ) * polish infer shape registry, test=develop * modify some operators registry, test=develop	6 years ago
Aurelius84	3d9dee575e	Set lod_level of Out in compile time of sequence_pool_op (#21604 )	6 years ago
zhouwei25	346705967d	monitoring changes of unittest, delete one unittest will need approve (#21377 )	6 years ago
Zeng Jinle	97e76cb96d	refine dev_ctx.Wait() exception throw, test=develop (#21600 )	6 years ago
Huihuang Zheng	1dcf6a7212	Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532 ) Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests. Fix bugs: 1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op. 2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var. This PR also did some code clean up: 1. Print the var name when sgd_op catches shape error so that it is easier to debug 2. Fix a typo: dicta -> dict	6 years ago
hutuxian	c5aec2fe68	Paddlebox Related to Framework (#21586 ) * Add a single_process_multi_thread transpiler. * Add some UTs. * Fix some API description.	6 years ago
liym27	9da7e6b4d4	add file check_op_desc.py and add interface to get default value. (#21530 ) * add file check_op_desc.py and add interface to get default value. test=develop * add test for c++ coverage rate. test=develop * Correct typo. test=develop	6 years ago
Jacek Czaja	8f5a93a07b	- Fix to regression in performance of ResNet-50 training (#21588 ) test=develop	6 years ago
Jacek Czaja	9ce0e29dc3	[MKL-DNN] Batch norm mkl-dnn NHWC support (#21553 ) * - BAtch norm mkl-dnn NHWC test=develop - compilation fix test=develop - UT fix - cosmetics test=develop - Fix to Batch Norm MKL-DNN NHWC UT test=develop Conflicts: paddle/fluid/operators/batch_norm_op.h * - Lint fixes test=develop	6 years ago
Zeng Jinle	3a7caf481c	add grad maker assert, test=develop (#21564 )	6 years ago
Huihuang Zheng	b241c7329c	Refine a Warning Which Can Occur Not Only During Init (#21546 ) As the title	6 years ago
Pei Yang	20d61414b4	fix glog warning, test=develop (#21573 )	6 years ago
wangchaochaohu	932aca162d	Add Branch to avoid CPU profiler warning print (#21556 ) * fix profiler warning message in cpu profile mode test=develop	6 years ago
Leo Chen	cdd46d7e02	Split VarBase from Python Variable for Dygraph (#21359 ) * test=develop, fix docker with paddle nccl problem * don't expose numerous Tensor.set(), test=develop * fix condition, test=develop * fix float16 bug, test=develop * feed should be Tensor or np.array, not Variable or number, test=develop * use forcecast to copy numpy slice to new array, test=develop * remove float16-uint16 hacking, test=develop * add variable method to varbase and refactor to_variable to support return varbase * support kwargs in varbase constructor * add VarBase constructor to support default python args * refine varbase initial method * reset branch * fix ut for change VarBase error info to PaddleEnforce * cherry is parameter change before * overload isinstance to replace too many change of is_variable * rm useless files * rm useless code merged by git * test=develop, fix some ut failed error * test=develop, fix test_graph_wrapper * add some tests, test=develop * refine __getitem__, test=develop * add tests, test=develop * fix err_msg, test=develop	6 years ago
Youwei Song	cdba41af4d	dygraph Embedding layer use lookuptable v2 (#21209 ) * dygraph Embedding layer use lookuptable v2 test=develop * fix test_nce test=develop	6 years ago
Pei Yang	122b37ce62	make config option DisableGlogInfo() able to mute all inference logs (#21318 ) * make DisableGlogInfo able to mute all logs in inference.	6 years ago
wangchaochaohu	4c9b3dafa7	fill_constant_batch_size_like OP precious problem fix (#21337 ) * fix fill_constant_batch_size_like_op precious problem test=develop	6 years ago
Aurelius84	fa7cff1fee	Add CI for checking registered data_type of new Op (#21488 ) * add data_type register CI test=develop * add op add test case test=develop * add test case for register op kernel test=develop * fix shell script bug test=develop * fix checkout branch test=develop * remove test case code test=develop * fix op_type.spec name test=develop	6 years ago
Zhaolong Xing	da7748c53d	add conv, depthwise_conv, pooling (#20966 ) test=develop	6 years ago
WangXi	768f9242e9	Fix dgc clip & rampup step, test=develop (#21491 )	6 years ago
hong	0b75a0c10b	add overrider for virtual function to avoid warning (#21503 ) * add overrider for virtual function; test=develop * fix layer.h OutputName bug; test=develop	6 years ago
Aurelius84	54382ce497	Add get_all_kernels api of registered data_type in pybind.cc (#21499 ) * add _get_all_register_op_kernels api test=develop * refine usage of check_op_register_type test=develop * add import in core test=develop	6 years ago
Zeng Jinle	3662fb71a7	remove eval() calls in Eigen, test=develop (#21498 )	6 years ago
Jacek Czaja	18a5d30754	[MKL-DNN] Conv2d and Conv2d transpose MKL-DNN NHWC support (#21466 )	6 years ago
GaoWei8	250a192181	Add ernie large c++ inference test (#21365 ) * add ernie-large test test=develop * add ernie large c++ inference test test=develop	6 years ago
zhongpu	6ebf0f47b8	support SelectedRows in dygraph, test=develop (#21078 ) * support SelectedRows in dygraph, test=develop * fix bug of _grad_ivar interface, test=develop * add optest for support seletedrows, test=develop * fix bug for gradient_accumulator in GPU mode, test=develop * fix error when Selectedrows addto LodTensor in sorted_gradient mdoe in dygraph, test=develop * refine and simplify gradient accumulator code, test=develop * add optest, test=develop * add optest and simplify code, test=develop * fix bug for test_imperative_selected_rows, test=develop * add optest for Coverage, test=develop * fix gradient interface and simplify code, test=develop * update api for gradient, test=develop * fix ShareDim's bug in DygraphExecutionContext class, test=develop * add optest, test=develop	6 years ago
Tao Luo	70eb397677	remove unused snappy/snappystream depends in distributed codes (#21484 ) test=develop	6 years ago
lilong12	0bc8bdf724	set dim[0] to -1 if dim[0] < 0 during compiling for c_allgather op (#21402 ) * set dim[0] to -1 if dim[0] < 0 and remove assertion to runtime, test=develop * modify ENFORCE message, test=develop * add validation for x.shape[0] > 0, test=develop * add ut, test=develop	6 years ago
Zhaolong Xing	c5f0293cf3	NV jetson(nano, tx2, xavier) inference compile support (#21393 ) * add jeston compile support test=develop * refine the cmake test=develop	6 years ago
Zhaolong Xing	b39c011637	specify the auto growth allocator for inference. (#21448 ) test=develop	6 years ago
tangwei12	0bddb951c2	fix async mode, test=develop (#21367 )	6 years ago
Zeng Jinle	81ef8b7f8f	Fix CI DefaultGradOpMaker check (#21482 ) * fix default grad op maker ci bug, test=develop, test=document_fix * remove some ops from paddle/fluid/op_use_default_grad_op_maker.spec, test=develop, test=document_fix	6 years ago
Huihuang Zheng	a71f53d7ac	Add warning message when initialize GLOG failed. (#21487 ) Add warning message when initialize GLOG failed	6 years ago
Leo Chen	b3090ad406	fix synchronization problem in softmax_with_cross_entropy_op, test=develop (#21480 )	6 years ago
Tao Luo	01fa4ead61	fix -Wno-error=sign-compare warning in gcc8 (#21434 ) * fix -Wno-error=sign-compare warning in gcc8 test=develop * fix warning in distributed codes test=develop	6 years ago
Lv Mengsi	37f3e56dea	Fix transpose conv (#21406 ) * fix transpose conv,test=develop * fix comments test=develop	6 years ago
hutuxian	7e68bc896b	refactor AUC OP and add its CUDA Kernel (#21336 ) * refactor AUC OP and add its CUDA Kernel * the layout of global auc doesn't change	6 years ago
wawltor	dbbe6e9cb6	fix the device supported of the op unique and unique_with_counts. (#21395 ) * fix the device supported of the op unique and unique_with_counts. test=develop test=document_fix * Fix the precision of test in the op of unique and unique_with_counts. test=develop test=document_fix	6 years ago
wangchaochaohu	d4776ec027	fix the correctness of memcpy profiling result test=develop (#21458 )	6 years ago
wangguanzhong	379e3febf2	fix shape check in density_prior_box, test=develop (#21414 ) * fix shape check in density_prior_box, test=develop	6 years ago
Adam	76b55da15a	Fix bug in UpdatePadding for int64_t type (#21465 ) test=develop	6 years ago
Pei Yang	7b28d938bf	show shape diff in wrong trt input shape errmsg, test=develop (#21451 )	6 years ago
Jie Fang	5e813b53c5	nhwc optimization for batchnorm (#21090 )	6 years ago
Leo Chen	e0c9d856fb	add unused input vars check for OpWithKernel, test=develop (#21169 ) * add unused input vars check for OpWithKernel, test=develop * remove unused vars in some ops, test=develop * fix batch_norm, test=develop * add white list, test=develop * add CI check for white list, test=develop * :ove white list to c++, test=develop * solve failure of CI, test=develop * add unittest for unused_var_check, test=develop * refine code, enable check in operator_test, test=develop * skip mkldnn, test=develop * extend white list, test=develop * refine condition of mkldnn, test=develop * fix paddle_build, test=develop * follow comments, test=develop * fix GetExpectedKernelType * add wiki ref to err_msg, test=develop * follow comment, test=develop	6 years ago
Chen Weihang	664f958a02	Fix optimizer op infershape failed in dygraph multi-cards mode (#21374 ) * add param & grad shape check for sgd op * add _reshape_inplece interface for dygraph parallel * refine unittest based paddle/models scripts, test=develop * add unittest for parallel grad fuse, test=develop	6 years ago
Huihuang Zheng	630be31952	Fix Cond Bug for Nested Control Flow (#21340 ) * Commit before merging develop test=develop * Backup after working with Huihuang logs * Commit before deleting Huihuang debug loggings * Commit before debug test=develop * Fix bug commit test=develop * Backup of fixing bugs test=develop * Clean up code test=develop * Fix a bug in sum_op test=develop	6 years ago
Jacek Czaja	cd43c4440e	[MKL-DNN] LRN and Pool2d (FWD) NHWC support (#21375 )	6 years ago
Leo Chen	add62acfd1	remove kDepXOut for abs_grad op, test=develop (#21407 )	6 years ago
Chen Weihang	407f883f5a	Add SelectedRows support for dygraph DebugString (#21415 ) * add selected rows support for debug string, test=develop * refactor unittest of debug string, test=develop * polish unittest name, test=develop	6 years ago
Adam	9107bf209f	Add template version of UpdatePadding (#21426 ) test=develop	6 years ago
zhaoyuchen2018	b16274556a	Add dscending for argsort (#21400 ) * Add ascending for argsort * Refine api doc description. * Refine descending description * Add int32 logic to speedup when data is small size. * Remove int32 opt as not support in python	6 years ago
Zeng Jinle	6b09b73e17	add explicit conversion to NoNeedBufferVarsFunctor, test=develop (#21430 )	6 years ago
hong	ac8546701d	Add dygraph execution context (#20157 ) * add_dygraph_execution_context * add dygraph infershape context and execution context; test=develop * fix imperative bug; test=develop * remove inputs outputs interface from execution context, because it have same function with inputNames; test=develop * remove tracer_test ctest; test=develop * fix split op bug; test=develop * fix unitests bug; test=develop * fix distribute test bug; test=develop * fix ngraph compile bug; test=develop * fix grad maker bug; test=develop * fix load op bugs; test=develop * fix operator.cc construct bug; test=develop * remove useless name find in operator; test=develop * add tracer_test; test=develop * fix concat, split bug; test=develop * remove tracer_test unitest; test=develop * fix attribute check bug; test=develop * add test code to fix converage; test=develop * remove useless code, change check backward input in engin; test=develop * unlock var type infer shape;test=develop * add ShareAllLoD api; test=develop * add dygraph infershape context unitest; test=develop * remove increase and decrease lod in dygraph; test=develop * addd override; test=develop * fix increase descrease lod; test=develop * fix paddle_enforce; test=develop * disable lod op dygraph check; test=develop * fix paddle enforce error; test=develop * add comment for op_registry and OperatorBase; test=develop * optimize the comment of op_registry; test=develop * fix format of comment; test=develop * fix format of comment; test=develop * optimize the format of comment; test=develop * optimize the format of the comment; test=develop * optimize comment of op_registry; test=develop	6 years ago
hutuxian	a6b089c614	add macro to ban windows (#21422 ) remove nccl related code in windows	6 years ago
Kaipeng Deng	ebfb720a63	add Adam beta1/beta2 support Variable (#21234 ) * add Adam beta1/beta2 support Variable. test=develop	6 years ago
Zeng Jinle	09696d5df8	Use system allocator in OpTest (#21335 ) * use system allocator in unittests, test=develop * fix op bugs, test=develop * fix tensor copy bug when src and dst are the same, test=develop	6 years ago
tianshuo78520a	d624b417d8	change make nproc on Cloud Integration (#21350 )	6 years ago
wangchaochaohu	8293f21a52	Profile refine (#21258 ) * fix profile api high version test=develop	6 years ago
Kaipeng Deng	67c836fb5c	batch_norm momentum support variable (#21246 ) * batch_norm momentum support variable. test=develop * fix format. test=develop * add batch_norm momentum variable example. test=develop * move MomentumTensor to training branch. test=develop * split example. test=develop * fix doc. test=develop * fix PADDLE_ENFORCE ci. test=develop * fix format. test=develop	6 years ago
lidanqing	c0aa13672e	Fp32 vs int8 qat C++ performance (#21244 ) * add ut for comparing FP32 and QAT INT8 * add save qat transformed model python script test=develop * updated * added missing file * add "with_label" test=develop * performance benchmark as unit test test=develop * change names of unnecessary thing * Change CMakeList.txt for model downloading and UT test=develop * change names of functions and params for more readable code test=develop * Change PADDLE_ENFORCE messages test=develop * fix indent problems test=develop * indent problems test=develop	6 years ago
ShenLiang	e2c6f434ec	Add Lod information for gather_nd & scatter_nd (#21404 ) * add lod information, test=develop * add lod, test=develop * fix lod, test=develop * fix lod, test=develop	6 years ago
Tao Luo	c0656dcb1a	remove -Wno-error=sign-compare, make warning as error (#21358 ) * remove -Wno-error=sign-compare, make warning as error test=develop test=document_fix * fix exist compile warning test=develop	6 years ago
wangchaochaohu	e0e205ea2d	fix the profiling bug test=develop (#21396 )	6 years ago
Zeng Jinle	b97fc16d21	fix lod_reset bug, test=develop (#21392 )	6 years ago
Zeng Jinle	89966525f1	Polish reference count pass (#21324 ) * fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop	6 years ago
Zhaolong Xing	d1a6e112e6	fix C++ multicard inference bug. (#20955 ) test=develop	6 years ago
hutuxian	47a82e38e3	Support data_norm gpu kernel (#21325 ) * support data_norm_op run in CUDA * add two parameters sync_stats & summary_decay_rate * add UT	6 years ago
Youwei Song	d5ff79e55e	Support numpy bridge (enabled by default in dygraph mode) (#20983 ) * add numpy bridge * fix template compile * add unittest, add default test=develop * fix unittest test=develop * fix unittest test=develop * zero_copy=True for to_variable, test=develop * bug fix test=develop * disable deprecated NumPy API test=develop * use better design of NumpyAllocator test=develop * fix Py_None check test=develop * reset c++ tracer when jump out dygraph guard test=develop * refine PADDLE_ENFORCE_xx format test=develop * bug fix of tracer switch test=develop * update decref test=develop	6 years ago
GaoWei8	8493f20ebc	Polish the codes of fc when needs padding (#21378 ) test=develop	6 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	6 years ago
Zeng Jinle	b639a882c3	fix syn bn grad maker, test=develop, test=document_fix (#21317 )	6 years ago
Youwei Song	4d0f5ab1a8	add axis check for concat op (#21288 ) * add axis check for concat op test=develop * fix PADDLE_ENFORCE format test=develop * move to ComputeAxis for InferShape check test=develop	6 years ago
zhaoyuchen2018	afb134847d	Fix ernie python infer diff (#21311 ) * Fix ernie pythoin infer diff * Refine mask test=develop	6 years ago
Lv Mengsi	b6ce4f8b2f	Fix mistake of batch norm op (#21237 ) * fix_bn * revert unittest,test=develop	6 years ago
lilong12	41d13209d7	add the framework support for distfc (#21197 ) * add the framework support for distfc and ut, test=develop * fix the implementation of shard_index_op, test=develop	6 years ago
Zeng Jinle	dbba9c7e4b	polish global_value_getter_setter, test=develop (#21332 )	6 years ago
GaoWei8	234060f88f	Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972 ) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop	6 years ago
silingtong123	45c1e7bb7b	add prediction demo and script on windows (#21248 )	6 years ago
Jacek Czaja	f4cf028a8c	[MKL-DNN] Error throwing for NHWC layout for MKL-DNN ops (#21207 )	6 years ago
Michał Gallus	ed9ceb9f98	Refactor MKL-DNN ElementwiseMul (#21061 ) * Refactor MKL-DNN ElementwiseMul remove manual fallback, remove format attrs test=develop * Refine PADDLE_ENFORCEs in eltwise_mul_op.h test=develop * Make ElementwiseMulOp inherit from ElementwiseOp * Change type of simd_width to int test=develop * Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp test=develop * Restore attributes test=develop * Fix test coverage for mkldnn eltwise mul test=develop * Conform to new is_run_common_broadcast API test=develop * Add UT for AreDimsAndFormatCorrect test=develop	6 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	6 years ago
wangchaochaohu	6514f52e46	fix the fill_constant op precious problem (#21322 ) * fix the fill_constant op precious problem test=develop	6 years ago
zhaoyuchen2018	08c19c585d	Improve argsort performance. (#21267 ) * Improve argsort performance. - Give 200000 data to compute argsort on v100, can speed up ~190x before opt cost: 0.53s after opt cost:0.0027s - Add fp16 support * Refine error message * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
Thunderbrook	9a7832f8be	print table stat info for pslib (#21296 ) * print table stat test=develop * notes test=develop * notes test=develop	6 years ago
WangXi	8ac7687e36	Fix dgc accuracy by mv regularization to local (#21278 )	6 years ago
Zeng Jinle	b9f8ae8494	Add global value getter setter (#21285 ) * add global value getter setter, test=develop * fix error messages, test=develop	6 years ago
Leo Zhao	b19e1a1b56	use prefetch to load next mem into cache (#21206 ) * use prefetch to load next mem into cache test=develop * remove hard code memcpy om pyramid_hash_ff test=develop	6 years ago
Dong Daxiang	691ced87c0	Refactor fetch handler (#21264 ) * fix fetch handler problem and refactor when a user define FetchHandler class, he or she should initialize a handler with variable dict. the key of a variable dict is a user defined name, the value of a variable dict is a Varaible generated from python API. For each fetching, a user should implement handler function in which fetched_result_dict will be available and the user can access the fetched value with user defined keys.	6 years ago
gongweibao	ed2a185248	optimize nhwc for tensor core in ConvOp and ConvGradOp (#20597 )	6 years ago
Yiqun Liu	c918788ba9	Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. (#21310 ) * Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. test=develop * Print the subgraph when check failed. test=develop	6 years ago
Yihua Xu	69dd5152cf	Fix the crash issue when scale or bias was null-pointer. (#21284 ) * Fix the crash issue when scale or bias was null-pointer. test=develop * Add the error message for passing CI. test=develop	6 years ago
Zhang Ting	698b8b73ad	optimize lod_reset op to avoid data transform	6 years ago
Liufang Sang	f0b1518438	add dequantize_abs_max op and modify lookup_table op (#20899 ) * add int8 kernel to lookup_table op and add dequantize op test=develop * change paddle_enforce to paddle_enforce_eq test=develop * change copyright and change some not suitable code test=develop * remove debug log test=develop * replace GetInputType with IndicateVarDataType test=develop * fix EmptyGradMaker test=develop * fix diff between cpu and gpu test=develop * use memcopy when int8_t test=develop	6 years ago
hutuxian	a6ce2306f9	support cvm_op run in gpu (#21300 ) Previously, CVM OP was only able to run in CPU. This PR implements its GPU kernel. What's more, we improve the UTs about CVM OP.	6 years ago
Yihua Xu	b085ecc258	Avoid the string as the key of map to improve the jit performance (#21292 ) * Avoid the string as the key of map to improve the jit performance. test=develop * Use map to replace unordered_map. test=develop	6 years ago
Chen Weihang	952508527a	Polish some PE code details (#21274 ) * polish code details, test=develop * futher polish hint msg, test=develop	6 years ago
Thunderbrook	0d17c1b816	solve pslib core in stop worker (#21263 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop	6 years ago
zhongpu	c4ede95c74	open dygraph op test, test=develop (#19787 ) * open dygraph op test, test=develop * modify to_variable, test=develop * modify input and output for dygraph, test=develop * modify input and output for dygraph(fix bug), test=develop * fix input processing of dygraph op test, test=develop * fix bug, test=develop * fix op test, test=develop * fix forward bug for dygraph, test=develop * fix mkldnn op test for forward, test=develop * update nn.py for dygraph, test=develop * fix crop_tensor_op, test=develop * fix elementwise_mul_op, test=develop * fix fill_op, test=develop * fix some mkldnn op, test=develop * open backward op test for dygraph, test=develop * delete log, test=develop * close backward op test for dygraph, test=develop * fix bug for edit_distance_op and test_lstm_cudnn_op, test=develop * fix optest backward bug for dygraph, test=develop * fix optest backward bug for dygraph, test=develop * close backward op test for dygraph, test=develop * close backward op test for dygraph, test=develop * open dygraph op test, test=develop * fix op test for dygraph, fix GradOpDescMaker, test=develop * fix bug for linear_chain_crf_op.h, test=develop * remove log, test=develop * remove log, test=develop * remove log for op_test.py, test=develop * remove log for op_test.py, test=develop * fix bug for var_conv_2d_op, change PADDLE_ENFORCE, test=develop * fix PADDLE_ENFORCE_EQ for hierarchical_sigmoid_op.cc, test=develop * fix bug for test_increment_ngraph_op.py, test=develop * fix lod for op test in dygraph, test=develop * refactor op_test.py to reduce redundant code, test=develop * fix lod optest, modify InputVar/OutputVar to HasInput/HasOutput, test=develop * remove debug log, test=develop * remove redundant code in base.py, test=develop * fix some error in optest, test=develop * fix ClearNoNeedBufferInputs function's bug for LoDTensor, test=develop * refactor op_test.py, test=develop * remove redundant writing, test=develop * fix error(get tensor of the grad variable), test=develop * fix test_concat_mkldnn test_conv2d_mkldnn, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix some redundant code, test=develop * reslove conflict and rewrite paddle error message, test=develop	6 years ago
danleifeng	6fc3e8ec84	edit elementwise_mul doublegrad inplace (#21245 )	6 years ago
Thunderbrook	349e82d669	support general embedding params (#21217 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop	6 years ago
liu zhengxi	3cb6c0a059	Fix the CAPI ZeroCopy shape error and reuse the code to get output (#21240 ) * fix the CAPI ZeroCopy shape error and reconstruct the output obtain * use an anonymous namespace to cover the functor * fix unit tests because of the output of typeid(T).name() is different from linux and windows, test=develop	6 years ago

... 3 4 5 6 7 ...

16697 Commits (03deb41d736bea9c8d593b11d9aa541a056d250a)