Paddle

Commit Graph

Author	SHA1	Message	Date
Zeng Jinle	91a0911ca3	Make PADDLE_ENFORCE_EQ support types that cannot be converted to std::string (#19243 ) * make PADDLE_ENFORCE_EQ support cannot to string types, test=develop * follow huihuang's comments, test=develop	6 years ago
chengduo	8a89ca94ce	Fix REGISTER_OP_WITHOUT_GRADIENT (#19251 ) * fix REGISTER_OP_WITHOUT_GRADIENT test=develop	6 years ago
gongweibao	fd4b15a2f6	Unset unittests http_proxy env to avoid timeout. (#19269 ) Unset unittests http_proxy env to avoid timeout.	6 years ago
silingtong123	a94a25867d	imporve the doc of decorate_reader API (#19206 ) * imporve the doc of decorate_reader API, test=develop * udpate API.spec, test=develop	6 years ago
zhongpu	c27b081397	modify paddle/scripts/fast_install.sh about mac installation, test=develop (#19187 ) modify checkMacPython2 and checkMacPython3 function to ensure python version used for paddle installation is EQ/GT than 2.7.15 on MacOS; modify checkMacPaddleVersion function to ensure paddle version is release version, because paddle don't have develop version on MacOS.	6 years ago
Kaipeng Deng	2848cb791e	fix temporal_shift OP PADDLE_ENFORCE. test=develop (#19161 ) * fix temporal_shift OP PADDLE_ENFORCE. test=develop * fix HasInput/HasOutpu ENFORECE. test=develop	6 years ago
Tao Luo	2f8c7e021f	remove unused inference_transpiler unit-tests (#19130 ) * remove unused inference_transpiler unit-tests test=develop * remove InferenceTranspiler usage in quantize_transpiler.py test=develop	6 years ago
Zeng Jinle	708bd9798d	move_flags_to_unified_files_for_management, test=develop (#19224 )	6 years ago
Zeng Jinle	002f325dcd	add PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#19211 )	6 years ago
lidanqing	07a4d8f8d6	Fix mAP problem in unit test of int8 object detection test (#18946 ) * change the top1 comparison to mAP comparison test=develop * change the mobilenet-ssd tester demo data and batch_size settings test=develop	6 years ago
Hao Wang	d53fa53b65	CI - Improve example code check (#19170 ) * add exception exit on error example codes test=develop	6 years ago
Adam	b837689e97	Add generalized Conv+Activation MKLDNN fuse pass creation (#19072 ) test=develop	6 years ago
Yibing Liu	50b1cab122	Add padding support for crf_decoding (#19057 ) * Add padding support for crf_decoding * Fixes in comupte kernel test=develop * Update API Spec test=develop * Update API.spec test=develop * Avoid using paddle_enforce test=develop * Fix enforce test=develop	6 years ago
Aurelius84	45fb031f6b	remove is_test param of FC test=develop (#19209 ) Remove is_test parameter of FC op. The parameter is_test is not used anywhere.	6 years ago
liym27	c8cdef37b2	change the default value of summarize from -1 to 20 in Print API to improve ease of use (#18738 ) * change the default value of summarize from -1 to 20 in Print op to improve ease of use, test=develop * change the doc of API Print to make the document easier to understand, test=develop	6 years ago
Yiqun Liu	77572b70cb	Enhance the error message when GrapOpMaker is null. (#19070 ) * Enhance the error message when GrapOpMaker is null. test=develop * Call Proto() instead of directly using proto_ pointer. test=develop * Rollback to use proto_ directly, because some sepecial grad ops, such some double grad ops, donot have proto. test=develop	6 years ago
lvmengsi	c6f163cd7a	add description of sync_bn (#19056 )	6 years ago
chengduo	b5ba801ef0	Fix gather op bug (#19168 ) * fix gather op bug test=develop	6 years ago
Zeng Jinle	0f9b33954a	move python reader api to fluid.io module, test=develop (#19143 )	6 years ago
Leo Chen	80eab822c1	Remove unused DefaultGradOpDescMaker in REGISTER_OPERATOR() (#19166 ) * remove unused DefaultGradOpDescMaker in REGISTER_OPERATOR(), test=develop * remove SplitIdsOpGradMaker since it is buggy and not tested, update spec file, test=develop	6 years ago
chengduo	c70a97f46e	Use CUDAPinnedPlace in buffered_reader (#19112 ) Use CUDAPinnedPlace in buffered_reader	6 years ago
jiaqi	b104ea0684	add get_last_save_xbox_base/get_last_save_xbox (#19122 ) * add get_last_save_xbox_base/get_last_save_xbox * fix fleet_util bug of load paddle model * add doc string in fleet api	6 years ago
joanna.wozna.intel	492a00f53e	Add conv reqantize squash (#18754 ) * Add requantize squash test=develop * Add more precise tests test=develop * REname and REfactor tester test=develop	6 years ago
Jiawei Wang	6ac32d0981	Instag Implemention (#18394 ) * instag lod tensor impl * First PR for instag * First PR for instag * Before adding Selection Rows. * Change name from instag to filter_instag, add upgrade the impl of filter_instag * Change name from instag to filter_instag, add upgrade the impl of filter_instag * Fix yapf error in gradient_checker.py to pass Travis-CI * Fix Filter Instag Grad test=develop * Fix Filter Instag Grad test=develop * 1) Fix API.spec, add filter_instag Op. 2) Add Vector Support for CUDA. test=develop * Impl Loss_weight and empty output handler * change Loss Weight datatype to Float32, and add Loss Weight as 2nd output * 1) Support Tensor Input(without LOD) 2) Add Unit test * Filter By Instag Final test=develop * Update API.spec for filter_by_instag test=develop * Update API.spec for filter_by_instag 2 test=develop * Add Filter By Instag Coverage * code format of test_layers.py * code format test_layers.py test=develop * Make API args more readable test=develop * Make API args more readable and pass code format test=develop * Filter By Instag Op, Rename Map to Index Map test=develop * Filter By Instag Op, code format err in filter_by_instag_op.cc test=develop * Filter by instag op: code format of cpp files test=develop * Filter by instag Op: Api spec modification test=develop * Filter by instag Op: Api spec doc id modification test=develop * Filter by instag Op: Api spec and doc preview test=develop test=document_preview * Filter By Instag Op, fix doc erro test=document_preview test=develop * Filter By Instag Op, fix doc err and Api spec test=document_preview test=develop * Filter By Instag Op, fix Api spec test=document_preview test=develop * Filter By Instag Op, fix Paddle Encoforce deprecated warning test=document_preview test=develop * Filter By Instag Op, fix Paddle Encoforce deprecated and code format warning test=document_preview test=develop	6 years ago
zhongpu	2e76e75517	modify paddle/scripts/fast_install.sh about Mac installation to support paddle version check on MacOS (#19108 ) Add python version check on MacOS	6 years ago
Tao Luo	5f5648a8ff	Revert "Python inference API support numpy (#19009 )" (#19160 ) test=develop	6 years ago
wawltor	0019eb376a	Fix the error of op `ones_like` document，change the output variable test=document_preview test=develop Fix the error of op `ones_like` document, change the output variable from x to out.	6 years ago
huangjun12	20f18930ae	Add hard swish op (new op) (#19001 ) * add hard_swish activation op (new op) test=develop * remove redundancy files * modify document content of HardSwish OP * add API test in test_layers.py * add dynamic_graph for test_hard_swish	6 years ago
joanna.wozna.intel	bce72c7fea	Replace Relu with bounded Relu in MobileNetV2 quantization (#18988 ) test=develop	6 years ago
chengduo	e044e84264	open fuse_all_optimizer_ops (#19087 ) test=develop	6 years ago
wangguanzhong	1fc242a7ed	refine infer shape in box decoder and assign op, test=develop (#19118 )	6 years ago
gongweibao	29d8781240	Polish fleet API to support cuda collective mode and nccl2 mode. (#18966 ) Polish fleet API to support cuda collective mode and nccl2 mode	6 years ago
flame	b7e1a1d7e7	Python inference API support numpy (#19009 ) test=develop	6 years ago
wopeizl	80b7ef6fc8	add tensorrt support for windows (#19084 ) * add tensorrt support for windows	6 years ago
Kevin	744279fe68	Refine embedding Api doc (#18820 ) * fix overflow by int32 mul test=develop * fix reference nullptr * fix codestyle test=develop * modify to point in ContextProjectFunctor test=develop * modify to point in ContextProjectFunctor test=develop * modify . to -> test=develop * refine embedding padding_idx doc test=develop * fix math:padding_idx preview bug test=develop * modify API.spec test=develop * fix spell error test=develop * refine dtype parm desc test=develop	6 years ago
Kevin	945f3cf631	fix code too big test=develop (#19111 ) Fix seq_pool failed when input dims is too large. Resolve issue #3023	6 years ago
Tao Luo	4a959883e7	remove unused aws_benchmarking and go directory (#19103 ) test=develop	6 years ago
yaoxuefeng	9150cf50fc	add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871 ) * add ctr related metric layer test=develop * add save cache and slots shuffle test=develop * add save cache and slots shuffle test=develop * fix error * fix error * fix style for ci * fix for comments * change SlotsShuffle input to std::strinf for generality * fix style * fix style * fix style * fix style * fix style * fix style * fix stylr * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * change non-const reference to pointer * fix style * fix style * fix style test=develop * fix style test=develop * add return ins num in ctr metric op * change dtype to float in metric_op.py * fix error test=develop * fix style test=develop * fix API spec * fix API spec * fix API spec test=develop * add UT test=develop	6 years ago
hutuxian	5a80cc8431	Datafeed support reading to cuda place directly. (#19071 ) * add a place field in DataFeed to denote which place it will feed data to. * abstract the copy process in CopyToFeedTensor function * add UT for float32 type and for CUDAPlace	6 years ago
Zeng Jinle	88f111f885	remove unused inplace act codes, test=develop (#19079 )	6 years ago
tianshuo78520a	cfa6305303	Add check PADDLE_ENFORCE approval (#19088 )	6 years ago
ShenLiang	4397cb318e	add eye op, kernel and unitest test=develop (#18980 ) * add eye op,test=document_preview test=develop * fix the API.spec, test=develop * fix the document, test=document_preview test=develop * add unitest for CI coverage, test=develop	6 years ago
Kaipeng Deng	f86fead693	Add trilinear_interp OP (#18711 ) * add trilinear interp. test=develop * fix unittest. test=develop * add python api and test_layers. test=develop * refine API.spec. test=develop * fix format. test=develop * add python API test. test=develop * format code. test=develop * refine code strcuture. test=develop * fix format * fix doc. test=develop * fix converage. test=develop * fix format. test=develop	6 years ago
Zhang Ting	c2063217e7	optimize error message for "embedding" and "cross_entropy" OP (#18765 ) * optimize error message, test=develop * optimize error message, test=develop	6 years ago
Tao Luo	741ce8bb1a	inference_shared_library support profile (#16275 ) test=develop	6 years ago
chengduo	17d62ab220	Enhance fuse optimization op pass (#19010 ) * Enhance fuse optimization op pass test=develop	6 years ago
chengduo	21440b4d69	Add call stack info during compile time (#19067 ) * Add call stack info during runtime and compile time test=develop * Rename operator_call_stack test=develop * Add unit test test=develop * follow comment test=develop	6 years ago
jiaqi	a99bc64c63	add fleet util, add some interface in hdfs util (#18752 ) * add fleet util (fleet/utils/fleet_util.py): functions for users' convenience * add some interface in hdfs util : hdfs is_file、hdfs cat	6 years ago
mapingshuo	4ad7c9d5a7	[WIP] Add Imdb train demo (#18895 ) * add train demo for imdb text classification task * make inference library release data_feed dataset dataset_factory data_feed_factory * add String Data Generator * new feature of train demo: save model params * New feature of train demo: set training config using gflags * change code style for CI * add readme and dataset for imdb demo trainer	6 years ago
tianshuo78520a	0b1025769c	Add op_use_default_grad_op_maker.spec approval (#19035 ) * change grad_op approval * test=develop	6 years ago
wangguanzhong	e50f527fee	update roi doc in roi_pool and roi_align (#19036 ) * update roi doc in roi_pool and roi_align, test=develop	6 years ago
jiaqi	fc038da749	fix QueueDataset queue size (#19016 ) * fix QueueDataset queue size，set queue size = batch size * 100, to avoid too many instances in channel when training is much slower than reading data.	6 years ago
Leo Chen	8f53735437	Fix memory overwriting of tensors returned by executor (#19030 ) * fix memory overlapping of fetch var (return of executor.run), test=develop * fix wrong usage of ParallelExecutor in op_test, test=develop * remove useless parameter and simplify code * avoid tensor destruct untimely, test=develop * add testcase independent of OpTest, test=develop	6 years ago
Kaipeng Deng	1f46253d4a	fix natural exp decay doc. test=develop (#19025 )	6 years ago
tianshuo78520a	be3f469ad1	CI Add Reviewer Rules for large PRs (modify 20+ files or add 1000+ lines) (#19033 ) * CI Add Reviewer Rules * CI Add Reviewer Rules * change git_files * change git_files * test=develop * test=develop	6 years ago
Yiqun Liu	a445c33552	Add the check of lod in sequence_softmax kernel. (#18996 ) * Add the check of lod in sequence_softmax kernel. test=develop * Refine the comments. test=develop	6 years ago
Zeng Jinle	2175d19993	fix memory_reuse_pass memory_size calculation error, test=develop (#19020 )	6 years ago
tianshuo78520a	de975be1ec	change op_use_default_grad_op_maker.spec approval member (#19029 )	6 years ago
Kevin	e681d65515	Add var_conv_2d op (#18518 ) * fix overflow by int32 mul test=develop * fix reference nullptr * fix codestyle test=develop * modify to point in ContextProjectFunctor test=develop * modify to point in ContextProjectFunctor test=develop * modify . to -> test=develop * add var_conv_2d op test=develop * edit api.spec test=develop * ignore unittest if with_mkl=off test=develop * fix python3 division test=develop * fix ignore unittest bug test=develop * remove useless code test=develop * modify api.spec test=develop * modify default_grad.spec test=develop	6 years ago
Chen Weihang	81fe02c3fe	Fix config description error in cuda_profiler function document (#18750 ) * fix profiler doc error, test=develop * update API.spec, test=develop	6 years ago
SunGaofeng	4da1c4f15d	fix g_param shape mismatch in WeightNormParamAttr (#18940 ) * fix g_param shape mismatch in WeightNormParamAttr * add comment to show why insert reshape in startup_program test=develop	6 years ago
liuwei1031	a43a763b54	fix warpctc.dll not found issue (#18761 ) * fix warpctc.dll not found issue, test=develop * revert the linux platform change, test=develop * delete warpctc_lib_path.h.in, test=develop * add SetPySitePackagePath function * fix warpctc.dylib not found issue on Mac, test=develop * improve the paddle lib path setting logic, test=develop * fix mac ci issue caused by test_warpctc_op unittest, test=develop * tweak code, test=develop	6 years ago
chengduo	01c7daade7	Add checking for the fetch_list of Executor.run (#18957 ) * update exe.run	6 years ago
pawelpiotrowicz	e53f517a44	fix for multithreading test_analyzer_image_classification --num_threads=X (#18265 ) test=develop	6 years ago
flame	65d987527d	python inference enable_memory_optim(#18817 ) python inference API support enable_memory_optim	6 years ago
silingtong123	fd3b666d8c	test=develop,Synchronize the contents of develop with release1.5 (#18937 ) Fix the third-party openblas dependency for paddle on windows	6 years ago
Liufang Sang	faf6890b6c	support tensor input for ctc align op (#18887 ) * test=develop support Tensor input for ctc_align_op * test=develop add some comment	6 years ago
xsrobin	8ce902541c	fix unalign of some examples (#18943 ) * test=develop test=document_preview * Update API.spec	6 years ago
hutuxian	b62c4f9b04	fix concat check info typo (#18975 )	6 years ago
Zeng Jinle	7ac748adb4	Open gc by default (#18836 ) * open gc by default, test=develop * fix test_train_recognize_digits and disable gc when ngraph is enabled, test=develop * fix conditional_block op eager deletion bug, test=develop * add some comments to reviewers, test=develop	6 years ago
Zhaolong Xing	3816d221ff	Fix the CE error which caused by paddle-trt version (#18941 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop * 1 add trt fp16 support test=develop * fix trt fp16 ce error test=develop * add an vlog if the user use trt4 and specify fp16. test=develop	6 years ago
jiaqi	02c370c3dc	support filelist size < trainer num && fix pull dense (#18956 ) * support filelist size < trainer num * pull dense when stop, to make sure local dense params are same as pserver, so save paddle model will save dense model same as pserver * enable QueueDataset train same filelist for serveral times	6 years ago
chengduo	e7da0940f9	Disable fuse optimization option (#18924 ) * Disable fuse optimization test=develop	6 years ago
Krzysztof Binias	c2c876f718	Fix memory leak in test (#18622 ) * Fix memory leak in test test=develop * Fix memory leak in test test=develop * Fix memory leak in test test=develop * Pull out vars of the loops test=develop	6 years ago
石晓伟	ee2f296ef8	Fusion: seqpool_cvm_concat (#18471 ) * add fusion_seqpool_cvm_concat test=develop * simplify pass, test=develop * fix code style, test=develop	6 years ago
jiaqi	768059b3a0	adjust ins weight according to nid slot (#18784 ) adjust ins weight according to nid slot , user can specify adjust_ins_weight in strategy	6 years ago
Zeng Jinle	08fa98f7cc	Fix gpu_info PADDLE_ENFORCE_GT when fraction_of_gpu_memory_to_use=1.0 (#18950 ) * fix gpu_info, test=develop * fix reserving gpu memory calculation bug, add fraction=1 unittest, test=develop * fix bug again for reserving size, test=develop	6 years ago
wawltor	3ab1866ca5	Add the op of unique_with_counts, expand count function of the op unique (#18720 ) * test=develop Add the op of unique_with_counts, the op is calc the unqiue input of data, and output the corresponding indices and count of data. * test=develop Check the input and dtype in the op of unique_with_counts * test=develop test=document_preview update the API.spec for `unique_with_counts`, at the same time, optimize the python api in the op of `unique_with_count` * test=develop test=document_preview Fix some python api problem in the op of `unique_with_counts`, and change the error messsage in this op. * Fix some API problem in the op of `unique_with_counts` test=develop test=document_preview * test=develop test=document_preview Fix the api sample of op `unique_with_counts`, and update api.spec	6 years ago
Jacek Czaja	5cf2d38594	- Removed passing X from FWD to GRAD via device context (#18911 ) test=develop - Extracted key generation from FWD and GRAD into separate function test=develop - Compilation fix test=develop - another compilation test=develop	6 years ago
LielinJiang	22fa4c2d24	Fix depthwise conv gpu kernel bug (#18582 ) * fix depthwise conv gpu kernel bug, test=develop * add more depthwise conv test, test=develop	6 years ago
Huihuang Zheng	ea6ee76fa9	GPU allocation uses fraction of available memory (#18896 ) GPU allocation uses fraction of available memory, also fix the GetUsed without lock	6 years ago
liuwei1031	0d99690809	fix several security bugs reported by security team (#18831 ) * fix security issue, test=develop * bug fix, test=develop * throw an exception when null pointer data with non-zero length PaddleBuf is passed, test=develop	6 years ago
Zhaolong Xing	61238d31f7	Trt fp16 support (#18860 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop * 1 add trt fp16 support test=develop	6 years ago
chengduo	20859c08e8	[DyGraph] Make multi-card program faster (#18892 ) * update parallel.py test=develop	6 years ago
HaoRen	24f8543106	Add center Loss Op Support (#18681 ) * support center loss * change tensor copy api to high level api tensorcopy * test=develop rewrite the center_loss cuda_kernel to make it faster and add document of the center loss api,also update test function * test=document_preview test=develop update document of center loss * test=document_preview test=develop modify API.spec modify test code remove nouse const_cast	6 years ago
lvmengsi	d21c391447	replace paper link (#18861 ) Update conv2d transpose link	6 years ago
Leo Zhao	86e494eb64	use mkl to accelerate gelu_grad (#18099 ) test=develop	6 years ago
wopeizl	dfd6a62a9a	Optimize the error report information when loadcombine fail to open model files test=develop (#18888 )	6 years ago
baojun	adcfc53b18	upgrade ngraph version and simplify ngraph engine (#18853 ) * upgrade ngraph to v0.24 test=develop * simplify io test=develop	6 years ago
whs	6cccab9203	Make lod_append support variable lod. (#18908 ) test=develop	6 years ago
Jacek Czaja	cfcb96d2df	[MKL-DNN] Fix int8 performance regression (#18758 ) test=develop - optimization of TID to string test=develop	6 years ago
danleifeng	e0a2d4dfec	Add elementwise_pow_op backward implementation and the unit test codes of it. (#18848 )	6 years ago
Leo Zhao	10eeed93d1	Revert "use static variable to do cache instead of thread local in thread frequent switching case (#18428 )" (#18879 ) This reverts commit `ce38bb5341`. test=develop	6 years ago
tianshuo78520a	6cd1b71208	add DEFINE_int32/DEFINE_bool/DEFINE_string flag (#18869 )	6 years ago
Huihuang Zheng	0d3f16f53e	Try to modify external gflags to solve CI compilation (#18872 )	6 years ago
Zeng Jinle	8008ab4e6b	Remove legacy C++ memory optimization codes (#18834 ) * remove legacy memory optimization codes, test=develop * follow huihuang's comments,test=develop * follow luotao's comments, test=develop	6 years ago
Thunderbrook	52c1431eee	add clear_model interface in fleetwrapper (#18815 ) * dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop	6 years ago
Zeng Jinle	9a8a7a1ddc	fix affine_channel no_need buffer bug, test=develop (#18844 )	6 years ago
lvmengsi	829ef26281	Fix drop deconv (#18813 ) * replace link * update api.spec * fix mistake	6 years ago
Huihuang Zheng	cfce4994cf	Merge cuda 9/10 dockerfile with root dockerfile (#18693 ) Also fix a dependency error which may cause compile error	6 years ago
chengduo	4140fe11a4	Open fuse optimization ops (#18741 ) * open fuse optimization ops test=develop	6 years ago
Adam	ee02227949	Add LeakyReLU MKLDNN support (#18762 )	6 years ago
lidanqing	b05bdda0cf	remove unused TransposeINT8Op for higher UT coverage (#18791 ) test=develop	6 years ago
Zeng Jinle	a802da650b	Feature/mem opt pass refactor (#18735 ) * first version memory optimize pass, test=develop * remove move_tensor_sharing_pass, test=develop * refine code comments, add unittests, test=develop * turn off memory_optimize by default, test=develop * follow huihuang's comments, test=develop * follow chengduoZH's comments, test=develop * fix grammar error, add const qualifier, fix pass_test exception message, test=develop * follow chengduoZH's comments 2nd, test=develop	6 years ago
Physher	c5f47c2107	fix mul_mkldnn_op build failure (#18816 )	6 years ago
Physher	a5c986301c	clarify MKLDNN INT8 Mul Op attributes (#18685 )	6 years ago
FDInSky	cff5e2c173	fix roi_align_op cpu backward's bug (#18789 ) * test=develop fix cpu roi_align_op backward bug	6 years ago
石晓伟	9dbb62eeb9	Fix examples of API (#18092 ) * fix logical APIs test=develop test=document_preview * fix isfinite * update matmul comments * update API.spec test=document_preview test=develop * update API.spec test=document_preview test=develop * update API.spec test=document_preview test=develop	6 years ago
chengduo	292dfbce63	fix build strategy doc (#18725 ) test=develop	6 years ago
fuyinno4	c167a4b4dd	Fix shrink-dense and add scale-datanorm (#18746 ) Fix FleetWrapper: 1. fix shrink dense: just scale show 2. add datanorm scale: divide datanorm's gradient by batch_size	6 years ago
Bai Yifan	d3ac561d65	fix deformable_conv_op compile error, test=develop (#18793 )	6 years ago
lidanqing	9ecd8ee789	change ComputeINT8 to template version to remove checking dst_datatype code (#18756 ) * change INT8 to template so that checking dst_dt with if-else could be removed. CI will be enabled after fixing reviews * reverse user_residual_memory_p and user_bias_memory_p declaration scope test=develop	6 years ago
JesseyXujin	d9e7b5b5e9	fix bug of swish op formula,test=develop (#18772 )	6 years ago
Bob Zhu	220eef602e	Extend Matmul to support matrix multiplication with multiple heads (#18570 ) * extend matmul op to support multiple head multiplication With the support of multiple head, the multiplication of two big matrixes is split into multiplication of several (head_number) small matrixes. e.g. if Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].	6 years ago
whs	075e1cf78e	Add python API for appending LoD level (#18702 ) * Make lod reset op support for append lod level. * Fix API.spec test=develop * Fix unitest. test=develop * Add python api for lod append. test=develop * Fix API.spec test=develop * Fix format of doc. test=develop * Fix unitest. test=develop * Fix doc. test=develop	6 years ago
chengduo	8259f1418f	Enhance backward process (#18700 ) * prun backward ops test=develop	6 years ago
JesseyXujin	25c9b57bcd	Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771 )	6 years ago
Zhaolong Xing	26ae6d49e4	Update trt5 for paddle-trt (#18645 ) * update paddle-trt for: 1. fix bug: when batch > 2, core in split plugin. 2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.) 3. add new attr to dropout. 4. shuffle channel, swish, relu6 support test=develop * 1. fix ci test=develop	6 years ago
Thunderbrook	d8396281ef	add slot to sparse table (#18686 ) The change includes 2 things: 1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table. 2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta. test=develop	6 years ago
Jacek Czaja	95c1816ec0	[MKL-DNN] Extended LRN with reusing via Acquire API (#18675 ) test=develop - compileation fix - Yet another compilation fix - Even yet another compilation fix - Surprise! Again compilation fix - lint fixes test=develop - Fix to workspace acquire of LRN test=develop - Fix to hash of BWD LRN test=develop - fix to lrn BWD PD acquire test=develop - Fixing LRN PD creation test=develop - cosmetic fix in comment test=develop - Fixes after review test=develop	6 years ago
jiaqi	d18aabb472	support patch data, add load_one_table, fix bug (#18509 ) （1）support patch data （merge slots of instances of same line id, modify dense layer which changes its size）（2）add fleet load_one_table interface, support load from paddle model and load from pslib model （3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error. （5）add more debug info in TrainFilesWithProfiler	6 years ago
chengduo	fd3aad6cb3	Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664 ) * support sparse gradients test=develop	6 years ago
wangchaochaohu	6b78e00da4	Cudnn convolution reconstruction (#18284 ) * rewrite the conv_op using cudnn_conv_helper * add workspace limit for v7 test=develop * fix test=develop * add half float test=develop * fix test=develop * fix test=develop * revise code style test=develop * fix test=develop	6 years ago
Yi Liu	157211c4e1	supports distributed classification (#18690 ) * supports distributed classification training * update API.spec * fix evenly division in python3 * change "index_range" to "index_num" in shard_index operator test=document_preview test=develop	6 years ago
qingqing01	3429e65aa8	Fix CPU implementation of roi_align_op backward (#18728 )	6 years ago
Tao Luo	bd22453f20	Revert "Add LeakyRelu MKLDNN support (#18656 )" (#18723 ) test=develop	6 years ago
tianshuo78520a	58469186c3	Change api approval people name (#18699 )	6 years ago
whs	189b08dc0d	Make infer shape of pad2d support for input with negative dims in compile time. (#18695 ) test=develop	6 years ago
Bai Yifan	7e3963f295	add license, test=develop (#18709 )	6 years ago
cjt222	ccf06a48b0	test=develop (#18701 ) add license	6 years ago
wangguanzhong	185b3acea1	fix clip_by_norm doc (#18688 ) * fix clip_by_norm doc, test=develop	6 years ago
Huihuang Zheng	89bc3fd841	Support memory eager deletion on recurrent OP (#17710 ) Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)	6 years ago
Jacek Czaja	0d8e6c9b8b	MKL-DNN upgrade to 0.20 (#18370 ) test=develop	6 years ago
Adam	d6b6a337a9	Add LeakyRelu MKLDNN support (#18656 ) test=develop	6 years ago
zhouwei25	772e09560e	Optimize the content of error reporting information, print error code and official document web sites (#18671 ) optimize the error reporting information of cuda related API index on develop: 130ac17 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into develop	6 years ago
Zeng Jinle	ae58afc546	Feature/auto_growth_allocator (#18561 ) * feature/auto_growth_allocator, test=develop * add unittest of AlignedAllocator, test=develop * try to turn on auto_growth to test on CI, test=develop * fix segmentation fault in mixed_vector.h, test=develop * add unittests, test=develop	6 years ago
hutuxian	bb2f5d24a2	hash_op support int64 hash_size (#18674 ) * hash_op support int64 hash_size * add corresponding UT	6 years ago
guru4elephant	5ed713d519	remove ctr reader, all functions are satisfied in dataset (#18672 ) * remove ctr reader, all functions are satisfied in dataset	6 years ago
guru4elephant	d714bf037c	remove async executor and add data_feed.proto to the deps of train demo (#18659 ) * remove async executor and add data_feed.proto to the deps of train demo	6 years ago
Yang Zhang	ce1ec33299	Add cuda implementation for `prelu` backward pass (#18633 ) * Add GPU implementation for `prelu` backward pass test=develop * Fix logic error in `prelu` GPU backward and simplify a bit test=develop * Fix `prelu` backward CUDA implementation test=develop CPU version was not used actually, so test passed	6 years ago
石晓伟	25d8079140	Fix Bitmain Predictor::Clone() (#18599 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop * modify interfaces test=develop * add cmake dependments test=develop * enforce the outputs of net test=develop	6 years ago
Yihua Xu	97549a4f13	[CPU] Fix the compiling issue with AVX512F macro. (#18634 )	6 years ago
baojun	256ba7cbb8	[NGraph] handle dim element 0 of ngraph op (#18568 )	6 years ago
chengduo	a6d468a265	fix PE fetch bug (#18644 ) test=develop	6 years ago
liuwei1031	759530966c	print out error code of cudaGetDeviceProperties if failed (#18643 )	6 years ago
Jacek Czaja	71d883b8ef	[MKL-DNN] Reimplemented pool2d mkl-dnn to use Acquire API (#18585 ) * - Added partial draft of pooling acquire - Workspace support - compilation fix - Added draft of pooling backward reimplementation - Segfault fix - reverted 'any' for diff_dst crewation in pooling - Lint fixes test=develop - lint fixes test=develop - Further lint fixes test=develop * - Fixes after review test=develop * - Lint fixes test=develop * - Even more lint fixes test=develop	6 years ago
chengduo	f4ec7d54c8	fix bug of scatter op (#18640 ) test=develop	6 years ago
tianshuo78520a	112cf850b7	change pip install whl;test=develop (#18635 )	6 years ago
guru4elephant	ab57d3893e	make auc op compatible with 1 dim (#18551 ) * make auc op compatible with 1 dim	6 years ago
tianshuo78520a	de22215c8f	change const_cast error message (#18620 )	6 years ago
Leo Zhao	ff77dea969	not use transferscope cache in cpu case (#18578 ) * not use transferscope cache in cpu case test=develop * adjust variable name and add comments test=develop * use correct format for class member in operator.h * use correct format for class member in operator.cc test=develop	6 years ago
123malin	b414645a65	fix #17430 : int64类型的attr训练非预期 (#18264 ) * fix int64_t * update fill constant op unittest * add empty line	6 years ago
tangwei12	db212bb932	delete AllocatorFacade destructor (#18606 ) * delete m, test=develop	6 years ago
Kevin	995d7d8600	Modify embedding_op input dtype to int64 (#18598 )	6 years ago
kh2se2013	9ad57f2dfd	1）change to parallel mode on python coverage run (#18594 ) 2）add pip install coverage in Dockerfile.tmp test=develop	6 years ago
Tao Luo	076f833110	add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580 ) * add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy test=develop * enhance MkldnnPostReset test=develop * add comments for mkldnn_cache_capacity field test=develop	6 years ago
Hongyu Liu	a20b2b43fc	fix cudnn lstm shape bug; test=develop (#18492 )	6 years ago
gongweibao	c0a82748cf	Polish backwards optimizer dependency codes and use more default values. (#18255 )	6 years ago
Zeng Jinle	d3003a1620	Feature/buffer_shared_inplace (#17911 ) * feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop	6 years ago
tianshuo78520a	1c10dac4f2	Add code example in CI (#18228 ) * test api example * update python * add sampcd_processor.py * add if 0 * sort * test paddle * test paddle * test paddle * add whitelist * change sampcd_processor.py * change sampcd_processor.py * change sampcd_processor.py * add exit * test=develop * test=develop	6 years ago
Zeng Jinle	be24e5b391	Clean unused code of dim and place (#18565 ) * clean code of dim and place, test=develop * fix failed unittests, test=develop	6 years ago
Jacek Czaja	8869d7f735	Activations MKLDNN ops refactoring (#18191 )	6 years ago
lujun	b6d5c74f69	update dygraph api doc for web (#18550 ) remove dygraph.enable from __all__ hidden dygraph. profiler add doc to dygraph. no_grad	6 years ago
Yibing Liu	b86234fc0b	Register fp16 for concat_op (#18563 )	6 years ago
Physher	5e1220ef37	fix compile error which caused by gcc4.8 related commit;test=develop (#18567 )	6 years ago
Jiabin Yang	667f88f9a6	Fix/gcc 4.8 ubt link error (#18558 ) * test=develop, fix docker with paddle nccl problem * test=develop, fix/gcc_4.8_ubt_link_error * test=develop, fix code format	6 years ago
Physher	0caa08ea40	Add mkldnn int8 mul-op kernel (#17834 )	6 years ago
LielinJiang	24d1c44a0c	Fix roi_perspective_transform_op bug (#18522 ) * fix transform matrix bug, test=develop * modify API.spec	6 years ago
Zhaolong Xing	88b52a27fe	Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop	6 years ago
石晓伟	1529154821	Support Bitmain Anakin (#18542 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop	6 years ago
tianshuo78520a	9b3d3b8387	Cancel jacquesqiao approval authority (#18538 )	6 years ago
Leo Zhao	ce38bb5341	use static variable to do cache instead of thread local in thread frequent switching case (#18428 )	6 years ago
gongweibao	160ddc980c	Regroup fusion by date type. (#18496 )	6 years ago
Tao Luo	fe32879d2a	add mkldnn shapeblob cache clear strategy (#18513 ) * add mkldnn shapeblob cache clear strategy test=develop * refine with comments test=develop * make cache clear strategy more safey test=develop * add lock for GetShapeBlobSize test=develop	6 years ago
chengduo	e576f2667b	update docker build (#18523 ) test=develop	6 years ago
zhaoyuchen2018	832d8191ff	Fix topk cannot handle 1D vector bug (#18466 ) * Fix topk cannot handle 1D vector bug Add path to handle 1D vector test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
石晓伟	280a8784f7	Remove the obsolete cmake options (#18493 ) * remove the obsolete cmake options, test=develop * remove unittests, test=develop * delete options in paddle/scripts/paddle_build.sh	6 years ago
LielinJiang	43e17c7951	Add distributions of normal and uniform (#18023 ) * add_distributions_of_normal_and_uniform * paddle/fluid/API.spec * modify API.spec * modified paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * fix some comment, test=develop * modify API.spec, test=develop * add comment for init function, modify hard code, test=develop * modify API.spec, test=develop * modify API.spec, test=develop * make unit test function shorter, test=develop * modify paddle/fluid/API.spec	6 years ago
bingyanghuang	3fe6bf5ee6	fix command line bug in int8v2 readme (#18507 )	6 years ago
tensor-tang	4828a5e008	core remove pycpuinfo (#18479 ) remove pycpuinfo deps in core	6 years ago
qingqing01	7ac4818a98	Refine Infershape in activation_op for double_grad. (#18485 ) * Refine Infershape in activation_op for double_grad.	6 years ago
qingqing01	602cb6a5b4	Enhance linear_lr_warmup (#18463 ) * make it support float/int learning as input.	6 years ago
chengduo	7453857324	Make fuse_all_reduce_op_pass support mix_precision (#17652 )	6 years ago
chengduo	55baeceddb	Enhance execution error info (#18482 ) * enhance execution error info test=develop	6 years ago
石晓伟	047bba855b	Remove the obsolete cmake options (#18481 ) * remove the obsolete cmake options, test=develop * remove unittests, test=develop	6 years ago
pkpk	e9c7e218f2	Nan debugger init (#18401 ) test=develop	6 years ago
Jiabin Yang	f72ced8814	test=develop, fix docker with paddle nccl problem (#18451 )	6 years ago
Tao Luo	3f3112ceb0	add shape_blob for cache mkldnn primitive (#18454 ) test=develop	6 years ago
Tao Luo	d234aa02cd	add transfer_scope_cache unit-test (#18467 ) test=develop	6 years ago
zhoukunsheng	7c6f2350b9	support Tensor input for edit_distance op (#18162 )	6 years ago
zhoukunsheng	26318544d2	support Tensor input for chunk_eval op (#18226 ) * test=develop support Tensor input for chunk_eval op * test=develop fix testcase for chunk_eval op * test=develop fix typos in nn.py	6 years ago
zhoukunsheng	206c44e2a8	add unique kernel and op (#17557 )	6 years ago
zhoukunsheng	71af72b1c2	upgrade hash op to support Tensor and LoDTensor input (#17998 )	6 years ago
zhoukunsheng	d3b3443d10	add ones_like op (#17388 )	6 years ago
zhoukunsheng	67b48d7fe7	add size op (#17412 )	6 years ago
Leo Zhao	8f5fffca0a	rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() (#18453 ) * rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() test=develop * update session id definition and adjust logic for default behavior test=develop * reset logic in mkldnn reuse as most of cases work in default. test=develop	6 years ago
Tao Luo	3123d18787	remove unused AnalysisPredictor::SetMkldnnThreadID() (#18444 ) test=develop	6 years ago
Yi Liu	a873fa84ce	supports collective training with programs (#18392 ) 1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops 2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext 3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis	6 years ago
tianshuo78520a	85b49d8473	fix the api.spec file does not get the class comment problem (#18439 ) * fix the api.spec file does not get the class comment problem * cat new.spec * check api.spec * test=develop	6 years ago
chengduo	e0d8c6ac68	Add find_no_grad_vars in backward.py (#17942 ) * add not_been_used_vars to no_grad_set test=develop	6 years ago
LielinJiang	449c7a9f98	Make roi_perspective_transform op return mask and transform matrix (#18371 ) * modify roi_perspective_transform_op to output mask and transform matrix * modify comment * modify comment * modify API.spec * update API.spec * remove no use header, test=develop * resolve conflict	6 years ago
tensor-tang	a3bc804f5f	fix mac ci random fail (#18430 ) * fix mac ci random fail * use platform instead	6 years ago
Michał Gallus	7023a86c3a	Fix Pooling output scale (#18186 ) * Int8: Fix Pooling output scale test=develop * Update scales quantization for certain operators These include: concat, transpose, pool and reshape. test=develop * Move concat minimum scale finding to quantizer test=develop	6 years ago
Brian Liu	4bc2987d2f	Fix bug in quantize kernel which cause crash in vgg16/19 model (#17964 ) * Fix bug in quantize kernel which cause crash in vgg16/19 model test=develop * refine the code to reduce verbose code; test=develop * remove useless code; test=develop	6 years ago
xsrobin	47e2ef38e9	add "import paddle.fluid as fluid" to examples lack of it	6 years ago
tianshuo78520a	92ecb305c2	test=develop (#18426 )	6 years ago
hutuxian	8a39e5c110	update api format (#18413 ) * update api format test=develop * update API.spec test=develop	6 years ago
jiaqi	93a2b317f7	fix data feed ptr error (#18419 ) fix data feed ptr runtime error, pipeline trainer will core in some cases, so set it nullptr as default value.	6 years ago
tensor-tang	ce7a024c6d	fix py-cpuinfo mac random fail (#18383 ) * fix py-cpuinfo mac random fail * differentiate version on windows	6 years ago
Jie Fang	2b4ef509ea	init custom black white list (#18377 ) test=develop	6 years ago
Leo Zhao	681d3553f1	Fix potential mkldnn concat/pool/conv kernel issues (#18393 ) 1. some key generation method is not aligned with PR#17965 2. enlarge ptr lifetime to avoid memory release if SetBlob fails otherwise it will get core dump. test=develop	6 years ago
tianshuo78520a	052b044873	Fix mac build nproc command not found (#18362 ) * change nproc 8	6 years ago
Zeng Jinle	f5641000bb	Add a unittest to inplace elementwise_add (#18385 ) * add_elementwise_add_inplace_test,test=develop * rename file, test=develop	6 years ago
Jiabin Yang	43f64a177e	Fix/program doc (#17908 ) * test=develop, add some comments for Program.clone * test=develop, add API.spec * test=develop, refine comments * refine Program doc and clone doc * test=develop, refine doc	6 years ago
Jiabin Yang	af874a1f1d	test=develop, fix multigpu hang on latest docker (#18379 )	6 years ago
chengduo	871cc15e6a	Add is_compiled_with_cuda (#18356 ) * add cuda_is_available test=develop * Fix api.spec test=develop * fix api doc test=develop	6 years ago
lujun	fd6631ef2f	Fix dygraph show style (#18297 ) Fix dygraph show style for FluidDoc.	6 years ago
HaoRen	9931bc64f5	add dependecy of collective_helper (#18365 ) * add dependecy of collective_helper * test=develop fix dependecy of collective_helper	6 years ago
翟飞跃	19da59ed3f	Remove all the code, API and doc of MKL-DNN INT8v1 (#18347 )	6 years ago
chengduo	8ed33bf91f	Fix Bug-prone code of PE (#18354 ) * update pe reduce config test=develop * drop the local_exe_scopes of the previous parallel_executor test=develop	6 years ago
tangwei12	999d9a59a5	fix communicator with pyreader (#18350 ) * add is_runnning in communicator, test=develop	6 years ago
tianshuo78520a	cff2c2d83f	add combine_avx_noavx build to dockerfile 需要在avx_noavx build时候，生成dockerfile。使用combine_avx_noavx 参数生成whl后发现不能build镜像，原因：没有生成dockerfile。需要添加生成dockerfile选项。	6 years ago
kh2se2013	27fb9cad65	add WITH_COVERAGE option, default OFF (#17872 ) * add WITH_COVERAGE option, default OFF test=develop * add coverage for python sdk test=develop * fix code style * fix COVERAGE_FILE path test=develop * remove coverage package test=develop * test = develop, run coverage as module	6 years ago
Michał Gallus	8409693272	Reset DeviceContext after quantization warmup (#18182 ) test=develop	6 years ago
HaoRen	b7128bac5f	supports collective communicated training (#18175 ) * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * fix comment test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * fix comment test=develop * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * test=develop add collective op unittest standard * test=develop remove the test_collective directory * test=develop remove the test_collective directory * remove slicegather test * code format for reducescatter * update attr of shard_index_op * Modify macro nccl_helper * remove test without distribute * macro collective_helper * marcro update * test=develop update support python3.5 * test=develop change gpu memory use to 0.1 when test * test=develop update ut equal func * test=develop set flags to 1.5 * test=develop fix pickle dumple py35 * test=develop fix divide in slice and add sync_comm_stream update atol and rtol to 1e-05 rm shard_index op and test modify read input from file to read from memory remove origin_program in framework and add i/o in c_sync_calc_stream * test=develop update unittest sync operator I/O	6 years ago
Sylwester Fraczek	9252e8fa08	add int8 mkldnn prior_box (#17242 ) add prior_box quantization code add scale algo rules for prior box test=develop	6 years ago
lidanqing	5fd68ac154	some fixes for int8 mobilenet_ssd tester (#18112 ) * some fixes for int8 mobilenet_ssd tester test=develop * change wrong data file name test=develop * change test images bin file from 200 images to 100 images * change directory existence to file existence during downloading test=develop * reuse download_data test=develop * run full dataset when iterations=0 test=develop	6 years ago
Jacek Czaja	c2efdfd5bc	[MKL-DNN] Extending reusing to Elementwise_add_mkldnn op (#18146 ) * - Reusing of reuder used in elementwise_add_mkldnn - Added MKL-DNN sum prim reusing test=develop - Compilation fixes test=develop - Yet another compilation fix test=develop - Yet another compilation fix test=develo - Yet another linking fix test=develop - Final compilation fix test=develop - lint fixes test=develop - Lint fixes test=develop * - Fixes after review test=develop	6 years ago
qingqing01	9047ac687e	Simplify multi_box_head API in detection.py and remove assign op. (#18310 ) * Simplify multi_box_head API in detection.py and remove assign op.	6 years ago
Zeng Jinle	5826b72e06	Refine CUDAPlace error message. (#18343 ) * refine cuda place error msg, test=develop * use LOG(ERROR)+exit(-1), test=develop	6 years ago
Tao Luo	3c9755bbb9	remove unused jemalloc option (#18314 ) test=develop	6 years ago
Yibing Liu	23941e43ec	Update lamb optimizer (#18333 ) * Update lamb optimizer test=develop, test=document_preview * Regenerate api spec test=develop, test=document_preview	6 years ago
chengduo	135a59ed45	update reduce config (#18334 ) test=develop	6 years ago
tensor-tang	81ec538279	fix softrelu doc (#18324 ) * fix softrelu doc test=develop * update API doc test=develop	6 years ago
Hongyu Liu	df2eee71d8	Sequence mask support tensor (#18249 ) * sequnce mask support max length tensor input; test=develop * add rnn_impl.py; test=develop * add basic gru lstm unittest; test=develop * fix api spec; test=develop * fix sequence_mask op bug; test=develop test=document_preview * change +-x to elmentwise_op; test=develop add mkl flag; test=develop * fix rnn impl bug; test=develop * update api spec; test=develop * fix doc bug; test=develop * fix lstm bugs; test=develop	6 years ago
Qiao Longfei	0e08e91c18	optimize communicator merge sparse gradient test=develop (#18159 ) * optimize communicator merge sparse gradient test=develop * revert multithread selected rows merge add test=develop * follow comment test=develop	6 years ago
chengduo	e06c69c788	Fix default value of fluid.memory_optimize (#18295 ) * fix default value of fluid.memory_optimize test=develop * fix api.spec test=develop	6 years ago
Zhaolong Xing	6978b2e48e	fix split and sampled softmax (#18280 ) test=develop	6 years ago
Yibing Liu	f57ee3693b	Fix the bug of sequence_unpad op (#18290 ) * Use TensorCopySync for sequence_unpad op test=develop * Fix the tensor memory alloc bug test=develop	6 years ago
chengduo	5489216eba	Clean build strategy (#18148 ) * clean build_strategy test=develop * DataBalanceOpHandle has been removed test=develop * debug * update build_strategy. test=develop	6 years ago
chengduo	14e1e165df	update alloc_continuous_space_for_grad_pass (#18287 ) test=develop	6 years ago
lujun	7e61baaa94	add Dygraph api to api.spec (#18235 ) add Dygraph api to api.spec	6 years ago
liuwei1031	a736c03b10	improve doc of lstm, sequence_enumerate, softmax_with_cross_entropy, space_to_depth APIs (#18261 ) * improve doc of lstm, sequence_enumerate, softmax_with_cross_entropy, space_to_depth APIs, test=develop * update API.spec, test=develop	6 years ago
flame	fdf798f95a	fix double buffer example (#18169 ) test=develop test=document_preview	6 years ago
Bai Yifan	23b8b18e56	fix api doc example, test=develop (#18266 )	6 years ago
xiaoting	2f0d68261c	fix yolo_box example,test=develop (#18247 )	6 years ago
songhao	6b3d96254d	fix some bug when merge sparse embedding parameters, test=develop (#18223 ) 1. fix the bug that out_put_var in SaveSelectedRows would be empty string 2. use merge_sparse_lookup_table to replace sum op for load_persistables_for_inference 3. fix the bug in _clone_var_in_block_ when the var is SELECTED_ROWS.	6 years ago
jiaqi	3f8031e256	dataset (#17973 ) (1) use channel instead of vector/BlockingQueue in Dataset，to keep same with existing implementation, and make code more readable and flexible (dataset single output channel or multi output channel). one previous memory out of limit problem is cause by not release memory after training. (2) add Record because MultiSlotType costs too much memory (80B)，fix memory out of limit problem. (3) add Channel, Archive in paddle/fluid/framework (4) change dataset from shared_ptr to unique_ptr in pybind (5) move create/destroy readers from trainer to dataset (6) move shuffle from datafeed to dataset. dataset holds memory, datafeed is only for load data and feed data to network. (7) fix thread num bug of Dataset when filelist size < thread num (8) support set_queue_num in InMemoryDataset	6 years ago
liuwei1031	5d54ed4a84	improve the doc of DataFeeder and default_main_program (#18241 ) * improve the doc of DataFeeder and default_main_program * update API.spec, test=develop	6 years ago
xiaoting	b58bb80248	set src_idx > 0 for bilinear_interp_op (#18238 ) * set src_idx > 0, test=develop * add unittest and cu, test=develop	6 years ago
wopeizl	daa32d5383	fix package generation for inference test=develop (#18220 )	6 years ago
Shuai Yuan	9a32dad811	[DOC] Fix comment code of API create_py_reader_by_data (#18193 ) * [DOC] Fix comment code of API create_py_reader_by_data. test=develop, test=document_preview * Fix code style of API comment. test=develop,test=document_preview Fix code style of API comment. test=develop,test=document_preview * update api spec of api create_py_reader_by_data * remove default config code. test=develop * remove useless code. test=develop * update create_py_reader_by_data api. test=develop	6 years ago
Hongyu Liu	cefd0fb598	Fix slice op shape=-1 bug (#18107 ) * fix slice op bug; test=develop * fix variabel test bug; test=develop * remove slice while true; test=develop	6 years ago
lijianshe02	ff4279e3b2	fix paddle.fluid.layers.io.open_files api doc bug test=develop (#18203 ) * fix paddle.fluid.layers.io.open_files api doc bug test=develop	6 years ago
chengduo	5588b923f3	Add multi process reader (#18115 ) * add multi process reader test=develop	6 years ago
wangchaochaohu	a9dc534f48	fix API example (#18153 ) * API.spec test=develop * update * update test=develop * update test=develop * update * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * test=develop * update * update test=develop * update test=develop * fix test=develop	6 years ago
翟飞跃	de42fe8fd5	Change int8v2 CAPI unit test name and add log in the prediction stage (#18200 ) * fix issue 18111;test=develop * fix timer;test=develop * refine code;test=develop	6 years ago
翟飞跃	802ea50956	fix spelling errors (#17941 ) * fix spelling errors; test=develop * Update API.spec update md5 * Update API.spec * change the order of api;test=develop	6 years ago
zhoukunsheng	0569ff78fa	Fix doc example for greater_equal, greater_than, less_equal, not_equal, rank, reduce_all, reduce_any, sign, where, diag (#18167 ) * test=develop fix greater_than, greater_equal, less_equal, not_equal, rank, reduce_all, reduce_any, sign, where, diag doc example * test=develop fix API.spec conflict	6 years ago
Huihuang Zheng	bbc292920c	Fix API example code (#18176 ) The fixed APIs: 6 Methods in paddle.fluid.io.PyReader paddle.fluid.layers.Preprocessor paddle.fluid.layers.py_reader paddle.fluid.io.save_params paddle.fluid.io.save_persistables test=develop test=document_preview	6 years ago
翟飞跃	78441c5449	add mkldnn Int8v2 slim doc (#17909 )	6 years ago
lvmengsi	d658f1133b	Fix doc for transpose, conv3d and batch_norm. (#18035 ) * update some op doc, test=develop	6 years ago
FlyingQianMM	944c3165ec	fix type error of std::pow in sigmoid_focal_loss_op.cu and sigmoid_focal_loss_op.h (#18152 ) * test=develop fix type error of std::pow in sigmoid_focal_loss_op.cu and sigmoid_focal_loss_op.h * test=develop fix wrong code stype in sigmoid_focal_loss_op.cu and sigmoid_focal_loss_op.h	6 years ago
chengduo	25f3cd6486	Update execution_strategy option default value (#18183 ) * update execution_strategy option default value test=develop * fix doc error test=develop	6 years ago
chengduo	4978db2c10	Remove nccl dep when the number of GPU is 1 (#18158 ) * remove nccl dep when the number of GPU is 1 test=develop	6 years ago
Zeng Jinle	25ab23be28	Fix dygraph mem leak (#18082 ) * fix dygraph mem leak, test=develop * polish msg, test=develop	6 years ago
tensor-tang	1c6e560607	core replace x86cpu with py cpuinfo (#18151 ) test=develop	6 years ago
Zeng Jinle	6eec66a1b1	Fix py_reader iterable bug (#18108 ) * fix py_reader iterable bug, test=develop * move data from buffered_reader,test=develop	6 years ago
qingqing01	80d2e66f9e	Update backward appending stragety to support double backward and fix some bug. (#18104 ) * Update backward.py: - If there is no input grad var in all outputs of previous ops, do not append this op into graph. - Only apply this stragety when double backward. * Update some double backward op. * Update sum_op to judge whether a tensor is empty by numel or IsInitialized().	6 years ago
Wojciech Uss	ca5642c850	unify FP32 vs. INT8 comparison tests output (#18111 ) test=develop	6 years ago
Wojciech Uss	c26130f3a9	reuse C-API INT8 unit test application (#18077 ) * reuse C-API INT8 unit test application test=develop * updates after review test=develop	6 years ago
FlyingQianMM	ff83655f7e	add detection output operator for supporting retinanet (#17896 ) * test=develop add detection output for supporting retinanet * test=develop add test_layers.py * test=develop add API.spec * test=develop alter test_retinanet_detection_output.py * test=develop alter round 2 * test=develop alter retinanet_detection_output * test=develop alter paddle/fluid/API.spec * test=devlop alter detection.py * test=develop alter retinanet_detection_output * test=develop alter paddle/fluid/API.spec * test=develop alter detection.py * test=develop alter API.spec * test=develop alter retinanet_detection_output * test=develop alter paddle/fluid/API.spec * test=develop alter python/paddle/fluid/tests/unittests/test_retinanet_detection_output.py * test=develop alter python/paddle/fluid/tests/unittests/test_retinanet_detection_output.py * test=develop fix grammer error * test=develop fix grammer error * test=develop fix grammer error * test=develop alter python/paddle/fluid/tests/unittests/test_layers.py * test=develop alter paddle/fluid/API.spec	6 years ago
FlyingQianMM	0aee1f0074	add sigmoid focal loss operator for supporting retinanet (#17895 ) * test=develop add sigmoid_focal_loss for supporting retinanet * test=develop add test_layers * test=develop add API.spc * test=develop alter sigmoid_focal_loss_op.cc * test=develop alter detection.py * test=develop alter API.spec * test=develop alter round 1 * test=develop alter simooid_focal_loss * test=develop alter sigmoid_focal_loss_op.cc * test=develop alter test_layers.py * test=develop alter paddle/fluid/API.spec * test=develop alter sigmoid_focal_loss_op.cu * test=develop alter paddle/fluid/operators/detection/sigmoid_focal_loss_op.cc	6 years ago
FDInSky	9e4b9d9798	Update generate_proposal_labels_op to support CascadeRCNN. (#17200 ) * Update generate_proposal_labels_op to support CascadeRCNN.	6 years ago
FlyingQianMM	9ed2f936f1	add target assign operator for supporting retinanet (#17893 ) * test=develop add target assign for retinanet * test=develop run ci * test=developp add test_layers * test=develop add APi.spec * test=develop alter round 1 * test=develop alter rpn_target_assign_op.cc * test=develop alter test_rpn_target_assign_op.py * test=develop alter rpn_target_assign_op.cc * test=develop alter API.spec * test=develop alter paddle/fluid/operators/detection/rpn_target_assign_op.cc * test=develop alter rpn_target_assign_op.cc * test=develop alter python/paddle/fluid/layers/detection.py * test=develop alter paddle/fluid/API.spec	6 years ago
Huihuang Zheng	7faf095618	Sync Dockerfile change of PR#17889 (#18072 ) Jian Tang made change on latest-dev Dockerfile, so sync the change in the cuda9/10 Dockerfile test=develop	6 years ago
Sylwester Fraczek	accb132f0f	fix slim int8 mkldnn multithreading issue (#18009 )	6 years ago
tianshuo78520a	2e1d8cf7c8	add approval to requirements.txt add luotao to approval requirements.txt	6 years ago
chengduo	24e988a471	Fix bug of scope_buffered_ssa_graph_executor (#18100 ) * fix code bug test=develop	6 years ago
Huihuang Zheng	3f55ab0f89	Modify format of GPU allocation failure log. (#18034 ) As title test=develop	6 years ago
gongweibao	f5caf3443c	Fix reinitialized ncclid error! (#18025 )	6 years ago
whs	354643d8d9	Add warning for cudnn warpctc kernel in CUDA9\CUDA10. (#18046 ) test=develop	6 years ago
qingqing01	e81756f1ba	Hidden paddle.fluid.layers.detection_map. (#18033 ) * Remove layers.detection_map API * Since uers can use fluid.metrics.DetectionMAP to calculate mAP of current-batch and cumulative-batch. layers.detection_map only can calculate cur-batch mAP.	6 years ago
Yiqun Liu	660c1a65f3	Optimize fused_elewise_activation_grad op. (#18041 ) test=develop	6 years ago
lidanqing	466254151a	add Mobilienet ssd int8 analyzer tester (#18075 ) * add pascalvoc preprocess script and mobilenet-ssd analyzer_tester, wait 17737 * change converting local dataset to downloading and converting tarfile test=develop * change the test data_path test=develop * change copyright (c) 2016 to copyright (c) 2019 test=develop	6 years ago
石晓伟	42f12a4aca	fix ci test cmake test=develop (#18060 )	6 years ago
chengduo	b5a1c1463d	Update CPU_NUM config (#18059 ) * update CPU_NUM config test=develop	6 years ago
lidanqing	f8ecc3de89	refactor the function ConvFwdPrimitiveDesc (#17897 ) * refractor the function ConvFwdPrimitiveDesc test=develop * change according to review test=develop * use pointer way without boost::optional test=develop * pass vector to function by reference instead of raw vector test=develop * change pointer to shared_ptr test=develop	6 years ago
Michał Gallus	8462e2b805	Disable MKLDNN FC in Resnet50 test (#18030 )	6 years ago
Wojciech Uss	78e932862c	Added unit test for QAT FP32 & INT8 comparison (#17814 ) * added unit test for QAT FP32 & INT8 comparison test=develop * enabled other models and updated filenames test=develop * added accuracy check and multiple batch handling test=develop * removed quantization_mkldnn_pass.py test=develop * cleanup test=develop * updated model paths test=develop * renamed tests without MKL-DNN test=develop * fix reusing mkldnn pool2d primitive test=develop * add performance measuring test=develop * fix accuracy statistics test=develop * removed non-mkldnn tests test=develop * added conv2d_depthwise->conv2d mkldnn transformation test=develop * format update test=develop * fixed creating key for pool2d grad test=develop * added pass * Fix the accuracy issue while using float precision to get the scale. test=develop * Fix the format issue when 'X' is not nchw. test=develop * removed output comparing and changed number of images test=develop * cmake and comment fix test=develop * updated acc threshold for QAT comparison tests test=develop * added OMP_NUM_THREADS setting test=develop * enable all QAT INT8 tests test=develop * restored upstream version of a file test=develop * modified directory names test=develop	6 years ago
tensor-tang	566bf2ec56	concat op support negative axis (#18045 ) test=develop	6 years ago
Yiqun Liu	7e463c84a6	Optimize the concat and split cuda implementation for cases when the number of inputs/outputs is less than 5. (#17979 ) test=develop	6 years ago
tangwei12	101f74cb19	fix save/load in fleet (#17675 ) * fix save/load in Fleet * add UT framework of Fleet	6 years ago
hutuxian	f1d458daf0	add trainer_desc proto DEPS (#18019 )	6 years ago
Guo Sheng	a06b316b94	Fix GetExpectedKernelType of add_position_encoding_op (#17935 ) * Fix the GetExpectedKernelType of add_position_encoding_op. test=develop * Fix the doc of lstm_unit outputs in nn.py. test=develop	6 years ago
tensor-tang	5c06bff222	combine noavx and avx package (#17889 ) * support avx and noavx core * add catch and give some log test=develop * fix build test=develop * add missing package test=develop * fix pybind name test=develop * fix import error test=develop * conbime noavx core test=develop * add requirements test=develop * fix unkown message test=develop * fix api spec test=develop * refine and clean test=develop * update * pass dist ut * follow comments test=develop * refine scripts test=develop	6 years ago
wawltor	8eb134c3c1	Fix scatter and gather op when has duplicate index (#17952 ) * test=develop The scatter op has a calc bug when the indices has same index, the scatter op use overwrite mode to calculate the same index, fix this bug by using the accumulate mode to calculate the same index.At the same time, the gather op has the same bug when the op calc the grad. And we use the lib of open-blas and eigen to optimize the time cost in accumulate mode. * test=develop Fix some code format problem, and the same time add the test case in gather and scatter op	6 years ago
lujun	75fcd29220	update load_error_info, test=develop (#18000 ) Repair error prompt: Users are prompted to check whether the model or parameter files are damaged when loading parameters are wrong.	6 years ago
石晓伟	04ea7cb069	modify the access level of anakin engine (#18015 ) test=develop	6 years ago
wawltor	2ae8decc90	test=develop (#17984 ) Fix bug in sequence_unpad op, when allocate the output memory do not match actual memory, check memory failed. Fix this bug by allocating the output memeory in correct code position.	6 years ago

... 4 5 6 7 8 ...

15854 Commits (f04f2b232a22c9aba3ee4538ab708acf9f77c813)