Paddle

Commit Graph

Author	SHA1	Message	Date
Dong Daxiang	691ced87c0	Refactor fetch handler (#21264 ) * fix fetch handler problem and refactor when a user define FetchHandler class, he or she should initialize a handler with variable dict. the key of a variable dict is a user defined name, the value of a variable dict is a Varaible generated from python API. For each fetching, a user should implement handler function in which fetched_result_dict will be available and the user can access the fetched value with user defined keys.	5 years ago
Yiqun Liu	c918788ba9	Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. (#21310 ) * Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. test=develop * Print the subgraph when check failed. test=develop	5 years ago
Chen Weihang	952508527a	Polish some PE code details (#21274 ) * polish code details, test=develop * futher polish hint msg, test=develop	5 years ago
Thunderbrook	0d17c1b816	solve pslib core in stop worker (#21263 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop	5 years ago
Thunderbrook	349e82d669	support general embedding params (#21217 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop	5 years ago
Yiqun Liu	6b1e1f0dda	Enable generating code for a given subgraph. (#21126 ) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop	5 years ago
Zeng Jinle	a152315be7	refine Tensor method, test=develop (#21031 )	5 years ago
Zeng Jinle	cdb3d27985	Fix warn of gcc8 (#21205 ) * fix warnings oof gcc 8 compilation, test=develop * fix boost::bad_get, test=develop * refine PADDLE_ENFORCE, test=develop	5 years ago
Zhaolong Xing	65f7052554	TRT int8: refine trt int8 for dynamic range set (#21112 ) * refine trt int8 for dynamic range set test=develop * refine trt int8 test=develop	5 years ago
xujiaqi01	23876de55b	fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052 ) * fix cache table bug * add save_paddle_inference_model * fix hdfs util bug * test=develop	5 years ago
xujiaqi01	9e045170c0	add copy table (#21086 ) * copy some feasigns and corresponding embeddings from one sparse table to another * copy all feasigns and corresponding embeddings from one sparse table to another * copy all dense params from one table to another * copy some local vars to other local vars	5 years ago
Chen Weihang	4bd9463630	fix detail error message error, test=develop (#21170 )	5 years ago
Chen Weihang	8da0cd537a	Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop add more already exists examples, test=develop	5 years ago
Chen Weihang	8414575b78	Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137 ) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop	5 years ago
Chen Weihang	7e5f74b825	Add examples for error message writing specification - InvalidArgument (#21132 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop fix error, test=develop	5 years ago
WangXi	de5d3ff688	Fix dgc buffer illegal & reuse velocity (#21012 )	5 years ago
Zeng Jinle	d625aaf0c1	remove so many logs of parallel executor, test=develop (#21105 )	5 years ago
Yiqun Liu	35f17ae28f	Add the check of lod_level between compile-time and runtime. (#20961 ) * Add the check of lod_level between compile-time and runtime. test=develop * Fix bug in check_compile_vs_runtime. test=develop * Fix the check of output when it is dispensiable or intermediate. test=develop * Share lod of x to out in match_matrix_tensor op in compile-time. * Implement GetLoDLevel in InferShapeContext. * Set the default value of check_compile_vs_runtime to False and enable it in test_sequence_pad_op. test=develop * Enable check_compile_vs_runtime in test_match_matrix_tensor. * Add the implementation of SetLoDLevel in InferShapeContext. * Remove the implementation of IncreaseLoDLevel and call Get/SetLoDLevel instead. * Remove the implementation of DecreaseLoDLevel and call Set/GetLoDLevel instead. * Refine some ops and unittests. test=develop * Fix a typo. test=develop * Remove the check of var type, and change int to int32_t. test=develop * Add unittest for Get/SetLoDLevel. test=develop	5 years ago
Chen Weihang	826254f664	Add pre-condition check for fuse optimizer op pass (#21005 ) * add pre condition check for fuse optimizer op pass, test=develop * add log & set init to zero, test=develop * fix test_fuse_all_reduce_pass failed, test=develop * polish details, test=develop * refine PADDLE_ENFORCE & remove needless VLOG, test=develop * refactor op check method, test=develop	5 years ago
Yiqun Liu	9091f8cdf9	Support generating code for grad_op (#21066 ) * Add the definition of operation in fusion_group. * Use operations in OperationMap to detect fusion_group of elementwise pattern. * Add namespace fusion_group in code_generator. * Use operations recorded in OperationMap to generate code. * Remove implementation codes to .cc file. * Refine Operation and CodeGenerator to make it easier to generate code for grad_op. Refine the unittest for better reuse. * Avoid recording the template's keyword in a array. * Support the generating of code for grad_op and add unittest. test=develop * Remove replaced_element_in_order and use use number instead. test=develop	5 years ago
joanna.wozna.intel	77c2083586	Add transpose2 INT8 for mkl-dnn (#19424 ) * Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdba4c859abb945e062ab13124f70508054, reversing changes made to 2ce6473f144da298aba4a43d46918f27d463cf7c. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd78ca47ae56881161172b2aacd349aba90. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop	5 years ago
Chen Weihang	7ee25189c3	Enrich the type of error and declare the error type interfaces (#21024 ) * Enrich the type of error and declare the error type interfaces, test=develop * adjust tests to adapt new form, test=develop * add inference deps with error_codes.pb.h, test=develop * restore stack iter start pos, test=develop * polish code based review comments, test=develop	5 years ago
Zeng Jinle	5aae595902	fix no_need_buffer_vars_dep, test=develop, test=document_fix (#21007 )	5 years ago
xujiaqi01	1d1a07937a	simplify master+patch，remove ins when size != merge_size or has conflict slot (#20913 ) * remove duplicate code and duplicate config of master+patch * drop all ins which has conflict slot or size < merge_size * user only need to set merge size，if ins num of same id is not equal to merge size, just drop these ins * user must make sure master data and patch data has no same slot whose feasigns are both non-zero, otherwise these ins will be dropped. (slot list should still be the same of both master and patch) * test=develop	5 years ago
Zeng Jinle	878a40f57d	Support NoNeedBufferVarsInference in dygraph backward (#20868 ) * support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop	5 years ago
Wilber	c534149642	fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param test=develop (#20960 ) fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param	5 years ago
WangXi	eec4fa9099	And Enforce to fuse pass for DGC doesn't support fuse for now, test=develop (#20935 )	5 years ago
Zeng Jinle	b0c0ffb9ae	refine pe when exception raises, test=develop (#20894 )	5 years ago
123malin	20cdff0e02	Optimize decay (#20816 ) * update pserver decay blocks * update distributed notify handler	5 years ago
hong	8c4573a3cb	GradMaker for dygraph (#19706 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop	5 years ago
Thunderbrook	59bcdc8a19	support dump param of model into afs (#20302 ) * support dump param to afs test=develop * code style test=develop * code style test=develop * dump param test=develop * dump param test=develop * dump param test=develop * dump param test=develop	5 years ago
Yiqun Liu	16e4d02675	Refine the cache of program, context and scope in executor. (#18483 ) * Refine the cache of program, context and scope in executor. test=develop * Refine the unittest test_executor_and_use_program_cache. * Add the test the PaddingRNN with use_program_cache=True. test=develop * Remove a check. test=develop * Refine the unittest to check whether it is correct when setting use_program_cache=True. test=develop	5 years ago
hong	ff0886a92a	save load problem fix and new feature add (#20823 ) * fix persistable; * fix save load bugs; test=develop * fix bug; test=develop * add example for new io api; test=develop * addd example; test=develop	5 years ago
Yiqun Liu	6fcfd32e6c	Check and correct the output's lod_level in DynamicRNN related operators (#19144 ) * Refine the InferShape of ReadFrom and WriteTo op, and add comment to explain why not call ShareLoD for runtime. test=develop * Add comment for ReorderLoDTensorByRank op. * Add comment for lod_tensor_to_tensor_array op to explain why only call DecreaseLoDLevel for compile time. test=develop * ShrinkRNNMemory op should call ShareLoD for compile time. test=develop * Add the implementation of IncreaseLoDLevel and add the compile-time check of lod_level in InferShape of sequence_pool. test=develop * Refine the unittest of DynamicRNN. test=develop * Change PADDLE_ENFORCE to PADDLE_ENFORCE_NE. test=develop	5 years ago
Yiqun Liu	b5f3be8330	Implement a pass detect fusion group of elementwise op (#19884 ) * Add fusion_group_pass and elementwise pattern. * Rewrite the detector of elementwise group. test=develop * Add a comment in codegen. * Add more unittest cases. test=develop * Move code_generator related code to fusion_group directory. * Correct the including path. * Add the definition of SubGraph and finish the insert of fusion_group op in pass. * Insert graph_vis_pass in tester to visualize the graph for debug.	5 years ago
Huihuang Zheng	95ba4bd2ab	Add shape and type check at read_op (#20754 )	5 years ago
Zeng Jinle	98103d3003	remove some unnecessary logs in pe, test=develop (#20848 )	5 years ago
Chen Weihang	26cc1fe508	Replace risky GetInputType method with secure IndicateVarDataType interface (#20668 ) * replace part of the old implementation, test=develop * restore concat op, test=develop * update all ops implemention & delete GetDataTypeOfVar func, test=develop	5 years ago
xujiaqi01	48669aa8f0	fix several sparse table issuses (#20686 ) * no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto. * add find_distributed_lookup_table_grads instead of hard code GRAD * support embedding stop gradient. push sparse has error before fix this.* * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this. * fix pull sparse, skip slots which do not have embedding. * fix collect feasign label info, skip slots which do not have embedding. * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables. * test=develop	5 years ago
Chen Weihang	1d1552d106	Make formatted ENFORCE stack adapt to more situations (#20826 ) * Make formatted ENFORCE stack adapt to more situations and polish details, test=develop * restore template message position, test=develop	5 years ago
Zeng Jinle	ac813bbaf4	Add more error debug message to Operator::Run (#20793 ) * add more err msg, test=develop * add more unittests, test=develop	5 years ago
wangchaochaohu	ba45dce35d	fix codetest for windows make test=develop (#20796 )	5 years ago
zhongpu	72d1d72c09	fix ExecutionContext::HasInput and ExecutionContext::HasOutput depend on the scope structure, test=develop (#20721 )	5 years ago
石晓伟	48a774c713	fix ts_sort's bug, test=develop (#20720 )	5 years ago
wopeizl	9e5948230e	add support to gcc8, add docker env test=develop (#19807 ) * add support to gcc8, add docker env test=develop	5 years ago
xujiaqi01	5223b0dd9d	add check nan / inf in downpour worker (#20694 ) * add check nan / inf in downpour worker during training * test=develop	5 years ago
WangXi	507afa8a8a	Fix dgc nan by stripping nccl from sparseReduce. (#20630 )	5 years ago
Zeng Jinle	4eeda9d676	fix tensor_util, test=develop (#20699 )	5 years ago
Zeng Jinle	ab575de725	Fix op run log when memory optimization strategy is enabled (#20695 )	5 years ago
Jacek Czaja	a1cd27f13f	[MKL-DNN] Added mkl-dnn cache clearing when creating Executor instance (#20241 ) * - Flushing mkl-dnn cache test=develop - Disabled clearing cache for LoadModel - Added clearing of mkl-dnn cache when Executor is created test=develop - Do not clear for GPU places test=develop - compilation fix test=develop * - Moved clearing of mkl-dnn cache in destructor of executor test=develop * - Compilation fix test=develop - Reverted conditional clearing of mkl-dnn cache in Executors's destructor test=develop - compilation fix	5 years ago
Zeng Jinle	10505faf4e	polish codes, test=develop (#20672 )	5 years ago
Chen Weihang	003f369bb2	Add IndicateVarDataType interface to block tensor is not initialized problem in OP GetExceptedKernelType (#20044 ) * add indicate_var_data_type inferface, test=develop * add unittests & polish error message, test=develop * remove needless include, test=develop * extract public function & polish message, test=develop * delete empty var check, test=develop * change data_type to pointer parameter, test=develop * polish details, test=develop	5 years ago
Chengmo	940c6ff1c8	Fix communicator slow bug & fix communicator stop bug (#20366 ) * test=develop,Fix communicator slow bug * test=develop, delete if() in stop_worker() * test=develop * fix UT, test=develop * fix bug in fetch handler, test=develop * fix bug in fetch handler, test=develop * test=develop, fix fetch barrier bug * test=develop, bug fix * test=develop, bug fix * test=develop, fix bug	5 years ago
WangXi	cadc6a9704	fix dgc test and bug when not set trainers_endpoints_, test=develop (#20617 )	5 years ago
Thunderbrook	f76a32df4a	dump fix dov vec file num (#20539 ) * support dump multi file test=develop * dump fix num file test=develop	5 years ago
633WHU	12e4be0382	Dlpack support (#20039 ) * support dlpack to tensor and implement python interface test=develop * add unittest for _to_dlpack and from_dlpack test=develop	5 years ago
Pei Yang	443f604c3b	add DisableGlogInfo() to AnalysisConfig, test=develop (#20581 )	5 years ago
Zeng Jinle	a9c8bdad7b	refine pe codes, test=develop (#20479 )	5 years ago
Zeng Jinle	76b321872a	fix cuda dev_ctx by event, test=develop (#20553 )	5 years ago
zhaoyuchen2018	b8333edef6	Add Multihead matmul fuse pass (#20167 ) * Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments	5 years ago
Adam	7faa3e9555	Add ConvTranspose + BatchNorm fuse pass (#20161 ) * Add ConvTranspose + BatchNorm fuse pass test=develop * Add tests for conv+bn and conv_transpose+bn passes test=develop	5 years ago
xujiaqi01	22b80e1246	fix parse content in CreatePreLoadReaders (#20258 ) Fix parse content in CreatePreLoadReaders. Before this fix, if you use dataset.set_parse_content and dataset.preload, parse content didn't work.	5 years ago
hong	fa43e80e19	New save load interface (#20148 ) * add new save load interface; test=develop * add new save interface; test=develop * add save load interface ; * fix save load error; * fix dygraph set dict bug; * add save load unit test; test=develop * fix test_imperative_optimizer bug; test=develop * fix unitest optimizer bug; test=develop * fix code coverage; test=develop * fix converage; test=develop * add document for apis; test=develop * fix unitest error; test=develop * fix save load unit test error; test=develop * fix error message; test=develop * change set_parameter set_optimizer to save_dygraph; test=develop * add load_graph check; test=develop * fix api spec; test=develop	5 years ago
Zeng Jinle	c20b11ba11	simplify op_info.h, test=develop (#20195 )	5 years ago
hong	0ec2c081d9	update op compatible list; test=develop (#20175 )	5 years ago
tangwei12	c9139c3db3	trainer from dataset fetch targets (#19760 ) add executor.FetchHandler for train/infer from the dataset	5 years ago
chengduo	bfa55c9ddb	Add place deps for fused_all_reduce_op_handle (#20077 ) test=develop	5 years ago
Zeng Jinle	5fef859c65	remove map type from var_type_traits.h, test=develop (#20090 )	5 years ago
Zeng Jinle	4ad66c779c	fix op_compatiable_compile_error, test=develop (#20076 )	5 years ago
qingqing01	1a3eef026c	Enable users to create custom cpp op outside framework. (#19256 ) * How to write custom op needs to follow framework OP spec. * Package fluid_framework.so and headers into whl. * Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir. * Export some C-APIs to merge OpInfo between core.so and custom_op.so. * Add unit testing. * Update API.spec.	5 years ago
bingyanghuang	9de6772510	Follow comment of Merged QAT PR 18970 (#19979 ) * Follow Wangzhen's comment in PR 18970, test=develop * Review comments, test=develop * Leave fake quantization around mul test=develop * Replace Fake with Real Quantized Mul test=develop * Fix bug in quantize placement pass Nodes in the graph now have checked type instead of node name when they are to be marked for quantization test=develop	5 years ago
石晓伟	01b9d07963	update operator compatible info, test=develop (#19978 ) * update operator compatible info, test=develop * revert cmake/version.cmake, test=develop * add unit_tests and fix bugs, test=develop * update ../paddle/fluid/framework/framework.proto, test=develop * fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop * update paddle/fluid/framework/version_test.cc, test=develop * add comments and rename interfaces, test=develop	5 years ago
joanna.wozna.intel	f5221ac19f	Disable conv requant squash (#20041 ) * Fix conv2d+dequantize squash for residual fusion test=develop * Disable conv-requant squash test=develop	5 years ago
wangchaochaohu	c9ea317b36	codegen code for reconstruction (#19728 ) * codegen code for reconstruction test=develop * fix the cmake test=develop * fix review advice test=develop	5 years ago
tangwei12	8f0b3c0516	the integrated communicator (#19849 ) * add a base class for the Communicator * add AsyncCommunicator Impl for async distributed training	5 years ago
Chen Weihang	b916335025	Paddle error message stack shaping and optimization (#19895 ) * shape and optimize paddle error message stack, test=develop * limit exception type & add unittest, test=develop * fix multi-platform problem, test=develop * fix related unnitest failed, test=develop * add doc & fix unittest errors, test=develop * fix function name error, test=develop * update tensor test exception msg compare, test=develop * remove unittest on win32, the dir format is different, test=develop * remove useless package, test=develop * add paddle enforce handler unittest, test=develop * add exception checkout, test=develop * fix coverage failed, test=develop * fix op registry test failed, test=develop * refactor whole pr, test=develop * remove test in CMakelist, test=develop * fix coverage, test=develop	5 years ago
chengduo	2450d15b78	disable fuse_all_optimizer_ops (#19966 ) test=develop	5 years ago
chengduo	101a2b610a	Add dtype for coalesce_tensor_op (#20016 ) Add dtype for coalesce_tensor_op	5 years ago
Huihuang Zheng	88af4ab650	Add new data layer (#19916 ) The new "fluid.data" changes old "fluid.layers.data": 1. Add shape and dtype check. 2. Remove "append_batch_size" parameter. We won't offer this in the new data layer because other deep learning platforms don't have this kind of data layer pre-processing. It may confuse users. 3. Remove "stop gradient" parameter because the data layer doesn't do back-propagation TODO： Now data layer feeded by executor is checked, will we want to check the feed data of readers in the future?	5 years ago
xujiaqi01	f50e701b3b	fix memory leak in HogwildWorker (#19956 ) fix memory leak in HogwildWorker, whose ops are explicitly deleted in destructor	5 years ago
xujiaqi01	cedc04775c	support change shuffle and train thread num (#19841 ) * support change shuffle thread num * support change train thread num * fix receive shuffle data of each channel * data norm stop gradient * add check thread_tensor type and root_tensor type when merge metric * remove sleep in shuffle, add config * add config of pslib client to client communication * fix xbox str * add data norm op testcase * add flush in trainer finalize	5 years ago
Zeng Jinle	cc157d5990	add inplace to assign op, test=develop (#19927 )	5 years ago
chengduo	55ce696986	clean tensor array (#19930 ) test=develop	5 years ago
chengduo	d7251a8e1e	Delete local execution scopes (#19749 ) * Add RecordHistoryLocalExecScopes test=develop	5 years ago
wopeizl	5452b6a152	remove the useless warning for user to avoid confuse test=develop (#19871 ) * remove the useless warning for user to avoid confuse test=develop	5 years ago
hong	85b398f171	Add op compatible information (#19910 ) * add op compatible infomation; test=develop * add enum type * add enum type; test=develop	5 years ago
Huihuang Zheng	e117114289	Set states of recurrent op as dependent vars in prune (#19865 ) * Set states of recurrent op as dependent vars in prune of save inference model This PR will fix the save/load inference model problem of RNN models. The reason of the bug is that save_inferenc_model will prune OPs that doesn't contribute to Output. But in recurrent_op, States are not Output, OPs refers States will be pruned. This fix adds States of recurrent_op as dependent var so that OPs referring States won't be pruned.	6 years ago
Zeng Jinle	b754700fb5	fix reduce and broadcast to avoid multi-stream, test=develop (#19889 )	6 years ago
joanna.wozna.intel	3f1d0234ae	Fix conv2d+dequantize squash for residual fusion (#19545 ) * Fix conv2d+dequantize squash for residual fusion test=develop * Change condition test=develop	6 years ago
Huihuang Zheng	a35557d8f4	Fix deps of prune (#19876 ) Add boost as dependency of prune fix #19862	6 years ago
Leo Chen	578a2f5da3	fix SplitLodTensor when batch_size = 0, test=develop (#19866 )	6 years ago
Yiqun Liu	3cd985a669	Add a pass to fuse fc+elementwise_add+layernorm (#19776 ) * Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop	6 years ago
Zeng Jinle	3f87464e9c	refine executor_gc_helper codes, test=develop (#19814 )	6 years ago
Zeng Jinle	3fd3b663a8	fix gc bug in controlflow ops, test=develop (#19827 )	6 years ago
Zeng Jinle	db26de8389	[Bug fix] Disable memory reuse on feeded variables (#19835 ) * fix memory reuse bug on feeding variables, test=develop * add comments to reference count members, test=develop	6 years ago
Thunderbrook	40c66f8df9	rm return in vfork (#19734 ) * rm return in vfork * rm return in vfork test=develop	6 years ago
xujiaqi01	6bf298bf09	support preload thread, optimize hdfs log, fix master+patch bug (#19695 ) * support preload thread * sleep before fleet wrapper exit for pslib core dump * optimize hdfs log * fix master+patch bug	6 years ago
Jiabin Yang	cc311bdf95	Feature/add transform data dygraph (#19707 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * add transform_data to dygraph * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * add test and change input to const ref for safety * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * add ut for data transform * refine ut for data_transform * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE * add test_tracer on multiple devices * test=develop, change place to mutable for data transform * test=develop, add transform data on same place test and remove useless log * test=develop, Add to do for data layout and and ut for conv2d with no bias	6 years ago
Zeng Jinle	754fd57ed7	disable memory optimization passes when FLAGS_use_ngraph=True, test=develop (#19778 )	6 years ago
chengduo	8281497030	Fix warning info of build_strategy (#19805 ) * fix warning info test=develop * fix bug of all_reduce_deps_pass test=develop	6 years ago
Yiqun Liu	c67c8758cb	Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop	6 years ago
Chen Weihang	00d5375e0c	Add prune_backward function to cover complicated test_program.clone situation (#19772 )	6 years ago
Adam	d4413a54bc	Add common CreateKey for mkldnn handlers (#19767 ) test=develop	6 years ago
chengduo	056fdedde3	Open fuse all reduce option (#19765 ) * Open fuse all reduce op test=develop * Add Fuse optimization op log * Add log in fuse_optimizer op pass and fuse all_reduce op pass * replace with boost::optional<bool> test=develop * Polish code test=develop * fix code coverage test=develop	6 years ago
Huihuang Zheng	12542320c5	Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989 ) TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation	6 years ago
Zeng Jinle	0daa5c9772	Make leaky relu inplacable (#19676 ) * make leaky relu inplacable, test=develop * force add unittests to pass coverage, test=develop	6 years ago
chengduo	e506c99c20	Open fuse broadcast option (#18833 ) * fix vlog level and fuse option type test=develop	6 years ago
Yiqun Liu	a65c728e5d	Implement the GPU kernel of fc operator (#19687 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop	6 years ago
chengduo	5866a7a5fe	Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418 ) * Enable fused_all_reduce_op_handle support GPU and CPU Gradients	6 years ago
Tao Luo	ec9bc1bd9f	paddle::framework::vectorize() templatization (#19730 ) remove unused accuracy-diff warpctc-cudnn implementation test=develop	6 years ago
Zeng Jinle	bb4f8dee83	add logs to left var memory size, test=develop (#19722 )	6 years ago
wangguanzhong	25dcd74d34	merge empty lod tensor, test=develop (#19228 ) * merge_empty_lod_tensor, test=develop * fix multiclass_nms, test=develop * refine API.spec, test=develop * add unittest case for fetch, test=develop * add lod tensor test, test=develop * return index for multiclass_nms, test=develop * add api for multiclass_nms2 * update API.spc, test=develop * refine api doc, test=develop * fix test_detection.py, test=develop * polish code, test=develop * add more unittest case, test=develop	6 years ago
Zeng Jinle	713c05dd60	refine tensor.mutable_data, test=develop (#19680 )	6 years ago
hutuxian	1ca6ea0318	fix cmakelist deps (#19668 ) fix cmakelist deps: remove unnecessary deps and add proper op deps	6 years ago
Tao Luo	bcddbc78d4	remove -Wmaybe-uninitialized warning (#19653 ) * remove -Wmaybe-uninitialized warning test=develop * remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc test=develop	6 years ago
wangchaochaohu	ed8f44ea21	codegen for fused elementwise operation (#19520 ) * test=develop codegen for fused elementwise operation * fix test=develop	6 years ago
mapingshuo	dca9b6c5b0	add feed_var_names to Prune interface (#19589 ) * Fix bug: add feed_vars to the prune function	6 years ago
Tao Luo	3ae939e48a	unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631 ) * remove assert.h * change PADDLE_ASSERT_MSG to PADDLE_ENFORCE test=develop * fix tensorrt paddle_enforce test=develop	6 years ago
tensor-tang	e3e98ed678	fix scope lock bug on infer (#19624 )	6 years ago
Tao Luo	0a46d34538	refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607 ) test=develop	6 years ago
baojun	a3a4b6e570	Enable ngraph through build_strategy (#19266 ) * enable ngraph throught build_strategy test=develop * add unittest test=develop * put use_ngraph unconditional test=develop * remove paddle_enforce test=develop * remove paddle_enforce test=develop * fix copyright test=develop * limit for ngraph only test=develop	6 years ago
Adam	8d6d95cc2b	paddle::framework::vectorize() templatization (#19611 ) test=develop	6 years ago
Tao Luo	75d1571995	refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603 ) test=develop	6 years ago
Yiqun Liu	c5548178b0	A a pass to enable the use of cudnn (#19346 ) * Add a interface to enable cudnn for inference. * Add cudnn_placement_pass. test=develop * Set the default value of cudnn_enabled_op_types to null. test=develop * Write the common basic class, placement_pass_base, to refine the codes. test=develop * Call EnableCUDNN in unittest. test=develop * Refine cudnn_placement_pass tester. * Enable the testing of cudnn_placement_pass in inference's unittest. test=develop * Add the check of op kernels. test=develop	6 years ago
Adam	e94b26daf5	using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568 ) * using MKLDNNMemoryFormat = mkldnn::memory::format changes test=develop * PADDLE_ENFORCE update test=develop	6 years ago
gongweibao	abaf87be2b	Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506 ) Change backward_guard to optimize_guard to maximize the allreduce overlap	6 years ago
Zeng Jinle	19474019c2	fix fast pe to run highest priority ops first, test=develop (#19575 )	6 years ago
Zeng Jinle	0af8549750	fix seg fault of share lod, test=develop (#19573 )	6 years ago
hutuxian	c756b5d231	Paddlebox Framework (#18982 ) * Support looking up embeddings from BoxPS. * Add a _pull_box_sparse op, for now this op is not exposed to users. * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on. * Add 'BoxPSDataset' in python code. * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS. * Add UT. * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982	6 years ago
Jacek Czaja	ecd9f330c9	[MKL-DNN] Fix to face model on AVX512 platforms (#19282 ) - Refactor step 1 - Compilation fix - Yet another compilation fix - Even more compilation fix - Lint fixes test=develop - Removed deprectaed PADDLE_ENFORCE occurance test=develop - Candidate fix to BN forward - Lint fixes test=develop - Refactoring in data_layout_transform - compilation fix - Another comppilation fix - Step further into darkness - Yet another compilation fix - Yet another compilation fix - missing header - compilation fix - Added MKLDNN -> Paddle conversion in fetch op test=develop - Compilation fix test=develop - Lint test=develop - Mul fix - Fix to MKLDNN MUL op and Elementwise MUL UT test=develop - Workaround for diffrent weights with groups representation Paddle vs MKL-DNN. test=develop - Candidate fix for 5D convolution with groups - Refactor of fix for conv3d and conv2d in fetch op test=develop - Compilation fix - Still same compilation fix - Compilation fix - Compilation fix - Reverted refactoring of fixes - Adapted test_conv2d_int8_mkldnn so it exects data in NCHW format not NHWC test=develop - minor fix in UT test=develop - Lint fixes test=develop	6 years ago
yaoxuefeng	10ca3f9609	add thread scope stat accurate metrics test=develop (#19480 ) * add thread scope stat accurate metrics test=develop * fix style * fix style * fix style * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix conflict * fix style * fix style test=develop * fix error test=develop * fix error test=develop	6 years ago
Tao Luo	02270b3eb1	remove unused assert.h (#19529 ) test=develop	6 years ago
chengduo	e340df013e	Support feed single persistable variable to PE (#19417 ) * update executor feed	6 years ago
Yiqun Liu	fcec365d29	Add a pass to replace dropout_op with scale_op when is_test is true (#19297 ) * Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true. test=develop * Delete dropout_op directly when upscale_in_train is true. test=develop * Improve the debug string, adding the print of op_desc information. * Fix the case when dropout's input x is reused as the next op's output. * Add the pass to inference. test=develop * Change the log level. test=develop * Add unittest for inplace case. * Add comment to explain the pass. * Apply the pass for CPU inference. test=develop * Fix the typo. test=develop * Add the check of AttrType. test=develop	6 years ago
Thunderbrook	1fe468d319	support debug each output of each ins (#19004 ) * dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop * support debug tensor of each ins test=develop * support debug tensor of each ins test=develop * learning rate * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style test=develop * code style test=develop * unitest * style * style * multi phase * add channel * code style * style * style * unitest * style * define * define test=develop * style test=develop * rm define test=develop * linux * linux test=develop * style test=develop * output format test=develop * windows ci test=develop	6 years ago
Zeng Jinle	5c8f210ce3	refine inplace inference registry, test=develop (#19032 )	6 years ago
chengduo	b6d1d8901f	Increase num_iteration_per_drop_scope (#19075 ) * increase num_iteration_per_drop_scope test=develop * Fix bug of while_op test=develop * fix bug of whileOp test=develop	6 years ago
tangwei12	65c7368400	Fix the correctness of async mode at distributed training (#18863 ) * fix correctness of the communicator * fix a bug in send thread when sending var context is empty, test=develop * add lookup_table_prefetch_op and prefetch optimize, test=develop * remove remote prefetch GPU supported * word2vec force with CPU, test=develop * test dist remote lookup table force with CPU, test=develop	6 years ago
joanna.wozna.intel	2e3ec66be0	Add conv dequant squash for int8 (#18905 )	6 years ago
Tao Luo	c82280e445	remove unused conv_elementwise_add2_act_fuse.cc (#19344 ) test=develop	6 years ago
Leo Chen	a9d5fc5142	Enhance OpTest to check the consistency of operators when using and not using inplace (#19101 ) * add pybind interface to get all inplace ops, test=develop * enhance OpTest to check whether the consistency of operator when using and not using inplace, test=develop * handle corner cases in op_test, test=develop * support outputs without tensor holder_, like XShape in reshape_op, test=develop * fix bug, some op has GradOpMaker, but actually no grad_op in OpInfoMap, test=develop * use reshape_grad instead of reshape in FlattenGradOp, test=develop * fix error debug dims info for variables like XShape, test=develop * change computational order in sum_op to relieve computation difference using inplace, test=develop * add inplace_atol to check group_norm, and skip inplace_grad for mkldnn, test=develop * follow sneaxiy's comments, test=develop * remove unused DefaultGradOpDescMaker in mkldnn op, test=develop	6 years ago
Tao Luo	e3c68bde78	stronger the error message of tensor's mutable_data (#19303 ) * stronger the error message of tensor's mutable_data test=develop * update error message test=develop	6 years ago
Adam	97d1db1874	Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237 ) * Add generalized Conv+Activation MKLDNN fuse pass creation Part2 test=develop * Undefined behaviour of GetAttrIfExists<> FIX test=develop	6 years ago
Zhaolong Xing	76c95af000	Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213 ) * fix mask rcnn bug: 1. affine channel fuse (diff) 2. condition block op (memory leak) 3. merge lod tensor op (diff) 4. memroy optim (diff) test=develop * fix ci aboud PADDLE_ENFOCE fix merge lod infer op ut test=develop	6 years ago
Zeng Jinle	5b6673c44d	merge develop to solve conflict, also fix API doc, test=develop (#18823 )	6 years ago
liuwei1031	50582071dc	fix compilation issue in windows vs2017 (#19183 ) * fix compilation issue in windows vs2017, test=develop * fix gtest lib not found issue, test=develop	6 years ago
juncaipeng	5368b36512	remove the warning for reminding user to avoid using the OriginProgram method, test=develop (#19244 ) This log information may annoy users who don't need to care about it.	6 years ago
chengduo	8a89ca94ce	Fix REGISTER_OP_WITHOUT_GRADIENT (#19251 ) * fix REGISTER_OP_WITHOUT_GRADIENT test=develop	6 years ago
Zeng Jinle	708bd9798d	move_flags_to_unified_files_for_management, test=develop (#19224 )	6 years ago
Adam	b837689e97	Add generalized Conv+Activation MKLDNN fuse pass creation (#19072 ) test=develop	6 years ago

1 2 3 4 5 ...

2771 Commits (063c51c7487d83b6618ba7a8b6f218526119b3e0)