Paddle

Commit Graph

Author	SHA1	Message	Date
wangchaochaohu	fb34bdb40c	API/OP(fill_constant) error message enhancement (#23584 )	5 years ago
liuwei1031	2fd728a978	add new dot op(#23418 )	5 years ago
chenhaoze	9b06dd8628	Add three passes and api reference of paddle_pass_builder. test=develop (#23741 ) * Add three passes and api reference of paddle_pass_builder.h	5 years ago
xujiaqi01	d98084e7ec	add save with prefix (#23449 ) * add save with prefix * test=develop	5 years ago
joanna.wozna.intel	5ee099ca57	Op-requant squash (#23665 ) * Op-requant squash test=develop * Add matmul to op-requant test test=develop	5 years ago
hutuxian	94a3789fd0	Add AfsAPI in PaddleBox (#23419 ) * Involves AfsAPI to resolve slow downloading. * Mainly used in PaddleBox	5 years ago
liym27	06d4aa4e73	API (BuildStrategy) error message enhancement. (#23462 )	5 years ago
Zhen Wang	84cd45f674	Solve the conflict of ops with the same name, test for CI. (#23573 ) * solve the conflict of ops with the same name. test=develop	5 years ago
Zeng Jinle	7f3e0eaad1	refine error msg, test=develop (#23589 )	5 years ago
mozga-intel	3baaee9aab	Remove: NGraph engine from PDPD repository (#23545 ) * Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop	5 years ago
Zhang Ting	1b8fe70e48	fix VLOG, test=develop (#23327 )	5 years ago
Chen Weihang	45880f604b	API(Program) error message enhancement (#23519 ) * polish api program error message, test=develop * fix condition error, test=develop * fix test prune error, test=develop * fix coverage problem, test=develop	5 years ago
joanna.wozna.intel	3cb5623dad	Add matmul dequant squash (#23505 ) test=develop	5 years ago
wangchaochaohu	c1187cd6f4	Fp16 refine for fusion group (#23472 )	5 years ago
joanna.wozna.intel	ce08fdcf2b	Add support for INT8 matmul in C-API quantization (#23463 ) * Integrate matmul with cpu_quantize_pass test=develop * Add matmul checking scales test=develop * Change condition of matmul quantization test=develop * Remove redundant var test=develop	5 years ago
Aurelius84	8674a82c03	Op (Scope) error message enhancement (#23458 ) * Op (Scope) error message enhancement test=develop	5 years ago
wangchaochaohu	d085f79228	fix untime fail for output var stop_gradient=True for fusion group (#23317 )	5 years ago
qingqing01	6162cf2f2e	Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO. (#23426 ) * Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO	5 years ago
ShenLiang	5223e2bbc4	Add a new DataFeed named PaddleBoxDataFeed (#23321 ) * add paddleboxdatafeed * add ifdef linux and boxps * add untest for datafeed * fix untest of test_paddlebox_datafeed * fix untest * rename function	5 years ago
Chen Weihang	75bd350710	Implement StaticModelRunner to support dygraph fine-tune static graph pre-training model (#23171 ) * static model runner basic implement, test=develop * add run program op to execute loaded program, test=develop * refactor static model runner & run program op, test=develop * reset engine.cc to resolve conflict * adapt the change of dygraph double grad, test=develop * refactor impl to solve control flow error, test=develop * clear debug code, test=develop * fix ci str compatible error & checkout dygraph grad maker & add example, test=develop * hide api & add op test, test=develop * fix run program op test places error, test=develop * fix program by review comment, test=develop * delete change var desc name, test=develop * fix other program by review comment, test=develop * remove _static_graph_guard, test=develop * add selectedrows test, test=develop * remove desc parser, test=develop * fix detail program, test=develop * change socpe create & add test, test=develop	5 years ago
Kaipeng Deng	d223a24904	Fix inplace_abn compile error on Windows (#23464 ) * fix inplace_abn windows compile error. test=develop	5 years ago
Tao Luo	0b583235f5	Revert "Solve the conflict of ops with the same name. (#23199 )" (#23494 ) This reverts commit `abe3e6906d`. test=develop	5 years ago
wawltor	6577f91b74	Add the sum op to API 2.0， add some parameters for new api * Add the sum op to API 2.0, test=develop * Fix the import meesage in common_ops_import	5 years ago
Zhen Wang	abe3e6906d	Solve the conflict of ops with the same name. (#23199 ) * solve the conflict of ops with the same name. test=develop	5 years ago
tianshuo78520a	d8a21ef6f3	test=develop;fix error (#23467 )	5 years ago
zhongpu	dbfbd7eac4	support Exhaustive search in dygraph (#23415 ) * use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop * fix compile error, test=develop Co-authored-by: phlrain <phliuhongyu@126.com>	5 years ago
gongweibao	24a063f6ac	Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL (#22586 )	5 years ago
wangchaochaohu	5c60778731	polish the code of fusion group test=develop (#23370 )	5 years ago
Leo Chen	a62599a888	[feature] prune program by feed and fetch_list automatically (#22474 ) * prune train program by fetch_list, test=develop * add unittest for prune, test=develop * fix pruned feed, test=develop * support ParallelExecutor and feed prune, test=develop * add comments, test=develop * update unittest, test=develop * update unittests, test=develop * remove debug code, test=develop * support cond in clone, test=develop * support cond in prune, test=develop * support multiple minimize, test=develop * support cache, test=develop * fix _copy_param_info_from, test=develop * support python2 str, test=develop * remove debug code, test=develop * fix bug of caching CompiledProgram, test=develop * fix multi_device issue, test=develop * tmp * support tuple in fetch_list and overriding use_prune, test=develop * dont use nonlocal in python2, test=develop * remove nonlocal, test=develop * code clean, test=develop * code clean, test=develop * feed list, test=develop * test adam, test=develop * follow comments, test=develop * reduce duplicate code, test=develop * update comments, test=develop	5 years ago
Yiqun Liu	bc2981e998	Disable test_code_generator and test_post_training_quantization_mobilenetv1 (#23440 )	5 years ago
Zeng Jinle	29337f4e17	fix conflict of inferne partial feed with gpu parallel ssa graph executor, test=develop (#23400 )	5 years ago
zhongpu	bfb07aafe8	Revert "Exhaustive search (#22821 )", test=develop (#23401 ) This reverts commit `48144e4099`.	5 years ago
xujiaqi01	93ea9dd27a	fix stat var in hogwild worker (#23367 ) * fix stat var in hogwild worker * test=develop	5 years ago
joanna.wozna.intel	8c463700e1	Add default pass attributes (#23042 )	5 years ago
zhongpu	48144e4099	Exhaustive search (#22821 ) * use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop Co-authored-by: phlrain <phliuhongyu@126.com>	5 years ago
Kaipeng Deng	21d95be0db	Add inplace abn op (#22806 ) * add inplace_abn_op. test=develop	5 years ago
Yi Liu	821534efd3	add paralell_executor dependancy to collective_helper (#23380 ) test=develop	5 years ago
Zeng Jinle	3a21980b78	add reader dependency pass, test=develop (#23301 )	5 years ago
wangchaochaohu	d280106007	Add support for attr type Op and add fill_constant Op and scale Op (#23163 ) * add attr support for fusion group and add support for fill_constant and scale Op	5 years ago
xujiaqi01	3a45767d49	add fleet pslib pull and push sparse op and push dense op (#23139 ) * add fleet pslib pull and push sparse op and push dense op * test=develop	5 years ago
Jacek Czaja	2bb1b0e89e	[DNNL] Added MKL-DNN inplace pass for C-API inference (#23315 )	5 years ago
Wojciech Uss	f836c8aa8f	add check for scales and a message (#23119 )	5 years ago
Tao Luo	c00d427d52	simplify the cmake log of ir/CMakeLists.txt (#23262 ) test=develop	5 years ago
xujiaqi01	68ea1ad55b	add clear one table (#23089 ) * add clear_one_table * test=develop	5 years ago
danleifeng	ae3bb16d06	add MaskAucCalculator in paddlebox (#23157 ) * add maskauc in paddlebox; test=develop	6 years ago
Zeng Jinle	53e6f8e1da	rename macro, test=develop (#23161 )	6 years ago
Zeng Jinle	7ca77a90ac	add Tensor::IsSharedBufferWith method, test=develop (#23175 )	6 years ago
Zeng Jinle	b8886bf122	rename no_need_buffer_vars_macro, test=develop (#23159 )	6 years ago
Zeng Jinle	bae5930ba1	fix graph attr copy issues, test=develop (#23191 )	6 years ago
Zeng Jinle	acfc9b8a70	Reader sequential and inference partial feed (#22699 ) * sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop	6 years ago
Wilber	95b356a069	update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114 ) update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input	6 years ago
Zeng Jinle	a31d7328b7	Add dygraph double grad implementation (#22939 ) * add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop	6 years ago
Yiqun Liu	3af4771122	Add the detection and code-generation of sqrt and square in fusion_group (#23095 )	6 years ago
hutuxian	0c30098f8b	Add need_save_delta parameter to solve OOM (#23097 )	6 years ago
Sylwester Fraczek	abee05a8c8	added mkldnn swish activation (#23041 )	6 years ago
Zhang Ting	880eb04d93	skip PrepareData when it is unnecessary (#22839 ) * remove unnecessary prepare data, test=develop * Op in while block will not skip PrepareData, test=develop	6 years ago
Adam	5842ae6785	Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )" (#22985 )	6 years ago
yaoxuefeng	660ff18488	fix datsset test=develop (#23043 )	6 years ago
wangchaochaohu	3757e0687c	Add Unittest for backward of fusion group (#22932 ) * add fusion group test for backward and refine code	6 years ago
wangchaochaohu	f0d193a23c	Cast fusion for fusion group (#22876 ) * add support for expression type convert and add cast Op support in fusion group	6 years ago
Wilber	ff3ddbb502	add skip_layernorm pass. test=develop (#22895 ) * add skip_layernorm pass. test=develop	6 years ago
Adam	056edf3929	Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )	6 years ago
Zhaolong Xing	8d6dc102fe	[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494 ) * 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop	6 years ago
Zeng Jinle	d33c4343e1	Imperative tracer refactoring (#22457 ) * refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop	6 years ago
liu zhengxi	61fef9754b	Fix fc padding bug during inference fusion (#22860 ) * fix fc padding during fusion, test=develop * fix optim model inference after SaveOptimModel, test=develop	6 years ago
wangchaochaohu	dbb0b9b3b6	refine the profiler print (#22823 ) * refine the profiler print test=develop	6 years ago
hong	5191e54494	reduce default attrs for dynamic graph (#22850 ) * reduce default attrs for dynamic graph, test=develop * add some explanations for explicit attr, test=develop * tweak explicit attr comments, test=develop	6 years ago
Zhang Ting	72ff5a09c3	fix print bug of profile, test=develop (#22804 )	6 years ago
Zhang Ting	4e8bc02461	add fluid.device_guard to specify the device type for Op (#22254 ) * add fluid.device_guard to specify the device type for Op	6 years ago
Zhen Wang	89cfa49156	Unmerged fetch list (#22635 ) * update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop	6 years ago
hutuxian	53a2b68f4e	support customized download command in dataset (#22782 ) * user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario	6 years ago
wangchaochaohu	ca9e77a8d4	add sum op support for fusion group (#22771 ) * Add the codegen and auto fusion for sum Op in fusion group	6 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	6 years ago
Leo Chen	b2c1be851a	support cond in clone, test=develop (#22657 ) * support cond in clone, test=develop * refine code, test=develop * refine code, test=develop * follow comments, test=develop * refine code, test=develop	6 years ago
hutuxian	175954d894	PaddleBox Framework Part2 (#22466 ) * Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.	6 years ago
GaoWei8	cdf5f6fb8c	Add an inference interface to disable FC padding (#22097 ) * Add an interface of disabling FC padding * fix bert regression * polish fc padding interface * recover pass function * fix argument error * fix mkldnn error	6 years ago
tianshuo78520a	d2ba91aad1	fix typo words (#22653 )	6 years ago
tangwei12	66a3150135	SYNC with communicaotor (#22344 ) * add sync communicator and implement	6 years ago
Yiqun Liu	22bbd54719	Add the support of fp16 in fusion_group (#22239 )	6 years ago
wangchaochaohu	c65c6ae534	add flag to control profile level in python API (#22319 ) * add python flag to control profile level test=develop	6 years ago
123malin	00594c1c88	support dumping params/grads in transpiler mode (#22490 )	6 years ago
flame	f7eafca828	remove python inference warning (#22602 )	6 years ago
Wilber	9a8203aa25	fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551 ) 当一个模型中有多个fc_lstm子图的时候，且其中fc共用了同一个persistable的bias，此时不应该将bias节点删除，只将非persistable的节点去除即可。	6 years ago
Zhaolong Xing	8acd745c25	[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486 ) * 1. optim multihead matmul: fuse three fc to multihtead matmul test=develop * fix conflict test=develop * fix comments test=develop	6 years ago
Yiqun Liu	96770f519e	Disable fusion_group for windows and mac in build_strategy. (#22549 ) test=develop	6 years ago
tangwei12	b0675c8193	fix bug with compiledProgram (#22495 ) * add thread barrier for the compiled program	6 years ago
hutuxian	1a7962be97	Paddlebox about box_wrapper (#22497 ) Refine PaddleBox Framework, Main functions: * Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC. * Replace FeedPass with new interface: BeginFeedPass & EndFeedPass * Refactor Pull/Push Sparse Function in box_wrapper. * Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct. * Cache copied keys in pull sparse in order to reuse it in push period.	6 years ago
yaoxuefeng	2235ee1a5e	multi-loss optimization by adding a DownpourOpt worker (#22025 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop	6 years ago
zhaoyuchen2018	54970444ce	Improve transpose performance with tile sm copy, test=develop (#22311 ) * Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
Wilber	a90fa54092	Compile without nccl deps. [1/2] (#22509 ) 支持不依赖nccl进行编译。[1/2] 多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	6 years ago
guofei	3a59a7a11f	Make assign op support LoDTensorArray and modify while_loop API (#22309 ) This PR makes assign op support LoDTensorArray and enable the loop_vars in while_loop to support tuple or list.	6 years ago
Yiqun Liu	dcfb603897	Enable the detection of subgraph composed of grad ops (#21223 ) * Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop	6 years ago
joanna.wozna.intel	17f2c0899f	Add dequant-scale squash (#22409 ) * Add dequant scale squash test=develop * Correct dequant-scale squash test test=develop	6 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	6 years ago
xujiaqi01	d51ffe860a	fix copy table bug (#22432 ) * fix copy table bug of lost some feasign * test=develop	6 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	6 years ago
xujiaqi01	371f377bea	add GeneralRoleMaker (#22295 ) * add GeneralRoleMaker which is for general usage * test=develop	6 years ago
Michał Gallus	269db0d1d1	[DNNL] Fix accuracy in INT8 FC (#22404 ) * Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop	6 years ago
joanna.wozna.intel	3099d9d47c	Restore requantize squash (#22399 )	6 years ago
Adam	e7a9f6bbb7	[Bugfix] Preserve shape in inpalce operators (#22360 )	6 years ago
Yiqun Liu	b7cac50b64	Implement a common python unittest to test the ir passes. (#22209 ) * Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop	6 years ago
tangwei12	82bc814a57	integrated HALF_ASYNC to communicator (#21869 ) * add half_async in the communicator * fix DistributedStrategy	6 years ago
Leo Chen	3e5744aa65	Remove unused inputs for some operators (#22284 ) * remove unused inputs, test=develop * remove unused inputs, test=develop * update dtype, test=develop * remove unused inputs, test=develop * update op_use_default_grad_op_maker, tese=develop * resolve conflicts, test=develop * follow comments, test=develop * update center_loss_grad, test=develop	6 years ago
lidanqing	895f8da7d6	change std::cout to log(INFO), vlog (#22316 )	6 years ago
Zhen Wang	e40cfb1010	fix the bug of assert_is_op_output. test=develop (#22262 )	6 years ago
Wojciech Uss	d3a6647372	improve placement pass tests code coverage (#22197 )	6 years ago
zhouwei25	549e6de7ac	faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164 )	6 years ago
xujiaqi01	e3a457d34b	add collective communication library in fleet (#22211 ) * add collective communication library in fleet to replace mpi * test=develop	6 years ago
Zhen Wang	f2522e91c4	fix the type error caused by setting bool attr in OpDesc. test=develop (#22257 )	6 years ago
Chen Weihang	fc0b21e17b	Polish fetch error message of parallel executor (#22206 ) * polish error message of parallel executor, test=develop * change PADDLE_ENFORCE, test=develop	6 years ago
wangchaochaohu	621d3e0b66	fix the bug of profile update (#22207 ) * fix the bug of profile update test=develop	6 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	6 years ago
zhongpu	d0f0a2520c	test Optimizer in dygraph (#21949 ) * test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop	6 years ago
joanna.wozna.intel	5b2e98aa17	Add multiple quantize operators fuse (#22062 )	6 years ago
Yiqun Liu	96980c2244	Polish the PADDLE_ENFORCE in fusion_group pass related codes. (#22144 ) * Polish the PADDLE_ENFORCE in fusion_group pass related codes. test=develop * Correct the unittest because of the change relu_grad's formula. test=develop	6 years ago
wangchaochaohu	c3876cf82d	add support for nested profiling event and printing in different level (#22061 ) * add support for nested profiling event and printing in different level	6 years ago
liu zhengxi	724b13e459	fix xception precision problem, test=develop (#22124 )	6 years ago
Yiqun Liu	b1401fb74d	Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#22094 ) test=develop	6 years ago
bingyanghuang	4b4a9cc88f	fix format in operator.cc (#22101 )	6 years ago
silingtong123	6c20e7c4e6	test=develop, remove unused parameter from class RuntimeInferShapeContext constructors (#22046 )	6 years ago
Jacek Czaja	b0b27ff699	[MKL-DNN] Conv grad and Batch Norm grad NHWC support (#22088 )	6 years ago
Huihuang Zheng	dd4361568e	Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029 )	6 years ago
Jacek Czaja	ad8a9cb82c	[MKL-DNN] Pool & LRN Grad Ops NHWC support (#21747 )	6 years ago
Yiqun Liu	d48320777e	Add the first implememtation of fusion_group op (#19621 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop	6 years ago
Michał Gallus	6192108408	[DNNL] 3D Fully-Connected (#21746 )	6 years ago
liu zhengxi	196e20dfbb	Fix multi-threads memory out of bounds error for passes (#21920 ) * fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop * fix attention_lstm_fuse_pass during multi-threads inference, test=develop * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop * fix fc_lstm_fuse_pass during multi-threads inference, test=develop * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop	6 years ago
石晓伟	03479469a7	fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841 ) * fix multi-thread error of fc_gru_fuse_pass.cc, test=develop * export FLAGS and GLOG symbols, test=develop	6 years ago
Pei Yang	3e5008ad01	fix trt calib not working bug, test=develop (#21934 )	6 years ago
qingqing01	2066745847	Pack imperative/layer into paddle_framework.so (#21921 ) * Pack imperative/layer into paddle_framework.so	6 years ago
Aurelius84	51a86d2b6b	Optimize adam speed (#21777 ) * optimize adam speed by removing _finish_update test=develop * fix SparseAdamFunctor param list test=develop * Remove scale_op in expect_list of adam_op test=develop * fix test optimizer loss assert error test=develop * fix test optimizer loss assert error test=develop * modify PADDLE_ENFORCE usage test=develop * fix op_type in lamb_op.cc test=develop * fix errors ostream format bug test=develop * add betaPowOut in ngraph op test=develop * fix ngraph::op api for gcc8 test=develop * clean code test=develop * modify struct into class test=develop * remove code of beta1Tensor in lamb_op test=develop	6 years ago
Thunderbrook	c3cf42d0f7	add table id in cache shuffle (#21585 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop * add table id in cache shuffle test=develop * table id test=develop * code style test=develop	6 years ago
WangXi	17299b8d21	fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop (#21801 )	6 years ago
Huihuang Zheng	557bce77da	Fix Backward Bugs in Conditional Block (#21809 ) The fixed bugs: 1. The condition sub-graph is not pruned 2. When backward graph is extremely simple, the whole backward ops are pruned.	6 years ago
xujiaqi01	0eb4d990c4	fix compiled error when with_pslib=on (#21769 ) * fix compiled error of butil when with_pslib=on and with_testing=on * test=develop	6 years ago
lidanqing	d3a96632fa	Add fc-dequantize squash in cpu_quantize_squash_pass for ernie model (#21714 ) * fc-dequantize squash test=develop * change according to reviews test=develop * change PADDLE_ENFORCE test=develop * add second test when fc-dequant do not fuse test=develop * change all related PADDLE_ENFORCE test=develop	6 years ago
WangXi	8754cbd1f2	fix std::min type in nan_inf, test=develop (#21725 )	6 years ago
joanna.wozna.intel	d419b859c0	Add reshape int8 mkldnn op (#21428 ) * Add reshape int8 op test=develop * Change test to CPUPlace test=develop * Correct tests test=develop	6 years ago
WangXi	8a0f611b64	Rewrite check nan inf tools (#21076 )	6 years ago
tangwei12	9ad940fdfe	memory leak for cpu (#21174 ) * add fake init for the trainer, fix large memory hold in the trainer * do not merge recv vars from a remote endpoint, test=develop * add recv and save op, merge slice var in one op, save memory * remove hsigmoid with pull sparse, test=develop	6 years ago
Zeng Jinle	73461a7ae6	Make OperatorWithKernel::InferShape abstract (#21633 ) * make OperatorWithKernel::InferShape virtual, test=develop * fix test_prepare_op by relu, test=develop	6 years ago
Zeng Jinle	6828f3684b	fix op_registry, add ignore op_function_impl.h, test=develop (#21654 )	6 years ago
Adam	e81f0228df	MKL-DNN 1.0 Update (#20162 ) * MKLDNN v1.0 rebase to Paddle 1.6 test=develop * Add hacky paddle::string::to_string() implementation * vectorize<int64-t>() -> vectorize() cleanup test=develop * PADDLE_ENFORCE and void_cast fixes test=develop * Rebase changes test=develop * Cosmetics test=develop * Delete MKL from mkldnn.cmake test=develop * CMake debug commands test=develop * Delete MKLDNN_VERBOSE and rebase fixes test=develop * Rebase fixes test=develop * Temporarily disable int8 resnet101 vgg16 and vgg19 tests test=develop * Add libmkldnn.so.1 to python setup test=develop * Add libmkldnn.so.1 to inference_lib cmake after rebase test=develop * Post rebase fixes + FC int8 changes test=develop * Fix LRN NHWC test=develop * Fix NHWC conv3d test=develop * Windows build fix + next conv3d fix test=develop * Fix conv2d on AVX2 machines test=develop	6 years ago
xujiaqi01	f404157205	fix master patch when slot is dense (#21580 ) * fix master patch when slot is dense * test=develop	6 years ago
xujiaqi01	c05706fe73	fix code style of fleet_wrapper (#21639 ) * fix code style of fleet_wrapper * test=develop	6 years ago
xujiaqi01	88960684aa	rm optimize_for in framework.proto (#21571 ) * remove optimize_for in framework.proto * test=develop	6 years ago
Zeng Jinle	0f8888360e	Polish op registry codes (#21561 ) * polish infer shape registry, test=develop * modify some operators registry, test=develop	6 years ago
hutuxian	c5aec2fe68	Paddlebox Related to Framework (#21586 ) * Add a single_process_multi_thread transpiler. * Add some UTs. * Fix some API description.	6 years ago
liym27	9da7e6b4d4	add file check_op_desc.py and add interface to get default value. (#21530 ) * add file check_op_desc.py and add interface to get default value. test=develop * add test for c++ coverage rate. test=develop * Correct typo. test=develop	6 years ago
Pei Yang	122b37ce62	make config option DisableGlogInfo() able to mute all inference logs (#21318 ) * make DisableGlogInfo able to mute all logs in inference.	6 years ago
Jacek Czaja	18a5d30754	[MKL-DNN] Conv2d and Conv2d transpose MKL-DNN NHWC support (#21466 )	6 years ago
Zhaolong Xing	c5f0293cf3	NV jetson(nano, tx2, xavier) inference compile support (#21393 ) * add jeston compile support test=develop * refine the cmake test=develop	6 years ago
Tao Luo	01fa4ead61	fix -Wno-error=sign-compare warning in gcc8 (#21434 ) * fix -Wno-error=sign-compare warning in gcc8 test=develop * fix warning in distributed codes test=develop	6 years ago
wangchaochaohu	d4776ec027	fix the correctness of memcpy profiling result test=develop (#21458 )	6 years ago
Jie Fang	5e813b53c5	nhwc optimization for batchnorm (#21090 )	6 years ago
Leo Chen	e0c9d856fb	add unused input vars check for OpWithKernel, test=develop (#21169 ) * add unused input vars check for OpWithKernel, test=develop * remove unused vars in some ops, test=develop * fix batch_norm, test=develop * add white list, test=develop * add CI check for white list, test=develop * :ove white list to c++, test=develop * solve failure of CI, test=develop * add unittest for unused_var_check, test=develop * refine code, enable check in operator_test, test=develop * skip mkldnn, test=develop * extend white list, test=develop * refine condition of mkldnn, test=develop * fix paddle_build, test=develop * follow comments, test=develop * fix GetExpectedKernelType * add wiki ref to err_msg, test=develop * follow comment, test=develop	6 years ago
Huihuang Zheng	630be31952	Fix Cond Bug for Nested Control Flow (#21340 ) * Commit before merging develop test=develop * Backup after working with Huihuang logs * Commit before deleting Huihuang debug loggings * Commit before debug test=develop * Fix bug commit test=develop * Backup of fixing bugs test=develop * Clean up code test=develop * Fix a bug in sum_op test=develop	6 years ago
Jacek Czaja	cd43c4440e	[MKL-DNN] LRN and Pool2d (FWD) NHWC support (#21375 )	6 years ago
Zeng Jinle	6b09b73e17	add explicit conversion to NoNeedBufferVarsFunctor, test=develop (#21430 )	6 years ago
hong	ac8546701d	Add dygraph execution context (#20157 ) * add_dygraph_execution_context * add dygraph infershape context and execution context; test=develop * fix imperative bug; test=develop * remove inputs outputs interface from execution context, because it have same function with inputNames; test=develop * remove tracer_test ctest; test=develop * fix split op bug; test=develop * fix unitests bug; test=develop * fix distribute test bug; test=develop * fix ngraph compile bug; test=develop * fix grad maker bug; test=develop * fix load op bugs; test=develop * fix operator.cc construct bug; test=develop * remove useless name find in operator; test=develop * add tracer_test; test=develop * fix concat, split bug; test=develop * remove tracer_test unitest; test=develop * fix attribute check bug; test=develop * add test code to fix converage; test=develop * remove useless code, change check backward input in engin; test=develop * unlock var type infer shape;test=develop * add ShareAllLoD api; test=develop * add dygraph infershape context unitest; test=develop * remove increase and decrease lod in dygraph; test=develop * addd override; test=develop * fix increase descrease lod; test=develop * fix paddle_enforce; test=develop * disable lod op dygraph check; test=develop * fix paddle enforce error; test=develop * add comment for op_registry and OperatorBase; test=develop * optimize the comment of op_registry; test=develop * fix format of comment; test=develop * fix format of comment; test=develop * optimize the format of comment; test=develop * optimize the format of the comment; test=develop * optimize comment of op_registry; test=develop	6 years ago
Zeng Jinle	09696d5df8	Use system allocator in OpTest (#21335 ) * use system allocator in unittests, test=develop * fix op bugs, test=develop * fix tensor copy bug when src and dst are the same, test=develop	6 years ago
Tao Luo	c0656dcb1a	remove -Wno-error=sign-compare, make warning as error (#21358 ) * remove -Wno-error=sign-compare, make warning as error test=develop test=document_fix * fix exist compile warning test=develop	6 years ago
Zeng Jinle	b97fc16d21	fix lod_reset bug, test=develop (#21392 )	6 years ago
Zeng Jinle	89966525f1	Polish reference count pass (#21324 ) * fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop	6 years ago
Youwei Song	d5ff79e55e	Support numpy bridge (enabled by default in dygraph mode) (#20983 ) * add numpy bridge * fix template compile * add unittest, add default test=develop * fix unittest test=develop * fix unittest test=develop * zero_copy=True for to_variable, test=develop * bug fix test=develop * disable deprecated NumPy API test=develop * use better design of NumpyAllocator test=develop * fix Py_None check test=develop * reset c++ tracer when jump out dygraph guard test=develop * refine PADDLE_ENFORCE_xx format test=develop * bug fix of tracer switch test=develop * update decref test=develop	6 years ago
GaoWei8	8493f20ebc	Polish the codes of fc when needs padding (#21378 ) test=develop	6 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	6 years ago
GaoWei8	234060f88f	Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972 ) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop	6 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	6 years ago
Thunderbrook	9a7832f8be	print table stat info for pslib (#21296 ) * print table stat test=develop * notes test=develop * notes test=develop	6 years ago
Dong Daxiang	691ced87c0	Refactor fetch handler (#21264 ) * fix fetch handler problem and refactor when a user define FetchHandler class, he or she should initialize a handler with variable dict. the key of a variable dict is a user defined name, the value of a variable dict is a Varaible generated from python API. For each fetching, a user should implement handler function in which fetched_result_dict will be available and the user can access the fetched value with user defined keys.	6 years ago
Yiqun Liu	c918788ba9	Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. (#21310 ) * Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. test=develop * Print the subgraph when check failed. test=develop	6 years ago
Chen Weihang	952508527a	Polish some PE code details (#21274 ) * polish code details, test=develop * futher polish hint msg, test=develop	6 years ago
Thunderbrook	0d17c1b816	solve pslib core in stop worker (#21263 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop	6 years ago
Thunderbrook	349e82d669	support general embedding params (#21217 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop	6 years ago
Yiqun Liu	6b1e1f0dda	Enable generating code for a given subgraph. (#21126 ) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop	6 years ago
Zeng Jinle	a152315be7	refine Tensor method, test=develop (#21031 )	6 years ago
Zeng Jinle	cdb3d27985	Fix warn of gcc8 (#21205 ) * fix warnings oof gcc 8 compilation, test=develop * fix boost::bad_get, test=develop * refine PADDLE_ENFORCE, test=develop	6 years ago
Zhaolong Xing	65f7052554	TRT int8: refine trt int8 for dynamic range set (#21112 ) * refine trt int8 for dynamic range set test=develop * refine trt int8 test=develop	6 years ago
xujiaqi01	23876de55b	fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052 ) * fix cache table bug * add save_paddle_inference_model * fix hdfs util bug * test=develop	6 years ago
xujiaqi01	9e045170c0	add copy table (#21086 ) * copy some feasigns and corresponding embeddings from one sparse table to another * copy all feasigns and corresponding embeddings from one sparse table to another * copy all dense params from one table to another * copy some local vars to other local vars	6 years ago
Chen Weihang	4bd9463630	fix detail error message error, test=develop (#21170 )	6 years ago
Chen Weihang	8da0cd537a	Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop add more already exists examples, test=develop	6 years ago
Chen Weihang	8414575b78	Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137 ) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop	6 years ago
Chen Weihang	7e5f74b825	Add examples for error message writing specification - InvalidArgument (#21132 ) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_*, test=develop fix error, test=develop	6 years ago
WangXi	de5d3ff688	Fix dgc buffer illegal & reuse velocity (#21012 )	6 years ago
Zeng Jinle	d625aaf0c1	remove so many logs of parallel executor, test=develop (#21105 )	6 years ago
Yiqun Liu	35f17ae28f	Add the check of lod_level between compile-time and runtime. (#20961 ) * Add the check of lod_level between compile-time and runtime. test=develop * Fix bug in check_compile_vs_runtime. test=develop * Fix the check of output when it is dispensiable or intermediate. test=develop * Share lod of x to out in match_matrix_tensor op in compile-time. * Implement GetLoDLevel in InferShapeContext. * Set the default value of check_compile_vs_runtime to False and enable it in test_sequence_pad_op. test=develop * Enable check_compile_vs_runtime in test_match_matrix_tensor. * Add the implementation of SetLoDLevel in InferShapeContext. * Remove the implementation of IncreaseLoDLevel and call Get/SetLoDLevel instead. * Remove the implementation of DecreaseLoDLevel and call Set/GetLoDLevel instead. * Refine some ops and unittests. test=develop * Fix a typo. test=develop * Remove the check of var type, and change int to int32_t. test=develop * Add unittest for Get/SetLoDLevel. test=develop	6 years ago
Chen Weihang	826254f664	Add pre-condition check for fuse optimizer op pass (#21005 ) * add pre condition check for fuse optimizer op pass, test=develop * add log & set init to zero, test=develop * fix test_fuse_all_reduce_pass failed, test=develop * polish details, test=develop * refine PADDLE_ENFORCE & remove needless VLOG, test=develop * refactor op check method, test=develop	6 years ago
Yiqun Liu	9091f8cdf9	Support generating code for grad_op (#21066 ) * Add the definition of operation in fusion_group. * Use operations in OperationMap to detect fusion_group of elementwise pattern. * Add namespace fusion_group in code_generator. * Use operations recorded in OperationMap to generate code. * Remove implementation codes to .cc file. * Refine Operation and CodeGenerator to make it easier to generate code for grad_op. Refine the unittest for better reuse. * Avoid recording the template's keyword in a array. * Support the generating of code for grad_op and add unittest. test=develop * Remove replaced_element_in_order and use use number instead. test=develop	6 years ago
joanna.wozna.intel	77c2083586	Add transpose2 INT8 for mkl-dnn (#19424 ) * Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdba4c859abb945e062ab13124f70508054, reversing changes made to 2ce6473f144da298aba4a43d46918f27d463cf7c. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd78ca47ae56881161172b2aacd349aba90. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop	6 years ago
Chen Weihang	7ee25189c3	Enrich the type of error and declare the error type interfaces (#21024 ) * Enrich the type of error and declare the error type interfaces, test=develop * adjust tests to adapt new form, test=develop * add inference deps with error_codes.pb.h, test=develop * restore stack iter start pos, test=develop * polish code based review comments, test=develop	6 years ago
Zeng Jinle	5aae595902	fix no_need_buffer_vars_dep, test=develop, test=document_fix (#21007 )	6 years ago
xujiaqi01	1d1a07937a	simplify master+patch，remove ins when size != merge_size or has conflict slot (#20913 ) * remove duplicate code and duplicate config of master+patch * drop all ins which has conflict slot or size < merge_size * user only need to set merge size，if ins num of same id is not equal to merge size, just drop these ins * user must make sure master data and patch data has no same slot whose feasigns are both non-zero, otherwise these ins will be dropped. (slot list should still be the same of both master and patch) * test=develop	6 years ago
Zeng Jinle	878a40f57d	Support NoNeedBufferVarsInference in dygraph backward (#20868 ) * support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop	6 years ago
Wilber	c534149642	fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param test=develop (#20960 ) fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param	6 years ago
WangXi	eec4fa9099	And Enforce to fuse pass for DGC doesn't support fuse for now, test=develop (#20935 )	6 years ago
Zeng Jinle	b0c0ffb9ae	refine pe when exception raises, test=develop (#20894 )	6 years ago
123malin	20cdff0e02	Optimize decay (#20816 ) * update pserver decay blocks * update distributed notify handler	6 years ago
hong	8c4573a3cb	GradMaker for dygraph (#19706 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop	6 years ago
Thunderbrook	59bcdc8a19	support dump param of model into afs (#20302 ) * support dump param to afs test=develop * code style test=develop * code style test=develop * dump param test=develop * dump param test=develop * dump param test=develop * dump param test=develop	6 years ago
Yiqun Liu	16e4d02675	Refine the cache of program, context and scope in executor. (#18483 ) * Refine the cache of program, context and scope in executor. test=develop * Refine the unittest test_executor_and_use_program_cache. * Add the test the PaddingRNN with use_program_cache=True. test=develop * Remove a check. test=develop * Refine the unittest to check whether it is correct when setting use_program_cache=True. test=develop	6 years ago
hong	ff0886a92a	save load problem fix and new feature add (#20823 ) * fix persistable; * fix save load bugs; test=develop * fix bug; test=develop * add example for new io api; test=develop * addd example; test=develop	6 years ago
Yiqun Liu	6fcfd32e6c	Check and correct the output's lod_level in DynamicRNN related operators (#19144 ) * Refine the InferShape of ReadFrom and WriteTo op, and add comment to explain why not call ShareLoD for runtime. test=develop * Add comment for ReorderLoDTensorByRank op. * Add comment for lod_tensor_to_tensor_array op to explain why only call DecreaseLoDLevel for compile time. test=develop * ShrinkRNNMemory op should call ShareLoD for compile time. test=develop * Add the implementation of IncreaseLoDLevel and add the compile-time check of lod_level in InferShape of sequence_pool. test=develop * Refine the unittest of DynamicRNN. test=develop * Change PADDLE_ENFORCE to PADDLE_ENFORCE_NE. test=develop	6 years ago
Yiqun Liu	b5f3be8330	Implement a pass detect fusion group of elementwise op (#19884 ) * Add fusion_group_pass and elementwise pattern. * Rewrite the detector of elementwise group. test=develop * Add a comment in codegen. * Add more unittest cases. test=develop * Move code_generator related code to fusion_group directory. * Correct the including path. * Add the definition of SubGraph and finish the insert of fusion_group op in pass. * Insert graph_vis_pass in tester to visualize the graph for debug.	6 years ago
Huihuang Zheng	95ba4bd2ab	Add shape and type check at read_op (#20754 )	6 years ago
Zeng Jinle	98103d3003	remove some unnecessary logs in pe, test=develop (#20848 )	6 years ago
Chen Weihang	26cc1fe508	Replace risky GetInputType method with secure IndicateVarDataType interface (#20668 ) * replace part of the old implementation, test=develop * restore concat op, test=develop * update all ops implemention & delete GetDataTypeOfVar func, test=develop	6 years ago
xujiaqi01	48669aa8f0	fix several sparse table issuses (#20686 ) * no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto. * add find_distributed_lookup_table_grads instead of hard code GRAD * support embedding stop gradient. push sparse has error before fix this.* * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this. * fix pull sparse, skip slots which do not have embedding. * fix collect feasign label info, skip slots which do not have embedding. * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables. * test=develop	6 years ago
Chen Weihang	1d1552d106	Make formatted ENFORCE stack adapt to more situations (#20826 ) * Make formatted ENFORCE stack adapt to more situations and polish details, test=develop * restore template message position, test=develop	6 years ago
Zeng Jinle	ac813bbaf4	Add more error debug message to Operator::Run (#20793 ) * add more err msg, test=develop * add more unittests, test=develop	6 years ago
wangchaochaohu	ba45dce35d	fix codetest for windows make test=develop (#20796 )	6 years ago
zhongpu	72d1d72c09	fix ExecutionContext::HasInput and ExecutionContext::HasOutput depend on the scope structure, test=develop (#20721 )	6 years ago
石晓伟	48a774c713	fix ts_sort's bug, test=develop (#20720 )	6 years ago
wopeizl	9e5948230e	add support to gcc8, add docker env test=develop (#19807 ) * add support to gcc8, add docker env test=develop	6 years ago
xujiaqi01	5223b0dd9d	add check nan / inf in downpour worker (#20694 ) * add check nan / inf in downpour worker during training * test=develop	6 years ago
WangXi	507afa8a8a	Fix dgc nan by stripping nccl from sparseReduce. (#20630 )	6 years ago
Zeng Jinle	4eeda9d676	fix tensor_util, test=develop (#20699 )	6 years ago
Zeng Jinle	ab575de725	Fix op run log when memory optimization strategy is enabled (#20695 )	6 years ago
Jacek Czaja	a1cd27f13f	[MKL-DNN] Added mkl-dnn cache clearing when creating Executor instance (#20241 ) * - Flushing mkl-dnn cache test=develop - Disabled clearing cache for LoadModel - Added clearing of mkl-dnn cache when Executor is created test=develop - Do not clear for GPU places test=develop - compilation fix test=develop * - Moved clearing of mkl-dnn cache in destructor of executor test=develop * - Compilation fix test=develop - Reverted conditional clearing of mkl-dnn cache in Executors's destructor test=develop - compilation fix	6 years ago
Zeng Jinle	10505faf4e	polish codes, test=develop (#20672 )	6 years ago
Chen Weihang	003f369bb2	Add IndicateVarDataType interface to block tensor is not initialized problem in OP GetExceptedKernelType (#20044 ) * add indicate_var_data_type inferface, test=develop * add unittests & polish error message, test=develop * remove needless include, test=develop * extract public function & polish message, test=develop * delete empty var check, test=develop * change data_type to pointer parameter, test=develop * polish details, test=develop	6 years ago
Chengmo	940c6ff1c8	Fix communicator slow bug & fix communicator stop bug (#20366 ) * test=develop,Fix communicator slow bug * test=develop, delete if() in stop_worker() * test=develop * fix UT, test=develop * fix bug in fetch handler, test=develop * fix bug in fetch handler, test=develop * test=develop, fix fetch barrier bug * test=develop, bug fix * test=develop, bug fix * test=develop, fix bug	6 years ago
WangXi	cadc6a9704	fix dgc test and bug when not set trainers_endpoints_, test=develop (#20617 )	6 years ago
Thunderbrook	f76a32df4a	dump fix dov vec file num (#20539 ) * support dump multi file test=develop * dump fix num file test=develop	6 years ago
633WHU	12e4be0382	Dlpack support (#20039 ) * support dlpack to tensor and implement python interface test=develop * add unittest for _to_dlpack and from_dlpack test=develop	6 years ago
Pei Yang	443f604c3b	add DisableGlogInfo() to AnalysisConfig, test=develop (#20581 )	6 years ago
Zeng Jinle	a9c8bdad7b	refine pe codes, test=develop (#20479 )	6 years ago
Zeng Jinle	76b321872a	fix cuda dev_ctx by event, test=develop (#20553 )	6 years ago
zhaoyuchen2018	b8333edef6	Add Multihead matmul fuse pass (#20167 ) * Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments	6 years ago
Adam	7faa3e9555	Add ConvTranspose + BatchNorm fuse pass (#20161 ) * Add ConvTranspose + BatchNorm fuse pass test=develop * Add tests for conv+bn and conv_transpose+bn passes test=develop	6 years ago
xujiaqi01	22b80e1246	fix parse content in CreatePreLoadReaders (#20258 ) Fix parse content in CreatePreLoadReaders. Before this fix, if you use dataset.set_parse_content and dataset.preload, parse content didn't work.	6 years ago
hong	fa43e80e19	New save load interface (#20148 ) * add new save load interface; test=develop * add new save interface; test=develop * add save load interface ; * fix save load error; * fix dygraph set dict bug; * add save load unit test; test=develop * fix test_imperative_optimizer bug; test=develop * fix unitest optimizer bug; test=develop * fix code coverage; test=develop * fix converage; test=develop * add document for apis; test=develop * fix unitest error; test=develop * fix save load unit test error; test=develop * fix error message; test=develop * change set_parameter set_optimizer to save_dygraph; test=develop * add load_graph check; test=develop * fix api spec; test=develop	6 years ago
Zeng Jinle	c20b11ba11	simplify op_info.h, test=develop (#20195 )	6 years ago
hong	0ec2c081d9	update op compatible list; test=develop (#20175 )	6 years ago
tangwei12	c9139c3db3	trainer from dataset fetch targets (#19760 ) add executor.FetchHandler for train/infer from the dataset	6 years ago
chengduo	bfa55c9ddb	Add place deps for fused_all_reduce_op_handle (#20077 ) test=develop	6 years ago
Zeng Jinle	5fef859c65	remove map type from var_type_traits.h, test=develop (#20090 )	6 years ago
Zeng Jinle	4ad66c779c	fix op_compatiable_compile_error, test=develop (#20076 )	6 years ago
qingqing01	1a3eef026c	Enable users to create custom cpp op outside framework. (#19256 ) * How to write custom op needs to follow framework OP spec. * Package fluid_framework.so and headers into whl. * Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir. * Export some C-APIs to merge OpInfo between core.so and custom_op.so. * Add unit testing. * Update API.spec.	6 years ago
bingyanghuang	9de6772510	Follow comment of Merged QAT PR 18970 (#19979 ) * Follow Wangzhen's comment in PR 18970, test=develop * Review comments, test=develop * Leave fake quantization around mul test=develop * Replace Fake with Real Quantized Mul test=develop * Fix bug in quantize placement pass Nodes in the graph now have checked type instead of node name when they are to be marked for quantization test=develop	6 years ago
石晓伟	01b9d07963	update operator compatible info, test=develop (#19978 ) * update operator compatible info, test=develop * revert cmake/version.cmake, test=develop * add unit_tests and fix bugs, test=develop * update ../paddle/fluid/framework/framework.proto, test=develop * fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop * update paddle/fluid/framework/version_test.cc, test=develop * add comments and rename interfaces, test=develop	6 years ago
joanna.wozna.intel	f5221ac19f	Disable conv requant squash (#20041 ) * Fix conv2d+dequantize squash for residual fusion test=develop * Disable conv-requant squash test=develop	6 years ago
wangchaochaohu	c9ea317b36	codegen code for reconstruction (#19728 ) * codegen code for reconstruction test=develop * fix the cmake test=develop * fix review advice test=develop	6 years ago
tangwei12	8f0b3c0516	the integrated communicator (#19849 ) * add a base class for the Communicator * add AsyncCommunicator Impl for async distributed training	6 years ago
Chen Weihang	b916335025	Paddle error message stack shaping and optimization (#19895 ) * shape and optimize paddle error message stack, test=develop * limit exception type & add unittest, test=develop * fix multi-platform problem, test=develop * fix related unnitest failed, test=develop * add doc & fix unittest errors, test=develop * fix function name error, test=develop * update tensor test exception msg compare, test=develop * remove unittest on win32, the dir format is different, test=develop * remove useless package, test=develop * add paddle enforce handler unittest, test=develop * add exception checkout, test=develop * fix coverage failed, test=develop * fix op registry test failed, test=develop * refactor whole pr, test=develop * remove test in CMakelist, test=develop * fix coverage, test=develop	6 years ago
chengduo	2450d15b78	disable fuse_all_optimizer_ops (#19966 ) test=develop	6 years ago
chengduo	101a2b610a	Add dtype for coalesce_tensor_op (#20016 ) Add dtype for coalesce_tensor_op	6 years ago
Huihuang Zheng	88af4ab650	Add new data layer (#19916 ) The new "fluid.data" changes old "fluid.layers.data": 1. Add shape and dtype check. 2. Remove "append_batch_size" parameter. We won't offer this in the new data layer because other deep learning platforms don't have this kind of data layer pre-processing. It may confuse users. 3. Remove "stop gradient" parameter because the data layer doesn't do back-propagation TODO： Now data layer feeded by executor is checked, will we want to check the feed data of readers in the future?	6 years ago
xujiaqi01	f50e701b3b	fix memory leak in HogwildWorker (#19956 ) fix memory leak in HogwildWorker, whose ops are explicitly deleted in destructor	6 years ago
xujiaqi01	cedc04775c	support change shuffle and train thread num (#19841 ) * support change shuffle thread num * support change train thread num * fix receive shuffle data of each channel * data norm stop gradient * add check thread_tensor type and root_tensor type when merge metric * remove sleep in shuffle, add config * add config of pslib client to client communication * fix xbox str * add data norm op testcase * add flush in trainer finalize	6 years ago

... 3 4 5 6 7 ...

3040 Commits (89530384008b023dc1e8c51e5a8e7e710718efff)