Paddle

Commit Graph

Author	SHA1	Message	Date
wawltor	6577f91b74	Add the sum op to API 2.0， add some parameters for new api * Add the sum op to API 2.0, test=develop * Fix the import meesage in common_ops_import	5 years ago
Zhen Wang	abe3e6906d	Solve the conflict of ops with the same name. (#23199 ) * solve the conflict of ops with the same name. test=develop	5 years ago
tianshuo78520a	d8a21ef6f3	test=develop;fix error (#23467 )	5 years ago
zhongpu	dbfbd7eac4	support Exhaustive search in dygraph (#23415 ) * use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop * fix compile error, test=develop Co-authored-by: phlrain <phliuhongyu@126.com>	5 years ago
gongweibao	24a063f6ac	Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL (#22586 )	5 years ago
wangchaochaohu	5c60778731	polish the code of fusion group test=develop (#23370 )	5 years ago
Leo Chen	a62599a888	[feature] prune program by feed and fetch_list automatically (#22474 ) * prune train program by fetch_list, test=develop * add unittest for prune, test=develop * fix pruned feed, test=develop * support ParallelExecutor and feed prune, test=develop * add comments, test=develop * update unittest, test=develop * update unittests, test=develop * remove debug code, test=develop * support cond in clone, test=develop * support cond in prune, test=develop * support multiple minimize, test=develop * support cache, test=develop * fix _copy_param_info_from, test=develop * support python2 str, test=develop * remove debug code, test=develop * fix bug of caching CompiledProgram, test=develop * fix multi_device issue, test=develop * tmp * support tuple in fetch_list and overriding use_prune, test=develop * dont use nonlocal in python2, test=develop * remove nonlocal, test=develop * code clean, test=develop * code clean, test=develop * feed list, test=develop * test adam, test=develop * follow comments, test=develop * reduce duplicate code, test=develop * update comments, test=develop	5 years ago
Yiqun Liu	bc2981e998	Disable test_code_generator and test_post_training_quantization_mobilenetv1 (#23440 )	5 years ago
Zeng Jinle	29337f4e17	fix conflict of inferne partial feed with gpu parallel ssa graph executor, test=develop (#23400 )	5 years ago
zhongpu	bfb07aafe8	Revert "Exhaustive search (#22821 )", test=develop (#23401 ) This reverts commit `48144e4099`.	5 years ago
xujiaqi01	93ea9dd27a	fix stat var in hogwild worker (#23367 ) * fix stat var in hogwild worker * test=develop	5 years ago
joanna.wozna.intel	8c463700e1	Add default pass attributes (#23042 )	5 years ago
zhongpu	48144e4099	Exhaustive search (#22821 ) * use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop Co-authored-by: phlrain <phliuhongyu@126.com>	5 years ago
Kaipeng Deng	21d95be0db	Add inplace abn op (#22806 ) * add inplace_abn_op. test=develop	5 years ago
Yi Liu	821534efd3	add paralell_executor dependancy to collective_helper (#23380 ) test=develop	5 years ago
Zeng Jinle	3a21980b78	add reader dependency pass, test=develop (#23301 )	5 years ago
wangchaochaohu	d280106007	Add support for attr type Op and add fill_constant Op and scale Op (#23163 ) * add attr support for fusion group and add support for fill_constant and scale Op	5 years ago
xujiaqi01	3a45767d49	add fleet pslib pull and push sparse op and push dense op (#23139 ) * add fleet pslib pull and push sparse op and push dense op * test=develop	5 years ago
Jacek Czaja	2bb1b0e89e	[DNNL] Added MKL-DNN inplace pass for C-API inference (#23315 )	5 years ago
Wojciech Uss	f836c8aa8f	add check for scales and a message (#23119 )	5 years ago
Tao Luo	c00d427d52	simplify the cmake log of ir/CMakeLists.txt (#23262 ) test=develop	5 years ago
xujiaqi01	68ea1ad55b	add clear one table (#23089 ) * add clear_one_table * test=develop	5 years ago
danleifeng	ae3bb16d06	add MaskAucCalculator in paddlebox (#23157 ) * add maskauc in paddlebox; test=develop	5 years ago
Zeng Jinle	53e6f8e1da	rename macro, test=develop (#23161 )	5 years ago
Zeng Jinle	7ca77a90ac	add Tensor::IsSharedBufferWith method, test=develop (#23175 )	5 years ago
Zeng Jinle	b8886bf122	rename no_need_buffer_vars_macro, test=develop (#23159 )	5 years ago
Zeng Jinle	bae5930ba1	fix graph attr copy issues, test=develop (#23191 )	5 years ago
Zeng Jinle	acfc9b8a70	Reader sequential and inference partial feed (#22699 ) * sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop	5 years ago
Wilber	95b356a069	update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114 ) update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input	5 years ago
Zeng Jinle	a31d7328b7	Add dygraph double grad implementation (#22939 ) * add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop	5 years ago
Yiqun Liu	3af4771122	Add the detection and code-generation of sqrt and square in fusion_group (#23095 )	5 years ago
hutuxian	0c30098f8b	Add need_save_delta parameter to solve OOM (#23097 )	5 years ago
Sylwester Fraczek	abee05a8c8	added mkldnn swish activation (#23041 )	5 years ago
Zhang Ting	880eb04d93	skip PrepareData when it is unnecessary (#22839 ) * remove unnecessary prepare data, test=develop * Op in while block will not skip PrepareData, test=develop	5 years ago
Adam	5842ae6785	Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )" (#22985 )	5 years ago
yaoxuefeng	660ff18488	fix datsset test=develop (#23043 )	5 years ago
wangchaochaohu	3757e0687c	Add Unittest for backward of fusion group (#22932 ) * add fusion group test for backward and refine code	5 years ago
wangchaochaohu	f0d193a23c	Cast fusion for fusion group (#22876 ) * add support for expression type convert and add cast Op support in fusion group	5 years ago
Wilber	ff3ddbb502	add skip_layernorm pass. test=develop (#22895 ) * add skip_layernorm pass. test=develop	5 years ago
Adam	056edf3929	Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )	5 years ago
Zhaolong Xing	8d6dc102fe	[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494 ) * 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop	5 years ago
Zeng Jinle	d33c4343e1	Imperative tracer refactoring (#22457 ) * refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop	5 years ago
liu zhengxi	61fef9754b	Fix fc padding bug during inference fusion (#22860 ) * fix fc padding during fusion, test=develop * fix optim model inference after SaveOptimModel, test=develop	5 years ago
wangchaochaohu	dbb0b9b3b6	refine the profiler print (#22823 ) * refine the profiler print test=develop	5 years ago
hong	5191e54494	reduce default attrs for dynamic graph (#22850 ) * reduce default attrs for dynamic graph, test=develop * add some explanations for explicit attr, test=develop * tweak explicit attr comments, test=develop	5 years ago
Zhang Ting	72ff5a09c3	fix print bug of profile, test=develop (#22804 )	5 years ago
Zhang Ting	4e8bc02461	add fluid.device_guard to specify the device type for Op (#22254 ) * add fluid.device_guard to specify the device type for Op	5 years ago
Zhen Wang	89cfa49156	Unmerged fetch list (#22635 ) * update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop	5 years ago
hutuxian	53a2b68f4e	support customized download command in dataset (#22782 ) * user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario	5 years ago
wangchaochaohu	ca9e77a8d4	add sum op support for fusion group (#22771 ) * Add the codegen and auto fusion for sum Op in fusion group	5 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	5 years ago
Leo Chen	b2c1be851a	support cond in clone, test=develop (#22657 ) * support cond in clone, test=develop * refine code, test=develop * refine code, test=develop * follow comments, test=develop * refine code, test=develop	5 years ago
hutuxian	175954d894	PaddleBox Framework Part2 (#22466 ) * Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.	5 years ago
GaoWei8	cdf5f6fb8c	Add an inference interface to disable FC padding (#22097 ) * Add an interface of disabling FC padding * fix bert regression * polish fc padding interface * recover pass function * fix argument error * fix mkldnn error	5 years ago
tianshuo78520a	d2ba91aad1	fix typo words (#22653 )	5 years ago
tangwei12	66a3150135	SYNC with communicaotor (#22344 ) * add sync communicator and implement	5 years ago
Yiqun Liu	22bbd54719	Add the support of fp16 in fusion_group (#22239 )	5 years ago
wangchaochaohu	c65c6ae534	add flag to control profile level in python API (#22319 ) * add python flag to control profile level test=develop	5 years ago
123malin	00594c1c88	support dumping params/grads in transpiler mode (#22490 )	5 years ago
flame	f7eafca828	remove python inference warning (#22602 )	5 years ago
Wilber	9a8203aa25	fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551 ) 当一个模型中有多个fc_lstm子图的时候，且其中fc共用了同一个persistable的bias，此时不应该将bias节点删除，只将非persistable的节点去除即可。	5 years ago
Zhaolong Xing	8acd745c25	[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486 ) * 1. optim multihead matmul: fuse three fc to multihtead matmul test=develop * fix conflict test=develop * fix comments test=develop	5 years ago
Yiqun Liu	96770f519e	Disable fusion_group for windows and mac in build_strategy. (#22549 ) test=develop	5 years ago
tangwei12	b0675c8193	fix bug with compiledProgram (#22495 ) * add thread barrier for the compiled program	5 years ago
hutuxian	1a7962be97	Paddlebox about box_wrapper (#22497 ) Refine PaddleBox Framework, Main functions: * Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC. * Replace FeedPass with new interface: BeginFeedPass & EndFeedPass * Refactor Pull/Push Sparse Function in box_wrapper. * Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct. * Cache copied keys in pull sparse in order to reuse it in push period.	5 years ago
yaoxuefeng	2235ee1a5e	multi-loss optimization by adding a DownpourOpt worker (#22025 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop	5 years ago
zhaoyuchen2018	54970444ce	Improve transpose performance with tile sm copy, test=develop (#22311 ) * Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Wilber	a90fa54092	Compile without nccl deps. [1/2] (#22509 ) 支持不依赖nccl进行编译。[1/2] 多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
guofei	3a59a7a11f	Make assign op support LoDTensorArray and modify while_loop API (#22309 ) This PR makes assign op support LoDTensorArray and enable the loop_vars in while_loop to support tuple or list.	5 years ago
Yiqun Liu	dcfb603897	Enable the detection of subgraph composed of grad ops (#21223 ) * Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop	5 years ago
joanna.wozna.intel	17f2c0899f	Add dequant-scale squash (#22409 ) * Add dequant scale squash test=develop * Correct dequant-scale squash test test=develop	5 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
xujiaqi01	d51ffe860a	fix copy table bug (#22432 ) * fix copy table bug of lost some feasign * test=develop	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
xujiaqi01	371f377bea	add GeneralRoleMaker (#22295 ) * add GeneralRoleMaker which is for general usage * test=develop	5 years ago
Michał Gallus	269db0d1d1	[DNNL] Fix accuracy in INT8 FC (#22404 ) * Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop	5 years ago
joanna.wozna.intel	3099d9d47c	Restore requantize squash (#22399 )	5 years ago
Adam	e7a9f6bbb7	[Bugfix] Preserve shape in inpalce operators (#22360 )	5 years ago
Yiqun Liu	b7cac50b64	Implement a common python unittest to test the ir passes. (#22209 ) * Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop	5 years ago
tangwei12	82bc814a57	integrated HALF_ASYNC to communicator (#21869 ) * add half_async in the communicator * fix DistributedStrategy	5 years ago
Leo Chen	3e5744aa65	Remove unused inputs for some operators (#22284 ) * remove unused inputs, test=develop * remove unused inputs, test=develop * update dtype, test=develop * remove unused inputs, test=develop * update op_use_default_grad_op_maker, tese=develop * resolve conflicts, test=develop * follow comments, test=develop * update center_loss_grad, test=develop	5 years ago
lidanqing	895f8da7d6	change std::cout to log(INFO), vlog (#22316 )	5 years ago
Zhen Wang	e40cfb1010	fix the bug of assert_is_op_output. test=develop (#22262 )	5 years ago
Wojciech Uss	d3a6647372	improve placement pass tests code coverage (#22197 )	5 years ago
zhouwei25	549e6de7ac	faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164 )	5 years ago
xujiaqi01	e3a457d34b	add collective communication library in fleet (#22211 ) * add collective communication library in fleet to replace mpi * test=develop	5 years ago
Zhen Wang	f2522e91c4	fix the type error caused by setting bool attr in OpDesc. test=develop (#22257 )	5 years ago
Chen Weihang	fc0b21e17b	Polish fetch error message of parallel executor (#22206 ) * polish error message of parallel executor, test=develop * change PADDLE_ENFORCE, test=develop	5 years ago
wangchaochaohu	621d3e0b66	fix the bug of profile update (#22207 ) * fix the bug of profile update test=develop	5 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	5 years ago
zhongpu	d0f0a2520c	test Optimizer in dygraph (#21949 ) * test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop	5 years ago
joanna.wozna.intel	5b2e98aa17	Add multiple quantize operators fuse (#22062 )	5 years ago
Yiqun Liu	96980c2244	Polish the PADDLE_ENFORCE in fusion_group pass related codes. (#22144 ) * Polish the PADDLE_ENFORCE in fusion_group pass related codes. test=develop * Correct the unittest because of the change relu_grad's formula. test=develop	5 years ago
wangchaochaohu	c3876cf82d	add support for nested profiling event and printing in different level (#22061 ) * add support for nested profiling event and printing in different level	5 years ago
liu zhengxi	724b13e459	fix xception precision problem, test=develop (#22124 )	5 years ago
Yiqun Liu	b1401fb74d	Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#22094 ) test=develop	5 years ago
bingyanghuang	4b4a9cc88f	fix format in operator.cc (#22101 )	5 years ago
silingtong123	6c20e7c4e6	test=develop, remove unused parameter from class RuntimeInferShapeContext constructors (#22046 )	5 years ago
Jacek Czaja	b0b27ff699	[MKL-DNN] Conv grad and Batch Norm grad NHWC support (#22088 )	5 years ago
Huihuang Zheng	dd4361568e	Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029 )	5 years ago
Jacek Czaja	ad8a9cb82c	[MKL-DNN] Pool & LRN Grad Ops NHWC support (#21747 )	5 years ago
Yiqun Liu	d48320777e	Add the first implememtation of fusion_group op (#19621 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop	5 years ago
Michał Gallus	6192108408	[DNNL] 3D Fully-Connected (#21746 )	5 years ago
liu zhengxi	196e20dfbb	Fix multi-threads memory out of bounds error for passes (#21920 ) * fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop * fix attention_lstm_fuse_pass during multi-threads inference, test=develop * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop * fix fc_lstm_fuse_pass during multi-threads inference, test=develop * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop	5 years ago
石晓伟	03479469a7	fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841 ) * fix multi-thread error of fc_gru_fuse_pass.cc, test=develop * export FLAGS and GLOG symbols, test=develop	5 years ago
Pei Yang	3e5008ad01	fix trt calib not working bug, test=develop (#21934 )	5 years ago
qingqing01	2066745847	Pack imperative/layer into paddle_framework.so (#21921 ) * Pack imperative/layer into paddle_framework.so	5 years ago
Aurelius84	51a86d2b6b	Optimize adam speed (#21777 ) * optimize adam speed by removing _finish_update test=develop * fix SparseAdamFunctor param list test=develop * Remove scale_op in expect_list of adam_op test=develop * fix test optimizer loss assert error test=develop * fix test optimizer loss assert error test=develop * modify PADDLE_ENFORCE usage test=develop * fix op_type in lamb_op.cc test=develop * fix errors ostream format bug test=develop * add betaPowOut in ngraph op test=develop * fix ngraph::op api for gcc8 test=develop * clean code test=develop * modify struct into class test=develop * remove code of beta1Tensor in lamb_op test=develop	5 years ago
Thunderbrook	c3cf42d0f7	add table id in cache shuffle (#21585 ) * general table * add sparse table test=develop * no cvm test=develop * add no_cvm test=develop * add note test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * add key of optimizer test=develop * solve pslib stop core test=develop * barrier test=develop * add notes test=develop * add table id in cache shuffle test=develop * table id test=develop * code style test=develop	5 years ago
WangXi	17299b8d21	fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop (#21801 )	5 years ago
Huihuang Zheng	557bce77da	Fix Backward Bugs in Conditional Block (#21809 ) The fixed bugs: 1. The condition sub-graph is not pruned 2. When backward graph is extremely simple, the whole backward ops are pruned.	5 years ago
xujiaqi01	0eb4d990c4	fix compiled error when with_pslib=on (#21769 ) * fix compiled error of butil when with_pslib=on and with_testing=on * test=develop	5 years ago
lidanqing	d3a96632fa	Add fc-dequantize squash in cpu_quantize_squash_pass for ernie model (#21714 ) * fc-dequantize squash test=develop * change according to reviews test=develop * change PADDLE_ENFORCE test=develop * add second test when fc-dequant do not fuse test=develop * change all related PADDLE_ENFORCE test=develop	5 years ago
WangXi	8754cbd1f2	fix std::min type in nan_inf, test=develop (#21725 )	5 years ago
joanna.wozna.intel	d419b859c0	Add reshape int8 mkldnn op (#21428 ) * Add reshape int8 op test=develop * Change test to CPUPlace test=develop * Correct tests test=develop	5 years ago
WangXi	8a0f611b64	Rewrite check nan inf tools (#21076 )	5 years ago
tangwei12	9ad940fdfe	memory leak for cpu (#21174 ) * add fake init for the trainer, fix large memory hold in the trainer * do not merge recv vars from a remote endpoint, test=develop * add recv and save op, merge slice var in one op, save memory * remove hsigmoid with pull sparse, test=develop	5 years ago
Zeng Jinle	73461a7ae6	Make OperatorWithKernel::InferShape abstract (#21633 ) * make OperatorWithKernel::InferShape virtual, test=develop * fix test_prepare_op by relu, test=develop	5 years ago
Zeng Jinle	6828f3684b	fix op_registry, add ignore op_function_impl.h, test=develop (#21654 )	5 years ago
Adam	e81f0228df	MKL-DNN 1.0 Update (#20162 ) * MKLDNN v1.0 rebase to Paddle 1.6 test=develop * Add hacky paddle::string::to_string() implementation * vectorize<int64-t>() -> vectorize() cleanup test=develop * PADDLE_ENFORCE and void_cast fixes test=develop * Rebase changes test=develop * Cosmetics test=develop * Delete MKL from mkldnn.cmake test=develop * CMake debug commands test=develop * Delete MKLDNN_VERBOSE and rebase fixes test=develop * Rebase fixes test=develop * Temporarily disable int8 resnet101 vgg16 and vgg19 tests test=develop * Add libmkldnn.so.1 to python setup test=develop * Add libmkldnn.so.1 to inference_lib cmake after rebase test=develop * Post rebase fixes + FC int8 changes test=develop * Fix LRN NHWC test=develop * Fix NHWC conv3d test=develop * Windows build fix + next conv3d fix test=develop * Fix conv2d on AVX2 machines test=develop	5 years ago
xujiaqi01	f404157205	fix master patch when slot is dense (#21580 ) * fix master patch when slot is dense * test=develop	5 years ago
xujiaqi01	c05706fe73	fix code style of fleet_wrapper (#21639 ) * fix code style of fleet_wrapper * test=develop	5 years ago
xujiaqi01	88960684aa	rm optimize_for in framework.proto (#21571 ) * remove optimize_for in framework.proto * test=develop	5 years ago
Zeng Jinle	0f8888360e	Polish op registry codes (#21561 ) * polish infer shape registry, test=develop * modify some operators registry, test=develop	5 years ago
hutuxian	c5aec2fe68	Paddlebox Related to Framework (#21586 ) * Add a single_process_multi_thread transpiler. * Add some UTs. * Fix some API description.	5 years ago
liym27	9da7e6b4d4	add file check_op_desc.py and add interface to get default value. (#21530 ) * add file check_op_desc.py and add interface to get default value. test=develop * add test for c++ coverage rate. test=develop * Correct typo. test=develop	5 years ago
Pei Yang	122b37ce62	make config option DisableGlogInfo() able to mute all inference logs (#21318 ) * make DisableGlogInfo able to mute all logs in inference.	5 years ago
Jacek Czaja	18a5d30754	[MKL-DNN] Conv2d and Conv2d transpose MKL-DNN NHWC support (#21466 )	5 years ago
Zhaolong Xing	c5f0293cf3	NV jetson(nano, tx2, xavier) inference compile support (#21393 ) * add jeston compile support test=develop * refine the cmake test=develop	5 years ago
Tao Luo	01fa4ead61	fix -Wno-error=sign-compare warning in gcc8 (#21434 ) * fix -Wno-error=sign-compare warning in gcc8 test=develop * fix warning in distributed codes test=develop	5 years ago
wangchaochaohu	d4776ec027	fix the correctness of memcpy profiling result test=develop (#21458 )	5 years ago
Jie Fang	5e813b53c5	nhwc optimization for batchnorm (#21090 )	5 years ago
Leo Chen	e0c9d856fb	add unused input vars check for OpWithKernel, test=develop (#21169 ) * add unused input vars check for OpWithKernel, test=develop * remove unused vars in some ops, test=develop * fix batch_norm, test=develop * add white list, test=develop * add CI check for white list, test=develop * :ove white list to c++, test=develop * solve failure of CI, test=develop * add unittest for unused_var_check, test=develop * refine code, enable check in operator_test, test=develop * skip mkldnn, test=develop * extend white list, test=develop * refine condition of mkldnn, test=develop * fix paddle_build, test=develop * follow comments, test=develop * fix GetExpectedKernelType * add wiki ref to err_msg, test=develop * follow comment, test=develop	5 years ago
Huihuang Zheng	630be31952	Fix Cond Bug for Nested Control Flow (#21340 ) * Commit before merging develop test=develop * Backup after working with Huihuang logs * Commit before deleting Huihuang debug loggings * Commit before debug test=develop * Fix bug commit test=develop * Backup of fixing bugs test=develop * Clean up code test=develop * Fix a bug in sum_op test=develop	5 years ago
Jacek Czaja	cd43c4440e	[MKL-DNN] LRN and Pool2d (FWD) NHWC support (#21375 )	5 years ago
Zeng Jinle	6b09b73e17	add explicit conversion to NoNeedBufferVarsFunctor, test=develop (#21430 )	5 years ago
hong	ac8546701d	Add dygraph execution context (#20157 ) * add_dygraph_execution_context * add dygraph infershape context and execution context; test=develop * fix imperative bug; test=develop * remove inputs outputs interface from execution context, because it have same function with inputNames; test=develop * remove tracer_test ctest; test=develop * fix split op bug; test=develop * fix unitests bug; test=develop * fix distribute test bug; test=develop * fix ngraph compile bug; test=develop * fix grad maker bug; test=develop * fix load op bugs; test=develop * fix operator.cc construct bug; test=develop * remove useless name find in operator; test=develop * add tracer_test; test=develop * fix concat, split bug; test=develop * remove tracer_test unitest; test=develop * fix attribute check bug; test=develop * add test code to fix converage; test=develop * remove useless code, change check backward input in engin; test=develop * unlock var type infer shape;test=develop * add ShareAllLoD api; test=develop * add dygraph infershape context unitest; test=develop * remove increase and decrease lod in dygraph; test=develop * addd override; test=develop * fix increase descrease lod; test=develop * fix paddle_enforce; test=develop * disable lod op dygraph check; test=develop * fix paddle enforce error; test=develop * add comment for op_registry and OperatorBase; test=develop * optimize the comment of op_registry; test=develop * fix format of comment; test=develop * fix format of comment; test=develop * optimize the format of comment; test=develop * optimize the format of the comment; test=develop * optimize comment of op_registry; test=develop	5 years ago
Zeng Jinle	09696d5df8	Use system allocator in OpTest (#21335 ) * use system allocator in unittests, test=develop * fix op bugs, test=develop * fix tensor copy bug when src and dst are the same, test=develop	5 years ago
Tao Luo	c0656dcb1a	remove -Wno-error=sign-compare, make warning as error (#21358 ) * remove -Wno-error=sign-compare, make warning as error test=develop test=document_fix * fix exist compile warning test=develop	5 years ago
Zeng Jinle	b97fc16d21	fix lod_reset bug, test=develop (#21392 )	5 years ago
Zeng Jinle	89966525f1	Polish reference count pass (#21324 ) * fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop	5 years ago
Youwei Song	d5ff79e55e	Support numpy bridge (enabled by default in dygraph mode) (#20983 ) * add numpy bridge * fix template compile * add unittest, add default test=develop * fix unittest test=develop * fix unittest test=develop * zero_copy=True for to_variable, test=develop * bug fix test=develop * disable deprecated NumPy API test=develop * use better design of NumpyAllocator test=develop * fix Py_None check test=develop * reset c++ tracer when jump out dygraph guard test=develop * refine PADDLE_ENFORCE_xx format test=develop * bug fix of tracer switch test=develop * update decref test=develop	5 years ago
GaoWei8	8493f20ebc	Polish the codes of fc when needs padding (#21378 ) test=develop	5 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	5 years ago
GaoWei8	234060f88f	Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972 ) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop	5 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	5 years ago
Thunderbrook	9a7832f8be	print table stat info for pslib (#21296 ) * print table stat test=develop * notes test=develop * notes test=develop	5 years ago
Dong Daxiang	691ced87c0	Refactor fetch handler (#21264 ) * fix fetch handler problem and refactor when a user define FetchHandler class, he or she should initialize a handler with variable dict. the key of a variable dict is a user defined name, the value of a variable dict is a Varaible generated from python API. For each fetching, a user should implement handler function in which fetched_result_dict will be available and the user can access the fetched value with user defined keys.	5 years ago
Yiqun Liu	c918788ba9	Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. (#21310 ) * Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. test=develop * Print the subgraph when check failed. test=develop	5 years ago
Chen Weihang	952508527a	Polish some PE code details (#21274 ) * polish code details, test=develop * futher polish hint msg, test=develop	5 years ago

1 2 3 4 5 ...

2918 Commits (090a331d303f6ea9a4adfb485ee53e7379a6d1c2)