Paddle

Commit Graph

Author	SHA1	Message	Date
Wilber	f686310d81	fix concat_mkldnn op. test=develop (#22692 ) fix concat_mkldnn op when encounter extreame conditions.	5 years ago
hong	5191e54494	reduce default attrs for dynamic graph (#22850 ) * reduce default attrs for dynamic graph, test=develop * add some explanations for explicit attr, test=develop * tweak explicit attr comments, test=develop	5 years ago
Zhaolong Xing	1a533ed2de	[BUG]: Multihead matmul op's ouput size should be BxSx(N*H) (#22848 ) test=develop	5 years ago
hong	c736fef93b	dygraph backward engine accelerate (#22808 ) * fix loaded program load bug; test=develop * first version * speed backward engin; test=develop * remove useless code; test=develop * reconvery io.py; test=develop * remove useless code; test=develop * remove useless code; test=develop	5 years ago
Zeng Jinle	d41d802ba3	Add flags to limit gpu memory (#22793 ) * add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop	5 years ago
石晓伟	1861ca88f1	serialize the PaddleTensor, test=develop (#22810 ) * encapsulate the PaddleTensorToLoDTensor, test=develop * serialize the pd_tensor, test=develop * serialize tensors to file, test=develop	5 years ago
Zhang Ting	72ff5a09c3	fix print bug of profile, test=develop (#22804 )	5 years ago
Zhang Ting	4e8bc02461	add fluid.device_guard to specify the device type for Op (#22254 ) * add fluid.device_guard to specify the device type for Op	5 years ago
石晓伟	ddb9b46fec	change the function in op_teller, test=develop (#22794 ) * change the function in op_teller, test=develop * correct the commit-id, test=develop	5 years ago
Zhen Wang	89cfa49156	Unmerged fetch list (#22635 ) * update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop	5 years ago
wangchaochaohu	8456c3f4dd	polish the profiler_help code (#22811 )	5 years ago
zhongpu	2fd1ec1e3e	fix docker build for paddle openblas, test=develop (#22795 )	5 years ago
Chen Weihang	7d8d573453	Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541 ) * add lodtensor share memory & serialization, test=develop * fix windows compile error, test=develop * deal vartype pickle & fix unittest matching error message, test=develop * update timeout variable name, test=develop * refactor memory map implement, test=develop * clear mmap file discripter when exit unexpectedly, test=develop * remove the child process fd in advance, test=develop * remove mmap fds after Queue.put in child process, test=develop * add hard unittests for register exit func, test=develop * fix python2 compatibility problem in unittest, test=develop * fix exception unittest error, test=develop * polish code based review comment, test=develop	5 years ago
liu zhengxi	324f2b3922	Fix inference c api PD_GetZeroCopyOutput lod (#22768 ) * fix inference c api lod, test=develop * fix capi lod problem and enrich tests, test=develop * delete useless header files and alter const_cast, test=develop	5 years ago
wangchaochaohu	7578fcbac4	Profile code refine (#22800 ) * add profiler_help.h to refine the code test=develop	5 years ago
hutuxian	53a2b68f4e	support customized download command in dataset (#22782 ) * user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario	5 years ago
wangchaochaohu	ca9e77a8d4	add sum op support for fusion group (#22771 ) * Add the codegen and auto fusion for sum Op in fusion group	5 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	5 years ago
Kaipeng Deng	ebc7ffc300	fix detection_map. test=develop (#22705 )	5 years ago
zhaoyuchen2018	72dde4abde	Refine adam op to improve performance, test=develop (#22346 ) * Refine adam op, test=develop * Fuse kernels together to reduce cpu time. * Refine paddle enforce, test=develop * Remove some comments, test=develop * Refine code,test=develop * Refine cuda kernel, test=develop * Refine code according to comments, test=develop	5 years ago
wangguanzhong	f2d1cd119a	fix lod level, test=develop (#22755 )	5 years ago
FlyingQianMM	79d712346f	Correct CPU gradients of the argsort op (#22739 ) * Correct CPU gradients of the argsort op, form a network to test its forward and backward process, test=develop * fix dynamic threshold error in test_argsort_op, test=develop	5 years ago
Adam	2b80e9a719	Add cpu_info without XBYAK (#22716 )	5 years ago
guofei	ae8b5f11a3	Change ShareDataWith() to TensorCopy() in ref_by_trainer_id (#22717 ) As the title	5 years ago
liu zhengxi	71ab0458e1	Fix pointer and c-api encapsulation (#22663 ) * refine pointer and c-api prototype, test=develop * fix new c api profile bug, test=develop * add unit tests, test=develop	5 years ago
Leo Chen	b2c1be851a	support cond in clone, test=develop (#22657 ) * support cond in clone, test=develop * refine code, test=develop * refine code, test=develop * follow comments, test=develop * refine code, test=develop	5 years ago
Zhang Ting	f97f3f9301	add framework overhead ratio in profile report (#22590 ) * add framework overhead ratio, test=develop * print GpuMemcpy overhead, test=develop	5 years ago
zhouwei25	160d0f1308	fix the CI risk that network cannot be connected (#22736 )	5 years ago
chengjuntao	15c2667143	register fp16 for assign op (#22744 ) * register fp16 for assign op, test=develop * add op test for fp16, test=develop	5 years ago
zhangchunle	882e7f7c3b	Directly getting API.spec for tools/sampcd_processor.py (#22728 )	5 years ago
dyning	1c0653462d	fix generate_mask_labels lod level (#22743 )	5 years ago
GaoWei8	ba140222d6	fix compile&runtime lod_equality of lod_reset (#22737 )	5 years ago
hutuxian	175954d894	PaddleBox Framework Part2 (#22466 ) * Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.	5 years ago
ShenLiang	3132681e8a	add partial_sum op in contrib (#22292 ) * add partial_sum_op, test=develop * modify the Paddle Error Message, test=develop * modify the Paddle Error Message, test=develop * modify the bug for python3, test=develop * modify the ut for ci, test=develop * mv to contrib, test=develop * use check_variable_and_dtype, test=develop * fix ci, test=develop * fix conflict, test=dvelop * add partial concat, test=develop * fix the conflict, test=develop * fix the error, test=develop * rm SSE4, test=develop	5 years ago
wangchaochaohu	611411b90e	Fusion group profile support (#22718 ) * add support for the driver api callback and fix the profiler name show bug	5 years ago
ShenLiang	e136661304	add partial_concat op in contrib (#22528 ) * add partial_concat, test=develop * fix the grids and blocks, test=develop * fix the Paddle_Enforce, test=develop * fix the doc of op, test=develop * fix the doc, test=develop * fix the doc of the op, test=develop * replace -1 with None, test=develop	5 years ago
GaoWei8	cdf5f6fb8c	Add an inference interface to disable FC padding (#22097 ) * Add an interface of disabling FC padding * fix bert regression * polish fc padding interface * recover pass function * fix argument error * fix mkldnn error	5 years ago
tianshuo78520a	d2ba91aad1	fix typo words (#22653 )	5 years ago
Yibing Liu	6e7bfe30a6	register fp16 kernel for some ops (#22650 ) (#22696 ) test=develop	5 years ago
tangwei12	66a3150135	SYNC with communicaotor (#22344 ) * add sync communicator and implement	5 years ago
Yiqun Liu	22bbd54719	Add the support of fp16 in fusion_group (#22239 )	5 years ago
flame	d97475d53b	fix CPU C inference API compile bug (#22702 )	5 years ago
Huihuang Zheng	adfa5b8354	Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp (#22673 ) 1. Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp. 2. Also enrich PADDLE_ENFORCE error messages.	5 years ago
flame	74eb82de19	fix go api bug (#22669 )	5 years ago
wangchaochaohu	a089072c8b	fix the profile print error (#22665 ) * fix the profile print error test=develop	5 years ago
lidanqing	d926214535	[UT coverage] improve the mul_mkldnn_op line coverage (#22408 ) * improve the mul_mkldnn_op line coverage test=develop * remove fp32 mul mkldnn kernel test=develop * locally refactoring test=develop * change according to reviews test=develop	5 years ago
wangchaochaohu	c65c6ae534	add flag to control profile level in python API (#22319 ) * add python flag to control profile level test=develop	5 years ago
123malin	00594c1c88	support dumping params/grads in transpiler mode (#22490 )	5 years ago
Zhaolong Xing	a06d75a280	[Paddle-TRT] Refine the error log about runtime batch and max_batch_size. (#22535 ) * fix trt log test=develop * fix comments test=develop	5 years ago
Adam	608447bfd5	Update MKLDNN to v1.2 (#22521 )	5 years ago
Adam	ab610a34ff	transpose_mkldnn code change to meet Paddle standards (#22591 )	5 years ago
Jiawei Wang	8f035fb637	Add TopK Op Grad CPU&GPU Kernel test=develop (#22628 ) * Add TopK Op Grad CPU&GPU Kernel test=develop * Add TopK Op Grad, modify grad op maker test=develop * Add TopK Op Grad, modify grad op maker test=develop * Add TopK Op Grad, modify PADDLE_ENFORCE test=develop * Add TopK Op Grad, modify PADDLE_THROW test=develop * Add TopK Op Grad, modify unittest test=develop * fix ngraph top k op unittest test=develop	5 years ago
Steffy-zxf	90ee366653	update ops's unittest data type from float32 to float64 and shape over 100 (#22544 ) * update ops's unittest of elementwise_pow, elementwise_max, elementwise_min, scale and sqrt 1. update elementwise_pow, elementwise_max and scale's unitests with input data type (float32 -> float64) 2. fix bug that the elementwise_pow doesn't meet threshold requirements with tackling float64 data 3. remove sqrt from op_accuracy_white_list.py 4. update the unittests of elementwise_pow, elementwise_max and elementwise_min ops that their input data shape over 100 5. test=develop * modify the writing style according suggestions test=develop	5 years ago
flame	f7eafca828	remove python inference warning (#22602 )	5 years ago
Chen Weihang	fe685cc185	fix enforce test error, test=develop (#22610 )	5 years ago
Wilber	9a8203aa25	fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551 ) 当一个模型中有多个fc_lstm子图的时候，且其中fc共用了同一个persistable的bias，此时不应该将bias节点删除，只将非persistable的节点去除即可。	5 years ago
Chen Weihang	266106da75	Fix mismatch with plus sign in the line (#22588 ) * reproduce match error, test=develop, test=document_fix * fix mismatch error, test=develop, test=document_fix	5 years ago
flame	1d503e6a9e	Golang inference API (#22503 ) * support golang inference	5 years ago
Zhaolong Xing	8acd745c25	[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486 ) * 1. optim multihead matmul: fuse three fc to multihtead matmul test=develop * fix conflict test=develop * fix comments test=develop	5 years ago
Yiqun Liu	96770f519e	Disable fusion_group for windows and mac in build_strategy. (#22549 ) test=develop	5 years ago
Zeng Jinle	08033c8634	fix traced layer with non persistable vars, test=develop (#22552 )	5 years ago
Guo Sheng	31b5464632	Add support for dynamic_decode(while) training. (#22231 ) * Add support for dynamic_decode(while) training. test=develop * Fix assign_op and tensor_array_read_write_op after solving conflict. test=develop * Fix test_rnn_decode_api.py. test=develop * Refine docs for apis in rnn.py. test=develop * Adjust outputs of dynamic_decode. test=develop * Remove the force_cpu update in assign_op. test=develop * Remove the force_cpu update in assign_op. test=develop * Make RNNCell.get_initial_states support batch_dim_idx argument. test=develop * Rename _create_array_outof_while as _create_array_out_of_while in rnn.py. test=develop	5 years ago
tangwei12	b0675c8193	fix bug with compiledProgram (#22495 ) * add thread barrier for the compiled program	5 years ago
Wojciech Uss	4cddb43c5c	Add support for Ernie NLP model to the Slim QAT (#22506 ) * a test for Ernie QAT INT8 accuracy check test=develop * Remove NLP comparison test to split PRs test=develop * Fix typo and tabs, delete commented lines test=develop * re-combine the 2 PRs, test=develop Co-authored-by: Michał Gallus <sand3r@interia.eu> Co-authored-by: bingyanghuang <33643817+bingyanghuang@users.noreply.github.com>	5 years ago
Double_V	58d99247f4	support slice double grad, test=develop (#22166 ) * support slice double grad, test=develop * merge two doublegradopmaker to one doublegradopmaker,test=develop * change the shape of slice_OP's unittest, test=develop	5 years ago
hutuxian	1a7962be97	Paddlebox about box_wrapper (#22497 ) Refine PaddleBox Framework, Main functions: * Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC. * Replace FeedPass with new interface: BeginFeedPass & EndFeedPass * Refactor Pull/Push Sparse Function in box_wrapper. * Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct. * Cache copied keys in pull sparse in order to reuse it in push period.	5 years ago
huzhiqiang	9e29d3ebed	【OpPorting Example】DEMO OF FIX COMPILE&RUNTIME LOD_EQUALITY (#22460 )	5 years ago
yaoxuefeng	2235ee1a5e	multi-loss optimization by adding a DownpourOpt worker (#22025 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop	5 years ago
zhaoyuchen2018	54970444ce	Improve transpose performance with tile sm copy, test=develop (#22311 ) * Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Wilber	a90fa54092	Compile without nccl deps. [1/2] (#22509 ) 支持不依赖nccl进行编译。[1/2] 多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
guofei	3a59a7a11f	Make assign op support LoDTensorArray and modify while_loop API (#22309 ) This PR makes assign op support LoDTensorArray and enable the loop_vars in while_loop to support tuple or list.	5 years ago
Zhaolong Xing	54a325a52f	[Refine Paddle-TRT INT8]: Support PaddleSlim's Resnet50, Mobilenetv1, Yolov3 models for Inference. (#22483 ) * add int8 op teller for trt. * refine trt int8 * add int8 op teller for trt. test=develop	5 years ago
zhongpu	5739eeb9fa	add cp27-cp27m-gcc82 and cp27-cp27mu-gcc82 branch to support gcc8.2 compile for paddle, test=develop (#22504 )	5 years ago
Wilber	de009152a7	Compile without nccl deps. [2/2] (#22484 ) Compile without nccl deps. [1/2] Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Yiqun Liu	4b2227e958	Fix dismatch of std::max's arguments type on windows. (#22507 ) test=develop	5 years ago
Wilber	870f465887	fix test_fusion_seqpool_concat lod level between compile and runtime (#22488 )	5 years ago
Zhong Hui	a61d09527b	Fix the integer overflow problem of sequence2batch (#22479 ) Fix the integer overflow problem in the op of sequence2batch, change the int32_t to size_t， In the /paddle/fluid/operators/math/sequence2batch.h#L122.	5 years ago
cc	197913ebe1	Add weight quantization in post_training_quanzitaion (#22445 ) * support weight quantization in post_training_quanzitaion, test=develop * add test for weight quantization, test=develop	5 years ago
Yiqun Liu	dcfb603897	Enable the detection of subgraph composed of grad ops (#21223 ) * Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop	5 years ago
Tao Luo	7c9ce097f1	refine reshape_op shape error message (#22480 ) test=develop	5 years ago
LielinJiang	2b1386b2b2	optimize performance of interpolate op (#22436 ) * optimize interpolate op, test=develop	5 years ago
wangchaochaohu	77dd0d97bb	use enum class to replace the usage of enum in some condition test=develop (#22464 )	5 years ago
Yiqun Liu	44b45b9f07	Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456 ) * Add log in memory::Copy for debug purpose. * Change to use context in DeviceContextPool directly in sequence_pooling_test, instead to new one. * Change to use context in DeviceContextPool directly in sequence_padding_test, instead to new one. test=develop * Change the type of second_dim from size_t to int64_t. test=develop	5 years ago
joanna.wozna.intel	17f2c0899f	Add dequant-scale squash (#22409 ) * Add dequant scale squash test=develop * Correct dequant-scale squash test test=develop	5 years ago
mapingshuo	9c4deedbc2	update readme of imdb training demo (#22455 ) * update readme * test=develop	5 years ago
Zhaolong Xing	ceda0b9b1a	[Fix BUG]: Core when multi thread + clone + paddle-trt (#22442 ) * add mutex for trt engine test=develop * add the test for copy_to_cpu test=develop	5 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Tao Luo	943cb8c664	fix sigmoid cudnn bug (#22439 ) * Sigmoid bug fix, test=develop * fix code format test=develop Co-authored-by: Manjunath Bhat <manjunathbhat9920@gmail.com>	5 years ago
xujiaqi01	d51ffe860a	fix copy table bug (#22432 ) * fix copy table bug of lost some feasign * test=develop	5 years ago
Leo Chen	822e5b36ec	Support int16 for Tensor (#22423 ) * add int16 support, test=develop * add test, test=develop * fix typo, test=develop * fix dtype error in slice, test=develop	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
liu zhengxi	0404e7a985	Update the precision of pad, pad2d, pad_constant_like's unit tests from fp32 to fp64 (#22394 ) * update the ut precision of pad pad2d pad_constant_like from fp32 to fp64, test=develop	5 years ago
xujiaqi01	371f377bea	add GeneralRoleMaker (#22295 ) * add GeneralRoleMaker which is for general usage * test=develop	5 years ago
Michał Gallus	269db0d1d1	[DNNL] Fix accuracy in INT8 FC (#22404 ) * Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop	5 years ago
joanna.wozna.intel	fb3086fd57	[UT coverage]Remove unnecessary transpose op registration (#22402 )	5 years ago
lidanqing	ade5022681	[UT Coverage]Improve sum_mkldnn_op line coverage (#22275 )	5 years ago
joanna.wozna.intel	3099d9d47c	Restore requantize squash (#22399 )	5 years ago
Wojciech Uss	92462e948d	improve elementwise_add_mkldnn_op test code coverage (#22359 )	5 years ago
ceci3	20f30dd604	add benchmark flag for conv_transpose (#22389 )	5 years ago
Leo Chen	b96c7c9a7a	polish code, test=develop (#22380 ) remove unnecessary template.	5 years ago
Chengmo	8f36c39537	Fix GEO-SGD init & send Bug (#22375 ) * test=develop, fix geo Send & Init	5 years ago
zhupengyang	c6f888e5a5	update unittest accuracy to float64 for relu, prelu, maxout (#22273 )	5 years ago
wangchaochaohu	0d8b222b79	Optimize the depthwise op test=develop (#22265 )	5 years ago
Leo Chen	aaa4fe491a	use function instead of lambda, test=develop (#22348 ) * use function instead of lambda, test=develop * follow comments, test=develop	5 years ago
Adam	e7a9f6bbb7	[Bugfix] Preserve shape in inpalce operators (#22360 )	5 years ago
qingqing01	2d20869c94	Fix infer_shape in compling for elementwise_op (#22291 )	5 years ago
Yiqun Liu	b7cac50b64	Implement a common python unittest to test the ir passes. (#22209 ) * Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop	5 years ago
tangwei12	82bc814a57	integrated HALF_ASYNC to communicator (#21869 ) * add half_async in the communicator * fix DistributedStrategy	5 years ago
wangchaochaohu	1e932eccfa	remove unused code test=develop (#22327 )	5 years ago
Leo Chen	3e5744aa65	Remove unused inputs for some operators (#22284 ) * remove unused inputs, test=develop * remove unused inputs, test=develop * update dtype, test=develop * remove unused inputs, test=develop * update op_use_default_grad_op_maker, tese=develop * resolve conflicts, test=develop * follow comments, test=develop * update center_loss_grad, test=develop	5 years ago
zhangchunle	805328e13b	fix typo in error message (#22312 )	5 years ago
lidanqing	895f8da7d6	change std::cout to log(INFO), vlog (#22316 )	5 years ago
石晓伟	8cb04664b9	revert paddle_fluid.map, test=develop (#22236 )	5 years ago
Chen Weihang	35efbe6d95	Speeding up dygraph DataLoader with multiprocessing (#21762 ) * add multiprocess for dygraph data loader, test=develop * polish code & add safe gurad, test=develop * refactor dygraph dataloader & add signal handler, test=develop * fix member initializer compile error on ci, test=develop * fix member initializer compile error one more, test=develop * remove useless config, test=develop * skip windows incompatible problem, test=develop * add unittest for coverage, test=coverage * add more exception unittest case, test=develop * deal with signal handler coverage, test=develop * polish code & add signal handler tests, test=develop * deal with coverage ci problem, test=develop * split data loader test & coverage ci fix, test=develop * remove test_imperative_data_loader_with_exception, test=develop * remove singal process except test case, test=develop * add exception tests again & remove sample list test, test=develop * split normal and exception unittests to diff class, test=develop * polish doc for use_multiprocess effect in static mode, test=develop	5 years ago
Zeng Jinle	9435533adf	remove op_use_default_grad_op_maker.spec, test=develop, test=document_fix (#22300 )	5 years ago
wangchaochaohu	7b76a76495	fix the conda build confilict test=develop (#22279 )	5 years ago
Zeng Jinle	5e601a92ad	polish grad op check (#22290 ) * polish grad op check, test=develop, test=document_fix * keep op_use_default_grad_maker.spec to avoid conflict, test=develop, test=document_fix	5 years ago
Bai Yifan	faba4b116a	Remove disable flag in test_fsp_op.py (#22171 ) * fix fsp_op, test=develop * fix fsp grad op maker, test=develop * update op_use_default_grad_op_maker.spec, test=develop	5 years ago
Zhen Wang	e40cfb1010	fix the bug of assert_is_op_output. test=develop (#22262 )	5 years ago
Wojciech Uss	d3a6647372	improve placement pass tests code coverage (#22197 )	5 years ago
liu zhengxi	07afc29e90	Make api.cc malloc consistent with paddle_api.h for PaddleBuf (#22255 )	5 years ago
silingtong123	4f1da4adcb	remove the useless third_party library from C++ inference library (#22021 ) * remove the useless third_party library from C++ inference library * revert removing the install directory	5 years ago
zhouwei25	549e6de7ac	faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164 )	5 years ago
xujiaqi01	e3a457d34b	add collective communication library in fleet (#22211 ) * add collective communication library in fleet to replace mpi * test=develop	5 years ago
Zhen Wang	f2522e91c4	fix the type error caused by setting bool attr in OpDesc. test=develop (#22257 )	5 years ago
songyouwei	0ba1d140d4	Add CI check for sequence ops' unittests (#21615 )	5 years ago
Zeng Jinle	1b76e789cf	remove cuda allocator ctor, test=develop (#22212 )	5 years ago
Adam	9942d9ed5c	Add caching mechanizm to requantize_mkldnn_op (#22223 )	5 years ago
Wilber	1230c110cb	[fluid-lite] adjust to relative error (#22232 ) - fluid和lite精度比较替换为相对误差	5 years ago
123malin	985bceac53	Bug fix for sparse recorder (#21969 ) * test=develop, bug fix for sparse recorder	5 years ago
Chen Weihang	fc0b21e17b	Polish fetch error message of parallel executor (#22206 ) * polish error message of parallel executor, test=develop * change PADDLE_ENFORCE, test=develop	5 years ago
Wojciech Uss	2e90c4eb0a	improve mkldnn_quantizer_config test code coverage (#22216 )	5 years ago
Wilber	5750152e80	support fluid-lite subgraph run resnet test=develop (#22191 ) - 添加了fluid-lite子图方式运行resnet的单测 - 修改了依赖Lite的git commit id	5 years ago
wangchaochaohu	621d3e0b66	fix the bug of profile update (#22207 ) * fix the bug of profile update test=develop	5 years ago
FlyingQianMM	443a713c9e	add backward gradient computation for op argsort (#22203 ) * add backward gradient computation for op argsort test=developo * use pre-commit test=develop	5 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	5 years ago
zhouwei25	2f3e2a84af	fix ci rule to show Shell variables (#22177 )	5 years ago
baojun	298ee7d28a	Improve ngraph file line coverage (#22155 )	5 years ago
zhongpu	d0f0a2520c	test Optimizer in dygraph (#21949 ) * test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop	5 years ago
石晓伟	ad0dfb17c1	[Feature] Lite subgraph (#22114 )	5 years ago
joanna.wozna.intel	5b2e98aa17	Add multiple quantize operators fuse (#22062 )	5 years ago
Yiqun Liu	96980c2244	Polish the PADDLE_ENFORCE in fusion_group pass related codes. (#22144 ) * Polish the PADDLE_ENFORCE in fusion_group pass related codes. test=develop * Correct the unittest because of the change relu_grad's formula. test=develop	5 years ago
wangchaochaohu	c3876cf82d	add support for nested profiling event and printing in different level (#22061 ) * add support for nested profiling event and printing in different level	5 years ago
Zeng Jinle	c3bcd3c1e2	fix dygraph non zero gpu bug, test=develop (#22165 )	5 years ago
zhaoyuchen2018	3d4f2aa689	Refine stack op to improve xlnet performance, test=develop (#22142 ) stack's wait cost a lot of cpu time, use cuda kernel to do memory copy will reduce cpu time. Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
zhongpu	cf475f95df	Remove FC in dygraph, modify FC to Linear in sample code (#22082 ) * modify fc to linear in sample code, test=develop * remove FC, test=develop * remove warnings, test=develop * drop fluid/imperative/README.md , test=develop * change fc to linear, test=develop * polish code style, test=develop	5 years ago
liu zhengxi	64a4044292	add double register op_data_type of pad2d and fix compile error, test=develop (#22075 )	5 years ago
Liu Xudong	7ba7acd197	Add coverage tools (#21975 ) Add coverage data processing tools.	5 years ago
Double_V	6ea3809143	Support prroi_pool_op with Tensor and LoDTensor rois (#20649 ) 1. Add a new input named batch_roi_nums for prroi_pool_op. batch_roi_nums includes the number of roi for each image in batch when rois is Tensor. This information is saved in rois's lod when rois is LoDTensor. 2. add grad check to prroi_pool_op and solve unnormal X grad diff in CPU.	5 years ago
Pei Yang	d8a9b134e3	fix trt instance_norm serialize bug. test=develop (#22152 )	5 years ago

1 2 3 4 5 ...

16700 Commits (24a063f6ac0ba1122b5b6bec524c6ec659197e5f)