Paddle

Commit Graph

Author	SHA1	Message	Date
Zeng Jinle	29337f4e17	fix conflict of inferne partial feed with gpu parallel ssa graph executor, test=develop (#23400 )	5 years ago
Pei Yang	7e439780d9	add full paddle_analysis_config.h APIs. (#23215 )	5 years ago
zhongpu	bfb07aafe8	Revert "Exhaustive search (#22821 )", test=develop (#23401 ) This reverts commit `48144e4099`.	5 years ago
liym27	b7b0b3595b	Add unittest for transformer prediction in dygraph_to_static (#23207 ) * Add unittest for transformer prediction in dygraph_to_static. * fix bug in fill_constant api. * Make transpose support size 0. test=develop	5 years ago
xujiaqi01	93ea9dd27a	fix stat var in hogwild worker (#23367 ) * fix stat var in hogwild worker * test=develop	5 years ago
joanna.wozna.intel	8c463700e1	Add default pass attributes (#23042 )	5 years ago
zhongpu	48144e4099	Exhaustive search (#22821 ) * use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop Co-authored-by: phlrain <phliuhongyu@126.com>	5 years ago
Adam	da7c73f847	Delete is_test attribute from activation operators (#23318 ) * Delete is_test from activation operators test=develop * Revent unneeded changes test=develop	5 years ago
Kaipeng Deng	21d95be0db	Add inplace abn op (#22806 ) * add inplace_abn_op. test=develop	5 years ago
Yi Liu	821534efd3	add paralell_executor dependancy to collective_helper (#23380 ) test=develop	5 years ago
Zeng Jinle	3a21980b78	add reader dependency pass, test=develop (#23301 )	5 years ago
wangchaochaohu	69e3f99362	refine the error message (#23212 ) * refine the error message of tensor_array_read_write Op	5 years ago
石晓伟	5c59d2139e	reverts the commit 23177, test=develop (#23363 )	5 years ago
wangchaochaohu	d280106007	Add support for attr type Op and add fill_constant Op and scale Op (#23163 ) * add attr support for fusion group and add support for fill_constant and scale Op	5 years ago
xujiaqi01	3a45767d49	add fleet pslib pull and push sparse op and push dense op (#23139 ) * add fleet pslib pull and push sparse op and push dense op * test=develop	5 years ago
songyouwei	99d30bfc36	speedup slice impl (#23340 ) test=develop	5 years ago
Zhaolong Xing	1a6ce8b910	add swish split gelu plugin dynamic support (#23305 ) test=develop	5 years ago
Jacek Czaja	2bb1b0e89e	[DNNL] Added MKL-DNN inplace pass for C-API inference (#23315 )	5 years ago
Yi Liu	0471476a18	fix nccl comm double free bug (#23344 ) As nccl comm is not created by CUDADeviceContext, it should be destroyed by the creator as the best practice of RAII.	5 years ago
wangchaochaohu	1ee2a9a424	Profiler refine (#23294 ) * refine output of profiler for child event	5 years ago
Leo Chen	488b2387e2	Feature/expand params in auto-generated pybind functions for dygraph operators (#23181 ) * expand parameters, test=develop * support resnet, test=develop * fix resnet, test=develop * support duplicable out, test=develop * support ptb * fix bugs, test=develop * support null input, test=develop * fix bugs, test=develop * fix batchNorm is_test, test=develop * refine code, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop	5 years ago
GaoWei8	20eed5401a	Change fluid.layers.where‘s C++ operator name (#23250 )	5 years ago
Yi Liu	2169e6fb58	Initialize global nccl_comm in PE (#23275 )	5 years ago
Jacek Czaja	012886df79	[DNNL] Softmax mkldnn op inplace support (#23197 )	5 years ago
石晓伟	75ebb48a91	supports thread-binding stream, test=develop (#23177 )	5 years ago
石晓伟	708ded584e	pause the io_utils_test of int64 and resume after repair, test=develop (#23234 )	5 years ago
Zeng Jinle	babda94c8a	Distinguish public/private global vars (#23269 ) * distinguish public/private vars, test=develop * fix windows issues, test=develop	5 years ago
zhaoyuchen2018	58615a6272	Improve elementwise performance. (#23001 ) * Improve elementwise performance. Elementwise performace is poor as walk into CommonGradBroadcastCUDA, add some new kernels for different data pattern. * Add some cuda kernel to speedup common broadcast cases. test=develop * Add more test cases and fix cuda kernel bug. test=develop * Remove tests as cpu percision fails.test=develop * Refine SplitDims, test=develop * Change file mode, test=develop	5 years ago
Wojciech Uss	f836c8aa8f	add check for scales and a message (#23119 )	5 years ago
Zeng Jinle	8bfd62ffb7	Expose dygraph.grad api (#23124 ) * expose dygraph.grad api, test=develop, test=document_fix * add more parameter in dygraph.grad API, test=develop * add only_inputs=True parameter, test=develop * follow comments, test=develop, test=document_fix * fix typo, test=develop, test=document_fix	5 years ago
Wilber	0129f4b568	Add some inference API comments for AnalysisPredictor (#23242 ) * add inference api doc. test=develop	5 years ago
Tao Luo	c00d427d52	simplify the cmake log of ir/CMakeLists.txt (#23262 ) test=develop	5 years ago
Zeng Jinle	77b4dc80c9	code polish for adding const qualifier, test=develop, test=document_fix (#23248 )	5 years ago
Zhaolong Xing	430b0099c9	[Paddle-TRT]: Ernie Dynamic shape support. (#23138 ) * add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop	5 years ago
xujiaqi01	68ea1ad55b	add clear one table (#23089 ) * add clear_one_table * test=develop	5 years ago
danleifeng	ae3bb16d06	add MaskAucCalculator in paddlebox (#23157 ) * add maskauc in paddlebox; test=develop	5 years ago
liym27	6af480ca33	Support int64 for op assign_value. test=develop (#23179 )	5 years ago
Zeng Jinle	53e6f8e1da	rename macro, test=develop (#23161 )	5 years ago
Zeng Jinle	bba740710d	add cuda resource pool for BufferedReader, test=develop (#23152 )	5 years ago
Zeng Jinle	7d8d50b6cc	rename no_need_buffer_vars macro, test=develop (#23160 )	5 years ago
Liufang Sang	a486a739e1	fix compile error in win gpu (#23196 ) * fix compile error in win gpu test=develop * fix compile error in win gpu test=develop * fix compile error in win gpu test=develop	5 years ago
Zeng Jinle	7ca77a90ac	add Tensor::IsSharedBufferWith method, test=develop (#23175 )	5 years ago
Zeng Jinle	b8886bf122	rename no_need_buffer_vars_macro, test=develop (#23159 )	5 years ago
Zeng Jinle	bae5930ba1	fix graph attr copy issues, test=develop (#23191 )	5 years ago
wangchaochaohu	b721e23b25	transpose cudnn using cudnn v7 api (#19738 ) * refine the transopose conv using v7 to choose algorithm	5 years ago
Pei Yang	46b8d282dc	Add some inference API comments for AnalysisConfig (#23117 ) * add some API comments in paddle_analysis_config.h, test=develop * add some API comments in paddle_analysis_config.h, test=develop	5 years ago
Adam	4f5e4540f8	Improve SGD jit code to work with large data (#23120 )	5 years ago
Liufang Sang	4db031902d	add dequantize_log_op and make pyramid hash support int8 weight (#22548 ) * add dequantize_log_op and make pyramid hash support int8 weight test=develop * add unittest and update pyramid hash op test=develop * remove paddle_enforce test=develop * fix error message test=develop * remove incorrent commit test=develop * fix error message in log_dequantize test=develop * change 2019 to 2020 test=develop * remove useless check_grad test=develop	5 years ago
Zeng Jinle	e5fef8f38a	[Dygraph double grad]Code polish (#23121 ) * fix dygraph double grad, test=develop * fix unpack constructor, test=develop	5 years ago
Zeng Jinle	9258e96094	fix read op comments, test=develop, test=document_fix (#23122 )	5 years ago
Zeng Jinle	acfc9b8a70	Reader sequential and inference partial feed (#22699 ) * sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop	5 years ago
Wilber	95b356a069	update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114 ) update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input	5 years ago
Zeng Jinle	a31d7328b7	Add dygraph double grad implementation (#22939 ) * add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop	5 years ago
Yiqun Liu	3af4771122	Add the detection and code-generation of sqrt and square in fusion_group (#23095 )	5 years ago
hutuxian	0c30098f8b	Add need_save_delta parameter to solve OOM (#23097 )	5 years ago
songyouwei	2e2da7124b	high-performance dygraph slice (#22879 ) * move __getitem__ to cpp * bug fix * add type check and gil release * support negative step with omitted ends test=develop * code refine test=develop * bug fix test=develop * slice always return different pyobj test=develop	5 years ago
Sylwester Fraczek	abee05a8c8	added mkldnn swish activation (#23041 )	5 years ago
Zhaolong Xing	8c6fde9e69	fix align error (#23090 ) test=develop	5 years ago
Liufang Sang	915b892a15	Fix div zero in fake quantize op (#22966 ) * fix div zero test=develop * fix div zero test=develop * add hostdevice function test=develop * add eps when is zero test=develop	5 years ago
Yi Liu	121b2aed4d	initialize global nccl context in dygraph (#23037 ) initialize global nccl context in dygraph test=develop	5 years ago
Zhang Ting	880eb04d93	skip PrepareData when it is unnecessary (#22839 ) * remove unnecessary prepare data, test=develop * Op in while block will not skip PrepareData, test=develop	5 years ago
Feiyu Chan	01ab8a0619	add approximation for gelu, test=develop (#22961 ) add approximation for gelu, default value is False (only kernel with eigen is added, remove code for computing gelu with MKLDNN temporarily)	5 years ago
Adam	5842ae6785	Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )" (#22985 )	5 years ago
Pei Yang	24db750386	fix trt int8 calib precision bug. test=develop (#23036 )	5 years ago
GaoWei8	1dc1f9270e	Fix lod error of concat op for axis = 0 (#22538 )	5 years ago
yaoxuefeng	660ff18488	fix datsset test=develop (#23043 )	5 years ago
Zhang Ting	714b0076b6	Override GetKernelTypeForVar to avoid device transform, test=develop (#23032 )	5 years ago
wangchaochaohu	112e3edbf6	fix the conv group problem test=develop (#23025 )	5 years ago
Wilber	db40ee86db	fix unittets. test=develop (#23018 )	5 years ago
wangchaochaohu	99db0cf762	remove debug log test=develop (#22994 )	5 years ago
wangchaochaohu	3757e0687c	Add Unittest for backward of fusion group (#22932 ) * add fusion group test for backward and refine code	5 years ago
chengjuntao	63f3ada7b9	fix bug which input shape (#22965 ) * fix bug which input shape, test=develop * add error type,test=develop	5 years ago
Zhang Ting	137d6563fc	add check for assigned data, test=develop (#22960 )	5 years ago
wangchaochaohu	f0d193a23c	Cast fusion for fusion group (#22876 ) * add support for expression type convert and add cast Op support in fusion group	5 years ago
yaoxuefeng	29a7a52d38	Fix instag (#22632 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * fix compile warning test=develop * add attr default test=develop * add unittest test=develop * fix style test=develop * fix style test=develop * change out_val_ifempty to out_val_if_empty test=develop	5 years ago
wangchaochaohu	c979c9f2b0	refine the profiler print test=develop (#22968 )	5 years ago
Wilber	ff3ddbb502	add skip_layernorm pass. test=develop (#22895 ) * add skip_layernorm pass. test=develop	5 years ago
wawltor	f154d5860f	Speed up the matmul op, use the gemm replace the batch gemm (#22926 ) In the op of gemm, we use the gemm to replace batch gemm, speed up the matmul op	5 years ago
Adam	056edf3929	Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )	5 years ago
Zhaolong Xing	8d6dc102fe	[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494 ) * 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop	5 years ago
guofei	3d8571e884	modify assign op and add unittest of assign op (#22769 ) As the title.	5 years ago
Zeng Jinle	d33c4343e1	Imperative tracer refactoring (#22457 ) * refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop	5 years ago
liu zhengxi	61fef9754b	Fix fc padding bug during inference fusion (#22860 ) * fix fc padding during fusion, test=develop * fix optim model inference after SaveOptimModel, test=develop	5 years ago
tangwei12	ad9c8f6d2d	fix communicator when break under pyreder mode (#22911 ) * fix communicator when breaking under PyReader mode, test=develop * revert some vlog level to 0, test=develop	5 years ago
mapingshuo	5ba9dfc16a	add lookup_table_dequant_op (#22900 ) add lookup_table_dequant_op	5 years ago
zhaoyuchen2018	a020a25797	Fix model int8 quant fail, test=develop (#22891 ) As model fails when enable int8 quant, so disable allocate memory in cpu for small variable.	5 years ago
Zhaolong Xing	dd67d44a50	[Paddle-TRT] : (Part1) Dynamic shape support (#22868 ) * change the ci trt from version 5. to 6.0 * paddle-trt dynamic shape support init * conv+bias or conv+bn dynamic shape support test=develop * modity trt engine opconvert test=develop * fix ci error test=develop	5 years ago
tangwei12	07e13b84cd	remove vlog, test=develop (#22898 )	5 years ago
Zhang Ting	ca9c8b417d	fix compute ratio of profile, test=develop (#22872 )	5 years ago
wangchaochaohu	dbb0b9b3b6	refine the profiler print (#22823 ) * refine the profiler print test=develop	5 years ago
Michał Gallus	0038bfbd1d	Prevent loading of warmup data in analyzer_int8 if enable_int8 is set to false (#22857 )	5 years ago
Chen Weihang	1644926a6c	Polish detail implement of dygraph data loader (#22878 ) * polish detail implement of data loader, test=develop * solve coverage ci problem, test=develop	5 years ago
Wilber	f686310d81	fix concat_mkldnn op. test=develop (#22692 ) fix concat_mkldnn op when encounter extreame conditions.	5 years ago
hong	5191e54494	reduce default attrs for dynamic graph (#22850 ) * reduce default attrs for dynamic graph, test=develop * add some explanations for explicit attr, test=develop * tweak explicit attr comments, test=develop	5 years ago
Zhaolong Xing	1a533ed2de	[BUG]: Multihead matmul op's ouput size should be BxSx(N*H) (#22848 ) test=develop	5 years ago
hong	c736fef93b	dygraph backward engine accelerate (#22808 ) * fix loaded program load bug; test=develop * first version * speed backward engin; test=develop * remove useless code; test=develop * reconvery io.py; test=develop * remove useless code; test=develop * remove useless code; test=develop	5 years ago
Zeng Jinle	d41d802ba3	Add flags to limit gpu memory (#22793 ) * add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop	5 years ago
石晓伟	1861ca88f1	serialize the PaddleTensor, test=develop (#22810 ) * encapsulate the PaddleTensorToLoDTensor, test=develop * serialize the pd_tensor, test=develop * serialize tensors to file, test=develop	5 years ago
Zhang Ting	72ff5a09c3	fix print bug of profile, test=develop (#22804 )	5 years ago
Zhang Ting	4e8bc02461	add fluid.device_guard to specify the device type for Op (#22254 ) * add fluid.device_guard to specify the device type for Op	5 years ago
石晓伟	ddb9b46fec	change the function in op_teller, test=develop (#22794 ) * change the function in op_teller, test=develop * correct the commit-id, test=develop	5 years ago
Zhen Wang	89cfa49156	Unmerged fetch list (#22635 ) * update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop	5 years ago
wangchaochaohu	8456c3f4dd	polish the profiler_help code (#22811 )	5 years ago
zhongpu	2fd1ec1e3e	fix docker build for paddle openblas, test=develop (#22795 )	5 years ago
Chen Weihang	7d8d573453	Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541 ) * add lodtensor share memory & serialization, test=develop * fix windows compile error, test=develop * deal vartype pickle & fix unittest matching error message, test=develop * update timeout variable name, test=develop * refactor memory map implement, test=develop * clear mmap file discripter when exit unexpectedly, test=develop * remove the child process fd in advance, test=develop * remove mmap fds after Queue.put in child process, test=develop * add hard unittests for register exit func, test=develop * fix python2 compatibility problem in unittest, test=develop * fix exception unittest error, test=develop * polish code based review comment, test=develop	5 years ago
liu zhengxi	324f2b3922	Fix inference c api PD_GetZeroCopyOutput lod (#22768 ) * fix inference c api lod, test=develop * fix capi lod problem and enrich tests, test=develop * delete useless header files and alter const_cast, test=develop	5 years ago
wangchaochaohu	7578fcbac4	Profile code refine (#22800 ) * add profiler_help.h to refine the code test=develop	5 years ago
hutuxian	53a2b68f4e	support customized download command in dataset (#22782 ) * user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario	5 years ago
wangchaochaohu	ca9e77a8d4	add sum op support for fusion group (#22771 ) * Add the codegen and auto fusion for sum Op in fusion group	5 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	5 years ago
Kaipeng Deng	ebc7ffc300	fix detection_map. test=develop (#22705 )	5 years ago
zhaoyuchen2018	72dde4abde	Refine adam op to improve performance, test=develop (#22346 ) * Refine adam op, test=develop * Fuse kernels together to reduce cpu time. * Refine paddle enforce, test=develop * Remove some comments, test=develop * Refine code,test=develop * Refine cuda kernel, test=develop * Refine code according to comments, test=develop	5 years ago
wangguanzhong	f2d1cd119a	fix lod level, test=develop (#22755 )	5 years ago
FlyingQianMM	79d712346f	Correct CPU gradients of the argsort op (#22739 ) * Correct CPU gradients of the argsort op, form a network to test its forward and backward process, test=develop * fix dynamic threshold error in test_argsort_op, test=develop	5 years ago
Adam	2b80e9a719	Add cpu_info without XBYAK (#22716 )	5 years ago
guofei	ae8b5f11a3	Change ShareDataWith() to TensorCopy() in ref_by_trainer_id (#22717 ) As the title	5 years ago
liu zhengxi	71ab0458e1	Fix pointer and c-api encapsulation (#22663 ) * refine pointer and c-api prototype, test=develop * fix new c api profile bug, test=develop * add unit tests, test=develop	5 years ago
Leo Chen	b2c1be851a	support cond in clone, test=develop (#22657 ) * support cond in clone, test=develop * refine code, test=develop * refine code, test=develop * follow comments, test=develop * refine code, test=develop	5 years ago
Zhang Ting	f97f3f9301	add framework overhead ratio in profile report (#22590 ) * add framework overhead ratio, test=develop * print GpuMemcpy overhead, test=develop	5 years ago
zhouwei25	160d0f1308	fix the CI risk that network cannot be connected (#22736 )	5 years ago
chengjuntao	15c2667143	register fp16 for assign op (#22744 ) * register fp16 for assign op, test=develop * add op test for fp16, test=develop	5 years ago
zhangchunle	882e7f7c3b	Directly getting API.spec for tools/sampcd_processor.py (#22728 )	5 years ago
dyning	1c0653462d	fix generate_mask_labels lod level (#22743 )	5 years ago
GaoWei8	ba140222d6	fix compile&runtime lod_equality of lod_reset (#22737 )	5 years ago
hutuxian	175954d894	PaddleBox Framework Part2 (#22466 ) * Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.	5 years ago
ShenLiang	3132681e8a	add partial_sum op in contrib (#22292 ) * add partial_sum_op, test=develop * modify the Paddle Error Message, test=develop * modify the Paddle Error Message, test=develop * modify the bug for python3, test=develop * modify the ut for ci, test=develop * mv to contrib, test=develop * use check_variable_and_dtype, test=develop * fix ci, test=develop * fix conflict, test=dvelop * add partial concat, test=develop * fix the conflict, test=develop * fix the error, test=develop * rm SSE4, test=develop	5 years ago
wangchaochaohu	611411b90e	Fusion group profile support (#22718 ) * add support for the driver api callback and fix the profiler name show bug	5 years ago
ShenLiang	e136661304	add partial_concat op in contrib (#22528 ) * add partial_concat, test=develop * fix the grids and blocks, test=develop * fix the Paddle_Enforce, test=develop * fix the doc of op, test=develop * fix the doc, test=develop * fix the doc of the op, test=develop * replace -1 with None, test=develop	5 years ago
GaoWei8	cdf5f6fb8c	Add an inference interface to disable FC padding (#22097 ) * Add an interface of disabling FC padding * fix bert regression * polish fc padding interface * recover pass function * fix argument error * fix mkldnn error	5 years ago
tianshuo78520a	d2ba91aad1	fix typo words (#22653 )	5 years ago
Yibing Liu	6e7bfe30a6	register fp16 kernel for some ops (#22650 ) (#22696 ) test=develop	5 years ago
tangwei12	66a3150135	SYNC with communicaotor (#22344 ) * add sync communicator and implement	5 years ago
Yiqun Liu	22bbd54719	Add the support of fp16 in fusion_group (#22239 )	5 years ago
flame	d97475d53b	fix CPU C inference API compile bug (#22702 )	5 years ago
Huihuang Zheng	adfa5b8354	Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp (#22673 ) 1. Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp. 2. Also enrich PADDLE_ENFORCE error messages.	5 years ago
flame	74eb82de19	fix go api bug (#22669 )	5 years ago
wangchaochaohu	a089072c8b	fix the profile print error (#22665 ) * fix the profile print error test=develop	5 years ago
lidanqing	d926214535	[UT coverage] improve the mul_mkldnn_op line coverage (#22408 ) * improve the mul_mkldnn_op line coverage test=develop * remove fp32 mul mkldnn kernel test=develop * locally refactoring test=develop * change according to reviews test=develop	5 years ago
wangchaochaohu	c65c6ae534	add flag to control profile level in python API (#22319 ) * add python flag to control profile level test=develop	5 years ago
123malin	00594c1c88	support dumping params/grads in transpiler mode (#22490 )	5 years ago
Zhaolong Xing	a06d75a280	[Paddle-TRT] Refine the error log about runtime batch and max_batch_size. (#22535 ) * fix trt log test=develop * fix comments test=develop	5 years ago
Adam	608447bfd5	Update MKLDNN to v1.2 (#22521 )	5 years ago
Adam	ab610a34ff	transpose_mkldnn code change to meet Paddle standards (#22591 )	5 years ago
Jiawei Wang	8f035fb637	Add TopK Op Grad CPU&GPU Kernel test=develop (#22628 ) * Add TopK Op Grad CPU&GPU Kernel test=develop * Add TopK Op Grad, modify grad op maker test=develop * Add TopK Op Grad, modify grad op maker test=develop * Add TopK Op Grad, modify PADDLE_ENFORCE test=develop * Add TopK Op Grad, modify PADDLE_THROW test=develop * Add TopK Op Grad, modify unittest test=develop * fix ngraph top k op unittest test=develop	5 years ago
Steffy-zxf	90ee366653	update ops's unittest data type from float32 to float64 and shape over 100 (#22544 ) * update ops's unittest of elementwise_pow, elementwise_max, elementwise_min, scale and sqrt 1. update elementwise_pow, elementwise_max and scale's unitests with input data type (float32 -> float64) 2. fix bug that the elementwise_pow doesn't meet threshold requirements with tackling float64 data 3. remove sqrt from op_accuracy_white_list.py 4. update the unittests of elementwise_pow, elementwise_max and elementwise_min ops that their input data shape over 100 5. test=develop * modify the writing style according suggestions test=develop	5 years ago
flame	f7eafca828	remove python inference warning (#22602 )	5 years ago
Chen Weihang	fe685cc185	fix enforce test error, test=develop (#22610 )	5 years ago
Wilber	9a8203aa25	fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551 ) 当一个模型中有多个fc_lstm子图的时候，且其中fc共用了同一个persistable的bias，此时不应该将bias节点删除，只将非persistable的节点去除即可。	5 years ago
Chen Weihang	266106da75	Fix mismatch with plus sign in the line (#22588 ) * reproduce match error, test=develop, test=document_fix * fix mismatch error, test=develop, test=document_fix	5 years ago
flame	1d503e6a9e	Golang inference API (#22503 ) * support golang inference	5 years ago

1 2 3 4 5 ...

16792 Commits (54d3b5a1ebdb11686d60c024a4fc82c1e2927709)