Paddle

Commit Graph

Author	SHA1	Message	Date
liym27	0fff930667	Fix bug for set_value op when input dtype is not float32 (#31411 )	4 years ago
jakpiase	5b4f8aac82	Added LSTM BF16 and fixed GRU BF16 (#31234 )	4 years ago
Qi Li	7cdf6ea770	[ROCM] update fluid elementwise op for rocm (part10), test=develop (#31361 ) * [ROCM] update fluid elementwise op for rocm (part10), test=develop * update, test=develop * address review comments, test=develop	4 years ago
Qi Li	84639b6193	[ROCM] update fluid operators for rocm (part3), test=develop (#31213 ) * [ROCM] update fluid operators for rocm (part3), test=develop * fix clang format error, test=develop	4 years ago
Qi Li	3b9db17199	[ROCM] update fluid operators for rocm (part7), test=develop (#31307 )	4 years ago
Qi Li	db50fb6766	[ROCM] fix softmax with loss and update python scripts, test=develop (#31373 )	4 years ago
Pei Yang	32211fe9c4	TRT conv2d converter support SAME padding (#31379 )	4 years ago
Qi Li	e312a1ff6e	[ROCM] update fluid operators for rocm (part9), test=develop (#31338 )	4 years ago
Qi Li	6626c6a6ad	fix bert cu file compiler error, test=develop (#31389 )	4 years ago
Zhou Wei	13e4280f82	[Custom OP]polish doc of custom OP (#31369 )	4 years ago
Qi Li	946dbdae8c	[ROCM] update fluid operators for rocm (part6), test=develop (#31301 )	4 years ago
Shang Zhizhou	77c44e2f1b	change prelu plugin to tensorRT layer (#30210 )	4 years ago
Qi Li	59940cb383	[ROCM] update fluid operators for rocm (part8), test=develop (#31309 )	4 years ago
tangwei12	5d7a8b05f8	fix sycn training error (#31357 ) * fix sycn training error Change-Id: Ie2feebcf0b5b2984fd59cfcdde0c817840e203d2	4 years ago
Qi Li	ec72f5b235	fix ELU output for nan, test=develop (#31132 )	4 years ago
Qi Li	65bcaeb004	[ROCM] update fluid operators for rocm (part5), test=develop (#31258 ) * [ROCM] update fluid operators for rocm (part5), test=develop * address review comments, test=develop * fix typo, test=develop	4 years ago
YUNSHEN XIE	2111d912d4	Decrease threshold for failed ut retry (#30903 ) * Decrease threshold for failed ut retry * retry Method upgrade * second method upgrade * fix error * Remove the comment lines * test for modified_retry_times * fix error * fix some error * fix error * fix error * remove test content * fix error * Reduce duplicate code * fix more than 10 ut failed bug * fix more than 10 ut failed bug on mac	4 years ago
Pei Yang	2e9e3fad15	add n-d input support for trt scale converter (#31316 ) * add n-d input support for trt scale converter * add flatten for ut * fix dims	4 years ago
Shang Zhizhou	6404c43814	support trt serialize when load model from memory (#31342 ) * support trt serialize when load model from memory * delete conv_bn_fuse_pass before tensorrt, with which trt serialize engine id is not stable * Revert "delete conv_bn_fuse_pass before tensorrt, with which trt serialize engine id is not stable" performance degradation, fix in the future This reverts commit fa6cd17e60b15df351efda379ddd00e9e9c1fea9. * add delete conv_bn * delete path when delete_cache_files	4 years ago
Gradie	d79fdc3d62	lamb_op_xpu;test=kunlun (#31012 ) * lamb_op_xpu;test=kunlun * modify lamb_op_xpu.cc;test=kunlun * delete atol lamb_op_xpu; test=kunlun * update xpu.cmake;test=kunlun * test_error 1e-5,lamb_op_xpu;test=kunlun * error1e-5,lamb_op_xpu,test=kunlun * delete atol lamb_xpu;test=kunlun * modify atol,lamb_op_xpy;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu, XPUOptest;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu,modify xpu_cmake; test=kunlun * lamb_op_xpu;test=kunlun * lamb_op_xpu,modify xpucmake;test=kunlun	4 years ago
danleifeng	d1075df2e8	topo and memory performance for heterps (#30440 ) * topo and memory performance for heterps; test=develop * add trainwithprofiler in heter trainier; test=develop	4 years ago
Qi Li	72d99c5dcd	[ROCM] update fluid operators for rocm (part4), test=develop (#31225 )	4 years ago
cucuzg	91635de390	opt matmul and matmul_v2 on kunlun, test=kunlun (#31326 ) add clip_by_norm on kunlun, test=kunlun opt matmul and matmul_v2 on kunlun, *test=kunlun	4 years ago
Wilber	e20234094c	Fix xpu compile and cipher symbol problem. (#31271 )	4 years ago
wuhuanzhou	30858d8974	fix compilation errors for missing brpc header files, test=develop (#31325 )	4 years ago
石晓伟	625482f752	inference modification for custom operator, test=develop (#31312 )	4 years ago
wuhuanzhou	a13f1d6930	optimize unity build (#31119 ) * optimize unity build, test=develop * fix compilation error on Windows, test=develop * fix compilation error, test=develop * fix code style error, test=develop	4 years ago
jiangcheng	8f4ac6b525	optimize topk op through limit SortTopK kernel entrance, test=develop (#30403 )	4 years ago
alncat	bfb8a64234	updated conv bn fuse pass to make it compatible with latest batch_norm op (#31272 )	4 years ago
Chen Weihang	5610c1717e	fix dtype unmatched (#31305 )	4 years ago
Qi Li	9b016c7cb7	[ROCM] update fluid operators for rocm (part2), test=develop (#31211 )	4 years ago
niuliling123	2fd999d979	Optimized the adaptive_avg_pool2d op when output_size == 1 (#31197 ) * Optimized the adaptive_avg_pool2d op when output_size == 1	4 years ago
石晓伟	1da3280660	inference modification for custom operator, test=develop (#31283 )	4 years ago
Zhou Wei	af9066e89c	[Custom OP]add PD_THROW and PD_CHECK for User Error message (#31253 ) * [Custom OP]add PD_THROW and PD_CHECK for User error message * PD_THROW and PD_CHECK, fix comment * fix Windows error message * fix Windows error message * fix CI	4 years ago
石晓伟	8c94d8cb4c	[Custom OP] change the user header file format, test=develop (#31274 )	4 years ago
Jiabin Yang	038ce70d69	[Custom OP] Support stream set on Custom Op (#31257 )	4 years ago
Jiabin Yang	0c38708a90	[Custom Op] Remove unsupport dtypes (#31232 ) * remove remove_unsupport_dtype * remove remove_unsupport_dtype * remove test dtype * add more include * change dtype.h's enum as enum class to avoid conflict with inference lib * make enum as enum class * remove additional test * merge develop * polish code	4 years ago
WangXi	b8bce682e0	xpu support fuse allreduce (#31104 )	4 years ago
Chen Weihang	126633c50f	[CustomOp] Split build op marco & polish details (#31229 ) * split build op marco & polish details * revert register api del * fix other unittest	4 years ago
tangwei12	903235945b	loglevel adjustment for distributed training (#31205 ) Change-Id: I6210ce9c60bed48f3323c47b16500302b66cedf2	4 years ago
Qi Li	28b356b9a2	[ROCM] update fluid framework for rocm (part6), test=develop (#31015 )	4 years ago
Qi Li	c8fac5ee30	[ROCM] update fluid framework for rocm (part5), test=develop (#31014 )	4 years ago
Qi Li	580447d019	[ROCM] update fluid framework for rocm (part4), test=develop (#31013 )	4 years ago
Wilber	7d91974c91	enable lite ut. (#30890 )	4 years ago
Guanghua Yu	d18c5e47f3	fix ignore_index check in softmax_with_cross_entropy (#31201 )	4 years ago
chentianyu03	ca3b6bcf78	add cache for VariableWrapper (#30880 ) * add cache for VariableWrapper * modify args names and vlog level * format code style * add log when set cache to variable_wrapper * add log when set cache to variable_wrapper * add comment to variableWrapper cache * format code style	4 years ago
wangchaochaohu	f114c3f8ca	fix the branch of code choose (#31200 )	4 years ago
joanna.wozna.intel	d11602481c	Add bf16 gru model test (#31158 )	4 years ago
jakpiase	2f1165342b	OneDNN hardswish integration (#30211 )	4 years ago
Chen Weihang	e8cdb49aa9	[CustomOp] Support attributes as func input in custom op (#31128 ) * add simple attr support and test * add int, float attr support * support other attribute * add custom attrs test in cmake * polish details * fix test failed * add backward test * update test flags	4 years ago
Zhou Wei	ffbf71359a	modify custom op dependent from paddle_framework to paddle_custom_op (#31195 )	4 years ago
Leo Chen	0f1fde5102	fix the modification of set_expected_place (#31177 ) * revert the modification of set_expected_place * set device before op run * add ut	4 years ago
lilong12	dc8dfba35b	align the default value of some configuration for fleet to that of single cards (#30740 ) * update, test=develop	4 years ago
lilong12	a373aa7645	fix the bug in expand_v2 op (#30984 ) * update, test=develop	4 years ago
Thunderbrook	c4f279fe8d	support multi node in heterps (#31102 ) * push multi node * multi node * MultiThread * remove log * solve bug in 30829	4 years ago
liu zhengxi	ae2be49f40	Add cublas_handle() to expose cublas_handle to ops (#31157 ) * add get_cublas_handle() api * update format * add unittests * alter function name	4 years ago
Pei Yang	00b09e86ac	[Paddle-TRT] support group_norm (#31040 ) * add group norm plugin * fix compile problems * move concat axis check to trt op teller * add nbDims for scale and bias nv dims * add group norm unit test * fix unittest * add trt version restriction for group norm op teller * fix unittest	4 years ago
Chen Weihang	1ce96fa118	[CustomOp] Add new paddle custom op so (#31141 ) * add new custom op so * fix use new method error * fix test failed	4 years ago
tangwei12	ebbdf52557	fix entry (#31079 ) * fix entry * fix distributed lookup table fuse case * fix entry bug at first time * move entry from paddle.fluid -> paddle.distributed * fix ut with paddle.enable_static() Co-authored-by: malin10 <malin10@baidu.com>	4 years ago
Qi Li	ee76ea72de	[ROCM] update fluid collective op for rocm, test=develop (#31075 )	4 years ago
yaoxuefeng	d8fa65a3a8	fix heter compile (#30518 )	4 years ago
Zhou Wei	4b220550ef	[Custom OP]Fix problem of custom op unitests on Windows CI (#31114 ) * fix some problem of Windows custom op * fix some problem of Windows custom op * fix some problem of Windows custom op	4 years ago
Zhou Wei	be61c2d06b	support build whl and inference library nightly,test=windows3 (#30616 )	4 years ago
alncat	5d6a8c7b73	added support for fake_quantize_dequantize_abs_max op in quantization… (#30896 ) * added support for fake_quantize_dequantize_abs_max op in quantization inference pass * remove const_cast to pass ci * remove compare operator to pass ci-coverage * added detailed error message for unregistered tensorrt_subgrah_pass	4 years ago
Jacek Czaja	d3f09ad702	Update of onednn to 2.2 (#31067 )	4 years ago
Guanghua Yu	24ba5ee05c	merge develop conflict (#31122 )	4 years ago
Qi Li	cced930b61	[ROCM] update fluid operators for rocm (part1), test=develop (#31077 )	4 years ago
wangchaochaohu	364cfa2686	fix windows for optimization of elementwise_add Op (#31068 ) * fix windows for optimization of elementwise_add Op	4 years ago
joanna.wozna.intel	781df300d0	Unification of BF16 enablement process (#31034 ) * Unification of bfloat16 enablement process and refactor * Remove unnecessary function * Standardize the output name search	4 years ago
Zhong Hui	16fe11d71e	fix softmax cross entropy integer overflow (#30590 ) [BUG FIX] Fix softmax cross entropy overflow problem.	4 years ago
Zhou Wei	44ee251fde	fix UNIX cmake problem (#31113 )	4 years ago
Qi Li	a60d93fb77	[ROCM] update fluid framework for rocm (part2), test=develop (#31010 )	4 years ago
Thunderbrook	565354f676	support save multi sparse table in one path (#31108 ) * save multi table one path * format	4 years ago
Qi Li	50967135a5	[ROCM] update fluid framework for rocm (part3), test=develop (#31011 )	4 years ago
Qi Li	8fe09faf14	[ROCM] update fluid framework for rocm (part1), test=develop (#31009 )	4 years ago
Qi Li	334296306c	[ROCM] update fluid platform for rocm39 (part4), test=develop (#30936 )	4 years ago
Shang Zhizhou	a5c56d83a1	update trt int8 calibrator to IEntropyCalibratorV2 (#31060 ) * update trt int8 calibrator to IEntropyCalibratorV2 * add delele opt_cache for trt_split_converter_test	4 years ago
Zhou Wei	adaec0073d	[2.0Custom OP]Support New Custom OP on Windows (#31063 ) * [2.0.1]Support New Custom OP on windows * fix CI * fix code style * fix CI * fix CI * fix coverage * fix CI * fix CI	4 years ago
Qi Li	1d996637e6	[ROCM] update fluid imperative for rocm (part1), test=develop (#31017 ) * [ROCM] update fluid imperative for rocm (part1), test=develop * [ROCM] update reducer.cc after merge, test=develop * update reducer cmake after merge, test=develop	4 years ago
JamesLim	b95eb38b8a	fix the bug in backward OP of index_sample. (#31026 )	4 years ago
Chengmo	6b3371e0c7	Remove PE special profiler (#30886 ) * remove pe special profiler * add profiler info	4 years ago
Chen Weihang	6beeafe797	[CustomOp] Add more dispatch marco for users (#31058 ) * add more dispatch marco * add more dispatch marco * add more tests * revert unneeded change * add timeout for test dispatch * add float and complex test * remove and marco	4 years ago
TTerror	d5323dab41	add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056 ) * add squeeze_op/unsqueeze_op on kunlun; fix conv op and parallel executor on kunlun; optimize lookup_table op on kunlun * update squeeze/unsqueeze op	4 years ago
123malin	16b4260b2f	test=develop, save/load, shrink (#30625 ) * test=develop, save/load, shrink Co-authored-by: seiriosPlus <tangwei12@baidu.com>	4 years ago
Jiabin Yang	628451af06	hide useless headers and add complex support (#31074 )	4 years ago
Wilber	463eae0383	update paddle_fluid.so to paddle_inference.so (#30850 ) * update paddle_fluid.so to paddle_inference.so	4 years ago
liym27	5b367dab44	[static setitem] Support the index is Tensor; step>1; step<0 .(#30949 ) * [static setitem] support the index step > 1. tensor_a[::3] = value * [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value * [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value * Add op version.	4 years ago
Qi Li	eb3050fa9a	[ROCM] update fluid inference for rocm (part1), test=develop (#31018 )	4 years ago
Jacek Czaja	f7465641c3	Added reshape grad bf16 (#31035 ) * - added Reshape grad bf16 * - Added reshape grad bf16 * - cosmetics in py	4 years ago
Wojciech Uss	615d8a2264	Modify relu native implementation 2 (#30996 ) * Modify relu native implementation * fix GPU performance	4 years ago
ShenLiang	9401173e3a	Remove scale loss before reduce in dygraph (#30807 )	4 years ago
Wilber	0020d91506	fix python pass builder error. (#30946 )	4 years ago
Wilber	39aeaa160e	fix jetson problem (#30939 )	4 years ago
Wilber	01ccfbcde9	update trt error message when input height or width is -1 (#31019 )	4 years ago
Wilber	cf8b8f9c5e	resolve memory leak in cudnn8.0 (#31029 )	4 years ago
Guanghua Yu	5b267474a9	add offset parameter in roi_align,generate_proposals.etc ops (#30864 ) * add parameter in roi_align op	4 years ago
Chen Weihang	75f81233ae	fix regex error & simplify marco name (#31031 )	4 years ago
Zhang Ting	f0ee159280	enable exhaustive_search for forward and backward algos when dtype is float16 (#30959 ) * enable exhaustive_search for input_grad when dtype is float16 * enable exhaustive_search for forward algos	4 years ago
Pei Yang	9b54fe4154	add trt transpose and flatten converter (#31022 )	4 years ago
joanna.wozna.intel	caf9d39839	Add Conv Transpose BF16 (#30877 ) * Add conv transpose BF16 * Share function GetWeightsTz * Adjust to review and fix op compatibility * Add bias to unique handler name * Remove errors related to paddle enforce * Add conv2d_transpose to bf16 list and kernel refator	4 years ago

1 2 3 4 5 ...

18504 Commits (test_benchmark_ci)