Paddle

Commit Graph

Author	SHA1	Message	Date
Jiabin Yang	0c38708a90	[Custom Op] Remove unsupport dtypes (#31232 ) * remove remove_unsupport_dtype * remove remove_unsupport_dtype * remove test dtype * add more include * change dtype.h's enum as enum class to avoid conflict with inference lib * make enum as enum class * remove additional test * merge develop * polish code	4 years ago
WangXi	b8bce682e0	xpu support fuse allreduce (#31104 )	4 years ago
Aurelius84	59b00e8c45	[CustomOP]Support Incremental compilation and Add Version management (#31228 ) * Support Incremental compilation and Add Version management * replace hash with hashlib	4 years ago
Chen Weihang	126633c50f	[CustomOp] Split build op marco & polish details (#31229 ) * split build op marco & polish details * revert register api del * fix other unittest	4 years ago
Aurelius84	e8d24b546a	[CustomOp] Add Modeling with Custom op unittest (#31218 ) * add unittest for static/dygraph/dy2stat * add PE unittet * remove usless code * add unittest in CMakeList.txt	4 years ago
littletomatodonkey	ad50fa710b	add int pad support for Pad1D/2D/3D (#31209 ) * add int pad support for Pad1D/2D/3D * fix type * fix format	4 years ago
jakpiase	2f1165342b	OneDNN hardswish integration (#30211 )	4 years ago
Aurelius84	912022fa0c	[CustomOp]Add cpp_extension en doc (#31187 ) * add cpp_extension en doc * remove cuda_cflags and add optional in doc * refine style * fix indent problem * add default None	4 years ago
Chen Weihang	e8cdb49aa9	[CustomOp] Support attributes as func input in custom op (#31128 ) * add simple attr support and test * add int, float attr support * support other attribute * add custom attrs test in cmake * polish details * fix test failed * add backward test * update test flags	4 years ago
Zhou Wei	ffbf71359a	modify custom op dependent from paddle_framework to paddle_custom_op (#31195 )	4 years ago
lilong12	dc8dfba35b	align the default value of some configuration for fleet to that of single cards (#30740 ) * update, test=develop	4 years ago
lilong12	a373aa7645	fix the bug in expand_v2 op (#30984 ) * update, test=develop	4 years ago
Thunderbrook	c4f279fe8d	support multi node in heterps (#31102 ) * push multi node * multi node * MultiThread * remove log * solve bug in 30829	4 years ago
Aurelius84	406f4a7513	[CustomOp] Support to specific extra_cflags and exctra_cuda_flags independently (#31059 ) * split cxx/nvcc compile flags * enhance input argument check * rename extra_cflags into extrac_cxx_flags * add name checking in setup * fix test_dispatch failed * fix word typo and rm usless import statement * refine import statement * fix unittest failed * fix cuda flags error	4 years ago
qingqing01	572cc8bd0f	Update doc for 2.0 API and some callback (#31180 ) test=document_fix	4 years ago
Pei Yang	00b09e86ac	[Paddle-TRT] support group_norm (#31040 ) * add group norm plugin * fix compile problems * move concat axis check to trt op teller * add nbDims for scale and bias nv dims * add group norm unit test * fix unittest * add trt version restriction for group norm op teller * fix unittest	4 years ago
Chen Weihang	c209751c8d	change test_multiprocess_reader_exception cmake (#31174 )	4 years ago
YUNSHEN XIE	153121457f	fix ut timeout (#31061 )	4 years ago
Chen Weihang	1ce96fa118	[CustomOp] Add new paddle custom op so (#31141 ) * add new custom op so * fix use new method error * fix test failed	4 years ago
tangwei12	ebbdf52557	fix entry (#31079 ) * fix entry * fix distributed lookup table fuse case * fix entry bug at first time * move entry from paddle.fluid -> paddle.distributed * fix ut with paddle.enable_static() Co-authored-by: malin10 <malin10@baidu.com>	4 years ago
Aurelius84	dce2db4857	[CustomOp] Split build directory for each setup.py (#31124 ) * split build directory for each setup.py * fix template string	4 years ago
Zhou Wei	4b220550ef	[Custom OP]Fix problem of custom op unitests on Windows CI (#31114 ) * fix some problem of Windows custom op * fix some problem of Windows custom op * fix some problem of Windows custom op	4 years ago
chentianyu03	70131b475f	add warning message when dtypes of operator are not same (#31136 ) * add error msg when dtypes of operator are not same * add error msg when dtypes of operator are not same * change error msg to warning msg when dtypes of operator are not same * modify test case to fit for python2	4 years ago
Chen Weihang	e60fd1f6a8	[CustomOp] Split test and add inference test (#31078 ) * split test & add inference test * add timeout config * change to setup install * change to jit compile * add verbose for test * fix load setup name repeat * polish details * resolve conflict * fix code format error	4 years ago
xiemoyuan	edacb6293c	Optimization of Transformer API (#30957 ) * Support 'bool' and 'int' for attention mask. * Update docs. * Add unittest for Transformer. * fix bugs.	4 years ago
WeiXin	ee1801c1ad	Save load/save pickle protocol (#31044 ) * add default argument for paddle.save/static.save * edit documentation of * Add comments for special processing for protocol=2 and protocol=3. * Update python/paddle/fluid/io.py Co-authored-by: lanxianghit <47554610+lanxianghit@users.noreply.github.com> Co-authored-by: lanxianghit <47554610+lanxianghit@users.noreply.github.com>	4 years ago
yukavio	99fd9815b6	fix flops api (#31081 ) * remove PrettyTable dependence from paddle.flops * fix bug in python2.7 * fix flops * fix flops * fix bug * fix bug	4 years ago
Zhou Wei	44ee251fde	fix UNIX cmake problem (#31113 )	4 years ago
Thunderbrook	565354f676	support save multi sparse table in one path (#31108 ) * save multi table one path * format	4 years ago
Huihuang Zheng	cf43a321a8	[Dy2stat] Refactoring tensor_shape_transformer.py to Fix Change after Assign Bug (#31082 ) Problem In our old shape transformer logic, if user write: ``` s = tensor.shape ... y = paddle.some_api(s) ``` Dy2stat will change it to ``` ... y = paddle.some_api(convert_var_shape(tensor)) ``` However it will cause fatal bug if user changes the shape of `x` after assign. For example: ``` s = tensor.shape ... tensor = paddle.some_change_shape_api(tensor) ... y = paddle.some_api(s) ``` Then the Dy2stat will get wrong result because the code is translated into: ``` tensor = paddle.some_change_shape_api(tensor) ... y = paddle.some_api(convert_var_shape(tensor)) # tensor shape has been changed, not origin `s` value ``` Solution Logic It can not be solved in the old logic, so I refactoring tensor_shape_transformer logic. Now we will use `s` to store shape attribute and generate a var `s__STATIC_CONVERT_VAR_SHAPE_SUFFIX` to store static shape API `shape(tensor)` ``` s = tensor.shape ... y = paddle.some_api(s) ``` Dy2stat will change it to ``` s = tensor.shape s__STATIC_CONVERT_VAR_SHAPE_SUFFIX = shape(tensor) ... y = paddle.some_api(choose_shape_attr_or_api(s, s__STATIC_CONVERT_VAR_SHAPE_SUFFIX )) ``` In this case, the code is consistent with origin dygraph meaning and it fixed the change after assign bug. Code Key Note To help reviewers, the key change of this PR is changing `self.name_to_var_shape` from "mapping name to shape node" to "mapping name to its STATIC_CONVERT_VAR_SHAPE_SUFFIX name", then if a variable name has the SUFFIX, we can choose to use attribute shape or shape api. Other changes go with the key change. Consideration The issue of this PR is that we store extra static `shape` API result, will it harms the speed of Dy2stat? In some cases it will, but we argue that the benefit would be greater than the cost. 1. The extra calling to static `shape` API will happen when coder assign among shape variables. Take the following dygraph code as an instance: ``` s1 = tensor.shape s2 = s1 s3 = s2 ... ``` Then we called extra static `shape` APIs again and again, however users seldom write code like this. 2. If the shape variable is used a lot, for example: ``` s = tensor.shape y1 = paddle.some_api1(s) y2 = paddle.some_api2(s) y3 = paddle.some_api3(s) ``` Our old logic will create 3 shape APIs but now just 1. This is more common user code pattern. In fact, if reviewers take a look at the current unit test in this PR, you could see the op numbers decrease after this PR. So we argue that this PR can also improve speed in this code pattern.	4 years ago
tangwei12	0e4b154298	fix dist fleet ctr ut (#31087 ) * fix dist fleet ctr ut Change-Id: I59bf5123c7bd47bd0e8f1ca2a26295257597c0f5 * fix dist fleet ctr ut Change-Id: Iafcdd172364be47fe67b753774ce09af050bcbce * Update CMakeLists.txt	4 years ago
Zhou Wei	adaec0073d	[2.0Custom OP]Support New Custom OP on Windows (#31063 ) * [2.0.1]Support New Custom OP on windows * fix CI * fix code style * fix CI * fix CI * fix coverage * fix CI * fix CI	4 years ago
Chen Weihang	2168f08ac8	add optional for param attr args, test=document_fix (#31105 )	4 years ago
Chen Weihang	6beeafe797	[CustomOp] Add more dispatch marco for users (#31058 ) * add more dispatch marco * add more dispatch marco * add more tests * revert unneeded change * add timeout for test dispatch * add float and complex test * remove and marco	4 years ago
TTerror	d5323dab41	add squeeze_op/unsqueeze_op on kunlun;fix conv op and parallel executor;optimize lookup_table op (#31056 ) * add squeeze_op/unsqueeze_op on kunlun; fix conv op and parallel executor on kunlun; optimize lookup_table op on kunlun * update squeeze/unsqueeze op	4 years ago
123malin	16b4260b2f	test=develop, save/load, shrink (#30625 ) * test=develop, save/load, shrink Co-authored-by: seiriosPlus <tangwei12@baidu.com>	4 years ago
Shibo Tao	4424aac608	export paddle.static.normalize_program method. (#31072 ) * export paddle.static.normalize_program method. test=develop * fix ut coverage.test=develop	4 years ago
liym27	5b367dab44	[static setitem] Support the index is Tensor; step>1; step<0 .(#30949 ) * [static setitem] support the index step > 1. tensor_a[::3] = value * [static setitem] support the index step < 0. Eg: tensor_a[::-3] = value * [static setitem] support the index is Tensor. eg: tensor_a[tensor_3:0:-1] = value * Add op version.	4 years ago
Jack Zhou	6df1ca54c8	add detail about states index in rnn result, test=document_fix (#31048 )	4 years ago
Huihuang Zheng	ef627ac5b9	Fix that convert_var_shape doesn't support slice like [0:], test=develop (#31051 ) As the title, when slice_node like 1:3 being passed to idx of convert_var_shape, it will cause syntax error because a function cannot take this as argument. This PR fixed it.	4 years ago
Jacek Czaja	f7465641c3	Added reshape grad bf16 (#31035 ) * - added Reshape grad bf16 * - Added reshape grad bf16 * - cosmetics in py	4 years ago
Aurelius84	4dbe16c48f	[CustomOp] Refine name argument in setup (#31049 ) * refine setup name usage * fix unittest failed	4 years ago
Aurelius84	f2dc29a9fa	[CustomOp] Support output dtypes in generated Python API (#31045 )	4 years ago
ShenLiang	9401173e3a	Remove scale loss before reduce in dygraph (#30807 )	4 years ago
Kaipeng Deng	c4ddc3ab0d	fix dataloader collate return list mix tensor and numpy array (#30904 ) * fix dataloader collate return list mix tensor and numpy array. test=develop	4 years ago
Guanghua Yu	5b267474a9	add offset parameter in roi_align,generate_proposals.etc ops (#30864 ) * add parameter in roi_align op	4 years ago
Chen Weihang	75f81233ae	fix regex error & simplify marco name (#31031 )	4 years ago
Pei Yang	9b54fe4154	add trt transpose and flatten converter (#31022 )	4 years ago
Aurelius84	4c9f96c902	[CustomOp] Support Compile multi ops at same time (#30920 ) * add more unitest for ABI compatibility * add more unittest * refine warning style * support compile multi custom ops in same time * fix not import paddle in unittest * fix typo * add more unittest * add comment for details	4 years ago
joanna.wozna.intel	caf9d39839	Add Conv Transpose BF16 (#30877 ) * Add conv transpose BF16 * Share function GetWeightsTz * Adjust to review and fix op compatibility * Add bias to unique handler name * Remove errors related to paddle enforce * Add conv2d_transpose to bf16 list and kernel refator	4 years ago
Huihuang Zheng	cbbe127483	Refine fake_interface Error Message (#30981 ) Refine fake_interface Error Message	4 years ago
Huihuang Zheng	c137578341	Add Support for Tuple in for Loop (#30998 ) Dy2stat didn't support tuple as iteration variable in the past. This PR added there main cases: 1). Non-enumerate case: for var1, var2 in var\|var.numpy() will be re-written as: for FOR_ITER_TUPLE_PREFIX_x in var \| var.numpy(): var1 = FOR_ITER_TUPLE_PREFIX_x[0] var2 = FOR_ITER_TUPLE_PREFIX_x[1] 2). Enumerate out tuple case: for t in enumerate(var\|var.numpy) will be rewritten as: for FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x in enumerate(var\|var.numpy): t = (FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x) 3). Enumerate inner tuple case: for i, (var1, (var2, va3)) in enumerate(var\|var.numpy()) will be re-written as: for i, FOR_ITER_TUPLE_PREFIX_x in var \| var.numpy(): var1 = FOR_ITER_TUPLE_PREFIX_x[0] var2 = FOR_ITER_TUPLE_PREFIX_x[1][0] var3 = FOR_ITER_TUPLE_PREFIX_x[1][1]	4 years ago
Wojciech Uss	2497f4392f	Handle missing symlink method on Windows (#31006 )	4 years ago
Aurelius84	5653c3a488	[CustomOp] Check Compiler ABI compatibility (#30869 ) * support setup.py to compile custom op * move file into paddle.utils.cpp_extension * support python setup.py install * refine code style * Enrich code and add unittest	4 years ago
huangjun12	20e300e2df	fix lrn bug in reshape size, test=develop (#30968 )	4 years ago
WeiXin	8ab29f4bea	delay timeout of unnittest 'test_static_save_load'. (#30975 )	4 years ago
Chen Weihang	f649442ddd	New custom operator extension mechanism (#30690 ) * initial commit: simple demo * polish copyright format * add grap op simple demo * adapt uncertain number of argument * change trait marco name * add place & dtype support for add kernel * add dispath and infershape func * poish code & add notes * add dynamic_loader dep for paddle_framework * add new custom op test dir * polish impl details * add unittest for new custom op * fix failed unittest * Costum op (#1) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * refactor register design & add test * change op_funtion to op_meta_info * split op meta info into .h and .cc * move get methods into friend class * move OpMetaInfoHelper into framework space * move CustomTensorUtils into framework space * change pybind api name * move PD C API into op meta info * add register custom op api * remove inference cmake change * refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * polish detail & error message * polish test details * Add cast api && Change copy related api to copy_to && add more test (#4) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add type cast * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * merge cwh code * merge cwh code * merge cwh code * merge cwh code * merge cwh code * add more error log * add more error log * polish code * used for test * remove test comment * remove test comment * fix uint8 type error * fix lost uint8 type error * add test for coverage * polish details by reviewer comments * add prefix for DISABLE_COPY_AND_ASSIGN Co-authored-by: Jiabin Yang <360788950@qq.com>	4 years ago
chajchaj	f5ca2db2cc	support label with float input of cross_entropy, test=develop (#30929 ) * support label with float input of cross_entropy, test=develop * fix code style in nn/functional/loss.py, test=develop	4 years ago
Huihuang Zheng	8e72e031fc	Update gast requirement, test=develop (#30932 ) gast version can be conflict with the other software users installed. We set the version to be higher than 0.3.3	4 years ago
Chen Weihang	010f2caa23	try to fix reader and signal test failed (#30960 )	4 years ago
liym27	12c15bebe4	[Static setitem] Support index is ellipsis for setitem in static mode (#30836 )	4 years ago
liuyuhui	87197f8c2e	[kunlun]fix sync in multi kunlun xpu dygraph training. (#30943 )	4 years ago
wanghuancoder	823f499a8a	fix a bug of Sequential::__getitem__ (#30899 ) * fix a bug of Sequential::__getitem__, test=develop * add testcase, test=develop	4 years ago
Jacek Czaja	9e527d9956	[oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925 )	4 years ago
liuyuhui	4a8b8b4547	[Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858 )	4 years ago
wanghuancoder	90d92111cf	let LayerList could add [None], test=develop (#30911 )	4 years ago
taixiurong	24873f4f77	dyngraph (#30892 )	4 years ago
Zhen Wang	71acde9afc	Use correct master weights in AdamW. (#30895 ) * Use correct master weights in AdamW. * Just modify the master weight. * Update for CI Coverage.	4 years ago
Jacek Czaja	abfa822650	[oneDNN]Extended adaptive pooling support for oneDNN pool kernel (#30757 )	4 years ago
Zhang Ting	e97905c5fa	improve performance of momentum (#30881 )	4 years ago
cucuzg	ac2e2e6b7f	add clip_by_norm on kunlun, *test=kunlun (#30862 )	4 years ago
Kaipeng Deng	302427170f	remove numpy array check in single-process dataloader. test=develop (#30861 )	4 years ago
wawltor	b7560a59ab	fix the broadcast for the large second input (#30818 ) fix the broadcast for the large second input	4 years ago
JamesLim	6e1e036a75	Implement cuda kernel for index_sample. (#30380 )	4 years ago
AshburnLee	666efc2336	Call new cudnn batch norm API regardless of data type and data layout (#30157 )	4 years ago
石晓伟	2ac4143b6c	support xpu with analysis predictor, test=develop (#30832 ) * support xpu inference with analysis predictor, test=develop * merge the cmake of the xpu toolchain, test=develop * add c-apis, test=develop * fix a bug in extern_xpu, test=develop	4 years ago
joejiong	05d2b7a37f	Update paddle.static.Print with paddle2.0 api (#30846 ) As the title	4 years ago
Aurelius84	e49d0746dd	[CustomOp] Support install as Package and Add load interface (#30798 ) * support setup.py to compile custom op * move file into paddle.utils.cpp_extension * support python setup.py install * refine code style * Enrich code and add unittest * Polish code and api doc * fix cpp_extension not include in package * fix relative import * fix os.makedirs exist_ok param compatibility PY2 * add compile flags in test_jit_load	4 years ago
Adam Osewski	4f066e316e	Layer normalization fuse pass. (#30721 )	4 years ago
WangXi	b1026f64af	【kunlun】dygraph supports multi xpu card training (#30671 )	4 years ago
LielinJiang	3a3ff75c52	Fix unittest random failed of test_datasets (#30804 ) * fix test_datasets unittest	4 years ago
Shang Zhizhou	b909450994	fix trt plugin clone and initialize bugs in TRT7.1+ (#30709 ) * fix trt plugin clone and initialize bugs * fix unit test error * enable trt in ci py3 * update unittest timeout	4 years ago
Shang Zhizhou	200ee33df8	fix unittest random error (#30808 )	4 years ago
xiemoyuan	db87087283	Optimize the encoder of Transformer. (#30439 ) * Add cache for Transformer encoder. * Bug fixed. * add unittests for transformer encoder.	4 years ago
WangXi	31ed9c9eed	Fleet distributed strategy support pure fp16 (#30754 )	4 years ago
Aurelius84	2c974cc316	【CustomOp】support setup.py to compile custom op (#30753 )	4 years ago
Jiaqi Liu	65a9744cfd	fix paddle.static.acc and auc sample code bug, test=document_fix (#30715 )	4 years ago
Wojciech Uss	fc00240575	A fix for oneDNN matmul kernel. Fixes issue #30309 (#30723 )	4 years ago
tianshuo78520a	a12b6bb9cb	add readme in whl package (#30726 )	4 years ago
WeiXin	3491acfb1e	Split unittest. (#30727 )	4 years ago
liu zhengxi	a87d78f1a9	update gather_tree doc (#30693 ) * update gather_tree doc, test=document_fix * update sample code, test=document_fix * remove tensor type, test=document_fix	4 years ago
liu zhengxi	fef3654b4e	upgrade gather_tree to core.ops (#30697 ) * upgrade gather_tree to core.ops * update gather_tree unittests	4 years ago
jakpiase	f8da5536ed	REUPLOAD Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30719 ) * added external reorder to profiler * resolved conflict * added enable_static * initial version of lstm, not working yet * added lstm to operators.cmake * added vanilla lstm mkldnn op * added peephole weights integration * minor changes * added formatting * added fusion_lstm_mkldnn to static_whitelist * added formatting * removed comment * moved use_peepholes attribute inside is_cached block * reverted wrong changes * minor formatting change * minor changes * changed stream handling * minor change * added datatype to GetExpectedKernelType() * added reading stream from TLS	4 years ago
liym27	13ef444fa6	[Dy2Stat] Fix error message when the message has more than one lines. (#30714 )	4 years ago
Tao Luo	824a79d383	Revert "Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661 )" (#30708 ) This reverts commit `d834f4e6e8`.	4 years ago
jakpiase	d834f4e6e8	Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661 ) * added external reorder to profiler * resolved conflict * added enable_static * initial version of lstm, not working yet * added lstm to operators.cmake * added vanilla lstm mkldnn op * added peephole weights integration * minor changes * added formatting * added fusion_lstm_mkldnn to static_whitelist * added formatting * removed comment * moved use_peepholes attribute inside is_cached block * reverted wrong changes * minor formatting change * minor changes	4 years ago
Leo Chen	1a13626f5f	polish printing dtype (#30682 ) * polish printing dtype * fix special case	4 years ago
WangXi	a28a202603	fix test_gen_nccl_id_op failed (#30686 )	4 years ago
123malin	164275704d	test=develop, fix nonzero astuple=true (#30647 )	4 years ago
yingshengBD	0eea5d714f	post quantize support insert fake_quantize_dequantize node before the OPs that will be used in VIS's faceid models (#30659 ) test=develop	4 years ago
123malin	06a3e31148	test=develop, fix test_lookahead (#30677 ) * test=develop, fix test_lookahead	4 years ago
yukavio	8c5f158172	remove PrettyTable dependence from paddle.flops (#30675 )	4 years ago
chentianyu03	fb7fbc7a5d	fix abs bug and add abs test case (#30637 ) * add abs test case * use std::abs to fix abs bug * fix the abs bug * fix abs bug	4 years ago
ShenLiang	9514b4aa5f	Fix scatter grad bug (#30604 )	4 years ago
Qi Li	1f5841c2a0	[ROCM] update cmake and dockerfile, test=develop (#30598 )	4 years ago
Zhen Wang	4a9de931a2	Fix the bug in fleet amp_init. (#30606 ) * Fix the bug in fleet amp_init. * Fix the amp_init unit test.	4 years ago
cnn	7e9f336b58	update document of paddle.vision.dataset, test=document (#30414 ) * update document of paddle.vision.dataset, test=document * update document of paddle.vision.dataset, test=document	4 years ago
guofei	430f8449f1	Fix the error of save_quantized_model (#30583 ) * Fix the error of save_quantized_model	4 years ago
TTerror	10271ddfc4	support reduce_max op on kunlun (#30581 ) * support reduce_max op on kunlun * support reduce_max op on kunlun * support reduce_max op on kunlun * support reduce_max op on kunlun	4 years ago
WeiXin	ca33821475	延长单测'test_static_save_load'超时 (#30599 ) * delay the 'timeout' of 'test_static_save_load'. * delay the 'timeout' of 'test_static_save_load'.	4 years ago
chentianyu03	358106fcb0	make abs op support complex types (#30375 ) * rewrite abs op * rewrite abs op and remove abs in activation * remove abs register in old codes * fix abs_grad type error * fix abs double_grad output name error * modify abs_grad, abs_grad_grad functor for windows building * format code style * fix the bug of result is nan when the divisor is zero * add missing abs attr and add abs for float16	4 years ago
huangxu96	138620084c	Add fleet amp_init() (#30572 ) * add fleet amp.init() * add unittest for fleet_amp_init	4 years ago
wanghuancoder	27a5c0cff6	fix layers train eval bug (#30580 ) * delete empty line of pybing.cc, test=develop * fix layers train eval bug, test=develop	4 years ago
lilong12	8126a41d73	fix the bug of all_reduce pipeline gradient multiple times (#30437 ) * update, test=develop	4 years ago
Aurelius84	621bc4f771	[Dy2static]Fix paddle prefix in is_paddle_api (#30569 ) * add paddle. * add unittest	4 years ago
tangwei12	c9e78a22c5	add trainers for pserver (#30523 ) * add trainers for pserver Change-Id: I1a75793ec81ce126d07f4c47cae09b95d530bbc8	4 years ago
Aurelius84	5067e3a8d2	[Dy2Static]Enhance check of TracedLayers out vars (#30576 )	4 years ago
liym27	ff25c5b36f	Fix bug: GetAttrValue should deal with attr with attrType vector<double> (#30536 )	4 years ago
WangXi	572c466d19	[Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455 )	4 years ago
ykkk2333	549855ac20	add rmsprop_op_xpu test=kunlun (#30493 ) * add rmsprop_op_xpu test=kunlun * modified rmsprop_op_xpu error code. test=kunlun	4 years ago
Leo Chen	7043b8cfc6	support layer_norm fp16 in dygraph amp (#30430 ) * support layer_norm fp16 in dygraph amp * add ut * refine code	4 years ago
Zhang Ting	66c514ce83	[2.0 API] device guard (#30307 ) * add 2.0 API: device_guard	4 years ago
WangXi	7a0a576e51	fix adamw lr_to_coeff is fixed when dygraph (#30526 )	4 years ago
cc	ce6777fcdf	Fix bug of supporting channelwise dygraph quantized model, test=develop (#30531 )	4 years ago
WeiXin	c0fb03a0dc	Supplement PR29988(https://github.com/PaddlePaddle/Paddle/pull/29988 ) (#30507 )	4 years ago
hutuxian	9fec1618d2	Ascend Framework Part3: Ascend Parser (#30391 )	4 years ago
hutuxian	40ede12631	Ascend Framework Part1: OP & Wrapper (#30281 )	4 years ago
Zhang Ting	34bf8dfc40	avoid calling cast twice (#30527 )	4 years ago
gongweibao	bdae7ed326	Fix potential port conflicts. (#30508 ) Fix potential port conflicts	4 years ago
QingshuChen	8489d4f76f	optimize batch_norm & pool op for kunlun (#30490 )	4 years ago
taixiurong	5e5c2827a3	fix range op crash in dygraph xpu place (#30469 )	4 years ago
WeiXin	18ecd433f5	Avoid bug on 'MAC python3.5/6'. (#30485 ) * Avoid bug on 'MAC python3.5/6'. * Choose the saving method according to the OS. * smaller length of '_unpack_saved_dict' for MAC OS. * add version information of Python. * Edit comment.	4 years ago
JZ-LIANG	16ba0abc79	Recompute Offload: fixed bug in memcpy (#30484 )	4 years ago
lijianshe02	d8a9ba56ef	fix random seed in nll_loss unittest test=develop (#30468 )	4 years ago
cc	5d8d463cf7	Collect weight threshold for lstm op in post_training_quantization (#28701 ) * Collect weight threshold of lstm, test=develop	4 years ago
guofei	11e78ebaa3	Modify the calculation logic of LambOptimizer (#29313 ) * Modify the calculation logic of LambOptimizer	4 years ago
LielinJiang	1d7bf1de2b	Update voc dataset url (#30450 ) * update voc url	4 years ago
pangyoki	13d757362c	Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103 ) * add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error	4 years ago
WeiXin	e5bb4edb2c	perfect 'var_list' of static.load/fluid.load (#30457 )	4 years ago
123malin	05f06d9ae1	test=develop, fix fleet.metric (#30438 ) * test=develop, fix fleet.metrics(mse, rmse, mae)	4 years ago
taixiurong	6a3c8725b0	support transformer v2.0 (#30381 )	4 years ago
Zhou Wei	c94a4b9468	Separate AVX and NO_AVX compilation, enhance installation error message (#30413 )	4 years ago
Jiaqi Liu	e395bcd1e0	add auc into 'all' list (#30310 ) * add auc into 'all' list * alias acc, expose to users * update sample code	4 years ago
Chengmo	859431aadb	fix ps init(#30397 ) Co-authored-by: seiriosPlus <tangwei12@baidu.com>	4 years ago
123malin	2a98e9323a	test=develop, add distributed_infer (#30300 ) * test=develop, add distributed_infer	4 years ago
Wilber	96784ed6c8	fix compile error on ARM (#30398 )	4 years ago
Chen Weihang	ae1f32091a	fix prune input bug (#30384 )	4 years ago
WeiXin	5ff4f1ad5e	move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339 )	4 years ago
Huihuang Zheng	cd5f11b822	Decrease Batch Size for Windows CI, test=develop (#30331 ) As the title	4 years ago
cc	8e3a294045	skip quantizing ops in cpu inference (#30342 ) * skip quantizing ops in cpu inference, test=develop	4 years ago

1 2 3 4 5 ...

12587 Commits (73a6fa3ed0fe2bbbfe72c05f42faabccd3bbadb7)