Paddle

Commit Graph

Author	SHA1	Message	Date
Jacek Czaja	f7465641c3	Added reshape grad bf16 (#31035 ) * - added Reshape grad bf16 * - Added reshape grad bf16 * - cosmetics in py	5 years ago
Aurelius84	4dbe16c48f	[CustomOp] Refine name argument in setup (#31049 ) * refine setup name usage * fix unittest failed	5 years ago
Aurelius84	f2dc29a9fa	[CustomOp] Support output dtypes in generated Python API (#31045 )	5 years ago
ShenLiang	9401173e3a	Remove scale loss before reduce in dygraph (#30807 )	5 years ago
Kaipeng Deng	c4ddc3ab0d	fix dataloader collate return list mix tensor and numpy array (#30904 ) * fix dataloader collate return list mix tensor and numpy array. test=develop	5 years ago
Guanghua Yu	5b267474a9	add offset parameter in roi_align,generate_proposals.etc ops (#30864 ) * add parameter in roi_align op	5 years ago
Chen Weihang	75f81233ae	fix regex error & simplify marco name (#31031 )	5 years ago
Pei Yang	9b54fe4154	add trt transpose and flatten converter (#31022 )	5 years ago
Aurelius84	4c9f96c902	[CustomOp] Support Compile multi ops at same time (#30920 ) * add more unitest for ABI compatibility * add more unittest * refine warning style * support compile multi custom ops in same time * fix not import paddle in unittest * fix typo * add more unittest * add comment for details	5 years ago
joanna.wozna.intel	caf9d39839	Add Conv Transpose BF16 (#30877 ) * Add conv transpose BF16 * Share function GetWeightsTz * Adjust to review and fix op compatibility * Add bias to unique handler name * Remove errors related to paddle enforce * Add conv2d_transpose to bf16 list and kernel refator	5 years ago
Huihuang Zheng	cbbe127483	Refine fake_interface Error Message (#30981 ) Refine fake_interface Error Message	5 years ago
Huihuang Zheng	c137578341	Add Support for Tuple in for Loop (#30998 ) Dy2stat didn't support tuple as iteration variable in the past. This PR added there main cases: 1). Non-enumerate case: for var1, var2 in var\|var.numpy() will be re-written as: for FOR_ITER_TUPLE_PREFIX_x in var \| var.numpy(): var1 = FOR_ITER_TUPLE_PREFIX_x[0] var2 = FOR_ITER_TUPLE_PREFIX_x[1] 2). Enumerate out tuple case: for t in enumerate(var\|var.numpy) will be rewritten as: for FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x in enumerate(var\|var.numpy): t = (FOR_ITER_TUPLE_INDEX_PREFIX_x, FOR_ITER_TUPLE_PREFIX_x) 3). Enumerate inner tuple case: for i, (var1, (var2, va3)) in enumerate(var\|var.numpy()) will be re-written as: for i, FOR_ITER_TUPLE_PREFIX_x in var \| var.numpy(): var1 = FOR_ITER_TUPLE_PREFIX_x[0] var2 = FOR_ITER_TUPLE_PREFIX_x[1][0] var3 = FOR_ITER_TUPLE_PREFIX_x[1][1]	5 years ago
Wojciech Uss	2497f4392f	Handle missing symlink method on Windows (#31006 )	5 years ago
Aurelius84	5653c3a488	[CustomOp] Check Compiler ABI compatibility (#30869 ) * support setup.py to compile custom op * move file into paddle.utils.cpp_extension * support python setup.py install * refine code style * Enrich code and add unittest	5 years ago
huangjun12	20e300e2df	fix lrn bug in reshape size, test=develop (#30968 )	5 years ago
WeiXin	8ab29f4bea	delay timeout of unnittest 'test_static_save_load'. (#30975 )	5 years ago
Chen Weihang	f649442ddd	New custom operator extension mechanism (#30690 ) * initial commit: simple demo * polish copyright format * add grap op simple demo * adapt uncertain number of argument * change trait marco name * add place & dtype support for add kernel * add dispath and infershape func * poish code & add notes * add dynamic_loader dep for paddle_framework * add new custom op test dir * polish impl details * add unittest for new custom op * fix failed unittest * Costum op (#1) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * refactor register design & add test * change op_funtion to op_meta_info * split op meta info into .h and .cc * move get methods into friend class * move OpMetaInfoHelper into framework space * move CustomTensorUtils into framework space * change pybind api name * move PD C API into op meta info * add register custom op api * remove inference cmake change * refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * polish detail & error message * polish test details * Add cast api && Change copy related api to copy_to && add more test (#4) * fix compile error * wrap framework tensor with LoDTensor * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * add CustomTensor default constructor * add size() for CustomTensor * make size const for CustomTensor * refactor place related api to circle the concept * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * fix compile error * make place const * make Tensor copy * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * debug CustomTensor core * remove additional head of framework * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * use back to shared ptr for custom tensor * add gpu test * merge latest cwh code in * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * adjust ut code of custom op * hid share data from and to * rename CustomTensor to Tensor * support multi dtype * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * remove lod, make reshape lowercase, add copy test and refactor copy api * fix copy to error * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add more test * add type cast * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * add cast and make copy to api * merge cwh code * merge cwh code * merge cwh code * merge cwh code * merge cwh code * add more error log * add more error log * polish code * used for test * remove test comment * remove test comment * fix uint8 type error * fix lost uint8 type error * add test for coverage * polish details by reviewer comments * add prefix for DISABLE_COPY_AND_ASSIGN Co-authored-by: Jiabin Yang <360788950@qq.com>	5 years ago
chajchaj	f5ca2db2cc	support label with float input of cross_entropy, test=develop (#30929 ) * support label with float input of cross_entropy, test=develop * fix code style in nn/functional/loss.py, test=develop	5 years ago
Huihuang Zheng	8e72e031fc	Update gast requirement, test=develop (#30932 ) gast version can be conflict with the other software users installed. We set the version to be higher than 0.3.3	5 years ago
Chen Weihang	010f2caa23	try to fix reader and signal test failed (#30960 )	5 years ago
liym27	12c15bebe4	[Static setitem] Support index is ellipsis for setitem in static mode (#30836 )	5 years ago
liuyuhui	87197f8c2e	[kunlun]fix sync in multi kunlun xpu dygraph training. (#30943 )	5 years ago
wanghuancoder	823f499a8a	fix a bug of Sequential::__getitem__ (#30899 ) * fix a bug of Sequential::__getitem__, test=develop * add testcase, test=develop	5 years ago
Jacek Czaja	9e527d9956	[oneDNN] Added basic changes for elementwise_add_grad bf16 (#30925 )	5 years ago
liuyuhui	4a8b8b4547	[Kunlun] add gen_bkcl_id_op, support multi XPU cards training using multiprocess (#30858 )	5 years ago
wanghuancoder	90d92111cf	let LayerList could add [None], test=develop (#30911 )	5 years ago
taixiurong	24873f4f77	dyngraph (#30892 )	5 years ago
Zhen Wang	71acde9afc	Use correct master weights in AdamW. (#30895 ) * Use correct master weights in AdamW. * Just modify the master weight. * Update for CI Coverage.	5 years ago
Jacek Czaja	abfa822650	[oneDNN]Extended adaptive pooling support for oneDNN pool kernel (#30757 )	5 years ago
Zhang Ting	e97905c5fa	improve performance of momentum (#30881 )	5 years ago
cucuzg	ac2e2e6b7f	add clip_by_norm on kunlun, *test=kunlun (#30862 )	5 years ago
Kaipeng Deng	302427170f	remove numpy array check in single-process dataloader. test=develop (#30861 )	5 years ago
wawltor	b7560a59ab	fix the broadcast for the large second input (#30818 ) fix the broadcast for the large second input	5 years ago
JamesLim	6e1e036a75	Implement cuda kernel for index_sample. (#30380 )	5 years ago
AshburnLee	666efc2336	Call new cudnn batch norm API regardless of data type and data layout (#30157 )	5 years ago
石晓伟	2ac4143b6c	support xpu with analysis predictor, test=develop (#30832 ) * support xpu inference with analysis predictor, test=develop * merge the cmake of the xpu toolchain, test=develop * add c-apis, test=develop * fix a bug in extern_xpu, test=develop	5 years ago
joejiong	05d2b7a37f	Update paddle.static.Print with paddle2.0 api (#30846 ) As the title	5 years ago
Aurelius84	e49d0746dd	[CustomOp] Support install as Package and Add load interface (#30798 ) * support setup.py to compile custom op * move file into paddle.utils.cpp_extension * support python setup.py install * refine code style * Enrich code and add unittest * Polish code and api doc * fix cpp_extension not include in package * fix relative import * fix os.makedirs exist_ok param compatibility PY2 * add compile flags in test_jit_load	5 years ago
Adam Osewski	4f066e316e	Layer normalization fuse pass. (#30721 )	5 years ago
WangXi	b1026f64af	【kunlun】dygraph supports multi xpu card training (#30671 )	5 years ago
LielinJiang	3a3ff75c52	Fix unittest random failed of test_datasets (#30804 ) * fix test_datasets unittest	5 years ago
Shang Zhizhou	b909450994	fix trt plugin clone and initialize bugs in TRT7.1+ (#30709 ) * fix trt plugin clone and initialize bugs * fix unit test error * enable trt in ci py3 * update unittest timeout	5 years ago
Shang Zhizhou	200ee33df8	fix unittest random error (#30808 )	5 years ago
xiemoyuan	db87087283	Optimize the encoder of Transformer. (#30439 ) * Add cache for Transformer encoder. * Bug fixed. * add unittests for transformer encoder.	5 years ago
WangXi	31ed9c9eed	Fleet distributed strategy support pure fp16 (#30754 )	5 years ago
Aurelius84	2c974cc316	【CustomOp】support setup.py to compile custom op (#30753 )	5 years ago
Jiaqi Liu	65a9744cfd	fix paddle.static.acc and auc sample code bug, test=document_fix (#30715 )	5 years ago
Wojciech Uss	fc00240575	A fix for oneDNN matmul kernel. Fixes issue #30309 (#30723 )	5 years ago
tianshuo78520a	a12b6bb9cb	add readme in whl package (#30726 )	5 years ago
WeiXin	3491acfb1e	Split unittest. (#30727 )	5 years ago
liu zhengxi	a87d78f1a9	update gather_tree doc (#30693 ) * update gather_tree doc, test=document_fix * update sample code, test=document_fix * remove tensor type, test=document_fix	5 years ago
liu zhengxi	fef3654b4e	upgrade gather_tree to core.ops (#30697 ) * upgrade gather_tree to core.ops * update gather_tree unittests	5 years ago
jakpiase	f8da5536ed	REUPLOAD Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30719 ) * added external reorder to profiler * resolved conflict * added enable_static * initial version of lstm, not working yet * added lstm to operators.cmake * added vanilla lstm mkldnn op * added peephole weights integration * minor changes * added formatting * added fusion_lstm_mkldnn to static_whitelist * added formatting * removed comment * moved use_peepholes attribute inside is_cached block * reverted wrong changes * minor formatting change * minor changes * changed stream handling * minor change * added datatype to GetExpectedKernelType() * added reading stream from TLS	5 years ago
liym27	13ef444fa6	[Dy2Stat] Fix error message when the message has more than one lines. (#30714 )	5 years ago
Tao Luo	824a79d383	Revert "Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661 )" (#30708 ) This reverts commit `d834f4e6e8`.	5 years ago
jakpiase	d834f4e6e8	Added vanilla LSTM and LSTM with peepholes oneDNN fp32 kernel (#30661 ) * added external reorder to profiler * resolved conflict * added enable_static * initial version of lstm, not working yet * added lstm to operators.cmake * added vanilla lstm mkldnn op * added peephole weights integration * minor changes * added formatting * added fusion_lstm_mkldnn to static_whitelist * added formatting * removed comment * moved use_peepholes attribute inside is_cached block * reverted wrong changes * minor formatting change * minor changes	5 years ago
Leo Chen	1a13626f5f	polish printing dtype (#30682 ) * polish printing dtype * fix special case	5 years ago
WangXi	a28a202603	fix test_gen_nccl_id_op failed (#30686 )	5 years ago
123malin	164275704d	test=develop, fix nonzero astuple=true (#30647 )	5 years ago
yingshengBD	0eea5d714f	post quantize support insert fake_quantize_dequantize node before the OPs that will be used in VIS's faceid models (#30659 ) test=develop	5 years ago
123malin	06a3e31148	test=develop, fix test_lookahead (#30677 ) * test=develop, fix test_lookahead	5 years ago
yukavio	8c5f158172	remove PrettyTable dependence from paddle.flops (#30675 )	5 years ago
chentianyu03	fb7fbc7a5d	fix abs bug and add abs test case (#30637 ) * add abs test case * use std::abs to fix abs bug * fix the abs bug * fix abs bug	5 years ago
ShenLiang	9514b4aa5f	Fix scatter grad bug (#30604 )	5 years ago
Qi Li	1f5841c2a0	[ROCM] update cmake and dockerfile, test=develop (#30598 )	5 years ago
Zhen Wang	4a9de931a2	Fix the bug in fleet amp_init. (#30606 ) * Fix the bug in fleet amp_init. * Fix the amp_init unit test.	5 years ago
cnn	7e9f336b58	update document of paddle.vision.dataset, test=document (#30414 ) * update document of paddle.vision.dataset, test=document * update document of paddle.vision.dataset, test=document	5 years ago
guofei	430f8449f1	Fix the error of save_quantized_model (#30583 ) * Fix the error of save_quantized_model	5 years ago
TTerror	10271ddfc4	support reduce_max op on kunlun (#30581 ) * support reduce_max op on kunlun * support reduce_max op on kunlun * support reduce_max op on kunlun * support reduce_max op on kunlun	5 years ago
WeiXin	ca33821475	延长单测'test_static_save_load'超时 (#30599 ) * delay the 'timeout' of 'test_static_save_load'. * delay the 'timeout' of 'test_static_save_load'.	5 years ago
chentianyu03	358106fcb0	make abs op support complex types (#30375 ) * rewrite abs op * rewrite abs op and remove abs in activation * remove abs register in old codes * fix abs_grad type error * fix abs double_grad output name error * modify abs_grad, abs_grad_grad functor for windows building * format code style * fix the bug of result is nan when the divisor is zero * add missing abs attr and add abs for float16	5 years ago
huangxu96	138620084c	Add fleet amp_init() (#30572 ) * add fleet amp.init() * add unittest for fleet_amp_init	5 years ago
wanghuancoder	27a5c0cff6	fix layers train eval bug (#30580 ) * delete empty line of pybing.cc, test=develop * fix layers train eval bug, test=develop	5 years ago
lilong12	8126a41d73	fix the bug of all_reduce pipeline gradient multiple times (#30437 ) * update, test=develop	5 years ago
Aurelius84	621bc4f771	[Dy2static]Fix paddle prefix in is_paddle_api (#30569 ) * add paddle. * add unittest	5 years ago
tangwei12	c9e78a22c5	add trainers for pserver (#30523 ) * add trainers for pserver Change-Id: I1a75793ec81ce126d07f4c47cae09b95d530bbc8	5 years ago
Aurelius84	5067e3a8d2	[Dy2Static]Enhance check of TracedLayers out vars (#30576 )	5 years ago
liym27	ff25c5b36f	Fix bug: GetAttrValue should deal with attr with attrType vector<double> (#30536 )	5 years ago
WangXi	572c466d19	[Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455 )	5 years ago
ykkk2333	549855ac20	add rmsprop_op_xpu test=kunlun (#30493 ) * add rmsprop_op_xpu test=kunlun * modified rmsprop_op_xpu error code. test=kunlun	5 years ago
Leo Chen	7043b8cfc6	support layer_norm fp16 in dygraph amp (#30430 ) * support layer_norm fp16 in dygraph amp * add ut * refine code	5 years ago
Zhang Ting	66c514ce83	[2.0 API] device guard (#30307 ) * add 2.0 API: device_guard	5 years ago
WangXi	7a0a576e51	fix adamw lr_to_coeff is fixed when dygraph (#30526 )	5 years ago
cc	ce6777fcdf	Fix bug of supporting channelwise dygraph quantized model, test=develop (#30531 )	5 years ago
WeiXin	c0fb03a0dc	Supplement PR29988(https://github.com/PaddlePaddle/Paddle/pull/29988 ) (#30507 )	5 years ago
hutuxian	9fec1618d2	Ascend Framework Part3: Ascend Parser (#30391 )	5 years ago
hutuxian	40ede12631	Ascend Framework Part1: OP & Wrapper (#30281 )	5 years ago
Zhang Ting	34bf8dfc40	avoid calling cast twice (#30527 )	5 years ago
gongweibao	bdae7ed326	Fix potential port conflicts. (#30508 ) Fix potential port conflicts	5 years ago
QingshuChen	8489d4f76f	optimize batch_norm & pool op for kunlun (#30490 )	5 years ago
taixiurong	5e5c2827a3	fix range op crash in dygraph xpu place (#30469 )	5 years ago
WeiXin	18ecd433f5	Avoid bug on 'MAC python3.5/6'. (#30485 ) * Avoid bug on 'MAC python3.5/6'. * Choose the saving method according to the OS. * smaller length of '_unpack_saved_dict' for MAC OS. * add version information of Python. * Edit comment.	5 years ago
JZ-LIANG	16ba0abc79	Recompute Offload: fixed bug in memcpy (#30484 )	5 years ago
lijianshe02	d8a9ba56ef	fix random seed in nll_loss unittest test=develop (#30468 )	5 years ago
cc	5d8d463cf7	Collect weight threshold for lstm op in post_training_quantization (#28701 ) * Collect weight threshold of lstm, test=develop	5 years ago
guofei	11e78ebaa3	Modify the calculation logic of LambOptimizer (#29313 ) * Modify the calculation logic of LambOptimizer	5 years ago
LielinJiang	1d7bf1de2b	Update voc dataset url (#30450 ) * update voc url	5 years ago
pangyoki	13d757362c	Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103 ) * add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error	5 years ago
WeiXin	e5bb4edb2c	perfect 'var_list' of static.load/fluid.load (#30457 )	5 years ago
123malin	05f06d9ae1	test=develop, fix fleet.metric (#30438 ) * test=develop, fix fleet.metrics(mse, rmse, mae)	5 years ago
taixiurong	6a3c8725b0	support transformer v2.0 (#30381 )	5 years ago
Zhou Wei	c94a4b9468	Separate AVX and NO_AVX compilation, enhance installation error message (#30413 )	5 years ago
Jiaqi Liu	e395bcd1e0	add auc into 'all' list (#30310 ) * add auc into 'all' list * alias acc, expose to users * update sample code	5 years ago
Chengmo	859431aadb	fix ps init(#30397 ) Co-authored-by: seiriosPlus <tangwei12@baidu.com>	5 years ago
123malin	2a98e9323a	test=develop, add distributed_infer (#30300 ) * test=develop, add distributed_infer	5 years ago
Wilber	96784ed6c8	fix compile error on ARM (#30398 )	5 years ago
Chen Weihang	ae1f32091a	fix prune input bug (#30384 )	5 years ago
WeiXin	5ff4f1ad5e	move 'load_op_library','LayerHelper' to 'paddle/incubate' (#30339 )	5 years ago
Huihuang Zheng	cd5f11b822	Decrease Batch Size for Windows CI, test=develop (#30331 ) As the title	5 years ago
cc	8e3a294045	skip quantizing ops in cpu inference (#30342 ) * skip quantizing ops in cpu inference, test=develop	5 years ago
Bai Yifan	ad6fee2fa8	fix quantize error in speical naming model (#30354 )	5 years ago
huangxu96	342d62de60	add amp example document (#30314 )	5 years ago
Huihuang Zheng	017a534888	Decrease Mac Input Size Because of CI Short Memory (#30330 ) As the title	5 years ago
Leo Chen	3d015f1cf5	Set expected place in child thread for dataloader to avoid costing cuda memory on other card (#30338 ) * set expected place in child thread for dataloader * set device id when set tensor from numpy * revert tensor_py change * add compile guard * fix ci * fix bug	5 years ago
QingshuChen	2c1bba02e4	optimize memcpy perf for kunlun (#30291 ) * optimize memcpy perf for kunlun * remove useless unitest for kunlun mean * minor	5 years ago
cnn	10ae31579b	update error information (#30277 )	5 years ago
huangxu96	ee623bff64	Implemented AddQuantDequantPass in imperative quantization. (#26692 ) * Implemented AddQuantDequantPass in imperative quantization. * Supported LeakyReLU Quantization * For meeting coverage rate. * Changed the file name of test of AddQuantDequant * Implemented more Quantized NoWeightLayers. * Fix the loss cannot align problem between static and dynamic model quantization, add swish as supported quantized layer in imperative quantization. * remove noweight_list * support 2.0 API such as Pool2D and ReLu	5 years ago
ShenLiang	a60f17b89d	Support unused parameters in dynamic graph distributed (#30224 )	5 years ago
JZ-LIANG	75936d838f	Recompute Offload (#30233 )	5 years ago
houj04	dc12b5eedf	resolve #30141 (#30145 ) fix compile problem on FT	5 years ago
lidanqing	a238298659	Skip some conv2d_int8 tests in windows (#30128 )	5 years ago
Wojciech Uss	fc42faffc2	Wojtuss/upgrade one dnn 2.0 (#30295 ) * upgrade oneDNN version to 2.0 master branch * - Added workarounds for new lib onednn change * fix regex Co-authored-by: Jacek Czaja <jacek.czaja@intel.com>	5 years ago
tangwei12	5e839e4da5	add sparse embedding & load vars for 2.0 & gloo bug fix (#30306 ) * add sparse embedding & load vars for 2.0 Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b * fix hdfs gloo Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6 * fix gloo hdfs Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e * move loadvar/sparse embedding from incubute to static Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0	5 years ago
YUNSHEN XIE	da3ab010e0	disable test_pipeline (#30204 ) * disable test_pipeline * fix error	5 years ago
tangwei12	25f80fd304	Fix/distributed proto (#29981 ) * rename sendrecv.proto to namespace paddle.distributed * split ps with distributed	5 years ago
Chengmo	d479ae1725	【Paddle.Fleet】Support local save sparse param (#30175 ) * add save tensor support Co-authored-by: seiriosPlus <tangwei12@baidu.com>	5 years ago
chajchaj	113810c557	fix bug of celoss when using ignore_index and reduction (#30180 ) * fix bug of using ignore_index and reduction,test=develop * fix bug of celoss when using ignore_index and reduction, test=develop * improve performance when ignore_index=-100, test=develop * add test in test_cross_entropy_loss.py for coverage rate, test=develop * rm comment in test_cross_entropy_loss.py, test=develop * del hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop * change mask to a more simplified implementation, test=develop * del comment in python/paddle/nn/functional/loss.py, test=develop * del hard code and change mask to a more simplified implementation, test=develop * change mask to a more simplified implementation, test=develop * change mask to a more simplified implementation, test=develop	5 years ago
Double_V	231501fefc	fix elugradgrad test fail & error message opt (#30171 ) * fix elugradgrad test fail and error message opt * fix unitest,test=develop * Update prroi_pool_op.h fix error message * opt message,test=develop * fix ci fail,test=develop	5 years ago
Zhen Wang	fb49ea388e	Fix the accuracy problem of allclose op when using float64 data type in static mode. (#29890 ) * Fix the accuracy problem of allclose op when using float64 data type in static mode. * Format the code style.	5 years ago
furnace	77051cc9f0	add fp16 support for tril_triu op (#30186 )	5 years ago
LielinJiang	86d81af5ef	reduce unittest time of test_datasets (#30275 )	5 years ago
liym27	b4989fb744	Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126 )	5 years ago
furnace	c6296b2b0e	fix empty op unit test fail sometimes (#30225 )	5 years ago
AshburnLee	924aac2216	Add tf32 switch for cuDNN (#29192 )	5 years ago
chentianyu03	c7371b7b20	type promotion for grad (#30177 ) * type promotion for grad * add type promotion for div op	5 years ago
YUNSHEN XIE	42a6442a08	disable ut test_tsm on windows (#30017 ) * disable ut test_tsm on windows * fix error * add ut execuate time	5 years ago
Jiaqi Liu	b7335b4db7	Alias from paddle.fluid.layers.auc to paddle.static.auc (#30206 ) * add alias from fluid.layers.auc to static.auc * Update __init__.py	5 years ago
WeiXin	edafb5465a	Fix bug for 'save mutiple method' (#30218 ) * Fix bug for 'save mutiple method' * To pass coverage. * edit code to pass coverage. * edit code to pass coverage. * add unittest for coverage. * change for coverage. * edit for coverage.	5 years ago
gongweibao	8700a7bd90	Fix unittests bugs. (#30250 )	5 years ago
Bai Yifan	dd6f591991	fix test_pool3d_op timeout issue (#30248 )	5 years ago
Huihuang Zheng	c372a76303	Add Static Variable Clone (#30208 ) Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat	5 years ago
XiaoguangHu	6bfdef727e	clean redundant API alias in 2.0 - part 2 (#30013 ) * delete paddle.nn.functional.assign * fix dynamic to static error	5 years ago
LielinJiang	e6a1e8757d	Delete incorrect warning message (#30196 ) * fix warning and no grad	5 years ago
wangchaochaohu	af80859dd6	reduce the occupied size of memory for the fused pattern of elementwise_add Op and activation Op(relu Op for example) (#29885 )	5 years ago
pangyoki	da16b33f2e	add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op (#29913 ) * add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput	5 years ago
huangxu96	be5c2e6050	fix windows bug (#29993 )	5 years ago
Chen Weihang	3016ba852e	remove distributed prepare context (#30219 )	5 years ago
Zhen Wang	7f7dfccf20	Support pure fp16 training for AMP API. (#29544 ) * add cast ops before and after unsupported fp16 ops. * Keep partial net in FP32 pattern. * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode. * Add fp16 support for adam op. * add multi precision attr for adam. * Fix the bug of test_multi_precision_fp16_train UT. * Code format for CI. * Fix the redefine error about MPTypeTrait on windows. * fix bugs of the _create_accumulators func in Momentum. * fix bug when inserting post cast op. * Add the update_loss_scaling op in allow_set of UnusedVarCheck. * Update for ci coverage. * Add some doc for OptimizerWithMixedPrecision. * Fix the code style. * Imporve the doc of `amp_init`. * Change for fp16 testing if users have the infer program defined in separate way.	5 years ago
Leo Chen	8696335f86	Fix dtype of ungenerated grad var (#28511 ) * fix dtype of ungenerated grad var * update ut * refine code * set default dtype * fix could_use_cudnn bug * remove debug code * re-implement * fix bug	5 years ago
Aurelius84	03e072736e	Skip convert tensor shape while using Paddle.shape (#30223 ) * fix tensor shape bug * fix op_num * clean code	5 years ago
liym27	49411a20da	In creation.assgin, reuse implamention code of layers.tensor.assign to avoid maintain two code (#30227 )	5 years ago
littletomatodonkey	e03171b7c7	fix pad (#30222 )	5 years ago
liym27	31ed9a5ed3	[Dy2Stat] Use Paddle2.0 api paddle.tensor.array_* (#30156 )	5 years ago
liym27	ad55f609d5	[Dy2Stat] Don't convert to paddle.shape if var_x.shape is not negetive (#29965 ) 1. When x is Variable, call nn.shape(x) only in following cases: 1）The shape of x is used in control flow condition. 2）The dim to be used is negetive 2. When x is Variable, but x.shape or x.shape[idx] doesn't contain negetive value, don't convert to paddle.shape()	5 years ago
Leo Chen	1f97d61c68	Add callback after TensorCopy (#30123 ) * change to tensor copy sync * change to tensor copy sync * make copy_to safe when use TensorCopy * refine code * add ut * add cudapinned garbagecollector * add testcase: cpu place -> cuda pinned place	5 years ago
liym27	b2483d78a8	Fix test_slice: avoid unnecessary copying of TensorArray from subblock to parent block(#30168 ) In control flow, don't copy TensorArray from subblock to parent block when TensorArray is created in parent block.	5 years ago
Chengmo	528e03fc08	【Paddle.Fleet】Fix tensor table (#30075 ) * add tensor table	5 years ago
guofei	1bdf924217	Quantization supports 2.0 APIs (#30036 ) * Quantization supports 2.0 APIs * Fix the error of save_quantized_model	5 years ago
Chen Weihang	d0fb06b27f	[Complex] Simplify prepared op impl to improve performance (#30153 ) * simplify prepared op impl to improve performance * fix kunlun compile error * continue fix kunlun compile error * only transform diff place when dtype diff * fix failed unittests * remove useless file * polish impl by review comment	5 years ago
Chen Weihang	e503470700	try multi times for sys.exit (#30188 )	5 years ago
WangXi	619c62bb48	fix adamw apply gradient (#30130 )	5 years ago
LutaoChu	1ff69f58b6	fix paddle.pow doc, test=document_fix (#30159 )	5 years ago
wangchaochaohu	7dd551e08b	refine the paddle place support using str (#28769 )	5 years ago
Chen Weihang	8020e34e7c	Simplify the options of spawn based on fleetrun (#30144 ) * Simplify the options of spawn based on fleetrun * polish details * polish doc details	5 years ago
tangwei12	4763e6bc4e	pre padding in dygraph (#30163 ) Change-Id: Ia5279b0cbb6a5b3970aff66e9510e0d85efa70ce	5 years ago
123malin	198fbdfb60	Add Lookahead and ModelAverage Optimizer (#30004 ) * test=develop, add model_average and lookahead	5 years ago
ceci3	6a19e41f1f	fix syncbn convert (#30158 ) * fix syncbn convet * add unittest	5 years ago
Leo Chen	adac38c506	add dispenable input for core.ops.reshape2/expand/slice (#30072 ) * add dispenable input 'shape' for core.ops.reshape2 * add dispenable inputs for core.ops.reshape2/expand/slice * add ut	5 years ago
Zhou Wei	30888ca343	Polish and Optimize the print/repr information of Layer (#29998 ) * Polish and Optimize the print/repr message of all layer * fix some code format	5 years ago
WeiXin	f3a2392662	Extend the timeout for the (#30151 )	5 years ago
Zhou Wei	9c99d37906	fix unittest failed on windows (#29837 )	5 years ago
liym27	9922bd4125	Fix bug: In dynamic mode, if start or end is negetive, __getitem__ return wrong result(#30003 ) 1. when slice_item is a slice: 1) the start of __getitem__ should be std::max(start, 0) if slice 2) the start of __getitem__ should be std::min(end, dim) 2. when slice_item is an integer, it should be in [-dim_len, dim_len) 3. Fix error message to use accurate data	5 years ago
gongweibao	4d2a4bb27a	fix logs info test=develop (#30071 )	5 years ago
ceci3	a125d6331f	fix bn docs (#30096 )	5 years ago
ceci3	334247791a	add attribute for batch_norm (#29950 ) * add attribute for batch_norm	5 years ago
Jiaqi Liu	2e8425b693	Fix beam search bug (#29824 ) * fix beam search bug * add dygraph unittest * update dynamic_decode argument doc * add warning info for state which has no lengths attribute	5 years ago
WeiXin	f43e1d8c57	Support storage of large parameters (#29988 ) * Support storage of large parameters * Reduce the complexity of the unittest * Reduce the complexity of the unittest,commented out unittest for * add unittest for static.save/load * Increase the timeout threshold of 'test_static_save_load' * Increase the timeout threshold of 'test_static_save_load' * Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load' * Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'	5 years ago
chentianyu03	666e665132	change the kron gradient when complex types (#29995 )	5 years ago
WangXi	ab04997846	[fleet] combine amp and gradient merge, test=develop (#30086 )	5 years ago
wanghuancoder	88e6dc4ac5	optimize momentum to speedup dygraph, a little, test=develop (#30099 )	5 years ago
Thunderbrook	0b8e1fadc5	add topo-aware in heter-ps (#30087 ) * add topo aware * resource.h * topo aware * format	5 years ago
gongweibao	eea7090c26	fix selected_gpus test=develop (#30044 )	5 years ago
cc	1fa863da40	Support dygraph quant model (#29927 ) * Avoid the scale to be infinity in quant2_int8_mkldnn_pass, test=develop * support quantized model for paddle2.0 dygraph, test=develop	5 years ago
Chen Weihang	46c4695421	Set FLAGS_selected_gpus for spawn (#29962 ) * set flags_selectedd_gpus for spawn * add cond for unittest * Delete test_no_single_process_using_multi_gpus_in_spawn.py * Update spawn.py * Update nccl_context.cc	5 years ago
WangXi	ee16006b5d	Optimization grad merge performance (#29784 )	5 years ago
xiaoting	4d395203a2	Add alias for upsample (#29983 ) * add alias for upsample, test=develop * add alias for upsample * fix example	5 years ago
lilong12	9e51e3833f	update, test=develop (#30047 )	5 years ago
chentianyu03	e012930aa3	complex gradient matmul (#29966 ) * dot op support complex types * matmul support complex types * add test case * matmul broadcast gradient support complex * move conjFunctor to complex_functor.h	5 years ago
lilong12	b0bd93de00	Disable gloo by default (#29805 ) * update, test=develop	5 years ago
ShenLiang	b6fd262951	fix gather nd for untest (#30037 )	5 years ago
Leo Chen	a253a78a85	fix error message (#30020 )	5 years ago
lilong12	2bc5121da8	add the paddle.distributed.split api (#29970 ) * add distributed.split, test=develop	5 years ago
cc	c3c064a8fc	Add mkldnn nearest_interp and bilinear_interp op (#30016 ) * Add mkldnn nearest_interp and bilinear_interp op * don't run mkldnn interpolate in default * add interpolate_mkldnn_pass	5 years ago
zhupengyang	65d4ff753b	hardsigmoid add attr slope and offset (#29999 )	5 years ago
tangwei12	ed856d254e	fix ut (#29989 ) * fix ut Change-Id: I151e152919a1863db07792bffb42d0ca68995756	5 years ago
cc	62f455e023	Support quantizing program_desc (#29526 ) * Support quantizing program_desc, test=develop	5 years ago
Chen Long	af37285870	fix code bugs (#29932 ) * fix code bugs * fix code bugs test=document_fix * fix code bugs test=document_fix	5 years ago
guofei	8212874f47	Fix test_imperative_skip_out (#29939 ) * Fix unittest:test_imperative_skip_out	5 years ago
LielinJiang	ec2fad4d51	Fix rotation bug when use cv2 backend (#29933 ) * fix cv2 rotation	5 years ago
Chen Weihang	a1d9a14e89	support grad accumulated across batch (#29942 )	5 years ago

... 2 3 4 5 6 ...

12597 Commits (develop)