Paddle

Commit Graph

Author	SHA1	Message	Date
liym27	b10ecd9d3a	[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267 )	5 years ago
Chen Weihang	9ad800ebb2	Support type promote for basic math ops (quantum required) (#29265 ) * basic impl of type promote * add comment & another testcase * fix complex bugs & support python op promote type * fix failed unittests & polish code * add unittest for coverage * change to only promote complex type * polish code details * polish several comments	5 years ago
Zhen Wang	be3777a50a	Add pure fp16 training with master weights. (#27712 ) * add the weight decay func for the momentum op * Add the multi_precision function in Momentum Optimizer. * Make sure that the initial value of master weights are same with the fp16 weights. * add static loss scaling. * add the rescale_grad function in the pure fp16 training. * use the original momentum updating method. * Polish some codes, such as variable names. * add docstring for apis. * update the var creation details of _create_master_weight. * not modify codes about imperative momentum updating. * Fix the error of test_dist_sparse_tensor_load_momentum UT. * add unit test for multi precision fp16 training. * add more unit tests for CI. * Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT. * For CI Coverage Checking.	5 years ago
chentianyu03	8f45d14263	add complex64 and complex128 type; add +-/@ and slice opreator for c… (#29199 ) add complex64 and complex128 type; add +-/@ and slice opreator for complex types add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest	5 years ago
Zhou Wei	c0a991c874	accumulate gradient for leaf tensor with previous graph and expose leaf tensor concept (#28429 ) * The leaf tensor concept is exposed and the gradient accumulation of leaf tensor * The leaf tensor concept is exposed and the gradient accumulation of leaf tensor * fix coverage * fix api doc * fix CI unittest * fix CI unittest * fix unitest * empty tensor does’t need inner_var_ * fix some error message	5 years ago
liym27	865a45984f	Check whether there is any inplace operation affecting gradient calculation. (#27901 ) * Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable. * Add a new attribute `_inplace_version` for VarBase. * Raise exception if an inplace operation can result in incorrect gradient computation. * Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation. * For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode. * Use original var_wrapper if the inplace_version is not changed. * Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.	5 years ago
ShenLiang	e2d01eb650	Support dynamic graph distributed (#28997 ) * add reducer * refine envent for memorycopy * add concat&split for allreduce * apply concat & split for fuse tensor * fix nccl dep * fix the untest, compile problem and ddp initialize problem * fix untest for mac & add some comments & solve the repeated param in sublayers * fix untest for windows & fix document	5 years ago
Leo Chen	770395cb93	Split train_mode and has_grad for tracer (#29064 ) * split train_mode and has_grad * fix format * fix ci problems * fix sample code	5 years ago
Zhou Wei	8ca0a8a859	fix tensor detach to zero copy (#27921 ) * fix tensor detach to zero copy * fix tensor detach to zero copy	5 years ago
Chen Weihang	768dab441e	polish two api doc detail, test=document_fix (#28971 )	5 years ago
gongweibao	1dad8ceaab	Fix gpu memory allocation bug. (#28703 )	5 years ago
Zhou Wei	3b0dd5f620	fix bug that to_tensor not support paddle.Place (#28717 )	5 years ago
Leo Chen	3d09929b1f	Add check for non-dispensable input (#28666 ) * Add check for non-dispensable input * fix typo	5 years ago
Zhou Wei	bf6e7cba7a	updata 2.0 API english doc (#28525 ) * make Numpy version is below 1.19.3 * fix 2.0 doc	5 years ago
Wilber	1bf4836580	[Inference] Add TryShrinkMemory interface. (#28409 )	5 years ago
石晓伟	c41fd033e5	check op_version_registry in CI test, test=develop (#28402 )	5 years ago
Leo Chen	8b2436a776	Add broadcast_shape api (#28257 ) * add broadcast_shape api * add ut * follow comments * add example code, test=dodument_fix * update example code, test=document_fix	5 years ago
石晓伟	21a63f6f90	enhance the op_version_registry, test=develop (#28347 ) * enhance the op_version_registry, test=develop * add unittests, test=develop * enhance the op_version_registry, test=develop * fix bugs, test=develop * revert pybind_boost_headers.h, test=develop * fix a attribute bug, test=develop	5 years ago
Shang Zhizhou	ea851796e5	TensorRT中ernie模型推理性能优化，支持变长输入 (#28367 ) * fp16 result ok * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS * auto detect special slice op converter for ernie with trt oss * ernie oss only support fp16 * fix special_slice_plugin serialize bug * matmul in tensorrt ok * ernie unittest ok * add matmul tensorrt unittest * remove demo code	5 years ago
Wilber	6f0f45f69c	copy_to_cpu support uint8 (#28372 )	5 years ago
wangguanzhong	5262b02585	add generate_proposals_v2 op (#28214 ) * add generate_proposals_v2 op	5 years ago
石晓伟	d9b5f1261c	update the version of pybind, test=develop (#28284 ) * update version pybind to v2.4.3, test=develop * update unittests, test=develop	5 years ago
wangguanzhong	1c385e26f9	add op_function_generator for box_coder (#28303 ) * add op_function_generator for box_coder * fix format	5 years ago
Guanghua Yu	e8f2614da5	Enhance multiclass_nms op to support LoD for dygraph mode (#28276 ) * Enhance multiclass_nms to support LoD for dygraph mode * fix some error in multiclass_nms * update GetLodFromRoisNum to GetNmsLodFromRoisNum	5 years ago
wangxinxin08	41d26a8287	update matrix nms op to api 2.0 (#28265 ) * update matrix nms op to api 2.0 * modify code according to review	5 years ago
Zhang Ting	fdc06f2158	add Fuse bn add act pass (#28196 ) * add fuse_bn_add_act pass	5 years ago
Chen Weihang	813b2ade34	Enrich the python error types of paddle & polish format (#28124 ) * add multiple exception type * define all exception & polish compile pystack * mapping paddle error to python exception * polish static mode error format * fix failed unittests * fix dytostatic test_error * fix check_nan_inf failed * add unittest for coverage * revert some code try to solve compile error * refactor enforce & error change * polish code & add unittest	5 years ago
Zhou Wei	fb7f85291b	fix print tensor place,add cpu/cuda/pin_memory API for Tensor (#28200 )	5 years ago
Wilber	f935ca8a50	[lite-xpu-subgraph] Fix xpu compile and test xpu ci. (#27932 )	5 years ago
chentianyu03	05fd49e974	change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes (#27998 ) * change paddle.fluid.layers.reduce_sum to paddle.sum in sample codes * format codes	5 years ago
tangwei12	202bfab1be	Feature/large scale kv save base/delta (#27470 ) * add size method for large scale * add large scale UT * add ut for checkpoint	5 years ago
Zhou Wei	bf412f4665	add tensor clone (#27953 ) * add tensor clone * fix unittest test_var_base	5 years ago
guofei	6bbb6e7f45	Implement the function of OutScaleForTraining/OutScaleForInference in dygraph (#26601 ) * Implement the function of OueScaleForTraining/OutScaleForInference in dygraph test=develop	5 years ago
chentianyu03	d05058d268	Remove and reorganize the alias of APIs (#27717 ) * modify cond while_loop to paddle.static.nn.cond * modify crop_tensor to paddle.crop * modify Variable to paddle.static.Variable * remove nn.beam_search, nn.beam_search_decode, nn.gather_tree * remove bpr_loss, center_loss, rank_loss, smooth_l1, teacher_student_sigmoid_loss, edit_distance, sampled_softmax_with_cross_entropy in nn.functional * remove apis in nn.functional.learn_rate.py * remove pool2d, pool3d, adaptive_pool2d, adaptive_pool3d in nn.functional * remove apis in nn.functional.vision * remove erf, soft_relu in nn.functional.activation * remove apis in nn.functional.extension * remove nn.functional.rnn * remove hash from nn.functional.lod * remove row_conv from nn.functional.extension * remove one_hot, pad2d, pad_constant_like from nn.functional.common * remove nn.gather_tree, nn.BilinearTensorProduct, nn.Pool2D, nn.Pad2D * remove apis from optimizer.__init * remove tensor.creation.fill_constant * remove elementwise_mul in nn.functional.common and modify to paddle.multiply * remove tensor.stat.reduce_mean * remove reduce_all, reduce_any in tensor.logic * remove apis in tensor.math * remove apis in tensor.__init__ * remove has_inf, has_nan in tensor.search * remove apis in framework.__init__ * remove apis in paddle.__init__ * remove apis in nn.functional.__init__ * modify removed alias apis to raw api in doc and unittests * fix remove grid_sample bug * modify removed alias apis to raw api in doc and unittests * modify removed alias apis to raw api in doc and unittests * modify removed alias apis to raw api in doc and unittests * modify removed alias apis to raw api in doc and unittests * modify removed alias apis to raw api in doc and unittests * modify removed alias apis to raw api in doc and unittests * delete alias api relastions in doc * reserve paddle.compat, paddle.sysconfig * remove unittest for paddle.reduce_all, paddle.reduce_any * modify removed alias apis to raw api in doc and unittests * recover paddle.save and paddle.load * resolve conflicts * fix sample code missing paddle.enable_static() bug * fix sample code missing paddle.enable_static() bug * fix to_string sample code error	5 years ago
Leo Chen	9a2a4b5f65	Support setting xpu place in dygraph mode (#27909 ) * support setting xpu place * add ut, test=kunlun	5 years ago
Leo Chen	049696bf67	Refine the format of printing tensor (#27673 ) * add sumary feature * refine printting tensor * add sci_mode * add sample code * fix indent error * fix _format_item * polish code * support item indent * add ut * set place for ut * fix py2 issue * fix ut	5 years ago
joanna.wozna.intel	ddcd1b5381	Add bfloat16 resnet50 test (#27755 )	5 years ago
Wilber	9005c5a260	Lite subgraph support arm cpu. (#27827 )	5 years ago
yongqiangma	e8a5aefbbd	update CUDAPlace doc. test=document_fix (#27711 )	5 years ago
zhupengyang	659d04df2c	hsigmoid -> hsigmoid_loss/HSigmoidLoss; refine docs (#27745 )	5 years ago
石晓伟	0d27591642	save operator version infomation to program desc, test=develop (#27668 )	5 years ago
joanna.wozna.intel	0cd4907eba	Add avx512 core instructions check (#27732 ) * Add avx instructions check * Small fix * Change function name * Change uint to unsigned int	5 years ago
Zhang Ting	d2369dd91f	modify docs of CPUPlace and CUDAPinnedPlace, test=document_fix (#27587 )	5 years ago
Chen Weihang	b14ecb8632	Polish api BuildStrategy/ExecutionStrategy doc & code example (#27662 ) * polish BuildStrategy api doc & example * polish ExecutionStrategy api doc & example * polish details	5 years ago
lilong12	bbc2add703	Initialize gloo for low level collective apis (#27672 ) * add gloo initializer, test=develop	5 years ago
arlesniak	0ecf441af1	Add support for mkldnn ops types selection with FLAGS in dygraph (#27482 ) * Add support for mkldnn ops types selection with FLAGS in dygraph * use regex to match DNNL verbose * python3 encoding fix	5 years ago
lilong12	36c0410223	Revert "Initialize gloo for low level collective apis (#27356 )", test=document_fix (#27665 )	5 years ago
wanghuancoder	c68a0313a5	add paddle.fluid._cuda_synchronize (#27595 ) * add paddle.fluid._cuda_synchronize, test=develop * fix bug about core_avx core_noavx, test=develop * delete CPUPlace and XPUPlace, test=develop	5 years ago
liym27	074a71bd25	Support assignment to a Variable in dynamic mode but not deal with backward. (#27471 ) * Support assignment to a Variable in dynamic mode. Note: not deal with backward. * Rewrite VarBase __setitem__ for high-performance. * try to test 3 means to do __setitem__ and test the performance of 3 means. * Retain the means of the highest performance: C++ code and don't trace op.	5 years ago
lilong12	fa73e4a284	Initialize gloo for low level collective apis (#27356 ) * add gloo initializer, test=develop	5 years ago
Li Fuchen	1501a80f74	add support to float64 input of warpctc op. (#27399 ) * add float64 input to ctc_loss * modified error message of warpctc * update repo and tag of warpctc * add test for warpctc with float64 input * modified warpctc.cmake to make sure build always * resolved sample code bug of warpctc * add core.ops in warpctc dygraph * fix a bug of test	5 years ago
joanna.wozna.intel	b0ee1405f7	Add conv2d bfloat16 support (#27325 )	5 years ago
Zhou Wei	1e1ae5c54d	Make the Bind Method of Tensor more automatic (#27270 ) * Makes the Bind Method more intelligent * Makes the Bind Method more intelligent * fix unittest * fix unittest * fix conflict	5 years ago
Leo Chen	aba759ba16	[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112 ) * support use add instead of sum to do gradient accumulation * add inplace addto pass * add grad_add op and inplace addto pass * remove debug code * code refine * fix bug when sereral sum ops inserts at same op_idx * fix Flags type * add addto attribute for conv3d * fix ut * code clean * fix type	5 years ago
huangxu96	02606d45ef	Quant op dev (#25932 ) * Finished ChannelWiseQuantDequantAbsMaxOp and Passed unittests. * Finished channel-wise quantize strategy in imperative quantization. * Added Cuda code of ChannelWiseQuantDequantMaxAbsOP Add Cuda code of ChannelWiseQuantDequantMaxAbsOp * Add quant_axis for channel_wise quant. * fixed a bug in unnitests, which will not trigger axis = 1 case and cannot meet the coverage rate requirement. * Added some assert infomation and fixed some coding style mistakes.	5 years ago
Wilber	f827665ae6	[Pass Compatible] Bind python compatible. (#27262 )	5 years ago
joanna.wozna.intel	1483ea2304	Add bfloat16 passes (#26999 )	5 years ago
Zhen Wang	d708b21074	Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240 ) * update amp_check_finite_and_scale_op for static_amp. * use amp_check_finite_and_scale in static graph amp. * update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op). * add update_loss_scaling op in cpp. * add update_loss_scaling_op unit test. * update the doc of the check_finite_and_unscale op * Update the process of gradients updating skipping if the gradients have infinite values. * update the way to zero grads. * update test_update_loss_scaling_op.py * add log info when find infinite grads. * add the unit test for UpdateLossScaling Layer.	5 years ago
wangguanzhong	a28ae86e11	Enhance ops to support LoD as input for dygraph detection models. (#25316 ) * enhance collect_op for dygraph, test=develop * enhance detection ops with lod, test=develop * support none bbox left in generate_proposals, test=develop * unfiy MultiLevelRoisNum, test=develop * update core.ops, test=develop * add op register for new input & output, test=develop	5 years ago
Wilber	632125415c	Refine python inference api (#26958 )	5 years ago
yaoxuefeng	7f3e6ca596	add cuda generator (#26786 )	5 years ago
joanna.wozna.intel	95e1434bb2	Add bfloat16 data type (#25402 )	5 years ago
joanna.wozna.intel	0627a319b0	Restore "Add mkldnn bfloat16 option to C-API " (#26882 ) * Add mkldnn bfloat16 option to C-API * Add test for bfloat16 gpu * Change coverage test * Repair capi_gpu test	5 years ago
石晓伟	ced6e87eee	Revert "Add mkldnn bfloat16 option to C-API (#26676 )" (#26854 ) This reverts commit `02083bda40`.	5 years ago
arlesniak	885c61f086	Add use of global flag 'use_mkldnn' to layer_helper (#26497 ) * get use of global 'use_mkldnn' in layer_helper * update for CI * update for CI, relu test * update for CI, relu test added, make FLAGS_use_mkldnn a public flag * added more strict tests, fixes after review * fixes after review * fixes after review, CI stuff	5 years ago
yaoxuefeng	a47d92d868	fleet add save with whitelist test=develop (#23376 )	5 years ago
Wilber	68e0560c2f	refine paddle inference api (#26774 ) * refine paddle inference api Co-authored-by: nhzlx <nhzlx.dragon@gmail.com>	5 years ago
Leo Chen	844583c8fd	Refine paddle.manual_seed (#26496 ) * refine manual seed * fix ci problem * fix unittests * fix unittest * set is_init_py=false in manual_seed * fix unittest * fix bernoulli_op * fix(unittest): change random_seed to manual_seed * 🐞fix(unittest): fix manual_seed * trigger ci * fix test_sentiment * fix test_imperative_save_load * fix test_uniform_random_op * fix test_uniform_random_op * fix test_jit_save_load * merge develop * fix manual_seed * fix manual_seed * use global engine * use shared_ptr * fix double free * fix bug * fix bug * fix bug * fix test bug * fix test bug * fix test bug * fix ci	5 years ago
joanna.wozna.intel	02083bda40	Add mkldnn bfloat16 option to C-API (#26676 ) * Add mkldnn bfloat16 option to C-API * Add test for bfloat16 gpu * Change coverage test	5 years ago
Zhen Wang	f9066e6a6f	Update the demo code and the doc of varbase.backward. (#26506 ) * update the demo code and the doc of varbase.backward. * update the doc of the fake interface `paddle.fluid.Variable`. * remove BackwardStrategy.	5 years ago
lilong12	1c68138327	[api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552 ) add collective op for cpu using gloo and paddle.distributed.* apis	5 years ago
Zhang Ting	0a895bc0df	improve unique op (#26537 ) * add unique_v2 op * remove unique_v2 op * update doc	5 years ago
wanghuancoder	c1f5df5269	optimized transformation form tensor to numpy (#26447 ) * optimized transformation form tensor to numpy, test=develop * optimized transformation form tensor to numpy, pass pre-commit, test=develop * modify fetchophandle zerocopy to deepcopy in PE&CUP, test=develop * modify py:array construct, test=develop * fix _fetch_var to use deep copy, test=develop	5 years ago
wanghuancoder	422a162019	api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear (#26399 ) * api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear, test=develop * api2.0 fix code examples, test=develop * modify test_bilinear_api, about place,to_tensor , test=develop * re pass pre-commit, test=develop * Update common.py * fix BilinearTensorProduct ci error, test=develop	5 years ago
wanghuancoder	6e823cfec3	add op_function_generator.exe retry in windows, test=develop (#26591 ) add op_function_generator.exe retry in windows	5 years ago
wangchaochaohu	ebf9b2125e	add paddle.gather for API2.0 (#26455 )	5 years ago
QingshuChen	138ecf24aa	support Baidu Kunlun AI Accelerator (#25959 ) * support Baidu AI Accelerator * test=kunlun * minor * test=kunlun * support xpu op in separate file * test=kunlun * update XPU error message and remove duplicated code * test=kunlun * minor * test=kunlun * minor * test=kunlun	5 years ago
ceci3	56890dc729	Add SyncBatchNorm (#26032 ) * add SyncBatchNorm,test=develop	5 years ago
Leo Chen	049ac56c08	Print user-friendly error message in core.ops [part 2] (#26377 )	5 years ago
yaoxuefeng	23261ff44b	add cpu random Generator (#26013 )	5 years ago
Leo Chen	672578a797	Print user-friendly error message in core.ops (#26261 ) * print user-friendly error message * adjust error sumary	5 years ago
wangchaochaohu	0b81d76310	[API2.0] add op for cudnn version query test=develop (#26180 )	5 years ago
wangchaochaohu	bb11cbc250	[API2.0] add Device api (set_device and get_device)(#26103 )	5 years ago
Zhou Wei	6de463d3d1	expose and unify the Tensor concepts to the user (#25978 ) * expose and unify the Tensor concepts to the user * expose tensor to user * add copy place for Tensor * add copy place for Tensor * add note * add macro PADDLE_WITH_CUDA * remove RUN_TYPE=DIST * fix some error	5 years ago
Zhou Wei	20147ace3f	fix_copy_if_different (#25868 )	5 years ago
Leo Chen	2d95280e1f	Feature/Enable Auto-Mixed-Precision in dynamic graph (#24903 ) * add auto_cast, test=develop * add loss scaler, test=develop * add comments, test=develop * refine code, test=develop * refine code, test=develop * do not set flags automatically, test=develop * fix custom op bug, test=develop * add more test, test=develop * refine enable logic, test=develop * enable amp test with GPU, test=develop * add unittest * add test for found_inf * follow comments * follow comments * remove global variable, use singleton * add some notes * update comments * update comments * update comments * add use_dynamic_loss_scaling argument * refine found_inf * refine found_inf	5 years ago
Chen Weihang	838e36e9ed	Fix loaded variable suffix repeat error (#26169 ) * fix loaded var suffix repeat error * use new dygraph name for loaded param	5 years ago
Jack Zhou	dea41da715	add nll loss API for the paddlepaddle api2.0 * add nll loss API, update demo code of the comment	5 years ago
Chen Weihang	3c8daa9b89	Add pin memory control for BufferedReader (#26026 ) * add pin memory control * fix buffered reader init problem * fix unittest error * add unittest for coverage	5 years ago
Leo Chen	751305ecf0	Add flags to control call stack of error message (#25997 ) * add flags_call_stack_level * update * refine code	5 years ago
Thunderbrook	0cb60c700d	add heter ps mode (#25682 ) * add heter ps mode * code style test=develop * add with_pslib test=develop * unitest test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * test monitor test=develop * prepare trainer test=develop * code style test=develop	5 years ago
tangwei12	caa90a6510	Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957 ) * Integrated Trainer of Parameter Server	5 years ago
Zhou Wei	b484a59c39	fix copy file random fail on windows (#25731 )	5 years ago
Pei Yang	55b6205ddf	add set_mkldnn_cache_capacity python api(#25524 )	5 years ago
Zhen Wang	cea5086853	Fix the double grad bug for the star gan. (#25655 ) * fix the double grad bug for the star gan. test=develop * update the retain_graph parameter doc. test=develop * add the unit test for the retain_graph parameter. test=develop	5 years ago
石晓伟	7206417259	supports xpu runtime, test=develop (#25554 ) * update ResetHolder, test=develop * add TensorShare for lite engine, test=develop * tensor data changed from copying to sharing, test=develop * supports xpu runtime, test=develop * fix code styles, test=develop	5 years ago
Pei Yang	43f9f180e5	Add api to clear intermediate tensors in AnalysisPredictor (#25069 ) * add api to clear intemediate tensors in analysis predictor. test=develop * add python api. test=develop	5 years ago
Zhen Wang	bb45af02ac	add the c++ part of Imperative QAT. test=develop (#25446 )	5 years ago
ceci3	52be62c5ae	fix instance norm in dy (#24717 ) * fix bn & in in dy, test=develop * update instance_norm,test=develop * fix bugs,test=develop * add more case in unittest,test=develop * fix,test=develop * fix,test=develop	5 years ago
gongweibao	80f1c50738	Fix typo in interface. (#24779 )	5 years ago

1 2 3 4 5 ...

885 Commits (a37658daff841f670d557b2ec2aee09ca8feec75)