Paddle

Commit Graph

Author	SHA1	Message	Date
Jack Zhou	c791df09cf	Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast Add elementwise XPU OP kernel for KUNLUN core, including (but still cannot process common broadcast	4 years ago
wangchaochaohu	c5fcc96d5b	xpu support for fill_constant Op (#27675 )	4 years ago
tianshuo78520a	a820871669	Change PR-CI-Kunlun Test Number (#27923 )	4 years ago
Chengmo	328cb289ed	【paddle.fleet】fix sparse load (#27680 ) * add sparse tensor load method	4 years ago
tangwei12	cf70d5b350	fix paddle error informations (#27889 )	4 years ago
wawltor	95aa53425d	update the code for the topk message optimize update the code for the topk message optimize	4 years ago
Chen Weihang	4ba977c720	Polish some error message in opeators (#27876 ) * polish some error message * add white list * revert shell script change	4 years ago
123malin	a4f850748a	【paddle.fleet】bug fix for parameter_recv (#27838 ) * test=develop, bug fix for parameter_recv * test=develop, for unittest, test_fleet_rolemaker_new	4 years ago
QingshuChen	2712d07644	support kunlun matmul_v2 (#27910 ) *test=kunlun	4 years ago
zhang wenhui	5a83496c8d	Multi task (#26002 ) * add multitask * add multitask, test=develop * fix code style, test=develop * add partail push dense, test=develop * fix has_kay in py3, test=develop * fix, test=develop * fix, test=develop * fix, test=develop	4 years ago
zhang wenhui	7a58431c0a	fix norm api doc, test=develop (#27652 ) * fix norm api doc, test=develop * fix error message, test=develop * fix api norm, test=develop * add adagrad, test=develop * fix bug, test=develop * fix bug, test=develop * add spetral_norm, test=develop * fix adagrad, test=develop * merge , test=develop	4 years ago
yinhaofeng	3eb106da6d	Lookup table v2 xpu (#27888 ) * add lookup_table_v2_op_xpu, test=kunlun * add lookup_table_v2_op_xpu, test=kunlun * change some Tips ,test=kunlun	4 years ago
Zhang Ting	d5cc144c60	tune backward filter algorithm for float16 (#27529 ) * use exhaustive_search for float16 * tune algo only when dtype is float16	4 years ago
wanghuancoder	41aad9bfcd	revert 4 files, from clear include by iwyu, test=develop (#27895 )	4 years ago
hutuxian	3f2a6ab65d	fix error msg (#27887 )	4 years ago
xiaoting	ae01801f0a	Add dropout and log_loss for kunlun (#27790 ) * add dropout,log_loss, test=kunlun * fix dropout, test=kunlun * polish error message, test=kunlun * change boost::get to BOOST_GET_CONST, test=kunlun * fix copyright, test=kunlun	4 years ago
Guanghua Yu	70c8c31371	support mean,softmax_with_cross_entropy on Baidu Kunlun (#27792 ) * support mean,softmax_with_cross_entropy on Baidu Kunlun,test=kunlun * fix unittests error,test=kunlun * delete boost::get,test=kunlun	4 years ago
Chengmo	1607e87cb9	add xpu sgd & momentum (#27728 ) * add xpu sgd & momentum	4 years ago
Leo Chen	049696bf67	Refine the format of printing tensor (#27673 ) * add sumary feature * refine printting tensor * add sci_mode * add sample code * fix indent error * fix _format_item * polish code * support item indent * add ut * set place for ut * fix py2 issue * fix ut	4 years ago
hong19860320	c90d35564b	Add batch_norm and layer_norm XPU kernels (#27818 )	4 years ago
joanna.wozna.intel	ddcd1b5381	Add bfloat16 resnet50 test (#27755 )	4 years ago
xiaoting	6da7a7458b	add conv for xpu, test=kunlun (#27809 ) * add conv for xpu, test=kunlun * polish error_message, test=kunlun * polish error_message, test=kunlun * fix copyrigth, test=kunlun	4 years ago
Thunderbrook	04be37c57f	add xpu slice op (#27349 ) * add xpu slice op test=xpu * add slice xpu op test=xpu * code style test=kunlun * style test=kunlun * format test=kunlun	4 years ago
Thunderbrook	8c25dfaacc	op error info (#27856 ) * op error info * style * code format	4 years ago
Wilber	345574a6ed	Demo CMakeLists add openmp flag. (#27848 )	4 years ago
ShenLiang	6d63cd2b93	add gather_op xpu, test=kunlun (#27822 ) * add gather_op xpu, test=develop, test=kunlun * fix ut, test=develop, test=kunlun * fix the ut,test=develop, test=kunlun	4 years ago
Feiyu Chan	1d95a0fbc3	fix error message for nce_op (#27863 )	4 years ago
gongweibao	4237fefeb4	Add shellcheck tools and modify copyright hook (#27722 )	4 years ago
Chengmo	c5f2802d56	【paddle.fleet】Update fleetrun & ps-heter (#27472 ) * refine fleetrun.ps_launch * update fleet run for multi device support * ps_graph support ps-gpu * fix heter save * add heter save unittest * fix unittest & simple code * update fleetrun * fix fleetrun * fix launch barrier * fix role maker * add paddlecloud rolemaker unittest * rename heter_worker_device_guard	4 years ago
Shang Zhizhou	bbc837ee72	add info log for trt input dynamic shape check (#27796 ) * add info log for trt input dynamic shape check * fix error msg error	4 years ago
guofei	2e1bca99ca	Refine the gradient calculation errors caused by renaming in while_grad (#27814 ) test=develop	4 years ago
wanghuancoder	8fa4c09889	add load_op_xpu for Baidu Kunlun (#27817 ) * add load_op_xpu for Baidu Kunlun, test=kunlun * add is_compiled_with_xpu for unit test, test=kunlun * add is_compiled_with_xpu for unit test, test=kunlun	4 years ago
Wilber	9005c5a260	Lite subgraph support arm cpu. (#27827 )	4 years ago
Jacek Czaja	55e63763ec	[oneDNN] adaptive pool support (#27747 )	4 years ago
chen zhiyu	6335e6a0a6	add musl option (#27798 )	4 years ago
yongqiangma	e8a5aefbbd	update CUDAPlace doc. test=document_fix (#27711 )	4 years ago
Zhang Ting	16999ae49d	use IndexList to improve performance of instance_norm op (#25132 ) * use IndexList to improve performance, test=develop * remove EIGEN_HAS_INDEX_LIST, test=develop * use IndexList only when EIGEN_HAS_INDEX_LIST is true	4 years ago
GaoWei8	36bb056ed6	Add flattern weight of lstm (#27192 ) * add flattern weight of lstm	4 years ago
Guanghua Yu	7779790c61	error message optimization in softmax_with_cross_entropy_op (#27772 ) * error message optimization in softmax_with_cross_entropy_op * fix some unsuited comment	4 years ago
zhupengyang	659d04df2c	hsigmoid -> hsigmoid_loss/HSigmoidLoss; refine docs (#27745 )	4 years ago
TeslaZhao	070ac9590c	Add double grad in Squeeze and Unsqueeze (#27810 ) * Add double grad in Squeeze and Unsqueeze * Add double grad in Squeeze and Unsqueeze	4 years ago
Jack Zhou	d4359b0f39	add the kunlun kernel for the paddle 2.0 Add xpu kernel for KUNLUN core: * accuracy op * sign op * scale op * sum op Add default atol in xpu unittest.	4 years ago
mapingshuo	840d54de9b	add XPU support for shape op and reshape op (#27804 )	4 years ago
cc	8fabb1c32f	Add test attribute in channelwise_quant op, test=develop (#27742 ) * Add test attribute in channelwise_quant op, test=develop	4 years ago
wangxinxin08	ad99e638fd	add double grad op for matmul (#27776 ) * add matmul doublegrad op * fix compile errors * modify code according to review * delete float16	4 years ago
zhupengyang	0025e0d87b	refine APIs: brelu, hardsigmoid, hardswish, maxout (#27658 )	4 years ago
zhupengyang	5098891fdf	add softmax xpu kernel (#27700 )	4 years ago
Double_V	f6ad2375be	fix pool3d bug, test=develop (#27718 ) * fix pool3d bug, test=develop * fix unitest, test=develop * fix test and fix pool2d bug, test=develop	4 years ago
石晓伟	0d27591642	save operator version infomation to program desc, test=develop (#27668 )	4 years ago
Qi Li	b8d2a021f0	fix ut error of test_recognize_digits, test=develop (#27791 )	4 years ago
Jacek Czaja	631c1f3018	- Fix to 27398 (#27770 ) test=develop - compilation fix test=develop	4 years ago
Feiyu Chan	0a7bab4e34	fix error mesage for negative_positive_pair_op and nce_op (#27779 )	4 years ago
zhupengyang	395cb561aa	refine logsumexp error message and docs (#27713 )	4 years ago
smallv0221	057e28bc8f	API(lstm_unit, lstmp, sequence_mask, sequence_enumerate, sequence_conv) error message enhancement (#27572 ) * API(Compute) error message enhancement on line 44, 50, 53. * lstm_unit error message enhancement. lstmp error message enhancement. sequence_conv error message enhencement. sequence_enumerate error message enhencement. sequence_mask error message enhencement. * Update lstm_unit_op.cc * Update lstm_unit_op.h * error msg enhancement. * Update sequence_conv_op.cc * Update lstm_unit_op.cc * Update sequence_conv_op.cc * Update sequence_enumerate_op.cc * Update sequence_enumerate_op.cu * Update sequence_enumerate_op.h * Update sequence_pool_op.h * error message enhencement. * error message enhancement.	4 years ago
Jacek Czaja	606611d351	[oneDNN] GRU BF16 kernel (#27731 )	4 years ago
xiemoyuan	6c1acf34ed	Optimize the error message for OP (#27617 ) * Optimize the error message for OPs. * Optimize the error message for OPs in details.	4 years ago
cc	ec7d11a492	refine fused_elemwise_activation error message (#27734 )	4 years ago
Zhen Wang	365c2c9c89	fix error message showing in UpdateLossScalingOp (#27596 )	4 years ago
LielinJiang	9089841b6e	Fix bilateral inference shape bug (#26822 ) * fix bilateral bug	4 years ago
Yiqun Liu	65207b4560	Polish the error message of fc, fused_fc_elementwise_layernorm and fused_embedding_seq_pool. (#27692 ) * Polish the error message of fc_op. * Polish the error message of fused_fc_elementwise_layer_norm op. * Polish an error message in fused_embedding_seq_pool_op.	4 years ago
Wojciech Uss	f399bed8d9	Add an option to set number of warmup iterations (#27739 )	4 years ago
Jacek Czaja	b9fda2ff09	Fix to issue #25537 (#27546 ) * - condidate fix to issue #25537 test=develop * - UT for transpose NHWC test=develop	4 years ago
Wojciech Uss	966447e338	Added support for quantization of fusion_gru (#27518 )	4 years ago
joanna.wozna.intel	0cd4907eba	Add avx512 core instructions check (#27732 ) * Add avx instructions check * Small fix * Change function name * Change uint to unsigned int	4 years ago
hong19860320	7a96d5788d	Optimize the error messages of the CUDA implementation of activation ops (#27741 ) test=develop	4 years ago
tangwei12	fd616fadc2	repen heartbeat ut (#27684 )	4 years ago
Qi Li	f373269df0	update histogram op for performance optimization, test=develop (#24912 )	4 years ago
tianshuo78520a	4d5ddbf106	add xpu test (#27622 ) * add xpu test * notest;add kunlun_test * notest;add kunlun_test * notest;test=kunlun * notest;test=kunlun * notest;test=kunlun * notest;test=kunlun * test=kunlun	4 years ago
MRXLT	20fb01fb00	fix distributed error info (#27206 ) * fix distributed error info * bug fix; notest * error info refine * update error info * update error info * update error info * bug fix * bug fix * bug fix * bug fix	4 years ago
pangyoki	7cd2c13f1b	add multinomial op (#27219 ) * add multinomial cpu kernel * fix C++ notype error * fix windows ci array len error * let array len be const * change array to vector * add cuda kernrl with num_distribution is 1, and not support replacement=False * add multinomial python api * support num_distribution different multinomial distributions * add multinomial python api unittest * change output dtype to int64 * fix coverage prob * optimize format * fix dtype of output error, should be int64_t	4 years ago
Zhang Ting	d2369dd91f	modify docs of CPUPlace and CUDAPinnedPlace, test=document_fix (#27587 )	4 years ago
iducn	7c69e36131	add pip new requirements to windows (#27697 ) * add pip new requirements to windows * Increase the conditions that restrict system installation	4 years ago
Wojciech Uss	42d175385d	Add support for (de/re)quantization with shift (#27481 )	4 years ago
123malin	cc780b1977	test=develop, optimize geo communicator (#26857 ) * test=develop, optimize geo communicator	4 years ago
Pei Yang	8a4f85feb9	Add unittests and OP version registry for quant_conv2d_dequant_fuse_pass (#27689 )	4 years ago
yukavio	7b46fb0f14	fix generate_proposals and affine grid error info (#27636 )	4 years ago
Chen Weihang	b14ecb8632	Polish api BuildStrategy/ExecutionStrategy doc & code example (#27662 ) * polish BuildStrategy api doc & example * polish ExecutionStrategy api doc & example * polish details	4 years ago
AshburnLee	c3a3df6466	Add cuda support for unique op (#27646 ) * unique op for cuda is added * add support for cuda * Add cuda support for unique op. * Add support for int32_t and int64_t. * For old version, process by cpu * Add VisitDataType for thrust	4 years ago
lilong12	bbc2add703	Initialize gloo for low level collective apis (#27672 ) * add gloo initializer, test=develop	4 years ago
wawltor	29f4922906	optimize the error meesage for detetion_map_op optimize the error meesage for detetion_map_op	4 years ago
whs	daf5aa9b8b	Fix round in grid sample op (#27657 )	4 years ago
arlesniak	0ecf441af1	Add support for mkldnn ops types selection with FLAGS in dygraph (#27482 ) * Add support for mkldnn ops types selection with FLAGS in dygraph * use regex to match DNNL verbose * python3 encoding fix	4 years ago
Wilber	2bc70ab2e2	Fix lite_resnet50 unit test. (#27611 )	4 years ago
ysh329	2f9cdd9038	API/OP clip_by_norm_op error message enhancement. test=develop (#27614 ) * Fix clip_by_norm_op error message. test=develop * test=develop * test=develop	4 years ago
yongqiangma	aac57159c9	enhance array_to_lod_tensor_op lod_tensor_to_array_op errors informaiton (#27386 ) * enhance array_to_lod_tensor_op lod_tensor_to_array_op errors information. test=develop	4 years ago
lilong12	36c0410223	Revert "Initialize gloo for low level collective apis (#27356 )", test=document_fix (#27665 )	4 years ago
xiemoyuan	99e3337368	Optimize the error message of OP. (#27478 ) * iCafe 9009: Optimize the error message of OP. * Optimize the error message of GatherTreeOP.	4 years ago
ShenLiang	e8f873df88	optimize the speed&memory of matmul op (#27610 ) * fix the speed&memory of matmul * fix the comment * fix the memory copy * fix the windows ci	4 years ago
Pei Yang	ae6e40a7fd	Add unittests and OP version registry for tensorrt_subgraph_pass (#27544 ) * add unittests and op version register for tensorrt_subgraph_pass * rename to test_trt_subgraph_pass.py * fix softmax converter diff when padding dim=1	4 years ago
tangwei12	9704582eef	fix op error (#27599 ) * fix error * fix error * fix error * merge develop	4 years ago
wanghuancoder	c68a0313a5	add paddle.fluid._cuda_synchronize (#27595 ) * add paddle.fluid._cuda_synchronize, test=develop * fix bug about core_avx core_noavx, test=develop * delete CPUPlace and XPUPlace, test=develop	4 years ago
yaoxuefeng	c9a8801325	enhance error messages of lookup_tale, merge_ids, data_norm (#27619 ) * enhance error messages of lookup_tale, merge_ids, data_norm * fix * fix error msg in .cu	4 years ago
whs	9cc5603d56	Make grid support stopping graients. (#27630 )	4 years ago
liym27	074a71bd25	Support assignment to a Variable in dynamic mode but not deal with backward. (#27471 ) * Support assignment to a Variable in dynamic mode. Note: not deal with backward. * Rewrite VarBase __setitem__ for high-performance. * try to test 3 means to do __setitem__ and test the performance of 3 means. * Retain the means of the highest performance: C++ code and don't trace op.	4 years ago
lilong12	5218b7af6b	add ncclSend and ncclRecv (#27621 ) * include ncclRecv and ncclSend, test=develop	4 years ago
lilong12	fa73e4a284	Initialize gloo for low level collective apis (#27356 ) * add gloo initializer, test=develop	4 years ago
furnace	d01f626944	update mv op according PR#27024 (#27474 )	4 years ago
Double_V	9d783aeddd	Error message opt, test=develop (#27467 ) * Error message opt, test=develop * solve comments, test=develop * fix typo, test=develop	4 years ago
Li Fuchen	1501a80f74	add support to float64 input of warpctc op. (#27399 ) * add float64 input to ctc_loss * modified error message of warpctc * update repo and tag of warpctc * add test for warpctc with float64 input * modified warpctc.cmake to make sure build always * resolved sample code bug of warpctc * add core.ops in warpctc dygraph * fix a bug of test	4 years ago
QingshuChen	6b727e08b1	support elementwise add, activation, matmul on Baidu Kunlun (#27143 ) * support elementwise add, activation, matmul on Baidu Kunlun * test=kunlun * minor * test=kunlun * reconstuct the xpu directory * test=kunlun * minor * test=kunlun * minor * test=kunlun * minor * test=kunlun * minor * test=kunlun * minor * test=kunlun	4 years ago

1 2 3 4 5 ...

17783 Commits (e8db4412d00b9fb72f9a0a04d90f15fbf861c1fa)