Paddle

Commit Graph

Author	SHA1	Message	Date
Leo Zhao	10eeed93d1	Revert "use static variable to do cache instead of thread local in thread frequent switching case (#18428 )" (#18879 ) This reverts commit `ce38bb5341`. test=develop	6 years ago
tianshuo78520a	6cd1b71208	add DEFINE_int32/DEFINE_bool/DEFINE_string flag (#18869 )	6 years ago
Huihuang Zheng	0d3f16f53e	Try to modify external gflags to solve CI compilation (#18872 )	6 years ago
Zeng Jinle	8008ab4e6b	Remove legacy C++ memory optimization codes (#18834 ) * remove legacy memory optimization codes, test=develop * follow huihuang's comments,test=develop * follow luotao's comments, test=develop	6 years ago
Thunderbrook	52c1431eee	add clear_model interface in fleetwrapper (#18815 ) * dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop	6 years ago
Zeng Jinle	9a8a7a1ddc	fix affine_channel no_need buffer bug, test=develop (#18844 )	6 years ago
lvmengsi	829ef26281	Fix drop deconv (#18813 ) * replace link * update api.spec * fix mistake	6 years ago
Huihuang Zheng	cfce4994cf	Merge cuda 9/10 dockerfile with root dockerfile (#18693 ) Also fix a dependency error which may cause compile error	6 years ago
chengduo	4140fe11a4	Open fuse optimization ops (#18741 ) * open fuse optimization ops test=develop	6 years ago
Adam	ee02227949	Add LeakyReLU MKLDNN support (#18762 )	6 years ago
lidanqing	b05bdda0cf	remove unused TransposeINT8Op for higher UT coverage (#18791 ) test=develop	6 years ago
Zeng Jinle	a802da650b	Feature/mem opt pass refactor (#18735 ) * first version memory optimize pass, test=develop * remove move_tensor_sharing_pass, test=develop * refine code comments, add unittests, test=develop * turn off memory_optimize by default, test=develop * follow huihuang's comments, test=develop * follow chengduoZH's comments, test=develop * fix grammar error, add const qualifier, fix pass_test exception message, test=develop * follow chengduoZH's comments 2nd, test=develop	6 years ago
Physher	c5f47c2107	fix mul_mkldnn_op build failure (#18816 )	6 years ago
Physher	a5c986301c	clarify MKLDNN INT8 Mul Op attributes (#18685 )	6 years ago
FDInSky	cff5e2c173	fix roi_align_op cpu backward's bug (#18789 ) * test=develop fix cpu roi_align_op backward bug	6 years ago
石晓伟	9dbb62eeb9	Fix examples of API (#18092 ) * fix logical APIs test=develop test=document_preview * fix isfinite * update matmul comments * update API.spec test=document_preview test=develop * update API.spec test=document_preview test=develop * update API.spec test=document_preview test=develop	6 years ago
chengduo	292dfbce63	fix build strategy doc (#18725 ) test=develop	6 years ago
fuyinno4	c167a4b4dd	Fix shrink-dense and add scale-datanorm (#18746 ) Fix FleetWrapper: 1. fix shrink dense: just scale show 2. add datanorm scale: divide datanorm's gradient by batch_size	6 years ago
Bai Yifan	d3ac561d65	fix deformable_conv_op compile error, test=develop (#18793 )	6 years ago
lidanqing	9ecd8ee789	change ComputeINT8 to template version to remove checking dst_datatype code (#18756 ) * change INT8 to template so that checking dst_dt with if-else could be removed. CI will be enabled after fixing reviews * reverse user_residual_memory_p and user_bias_memory_p declaration scope test=develop	6 years ago
JesseyXujin	d9e7b5b5e9	fix bug of swish op formula,test=develop (#18772 )	6 years ago
Bob Zhu	220eef602e	Extend Matmul to support matrix multiplication with multiple heads (#18570 ) * extend matmul op to support multiple head multiplication With the support of multiple head, the multiplication of two big matrixes is split into multiplication of several (head_number) small matrixes. e.g. if Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].	6 years ago
whs	075e1cf78e	Add python API for appending LoD level (#18702 ) * Make lod reset op support for append lod level. * Fix API.spec test=develop * Fix unitest. test=develop * Add python api for lod append. test=develop * Fix API.spec test=develop * Fix format of doc. test=develop * Fix unitest. test=develop * Fix doc. test=develop	6 years ago
chengduo	8259f1418f	Enhance backward process (#18700 ) * prun backward ops test=develop	6 years ago
JesseyXujin	25c9b57bcd	Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771 )	6 years ago
Zhaolong Xing	26ae6d49e4	Update trt5 for paddle-trt (#18645 ) * update paddle-trt for: 1. fix bug: when batch > 2, core in split plugin. 2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.) 3. add new attr to dropout. 4. shuffle channel, swish, relu6 support test=develop * 1. fix ci test=develop	6 years ago
Thunderbrook	d8396281ef	add slot to sparse table (#18686 ) The change includes 2 things: 1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table. 2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta. test=develop	6 years ago
Jacek Czaja	95c1816ec0	[MKL-DNN] Extended LRN with reusing via Acquire API (#18675 ) test=develop - compileation fix - Yet another compilation fix - Even yet another compilation fix - Surprise! Again compilation fix - lint fixes test=develop - Fix to workspace acquire of LRN test=develop - Fix to hash of BWD LRN test=develop - fix to lrn BWD PD acquire test=develop - Fixing LRN PD creation test=develop - cosmetic fix in comment test=develop - Fixes after review test=develop	6 years ago
jiaqi	d18aabb472	support patch data, add load_one_table, fix bug (#18509 ) （1）support patch data （merge slots of instances of same line id, modify dense layer which changes its size）（2）add fleet load_one_table interface, support load from paddle model and load from pslib model （3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error. （5）add more debug info in TrainFilesWithProfiler	6 years ago
chengduo	fd3aad6cb3	Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664 ) * support sparse gradients test=develop	6 years ago
wangchaochaohu	6b78e00da4	Cudnn convolution reconstruction (#18284 ) * rewrite the conv_op using cudnn_conv_helper * add workspace limit for v7 test=develop * fix test=develop * add half float test=develop * fix test=develop * fix test=develop * revise code style test=develop * fix test=develop	6 years ago
Yi Liu	157211c4e1	supports distributed classification (#18690 ) * supports distributed classification training * update API.spec * fix evenly division in python3 * change "index_range" to "index_num" in shard_index operator test=document_preview test=develop	6 years ago
qingqing01	3429e65aa8	Fix CPU implementation of roi_align_op backward (#18728 )	6 years ago
Tao Luo	bd22453f20	Revert "Add LeakyRelu MKLDNN support (#18656 )" (#18723 ) test=develop	6 years ago
tianshuo78520a	58469186c3	Change api approval people name (#18699 )	6 years ago
whs	189b08dc0d	Make infer shape of pad2d support for input with negative dims in compile time. (#18695 ) test=develop	6 years ago
Bai Yifan	7e3963f295	add license, test=develop (#18709 )	6 years ago
cjt222	ccf06a48b0	test=develop (#18701 ) add license	6 years ago
wangguanzhong	185b3acea1	fix clip_by_norm doc (#18688 ) * fix clip_by_norm doc, test=develop	6 years ago
Huihuang Zheng	89bc3fd841	Support memory eager deletion on recurrent OP (#17710 ) Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)	6 years ago
Jacek Czaja	0d8e6c9b8b	MKL-DNN upgrade to 0.20 (#18370 ) test=develop	6 years ago
Adam	d6b6a337a9	Add LeakyRelu MKLDNN support (#18656 ) test=develop	6 years ago
zhouwei25	772e09560e	Optimize the content of error reporting information, print error code and official document web sites (#18671 ) optimize the error reporting information of cuda related API index on develop: 130ac17 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into develop	6 years ago
Zeng Jinle	ae58afc546	Feature/auto_growth_allocator (#18561 ) * feature/auto_growth_allocator, test=develop * add unittest of AlignedAllocator, test=develop * try to turn on auto_growth to test on CI, test=develop * fix segmentation fault in mixed_vector.h, test=develop * add unittests, test=develop	6 years ago
hutuxian	bb2f5d24a2	hash_op support int64 hash_size (#18674 ) * hash_op support int64 hash_size * add corresponding UT	6 years ago
guru4elephant	5ed713d519	remove ctr reader, all functions are satisfied in dataset (#18672 ) * remove ctr reader, all functions are satisfied in dataset	6 years ago
guru4elephant	d714bf037c	remove async executor and add data_feed.proto to the deps of train demo (#18659 ) * remove async executor and add data_feed.proto to the deps of train demo	6 years ago
Yang Zhang	ce1ec33299	Add cuda implementation for `prelu` backward pass (#18633 ) * Add GPU implementation for `prelu` backward pass test=develop * Fix logic error in `prelu` GPU backward and simplify a bit test=develop * Fix `prelu` backward CUDA implementation test=develop CPU version was not used actually, so test passed	6 years ago
石晓伟	25d8079140	Fix Bitmain Predictor::Clone() (#18599 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop * modify interfaces test=develop * add cmake dependments test=develop * enforce the outputs of net test=develop	6 years ago
Yihua Xu	97549a4f13	[CPU] Fix the compiling issue with AVX512F macro. (#18634 )	6 years ago
baojun	256ba7cbb8	[NGraph] handle dim element 0 of ngraph op (#18568 )	6 years ago
chengduo	a6d468a265	fix PE fetch bug (#18644 ) test=develop	6 years ago
liuwei1031	759530966c	print out error code of cudaGetDeviceProperties if failed (#18643 )	6 years ago
Jacek Czaja	71d883b8ef	[MKL-DNN] Reimplemented pool2d mkl-dnn to use Acquire API (#18585 ) * - Added partial draft of pooling acquire - Workspace support - compilation fix - Added draft of pooling backward reimplementation - Segfault fix - reverted 'any' for diff_dst crewation in pooling - Lint fixes test=develop - lint fixes test=develop - Further lint fixes test=develop * - Fixes after review test=develop * - Lint fixes test=develop * - Even more lint fixes test=develop	6 years ago
chengduo	f4ec7d54c8	fix bug of scatter op (#18640 ) test=develop	6 years ago
tianshuo78520a	112cf850b7	change pip install whl;test=develop (#18635 )	6 years ago
guru4elephant	ab57d3893e	make auc op compatible with 1 dim (#18551 ) * make auc op compatible with 1 dim	6 years ago
tianshuo78520a	de22215c8f	change const_cast error message (#18620 )	6 years ago
Leo Zhao	ff77dea969	not use transferscope cache in cpu case (#18578 ) * not use transferscope cache in cpu case test=develop * adjust variable name and add comments test=develop * use correct format for class member in operator.h * use correct format for class member in operator.cc test=develop	6 years ago
123malin	b414645a65	fix #17430 : int64类型的attr训练非预期 (#18264 ) * fix int64_t * update fill constant op unittest * add empty line	6 years ago
tangwei12	db212bb932	delete AllocatorFacade destructor (#18606 ) * delete m, test=develop	6 years ago
Kevin	995d7d8600	Modify embedding_op input dtype to int64 (#18598 )	6 years ago
kh2se2013	9ad57f2dfd	1）change to parallel mode on python coverage run (#18594 ) 2）add pip install coverage in Dockerfile.tmp test=develop	6 years ago
Tao Luo	076f833110	add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580 ) * add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy test=develop * enhance MkldnnPostReset test=develop * add comments for mkldnn_cache_capacity field test=develop	6 years ago
Hongyu Liu	a20b2b43fc	fix cudnn lstm shape bug; test=develop (#18492 )	6 years ago
gongweibao	c0a82748cf	Polish backwards optimizer dependency codes and use more default values. (#18255 )	6 years ago
Zeng Jinle	d3003a1620	Feature/buffer_shared_inplace (#17911 ) * feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop	6 years ago
tianshuo78520a	1c10dac4f2	Add code example in CI (#18228 ) * test api example * update python * add sampcd_processor.py * add if 0 * sort * test paddle * test paddle * test paddle * add whitelist * change sampcd_processor.py * change sampcd_processor.py * change sampcd_processor.py * add exit * test=develop * test=develop	6 years ago
Zeng Jinle	be24e5b391	Clean unused code of dim and place (#18565 ) * clean code of dim and place, test=develop * fix failed unittests, test=develop	6 years ago
Jacek Czaja	8869d7f735	Activations MKLDNN ops refactoring (#18191 )	6 years ago
lujun	b6d5c74f69	update dygraph api doc for web (#18550 ) remove dygraph.enable from __all__ hidden dygraph. profiler add doc to dygraph. no_grad	6 years ago
Yibing Liu	b86234fc0b	Register fp16 for concat_op (#18563 )	6 years ago
Physher	5e1220ef37	fix compile error which caused by gcc4.8 related commit;test=develop (#18567 )	6 years ago
Jiabin Yang	667f88f9a6	Fix/gcc 4.8 ubt link error (#18558 ) * test=develop, fix docker with paddle nccl problem * test=develop, fix/gcc_4.8_ubt_link_error * test=develop, fix code format	6 years ago
Physher	0caa08ea40	Add mkldnn int8 mul-op kernel (#17834 )	6 years ago
LielinJiang	24d1c44a0c	Fix roi_perspective_transform_op bug (#18522 ) * fix transform matrix bug, test=develop * modify API.spec	6 years ago
Zhaolong Xing	88b52a27fe	Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop	6 years ago
石晓伟	1529154821	Support Bitmain Anakin (#18542 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop	6 years ago
tianshuo78520a	9b3d3b8387	Cancel jacquesqiao approval authority (#18538 )	6 years ago
Leo Zhao	ce38bb5341	use static variable to do cache instead of thread local in thread frequent switching case (#18428 )	6 years ago
gongweibao	160ddc980c	Regroup fusion by date type. (#18496 )	6 years ago
Tao Luo	fe32879d2a	add mkldnn shapeblob cache clear strategy (#18513 ) * add mkldnn shapeblob cache clear strategy test=develop * refine with comments test=develop * make cache clear strategy more safey test=develop * add lock for GetShapeBlobSize test=develop	6 years ago
chengduo	e576f2667b	update docker build (#18523 ) test=develop	6 years ago
zhaoyuchen2018	832d8191ff	Fix topk cannot handle 1D vector bug (#18466 ) * Fix topk cannot handle 1D vector bug Add path to handle 1D vector test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
石晓伟	280a8784f7	Remove the obsolete cmake options (#18493 ) * remove the obsolete cmake options, test=develop * remove unittests, test=develop * delete options in paddle/scripts/paddle_build.sh	6 years ago
LielinJiang	43e17c7951	Add distributions of normal and uniform (#18023 ) * add_distributions_of_normal_and_uniform * paddle/fluid/API.spec * modify API.spec * modified paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * fix some comment, test=develop * modify API.spec, test=develop * add comment for init function, modify hard code, test=develop * modify API.spec, test=develop * modify API.spec, test=develop * make unit test function shorter, test=develop * modify paddle/fluid/API.spec	6 years ago
bingyanghuang	3fe6bf5ee6	fix command line bug in int8v2 readme (#18507 )	6 years ago
tensor-tang	4828a5e008	core remove pycpuinfo (#18479 ) remove pycpuinfo deps in core	6 years ago
qingqing01	7ac4818a98	Refine Infershape in activation_op for double_grad. (#18485 ) * Refine Infershape in activation_op for double_grad.	6 years ago
qingqing01	602cb6a5b4	Enhance linear_lr_warmup (#18463 ) * make it support float/int learning as input.	6 years ago
chengduo	7453857324	Make fuse_all_reduce_op_pass support mix_precision (#17652 )	6 years ago
chengduo	55baeceddb	Enhance execution error info (#18482 ) * enhance execution error info test=develop	6 years ago
石晓伟	047bba855b	Remove the obsolete cmake options (#18481 ) * remove the obsolete cmake options, test=develop * remove unittests, test=develop	6 years ago
pkpk	e9c7e218f2	Nan debugger init (#18401 ) test=develop	6 years ago
Jiabin Yang	f72ced8814	test=develop, fix docker with paddle nccl problem (#18451 )	6 years ago
Tao Luo	3f3112ceb0	add shape_blob for cache mkldnn primitive (#18454 ) test=develop	6 years ago
Tao Luo	d234aa02cd	add transfer_scope_cache unit-test (#18467 ) test=develop	6 years ago
zhoukunsheng	7c6f2350b9	support Tensor input for edit_distance op (#18162 )	6 years ago
zhoukunsheng	26318544d2	support Tensor input for chunk_eval op (#18226 ) * test=develop support Tensor input for chunk_eval op * test=develop fix testcase for chunk_eval op * test=develop fix typos in nn.py	6 years ago
zhoukunsheng	206c44e2a8	add unique kernel and op (#17557 )	6 years ago

1 2 3 4 5 ...

15562 Commits (f86fead6938efc8735412bd3489dc17a609e373c)