Paddle

Commit Graph

Author	SHA1	Message	Date
hutuxian	c756b5d231	Paddlebox Framework (#18982 ) * Support looking up embeddings from BoxPS. * Add a _pull_box_sparse op, for now this op is not exposed to users. * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on. * Add 'BoxPSDataset' in python code. * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS. * Add UT. * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982	6 years ago
Thunderbrook	1fe468d319	support debug each output of each ins (#19004 ) * dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop * support debug tensor of each ins test=develop * support debug tensor of each ins test=develop * learning rate * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style test=develop * code style test=develop * unitest * style * style * multi phase * add channel * code style * style * style * unitest * style * define * define test=develop * style test=develop * rm define test=develop * linux * linux test=develop * style test=develop * output format test=develop * windows ci test=develop	6 years ago
Leo Chen	6fb310ae29	Fix bug of getting bool Flags from os.environ (#19349 ) * fix bug of getting bool Flags from os.environ, test=develop * add empty loss_name in CompiledProgram for inplace grad test, test=develop	6 years ago
liu zhengxi	32598ffd8f	Python infer api update and add unit test (#19353 ) * python inference api supports numpy and add unit test, fix unit test fail in test_slim_int8_googlenet and test_slim_int8_mobilenet	6 years ago
Leo Chen	a9d5fc5142	Enhance OpTest to check the consistency of operators when using and not using inplace (#19101 ) * add pybind interface to get all inplace ops, test=develop * enhance OpTest to check whether the consistency of operator when using and not using inplace, test=develop * handle corner cases in op_test, test=develop * support outputs without tensor holder_, like XShape in reshape_op, test=develop * fix bug, some op has GradOpMaker, but actually no grad_op in OpInfoMap, test=develop * use reshape_grad instead of reshape in FlattenGradOp, test=develop * fix error debug dims info for variables like XShape, test=develop * change computational order in sum_op to relieve computation difference using inplace, test=develop * add inplace_atol to check group_norm, and skip inplace_grad for mkldnn, test=develop * follow sneaxiy's comments, test=develop * remove unused DefaultGradOpDescMaker in mkldnn op, test=develop	6 years ago
Zeng Jinle	5b6673c44d	merge develop to solve conflict, also fix API doc, test=develop (#18823 )	6 years ago
Tao Luo	5f5648a8ff	Revert "Python inference API support numpy (#19009 )" (#19160 ) test=develop	6 years ago
flame	b7e1a1d7e7	Python inference API support numpy (#19009 ) test=develop	6 years ago
yaoxuefeng	9150cf50fc	add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871 ) * add ctr related metric layer test=develop * add save cache and slots shuffle test=develop * add save cache and slots shuffle test=develop * fix error * fix error * fix style for ci * fix for comments * change SlotsShuffle input to std::strinf for generality * fix style * fix style * fix style * fix style * fix style * fix style * fix stylr * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * change non-const reference to pointer * fix style * fix style * fix style test=develop * fix style test=develop * add return ins num in ctr metric op * change dtype to float in metric_op.py * fix error test=develop * fix style test=develop * fix API spec * fix API spec * fix API spec test=develop * add UT test=develop	6 years ago
Zeng Jinle	88f111f885	remove unused inplace act codes, test=develop (#19079 )	6 years ago
jiaqi	a99bc64c63	add fleet util, add some interface in hdfs util (#18752 ) * add fleet util (fleet/utils/fleet_util.py): functions for users' convenience * add some interface in hdfs util : hdfs is_file、hdfs cat	6 years ago
Leo Chen	8f53735437	Fix memory overwriting of tensors returned by executor (#19030 ) * fix memory overlapping of fetch var (return of executor.run), test=develop * fix wrong usage of ParallelExecutor in op_test, test=develop * remove useless parameter and simplify code * avoid tensor destruct untimely, test=develop * add testcase independent of OpTest, test=develop	6 years ago
liuwei1031	a43a763b54	fix warpctc.dll not found issue (#18761 ) * fix warpctc.dll not found issue, test=develop * revert the linux platform change, test=develop * delete warpctc_lib_path.h.in, test=develop * add SetPySitePackagePath function * fix warpctc.dylib not found issue on Mac, test=develop * improve the paddle lib path setting logic, test=develop * fix mac ci issue caused by test_warpctc_op unittest, test=develop * tweak code, test=develop	6 years ago
flame	65d987527d	python inference enable_memory_optim(#18817 ) python inference API support enable_memory_optim	6 years ago
Zhaolong Xing	61238d31f7	Trt fp16 support (#18860 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop * 1 add trt fp16 support test=develop	6 years ago
chengduo	20859c08e8	[DyGraph] Make multi-card program faster (#18892 ) * update parallel.py test=develop	6 years ago
Zeng Jinle	8008ab4e6b	Remove legacy C++ memory optimization codes (#18834 ) * remove legacy memory optimization codes, test=develop * follow huihuang's comments,test=develop * follow luotao's comments, test=develop	6 years ago
Thunderbrook	52c1431eee	add clear_model interface in fleetwrapper (#18815 ) * dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop	6 years ago
chengduo	292dfbce63	fix build strategy doc (#18725 ) test=develop	6 years ago
jiaqi	d18aabb472	support patch data, add load_one_table, fix bug (#18509 ) （1）support patch data （merge slots of instances of same line id, modify dense layer which changes its size）（2）add fleet load_one_table interface, support load from paddle model and load from pslib model （3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error. （5）add more debug info in TrainFilesWithProfiler	6 years ago
chengduo	fd3aad6cb3	Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664 ) * support sparse gradients test=develop	6 years ago
Zeng Jinle	ae58afc546	Feature/auto_growth_allocator (#18561 ) * feature/auto_growth_allocator, test=develop * add unittest of AlignedAllocator, test=develop * try to turn on auto_growth to test on CI, test=develop * fix segmentation fault in mixed_vector.h, test=develop * add unittests, test=develop	6 years ago
guru4elephant	d714bf037c	remove async executor and add data_feed.proto to the deps of train demo (#18659 ) * remove async executor and add data_feed.proto to the deps of train demo	6 years ago
123malin	b414645a65	fix #17430 : int64类型的attr训练非预期 (#18264 ) * fix int64_t * update fill constant op unittest * add empty line	6 years ago
gongweibao	c0a82748cf	Polish backwards optimizer dependency codes and use more default values. (#18255 )	6 years ago
Zeng Jinle	d3003a1620	Feature/buffer_shared_inplace (#17911 ) * feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop	6 years ago
Zhaolong Xing	88b52a27fe	Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop	6 years ago
Yi Liu	a873fa84ce	supports collective training with programs (#18392 ) 1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops 2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext 3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis	6 years ago
xsrobin	47e2ef38e9	add "import paddle.fluid as fluid" to examples lack of it	6 years ago
lujun	fd6631ef2f	Fix dygraph show style (#18297 ) Fix dygraph show style for FluidDoc.	6 years ago
tangwei12	999d9a59a5	fix communicator with pyreader (#18350 ) * add is_runnning in communicator, test=develop	6 years ago
HaoRen	b7128bac5f	supports collective communicated training (#18175 ) * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * fix comment test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * fix comment test=develop * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * test=develop add collective op unittest standard * test=develop remove the test_collective directory * test=develop remove the test_collective directory * remove slicegather test * code format for reducescatter * update attr of shard_index_op * Modify macro nccl_helper * remove test without distribute * macro collective_helper * marcro update * test=develop update support python3.5 * test=develop change gpu memory use to 0.1 when test * test=develop update ut equal func * test=develop set flags to 1.5 * test=develop fix pickle dumple py35 * test=develop fix divide in slice and add sync_comm_stream update atol and rtol to 1e-05 rm shard_index op and test modify read input from file to read from memory remove origin_program in framework and add i/o in c_sync_calc_stream * test=develop update unittest sync operator I/O	6 years ago
Zeng Jinle	5826b72e06	Refine CUDAPlace error message. (#18343 ) * refine cuda place error msg, test=develop * use LOG(ERROR)+exit(-1), test=develop	6 years ago
jiaqi	3f8031e256	dataset (#17973 ) (1) use channel instead of vector/BlockingQueue in Dataset，to keep same with existing implementation, and make code more readable and flexible (dataset single output channel or multi output channel). one previous memory out of limit problem is cause by not release memory after training. (2) add Record because MultiSlotType costs too much memory (80B)，fix memory out of limit problem. (3) add Channel, Archive in paddle/fluid/framework (4) change dataset from shared_ptr to unique_ptr in pybind (5) move create/destroy readers from trainer to dataset (6) move shuffle from datafeed to dataset. dataset holds memory, datafeed is only for load data and feed data to network. (7) fix thread num bug of Dataset when filelist size < thread num (8) support set_queue_num in InMemoryDataset	6 years ago
chengduo	25f3cd6486	Update execution_strategy option default value (#18183 ) * update execution_strategy option default value test=develop * fix doc error test=develop	6 years ago
Zeng Jinle	25ab23be28	Fix dygraph mem leak (#18082 ) * fix dygraph mem leak, test=develop * polish msg, test=develop	6 years ago
Sylwester Fraczek	accb132f0f	fix slim int8 mkldnn multithreading issue (#18009 )	6 years ago
tensor-tang	5c06bff222	combine noavx and avx package (#17889 ) * support avx and noavx core * add catch and give some log test=develop * fix build test=develop * add missing package test=develop * fix pybind name test=develop * fix import error test=develop * conbime noavx core test=develop * add requirements test=develop * fix unkown message test=develop * fix api spec test=develop * refine and clean test=develop * update * pass dist ut * follow comments test=develop * refine scripts test=develop	6 years ago
Jiabin Yang	4d5f6937c3	Feature/refine api for dygraph (#17907 ) * WIP * WIP * test=develop, add api doc and example code for dygraph	6 years ago
gongweibao	fbbdc9ccad	Add backward and optimizer operator dependency pass. (#17746 )	6 years ago
wopeizl	453a49b1bc	Make ParallelExecutor support Windows GPU (#17787 ) * fix the ParallelExecutor on Windows test=develop * restrict to use one GPU only under windows	6 years ago
翟飞跃	993c703bcc	INT8 MKL-DNN v2 integrate to slim (#17634 ) * refactor PR 16865 * delete mergetool files * test=develop * test=develop * test=develop * test=develop * create dir for int8 model before call SaveOptimModel * test=develop * mkldnn int8 only support linux; test=develop * refine code; test=develop * remove comment; test=develop * refine code; test=develop * fix bug; test=develop * add exception for mkldnn_post_training_strategy * reuse int8v2 CAPI dataset; test=develop * fix accuracy check bug; test=develop * remove tab * convert files to unix format * test=develop * reduce CI time;test=develop * reduce CI time and refine code;test=develop * refine comment; test=develop * add cmake FLAGS;test=develop * remove predict_num;test=develop	6 years ago
wopeizl	841553e13f	use pyreader to read data in dygraph mode (#17314 ) * use pyreader to read data * add return_list to PyReader to support return value represented as list	6 years ago
Zeng Jinle	674e0ce2d6	Use Python C-API to speed up dygraph trace (#17837 ) * use python api to reduce python time cost, test=develop * fix travis ci, test=develop * fix Py_None error,test=develop	6 years ago
Jiabin Yang	3b70f870e2	Using Smart pointer to optimizer memory usage of dyGraph (#17768 ) * for debug * test=develop, memory optimize for dygraph using shared_ptr * test=develop, fix travis ci showed error * test=develop, fix bug for recurrent usage of varbase * test=develop, init varbase when it need to be Add	6 years ago
guru4elephant	d52391094d	fix prepare context redundant code problem, optimize executor by cach… (#17743 ) * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * cache sub_scope, program, var when use_program_cache=True is set * make fetch_list runable with variables, add more unittest for use_program_cache	6 years ago
Zeng Jinle	432ac70124	clean code of py_layer in dygraph mode,test=develop (#17661 )	6 years ago
gongweibao	65bbf950ee	Add multi-ncclcomm and 2D ncclallreduce support. (#17263 )	6 years ago
Zhaolong Xing	61221ebc28	TRT: Support set dynamic range in int8 mode. (#17524 ) * fluid int8 train and trt int8 predict align. trt int8 predict init op converter * 2. align fluid int8 train and trt int8 inference. enhance quant dequant fuse pass enhance op converter, trt engine, trt engine op, trt subgraph pass. * 3. add delete_quant_dequant_pass for trt test=develop * 4. add the missing file test=develop * 5. i modify the c++ interface, but forget to modify the pybind code fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter test=develop	6 years ago
wopeizl	6724a652f3	add __str__ method for tensor and lodtensor to support print test=dev… (#17588 ) * add __str__ method for tensor and lodtensor to support print test=develop	6 years ago
guru4elephant	326bf8291a	add Run Prepared Ctx (#17616 ) add Run Prepared Ctx, fix pybind problem	6 years ago
flame	2280f185d7	BuildStrategy api comment (#17348 ) Python examples of fluid.layers.io.double_buffer and some BuildStrategy's methods.	6 years ago
guru4elephant	7f8bc49d00	polish_executor_and_add_ctx_cache (#17536 ) * polish_executor_and_add_ctx_cache	6 years ago
Zeng Jinle	c6189637cd	Fix allocator bug (#16712 ) * Revert "Revert "Fix allocator bug"" This reverts commit `174d0d0b90`. * Revert "fix travis ci" This reverts commit `5656fa9f7c`. test=develop * add inlined_vector.h, test=develop * add inlined_vector_test,test=develop	6 years ago
Qiao Longfei	92e7d5d7cc	fix distribute doc test=develop (#17318 ) * fix distribute doc	6 years ago
Qiao Longfei	58f7695ab2	Async exe support communicator (#17386 ) Async exe support communicator	6 years ago
Tao Luo	32da5e9c3d	remove unused expected_kernel_cache_pass (#17486 ) test=develop	6 years ago
Yan Xu	0217555530	polish parallel dygraph code (#17164 ) * add var grad hook test=develop	6 years ago
Jiabin Yang	d7df4e5e5b	Fix/Fix memory leak in dygraph (#17394 ) * test=develop, add gradient sort backward strategy * test=develop, fix test by add FLAGS_cudnn_deterministic on new tests * test=develop, fix memory leak in dygraph mode * test=develop, fix memory leak in dygraph mode * test=develop, polish code * test=develop, polish code * test=develop, polish code	6 years ago
Zhen Wang	4a1b7fec96	Add setting Scope function for the graph class (#17417 ) * add set_not_owned function for graph * add scope set. test=develop * add scope_ptr enforce not null before setting.test=develop	6 years ago
jiaqi	66d51206b1	add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118 ) * add save/load model, shrink table, cvm, config file & fix pull dense bug test=develop * fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error add client flush, add get data size test=develop * fix global shuffle bug test=develop * fix global shuffle bug test=develop * fix code style test=develop * fix code style & modify pslib cmake test=develop * fix error of _role_maker test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix windows compile error of fleet test=develop * fix global shuffle bug * add comment test=develop * update pslib.cmake test=develop * fix fill sparse bug test=develop * fix push sparse bug test=develop	6 years ago
Tao Luo	68ec0a6f74	make parallel_executor support FLAGS_use_mkldnn (#17341 ) * make parallel_executor support FLAGS_use_mkldnn test=develop * add warning when set mkldnn_enabled_op_types_ in non-mkldnn env test=develop	6 years ago
Jiabin Yang	4624d7c642	test=develop, add gradient sort backward strategy (#17125 ) * test=develop, add gradient sort backward strategy * test=develop, fix test by add FLAGS_cudnn_deterministic on new tests	6 years ago
chengduo	bc833945a4	Add DropLocalExeScopes in ParallelExecutor (#17297 ) * reset drop local scope counter test=develop	6 years ago
qingqing01	e32c9888f5	Double backward of conv2d. (#17211 ) * Add conv2d_grad_grad_op * Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h. - Now use it in conv2d_grad_grad. - Will simply the searching code in conv2d and conv2d_grad in next PR. * Enhance and fix bug in unit testing of gradient_checker. * Support to fetch empty variables，return None in Python.	6 years ago
lujun	e388a1fb66	Repair api example (#17221 ) Fix the following API examples: paddle.fluid.scope_guard paddle.fluid.backward.append_backward paddle.fluid.cpu_places paddle.fluid.cuda_pinned_places paddle.fluid.cuda_places paddle.fluid.in_dygraph_mode paddle.fluid.CUDAPlace paddle.fluid.CPUPlace paddle.fluid.CUDAPinnedPlace	6 years ago
chengduo	04bd413acb	Code Clean: Move all pass to paddle::framework::ir (#17228 ) * move pass to ir * polish code test=develop * fix dependency test=develop	6 years ago
Zeng Jinle	f2fa3f7300	fix api doc,test=develop (#17241 )	6 years ago
石晓伟	a72dbe9abf	Cherry-pick benchmark related changes from release/1.4 (#17156 ) * cherry-pick commit from `8877054` * cherry-pick commit from `3f0b97d` * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn (cherry picked from commit `8643dbc233`) * Cherry-Pick from 16662 : Anakin subgraph cpu support (cherry picked from commit `7ad182e16c`) * Cherry-pick from 1662, 16797.. : add anakin int8 support (cherry picked from commit `e14ab180fe`) * Cherry-pick from 16813 : change singleton to graph RegistBlock test=release/1.4 (cherry picked from commit `4b9fa42307`) * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2 Support ShuffleNet and MobileNet-v2, test=release/1.4 (cherry picked from commit `a6fb066f90`) * Cherry-pick : anakin subgraph add opt config layout argument #16846 test=release/1.4 (cherry picked from commit `8121b3eccb`) * 1. add shuffle_channel_detect (cherry picked from commit `6efdea8997`) * update shuffle_channel op convert, test=release/1.4 (cherry picked from commit `e4726a066f`) * Modify symbol export rules test=develop	6 years ago
Zeng Jinle	c5eeecca7c	Fix tensor_py.h (#17195 ) * fix tensor_py,test=develop * change class name,test=develop	6 years ago
Zeng Jinle	5dfe2ab9e8	Fix mem leak when converting Tensor to numpy array (#17182 ) * fix mem leak when converting Tensor to numpy array test=develop * remove unused unittest,test=develop * follow comments, test=develop * fix dygraph bug,test=develop	6 years ago
Yan Xu	0b07eef118	ParallelDyGraph with GPU collective mode (#16827 ) implement dygraph.parallel.DataParallel to hook reduce op.	6 years ago
guru4elephant	03d469ad98	Merge pull request #17005 from wopeizl/fix_ncclwrapper_win1 fix nccl wrapper on windows	6 years ago
liuwei1031	a770ce0615	add doc for memory_optimize, test=develop (#17010 ) * add doc for memory_optimize, test=develop * update doc, test=develop * doc update, test=develop	6 years ago
qingqing01	ea42e431f8	Speed unit testing. (#16978 ) * Speed affine_channel_op unit testing * Add check in tensor_py * Fix ONLY_CPU Compiling	6 years ago
wopeizl	51a0243a56	fix nccl wrapper on windows test=develop	6 years ago
Zeng Jinle	1202d3fc74	Refine model gpu memory (#16993 ) * speedup gc and inplace softmax_with_cross_entropy_grad test=develop * refine models gpu mem Merge skip vars and warning messages of mem opt remove relu mem opt test=develop * follow comments test=develop	6 years ago
guru4elephant	bbc6c5714f	Merge pull request #16887 from guru4elephant/add_nccl_context_pybind Add nccl context pybind	6 years ago
gongweibao	cbdb8a17b1	Polish DGC code (#16818 )	6 years ago
dongdaxiang	466d177d09	add pybind dependency test=develop	6 years ago
dongdaxiang	4aa6f679b5	add pybind dependency test=develop	6 years ago
dongdaxiang	b091139049	add nccl wrapper for python API	6 years ago
Yiqun Liu	112f16143b	Add an option to enable the cache of expected kernel in train phase. (#16724 ) * Add an option to enable the cache of expected kernel in train phase. test=develop * Change the default value of cache_expected_kernel to true.	6 years ago
chengduo	55b15db5af	Add unit test for fuse all_reduce ops (#16699 ) * test fuse all_reduce	6 years ago
Yiqun Liu	3fe8cb0dd7	Enable the runtime_context_cache pass in train phase (#16640 ) * Try to enable the runtime_context_cache pass in train phase. * Put the append of runtime_context_cache pass ahead of multi_dev passes. test=develop	6 years ago
guru4elephant	7d653f0aed	Merge pull request #16652 from xjqbest/dataset_merge_develop fix dataset bug	6 years ago
xjqbest	6a57e8075a	remove trainer_id in datafeed and dataset test=develop	6 years ago
Yan Xu	b4c3a6aa0b	[Imperative] implement imperative NCCLParallelContext (#16477 ) add NCCLParallelContext for parallel dygraph	6 years ago
xjqbest	271b7147cc	fix dataset bug test=develop	6 years ago
chengduo	b75a69bad6	Add Stream for fetch op handle (#16600 ) * expose fuse broadcast ops	6 years ago
乔龙飞 Qiao Longfei	21622ca30b	Merge pull request #16172 from jacquesqiao/add-async-ssa-graph-executor-communicator Add async ssa graph executor communicator	6 years ago
sneaxiy	10249c0b78	Merge develop test=develop	6 years ago
Qiao Longfei	adf272bcec	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
xjqbest	9b84e8e66b	fix code style test=develop	6 years ago
xjqbest	a99c8d0c29	fix client to client communication bug test=develop	6 years ago
sneaxiy	33473890f3	Merge develop test=develop	6 years ago
dongdaxiang	720647e17f	rebase current develop and fix conflict test=develop	6 years ago
dongdaxiang	45eb6f0765	run pre-commit check files and fix code style problem test=develop	6 years ago
xjqbest	e95cafd9a7	fix code style & add dataset testcase test=develop	6 years ago
xjqbest	be74de2c61	fix code style & fix register bug & add release_memory test=develop	6 years ago
xujiaqi01	a5b1a0e12b	support multi dataset && add init model && fix bug	6 years ago
dongdaxiang	b7a202aa38	add distributed optimizer factory	6 years ago
dongdaxiang	f612877797	add incubate for unified API	6 years ago
dongdaxiang	317eb0aad3	add incubate for unified API	6 years ago
xujiaqi01	ecfc7df913	add dataset factory && fix style	6 years ago
xujiaqi01	3cea00bd52	store memory data in Dataset && fix bug	6 years ago
dongdaxiang	cc4def6ba5	fix some conflict for compilation	6 years ago
heqiaozhi	9bca1926c1	refactor & fix bug	6 years ago
xjqbest	2e9a836c6f	add DataSet and InMemoryDataFeed, support load data into memory and shuffle data	6 years ago
dongdaxiang	e36bbcc871	fix some typo and CMakefile.txt	6 years ago
xjqbest	824b84d185	add DataSet and InMemoryDataFeed, support load data into memory and shuffle data	6 years ago
dongdaxiang	be757096da	add pybind for fleet	6 years ago
Qiao Longfei	d8974e6da0	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
chengduo	1096746cbf	Fuse Adam And SGD ops (#15933 ) * fuse optimizer	6 years ago
sneaxiy	2c836ff914	check default grad maker test=develop	6 years ago
Zeng Jinle	69cb9792ea	Merge pull request #16506 from sneaxiy/revert-16424-fix_allocator_bug Revert "Fix allocator bug"	6 years ago
chengduo	ed61d67c73	Fix the interface of Pass::Apply (#16484 ) * modify the interface of Pass::Allay test=develop * Polish code test=develop * Fix Travis CI test=develop * fix Pass::Apply interface test=develop * Fix Travis CI test=develop	6 years ago
Zeng Jinle	174d0d0b90	Revert "Fix allocator bug" add include headers to fix travis-ci test=develop	6 years ago
gongweibao	eb83abeac3	Add DGC(Deep Gradient Compression) interface. (#15841 )	6 years ago
Zeng Jinle	644e8af4cf	Merge pull request #16424 from sneaxiy/fix_allocator_bug Fix allocator bug	6 years ago
Zeng Jinle	c7c6eeb44e	Merge pull request #16409 from sneaxiy/feature/advance_gc Enhance gc to support deleting tensor buffer in advance	6 years ago
wopeizl	c300b1ba69	Tensor index (#16223 ) * extend the slice function for python test=develop	6 years ago
Xin Pan	f8c279b11c	Merge pull request #16454 from panyx0718/imperative2 polish deepCF model to support real dataset	6 years ago
Qiao Longfei	30618409db	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator	6 years ago
chengduo	4f2278f032	Add doc for CPUPlace CUDAPlace CUDAPinPlace (#16442 ) test=develop	6 years ago
sneaxiy	78fb3a62e0	fix env variable settting bug test=develop	6 years ago
sneaxiy	2d92b6be98	merge develop test=develop	6 years ago
Xin Pan	fd24ab47ab	polish test=develop	6 years ago
sneaxiy	a7d0ac50b8	Merge develop	6 years ago
sneaxiy	7000ec85d9	fix some op grad maker fix ctest eager deletion disable bug test=develop	6 years ago
sneaxiy	f8ed2c229e	try to fix ci error test=develop	6 years ago
sneaxiy	c20db6357b	split PR test=develop	6 years ago
sneaxiy	2f54d9f995	Merge develop test=develop	6 years ago
sneaxiy	a93a9eef8f	add op registry type refine gc code test=develop	6 years ago
sneaxiy	953214ad97	add more unittest modify allocator strategy remove changes of legacy buddy_allocator test=develop	6 years ago
chengduo	f26ba5bddd	Fuse AllReduce (#15921 ) * fuse all_reduce test=develop * add fuse_parameter_groups_size test=develop * Polish code test=develop * Fix travis-ci test=develop * Add SetGroupAccordingToLayers and SetGroupAccordingToGroupSize test=develop * Add SetGroupAccordingToMemorySize test=develop * fix multi_devices_graph test=develop * reset params_grads test=develop * Polish code test=develop	6 years ago
Tao Luo	7d2740db83	Revert "cache runtime_context"	6 years ago
sneaxiy	fd23262e0c	merge develop, fix conflict test=develop	6 years ago
Qiyang Min	c7f1f3ed0c	Merge pull request #16214 from velconia/imperative_infer_var_type Implement imperative infer var type	6 years ago
Tao Luo	dbb92ee4b1	Merge pull request #16002 from luotao1/runtime_context cache runtime_context	6 years ago
sneaxiy	161b8ddcaa	Merge develop	6 years ago
minqiyang	b40e41fbd1	Polish code style test=develop	6 years ago
Qiyang Min	8e4ad008fb	Merge pull request #16198 from velconia/imperative_train_speed Improve imperative mode training speed	6 years ago
minqiyang	36dce65bb3	Take DataType and VarType apart test=develop	6 years ago
minqiyang	438bca9c3d	Implement Runtime Var Type Inference test=develop	6 years ago
luotao1	1b59bed989	Merge branch 'develop' into runtime_context	6 years ago
qingqing01	8ad672a287	Support sync batch norm. (#16121 ) * Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)	6 years ago
minqiyang	7355d41834	1. Add imperative gperf profiler 2. Add binutils 2.27 in manylinux support test=develop	6 years ago
luotao1	b2898c0f57	Merge branch 'develop' into runtime_context test=develop	6 years ago
minqiyang	98dfb492bb	Release GIL lock	6 years ago
sneaxiy	ac0e0f5181	merge develop test=develop	6 years ago
minqiyang	42e96a029f	Accelerate CPU part	6 years ago
sneaxiy	682f2dbf29	merge develop test=develop	6 years ago
sneaxiy	2c4fcaa683	merge develop	6 years ago
luotao1	d94fd97230	add runtime_context_cache_pass test=develop	6 years ago
Yan Xu	30568473ec	fix broadcast on mp mode (#15951 ) * fix broadcast with mp mode * polish code test=develop * fix bcast strategy test=develop * fic cpplint test=develop * fix py3 failed test=develop * fix comment test=develop * update comment test=develop	6 years ago
baojun	e3c37bd564	remove const_cast and refactor ngraph engine code (#15925 ) * remove concast_cast and refactor code test=develop * reduce flag use test=develop	6 years ago
Zhen Wang	ac6ef06ffa	Add the Clone method in Graph. test=develop	6 years ago
Zhen Wang	01eddf125c	Not add graph copy construction method. test=develop	6 years ago
Zhen Wang	1b9c8d5f06	add clone function for IrGraph. test=develop	6 years ago
Qiyang Min	1f4aa7a202	Imperative remove all descs (#16045 ) * Remove Desc in Forward Pass * Refactor VarBase * Add dbg info * Only check type in imperative mode * Polish code and support optimizer test=develop * Fix stop gradient problem in PyLayer test=develop	6 years ago
Zeng Jinle	472f16b5aa	Merge pull request #16063 from sneaxiy/enhance_gc Enhance gc	6 years ago
wopeizl	a38db3cb99	Fixrecordio (#16124 ) * fix recordio on win test=develop * test=develop * test=develop * fix code style test=develop * test=develop	6 years ago
sneaxiy	b80d76f784	merge develop	6 years ago
sneaxiy	732fa00eaf	disable gc in recurrent_op currently test=develop	6 years ago
Tao Luo	6f2581e4c5	Merge pull request #16090 from lidanqing-intel/paddle-int32 Add PaddleDType INT32 support	6 years ago
Zhaolong Xing	3d63aa0a11	Merge pull request #15729 from NHZlX/add_static_model_load_for_trt Four points for enhancing Paddle-TRT	6 years ago
nhzlx	a9ed427749	cant not pass ci add if use static engine for trt test=develop	6 years ago
lidanqing	4aeb261da9	Add INT32 support. INT32 in last switch case test=develop	6 years ago
sneaxiy	2a639d5c2a	add allocator chain to fix bug test=develop	6 years ago
Qiao Longfei	8744f9a083	fix parallel executor async mode	6 years ago
Qiao Longfei	e70b1727ef	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor	6 years ago
Qiao Longfei	847e4f4e85	pure async mode train	6 years ago
sneaxiy	3334c279d0	add sample_generator test=develop	6 years ago
Qiyang Min	187cffd019	Merge pull request #15928 from velconia/imperative_backward_hooks Imperative backward hooks	6 years ago
minqiyang	ac88c62a5b	Reset output var's pre_op pointer when op was destructed	6 years ago
sneaxiy	69b1ebdfa5	merge develop test=develop	6 years ago
mozga-intel	68a9ead17a	The flag of mkldnn is enabled iff it is necessary test=develop	6 years ago
Zhen Wang	e00c7a2e26	Merge pull request #15830 from wzzju/add_ir_node_encapsulation add IrNode&IrVarNode&IrOpNode. test=develop	6 years ago
Qiao Longfei	f768fbf715	support multi graph test=develop	6 years ago
minqiyang	efb2f2baf8	Fix bugs test=develop	6 years ago
Qiao Longfei	cf0511f21e	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor test=develop	6 years ago
Zhen Wang	548931456c	update some functions' names according to the suggestion. test=develop	6 years ago
sneaxiy	c545f1ed8f	unify API test=develop	6 years ago
minqiyang	b420ec3a92	invoke backward_hooks after reduce op's depcounts map test=develop	6 years ago
Qiyang Min	4bd28b304b	Merge pull request #15831 from velconia/imperative_engine Imperative training network to the end	6 years ago
sneaxiy	b17541a9c1	fix hang bug	6 years ago
minqiyang	84bf4d7b06	Move ClearBlock into OpBase and VarBase's destructor test=develop	6 years ago
minqiyang	2b3510bc50	Add imperative python tracer	6 years ago
minqiyang	a15a3fc314	Polish code test=develop	6 years ago
sneaxiy	1e4c0a6f72	merge develop	6 years ago
minqiyang	9dc64edfd9	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_engine test=develop	6 years ago
Xin Pan	32d5a16036	resolve conflicts test=develop	6 years ago
Xin Pan	26e32e095a	allow compiler to use graph test=develop	6 years ago
minqiyang	8fe0c0c52c	implement backward refs	6 years ago
Qiao Longfei	cc71e89499	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor test=develop	6 years ago
minqiyang	74551758cc	Polish code test=develop	6 years ago
minqiyang	f53e1d5c4b	implement ClearBlock	6 years ago
sneaxiy	7160cb0f32	decoupled reader test=develop	6 years ago
sneaxiy	d331e97af8	fix compiler place compare test=develop	6 years ago

... 2 3 4 5 6 ...

735 Commits (d5e40d1ba911a35f1094e9d04260e6c8d85fa68b)