Paddle

Commit Graph

Author	SHA1	Message	Date
jiaqi	66d51206b1	add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118 ) * add save/load model, shrink table, cvm, config file & fix pull dense bug test=develop * fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error add client flush, add get data size test=develop * fix global shuffle bug test=develop * fix global shuffle bug test=develop * fix code style test=develop * fix code style & modify pslib cmake test=develop * fix error of _role_maker test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix windows compile error of fleet test=develop * fix global shuffle bug * add comment test=develop * update pslib.cmake test=develop * fix fill sparse bug test=develop * fix push sparse bug test=develop	6 years ago
Tao Luo	68ec0a6f74	make parallel_executor support FLAGS_use_mkldnn (#17341 ) * make parallel_executor support FLAGS_use_mkldnn test=develop * add warning when set mkldnn_enabled_op_types_ in non-mkldnn env test=develop	6 years ago
chengduo	bc833945a4	Add DropLocalExeScopes in ParallelExecutor (#17297 ) * reset drop local scope counter test=develop	6 years ago
qingqing01	e32c9888f5	Double backward of conv2d. (#17211 ) * Add conv2d_grad_grad_op * Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h. - Now use it in conv2d_grad_grad. - Will simply the searching code in conv2d and conv2d_grad in next PR. * Enhance and fix bug in unit testing of gradient_checker. * Support to fetch empty variables，return None in Python.	6 years ago
Zeng Jinle	5e5e7b3305	fix data_type error message (#17312 ) test=develop	6 years ago
guru4elephant	5d6a1fcf16	fix infer_from_dataset and train_from_dataset (#17243 ) * fix train_from_dataset and infer_from_dataset example * add inductive dim for data_reader, example: shape=[-1, 1], then -1 will be inducted through run-time reading of number of elements	6 years ago
chengduo	516317cf91	use sync copy (#17291 ) test=develop	6 years ago
Hongyu Liu	c3195de522	Fix concat shape check (#17247 ) * fix shape_check; test=develop * fix format; test=develop * fix format; test=develop * fix ddim bug; test=develop * fix c++ format; test=develop * change function name; test=develop	6 years ago
chengduo	04bd413acb	Code Clean: Move all pass to paddle::framework::ir (#17228 ) * move pass to ir * polish code test=develop * fix dependency test=develop	6 years ago
Zeng Jinle	4f8594088d	Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace (#17225 ) * add use_cuda to inplace pass,test=develop * add test softmax_with_xe_inplace test,test=develop * fix potential inplace bug test=develop * add more skip vars in mem opt pass,test=develop * follow comment,test=develop * follow comments,move duplicate out arg check to program->graph,test=develop	6 years ago
songhao	c2e20e2a29	fix build warning like 'comparison between signed and unsigned (#17240 ) integer', test=develop	6 years ago
石晓伟	a72dbe9abf	Cherry-pick benchmark related changes from release/1.4 (#17156 ) * cherry-pick commit from `8877054` * cherry-pick commit from `3f0b97d` * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn (cherry picked from commit `8643dbc233`) * Cherry-Pick from 16662 : Anakin subgraph cpu support (cherry picked from commit `7ad182e16c`) * Cherry-pick from 1662, 16797.. : add anakin int8 support (cherry picked from commit `e14ab180fe`) * Cherry-pick from 16813 : change singleton to graph RegistBlock test=release/1.4 (cherry picked from commit `4b9fa42307`) * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2 Support ShuffleNet and MobileNet-v2, test=release/1.4 (cherry picked from commit `a6fb066f90`) * Cherry-pick : anakin subgraph add opt config layout argument #16846 test=release/1.4 (cherry picked from commit `8121b3eccb`) * 1. add shuffle_channel_detect (cherry picked from commit `6efdea8997`) * update shuffle_channel op convert, test=release/1.4 (cherry picked from commit `e4726a066f`) * Modify symbol export rules test=develop	6 years ago
Zeng Jinle	ee2028a110	Add use_cuda to inplace pass (#17205 ) * add use_cuda to inplace pass,test=develop * add test softmax_with_xe_inplace test,test=develop	6 years ago
chengduo	950aec55fd	It doesn't need sync when fetch_list nit not empty (#17201 ) test=develop	6 years ago
tensor-tang	79ed1c76cd	fix bn fuse vardesc and add model saver (#17143 ) * fix bn fuse vardesc and add model saver test=develop * unify save model in test helper test=develop * fix mkdir on windows test=develop * remove magic number use bn bias var desc test=develop	6 years ago
Zeng Jinle	4e1bc6e805	Rewrite inplace pass and fix gc bug (#17126 ) * fix op graph view test=develop * rewrite inplace pass and fix reference count pass bug test=develop * fix unittest failed test=develop * follow comments, test=develop	6 years ago
chengduo	794a195881	fix fuse optimizer ops (#17102 ) test=develop	6 years ago
Tao Luo	aca60e9a20	remove unnecessary prepare_data (#17080 ) test=develop	6 years ago
Zeng Jinle	842ded14b0	fix reference_count_pass,test=develop (#17060 ) test=develop	6 years ago
Tao Luo	d9cd989825	Merge pull request #17048 from luotao1/fix_runtime_cache_bug fix runtime_context_cache bug when gpu model has an op runs only on cpu	6 years ago
chengduo	cc31681687	use fast executor as default (#17044 ) test=develop	6 years ago
chengduo	a2be4b4d91	Add fuse momenutum ops (#16745 ) * Add fuse momenutum ops	6 years ago
luotao1	490e746269	fix runtime_context_cache bug when gpu model has an op runs only on cpu test=develop	6 years ago
wopeizl	51a0243a56	fix nccl wrapper on windows test=develop	6 years ago
Zeng Jinle	1202d3fc74	Refine model gpu memory (#16993 ) * speedup gc and inplace softmax_with_cross_entropy_grad test=develop * refine models gpu mem Merge skip vars and warning messages of mem opt remove relu mem opt test=develop * follow comments test=develop	6 years ago
Yibing Liu	3c375751f8	Support seq len equal to 0 in sequence ops (#16935 ) * Support seq len equal to 0 in sequence ops test=develop * Add more test cases * Fix some comments test=develop * Fix py3 error test=develop	6 years ago
jiaqi	8bcba3db84	Merge pull request #16896 from xjqbest/develop fix bug of num > INT_MAX	6 years ago
guru4elephant	bbc6c5714f	Merge pull request #16887 from guru4elephant/add_nccl_context_pybind Add nccl context pybind	6 years ago
gongweibao	cbdb8a17b1	Polish DGC code (#16818 )	6 years ago
dongdaxiang	2ab2869c2d	fix GPU compile error problem	6 years ago
dongdaxiang	466d177d09	add pybind dependency test=develop	6 years ago
xjqbest	10991e00a9	fix bug of num > INT_MAX	6 years ago
xjqbest	241120d94d	fix bug of num > INT_MAX	6 years ago
xjqbest	dac70ad4c5	fix bug of num > INT_MAX	6 years ago
xjqbest	74471397cf	fix bug of num > INT_MAX	6 years ago
dongdaxiang	b091139049	add nccl wrapper for python API	6 years ago
dongdaxiang	fff795e5c8	add nccl_wrapper	6 years ago
乔龙飞 Qiao Longfei	82cff5ec42	Merge pull request #16762 from jacquesqiao/add-async_sparse_param_update_recorder Add async sparse param update recorder	6 years ago
Yibing Liu	4267a81afc	Correct the lod level of compiled time in lod_reset (#16790 ) test=develop	6 years ago
chengduo	e9409665f7	Refine Fuse Optimize Ops (#16810 ) * fix bug of fuse optimize ops	6 years ago
chengduo	d105c06b50	Replace ThreadedExecutor with FastThreadedExecutor (#16650 ) * replace ThreadedExecutor with FastThreadedExecutor test=develop * Fix Travise CI test=develop * Test FastThreadedSSAGraphExecutor test=develop * refine parallel_ssa_graph_executor.cc test=develop	6 years ago
Qiao Longfei	1526a3e4da	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async_sparse_param_update_recorder test=develop	6 years ago
Yihua Xu	93cedfdb9c	Fix the order while sorting the operators (#16756 ) * Fix the order when sorting operators. test=develop * Enable transfomer compare test item. test=develop * Use set to replace vector. test=develop	6 years ago
Qiao Longfei	afc56949c1	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async_sparse_param_update_recorder	6 years ago
liuwei1031	85363848a1	Security issue (#16774 ) * disable memory_optimize and inpalce strategy by default, test=develop * fix security issue http://newicafe.baidu.com:80/issue/PaddleSec-3/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-8/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-12/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-32/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-35/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-37/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-40/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-43/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-44/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-45/show?from=page test=develop * revert piece.cc, test=develop * adjust api.cc,test=develop	6 years ago
guru4elephant	aa46caf3d9	Merge pull request #16765 from guru4elephant/gpu_dataset_train add gpu training for Executor.train_from_dataset	6 years ago
dongdaxiang	3c2d236815	remove all warnings test=develop	6 years ago
Yiqun Liu	112f16143b	Add an option to enable the cache of expected kernel in train phase. (#16724 ) * Add an option to enable the cache of expected kernel in train phase. test=develop * Change the default value of cache_expected_kernel to true.	6 years ago
liuwei1031	2e07c19a9c	disable memory_optimize and inpalce strategy by default, test=develop (#16760 )	6 years ago
dongdaxiang	ea07eb8cd2	remove comment in data_feed.cc develop=test	6 years ago
dongdaxiang	05464e7c5c	add gpu training for Executor.train_from_dataset test=develop	6 years ago
Qiao Longfei	0608f8ca56	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async_sparse_param_update_recorder	6 years ago
Zeng Jinle	9f7b027dce	fix activation grad op desc maker (#16715 ) test=develop	6 years ago
liuwei1031	fdb719a1bf	avoid optimize variable used in subblock, test=develop (#16739 )	6 years ago
liuwei1031	a18ef10c87	only use the latest version variable for inplace strategy (#16736 ) * bug-fix, test=develop * tweak code, test=develop	6 years ago
Tao Luo	5c364cda3c	Merge pull request #16711 from luotao1/has_attr reduce hasAttr elapsed time in RunImpl	6 years ago
chengduo	55b15db5af	Add unit test for fuse all_reduce ops (#16699 ) * test fuse all_reduce	6 years ago
luotao1	4098ba29ed	reduce hasAttr elapsed time in RunImpl test=develop	6 years ago
luotao1	f89a9c5d95	Merge branch 'develop' into has_attr	6 years ago
Tao Luo	ad4a1bd13c	Merge pull request #16339 from luotao1/core_opt_choose_kernel Cache the chosen kernel of operators	6 years ago
luotao1	6afc97ca6b	reduce hasAttr elapsed time in RunImpl test=develop	6 years ago
gongweibao	8b793d0efd	Fix DGC bug. (#16697 )	6 years ago
Yiqun Liu	3fe8cb0dd7	Enable the runtime_context_cache pass in train phase (#16640 ) * Try to enable the runtime_context_cache pass in train phase. * Put the append of runtime_context_cache pass ahead of multi_dev passes. test=develop	6 years ago
xjqbest	6a57e8075a	remove trainer_id in datafeed and dataset test=develop	6 years ago
luotao1	695f2db6a0	update expected_kernel_cache_pass test=develop	6 years ago
luotao1	226596a296	Merge branch 'develop' into core_opt_choose_kernel	6 years ago
xjqbest	5e5139283b	fix runtime error test=develop	6 years ago
xjqbest	271b7147cc	fix dataset bug test=develop	6 years ago
Zeng Jinle	1c526e1d1a	Fix some grad op desc makers (#16633 ) * fix some grad op desc maker test=develop * fix grad op desc makers test=develop	6 years ago
chengduo	ea2a2f778a	Fix the bug of AllReduceDepPass (#16393 )	6 years ago
chengduo	b75a69bad6	Add Stream for fetch op handle (#16600 ) * expose fuse broadcast ops	6 years ago
chengduo	1342e2ea04	Fix the bug of the fast threaded executor (#16514 ) * Fix the bug of the fast threaded executor. I	6 years ago
gongweibao	423bc515da	fix batch merge bug (#16601 )	6 years ago
liuwei1031	bd193781df	fix the bug of reusing different types of variables in memory_optimiz… (#16547 ) * fix the bug of reusing different types of variables in memory_optimize_pass, test=develop * disable SELECTED_ROWS AND LOD_TENSOR_ARRAY reusage, test=develop	6 years ago
乔龙飞 Qiao Longfei	21622ca30b	Merge pull request #16172 from jacquesqiao/add-async-ssa-graph-executor-communicator Add async ssa graph executor communicator	6 years ago
sneaxiy	10249c0b78	Merge develop test=develop	6 years ago
Qiao Longfei	9861a92f6f	change the return type of NewTempScope to unique ptr test=develop	6 years ago
Qiao Longfei	fb6cc3a1bd	follow commnet, optimize code and add comment test=develop	6 years ago
Qiao Longfei	adf272bcec	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
guru4elephant	76b49f02ee	Merge pull request #16539 from guru4elephant/train_with_pipe_reader_merge_develop Train with pipe reader merge develop	6 years ago
Qiao Longfei	baf02328b2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
Qiao Longfei	9db1a9e128	change log level test=develop	6 years ago
gongweibao	a61ed9782e	fix log level test=develop (#16554 )	6 years ago
Qiao Longfei	8342f12e31	fix set remote_prefetch test=develop	6 years ago
Qiao Longfei	df45c8c538	update nce and hierarchical_sigmoid remote_prefetch test=develop	6 years ago
Qiao Longfei	a1821a0449	remote remote_prefetch in embedding layer test=develop	6 years ago
dongdaxiang	718ea6dbd5	fix fleet code style test=develop	6 years ago
xjqbest	782ab2e2bd	add some doc test=develop	6 years ago
xjqbest	a99c8d0c29	fix client to client communication bug test=develop	6 years ago
gongweibao	fea91164b7	Fix windows compilation error! (#16546 ) * fix compiled test=develop * follow comments test=develop	6 years ago
Zhaolong Xing	3e6aa498d6	Merge pull request #16526 from NHZlX/refine_trt_anakin refine subgraph trt and anakin	6 years ago
sneaxiy	33473890f3	Merge develop test=develop	6 years ago
dongdaxiang	ade9337486	fix API.spec test=develop	6 years ago
liuwei1031	278debab71	fix comments of 16410, test=develop (#16499 ) * fix comments of 16410, test=develop * modify inplace_op_inference_test according to pass interface change, test=develop	6 years ago
dongdaxiang	720647e17f	rebase current develop and fix conflict test=develop	6 years ago
dongdaxiang	98dda08a85	fix pull sparse slow problem test=develop	6 years ago
dongdaxiang	d739bab844	fix async_executor problem and remove some unnecessary testcase, fix trainer_desc import problem test=develop	6 years ago
dongdaxiang	241d8808be	add timer to distributed executor test=develop	6 years ago
dongdaxiang	3c73859eec	add trainer_desc.proto to distributed executor test=develop	6 years ago
dongdaxiang	60b7bf6fa6	add infer_from_dataset for inference	6 years ago
xjqbest	030c7e7e9d	fix FillSparseValue error test=develop	6 years ago
dongdaxiang	88880d9b69	fix import trainer_desc_pb2 error test=develop	6 years ago
dongdaxiang	0030eb2a61	fix distributed building test=develop	6 years ago
dongdaxiang	ed31874397	undefine rand_r() test=develop	6 years ago
dongdaxiang	f7e4813804	add WIN32 for rand_r and usleep test=develop	6 years ago
dongdaxiang	cedbc161da	add more _LINUX maroc on data_feed.cc for mac and window compile test=develop	6 years ago
dongdaxiang	c5980c3566	add _LINUX macro test=develop	6 years ago
dongdaxiang	433301fbc2	remove glog in shell.h test=develop	6 years ago
dongdaxiang	9e51ad4a65	fix io and fs compile on mac test=develop	6 years ago
dongdaxiang	6eca88ac76	fix io and fs compile on mac test=develop	6 years ago
dongdaxiang	2708108a08	fix fleet_wrapper compile on windows test=develop	6 years ago
dongdaxiang	4ce35815fb	fix windows GLOG problem test=develop	6 years ago
dongdaxiang	e3107a6ae0	fix windows compile problem test=develop	6 years ago
dongdaxiang	398004ece0	disable sys/wait.h to fix windows compile problem, include scope in lodtensor_printer test=develop	6 years ago
dongdaxiang	d4514949bf	remove local random engine in fleet with rand_r() test=develop	6 years ago
dongdaxiang	45eb6f0765	run pre-commit check files and fix code style problem test=develop	6 years ago
dongdaxiang	d87ba58c14	refine document of python API, make device_worker and trainer's API private test=develop	6 years ago
dongdaxiang	5687f234bf	fix trainer_desc.proto error	6 years ago
dongdaxiang	b95b80bc76	add doc string for executor and update API.spec test=develop	6 years ago
dongdaxiang	6be9f719e2	make string_helper dependency work test=develop	6 years ago
xjqbest	e95cafd9a7	fix code style & add dataset testcase test=develop	6 years ago
dongdaxiang	ba15d6b164	move root_scope->DropKids() into Finalize() so that we do not have to drop all the kids test=develop	6 years ago
xjqbest	be74de2c61	fix code style & fix register bug & add release_memory test=develop	6 years ago
dongdaxiang	a0b59773af	fix code style	6 years ago
dongdaxiang	f39b323ed7	remove trainer_library in CMakeLists test=develop	6 years ago
dongdaxiang	365be5d559	support win32 flag in io.cc shell.cc, fix code style problem in fleet_wrapper, fix lodtensor_printer_test problem test=develop	6 years ago
dongdaxiang	6bf796df14	refine print fetch list	6 years ago
xjqbest	589467f24c	fix bug	6 years ago
xjqbest	b7940c2918	fix bug of gen_worker_desc and set_filelist, add some doc	6 years ago
dongdaxiang	68d7bf3de5	add fetch var function test=develop	6 years ago
xjqbest	a34fe6248f	add some doc	6 years ago
xujiaqi01	f5c6a14b54	fix runtime error	6 years ago
xujiaqi01	a5b1a0e12b	support multi dataset && add init model && fix bug	6 years ago
dongdaxiang	3c65cc1bbd	add document for role_maker and fleet parameter, data_generator	6 years ago
dongdaxiang	f6c9232a3d	fix dataset float32 type problem	6 years ago
dongdaxiang	73b1f396d7	add data_generator into paddle.fluid.incubate.data_generator, add op run log in hogwild_device_worker and downpour_device_worker test=develop	6 years ago
dongdaxiang	73544e8b8d	add training speed log	6 years ago
dongdaxiang	9419de521f	add IO percent for multi_trainer	6 years ago
dongdaxiang	6af697adb0	add trainfileswithprofiler for downpour worker	6 years ago
dongdaxiang	2644b88685	add comment for MPI Symetric role maker test=develop	6 years ago
dongdaxiang	cf45c54340	add distributed optimizer factory	6 years ago
dongdaxiang	b7a202aa38	add distributed optimizer factory	6 years ago
xujiaqi01	70a5d4f797	fix error	6 years ago
xujiaqi01	d25389fefd	add some log && fix error	6 years ago
dongdaxiang	317eb0aad3	add incubate for unified API	6 years ago
xujiaqi01	39449ba0b9	fix bug && add DestroyReaders in trainer	6 years ago
dongdaxiang	e657c127a8	hide opt_info in distirbuted optimizer	6 years ago
xujiaqi01	ecfc7df913	add dataset factory && fix style	6 years ago
dongdaxiang	328f11b8b6	refactor downpour optimization test=develop	6 years ago
xujiaqi01	3cea00bd52	store memory data in Dataset && fix bug	6 years ago
dongdaxiang	ff87698a44	refactor downpour optimization	6 years ago
dongdaxiang	b66f0074b6	fix data reading bugs in api, add VLOG(3) log for setup	6 years ago
dongdaxiang	b415ec27e8	make Dataset* as an argument	6 years ago
xjqbest	dd67ad08a2	modify c++ and python dataset related code & fix bug	6 years ago
dongdaxiang	cc4def6ba5	fix some conflict for compilation	6 years ago
heqiaozhi	9bca1926c1	refactor & fix bug	6 years ago
xjqbest	2e9a836c6f	add DataSet and InMemoryDataFeed, support load data into memory and shuffle data	6 years ago
dongdaxiang	2486389793	add RunFromDataset in executor	6 years ago
dongdaxiang	e36bbcc871	fix some typo and CMakefile.txt	6 years ago
xjqbest	824b84d185	add DataSet and InMemoryDataFeed, support load data into memory and shuffle data	6 years ago
dongdaxiang	08c25995a2	add run from dataset in executor.	6 years ago
dongdaxiang	c28bbdf8ba	add dataset_generator.py dataset_generator.py is a framework for generating data with python the generated data with a fixed format will be feeded into c++ reader test=develop	6 years ago
dongdaxiang	be757096da	add pybind for fleet	6 years ago
dongdaxiang	687cb79dbb	add pipe command io interface	6 years ago
dongdaxiang	1fe54416c9	move fs.cc and shell.cc into paddle/fluid/framework/io test=develop	6 years ago
dongdaxiang	53fbab5d33	add fs_local_open example	6 years ago
dongdaxiang	afaf937010	add fs_local_open example	6 years ago
dongdaxiang	cf1360643f	add printer for fetch variable	6 years ago
dongdaxiang	d65cb13ad5	add pslib flag on fleet_wrapper CMakefile	6 years ago
dongdaxiang	6de9ebc65c	refine VLOG in fleet_wrapper.h test=develop	6 years ago
dongdaxiang	97d5cd30f0	make pull dense worker work	6 years ago
dongdaxiang	39014b9f9f	fix class register problem	6 years ago
dongdaxiang	f0dd1201cc	fix destructor problem test=develop	6 years ago
dongdaxiang	f2bde9c241	fix destructor problem	6 years ago
dongdaxiang	54f047a126	fix ngraph compile option	6 years ago
dongdaxiang	dd1dc9bcf0	add common.h.in back	6 years ago
dongdaxiang	378037c535	make s_instance_ private to ensure singleton	6 years ago
dongdaxiang	a446d26e8a	add todo for asynce executor	6 years ago
dongdaxiang	c165012031	refine device_worker and trainer code test=develop	6 years ago
dongdaxiang	8a335b50be	add downpour device_worker pb configuration	6 years ago
dongdaxiang	24a8001142	make -DWITH_PSLIB=ON compilable	6 years ago
dongdaxiang	67b1d6d721	add dist_multi_trainer for distributed training, add trainer_factory and device_worker_factory so that we can easily extend new training mode, add pull dense worker which is a singleton for parameter fetching	6 years ago
dongdaxiang	855bf579d2	add dist_multi_trainer for distributed training, add trainer_factory and device_worker_factory so that we can easily extend new training mode, add pull dense worker which is a singleton for parameter fetching	6 years ago
Qiao Longfei	d8974e6da0	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
chengduo	1096746cbf	Fuse Adam And SGD ops (#15933 ) * fuse optimizer	6 years ago
Jacek Czaja	2632327429	[MKL-DNN] Tensor modifications revert (#16462 ) * Revert "[MKL-DNN] Fix to crash of Transformer when mkldnn is to be used (#16233)" This reverts commit `13816dd4ac`. Apart from enabling transformer for MKL-DNN * Revert "- MKL-DNN pooling updated to set_prim_desc" This reverts commit `c63f6b2039`. Conflicts: paddle/fluid/operators/mkldnn/concat_mkldnn_op.cc * Revert "[MKL-DNN] MKL-DNN specific Tensor modification (#15429)" test=develop This reverts commit `dec9cf53c8`. * - concat compilation fix - lint test=develop - Lint fixes test=develop - Lint fixes test=develop - Fix Transpose MKLDNN op test=develop	6 years ago
chengduo	2265d091e6	Fix threaded executor bug (#16508 ) * fix threaded executor bug test=develop * change the order of class member test=develop * Fix Travis CI test=develop	6 years ago
sneaxiy	2c836ff914	check default grad maker test=develop	6 years ago
nhzlx	d065b5bf2b	Anakin ssd support refine trt first run add quant dequant fuse pass omit simplify_anakin_priorbox_detection template omit transpose_flatten_concat_fuse template test=develop	6 years ago
Zeng Jinle	69cb9792ea	Merge pull request #16506 from sneaxiy/revert-16424-fix_allocator_bug Revert "Fix allocator bug"	6 years ago
chengduo	ed61d67c73	Fix the interface of Pass::Apply (#16484 ) * modify the interface of Pass::Allay test=develop * Polish code test=develop * Fix Travis CI test=develop * fix Pass::Apply interface test=develop * Fix Travis CI test=develop	6 years ago
Zeng Jinle	2aa18e2bda	Merge pull request #16496 from sneaxiy/fix_gc_bug Fix gc bug	6 years ago
Zeng Jinle	174d0d0b90	Revert "Fix allocator bug" add include headers to fix travis-ci test=develop	6 years ago
gongweibao	eb83abeac3	Add DGC(Deep Gradient Compression) interface. (#15841 )	6 years ago
Zeng Jinle	644e8af4cf	Merge pull request #16424 from sneaxiy/fix_allocator_bug Fix allocator bug	6 years ago
sneaxiy	c4c6205268	fix gc bug test=develop	6 years ago
Zeng Jinle	c7c6eeb44e	Merge pull request #16409 from sneaxiy/feature/advance_gc Enhance gc to support deleting tensor buffer in advance	6 years ago
Qiao Longfei	33be014535	fix distribute compile problem test=develop	6 years ago
Qiao Longfei	b542639dc0	code clean test=develop	6 years ago
liuwei1031	8d22bc17a4	Memory optimize (#16410 ) * fix cdn issue, test=develop * fix memory optimize bugs, test=develop * fix memory optimize bugs, test=develop * remove add/sub_2 op, test=develop * disable memory_optimize by default, test=develop * disable inplace activation in python, test=develop * fix unittests, test=develop * fix unittests, test=develop * bug-fix, test=develop	7 years ago
Zhaolong Xing	fa1796a30a	Merge pull request #16330 from NHZlX/merge_anakin_branch_to_dev Cherry-pick from PaddlePaddle:feature/anakin-engine: Anakin subgraph support.	7 years ago
sneaxiy	a0f4fefb60	delete source file no_need_buffer_vars_inference.cc test=develop	7 years ago
Qiao Longfei	392e97aae5	fix cpplint test=develop	7 years ago
Qiao Longfei	37f6b9ab7a	fix build test=develop	7 years ago
Qiao Longfei	30618409db	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator	7 years ago
Wu Yi	9ffd5eecef	test fix fetch bar place for ce (#16406 ) * test fix fetch bar place for ce * fix ps mode dist train in develop test=develop * fix style check test=develop * update test=develop	7 years ago
nhzlx	953bdde058	Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD test=develop	7 years ago
Tao Luo	e0a3a49096	Merge pull request #16438 from wojtuss/wojtuss/move-cpu-quantize-passes Move cpu_quantize_* passes into mkldnn subfolder	7 years ago
gongweibao	ec6519e806	Fix allreducedep bug (#16443 )	7 years ago
sneaxiy	78fb3a62e0	fix env variable settting bug test=develop	7 years ago
sneaxiy	2d92b6be98	merge develop test=develop	7 years ago
sneaxiy	7000ec85d9	fix some op grad maker fix ctest eager deletion disable bug test=develop	7 years ago
sneaxiy	f8ed2c229e	try to fix ci error test=develop	7 years ago
Wojciech Uss	46677fb080	Move cpu_quantize_* passes into mkldnn subfolder test=develop	7 years ago
sneaxiy	c20db6357b	split PR test=develop	7 years ago
Qiao Longfei	be0c482304	update trainer_id	7 years ago
sneaxiy	072d95d8f6	Merge develop test=develop	7 years ago
sneaxiy	a93a9eef8f	add op registry type refine gc code test=develop	7 years ago
chengduo	a6a3b2fbbc	[Speed]Refine ParallelExecutor (#16190 ) * refine parallelExecutor test=develop * Polish op_handle test=develop * Remove unnecessary op_handle test=develop * Fix Travis CI test=develop * Fix fetch bug test=develop * Remove WaitInputVarGenerated * Fix OpHandleBase::Run test=develop * debug test=develop * use origin fetch_op_handle test=develop * Revert op_handle_base.cc test=develop * Polish code test=develop * Fix OpHandleBase::Run test=develop * code refine * test CI and CE test=develop * fix OpHandle::Run test=develop * refine AllReduceOpHandle test=develop * Polish code test=develop	7 years ago
nhzlx	3df7b98a0f	Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD	7 years ago
chengduo	33965527fd	Add unit test for fuse all reduce (#16354 ) * refine fused_all_reduce_op * add unit test in test_parallel_executor_seresnext test=develop	7 years ago
sneaxiy	953214ad97	add more unittest modify allocator strategy remove changes of legacy buddy_allocator test=develop	7 years ago
luotao1	056599a738	add expected_kernel_cache_pass test=develop	7 years ago
Wojciech Uss	cbe2dbf0db	Add enabling quantization (#16326 ) * Add enabling quantization test=develop * remove unused (here) function	7 years ago
Tao Luo	9a05859179	Merge pull request #16322 from wojtuss/wojtuss/fix_cpu_quantize_pass fix pattern maching conv2d with(out) ResidualData	7 years ago
nhzlx	c407dfa3cb	cherry-pick from feature/anakin-engine: refine paddle-anakin to new interface. #16276	7 years ago
nhzlx	a25331bc26	cherry-pick from feature/anakin-engine: deal the changing shape when using anakin #16189	7 years ago
nhzlx	69d37f81d7	cherry-pick from feature/anakin-engine: refine anakin subgraph. #16157 support change input size	7 years ago
nhzlx	a1d200a5de	cherry-pick from feature/anakin-engine: Anakin support facebox #16111	7 years ago
luotao1	bfdab00e5b	Merge branch 'develop' into core_opt_choose_kernel	7 years ago
Tao Luo	a5124ee0bb	Merge pull request #16301 from luotao1/runtime_context_pass add runtime_context_cache_pass	7 years ago
luotao1	6c6a39222b	Merge branch 'core_opt_choose_kernel' of https://github.com/Xreki/Paddle into core_opt_choose_kernel	7 years ago
chengduo	f26ba5bddd	Fuse AllReduce (#15921 ) * fuse all_reduce test=develop * add fuse_parameter_groups_size test=develop * Polish code test=develop * Fix travis-ci test=develop * Add SetGroupAccordingToLayers and SetGroupAccordingToGroupSize test=develop * Add SetGroupAccordingToMemorySize test=develop * fix multi_devices_graph test=develop * reset params_grads test=develop * Polish code test=develop	7 years ago
Zeng Jinle	d0ef682552	Merge pull request #16274 from sneaxiy/fix_grad_maker Remove unused variables in op grad maker	7 years ago
Wojciech Uss	104a9f1e27	fix pattern maching conv2d with(out) ResidualData test=develop	7 years ago
Wu Yi	6382b62f6b	Collective ops (#15572 ) * wip allreduce in op * wip * wip * wip * wip adding test * wip for conflict with mp mode * fix tests test=develop * fix cpu build test=develop * fix travis clang format test=develop * fix cpu build test=develop * update api.spec test=develop * delete comment test=develop * fix cpplint test=develop * fix test=develop * follow comment test=develop * add file test=develop * fix build test=develop * update test=develop * to be compatible with sync_bn, and fix mp mode in develop test=develop	7 years ago
sneaxiy	023a3a3d62	fix op grad maker test=develop	7 years ago
luotao1	82af8031d9	add runtime_context_cache_pass test=develop	7 years ago
Tao Luo	7d2740db83	Revert "cache runtime_context"	7 years ago
sneaxiy	fd23262e0c	merge develop, fix conflict test=develop	7 years ago
Qiyang Min	c7f1f3ed0c	Merge pull request #16214 from velconia/imperative_infer_var_type Implement imperative infer var type	7 years ago
Jacek Czaja	13816dd4ac	[MKL-DNN] Fix to crash of Transformer when mkldnn is to be used (#16233 ) * - Fix to crash of Transformer when mkldnn is to be used Desc: TensorCopy was not setting MKLDNN primitive descriptor when layout was to be kMKLDNN test=develop * - Enable transformer for mkl-dnn test=develo * - Compilation fix test=develop * - Removed manual selection of MKL-DNN ops to be used in Transformer test test=develop	7 years ago
Wojciech Uss	af03008890	Add cpu_quantize_placement_pass for C-API quantization (#16265 ) * Add cpu_quantize_placement_pass for C-API quantization test=develop * added a comment on required pass attributes test=develop	7 years ago
Tao Luo	dbb92ee4b1	Merge pull request #16002 from luotao1/runtime_context cache runtime_context	7 years ago
minqiyang	b40e41fbd1	Polish code style test=develop	7 years ago
Qiyang Min	8e4ad008fb	Merge pull request #16198 from velconia/imperative_train_speed Improve imperative mode training speed	7 years ago
minqiyang	36dce65bb3	Take DataType and VarType apart test=develop	7 years ago
luotao1	cc0ae1f1a1	refine with comments test=develop	7 years ago
luotao1	a275fd6e0c	Merge branch 'develop' into runtime_context	7 years ago
Wojciech Uss	2579ade45f	Add cpu_quantize_pass for C-API quantization (#16127 ) * Add cpu_quantize_pass for C-API quantization test=develop * add cpu_quantize_pass test * fix lint: add include memory unorderd_map and unordered_set test=develop * fuse_relu 1 test=develop * tuned 2 without squash * fixes test=develop * remove unused vars test=develop * refactored test=develop * fix lint c-style cast -> C++ style cast test=develop * remove QuantMax and c style casts test=develop * last usage of QuantMax removed test=develop * Fix Analysis Predictor UT Check if memory_optimize_pass has already been added to the analysis config before adding a new one, so that it is not added multiple times. test=develop * change map to unordered_map fix the forgotten part of cpu_quantize_pass_tester.cc test=develop * removed quantized attribute * fixed cpu_quantize_pass_tester and op attr comments test=develop * removed redundant line test=debug * removed gmock test=develop * fix after merge	7 years ago

... 3 4 5 6 7 ...

2620 Commits (569951c418fb3c9f82cbdde9fda3910cc7033bff)