Paddle

Commit Graph

Author	SHA1	Message	Date
Zhaolong Xing	88b52a27fe	Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532 ) * Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop	6 years ago
石晓伟	1529154821	Support Bitmain Anakin (#18542 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop	6 years ago
tianshuo78520a	9b3d3b8387	Cancel jacquesqiao approval authority (#18538 )	6 years ago
Leo Zhao	ce38bb5341	use static variable to do cache instead of thread local in thread frequent switching case (#18428 )	6 years ago
gongweibao	160ddc980c	Regroup fusion by date type. (#18496 )	6 years ago
Tao Luo	fe32879d2a	add mkldnn shapeblob cache clear strategy (#18513 ) * add mkldnn shapeblob cache clear strategy test=develop * refine with comments test=develop * make cache clear strategy more safey test=develop * add lock for GetShapeBlobSize test=develop	6 years ago
chengduo	e576f2667b	update docker build (#18523 ) test=develop	6 years ago
zhaoyuchen2018	832d8191ff	Fix topk cannot handle 1D vector bug (#18466 ) * Fix topk cannot handle 1D vector bug Add path to handle 1D vector test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
石晓伟	280a8784f7	Remove the obsolete cmake options (#18493 ) * remove the obsolete cmake options, test=develop * remove unittests, test=develop * delete options in paddle/scripts/paddle_build.sh	6 years ago
LielinJiang	43e17c7951	Add distributions of normal and uniform (#18023 ) * add_distributions_of_normal_and_uniform * paddle/fluid/API.spec * modify API.spec * modified paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * fix some comment, test=develop * modify API.spec, test=develop * add comment for init function, modify hard code, test=develop * modify API.spec, test=develop * modify API.spec, test=develop * make unit test function shorter, test=develop * modify paddle/fluid/API.spec	6 years ago
bingyanghuang	3fe6bf5ee6	fix command line bug in int8v2 readme (#18507 )	6 years ago
tensor-tang	4828a5e008	core remove pycpuinfo (#18479 ) remove pycpuinfo deps in core	6 years ago
qingqing01	7ac4818a98	Refine Infershape in activation_op for double_grad. (#18485 ) * Refine Infershape in activation_op for double_grad.	6 years ago
qingqing01	602cb6a5b4	Enhance linear_lr_warmup (#18463 ) * make it support float/int learning as input.	6 years ago
chengduo	7453857324	Make fuse_all_reduce_op_pass support mix_precision (#17652 )	6 years ago
chengduo	55baeceddb	Enhance execution error info (#18482 ) * enhance execution error info test=develop	6 years ago
石晓伟	047bba855b	Remove the obsolete cmake options (#18481 ) * remove the obsolete cmake options, test=develop * remove unittests, test=develop	6 years ago
pkpk	e9c7e218f2	Nan debugger init (#18401 ) test=develop	6 years ago
Jiabin Yang	f72ced8814	test=develop, fix docker with paddle nccl problem (#18451 )	6 years ago
Tao Luo	3f3112ceb0	add shape_blob for cache mkldnn primitive (#18454 ) test=develop	6 years ago
Tao Luo	d234aa02cd	add transfer_scope_cache unit-test (#18467 ) test=develop	6 years ago
zhoukunsheng	7c6f2350b9	support Tensor input for edit_distance op (#18162 )	6 years ago
zhoukunsheng	26318544d2	support Tensor input for chunk_eval op (#18226 ) * test=develop support Tensor input for chunk_eval op * test=develop fix testcase for chunk_eval op * test=develop fix typos in nn.py	6 years ago
zhoukunsheng	206c44e2a8	add unique kernel and op (#17557 )	6 years ago
zhoukunsheng	71af72b1c2	upgrade hash op to support Tensor and LoDTensor input (#17998 )	6 years ago
zhoukunsheng	d3b3443d10	add ones_like op (#17388 )	6 years ago
zhoukunsheng	67b48d7fe7	add size op (#17412 )	6 years ago
Leo Zhao	8f5fffca0a	rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() (#18453 ) * rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() test=develop * update session id definition and adjust logic for default behavior test=develop * reset logic in mkldnn reuse as most of cases work in default. test=develop	6 years ago
Tao Luo	3123d18787	remove unused AnalysisPredictor::SetMkldnnThreadID() (#18444 ) test=develop	6 years ago
Yi Liu	a873fa84ce	supports collective training with programs (#18392 ) 1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops 2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext 3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis	6 years ago
tianshuo78520a	85b49d8473	fix the api.spec file does not get the class comment problem (#18439 ) * fix the api.spec file does not get the class comment problem * cat new.spec * check api.spec * test=develop	6 years ago
chengduo	e0d8c6ac68	Add find_no_grad_vars in backward.py (#17942 ) * add not_been_used_vars to no_grad_set test=develop	6 years ago
LielinJiang	449c7a9f98	Make roi_perspective_transform op return mask and transform matrix (#18371 ) * modify roi_perspective_transform_op to output mask and transform matrix * modify comment * modify comment * modify API.spec * update API.spec * remove no use header, test=develop * resolve conflict	6 years ago
tensor-tang	a3bc804f5f	fix mac ci random fail (#18430 ) * fix mac ci random fail * use platform instead	6 years ago
Michał Gallus	7023a86c3a	Fix Pooling output scale (#18186 ) * Int8: Fix Pooling output scale test=develop * Update scales quantization for certain operators These include: concat, transpose, pool and reshape. test=develop * Move concat minimum scale finding to quantizer test=develop	6 years ago
Brian Liu	4bc2987d2f	Fix bug in quantize kernel which cause crash in vgg16/19 model (#17964 ) * Fix bug in quantize kernel which cause crash in vgg16/19 model test=develop * refine the code to reduce verbose code; test=develop * remove useless code; test=develop	6 years ago
xsrobin	47e2ef38e9	add "import paddle.fluid as fluid" to examples lack of it	6 years ago
tianshuo78520a	92ecb305c2	test=develop (#18426 )	6 years ago
hutuxian	8a39e5c110	update api format (#18413 ) * update api format test=develop * update API.spec test=develop	6 years ago
jiaqi	93a2b317f7	fix data feed ptr error (#18419 ) fix data feed ptr runtime error, pipeline trainer will core in some cases, so set it nullptr as default value.	6 years ago
tensor-tang	ce7a024c6d	fix py-cpuinfo mac random fail (#18383 ) * fix py-cpuinfo mac random fail * differentiate version on windows	6 years ago
Jie Fang	2b4ef509ea	init custom black white list (#18377 ) test=develop	6 years ago
Leo Zhao	681d3553f1	Fix potential mkldnn concat/pool/conv kernel issues (#18393 ) 1. some key generation method is not aligned with PR#17965 2. enlarge ptr lifetime to avoid memory release if SetBlob fails otherwise it will get core dump. test=develop	6 years ago
tianshuo78520a	052b044873	Fix mac build nproc command not found (#18362 ) * change nproc 8	6 years ago
Zeng Jinle	f5641000bb	Add a unittest to inplace elementwise_add (#18385 ) * add_elementwise_add_inplace_test,test=develop * rename file, test=develop	6 years ago
Jiabin Yang	43f64a177e	Fix/program doc (#17908 ) * test=develop, add some comments for Program.clone * test=develop, add API.spec * test=develop, refine comments * refine Program doc and clone doc * test=develop, refine doc	6 years ago
Jiabin Yang	af874a1f1d	test=develop, fix multigpu hang on latest docker (#18379 )	6 years ago
chengduo	871cc15e6a	Add is_compiled_with_cuda (#18356 ) * add cuda_is_available test=develop * Fix api.spec test=develop * fix api doc test=develop	6 years ago
lujun	fd6631ef2f	Fix dygraph show style (#18297 ) Fix dygraph show style for FluidDoc.	6 years ago
HaoRen	9931bc64f5	add dependecy of collective_helper (#18365 ) * add dependecy of collective_helper * test=develop fix dependecy of collective_helper	6 years ago
翟飞跃	19da59ed3f	Remove all the code, API and doc of MKL-DNN INT8v1 (#18347 )	6 years ago
chengduo	8ed33bf91f	Fix Bug-prone code of PE (#18354 ) * update pe reduce config test=develop * drop the local_exe_scopes of the previous parallel_executor test=develop	6 years ago
tangwei12	999d9a59a5	fix communicator with pyreader (#18350 ) * add is_runnning in communicator, test=develop	6 years ago
tianshuo78520a	cff2c2d83f	add combine_avx_noavx build to dockerfile 需要在avx_noavx build时候，生成dockerfile。使用combine_avx_noavx 参数生成whl后发现不能build镜像，原因：没有生成dockerfile。需要添加生成dockerfile选项。	6 years ago
kh2se2013	27fb9cad65	add WITH_COVERAGE option, default OFF (#17872 ) * add WITH_COVERAGE option, default OFF test=develop * add coverage for python sdk test=develop * fix code style * fix COVERAGE_FILE path test=develop * remove coverage package test=develop * test = develop, run coverage as module	6 years ago
Michał Gallus	8409693272	Reset DeviceContext after quantization warmup (#18182 ) test=develop	6 years ago
HaoRen	b7128bac5f	supports collective communicated training (#18175 ) * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * fix comment test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * fix comment test=develop * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * test=develop add collective op unittest standard * test=develop remove the test_collective directory * test=develop remove the test_collective directory * remove slicegather test * code format for reducescatter * update attr of shard_index_op * Modify macro nccl_helper * remove test without distribute * macro collective_helper * marcro update * test=develop update support python3.5 * test=develop change gpu memory use to 0.1 when test * test=develop update ut equal func * test=develop set flags to 1.5 * test=develop fix pickle dumple py35 * test=develop fix divide in slice and add sync_comm_stream update atol and rtol to 1e-05 rm shard_index op and test modify read input from file to read from memory remove origin_program in framework and add i/o in c_sync_calc_stream * test=develop update unittest sync operator I/O	6 years ago
Sylwester Fraczek	9252e8fa08	add int8 mkldnn prior_box (#17242 ) add prior_box quantization code add scale algo rules for prior box test=develop	6 years ago
lidanqing	5fd68ac154	some fixes for int8 mobilenet_ssd tester (#18112 ) * some fixes for int8 mobilenet_ssd tester test=develop * change wrong data file name test=develop * change test images bin file from 200 images to 100 images * change directory existence to file existence during downloading test=develop * reuse download_data test=develop * run full dataset when iterations=0 test=develop	6 years ago
Jacek Czaja	c2efdfd5bc	[MKL-DNN] Extending reusing to Elementwise_add_mkldnn op (#18146 ) * - Reusing of reuder used in elementwise_add_mkldnn - Added MKL-DNN sum prim reusing test=develop - Compilation fixes test=develop - Yet another compilation fix test=develop - Yet another compilation fix test=develo - Yet another linking fix test=develop - Final compilation fix test=develop - lint fixes test=develop - Lint fixes test=develop * - Fixes after review test=develop	6 years ago
qingqing01	9047ac687e	Simplify multi_box_head API in detection.py and remove assign op. (#18310 ) * Simplify multi_box_head API in detection.py and remove assign op.	6 years ago
Zeng Jinle	5826b72e06	Refine CUDAPlace error message. (#18343 ) * refine cuda place error msg, test=develop * use LOG(ERROR)+exit(-1), test=develop	6 years ago
Tao Luo	3c9755bbb9	remove unused jemalloc option (#18314 ) test=develop	6 years ago
Yibing Liu	23941e43ec	Update lamb optimizer (#18333 ) * Update lamb optimizer test=develop, test=document_preview * Regenerate api spec test=develop, test=document_preview	6 years ago
chengduo	135a59ed45	update reduce config (#18334 ) test=develop	6 years ago
tensor-tang	81ec538279	fix softrelu doc (#18324 ) * fix softrelu doc test=develop * update API doc test=develop	6 years ago
Hongyu Liu	df2eee71d8	Sequence mask support tensor (#18249 ) * sequnce mask support max length tensor input; test=develop * add rnn_impl.py; test=develop * add basic gru lstm unittest; test=develop * fix api spec; test=develop * fix sequence_mask op bug; test=develop test=document_preview * change +-x to elmentwise_op; test=develop add mkl flag; test=develop * fix rnn impl bug; test=develop * update api spec; test=develop * fix doc bug; test=develop * fix lstm bugs; test=develop	6 years ago
Qiao Longfei	0e08e91c18	optimize communicator merge sparse gradient test=develop (#18159 ) * optimize communicator merge sparse gradient test=develop * revert multithread selected rows merge add test=develop * follow comment test=develop	6 years ago
chengduo	e06c69c788	Fix default value of fluid.memory_optimize (#18295 ) * fix default value of fluid.memory_optimize test=develop * fix api.spec test=develop	6 years ago
Zhaolong Xing	6978b2e48e	fix split and sampled softmax (#18280 ) test=develop	6 years ago
Yibing Liu	f57ee3693b	Fix the bug of sequence_unpad op (#18290 ) * Use TensorCopySync for sequence_unpad op test=develop * Fix the tensor memory alloc bug test=develop	6 years ago
chengduo	5489216eba	Clean build strategy (#18148 ) * clean build_strategy test=develop * DataBalanceOpHandle has been removed test=develop * debug * update build_strategy. test=develop	6 years ago
chengduo	14e1e165df	update alloc_continuous_space_for_grad_pass (#18287 ) test=develop	6 years ago
lujun	7e61baaa94	add Dygraph api to api.spec (#18235 ) add Dygraph api to api.spec	6 years ago
liuwei1031	a736c03b10	improve doc of lstm, sequence_enumerate, softmax_with_cross_entropy, space_to_depth APIs (#18261 ) * improve doc of lstm, sequence_enumerate, softmax_with_cross_entropy, space_to_depth APIs, test=develop * update API.spec, test=develop	6 years ago
flame	fdf798f95a	fix double buffer example (#18169 ) test=develop test=document_preview	6 years ago
Bai Yifan	23b8b18e56	fix api doc example, test=develop (#18266 )	6 years ago
xiaoting	2f0d68261c	fix yolo_box example,test=develop (#18247 )	6 years ago
songhao	6b3d96254d	fix some bug when merge sparse embedding parameters, test=develop (#18223 ) 1. fix the bug that out_put_var in SaveSelectedRows would be empty string 2. use merge_sparse_lookup_table to replace sum op for load_persistables_for_inference 3. fix the bug in _clone_var_in_block_ when the var is SELECTED_ROWS.	6 years ago
jiaqi	3f8031e256	dataset (#17973 ) (1) use channel instead of vector/BlockingQueue in Dataset，to keep same with existing implementation, and make code more readable and flexible (dataset single output channel or multi output channel). one previous memory out of limit problem is cause by not release memory after training. (2) add Record because MultiSlotType costs too much memory (80B)，fix memory out of limit problem. (3) add Channel, Archive in paddle/fluid/framework (4) change dataset from shared_ptr to unique_ptr in pybind (5) move create/destroy readers from trainer to dataset (6) move shuffle from datafeed to dataset. dataset holds memory, datafeed is only for load data and feed data to network. (7) fix thread num bug of Dataset when filelist size < thread num (8) support set_queue_num in InMemoryDataset	6 years ago
liuwei1031	5d54ed4a84	improve the doc of DataFeeder and default_main_program (#18241 ) * improve the doc of DataFeeder and default_main_program * update API.spec, test=develop	6 years ago
xiaoting	b58bb80248	set src_idx > 0 for bilinear_interp_op (#18238 ) * set src_idx > 0, test=develop * add unittest and cu, test=develop	6 years ago
wopeizl	daa32d5383	fix package generation for inference test=develop (#18220 )	6 years ago
Shuai Yuan	9a32dad811	[DOC] Fix comment code of API create_py_reader_by_data (#18193 ) * [DOC] Fix comment code of API create_py_reader_by_data. test=develop, test=document_preview * Fix code style of API comment. test=develop,test=document_preview Fix code style of API comment. test=develop,test=document_preview * update api spec of api create_py_reader_by_data * remove default config code. test=develop * remove useless code. test=develop * update create_py_reader_by_data api. test=develop	6 years ago
Hongyu Liu	cefd0fb598	Fix slice op shape=-1 bug (#18107 ) * fix slice op bug; test=develop * fix variabel test bug; test=develop * remove slice while true; test=develop	6 years ago
lijianshe02	ff4279e3b2	fix paddle.fluid.layers.io.open_files api doc bug test=develop (#18203 ) * fix paddle.fluid.layers.io.open_files api doc bug test=develop	6 years ago
chengduo	5588b923f3	Add multi process reader (#18115 ) * add multi process reader test=develop	6 years ago
wangchaochaohu	a9dc534f48	fix API example (#18153 ) * API.spec test=develop * update * update test=develop * update test=develop * update * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * test=develop * update * update test=develop * update test=develop * fix test=develop	6 years ago
翟飞跃	de42fe8fd5	Change int8v2 CAPI unit test name and add log in the prediction stage (#18200 ) * fix issue 18111;test=develop * fix timer;test=develop * refine code;test=develop	6 years ago
翟飞跃	802ea50956	fix spelling errors (#17941 ) * fix spelling errors; test=develop * Update API.spec update md5 * Update API.spec * change the order of api;test=develop	6 years ago
zhoukunsheng	0569ff78fa	Fix doc example for greater_equal, greater_than, less_equal, not_equal, rank, reduce_all, reduce_any, sign, where, diag (#18167 ) * test=develop fix greater_than, greater_equal, less_equal, not_equal, rank, reduce_all, reduce_any, sign, where, diag doc example * test=develop fix API.spec conflict	6 years ago
Huihuang Zheng	bbc292920c	Fix API example code (#18176 ) The fixed APIs: 6 Methods in paddle.fluid.io.PyReader paddle.fluid.layers.Preprocessor paddle.fluid.layers.py_reader paddle.fluid.io.save_params paddle.fluid.io.save_persistables test=develop test=document_preview	6 years ago
翟飞跃	78441c5449	add mkldnn Int8v2 slim doc (#17909 )	6 years ago
lvmengsi	d658f1133b	Fix doc for transpose, conv3d and batch_norm. (#18035 ) * update some op doc, test=develop	6 years ago
FlyingQianMM	944c3165ec	fix type error of std::pow in sigmoid_focal_loss_op.cu and sigmoid_focal_loss_op.h (#18152 ) * test=develop fix type error of std::pow in sigmoid_focal_loss_op.cu and sigmoid_focal_loss_op.h * test=develop fix wrong code stype in sigmoid_focal_loss_op.cu and sigmoid_focal_loss_op.h	6 years ago
chengduo	25f3cd6486	Update execution_strategy option default value (#18183 ) * update execution_strategy option default value test=develop * fix doc error test=develop	6 years ago
chengduo	4978db2c10	Remove nccl dep when the number of GPU is 1 (#18158 ) * remove nccl dep when the number of GPU is 1 test=develop	6 years ago
Zeng Jinle	25ab23be28	Fix dygraph mem leak (#18082 ) * fix dygraph mem leak, test=develop * polish msg, test=develop	6 years ago
tensor-tang	1c6e560607	core replace x86cpu with py cpuinfo (#18151 ) test=develop	6 years ago
Zeng Jinle	6eec66a1b1	Fix py_reader iterable bug (#18108 ) * fix py_reader iterable bug, test=develop * move data from buffered_reader,test=develop	6 years ago
qingqing01	80d2e66f9e	Update backward appending stragety to support double backward and fix some bug. (#18104 ) * Update backward.py: - If there is no input grad var in all outputs of previous ops, do not append this op into graph. - Only apply this stragety when double backward. * Update some double backward op. * Update sum_op to judge whether a tensor is empty by numel or IsInitialized().	6 years ago
Wojciech Uss	ca5642c850	unify FP32 vs. INT8 comparison tests output (#18111 ) test=develop	6 years ago
Wojciech Uss	c26130f3a9	reuse C-API INT8 unit test application (#18077 ) * reuse C-API INT8 unit test application test=develop * updates after review test=develop	6 years ago
FlyingQianMM	ff83655f7e	add detection output operator for supporting retinanet (#17896 ) * test=develop add detection output for supporting retinanet * test=develop add test_layers.py * test=develop add API.spec * test=develop alter test_retinanet_detection_output.py * test=develop alter round 2 * test=develop alter retinanet_detection_output * test=develop alter paddle/fluid/API.spec * test=devlop alter detection.py * test=develop alter retinanet_detection_output * test=develop alter paddle/fluid/API.spec * test=develop alter detection.py * test=develop alter API.spec * test=develop alter retinanet_detection_output * test=develop alter paddle/fluid/API.spec * test=develop alter python/paddle/fluid/tests/unittests/test_retinanet_detection_output.py * test=develop alter python/paddle/fluid/tests/unittests/test_retinanet_detection_output.py * test=develop fix grammer error * test=develop fix grammer error * test=develop fix grammer error * test=develop alter python/paddle/fluid/tests/unittests/test_layers.py * test=develop alter paddle/fluid/API.spec	6 years ago
FlyingQianMM	0aee1f0074	add sigmoid focal loss operator for supporting retinanet (#17895 ) * test=develop add sigmoid_focal_loss for supporting retinanet * test=develop add test_layers * test=develop add API.spc * test=develop alter sigmoid_focal_loss_op.cc * test=develop alter detection.py * test=develop alter API.spec * test=develop alter round 1 * test=develop alter simooid_focal_loss * test=develop alter sigmoid_focal_loss_op.cc * test=develop alter test_layers.py * test=develop alter paddle/fluid/API.spec * test=develop alter sigmoid_focal_loss_op.cu * test=develop alter paddle/fluid/operators/detection/sigmoid_focal_loss_op.cc	6 years ago
FDInSky	9e4b9d9798	Update generate_proposal_labels_op to support CascadeRCNN. (#17200 ) * Update generate_proposal_labels_op to support CascadeRCNN.	6 years ago
FlyingQianMM	9ed2f936f1	add target assign operator for supporting retinanet (#17893 ) * test=develop add target assign for retinanet * test=develop run ci * test=developp add test_layers * test=develop add APi.spec * test=develop alter round 1 * test=develop alter rpn_target_assign_op.cc * test=develop alter test_rpn_target_assign_op.py * test=develop alter rpn_target_assign_op.cc * test=develop alter API.spec * test=develop alter paddle/fluid/operators/detection/rpn_target_assign_op.cc * test=develop alter rpn_target_assign_op.cc * test=develop alter python/paddle/fluid/layers/detection.py * test=develop alter paddle/fluid/API.spec	6 years ago
Huihuang Zheng	7faf095618	Sync Dockerfile change of PR#17889 (#18072 ) Jian Tang made change on latest-dev Dockerfile, so sync the change in the cuda9/10 Dockerfile test=develop	6 years ago
Sylwester Fraczek	accb132f0f	fix slim int8 mkldnn multithreading issue (#18009 )	6 years ago
tianshuo78520a	2e1d8cf7c8	add approval to requirements.txt add luotao to approval requirements.txt	6 years ago
chengduo	24e988a471	Fix bug of scope_buffered_ssa_graph_executor (#18100 ) * fix code bug test=develop	6 years ago
Huihuang Zheng	3f55ab0f89	Modify format of GPU allocation failure log. (#18034 ) As title test=develop	6 years ago
gongweibao	f5caf3443c	Fix reinitialized ncclid error! (#18025 )	6 years ago
whs	354643d8d9	Add warning for cudnn warpctc kernel in CUDA9\CUDA10. (#18046 ) test=develop	6 years ago
qingqing01	e81756f1ba	Hidden paddle.fluid.layers.detection_map. (#18033 ) * Remove layers.detection_map API * Since uers can use fluid.metrics.DetectionMAP to calculate mAP of current-batch and cumulative-batch. layers.detection_map only can calculate cur-batch mAP.	6 years ago
Yiqun Liu	660c1a65f3	Optimize fused_elewise_activation_grad op. (#18041 ) test=develop	6 years ago
lidanqing	466254151a	add Mobilienet ssd int8 analyzer tester (#18075 ) * add pascalvoc preprocess script and mobilenet-ssd analyzer_tester, wait 17737 * change converting local dataset to downloading and converting tarfile test=develop * change the test data_path test=develop * change copyright (c) 2016 to copyright (c) 2019 test=develop	6 years ago
石晓伟	42f12a4aca	fix ci test cmake test=develop (#18060 )	6 years ago
chengduo	b5a1c1463d	Update CPU_NUM config (#18059 ) * update CPU_NUM config test=develop	6 years ago
lidanqing	f8ecc3de89	refactor the function ConvFwdPrimitiveDesc (#17897 ) * refractor the function ConvFwdPrimitiveDesc test=develop * change according to review test=develop * use pointer way without boost::optional test=develop * pass vector to function by reference instead of raw vector test=develop * change pointer to shared_ptr test=develop	6 years ago
Michał Gallus	8462e2b805	Disable MKLDNN FC in Resnet50 test (#18030 )	6 years ago
Wojciech Uss	78e932862c	Added unit test for QAT FP32 & INT8 comparison (#17814 ) * added unit test for QAT FP32 & INT8 comparison test=develop * enabled other models and updated filenames test=develop * added accuracy check and multiple batch handling test=develop * removed quantization_mkldnn_pass.py test=develop * cleanup test=develop * updated model paths test=develop * renamed tests without MKL-DNN test=develop * fix reusing mkldnn pool2d primitive test=develop * add performance measuring test=develop * fix accuracy statistics test=develop * removed non-mkldnn tests test=develop * added conv2d_depthwise->conv2d mkldnn transformation test=develop * format update test=develop * fixed creating key for pool2d grad test=develop * added pass * Fix the accuracy issue while using float precision to get the scale. test=develop * Fix the format issue when 'X' is not nchw. test=develop * removed output comparing and changed number of images test=develop * cmake and comment fix test=develop * updated acc threshold for QAT comparison tests test=develop * added OMP_NUM_THREADS setting test=develop * enable all QAT INT8 tests test=develop * restored upstream version of a file test=develop * modified directory names test=develop	6 years ago
tensor-tang	566bf2ec56	concat op support negative axis (#18045 ) test=develop	6 years ago
Yiqun Liu	7e463c84a6	Optimize the concat and split cuda implementation for cases when the number of inputs/outputs is less than 5. (#17979 ) test=develop	6 years ago
tangwei12	101f74cb19	fix save/load in fleet (#17675 ) * fix save/load in Fleet * add UT framework of Fleet	6 years ago
hutuxian	f1d458daf0	add trainer_desc proto DEPS (#18019 )	6 years ago
Guo Sheng	a06b316b94	Fix GetExpectedKernelType of add_position_encoding_op (#17935 ) * Fix the GetExpectedKernelType of add_position_encoding_op. test=develop * Fix the doc of lstm_unit outputs in nn.py. test=develop	6 years ago
tensor-tang	5c06bff222	combine noavx and avx package (#17889 ) * support avx and noavx core * add catch and give some log test=develop * fix build test=develop * add missing package test=develop * fix pybind name test=develop * fix import error test=develop * conbime noavx core test=develop * add requirements test=develop * fix unkown message test=develop * fix api spec test=develop * refine and clean test=develop * update * pass dist ut * follow comments test=develop * refine scripts test=develop	6 years ago
wawltor	8eb134c3c1	Fix scatter and gather op when has duplicate index (#17952 ) * test=develop The scatter op has a calc bug when the indices has same index, the scatter op use overwrite mode to calculate the same index, fix this bug by using the accumulate mode to calculate the same index.At the same time, the gather op has the same bug when the op calc the grad. And we use the lib of open-blas and eigen to optimize the time cost in accumulate mode. * test=develop Fix some code format problem, and the same time add the test case in gather and scatter op	6 years ago
lujun	75fcd29220	update load_error_info, test=develop (#18000 ) Repair error prompt: Users are prompted to check whether the model or parameter files are damaged when loading parameters are wrong.	6 years ago
石晓伟	04ea7cb069	modify the access level of anakin engine (#18015 ) test=develop	6 years ago
wawltor	2ae8decc90	test=develop (#17984 ) Fix bug in sequence_unpad op, when allocate the output memory do not match actual memory, check memory failed. Fix this bug by allocating the output memeory in correct code position.	6 years ago
ruri	9d6640ff44	Fix edit distance doc (#17947 ) * fix im2sequence padding bug, test=develop * fix edit_distance, test=develop * add API.spec,test=develop	6 years ago
Zeng Jinle	a1bdf25ecb	Add shape not match doc to data layer (#17936 ) * add shape not match doc to data layer, test=develop * fix API.spec md5 test=develop	6 years ago
cjt222	871af28d6c	add deformable psroi pooling (#17827 ) * add deformable psroi pooling * test=develop * test=develop * test=develop modify format * fix bug * test=develop run ci * test=develop add API.spec * add test_layers.py * run ci again * test=develop run ci again * run ci again * test=develop run ci again * test=develop run ci again * test=develop run ci again * add space between two lines * test=develop add space between two lines * test=develop add space between lines * test=develop modify comment in nn.py * test=develop add space between two lines * test=develop add space between two lines * update API.spec * run ci again * test=develop run ci again * rerun ci * test=develop rerun ci * change input shape * run ci * test=develop run ci * modify format of nn.py * test=develop * test=develop * test=develop update API.spec * test=develop fix API doc * modify API comment * modift API comment * test=develop update API.spec * test=develop modify comment * test=develop modift comment * test=develop modift comment * test=develop update API.spec * test=develop modify comment * test=develop add inference in nn.py * test=develop update API.spec * test=develop resolve confict * test=develop update API.spec	6 years ago
SunGaofeng	40885c225b	add unfold op (new op),test=develop (#17944 ) * add unfold op test=develop * fix divide bug in python3 when calculating output width and height test=develop * add name=None in python api, move redundant code into inline function * try to trigger ci for this code test=develop	6 years ago
Jacek Czaja	84bb45c054	[MKL-DNN] Thread-Safety for MKL-DNN reusing Part 1 (#17965 ) * - removed is_reusing_ * - Added TID to keys for reusing apart from softmax PD * - compilation fix * - Yet another compilation fix * - Batch Norm and Conv adapted * - Fix to softmax MT * - Fixes to MT code of MKL-DNN * - Lint fixes test=develop	6 years ago
gongweibao	da9143c1cc	Polish codes of old prs. (#17938 )	6 years ago
石晓伟	bce259e5bf	Update the Anakin interfaces for content-dnn and MLU (#17890 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop	6 years ago
tianshuo78520a	410907f624	added monitoring of python/requirements.txt file (#17957 )	6 years ago
hutuxian	969e6378b9	Pipeline Concurrency (#17402 ) Add Pipeline Concurrency Train Mode: - Cpp: pipeline_trainer & section_worker - Python: PipelineOptimizer - Add a new data_feed type: PrivateInstantDataFeed - Add a test demo of pipeline trainer and the test model is gnn - Do not support win32 now	6 years ago
Zhaolong Xing	4e8d5a034f	Light mem reuse strategy for inference. (#17925 ) * fix: when use the load model from memory mode, the RAM occupy is high test=develop * ligth mem reuse test=develop * fix cpplint test=develop	6 years ago
Tao Luo	53fd507bae	fix merge conflict of 'Remove attribute in Allocator::Allocate' and elementwise_add_mkldnn_op (#17949 ) test=develop	6 years ago
zhaoyuchen2018	3847d9fc2c	refine sum stack api doc (#17923 ) test=develop	6 years ago
jerrywgz	aab4d12c0e	refine GetExpectedKernelType in conat op, test=develop (#17934 )	6 years ago
Zeng Jinle	3ece61f71e	Remove attribute in Allocator::Allocate (#17878 ) * remove attribute in Allocator::Allocate, test=develop * fix travis ci error, test=develop	6 years ago
Yibing Liu	33d1e56506	Enable seq_pool op to accept len 0 input (#17284 ) * Enable seq_pool op to accept len 0 input test=develop * Update sequence_pool's api test=develop * Add more unittest cases for seq_pool op test=develop * Remove legacy comments test=develop * Don't use template in op maker test=develop	6 years ago
Yihua Xu	9b5017366a	Fix the format issue when 'X' is not nchw. (#17833 ) test=develop	6 years ago
Hongyu Liu	8062bd510c	Reshape support tensor attribute (#17781 ) * add reshape support tensor; test=develop * fix reshape bug; test=develop * change reshape attribute default value; test=develop * fix reshape input name; test=develop * fix reshape unitest; test=develop * check dim tensor shape; test=develop	6 years ago
gongweibao	972c54cd70	Fix FLAGS_fuse_parameter_memory_size unit from Bytes to MBytes. (#17924 )	6 years ago
Zeng Jinle	0a96ec699c	fix conv v7 workspace size limit error, test=develop (#17902 )	6 years ago
Jiabin Yang	4d5f6937c3	Feature/refine api for dygraph (#17907 ) * WIP * WIP * test=develop, add api doc and example code for dygraph	6 years ago
gongweibao	dd4cd352c7	Fix sync_batch_norm_op ncclallreduce error! (#17918 )	6 years ago
whs	5df65e506d	Add Ligth-NAS for PaddleSlim (#17679 ) * Add auto pruning strategy. 1. Fix compressor. 2. Enhence graph executor. 3. Add SAController 4. Add auto pruning strategy. 5. Add unitest for auto pruning strategy. test=develop * Init light-nas * Add light nas. * Some fix. test=develop * Fix sa controller. test=develop * Fix unitest of light nas. test=develop * Fix setup.py.in and API.spec. test=develop * Fix unitest. 1. Fix unitest on windows. 2. Fix package importing in tests directory. * 1. Remove unused comments. 2. Expose eval_epoch option. 3. Remove unused function in search_agent. 4. Expose max_client_num to yaml file. 5. Move flops constraint to on_epoch_begin function test=develop * Fix light nas strategy. test=develop * Make controller server stable. test=develop * 1. Add try exception to compressor. 2. Remove unitest of light-nas for windows. test=develop * Add comments Enhence controller test=develop * Fix comments. test=develop	6 years ago
Zeng Jinle	3925bd81e8	Fix cuda/cudnn version detection error (#17853 ) * fix cuda/cudnn version detection error, test=develop * fix again, test=develop	6 years ago
Yihua Xu	14a32bf0c4	Fix the accuracy issue while using float precision to get the scale. (#17884 ) test=develop	6 years ago
gongweibao	fbbdc9ccad	Add backward and optimizer operator dependency pass. (#17746 )	6 years ago
mozga-intel	c1379bf238	[NGraph] Bert model for a capi, ngraph's support test=develop (#17844 )	6 years ago
baojun	e2c1b7c354	[NGraph] cache compiled function instead test=develop (#17845 )	6 years ago
石晓伟	d008260fa8	update the initialization of anakin subgraph (#17880 ) test=develop	6 years ago
Zhaolong Xing	ae576f3c68	fix: when use the load model from memory mode, the RAM occupy is high (#17788 ) test=develop	6 years ago
Zhaolong Xing	5efe8c7287	fix bug: the lod_tensor_to_array op will aplly a new var but not release when dong inference (#17856 ) test=develop	6 years ago
Jiabin Yang	022dfed4fc	Add optimizer save and load (#16986 ) * save optimizer related vars in dygraph * test=develop, add optimizer save and load * test=develop, add optimizer save and load * test=develop, merge code and add multi-optimizer save and load * test=develop, fix test_imperative_checkpoint * test=develop, fix include error * test=develop, fix include error * test=develop, renew api spec * test=develop, refine code * test=develop, set default value for checkpoint * test=develop, fix ci error * test=develop, change API.spec and make api more readable * test=develop, refine version and time stamp * test=develop, add example code and refine code * test=develop, refine doc * test=develop, change version	6 years ago
wopeizl	453a49b1bc	Make ParallelExecutor support Windows GPU (#17787 ) * fix the ParallelExecutor on Windows test=develop * restrict to use one GPU only under windows	6 years ago
pawelpiotrowicz	39bc8a55a4	[NGraph] Enable ngraph layer_norm operator (#17599 ) * Enable ngraph layer_norm operator test=develop * Disable/Enable cuda, new unit-test test=develop * Fix use_cudnn test=develop * Fixed test_layer test, new funciton is added test=develop * set use_cudnn by default test=develop	6 years ago
翟飞跃	993c703bcc	INT8 MKL-DNN v2 integrate to slim (#17634 ) * refactor PR 16865 * delete mergetool files * test=develop * test=develop * test=develop * test=develop * create dir for int8 model before call SaveOptimModel * test=develop * mkldnn int8 only support linux; test=develop * refine code; test=develop * remove comment; test=develop * refine code; test=develop * fix bug; test=develop * add exception for mkldnn_post_training_strategy * reuse int8v2 CAPI dataset; test=develop * fix accuracy check bug; test=develop * remove tab * convert files to unix format * test=develop * reduce CI time;test=develop * reduce CI time and refine code;test=develop * refine comment; test=develop * add cmake FLAGS;test=develop * remove predict_num;test=develop	6 years ago
wopeizl	841553e13f	use pyreader to read data in dygraph mode (#17314 ) * use pyreader to read data * add return_list to PyReader to support return value represented as list	6 years ago
chengduo	5436d66667	close socket connect (#17862 ) test=develop	6 years ago
baojun	a4c528a31c	[NGraph] some ngraph updates to enable bert (#17739 ) * delay infershape test=develop * fall back subblock to paddle test=develop * fix edge cases test=develop * remove output duplicates test=develop * handle reshape2_grad infershape test=develop	6 years ago
Jiabin Yang	3d3f5506d2	Feature/Fix recurrent usage of Varbase in Dygraph (#17838 ) * for debug * test=develop, memory optimize for dygraph using shared_ptr * test=develop, fix travis ci showed error * test=develop, fix bug for recurrent usage of varbase * test=develop, init varbase when it need to be Add * test=develop, fix problem of recurrent gradient * test=develop, add gradient test for recurrent varbase usage	6 years ago
Zeng Jinle	674e0ce2d6	Use Python C-API to speed up dygraph trace (#17837 ) * use python api to reduce python time cost, test=develop * fix travis ci, test=develop * fix Py_None error,test=develop	6 years ago
tianshuo78520a	47cc1b51ad	Change Linux CI check API	6 years ago
jerrywgz	5e4f99dd74	refine doc for prelu (#17810 ) * refine doc for prelu	6 years ago
chengduo	d1169afaa3	remove InstallFailureSignalHandler (#17828 ) test=develop	6 years ago
chengduo	437520474c	fix DropLocalExeScopes (#17829 ) test=develop	6 years ago
Leo Zhao	50326563d5	enable mkldnn primitive reuse for platform reorder (#17826 ) test=develop	6 years ago
baojun	7611208ab7	[NGraph] added gather_grad to ngraph test=develop (#17646 )	6 years ago
tensor-tang	557452e778	update and polish hash op doc (#17809 ) * update and polish hash op doc test=develop * update api spec test=develop	6 years ago
jerrywgz	92d9bdfce2	fix api doc in slice op, test=develop (#17804 )	6 years ago
Hongyu Liu	dfec676270	expand op supprt tensor attribute (#17773 ) * expand support tensor attribute; test=develop * fix bug ; test=develop * fix uni test bug; test=develop * fix copy bug; test=develop * refine expand_times default value; test=develop	6 years ago
Jiabin Yang	3b70f870e2	Using Smart pointer to optimizer memory usage of dyGraph (#17768 ) * for debug * test=develop, memory optimize for dygraph using shared_ptr * test=develop, fix travis ci showed error * test=develop, fix bug for recurrent usage of varbase * test=develop, init varbase when it need to be Add	6 years ago
Hongyu Liu	82358bfdc1	ont hot support tensor depth (#16972 ) * support some input tensor remain on cpu; test=develop * fix input = none; test=develop * fix unfound bug; test=develop * fix proto None case; test=develop * fix bug; test=develop * fix proto null bug; test=develop * remove conv check; test=develop * fix test bug; test=develop * move fill constant; test=develop * no change in proto; test=develop * fix bug; test=develop * change attr detph name; test=develop * remove remain cpu; test=develop * fix bug; test=develop * merge develop; test=develop * fix one_hot bug; test=develop * fix bug; test=develop * fix bug; test=develop * fix bug; test=develop * fix python api bug; test=develop	6 years ago
Brian Liu	7cfddf22c8	Optimize bilinear interpolate op with OpenMP (#17800 ) Refactor the code to be OpenMP friendly test=develop	6 years ago
Yibing Liu	d6d33fd748	Add update method for ema (#17812 )	6 years ago
wangchaochaohu	c10157a5df	revise the cudnn conv choose algorithm to improve the performance(mask rcnn benchmark) (#17753 ) * revise conv layer cudnn algo choose test=develop * update for code style test=develop * update for code style test=develop	6 years ago
chengduo	863c75168c	polish error doc (#17772 ) test=develop	6 years ago
Tao Luo	e089e454a1	make omp thread num default 1 after inference run (#17801 ) test=develop	6 years ago
mozga-intel	6a6bf597f7	[NGraph] Enable elementwise_div operator test=develop (#17515 ) * Enable elementwise_div operator test=develop * Fix update date test=develop	6 years ago
Huihuang Zheng	931698a54a	Modify doc of program_guard, py_reader, data, and clone (#17727 ) Note the append_batch_size variable is doing prepend. We should change the name, but due to backward compatibility, I suggest to change at v2.0. Not now. test=develop	6 years ago
lidanqing	d7c5c2bd64	Add input format in Transpose GetHash (#17737 ) * fix the bug of mobilenet-ssd INT8 inference without overloading GetHash test=develop * remove the out_grad->format() in TransposeMKLDNNGradOpKernel test=develop	6 years ago
tangwei12	659b72a97c	fix document of python api get_startup_program() (#17764 ) * add example to get_startup_program() * fix example to get_startup_program()	6 years ago
AIFollowers	93de124cec	modify some initializer api (#17301 ) * test=develop modify some initializer api * test=develop modify API.spec * test=develop modify API.spec * test=develop modify API.spec * test=develop modify API.spec	6 years ago
guru4elephant	d52391094d	fix prepare context redundant code problem, optimize executor by cach… (#17743 ) * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * cache sub_scope, program, var when use_program_cache=True is set * make fetch_list runable with variables, add more unittest for use_program_cache	6 years ago
baojun	2c58f1a83c	[NGraph] Added lookup table to ngraph engine test=develop (#17647 )	6 years ago
pawelpiotrowicz	bacc822492	[NGraph] Enable transpose ngraph operator (#17636 ) test=develop	6 years ago
lujun	ed9d603a8a	fix api doc: Optimizer.ModelAverage (#17395 )	6 years ago
baojun	90eae0b39a	[NGraph] Addded slice op to ngraph test=develop (#17648 )	6 years ago
baojun	2fbaa5c075	[NGraph] added matmul op to ngraph engine test=develop (#17645 )	6 years ago
hong19860320	68dcb1bd7b	fix API examples of assign, reverse and array_write, etc. (#17287 ) * fix API examples of assign, reverse and array_write test=develop * update API.spec test=develop * update API examples for array_length, array_read, array_write, assign, hard_sigmoid, hsigmoid, increment, ones, pow, reverse, uniform_random and zeros * update API.spec for assign, reverse and array_write, etc.(#17287) * test=develop	6 years ago
tianshuo78520a	f144740b73	change ci ctest exit code (#17745 )	6 years ago
chengduo	67c8dade58	Add Event in ScopeBuffer Executor (#17667 ) * add event for fast executor and add threads for scopebuffer executor test=develop	6 years ago
Bai Yifan	bba57cdd82	Add deformable conv v2 op,test=develop (#17145 ) * unit commits, test=develop * update API.spec, test=develop	6 years ago
wangchaochaohu	bd48950c7e	fix paddlepaddle API examples (#17306 ) * API.spec test=develop * update * update test=develop * update test=develop * update * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * test=develop * update * update test=develop * update test=develop	6 years ago
YishengCheng	bd15912d65	fix bug for ctr_reader for svm data (#17575 ) * fix bug for ctr_reader test=develop * fix svm data test=develop fix svm data test=develop	6 years ago
Yiqun Liu	8fd39f3e99	Enhance fused_elementwise_activation op and add python api in contrib.layers (#17236 ) * Enhance fused_elementwise_activation op. test=develop * Move the api fused_elementwise_activation to contrib. test=develop * Add including files. test=develop * Add the support of sigmoid in fused_elementwise_activetion op. * Update API.spec. test=develop	6 years ago
yaoxuefeng	ac92e4c066	fix distributed_transpiler.py api test=develop (#17668 )	6 years ago
Yiqun Liu	2704479bb2	Optimize recurrent_op using Prepare and RunPreparedContext, avoiding create operators in every iter. (#17689 ) test=develop	6 years ago
pawelpiotrowicz	9b99876442	Enable less_than ngraph operator (#17642 ) * Enable less_than ngraph operator test=develop * Added compare unit-tests test=develop * Update: date && removed import test=develop	6 years ago
Zhaolong Xing	a9a531fa5f	Refine python api code example note: (#17369 ) * fix: 1. infernce multi card occupy 2. facebox model inference occupy too much test=develop * refine python api comments: shuffle, while, scale, sampled_softmax_with_cross_entropy, scatter, round, sin, sqrt, shape, split, soft_relu, slice, selu, ifelse, switch. test=develodp * fix conflict error. test=develop	6 years ago
Jiabin Yang	effc555955	test=develop, layz init Grad (#17653 )	6 years ago
hutuxian	4ff87c049d	remove useless input 'Softmax@GRAD' from softmax_with_cross_entropy op (#17612 )	6 years ago
Tao Luo	b4b169467b	add fc_mkldnn_pass in compare_mkldnn (#17712 ) test=develop	6 years ago
pawelpiotrowicz	70a887af63	[NGraph] Add reduce_sum operator for Ngraph (#17450 ) test=develop	6 years ago
baojun	29baca0dd8	add depthwise_conv2d op to ngraph engine (#17454 ) * add depthwise_conv2d test=develop * use cpu for ngraph test=develop	6 years ago
gongweibao	0d561ef442	fix 2dconn test=develop (#17681 )	6 years ago
mozga-intel	ccf9e2327b	[Lite] Enable cast operator test=develop (#17294 )	6 years ago
tangwei12	0d3c48e0a8	fix doc in transpiler, test=develop (#17313 ) * fix doc in transpiler, test=develop	6 years ago
Hongyu Liu	9f85f21880	Add new gard clip [old gradient clip not support in dy graph] (#17523 ) * add gradient clip in minimize; test=develop * fix bug; test=develop * fix format; test=develop * move new grad clip to dygraph/grad_clip.py; test=develop * fix lr decay and grad clip test; test=develop * seperate dygraph grad clip; test=develop * fix grad clip test; develop * fix api spec bug; test=develop * add blank line, test=develop,test=document_preview to fix format problem	6 years ago
Zhaolong Xing	4337009b92	fix trt ci timeout error (#17701 ) test=develop	6 years ago
mozga-intel	5eb81fe595	Capi for a ngraph engine (#17037 )	6 years ago
Yiqun Liu	5782dddad0	Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415 ) * Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2. test=develop * Refine codes. test=develop * Correct the condition. test=develop * Move the define of tmp_data outside the if statement. * Print the cudnn minor version. test=develop * Fix the case when in_num/o_num is 1 in concat/split op. test=develop * Remove const_cast. test=develop	6 years ago
石晓伟	acbb4bf38d	update python API examples (#17351 ) * update python APIs test=document_preview test=develop * update API.spec test=document_preview test=develop * update merge_selected_rows * update API.spec test=document_preview test=develop * update API.spec test=document_preview test=develop * fix the comment of less_than test=develop test=document_preview * update API.spec test=develop test=document_preview * update API.spec test=develop test=document_preview * update API.spec test=develop test=document_preview * update API.spec test=develop * update API test=develop	6 years ago
Jiabin Yang	7a401da52f	test=develop, fix mac ci will not uninstall dependency files when error occurs (#17688 )	6 years ago
lidanqing	04b6c29ee0	Improve mobilenetv2 INT8 performance by using INT8 relu as post-op (#17570 ) * add INT8 conv+relu6 fuse and enbale mobilentv2 INT8 test test=develop * change fasle and 0.0 to fuse_brelu and brelu_threshold test=develop change the "fuse_relu\|\|fuse_brelu" to "unsigned_output" test=develop * Use relu instead of brelu as INT8 post-op because INT8 brelu is not enabled in mkldnn v0.18 test=develop * continuous-integration fix test=develop	6 years ago
Jacek Czaja	6d8075ecef	[MKL-DNN] conv_transpose mkldnn bias pass (#17644 ) * - changes to graph detector - Changes to pass - Added ut for new pass - use_pass - Added pass to mkldnn passes - fix to registration - improved verbose messaging for conv bias passes - Lint fixes test=develop * - Lint fixes test=develop	6 years ago
Shuai Yuan	41f1186c6b	[DOC][PYTHON] Fix api docs, test=develop, test=document_preview (#17629 ) * [DOC] Fix api docs, test=develop, test=document_preview * [DOC] Fix api annotation: fluid.layers.tensor_array_to_tensor. test=develop, test=document_preview * test=develop, test=document_preview update MD5 of tensor_array_to_tensor	6 years ago
wopeizl	058f1f1e1b	fix the api example for create_global_var, create_parameter, SGDOptim… (#17371 ) * fix the api example for create_global_var, create_parameter, SGDOptimizer, RMSPropOptimizer, MomentumOptimizer, LarsMomentumOptimizer, FtrlOptimizer test=develop * add example for adamoptimizer fix API.spec test=develop * test=develop * test=develop	6 years ago
Yibing Liu	4f4f0993c1	Bias correction for exponential moving average (#17677 ) * Bias correction for exponential moving average test=develop, test=document_preview * Fix docs test=develop, test=document_preview	6 years ago
Tao Luo	962eed6f82	Revert "Enable SQRT operator for the nGraph Bridge (#17549 )" (#17680 ) This reverts commit `f34830e2aa`.	6 years ago
Tao Luo	67a6297a9f	update unique_name notes and examples (#17671 ) test=develop	6 years ago
Krzysztof Binias	f34830e2aa	Enable SQRT operator for the nGraph Bridge (#17549 ) * Enable sqrt operator for the nGraph Bridge. test=develop * Update activation_op.h	6 years ago
Sylwester Fraczek	96845d2168	add Concat quantization (#17448 ) * add Concat quantization add unit test for quantizing concat fix for wrong value when the input is not in map of calculated scales add use_quantizer to concat_op.cc add scale_algo rules for concat test=develop * missing fix for multiple inputs quantize-squash * wojtuss review fix: adding comment test=develop	6 years ago
Zeng Jinle	432ac70124	clean code of py_layer in dygraph mode,test=develop (#17661 )	6 years ago
gongweibao	65bbf950ee	Add multi-ncclcomm and 2D ncclallreduce support. (#17263 )	6 years ago
Krzysztof Binias	b1bd483a7d	[NGraph] Enable gelu operator for the nGraph Bridge. (#17547 ) test=develop	6 years ago
Zhen Wang	8bd651b7ed	Fix the bug in the AnalysisPredictor and add more directions about io APIs. (#17639 ) * fix the bug that sub_scope_ may be null in AnalysisPredictor::Run. * add more directions about io APIs' docs. * update the API.spec. test=develop test=document_preview	6 years ago
chengduo	343017324e	Polish Print Op (#17651 ) * enhance print	6 years ago
Zeng Jinle	4aa931dd85	Code clean of Allocator (#17602 ) * Revert "Revert "Fix allocator bug"" This reverts commit `174d0d0b90`. * Revert "fix travis ci" This reverts commit `5656fa9f7c`. test=develop * add inlined_vector.h, test=develop * add inlined_vector_test,test=develop * clean code of allocator,test=develop * delete zero_size_allocator.h,test=develop * fix failed unittest,test=develop	6 years ago
Guo Sheng	430e25654b	Fix the usage of out_grad lod in sequence_slice_op. (#17625 ) test=develop	6 years ago
Huihuang Zheng	afc3d85da2	Remove Docker build for CI tasks (#17650 ) * Add Dockerfile for cuda9 and cuda10	6 years ago
Bai Yifan	bbd6e438fc	fix conflicts,test=develop (#17186 )	6 years ago
bdzhuxiaoning	9f85afb7b6	test=develop (#17643 )	6 years ago
chengduo	9322216170	Add data distributed_sampler (#17573 ) * add data parallel batch	6 years ago
hutuxian	1670db5e86	Gather Op Index Support int64_t datatype (#17610 ) * gather_op support int64_t index by adding a template typename * add UT and rename typename test=develop	6 years ago
Huihuang Zheng	febc07f047	Add Dockerfile for cuda9 and cuda10 (#17600 ) * Add Dockerfile for cuda9 and cuda10 Add Dockerfile for building cuda9 cuda10 images.	6 years ago
mozga-intel	2b83d75bfa	Enable elementwise pow operator for ngraph (#17526 )	6 years ago
Zhaolong Xing	61221ebc28	TRT: Support set dynamic range in int8 mode. (#17524 ) * fluid int8 train and trt int8 predict align. trt int8 predict init op converter * 2. align fluid int8 train and trt int8 inference. enhance quant dequant fuse pass enhance op converter, trt engine, trt engine op, trt subgraph pass. * 3. add delete_quant_dequant_pass for trt test=develop * 4. add the missing file test=develop * 5. i modify the c++ interface, but forget to modify the pybind code fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter test=develop	6 years ago
Michał Gallus	0c39b97b4e	[MKL-DNN] Add Fully Connected Op for inference only(#15226 ) * fuse mul and elementwise add to fc * Reimplement the FC forward operator * Fix FC MKLDNN integration by transposing weights * Add FC MKLDNN Pass test=develop * FC MKLDNN Pass: change memcpy to std::copy * Fix MKLDNN FC handling of mismatch input and weights dims * Lower tolerance for MKL-DNN in resnet50 test test=develop * Adjust FC to support MKLDNN Op placement test=develop * Adjust Placement Op to set use_mkldnn attribute for graph test=develop * MKLDNN FC: fix weights format so that gemm version is called test=develop * FC MKLDNN: Remove tolerance decrease from tester_helper * FC MKL-DNN: Refactor the code, change input reorder to weight reorder * MKL-DNN FC: Introduce operator caching test=develop * FC MKL-DNN: Fix the tensor type in ExpectedKernelType test=develop * FC MKL-DNN: fix style changes test=develop * FC MKL-DNN: fallback to native on non-supported dim sizes test=develop * FC MKLDNN: fix CMake paths test=develop * FC MKLDNN: Refine placement pass graph mkldnn attribute test=develop * Fix Transpiler error for fuse_conv_eltwise test=develop * Fix missing STL includes in files test=develop * FC MKL-DNN: Enable new output size computation Also, refine pass to comply with newest interface. test=develop * FC MKL-DNN: enable only when fc_mkldnn_pass is enabled * FC MKL-DNN: Allow Weights to use oi or io format * FC MKL-DNN: Adjust UT to work with correct dims test=develop * Enable MKL DEBUG for resnet50 analyzer test=develop * FC MKL-DNN: Improve Hashing function test=develop * FC MKL-DNN: Fix shape for fc weights in transpiler * FC MKL-DNN: Update input pointer in re-used fc primitive * Add log for not handling fc fuse for unsupported dims test=develop * FC MKL-DNN: Move transpose from pass to Op Kernel test=develop * FC MKL-DNN: Disable transpose in unit test test=develop * FC MKL-DNN: Remove fc_mkldnn_pass from default list * Correct Flag for fake data analyzer tests test=develop * FC MKL-DNN: Add comment about fc mkldnn pass disablement test=develop * FC MKL-DNN: Disable fc in int8 tests test=develop	6 years ago
wopeizl	6724a652f3	add __str__ method for tensor and lodtensor to support print test=dev… (#17588 ) * add __str__ method for tensor and lodtensor to support print test=develop	6 years ago
Krzysztof Binias	e9216d0602	Enable logical operators for the nGraph Bridge. (#17543 ) test=develop	6 years ago
Hongyu Liu	cbaf9e5344	Fix api example [ lstm, sequence_enumerate, sequence_expand,sequence_expand_as ] (#17210 ) * fix example; test=develop * fix api spec; test=develop * fix api spec; test=develop * add doc check test=develop test=document_preview * test=develop,test=document_preview add blank line to fix format, add one more "import" * fix bug; test=develop * fix bug; test=develop	6 years ago
guru4elephant	326bf8291a	add Run Prepared Ctx (#17616 ) add Run Prepared Ctx, fix pybind problem	6 years ago
Yibing Liu	e8990e64f6	Fix trust ratio in lamb (#17614 ) test=develop	6 years ago
Guo Sheng	2a7b321110	Fix the example code in some Python API. (#17343 ) * Fix the example code in some Python API. test=develop * Fix the example code in some Python API by adding import. test=develop	6 years ago
chengduo	b5f4d5ed0e	Add broadcast operators (#17503 ) * This PR adds broadcast for multi-process. And it could be used in dynamic graph to broadcast parameters.	6 years ago
flame	2280f185d7	BuildStrategy api comment (#17348 ) Python examples of fluid.layers.io.double_buffer and some BuildStrategy's methods.	6 years ago
Sylwester Fraczek	5b2a3c4b12	Conv concat relu quantization (#17466 ) * add conv_concat_relu fuse test=develop * add test code test=develop * added missing include with unordered_map test=develop * review fixes for wojtuss test=develop * remove 'should (not) be fused' comment statements one of them was invalid anyway test=develop	6 years ago
Sylwester Fraczek	bccb0ba49a	fix quantize_squash_pass segfault when no tensor linked to Bias (#17292 ) * fix quantize_squash_pass segfault when there is no tensor linked do Bias input test=develop * add googlenet test test=develop * fix concat CreateKey not using input format test=develop	6 years ago
chengduo	2dc1c6f25c	Add profiler in tracer (#17076 ) * add profiler in tracer.cc * add profiler in layer.cc test=develop * add profiler in Layer.cc test=develop	6 years ago
mozga-intel	0d4cbdad91	[NGraph] Enable elementwise mul operator (#17552 )	6 years ago
tianshuo78520a	cee9dcc383	Delete LoDTensorset in API.spec (#17577 ) * test=develop * test=develop * test=develop * del #	6 years ago
mozga-intel	f2694e122d	[NGraph] Enable assign operator for a ngraph, test=develop (#17437 ) * Enable assign operator for a ngraph, test=develop * Cross_entropy operators needs to be updated	6 years ago
mozga-intel	cf02cb5e98	Enable elementwise sub operator for ngraph (#17527 )	6 years ago
guru4elephant	7f8bc49d00	polish_executor_and_add_ctx_cache (#17536 ) * polish_executor_and_add_ctx_cache	6 years ago
tensor-tang	7ae461eb13	[CPU] refine cpu softmax bwd (#17534 ) * refine softmax fwd test=develop * refine cpu softmax bwd test=develop * fix batch size test=develop * fix compile issue with gpu test=develop * add value clip	6 years ago
Yibing Liu	6e11f97708	Add exponential moving average (#17562 ) * Add exponential moving average test=develop, test=document_preview * Polish documents test=develop, test=document_preview * Update API spec test=develop, test=document_preview	6 years ago
tensor-tang	0600b370ea	[CPU] refine softmax op fwd on CPU (#17522 ) * refine softmax fwd test=develop * fix compile issue wih gpu test=develop * add value clip to avoid exp	6 years ago
Zeng Jinle	c6189637cd	Fix allocator bug (#16712 ) * Revert "Revert "Fix allocator bug"" This reverts commit `174d0d0b90`. * Revert "fix travis ci" This reverts commit `5656fa9f7c`. test=develop * add inlined_vector.h, test=develop * add inlined_vector_test,test=develop	6 years ago
mozga-intel	035771512d	Enable elementwise min operator for ngraph (#17521 )	6 years ago
Kaipeng Deng	cf60e5a2db	fix API python example (#17226 ) * fix api example. test=develop * fix API.spec. test=develop * fix spectral_norm format. test=develpp * merge develop * add import. test=develop * fix indent. test=develop * fix indent. test=develop * add import fluid. test=develop	6 years ago
Qiao Longfei	92e7d5d7cc	fix distribute doc test=develop (#17318 ) * fix distribute doc	6 years ago
jerrywgz	c1aae8b8d2	Fix GetExpectedKernelType in Concat op (#17459 ) * fix concat op vartype check, test=develop	6 years ago
Qiao Longfei	58f7695ab2	Async exe support communicator (#17386 ) Async exe support communicator	6 years ago
Zhaolong Xing	38da103034	fix trt ci bug temporary. (#17565 ) ban all trt ut. will fix it later. test=develop	6 years ago
mozga-intel	109b5aed5a	[NGraph] Enable reshape operator test=develop (#17512 )	6 years ago
zhang wenhui	9bb6a421e3	fix bpr_loss data_norm teacher_student_sigmoid_loss api & fix continuous_value_model (#17331 ) * fix bpr data_norm teacher_student_sigmoid , test=develop test=document_preview 修复了bpr_loss data_norm teacher_student_sigmoid_loss三个api, 同时修复了continuous_value_model文档英文拼写错误	6 years ago
lijianshe02	300bd7504d	fix api-doc related bugs test=develop test=document_preview (#17360 ) * fix api doc according to the reviewer's comment test=develop	6 years ago
lijianshe02	daf88968e2	fix bug that saved optimal model path in test_analyzer_save_model con… (#17555 ) * modify saved model path in analyzer_save_model.cc test=develop	6 years ago
Krzysztof Binias	43d15b9d96	Enable square operator for the nGraph Bridge. (#17551 ) test=develop	6 years ago
Sevin F. Varoglu	f86f49e779	[NGraph] add increment op to ngraph engine (#16929 ) * add increment op to ngraph engine test=develop * fix style errors test=develop	6 years ago
baojun	8923612b10	NGraph enable parse serialized graph test=develop (#17453 )	6 years ago
Yiqun Liu	cf5d271c5a	Fix examples of fluid.layers.sums and fluid.layers.DynamicRNN (#17308 ) * Fix examples of fluid.layers.sums. test=document_preview * Correct the example of DynamicRNN and its functions. test=develop * Add 'import paddle.fluid as fluid' to examples. test=develop * Update API.spec. test=develop * Add space lines. test=develop * Update the API.spec. test=develop	6 years ago
guomingz	2281ebf0f3	Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130 ) * Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization. Below table shows the benchmark(FPS) which measured on skx-8180(28 cores) Batch size \| with fusion \| without fusion -- \| -- \| -- 1 \| 214.7 \| 53.4 50 \| 1219.727 \| 137.280 test=develop * Fix the format issue test=develop * Add the missing nolint comments. test=develop * Fix the typos. test=develop * Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine. test=develop * Adjust the indentation. test=develop * Add the test_conv_brelu_mkldnn_fuse_pass case. test=develop * Slightly update the code per Baidu comments. Let the parameter definition embedded into the code. That's will make the code easy to understand. test=develop	6 years ago
Yibing Liu	f9796b1249	Add LAMB Optimizer support (#17489 ) * Add LAMB optimizer * Expose LAMB Optimizer's APIs test=develop, test=document_preview * Cleanup code & doc test=develop, test=document_preview * Update lamb optimizer's formula test=develop	6 years ago
mozga-intel	99ab57123c	Enabled ngraph elementwise max operator (#17517 )	6 years ago
Tao Luo	3d19f44a89	remove unused SERIAL compiler option (#17500 ) test=develop	6 years ago
zhaoyuchen2018	dfdcd91869	Add api doc code examples (#17285 ) * Add api doc code examples add or fix topk, squeeze, stack, StaticRNN, StaticRNN memory in doc test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Add squeeze md5. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Add import package test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	6 years ago
mozga-intel	1eb151752e	Enable abs operator for a ngraph test=develop (#17436 )	6 years ago
lidanqing	36757ed203	Enabling resnet101, vgg16, vgg19 INT8v2 model tests (#17468 ) * Add 6 models tests support in CMake * enabling resnet101, vgg16, vgg19 INT8v2 model tests test=develop * remove SERIAL test=develop	6 years ago
liuwei1031	ba70cc499e	fix security bugs : (#17464 ) http://newicafe.baidu.com:80/issue/PaddleSec-33/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-28/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-25/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-24/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-21/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-20/show?from=page test=develop	6 years ago
Zhaolong Xing	ff7f911b4d	add quant_dequant_moving_avg_max_abs op (#17480 ) * add quant_dequant_moving_avg_max_abs op test=develop * add more note for quantdequant op test=develop	6 years ago
Qiao Longfei	287de41c04	Optimize communicator flags (#17494 ) * optimize communicator flag * change flags in init py test=develop	6 years ago
liuwei1031	c3949f5699	remove two useless flags: enable_subgraph_optimize, memory_optimize_debug, test=develop (#17491 )	6 years ago
liuwei1031	f82e4d75e7	improve the doc of paddle.fluid.memory_optimize, test=develop (#17473 ) * improve the doc of paddle.fluid.memory_optimize, test=develop * fix typo, test=develop	6 years ago
Tao Luo	32da5e9c3d	remove unused expected_kernel_cache_pass (#17486 ) test=develop	6 years ago
wopeizl	ca3ba378c7	fix the random compilation failure on windows test=develop (#17475 ) * fix the random compilation failure on windows	6 years ago
lvmengsi	10b23a72c1	Double backward elementwise div (#17416 ) * double backward, elementwise_div * fix dx empty. test=develop * bug fix (#17392) fix secure bug * Eanble stack operator for a Ngraph, test=develop (#17406) * fix sqrt_grad_grad unittest. test=develop (#17410) * fix sqrt_grad_grad unittest. test=develop * disable sqrt_grad_grad unittest. test=develop * test=develop, fix unittest * test=develop, fix unittest * test=develop, fix unittest * test=develop, fix bug * fix unittest. test=develop * fix unittest dx. test=develop * tmp fix! for test... test=develop * reduce tmp, test=develop * test=develop, reduce tmp * fix broadcast unittest. test=develop * fix format. test=develop * refine code. test=develop * refine code. test=develop * refine GetDoubleGradSafeTensor. test=develop * fix format. test=develop	6 years ago
qingqing01	97f0ec2357	Fix compiling error with cuDNN 5.1 (#17458 ) test=develop	6 years ago
Zeng Jinle	3d4e8268c6	fix recurrent fwd bug when no backward and scope clear (#17460 )	6 years ago
lvmengsi	977e9fcb27	support elementwise_sub double backward (#17476 ) add elementwise_sub_grad_grad op for backward of backward calculation	6 years ago

... 4 5 6 7 8 ...

15686 Commits (1c2aae567a8863c9cdb666fc3b553b8f01281d15)