Paddle

Commit Graph

Author	SHA1	Message	Date
chengduo	8281497030	Fix warning info of build_strategy (#19805 ) * fix warning info test=develop * fix bug of all_reduce_deps_pass test=develop	6 years ago
Zeng Jinle	b34933d9ee	fix retry allocator bug, test=develop (#19794 )	6 years ago
Yiqun Liu	c67c8758cb	Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop	6 years ago
Zeng Jinle	32b1151f5e	reduce default value of cudnn workspace size, test=develop (#19780 )	6 years ago
zhongpu	52673956de	add kernel for squeeze_op, test=develop (#19656 ) * add kernel for squeeze_op, test=develop * delete comment, test=develop	6 years ago
zhongpu	2a81c3679a	add kernel for unstack_op, test=develop (#19538 ) * add kernel for unstack_op, test=develop * add kernel for unstack_op, test=develop * add kernel for unstack_op, test=develop * adjust the code format, test=develop * modify some comment, test=develop	6 years ago
Chen Weihang	00d5375e0c	Add prune_backward function to cover complicated test_program.clone situation (#19772 )	6 years ago
Kaipeng Deng	99c78b772a	fix softmax axis!=-1. test=develop (#19800 )	6 years ago
tianshuo78520a	38f1c2fe28	change approve site (#19791 ) * change approve site ;test=develop * test=develop	6 years ago
Adam	d4413a54bc	Add common CreateKey for mkldnn handlers (#19767 ) test=develop	6 years ago
Yihua Xu	0d6ea52958	Fix the definition issue when used mkl_scsrmm and mkl_dcsrmm functions. (#19774 ) test=develop	6 years ago
chengduo	056fdedde3	Open fuse all reduce option (#19765 ) * Open fuse all reduce op test=develop * Add Fuse optimization op log * Add log in fuse_optimizer op pass and fuse all_reduce op pass * replace with boost::optional<bool> test=develop * Polish code test=develop * fix code coverage test=develop	6 years ago
Aurelius84	8c7e411908	Remove constraint that last dimension is forced to be 1 by adding one_hot_v2 (#19716 ) * add one_hot_v2_op to remove last_dims==1 test=develop * add api unittest code for CI_Coverage test=develop * improve CI_Coverage rate by adding test_with_depth test=develop	6 years ago
JesseyXujin	e352467c1c	modify activation op API, delete use_cudnn args, test=develop, (#19758 )	6 years ago
Jacek Czaja	9e4c958552	Refactoring activation mkldnn op (#19748 ) test=develop - fix to BWD test=develop	6 years ago
Huihuang Zheng	12542320c5	Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989 ) TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation	6 years ago
Zeng Jinle	0daa5c9772	Make leaky relu inplacable (#19676 ) * make leaky relu inplacable, test=develop * force add unittests to pass coverage, test=develop	6 years ago
Zeng Jinle	078a678219	refine math_op_patch, test=develop (#19727 )	6 years ago
chengduo	e506c99c20	Open fuse broadcast option (#18833 ) * fix vlog level and fuse option type test=develop	6 years ago
Jacek Czaja	47f670d58c	- Softmax mkl-dnn refactoring (#19615 ) test=develop - Cosmetic fixes test=develop	6 years ago
Yiqun Liu	a65c728e5d	Implement the GPU kernel of fc operator (#19687 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop	6 years ago
Aurelius84	22301115d0	Remove constraint that last dimension is forced to be 1 in huber_loss op (#19562 ) * Remove constraint that last dimension is forced to be 1 in huber_loss test=develop * add y[rank-1] == 1 when x_rank=y_rank test=develop * modify into contain_unknown_dim test=develop	6 years ago
chengduo	5866a7a5fe	Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418 ) * Enable fused_all_reduce_op_handle support GPU and CPU Gradients	6 years ago
Youwei Song	3e5fb6361b	fix api-doc error for dygraph and backward (#19721 ) * update dygraph api-doc and backward api-doc, test=develop * update dygraph api-doc and backward api-doc, update api.spec, test=develop * update dygraph api-doc and backward api-doc, update api.spec, test=develop * update API.spec, test=develop	6 years ago
Tao Luo	ec9bc1bd9f	paddle::framework::vectorize() templatization (#19730 ) remove unused accuracy-diff warpctc-cudnn implementation test=develop	6 years ago
Zeng Jinle	bb4f8dee83	add logs to left var memory size, test=develop (#19722 )	6 years ago
Adam	428b2b9e17	MKLDNN handler cleanup (#19713 ) * MKLDNN handler cleanup * MKLDNN handler cleanup test=develop	6 years ago
XiaoguangHu	27235cf222	Add document annotations for FLAGS that need to be open to external developers test=develop (#19692 ) Add document annotations for FLAGS that need to be open to external developers	6 years ago
Zeng Jinle	1c25c88aba	refine memory usage of some operators, test=develop (#19700 )	6 years ago
wangguanzhong	25dcd74d34	merge empty lod tensor, test=develop (#19228 ) * merge_empty_lod_tensor, test=develop * fix multiclass_nms, test=develop * refine API.spec, test=develop * add unittest case for fetch, test=develop * add lod tensor test, test=develop * return index for multiclass_nms, test=develop * add api for multiclass_nms2 * update API.spc, test=develop * refine api doc, test=develop * fix test_detection.py, test=develop * polish code, test=develop * add more unittest case, test=develop	6 years ago
yaoxuefeng	c6756ed225	fix instag op (#19591 ) * fix instag op * fix instag bug: Some tiny logical error, occurring when ins_tag (2nd input) is multiple. test=develop	6 years ago
gongweibao	6c2bc29cc0	Fix float16 optimizer. (#19682 ) Fix float16 optimizer	6 years ago
Zeng Jinle	713c05dd60	refine tensor.mutable_data, test=develop (#19680 )	6 years ago
Chen Weihang	c78a4781bf	Fix train error when test_program.clone is executed after optimizer.minimize (#19397 ) * add prune when test_program.clone is executed after optimizer.minimize * add unittest, test=develop * add resnet and transformer test case, test=develop * add regularization for optimizer & program compare function, test=develop * add lstm unittest, test=develop * polish code based on review comment, test=develop * adapt to interface change in framework._prune, test=develop * update API.spec, test=develop	6 years ago
zhongpu	5f627488db	add kernel for unsqueeze_op and Add unsqueezed op test, test=develop (#19436 ) * add kernel for unsqueeze_op, test=develop * add kernel for unsqueeze_op, test=develop * add kernel for unsqueeze_op, test=develop	6 years ago
Zeng Jinle	a7691603a5	add gpu_allocator_try_time config, test=develop (#19675 )	6 years ago
JesseyXujin	0b06db9413	delete transmission args in linear_chain_crf op (#19619 ) * delete args on linear_chain_crf_op doc * delete args on linear_chain_crf_op doc * delete args on linear_chain_crf_op doc * add code example * fix api doc * fix doc of crf * fix doc of crf * add test=develop * modify API.spec, test=develop	6 years ago
Tao Luo	f05d2c519d	paddle::framework::vectorize() templatization [PART3] (#19643 ) * paddle::framework::vectorize() templatization test=develop * update pybind/imperative.cc test=develop * revert update on unsqueeze_op.cc and warpctc_cudnn_op.cu.cc test=develop	6 years ago
hutuxian	1ca6ea0318	fix cmakelist deps (#19668 ) fix cmakelist deps: remove unnecessary deps and add proper op deps	6 years ago
Tao Luo	bcddbc78d4	remove -Wmaybe-uninitialized warning (#19653 ) * remove -Wmaybe-uninitialized warning test=develop * remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc test=develop	6 years ago
Zeng Jinle	2db40d9f60	reduce thread num of retry_allocator_test,test=develop (#19638 )	6 years ago
wangchaochaohu	4440d7ced0	test=develop cuda realization of label smooth op (#19175 )	6 years ago
chengduo	31c5a5ee26	Remove linear_chain_crf_op.cu (#19645 ) test=develop	6 years ago
123malin	a25a716e87	Optimize fleet API: add input check for some interfaces (#18971 ) * fleet api add input check, test=develop	6 years ago
wangchaochaohu	ed8f44ea21	codegen for fused elementwise operation (#19520 ) * test=develop codegen for fused elementwise operation * fix test=develop	6 years ago
Chen Weihang	73daa3d6c0	Code Cleanup: delete three useless raw variables in Conv2D (#19644 ) * delete useless raw variables in Conv2D, test=develop * adjust the vars number in test_graph_wrapper to pass unittest, test=develop	6 years ago
123malin	2f037c3189	fix the diff between async mode and async_half mode (#19535 ) * test=develop, communicator merge add => merge average	6 years ago
Jiabin Yang	e9233d1c1e	Refactor dygraph (#19107 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE	6 years ago
mapingshuo	dca9b6c5b0	add feed_var_names to Prune interface (#19589 ) * Fix bug: add feed_vars to the prune function	6 years ago
tangwei12	f45cb1c2ca	fix bug of communicator flag, test=develop (#19635 )	6 years ago
Yiqun Liu	42b5bec6f9	Integrate NVRTC to support compiling CUDA kernel at runtime (#19422 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop	6 years ago
Tao Luo	3ae939e48a	unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631 ) * remove assert.h * change PADDLE_ASSERT_MSG to PADDLE_ENFORCE test=develop * fix tensorrt paddle_enforce test=develop	6 years ago
Leo Chen	af692c9140	update reduce_sum and reduce_mean to save memory, test=develop (#19608 )	6 years ago
tensor-tang	e3e98ed678	fix scope lock bug on infer (#19624 )	6 years ago
Aurelius84	6364ebc4dd	Add distributions of Categorical and MultivariateNormal (#18263 ) * add_distributions_of_normal_and_uniform * paddle/fluid/API.spec * modify API.spec * modified paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * modify paddle/fluid/API.spec, test=develop * fix some comment, test=develop * modify API.spec, test=develop * Add distributions of Categorical and MultivariateNormal test=develop * fix pylint codestyle test=develop * fix conflict file test=develop * edit API.spec test=develop * improve sample code test=develop * modify api.spec test=develop	6 years ago
Zeng Jinle	710767d894	Enable inplace support for some ops (#19612 ) * enable inplace for affine_channel op, dropout op, test=develop * remove dropout inplace for ngraph fails, test=develop	6 years ago
FDInSky	a18cf5e119	add a argument for softshrink python api (#19396 ) * test=develop add a argument for softshrink python api * test=develop fix doc format test=develop fix doc format * test=develop fix API.spec test=develop fix API.spec	6 years ago
Tao Luo	d6c85c96dc	paddle::framework::vectorize() templatization (#19627 ) test=develop	6 years ago
danleifeng	8672e15363	elementwise broadcast function enhancement (#19536 ) elementwise broadcast function enhancement	6 years ago
Chen Weihang	8cb54ede8c	Add user-friendly error message in optimizer ops to give a hint about the position sensitive problem of run(startup_program) (#19605 ) * add extra error message hint in optimizer ops * polish format & delete useless change, test=develop * extract init judue from shape compare, test=develop	6 years ago
zhongpu	118bb897cf	add kernel for flatten_op, test=develop (#19472 ) * add kernel for flatten_op, test=develop * add kernel for flatten_op, test=develop * fix the license and remove redundant code, test=develop	6 years ago
Tao Luo	0a46d34538	refine some PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19607 ) test=develop	6 years ago
baojun	a3a4b6e570	Enable ngraph through build_strategy (#19266 ) * enable ngraph throught build_strategy test=develop * add unittest test=develop * put use_ngraph unconditional test=develop * remove paddle_enforce test=develop * remove paddle_enforce test=develop * fix copyright test=develop * limit for ngraph only test=develop	6 years ago
ShenLiang	2cd3fa3e9a	add scatter_nd op and scatter_nd_add op (#19571 ) * add scatter_nd op, test=document_preview test=develop * fixed the document, test=document_preview test=develop * modify the notes, test=document_preview test=develop * remove the ShareDataWith, test=develop	6 years ago
wawltor	364c44422e	Add the support the int64 data type of `scatter_op` input Index(#18804 ) (#19508 ) * test=develop Fix the scatter op bug when use the add mode, and support the int64 data type of scatter_op Index(#18804). * test=develop Remove the PADDLE_ENFORCE and use PADDLE_ENFORCE_EQ * test=develop Remove the fix bug of scatter_add, and just add the support of int64 in scatter_add * test=develop Add the test case for scatter op, the test case just for index int64	6 years ago
Adam	8d6d95cc2b	paddle::framework::vectorize() templatization (#19611 ) test=develop	6 years ago
Tao Luo	75d1571995	refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603 ) test=develop	6 years ago
Yiqun Liu	c5548178b0	A a pass to enable the use of cudnn (#19346 ) * Add a interface to enable cudnn for inference. * Add cudnn_placement_pass. test=develop * Set the default value of cudnn_enabled_op_types to null. test=develop * Write the common basic class, placement_pass_base, to refine the codes. test=develop * Call EnableCUDNN in unittest. test=develop * Refine cudnn_placement_pass tester. * Enable the testing of cudnn_placement_pass in inference's unittest. test=develop * Add the check of op kernels. test=develop	6 years ago
zhongpu	cc443675e9	modify paddle_build.sh for Paddle python3 runtime image generation, test=develop (#19218 ) * modify paddle_build.sh for Paddle python3 version image generation, test=develop * modify paddle_build.sh for Paddle python3 image generation, test=develop	6 years ago
Adam	e94b26daf5	using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568 ) * using MKLDNNMemoryFormat = mkldnn::memory::format changes test=develop * PADDLE_ENFORCE update test=develop	6 years ago
Zeng Jinle	e045aadf9a	fix retry_allocator_test by removing glog envs, test=develop (#19596 )	6 years ago
baojun	f2ad30c4dd	Some ngraph op and unittest fix (#19515 ) * update ngraph ops test=develop * update unittest test=develop * increase coverage test=develop	6 years ago
Tao Luo	49523ea189	replace PADDLE_ASSERT with PADDLE_ASSERT_MSG (#19586 ) * remove unused PADDLE_ASSERT(_IS_NOT_ERROR) * replace PADDLE_ASSERT with PADDLE_ASSERT_MSG test=develop	6 years ago
gongweibao	abaf87be2b	Change backward_guard to optimize_guard to maximize the allreduce overlap. (#19506 ) Change backward_guard to optimize_guard to maximize the allreduce overlap	6 years ago
Zeng Jinle	578cccd48c	fix parallel compilation error of allocator (#19581 )	6 years ago
Zeng Jinle	f4562c3468	fix typo of allocator, test=develop (#19578 )	6 years ago
xiaoting	7a86706309	modified multiclass_nms example (#19553 ) test=develop, test=document_preview	6 years ago
gongweibao	57f0f0f2dc	Delete pserver complete file before executor running. (#19468 )	6 years ago
JesseyXujin	4a7e6deb63	add padding in linear_chain_crf op (#19583 ) * add padding in linear_chain_crf op * modify API.spec * add linear_chain_crf_op.cc and linear_chain_crf_op.h * remove useless unit test , test=develop * modify API.spec, test=develop * remove some blanks in nn.py , test=develop * fix some bugs on nn.py and API.spec ,test=develop * fix nn.py, test=develop * fix API.spec ,test=develop * fix bug of CI test in test_linear_chain_crf_op.py * fix bug of CI test in test_linear_chain_crf_op.py, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * remove paddle_enforce, test=develop * modify nn.py, test=develop * fix API.spec, test=develop * fix unittest bug, test=develop	6 years ago
Zeng Jinle	19474019c2	fix fast pe to run highest priority ops first, test=develop (#19575 )	6 years ago
zhouwei25	84c728013c	fix the compilation issue on windows caused by mkl_CSRMM (#19533 )	6 years ago
mapingshuo	f4ee60b7d0	Imdb train demo2 (#19572 ) * add place to reader, remove snappy dependency * remove snappy dependency from train demo test=develop	6 years ago
Zeng Jinle	0af8549750	fix seg fault of share lod, test=develop (#19573 )	6 years ago
Jacek Czaja	cef95ee30d	[MKL-DNN] Refactoring Softmax (#19312 ) * - First set of modifications - Compilation fixes - compilation fix - Another compilation fix - Moved AcquireSoftmaxPrimitiveDescriptor call into handler - MKL-DNN Softmax PD refactor test=develop - Compilation fix test=develop - another compilation fix - cosmetcis test=develop - Compilation fix - Fix to crash when softmax backward is created * - Fixes after review of softmax refactoring test=develop	6 years ago
Zeng Jinle	0a73f7202a	Add retry_allocator for gpu (#19409 ) * add retry_allocator for gpu, test=develop * follow chengduoZH's comments, test=develop * follow huihuang's comments,test=develop * change f,l in enforce.h to be file,line, test=develop * increase code coverage by adding unittests, test=develop * fix CMakeLists.txt, test=develop	6 years ago
hutuxian	c756b5d231	Paddlebox Framework (#18982 ) * Support looking up embeddings from BoxPS. * Add a _pull_box_sparse op, for now this op is not exposed to users. * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on. * Add 'BoxPSDataset' in python code. * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS. * Add UT. * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982	6 years ago
Zeng Jinle	5dce1da680	remove reset recordio usage (#19519 )	6 years ago
ShenLiang	85914f7a88	add gather_nd op and unit test (#19366 ) * fixed the code for coverage * fixed the document,test=document_preview test=develop	6 years ago
Jacek Czaja	ecd9f330c9	[MKL-DNN] Fix to face model on AVX512 platforms (#19282 ) - Refactor step 1 - Compilation fix - Yet another compilation fix - Even more compilation fix - Lint fixes test=develop - Removed deprectaed PADDLE_ENFORCE occurance test=develop - Candidate fix to BN forward - Lint fixes test=develop - Refactoring in data_layout_transform - compilation fix - Another comppilation fix - Step further into darkness - Yet another compilation fix - Yet another compilation fix - missing header - compilation fix - Added MKLDNN -> Paddle conversion in fetch op test=develop - Compilation fix test=develop - Lint test=develop - Mul fix - Fix to MKLDNN MUL op and Elementwise MUL UT test=develop - Workaround for diffrent weights with groups representation Paddle vs MKL-DNN. test=develop - Candidate fix for 5D convolution with groups - Refactor of fix for conv3d and conv2d in fetch op test=develop - Compilation fix - Still same compilation fix - Compilation fix - Compilation fix - Reverted refactoring of fixes - Adapted test_conv2d_int8_mkldnn so it exects data in NCHW format not NHWC test=develop - minor fix in UT test=develop - Lint fixes test=develop	6 years ago
GaoWei8	e8405e5c61	Modify the dropout op to multi-thread (#19504 ) * Modify the dropout op to multi-thread test=develop * define parallel test=develop	6 years ago
Huihuang Zheng	2916caa2c4	Change ugly PADDLE_ENFORCE_EQ in recurrent_op.cc (#19470 ) test=develop	6 years ago
Liufang Sang	9dde564097	change var name padding_num to padding_value (#19498 )	6 years ago
Aurelius84	5b5379b32a	Add sequence_topk_avg_pooling Op (#19442 ) * add topk_avg_pooling * refine api doc and modify api.spec test=develop	6 years ago
yaoxuefeng	10ca3f9609	add thread scope stat accurate metrics test=develop (#19480 ) * add thread scope stat accurate metrics test=develop * fix style * fix style * fix style * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix style test=develop * fix conflict * fix style * fix style test=develop * fix error test=develop * fix error test=develop	6 years ago
liuwei1031	d6cb1a4122	add dynamic C runtime support on windows, test=develop (#19502 )	6 years ago
Bai Yifan	6d99842bb8	fix mean_iou api example, test=develop, test=document_preview (#19503 ) Fix mean_iou api misleading example	6 years ago
Tao Luo	02270b3eb1	remove unused assert.h (#19529 ) test=develop	6 years ago
chengduo	e340df013e	Support feed single persistable variable to PE (#19417 ) * update executor feed	6 years ago
Yiqun Liu	fcec365d29	Add a pass to replace dropout_op with scale_op when is_test is true (#19297 ) * Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true. test=develop * Delete dropout_op directly when upscale_in_train is true. test=develop * Improve the debug string, adding the print of op_desc information. * Fix the case when dropout's input x is reused as the next op's output. * Add the pass to inference. test=develop * Change the log level. test=develop * Add unittest for inplace case. * Add comment to explain the pass. * Apply the pass for CPU inference. test=develop * Fix the typo. test=develop * Add the check of AttrType. test=develop	6 years ago
hong	e169538886	fix kernel config bug in dygraph mode; test=develop (#19532 )	6 years ago
Zeng Jinle	c2c5b1b941	remove signal raise msg, test=develop (#19527 )	6 years ago
lidanqing	ba368bf696	clean up intel labeled TODOs (#19476 ) test=develop	6 years ago
Zeng Jinle	11f2f78458	fix sofmax seg fault in AVX, test=develop (#19487 )	6 years ago
Thunderbrook	1fe468d319	support debug each output of each ins (#19004 ) * dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop * support debug tensor of each ins test=develop * support debug tensor of each ins test=develop * learning rate * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style * code style test=develop * code style test=develop * unitest * style * style * multi phase * add channel * code style * style * style * unitest * style * define * define test=develop * style test=develop * rm define test=develop * linux * linux test=develop * style test=develop * output format test=develop * windows ci test=develop	6 years ago
Zeng Jinle	5c8f210ce3	refine inplace inference registry, test=develop (#19032 )	6 years ago
chengduo	b6d1d8901f	Increase num_iteration_per_drop_scope (#19075 ) * increase num_iteration_per_drop_scope test=develop * Fix bug of while_op test=develop * fix bug of whileOp test=develop	6 years ago
Double_V	1d0f04315a	fix row_conv_op to force it support lodtensor and tensor input simultaneously, test=develop (#19412 ) Support Tensor input for row_conv_op	6 years ago
Jiabin Yang	1ce0a09e60	fix con2d transpose bias by create and init it in build_once (#18968 ) * fix con2d transpose bias by create and init it in build_onee * fix API spec * test=develop, invoke ci * fix bias_attr and act has no effect error on layer norm, conv2dTranpose, billinearTensorProduct, sequece_conv. fix original_mode not used error on GRUunit. fix sample_weight not set error on NCE. Add ut for all thoese layer * test=develop, change success standard for conv2dTranspose * test=develop, fix test_layers to invoke some error branch * test=develop, fix sample code * test=develop, fix BilinearTensorProduct failed in dygraph mode * test=develop, fix test_layers segment fault error	6 years ago
tangwei12	65c7368400	Fix the correctness of async mode at distributed training (#18863 ) * fix correctness of the communicator * fix a bug in send thread when sending var context is empty, test=develop * add lookup_table_prefetch_op and prefetch optimize, test=develop * remove remote prefetch GPU supported * word2vec force with CPU, test=develop * test dist remote lookup table force with CPU, test=develop	6 years ago
baojun	6421c61ae2	Update ngraph engine for multiple threading (#19155 ) * update for multiple threading test=develop * remove PADDLE_ENFORCE test=develop	6 years ago
Zeng Jinle	caf59d0f3f	Add signal message to stderr (#19421 ) * add signal message to stderr, test=develop * add unittests for ugly SignalHandle, test=develop	6 years ago
Yi Liu	efb05ba258	supports multiple NCCL communicators preserved in NCCLCommContext (#19407 ) * supports multiple NCCL communicators preserved in NCCLCommContext test=develop * add ut for c_comm_init_all operator and fix cuda resource release problem test=develop	6 years ago
Huihuang Zheng	56dd76538c	Delete useless ex-scope in recurrent op (#19426 )	6 years ago
wopeizl	b8aa37d529	save the callstack information to file when exception throws test=dev… (#19324 ) * save the callstack information to file when exception throws test=develop	6 years ago
Aurelius84	a9cd513680	improve sequence_conv api doc (#19316 ) * improve sequence_conv api doc test=develop * add warning for padding param test=develop modify into deprecated	6 years ago
joanna.wozna.intel	2e3ec66be0	Add conv dequant squash for int8 (#18905 )	6 years ago
vincentXiyu	482ce818bb	Support Tensor input with padding for warpctc op (#19322 ) * support tensor input with padding for warpctc op * merge with develop * test=develop * modified python API examples test=develop * nn.py is modified for code coverage test=develop * update documents info about warpctc op in API.spec test=develop * add test_warpctc_with_padding in test_layers test=develop * add warning log for cuda_version back to warpctc_op.cc * modify API.spec for warpctc op test=develop * modify API.spec * update warpctc test to new CompiledProgram API test=develop * modify code examples for warpctc op test=develop * modify API.spec for warpctc op test=develop * modify API.spec for warpctc op test=develop	6 years ago
Leo Chen	6fb310ae29	Fix bug of getting bool Flags from os.environ (#19349 ) * fix bug of getting bool Flags from os.environ, test=develop * add empty loss_name in CompiledProgram for inplace grad test, test=develop	6 years ago
tianshuo78520a	8048992042	add cuda10 support in fast_install.sh and add dynamic get version for release (#19106 ) add cuda10 support in fast_install.sh and add dynamic get version for release, then remove useless ave check for MacOS install check	6 years ago
liu zhengxi	32598ffd8f	Python infer api update and add unit test (#19353 ) * python inference api supports numpy and add unit test, fix unit test fail in test_slim_int8_googlenet and test_slim_int8_mobilenet	6 years ago
mapingshuo	d5ac87ec22	Lookahead optimizer (#19386 ) * Add lookahead optimizer * add unittest for lookahead optimizer test=develop * add doc string for LookaheadOptimizer test=develop test=document_preview * add API spec for lookahead test=develop test=document_preview * modify api spec test=develop test=document_preview * modified doc string * modify the test file test=develop test=document_preview * modify doc string test=develop test=document_preview	6 years ago
Huihuang Zheng	12d29f4d2a	Change TensorCopy in recurrent_op to ShareDataWith (#19319 )	6 years ago
tangwei12	19dac67e9f	fix distribute transpiler GRPC error code 4, RPC Deadline (#18984 ) * fix sync mode hang in transpiler * remove sync mode in send/recv * replace PADDLE_ENFORCE with PADDLE_ENFORCE_NE	6 years ago
Yibing Liu	5d1575cfe8	Fix arg do_model_average in param_attr (#19376 ) * Fix arg do_model_average in param_attr test=develop * Update api spec test=develop	6 years ago
Tao Luo	c82280e445	remove unused conv_elementwise_add2_act_fuse.cc (#19344 ) test=develop	6 years ago
chengduo	4278518fb0	Update CompiledProgram (#18919 ) * use PE for compiler test=develop	6 years ago
lidanqing	9240e5325c	add local user data conversion into full_pascalvoc_test_preprocess.py (#19283 ) * add local user data conversion into full_pascalvoc_test_preprocess.py test=develop * change PADDLE_ENFORCE to PADDLE_ENFORCE_GE test=develop * change according to reviews test=develop	6 years ago
翟飞跃	2e3ee57954	Use sparse matrix to implement FusedEmbeddingSeqPoolGradKernel (#19153 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * optimize bp with mkl sparse matrix test=develop	6 years ago
Leo Chen	a9d5fc5142	Enhance OpTest to check the consistency of operators when using and not using inplace (#19101 ) * add pybind interface to get all inplace ops, test=develop * enhance OpTest to check whether the consistency of operator when using and not using inplace, test=develop * handle corner cases in op_test, test=develop * support outputs without tensor holder_, like XShape in reshape_op, test=develop * fix bug, some op has GradOpMaker, but actually no grad_op in OpInfoMap, test=develop * use reshape_grad instead of reshape in FlattenGradOp, test=develop * fix error debug dims info for variables like XShape, test=develop * change computational order in sum_op to relieve computation difference using inplace, test=develop * add inplace_atol to check group_norm, and skip inplace_grad for mkldnn, test=develop * follow sneaxiy's comments, test=develop * remove unused DefaultGradOpDescMaker in mkldnn op, test=develop	6 years ago
Aurelius84	0d29cf18f4	Supports diagonal initialization in uniform_random op (#19299 ) * add diag init in Uniform_random op test=develop * modify api.spec test=develop * fix unform_batch_size_like maker test=develop * add diag_num and diag_step assert check test=develop	6 years ago
Tao Luo	e3c68bde78	stronger the error message of tensor's mutable_data (#19303 ) * stronger the error message of tensor's mutable_data test=develop * update error message test=develop	6 years ago
tianshuo78520a	188a5caf2e	Split and enhance assert_api_spec_approvals (#19292 )	6 years ago
chengduo	a8a9823dae	add memory profiler (#19320 ) test=develop	6 years ago
Adam	97d1db1874	Add generalized Conv+Activation MKLDNN fuse pass creation Part2 (#19237 ) * Add generalized Conv+Activation MKLDNN fuse pass creation Part2 test=develop * Undefined behaviour of GetAttrIfExists<> FIX test=develop	6 years ago
wangguanzhong	37428952c6	fix generate mask fpn, test=develop (#19301 )	6 years ago
zhaoyuchen2018	5296294dae	Fix elementwise performance poor issue (#19278 ) For small case use 1D block is better than 2D block. Refer to this issue: #19275	6 years ago
Tao Luo	6527a7df67	replace part of PADDLE_ASSERT to PADDLE_ENFORCE (#19285 ) * replace part of PADDLE_ASSERT to PADDLE_ENFORCE test=develop * remove unused fallback_alloc_size_ * add unit-test of CUDAPinnedAllocator test=develop	6 years ago
xiaoting	62facc7e47	fix yolo_box python example (#18925 ) test=develop, test=document_preview	6 years ago
Yihua Xu	b920395842	Use sparse matrix to implement fused emb_seq_pool operator (#19064 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * Ignore the deprecated status for windows test=develop	6 years ago
wangchaochaohu	6e326ca2c6	optimize the realization of cuda dropout (#19136 ) * cuda optimie for dropout * remove tmp swp file * fix compile error test=develop * test=develop optimize the cuda realization of dropout op * remove unsed code test=develop * remove tmp file test=develop	6 years ago
Zhaolong Xing	76c95af000	Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. (#19213 ) * fix mask rcnn bug: 1. affine channel fuse (diff) 2. condition block op (memory leak) 3. merge lod tensor op (diff) 4. memroy optim (diff) test=develop * fix ci aboud PADDLE_ENFOCE fix merge lod infer op ut test=develop	6 years ago
qingqing01	5fc8de449a	Remove warning in batch_norm_op (#19260 )	6 years ago
lvmengsi	d08d5ab519	Fix the mistake of convolution (#19274 )	6 years ago
Aurelius84	78a3d837f8	Add match_matrix_tensor op (#18525 ) * add matrch_matrix_tensor op test=develop * fix ignore unittest if with_mkl=off test=develop * clean code and rm is_test param test=develop * modify API.spec test=develop * rm useless code in search_compute.h test=develop * modify api.spec test=develop * modify default_grad.spec test=develop * Add API test code test=develop * clean code in search_computer.h * modify PADDLE_ENFORCE and clean search_compute.h test=develop * fix code style test=develop	6 years ago
Zeng Jinle	5b6673c44d	merge develop to solve conflict, also fix API doc, test=develop (#18823 )	6 years ago
liuwei1031	50582071dc	fix compilation issue in windows vs2017 (#19183 ) * fix compilation issue in windows vs2017, test=develop * fix gtest lib not found issue, test=develop	6 years ago
zhang wenhui	539c870753	add fl_listen_and_serv &fl_transpiler,test=develop (#19091 ) add fl_listen_and_serv op for Federated_learning and fl_distribute_transpiler add this op to pserver program . This op just listen the endpoint and sum&scale.	6 years ago
juncaipeng	5368b36512	remove the warning for reminding user to avoid using the OriginProgram method, test=develop (#19244 ) This log information may annoy users who don't need to care about it.	6 years ago
silingtong123	af0fbd9012	change PADDLE_ENFORCE to PADDLE_ENFORCE_CUDA_SUCCESS (#19205 ) * print error code if cuda related API fails	6 years ago
Zeng Jinle	91a0911ca3	Make PADDLE_ENFORCE_EQ support types that cannot be converted to std::string (#19243 ) * make PADDLE_ENFORCE_EQ support cannot to string types, test=develop * follow huihuang's comments, test=develop	6 years ago

1 2 3 4 5 ...

15853 Commits (a7512db2bc6a5bc6badaaf5b3a32abb42b9463aa)