Paddle

Commit Graph

Author	SHA1	Message	Date
yaoxuefeng	2235ee1a5e	multi-loss optimization by adding a DownpourOpt worker (#22025 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
Wilber	55b403e8a8	Modify lite commit id. (#22371 ) * modify lite commit id to support var_conv_2d cascade. test=develop * modify lite commit id. test=develop	5 years ago
石晓伟	24f9037e62	update external lite, test=develop (#22347 ) * update external lite, test=develop * switch WITH_TESTING to OFF, test=develop	5 years ago
Wilber	36afdbd3e1	modify lite commit id to support var_conv_2d cascade. test=develop (#22299 ) 修改了依赖lite的commit id：lite支持了var_conv_2d的级联使用	5 years ago
Leo Chen	032e49c494	fix compile issue, test=develop (#22001 ) * fix compile issue, test=develop * force link libiomp5 when mklml is enabled, test=develop	5 years ago
silingtong123	4f1da4adcb	remove the useless third_party library from C++ inference library (#22021 ) * remove the useless third_party library from C++ inference library * revert removing the install directory	5 years ago
zhouwei25	549e6de7ac	faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164 )	5 years ago
xujiaqi01	e3a457d34b	add collective communication library in fleet (#22211 ) * add collective communication library in fleet to replace mpi * test=develop	5 years ago
Wilber	5750152e80	support fluid-lite subgraph run resnet test=develop (#22191 ) - 添加了fluid-lite子图方式运行resnet的单测 - 修改了依赖Lite的git commit id	5 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	5 years ago
baojun	f8516ccb53	Upgrade nGraph to use mkldnn v1.1 (#22154 )	5 years ago
石晓伟	ad0dfb17c1	[Feature] Lite subgraph (#22114 )	5 years ago
zhouwei25	4f7a2bd0d1	tweak the interface of cache_third_party function - expose the SOURCE_DIR for each external library (#21899 )	5 years ago
Adam	700fdb1819	MKL-DNN 1.1 for Windows (#22089 )	5 years ago
Adam	c112b645c4	Update MKL-DNN to 1.1 (#21754 )	5 years ago
Yiqun Liu	d48320777e	Add the first implememtation of fusion_group op (#19621 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop	5 years ago
zhouwei25	8b15acd71d	remove patch command and file of warpctc to Improved quality of Paddle Repo (#21929 )	5 years ago
zhouwei25	2df4be5d35	Fix openblas bug to support compile on windows when WITH_MKL=OFF (#21902 ) * Fix openblas to support compile on Windows when WITH_MKL=OFF	5 years ago
zhouwei25	cad058ce19	remove patch command and file of grpc to Improved quality of Paddle Repo (#21778 )	5 years ago
zhouwei25	a01663ca1f	remove patch command and file of cares to Improved quality of Paddle Repo (#21776 )	5 years ago
zhouwei25	3e1404d208	fix cp bug of warpctc repository,test=develop (#21901 )	5 years ago
xujiaqi01	37896e9050	fix compile error when WITH_PSLIB=ON (#21702 ) * fix compile error when WITH_PSLIB=ON * test=develop	5 years ago
zhouwei25	34dc710641	fix wrong commitID with patch file of warpctc (#21755 )	5 years ago
zhouwei25	03133c2c58	fix the bug that cannot pathch command for the second time (#21596 )	5 years ago
baojun	45d2fa4e26	update ngraph to v0.27 test=develop (#21677 )	5 years ago
Adam	e81f0228df	MKL-DNN 1.0 Update (#20162 ) * MKLDNN v1.0 rebase to Paddle 1.6 test=develop * Add hacky paddle::string::to_string() implementation * vectorize<int64-t>() -> vectorize() cleanup test=develop * PADDLE_ENFORCE and void_cast fixes test=develop * Rebase changes test=develop * Cosmetics test=develop * Delete MKL from mkldnn.cmake test=develop * CMake debug commands test=develop * Delete MKLDNN_VERBOSE and rebase fixes test=develop * Rebase fixes test=develop * Temporarily disable int8 resnet101 vgg16 and vgg19 tests test=develop * Add libmkldnn.so.1 to python setup test=develop * Add libmkldnn.so.1 to inference_lib cmake after rebase test=develop * Post rebase fixes + FC int8 changes test=develop * Fix LRN NHWC test=develop * Fix NHWC conv3d test=develop * Windows build fix + next conv3d fix test=develop * Fix conv2d on AVX2 machines test=develop	5 years ago
Leo Chen	84b7267100	dygraph_grad_maker supports varbase without grad_var (#21524 ) * dygraph_grad_maker supports varbase without grad_var, test=develop * fix compile, test=develop * fix test_tracer, test=develop * follow comments, test=develop	5 years ago
Leo Chen	cdd46d7e02	Split VarBase from Python Variable for Dygraph (#21359 ) * test=develop, fix docker with paddle nccl problem * don't expose numerous Tensor.set(), test=develop * fix condition, test=develop * fix float16 bug, test=develop * feed should be Tensor or np.array, not Variable or number, test=develop * use forcecast to copy numpy slice to new array, test=develop * remove float16-uint16 hacking, test=develop * add variable method to varbase and refactor to_variable to support return varbase * support kwargs in varbase constructor * add VarBase constructor to support default python args * refine varbase initial method * reset branch * fix ut for change VarBase error info to PaddleEnforce * cherry is parameter change before * overload isinstance to replace too many change of is_variable * rm useless files * rm useless code merged by git * test=develop, fix some ut failed error * test=develop, fix test_graph_wrapper * add some tests, test=develop * refine __getitem__, test=develop * add tests, test=develop * fix err_msg, test=develop	5 years ago
silingtong123	4640178629	modify the personal repo address of eigen and warpctc (#21445 ) * modify the repo address of eigen and warpctc * fix the eigen not work on windows * fix the eigen and warpctc can't recompile	5 years ago
Zhaolong Xing	c5f0293cf3	NV jetson(nano, tx2, xavier) inference compile support (#21393 ) * add jeston compile support test=develop * refine the cmake test=develop	5 years ago
Tao Luo	060bf8d0d5	Revert "revert flags.cmake (#21437 )" (#21485 ) This reverts commit `c93c9e5bfe`. test=develop	5 years ago
gongweibao	c93c9e5bfe	revert flags.cmake test=develop (#21437 )	5 years ago
Zhaolong Xing	6aa13f46cb	update openblas version (#21450 ) test=develop	5 years ago
zhouwei25	fce24315fb	fix cub/threadpool include_dir to match setup.py.in,test=develop (#21436 )	5 years ago
Tao Luo	c0656dcb1a	remove -Wno-error=sign-compare, make warning as error (#21358 ) * remove -Wno-error=sign-compare, make warning as error test=develop test=document_fix * fix exist compile warning test=develop	5 years ago
zhouwei25	b39f947698	Eliminate the impact on incremental compilation (#21410 )	5 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	5 years ago
Tao Luo	d8e7d25274	make CUDA_ARCH_NAME default Auto (#21352 ) * make CUDA_ARCH_NAME default Auto test=develop * refine warning test=develop	5 years ago
silingtong123	4b429c190d	package the CAPI inference library and third_party (#21299 )	5 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	5 years ago
zhouwei25	341dee0657	Cache 3rd source code, improve stability, reduce the compilation time (#21190 )	5 years ago
Zeng Jinle	925280b96c	Change GCC version to be 8.2 in Dockerfile.GCC8 (#21222 ) * make Docker to gcc 8.2, test=develop * add -std=c11 to grpc.cmake, test=develop	5 years ago
zhouwei25	c0dcb090a3	Determine whether to copy and link inference lib by ON_INFER (#20931 )	5 years ago
Zeng Jinle	cdb3d27985	Fix warn of gcc8 (#21205 ) * fix warnings oof gcc 8 compilation, test=develop * fix boost::bad_get, test=develop * refine PADDLE_ENFORCE, test=develop	5 years ago
zhouwei25	5d821578d9	fix bug when build openblas with a computer that has installed openblas before,test=develop (#21160 )	5 years ago
Jeng Bai-Cheng	330b173c38	Better TensorRT support (#20858 ) * Fix TensorRT detection bug 1. Add new search path for TensorRT at tensorrt.cmake 2. Add better debug message 3. Fix the bug of detection of TensorRT version In NVIDIA official docker image, TensorRT headers are located at `/usr/include/x86_64-linux-gnu` and TensorRT libraries are located at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will fail to detect TensorRT. There is no debug/warning message to tell developer that TensorRT is failed to be detected. In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is defined at `NvInferVersion.h` instead of `NvInfer.h`, so add compatibility fix. * Fix TensorRT variables in CMake 1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}` 2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}` Manually type path may locate incorrect path of TensorRT. Use the paths detected by system instead. * Fix TensorRT library path 1. Add new variable - `${TENSORRT_LIBRARY_DIR}` 2. Fix TensorRT library path inference_lib.cmake and setup.py.in need the path of TensorRT library instead of the file of TensorRT library, so add new variable to fix it. * Add more general search rule for TensoRT Let system detect architecture instead of manually assign it, so replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`. * Add more general search rule for TensorRT Remove duplicate search rules for TensorRT libraries. Use `${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so test=develop	5 years ago
zhouwei25	d257355089	Remove useless code of openblas and fix the previous incorrect message (#21092 )	5 years ago
Michał Gallus	6cc544aa28	Add Shallow clone to ExternalProjects (#21060 ) test=develop	5 years ago
joanna.wozna.intel	77c2083586	Add transpose2 INT8 for mkl-dnn (#19424 ) * Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdba4c859abb945e062ab13124f70508054, reversing changes made to 2ce6473f144da298aba4a43d46918f27d463cf7c. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd78ca47ae56881161172b2aacd349aba90. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop	5 years ago
zhouwei25	89bc18eec0	move more third party library related logic to third_party.cmake (#20927 )	5 years ago
Chen Weihang	7ee25189c3	Enrich the type of error and declare the error type interfaces (#21024 ) * Enrich the type of error and declare the error type interfaces, test=develop * adjust tests to adapt new form, test=develop * add inference deps with error_codes.pb.h, test=develop * restore stack iter start pos, test=develop * polish code based review comments, test=develop	5 years ago
Zeng Jinle	878a40f57d	Support NoNeedBufferVarsInference in dygraph backward (#20868 ) * support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop	5 years ago
zhouwei25	394edd8647	fix mklml and cblas bug,test=develop (#20970 )	5 years ago
hong	8c4573a3cb	GradMaker for dygraph (#19706 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop	5 years ago
zhouwei25	b741761098	Integration of third_party compilation structure (#20887 )	5 years ago
wopeizl	3b31b74e20	remove the warning issue test=develop (#20718 )	5 years ago
zhouwei25	bcd77e147c	Cmake_generotor support has been added to enable multi-version VS support (#20755 )	5 years ago
wopeizl	9e5948230e	add support to gcc8, add docker env test=develop (#19807 ) * add support to gcc8, add docker env test=develop	5 years ago
WangXi	507afa8a8a	Fix dgc nan by stripping nccl from sparseReduce. (#20630 )	5 years ago
石晓伟	48b27229a8	fix version.cmake, test=develop (#20606 )	5 years ago
633WHU	12e4be0382	Dlpack support (#20039 ) * support dlpack to tensor and implement python interface test=develop * add unittest for _to_dlpack and from_dlpack test=develop	5 years ago
tangwei12	c9139c3db3	trainer from dataset fetch targets (#19760 ) add executor.FetchHandler for train/infer from the dataset	5 years ago
zhaoyuchen2018	e867366805	Add multihead op for ernie opt (#19933 ) * Add multihead op for ernie opt test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine softmax test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine kernel. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine cuda kernel test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine cuda version test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine cmake test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
liym27	3aa331d97e	fix conv2d and conv3d: (#20042 ) 1.support asymmetric padding; 2.support padding algorithm:"SAME" and "VALID"; 3.support channel_last: data_format NHWC and NDHWC; 4.change doc of python API and c++; test=develop, test=document_preview	5 years ago
石晓伟	01b9d07963	update operator compatible info, test=develop (#19978 ) * update operator compatible info, test=develop * revert cmake/version.cmake, test=develop * add unit_tests and fix bugs, test=develop * update ../paddle/fluid/framework/framework.proto, test=develop * fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop * update paddle/fluid/framework/version_test.cc, test=develop * add comments and rename interfaces, test=develop	5 years ago
gongweibao	ae593e57fa	Add dgc source code to bos platform. (#19892 ) * add dgc.tgz to bos	5 years ago
Yiqun Liu	3cd985a669	Add a pass to fuse fc+elementwise_add+layernorm (#19776 ) * Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop	5 years ago
chengjuntao	00efd1d8a9	add deformable conv v1 op and cpu version of deformable conv v2 (#18500 ) * add deformable conv v1 op, test=develop	5 years ago
zhouwei25	b5a5d93bbe	fix the dependencies of third party and inference lib (#19684 )	5 years ago
Huihuang Zheng	12542320c5	Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989 ) TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation	6 years ago
Yiqun Liu	a65c728e5d	Implement the GPU kernel of fc operator (#19687 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop	6 years ago
baojun	87f13f7569	upgrade ngraph to support mkldnn v1.0 (#19689 )	6 years ago
Tao Luo	bcddbc78d4	remove -Wmaybe-uninitialized warning (#19653 ) * remove -Wmaybe-uninitialized warning test=develop * remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc test=develop	6 years ago
Tao Luo	3aaea4c545	fix inference_lib deps error (#19632 ) test=develop	6 years ago
liuwei1031	9c88570881	fix the warning caused by mistach arguments of flags.cmake (#19576 )	6 years ago
silingtong123	e79cf3bce7	Enable online compilation of openblas on windows (#19602 ) * test=develop, Support for online compilation of openblas * test=develop, Modify the prefix of openblas static library	6 years ago
hutuxian	c756b5d231	Paddlebox Framework (#18982 ) * Support looking up embeddings from BoxPS. * Add a _pull_box_sparse op, for now this op is not exposed to users. * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on. * Add 'BoxPSDataset' in python code. * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS. * Add UT. * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982	6 years ago
liuwei1031	d6cb1a4122	add dynamic C runtime support on windows, test=develop (#19502 )	6 years ago
Yihua Xu	b920395842	Use sparse matrix to implement fused emb_seq_pool operator (#19064 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * Ignore the deprecated status for windows test=develop	6 years ago
Zeng Jinle	5b6673c44d	merge develop to solve conflict, also fix API doc, test=develop (#18823 )	6 years ago
liuwei1031	50582071dc	fix compilation issue in windows vs2017 (#19183 ) * fix compilation issue in windows vs2017, test=develop * fix gtest lib not found issue, test=develop	6 years ago
zhouwei25	2f0dc8463a	fix the bug that PYTHON_EXECUTABLE not exists (#19225 ) * test=develop,fix the inference library compilation bug on windows * test=develop,Fix the inference library compilation bug on windows * test=develop,fix the bug that PYTHON_EXECUTABLE not exists	6 years ago
zhouwei25	ef46918ad1	Fix the inference library compilation bug on windows (#19190 ) * test=develop,fix the inference library compilation bug on windows	6 years ago
Tao Luo	32a670badc	remove WITH_FAST_MATH option (#19149 ) test=develop	6 years ago
wopeizl	80b7ef6fc8	add tensorrt support for windows (#19084 ) * add tensorrt support for windows	6 years ago
Krzysztof Binias	e1b5833b88	[PROPOSAL] Add support for dynamic code analysis (Sanitizers) (#18303 ) * Add support for dynamic code analysis (Sanitizers) test=develop * Move options to one option test=develop * Missing check test=develop	6 years ago
baojun	adcfc53b18	upgrade ngraph version and simplify ngraph engine (#18853 ) * upgrade ngraph to v0.24 test=develop * simplify io test=develop	6 years ago
Huihuang Zheng	0d3f16f53e	Try to modify external gflags to solve CI compilation (#18872 )	6 years ago
Tao Luo	8de5aa1bde	remove package.cmake (#18760 ) test=develop	6 years ago
Tao Luo	0ae45f0b53	remove unused cmake file (#18744 ) test=develop	6 years ago
Tao Luo	c457a69db5	remove unused gzstream.cmake (#18705 ) test=develop	6 years ago
Jacek Czaja	0d8e6c9b8b	MKL-DNN upgrade to 0.20 (#18370 ) test=develop	6 years ago
gongweibao	ec1000cca9	Change to use brpc rdma branch instead of personal branch. (#18683 )	6 years ago
Jiabin Yang	898237c19a	Downgrade gcc to 4.8 (#18614 ) * test=develop, fix docker with paddle nccl problem * test=develop, downgrade gcc to 4.8 for latest-dev * test=develop, downgrade gcc to 4.8 for latest-dev * test=develop, modify cmake to renew all third_party * test=develop, invoke ci * test=develop, invoke ci * test=develop, complie python with wide-unicode * test=deveop, refine env settings * test=deveop, refine env settings	6 years ago
guru4elephant	d714bf037c	remove async executor and add data_feed.proto to the deps of train demo (#18659 ) * remove async executor and add data_feed.proto to the deps of train demo	6 years ago
kh2se2013	9ad57f2dfd	1）change to parallel mode on python coverage run (#18594 ) 2）add pip install coverage in Dockerfile.tmp test=develop	6 years ago
kh2se2013	ac81c81be1	unset CMAKE_BUILD_TYPE when WITH_COVERAGE = ON (#18541 ) install coverage package in develop image test = develop	6 years ago
石晓伟	1529154821	Support Bitmain Anakin (#18542 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop	6 years ago
石晓伟	047bba855b	Remove the obsolete cmake options (#18481 ) * remove the obsolete cmake options, test=develop * remove unittests, test=develop	6 years ago
guru4elephant	ef81ff742a	update pslib library path (#18415 ) change url of pslib.tar.gz	6 years ago
kh2se2013	27fb9cad65	add WITH_COVERAGE option, default OFF (#17872 ) * add WITH_COVERAGE option, default OFF test=develop * add coverage for python sdk test=develop * fix code style * fix COVERAGE_FILE path test=develop * remove coverage package test=develop * test = develop, run coverage as module	6 years ago
Tao Luo	3c9755bbb9	remove unused jemalloc option (#18314 ) test=develop	6 years ago
wopeizl	daa32d5383	fix package generation for inference test=develop (#18220 )	6 years ago
Wojciech Uss	c26130f3a9	reuse C-API INT8 unit test application (#18077 ) * reuse C-API INT8 unit test application test=develop * updates after review test=develop	6 years ago
Michał Gallus	8462e2b805	Disable MKLDNN FC in Resnet50 test (#18030 )	6 years ago
tensor-tang	5c06bff222	combine noavx and avx package (#17889 ) * support avx and noavx core * add catch and give some log test=develop * fix build test=develop * add missing package test=develop * fix pybind name test=develop * fix import error test=develop * conbime noavx core test=develop * add requirements test=develop * fix unkown message test=develop * fix api spec test=develop * refine and clean test=develop * update * pass dist ut * follow comments test=develop * refine scripts test=develop	6 years ago
石晓伟	bce259e5bf	Update the Anakin interfaces for content-dnn and MLU (#17890 ) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop	6 years ago
wopeizl	3d0e1204d6	add support for cuda9 on windows test=develop (#17594 ) * add support for cuda9 on windows test=develop * use different git address for cuda9 compatible on windows	6 years ago
wopeizl	82b834cbdb	use the bj as default address instead of cdn test=develop (#17795 ) The cdn.bcebos.com can be unstable randomly for unknown reason, restore it to bj.bcebos.com.	6 years ago
wopeizl	f893914f1f	fix the dll not found issue on windows (#17750 ) * fix the dll not found issue on windows	6 years ago
baojun	2c58f1a83c	[NGraph] Added lookup table to ngraph engine test=develop (#17647 )	6 years ago
Bai Yifan	bba57cdd82	Add deformable conv v2 op,test=develop (#17145 ) * unit commits, test=develop * update API.spec, test=develop	6 years ago
Yiqun Liu	5782dddad0	Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415 ) * Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2. test=develop * Refine codes. test=develop * Correct the condition. test=develop * Move the define of tmp_data outside the if statement. * Print the cudnn minor version. test=develop * Fix the case when in_num/o_num is 1 in concat/split op. test=develop * Remove const_cast. test=develop	6 years ago
Michał Gallus	0c39b97b4e	[MKL-DNN] Add Fully Connected Op for inference only(#15226 ) * fuse mul and elementwise add to fc * Reimplement the FC forward operator * Fix FC MKLDNN integration by transposing weights * Add FC MKLDNN Pass test=develop * FC MKLDNN Pass: change memcpy to std::copy * Fix MKLDNN FC handling of mismatch input and weights dims * Lower tolerance for MKL-DNN in resnet50 test test=develop * Adjust FC to support MKLDNN Op placement test=develop * Adjust Placement Op to set use_mkldnn attribute for graph test=develop * MKLDNN FC: fix weights format so that gemm version is called test=develop * FC MKLDNN: Remove tolerance decrease from tester_helper * FC MKL-DNN: Refactor the code, change input reorder to weight reorder * MKL-DNN FC: Introduce operator caching test=develop * FC MKL-DNN: Fix the tensor type in ExpectedKernelType test=develop * FC MKL-DNN: fix style changes test=develop * FC MKL-DNN: fallback to native on non-supported dim sizes test=develop * FC MKLDNN: fix CMake paths test=develop * FC MKLDNN: Refine placement pass graph mkldnn attribute test=develop * Fix Transpiler error for fuse_conv_eltwise test=develop * Fix missing STL includes in files test=develop * FC MKL-DNN: Enable new output size computation Also, refine pass to comply with newest interface. test=develop * FC MKL-DNN: enable only when fc_mkldnn_pass is enabled * FC MKL-DNN: Allow Weights to use oi or io format * FC MKL-DNN: Adjust UT to work with correct dims test=develop * Enable MKL DEBUG for resnet50 analyzer test=develop * FC MKL-DNN: Improve Hashing function test=develop * FC MKL-DNN: Fix shape for fc weights in transpiler * FC MKL-DNN: Update input pointer in re-used fc primitive * Add log for not handling fc fuse for unsupported dims test=develop * FC MKL-DNN: Move transpose from pass to Op Kernel test=develop * FC MKL-DNN: Disable transpose in unit test test=develop * FC MKL-DNN: Remove fc_mkldnn_pass from default list * Correct Flag for fake data analyzer tests test=develop * FC MKL-DNN: Add comment about fc mkldnn pass disablement test=develop * FC MKL-DNN: Disable fc in int8 tests test=develop	6 years ago
mozga-intel	6101fd57ad	update ngraph to v0.19 test=develop (#17582 )	6 years ago
Tao Luo	3d19f44a89	remove unused SERIAL compiler option (#17500 ) test=develop	6 years ago
wopeizl	ca3ba378c7	fix the random compilation failure on windows test=develop (#17475 ) * fix the random compilation failure on windows	6 years ago
jiaqi	66d51206b1	add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118 ) * add save/load model, shrink table, cvm, config file & fix pull dense bug test=develop * fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error add client flush, add get data size test=develop * fix global shuffle bug test=develop * fix global shuffle bug test=develop * fix code style test=develop * fix code style & modify pslib cmake test=develop * fix error of _role_maker test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix code style test=develop * fix windows compile error of fleet test=develop * fix global shuffle bug * add comment test=develop * update pslib.cmake test=develop * fix fill sparse bug test=develop * fix push sparse bug test=develop	6 years ago
Jiabin Yang	c843e64cf5	Revert "rename the default version from '0.0.0' to 'latest' (#17304 )" (#17356 ) This reverts commit `f456c8beb8`.	6 years ago
wopeizl	f456c8beb8	rename the default version from '0.0.0' to 'latest' (#17304 ) * rename the default version from '0.0.0' to 'latest'	6 years ago
Tao Luo	ff1661f12a	remove unused FLAGS_warpctc_dir (#17162 ) * remove unused FLAGS_warpctc_dir test=develop * remove FLAGS_warpctc_dir test=develop	6 years ago
石晓伟	a72dbe9abf	Cherry-pick benchmark related changes from release/1.4 (#17156 ) * cherry-pick commit from `8877054` * cherry-pick commit from `3f0b97d` * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn (cherry picked from commit `8643dbc233`) * Cherry-Pick from 16662 : Anakin subgraph cpu support (cherry picked from commit `7ad182e16c`) * Cherry-pick from 1662, 16797.. : add anakin int8 support (cherry picked from commit `e14ab180fe`) * Cherry-pick from 16813 : change singleton to graph RegistBlock test=release/1.4 (cherry picked from commit `4b9fa42307`) * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2 Support ShuffleNet and MobileNet-v2, test=release/1.4 (cherry picked from commit `a6fb066f90`) * Cherry-pick : anakin subgraph add opt config layout argument #16846 test=release/1.4 (cherry picked from commit `8121b3eccb`) * 1. add shuffle_channel_detect (cherry picked from commit `6efdea8997`) * update shuffle_channel op convert, test=release/1.4 (cherry picked from commit `e4726a066f`) * Modify symbol export rules test=develop	6 years ago
baojun-nervana	855bb4d408	update ngraph to v0.18 test=develop	6 years ago
gongweibao	cbdb8a17b1	Polish DGC code (#16818 )	6 years ago
wopeizl	b6150e1fa7	disable the share lib for protobuf test=develop (#16778 )	6 years ago
Chen Weihang	0b2aec14b6	Revert "Model data cryption link all lib (#16555 )" test=develop This reverts commit `c38c7c5619`.	6 years ago
Chen Weihang	c38c7c5619	Model data cryption link all lib (#16555 ) * link the libwbaes.so into paddle * polish detail, test=develop * try fix mac_pr_ci error, test=develop * add compile option, test=develop * fix ci error, test=develop * ignore failed to find mac lib, test=develop * change cdn to bj, cdn can't get the latest version * trigger ci, test=develop * temporary delete win32 lib linking, test=develop * change https to http, test=develop * turn compile option on to off * turn compile option off to on, test=develop * try lib compiled by gcc4.8, test=develop * update lib version, test=develop * link other lib, test=develop * add setup config * delete false, test=develop * delete no_soname, test=develop * recover so name set * fix, test=develop * adjust make config, test=develop * remove link to wbaes, test=develop * remove useless define, test=develop	6 years ago
石晓伟	5dea0bdd1b	Merge pull request #16498 from Shixiaowei02/feature/anakin-engine merge feature/anakin-engine to develop	6 years ago
gongweibao	fea91164b7	Fix windows compilation error! (#16546 ) * fix compiled test=develop * follow comments test=develop	6 years ago
Shixiaowei02	bddb2cd315	resolve conflicts with the develop branch test=develop	6 years ago
gongweibao	eb83abeac3	Add DGC(Deep Gradient Compression) interface. (#15841 )	6 years ago
baojun	b1d2605152	fix compile issue test=develop (#16447 )	6 years ago
nhzlx	953bdde058	Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD test=develop	6 years ago
liuwei1031	de3b70a101	fix cdn issue, test=develop (#16423 ) * fix cdn issue, test=develop * fix cdn issue, test=develop	6 years ago
nhzlx	f3a2e4b3d8	1. Add ANAKIN_ROOT compile option 2. refine trt code test=develop	6 years ago
qingqing01	8ad672a287	Support sync batch norm. (#16121 ) * Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)	6 years ago
Brian Liu	db120b9392	Upgrade MKLDNN to v0.18-rc and fix issue caused by lib/lib64 (#15861 ) * Upgrade MKLDNN to v0.18-rc and fix issue caused by lib/lib64 Upgrade MKLDNN to v0.18-rc Also fix the issue during upgrade test=develop * Rebase MKLDNN to rls-v0.18 branch Some issues in v0.18-rc which caused INT8 conv op unit test failure was fixed in rls-v0.18 branch test=develop * Upgrade MKLDNN from v0.18rc to formal v0.18 tag test=develop * Fix the windows compile issue. test=develop	6 years ago
Tao Luo	344f098a34	Merge pull request #15963 from baojun-nervana/ngraph_v14 Fix lib64 issue on centos	6 years ago
Tao Luo	4efdebc6f6	Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt Optimize gelu operation with mkl erf	6 years ago
baojun-nervana	b51e4dc0a4	fix lib64 test=develop	6 years ago
Tao Luo	47d36b2008	Merge pull request #15924 from baojun-nervana/ngraph_v14 Update ngraph version to v0.14	6 years ago
dzhwinter	225c11a91f	polish cudnn related code and fix bug. (#15164 ) * staged. * polish code * polish code. test=develop * polish code. test=develop * api change. test=develop * fix default value. test=develop * fix default value. test=develop	6 years ago
Yihua Xu	7396788694	Optimize gelu operation with mkl erf. test=develop	6 years ago
baojun-nervana	2ffacdebc2	Update ngraph version to v0.14 test=develop	6 years ago
liangan1	4acc522087	Enable function coverage for U8/S8 ConvMKLDNNOpKernel test=develop	6 years ago
tensor-tang	ee2321debd	Revert 15770 develop `a6910f900` gelu mkl opt (#15872 ) * Revert "Optimze Gelu with MKL Erf function (#15770)" This reverts commit `676995c86c`. * test=develop	6 years ago
Yihua Xu	676995c86c	Optimze Gelu with MKL Erf function (#15770 ) * Optimize for gelu operator * Set up the low accuracy mode of MKL ERF function. test=develop * Only enable MKLML ERF when OS is linux * Use the speical mklml version included vmsErf function to verify gelu mkl kernel. test=develop * Add the CUDA macro to avoid NVCC's compile issue. test=develop * Add the TODO comments for mklml library modification. test=develop * Clean Code test=develop * Add the comment of marco for NVCC compiler. test=develop	6 years ago
JiabinYang	ba38be7242	test=develop, fix protobuf runtime update and keep lib in 3.1.0	6 years ago
Tao Luo	50ffed27f6	Merge pull request #15813 from luotao1/legacy_any remove legacy any.cmake	6 years ago
Tao Luo	60cb0b9781	remove legacy $external_project_dependencies variable test=develop	6 years ago
Tao Luo	c797a1f050	remove legacy any.cmake	6 years ago
Tao Luo	f52d372876	remove legacy EXTERNAL_LIBS variable test=develop	6 years ago
Tao Luo	0d38817cf4	remove legacy EIGEN_USE_THREADS, WITH_ARM_FP16 options	6 years ago
Tao Luo	978599154f	remove legacy WITH_GOLANG, GLIDE_INSTALL options	6 years ago
Tao Luo	f522b4417f	remove legacy WITH_TIMER, WITH_DOC, ON_TRAVIS options	6 years ago
Tao Luo	ff2a8386a0	remove legacy USE_EIGEN_FOR_BLAS option	6 years ago
Tao Luo	688023ede0	remove legacy WITH_RDMA option	6 years ago
Tao Luo	6311ae5df9	remove legacy WITH_DOUBLE option	6 years ago
JiabinYang	48cf979a21	test=develop, install requirements before start for Linux	6 years ago
JiabinYang	fe7ffedc1a	test=develop, update protobuf	6 years ago
dzhwinter	02a585b5c7	add details. test=develop	6 years ago
dzhwinter	04e9776aef	add details. test=develop	6 years ago
wopeizl	3614dadf23	Merge pull request #15631 from wopeizl/windows/fixci fix ci broken randomly and disable some warnings	6 years ago
peizhilin	805d505f14	disable warnings for third parties test=develop	6 years ago
Yan Xu	c356bd01e9	fix invalide paddle_version on tag branch test=develop (#15551 )	6 years ago
peizhilin	3a4110f960	fix ci broken randomly and disable some warnings test=develop	6 years ago
Krzysztof Binias	b1bdcd4de8	Make separate folders for mkldnn codes test=develop	6 years ago
Tao Luo	c42ef5bf05	remove legacy WITH_DOC option test=develop	6 years ago
chengduo	7166b52a6e	add limit_of_tmp_allocation for CI (#15513 ) test=develop	6 years ago
Tao Luo	df92d05ef3	remove legacy IOS option test=develop	6 years ago
Tao Luo	cf29ea1592	remove legacy ANDROID option	6 years ago
Tao Luo	3ce10dba15	remove legacy USE_NNPACK option	6 years ago
Tao Luo	2d529186f1	remove legacy CMAKE_CROSSCOMPILING option	6 years ago
Tao Luo	9353bc58dd	remove legacy MOBILE_INFERENCE option	6 years ago
Tao Luo	b4ccae75c0	remove legacy target in cmake/util.cmake	6 years ago
Tao Luo	e000d17a0c	remove legacy WITH_SWIG_PY option	6 years ago
Tao Luo	561ae9d507	remove legacy WITH_C_API option	6 years ago
Wu Yi	7e651a38dd	fix mac cmake version 3.13 build (#15386 ) * fix mac cmake version 3.13 test=develop * fix again test=develop	6 years ago
Yiqun Liu	568cc2ffa8	Optimize while_op for test (#14764 ) * Simplify the compare op for CPU. * Use asynchronous tensor copy in reshape_op's kernel. * Optimize while_op for test, avoiding creating variables every time. test=develop * Enable the cache of kernel type and kernel function. test=develop * Enable profiling with gperftools. * Remove flags for testing, and fix the linking error. test=develop * Delete the codes of ChooseKernel. test=develop * Fix bug when preparing ExecutorPrepareContext for while_op. * Fix missing depending on grpc libraries. * Remove the redundant print. test=develop * Follow comments. * Remove the codes related to prepare the ExecutorPrepareContext for while_op. test=develop	6 years ago
peizhilin	439691f5bd	adjust the shlwapi on windows test=develop	6 years ago
peizhilin	92da467c99	Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue	6 years ago
Sang Ik Lee	9181dea9f3	Set correct TBB library name in debug build and remove warning related to rpath dependency from symlink. test=develop	6 years ago
baojun-nervana	bb9f7a14a0	Fix cmake warning test=develop	6 years ago
Tao Luo	f23a257e90	use the new MKLDNN repo url test=develop	6 years ago
chengduo	55a0672378	fix compute_75 of cuda_cmake (#15209 ) test=develop	6 years ago
Jiabin Yang	7b8b42689a	Merge pull request #15190 from luotao1/mklml_update update mklml version	6 years ago
xuezhong	c0bc818688	Merge pull request #15188 from velconia/add_pyramid_dnn_support Add no lock optimization pass	6 years ago
Tao Luo	49c31e5da4	disable mkl for mac test=develop	6 years ago
chengduo	b1ea335f60	add sm_75 support (#15198 ) test=develop	6 years ago
minqiyang	68a07328fa	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_pyramid_dnn_support test=develop	6 years ago
Tao Luo	ee59e60f77	update mklml version test=develop	6 years ago
minqiyang	4bfa110fd8	Add no lock optimize pass test=develop	6 years ago
Qiyang Min	1df2399e00	Merge pull request #15180 from velconia/add_pyramid_dnn_support Add JeMalloc	6 years ago
Yan Chunwei	875a07c32d	refactor inference analysis api (#14634 )	6 years ago
minqiyang	583f7ce173	Add dynamic jemalloc modules test=develop	6 years ago
baojun-nervana	f0cde74564	Update ngraph with elt-wise relu test=develop	6 years ago
peizhilin	25523bb8e6	test=develop	6 years ago
peizhilin	9ae50dd07d	fix gpu buils issue on windows test=develop	6 years ago
Jiabin Yang	adc96e06d9	Merge pull request #15107 from luotao1/mkl_version_update update mkl version, and add mkl-mac version	6 years ago

... 2 3 4 5 6 ...

1316 Commits (27bdbec7fc16f5d66d8a0458bb6cfb68898204d1)