Paddle

Commit Graph

Author	SHA1	Message	Date
Shibo Tao	f8d5fd6f9b	generate dummy file using cmake configure_file function to avoid re-generating it. (#25161 ) * generate dummy file using cmake configure_file function to avoid re-generating it. test=develop * add cmake/dummy.c.in. test=develop	5 years ago
Shibo Tao	19c4db1b56	don't re-generate header file if content doesn't change (#25130 ) * don't re-generate header file if content doesn't change. test=develop * add copy_if_different function. test=develop	5 years ago
T8T9	7046165670	remove ${CMAKE_VERSION} VERSION_LESS "3.3.0". (#25128 )	5 years ago
石晓伟	6783441e70	fix repeat definitions in liengine.cc, test=develop (#25020 )	5 years ago
T8T9	a73a4a8fe7	don't support cmake 3.12, 3.13, 3.14 (#25021 )	5 years ago
Zhou Wei	3e04ed2227	fix bug in CUDA_NVCC_FALS and CMAKE_CUDA_FLAGS, and eliminate some warning,test=develop (#24982 ) fix bug in CUDA_NVCC_FALS and CMAKE_CUDA_FLAGS	5 years ago
Zhang Ting	0cb0318253	update cub to 1.9.8, test=develop (#24895 )	5 years ago
Yanghello	2ca2b90d62	fix cryptopp lib building bug in gcc8 (#24945 )	5 years ago
T8T9	90d420b13c	add -DPADDLE_CUDA_BINVER (#24928 ) * add -DPADDLE_CUDA_BINVER. test=develop, test=win_gpu * nvcc will use add_compile_options, avoid using it if you don't want to pass arguments to nvcc. test=develop * test=develop, test=win_gpu	5 years ago
Chen Weihang	4a702ef361	Support SelelctedRows allreduce in multi-cards imperative mode (#24690 ) * support selectedrows allreduce in multi-cards dygraph, test=develop * remove useless import modules in unittests, test=develop * add nccl cmake to get nccl version, test=develop * add if-condition to compiled correctly, test=develop * add detail version parseing for old nccl, test=develop * polish camke details, test=develop * fix remove test cmake error, test=develop * fix cmake condition, test=develop * change unittest camke list, test=develop * fix unittest cmake rule, test=develop, test=framep0	5 years ago
T8T9	211ef78c1e	Builtin cuda (#24904 ) * support CUDA using cmake built-in way (#24395) * support CUDA using cmake built-in way. test=develop * test=develop * cmake_minimum_required 3.10 * test=develop	5 years ago
silingtong123	fc4435174b	test=develop, fix the bug of tensorrt package can't compile on windows (#24860 ) * test=develop, fix a bug * test=develop, remove the macro of PADDLE_DLL_INFERENCE	5 years ago
Yanghello	aa47356b74	Add crypto python (#24836 ) * add crypto helper for paddle, test=develop * cryptopp.cmake bug fixed, test=develop * remove debug build type, test=develop * fixed CMakeLists for new target, test=develop * fix CI bug, test=develop * add cmake option flag DWITH_CRYPTO, test=develop * add crypto api for python, test=develop * Revert "add crypto api for python, test=develop" This reverts commit 3a1cfa9d055fab357f46e653a8786f96336f6b47. * Revert "Add crypto api (#24694)" This reverts commit `5a7a517cde`. * Revert "Revert "Add crypto api (#24694)"" This reverts commit f952b19fa7e8b7f9c57d31d78b9ffee1041c43ed. * fixed cryptopp cmake building error, test=develop * change WITH_CRYPTO building option to OFF, test=develop * âfixed cipher test failed, test=develop * "add crypto api for python, test=develop" This reverts commit 83fb55c0668d59afad2ad1e7e04d425c7c7dd189. * travis CI bug fixed, test=develop * fixed test in python3 * test=develop * fixed unittest, test=develop	5 years ago
Yanghello	62b4ff7dd2	Aes_cipher_test and cipher_utils_test failed fixed (#24816 )	5 years ago
Wilber	f8e370ac7f	[Inference] [unittest] Inference unit tests rely on dynamic libraries (#24743 )	5 years ago
Zhou Wei	8a9f06e62d	fix bug when compile CPU inference library (#24800 )	5 years ago
silingtong123	126d3d693b	support C++ inference shared library on windows (#24672 ) * add SetCommandLineOption * add the print_FLAGS function * remove the test demo * modify the location of macro * add the 'WITH_STATIC_LIB' option on windows * modify the macro of PD_INFER_DECL * modify the the fuction name * modify the unittest * modify the code style	5 years ago
Zhou Wei	d1047d0a69	add WITH_GPU for cudaerror download (#24056 )	5 years ago
Zhou Wei	80ec2fe71c	fix windows bug that compile .cu files use MSVC dynamic C runtime (#24729 )	5 years ago
Yanghello	5a7a517cde	Add crypto api (#24694 )	5 years ago
Pei Yang	21ad122a4a	add more info to version.txt, test=develop (#24551 )	5 years ago
Jinhua Liang	1ad6317bc4	fix compile error about cub (#24648 )	5 years ago
Jacek Czaja	3292f0ef58	[onednn] elementwise add broadcasting support (#24594 )	5 years ago
Wilber	4ec7287602	fix compile when with_nccl=off. test=develop (#24444 )	5 years ago
Shibo Tao	30efee339a	Revert "support CUDA using cmake built-in way (#24395 ). test=develop" (#24468 ) This reverts commit `068d3690c6`.	5 years ago
Shibo Tao	068d3690c6	support CUDA using cmake built-in way (#24395 ) * support CUDA using cmake built-in way. test=develop * test=develop	5 years ago
Pei Yang	8c296dea75	fix compile error(cpuid.h not found) on nvidia jetson platforms. test=develop (#24329 )	5 years ago
Tao Luo	9eedf05d2f	solve mklml memory leak on windows (#24015 ) * solve mklml memory leak on windows test=develop * remove unused msvcr120.dll test=develop	5 years ago
Guo Sheng	1fc6cc502a	Fix cusolver loader for Windows (#24157 ) * Fix cusolver loader for Windows in dynamic_loader.cc. test=develop * Fix missing CUSOLVER_ROUTINE_EACH_R1. test=gpu test=develop * Add unsupprot for cusolver on Windows temporarily. test=develop * Fix GetCusolverDsoHandle error message. test=develop	5 years ago
Tao Luo	34122e665e	update mklml.cmake to 2019.0.5 (#24179 ) test=develop	5 years ago
Tao Luo	e3179ea2f5	refine ccache statistics show (#24167 ) test=develop	5 years ago
Tao Luo	29e1968d63	Revert "update mklml.cmake to 2019.0.5 (#24022 )" (#24147 ) This reverts commit `652e804b41`. test=develop	5 years ago
Tao Luo	652e804b41	update mklml.cmake to 2019.0.5 (#24022 ) * update mklml.cmake to 2019.0.5 test=develop * update mklml.cmake with new version test=develop	5 years ago
Zhou Wei	6f5669f9bf	Add note about the time cost and change HTTPS to HTTP to avoid unable to download(#24043 )	5 years ago
Zeng Jinle	d053dfd5fc	fix cuda arch detection (#24036 )	5 years ago
Zhou Wei	7817003795	Optimize the error messages of paddle CUDA API (#23816 ) * Optimize the error messages of paddle CUDA API, test=develop * fix the error messages of paddle CUDA API, test=develop * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop * remove build_ex_string,test=develop * merge conflict,test=develop	5 years ago
Zhang Ting	b89dd86fb6	Update eigen (#23203 ) * update eigen, test=develop * remove patches, test=develop * add definition of -fabi-version, test=develop * add patch for TensorBlock.h, test=develop * test windows, test=develop * only update eigen for Linux, test=develop * add code comments, test=develop	5 years ago
WangXi	752636f94f	cache dgc package (#23941 )	5 years ago
Zhaolong Xing	c113302826	fix cuda9, volta, turing compile error (#23730 )	5 years ago
zhangchunle	faf284a9b3	modify cmake/external/*.cmake (#23710 )	5 years ago
mozga-intel	3baaee9aab	Remove: NGraph engine from PDPD repository (#23545 ) * Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop	5 years ago
石晓伟	9b82e4c183	change the cmake and apis of lite engine, test=develop (#22934 ) * change the cmake and apis of lite engine, test=develop * change the cmake of lite engine, test=develop	5 years ago
channings	a2e10930cf	update linspace, equal operators to API 2.0 (#23274 ) * update linspace, equal operators to API 2.0, test=develop * equal support higher performance CUDA kernel, test=develop * update comment of equal&linspace operator, test=develop * update comment of equal&linspace operator, test=develop	5 years ago
Adam	487f43bbcb	Update DNNL version to 1.3 (#23204 )	5 years ago
Zhaolong Xing	430b0099c9	[Paddle-TRT]: Ernie Dynamic shape support. (#23138 ) * add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop	5 years ago
xujiaqi01	d0413e58d3	support get pslib version (#22835 ) * get pslib version * test=develop	5 years ago
Zhaolong Xing	8d6dc102fe	[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494 ) * 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop	5 years ago
石晓伟	ddb9b46fec	change the function in op_teller, test=develop (#22794 ) * change the function in op_teller, test=develop * correct the commit-id, test=develop	5 years ago
zhou wei	0fb5ea7814	fix bug that sourcecode of third_party can't be cached correctly,and add cache for xbyak and openblas (#22772 )	5 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	5 years ago
hutuxian	175954d894	PaddleBox Framework Part2 (#22466 ) * Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.	5 years ago
zhouwei25	7cf648b315	fix bug of the cmake variable protobuf_MSVC_STATIC_CRT (#22598 )	5 years ago
Adam	608447bfd5	Update MKLDNN to v1.2 (#22521 )	5 years ago
flame	1d503e6a9e	Golang inference API (#22503 ) * support golang inference	5 years ago
石晓伟	53be3f07e9	update internal header files, test=develop (#22379 )	5 years ago
Pei Yang	5a1a9a1e59	remove copying trt to inference lib, test=develop (#22470 )	5 years ago
yaoxuefeng	2235ee1a5e	multi-loss optimization by adding a DownpourOpt worker (#22025 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
Wilber	55b403e8a8	Modify lite commit id. (#22371 ) * modify lite commit id to support var_conv_2d cascade. test=develop * modify lite commit id. test=develop	5 years ago
石晓伟	24f9037e62	update external lite, test=develop (#22347 ) * update external lite, test=develop * switch WITH_TESTING to OFF, test=develop	5 years ago
Wilber	36afdbd3e1	modify lite commit id to support var_conv_2d cascade. test=develop (#22299 ) 修改了依赖lite的commit id：lite支持了var_conv_2d的级联使用	5 years ago
Leo Chen	032e49c494	fix compile issue, test=develop (#22001 ) * fix compile issue, test=develop * force link libiomp5 when mklml is enabled, test=develop	5 years ago
silingtong123	4f1da4adcb	remove the useless third_party library from C++ inference library (#22021 ) * remove the useless third_party library from C++ inference library * revert removing the install directory	5 years ago
zhouwei25	549e6de7ac	faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164 )	5 years ago
xujiaqi01	e3a457d34b	add collective communication library in fleet (#22211 ) * add collective communication library in fleet to replace mpi * test=develop	5 years ago
Wilber	5750152e80	support fluid-lite subgraph run resnet test=develop (#22191 ) - 添加了fluid-lite子图方式运行resnet的单测 - 修改了依赖Lite的git commit id	5 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	5 years ago
baojun	f8516ccb53	Upgrade nGraph to use mkldnn v1.1 (#22154 )	5 years ago
石晓伟	ad0dfb17c1	[Feature] Lite subgraph (#22114 )	5 years ago
zhouwei25	4f7a2bd0d1	tweak the interface of cache_third_party function - expose the SOURCE_DIR for each external library (#21899 )	5 years ago
Adam	700fdb1819	MKL-DNN 1.1 for Windows (#22089 )	5 years ago
Adam	c112b645c4	Update MKL-DNN to 1.1 (#21754 )	5 years ago
Yiqun Liu	d48320777e	Add the first implememtation of fusion_group op (#19621 ) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop	5 years ago
zhouwei25	8b15acd71d	remove patch command and file of warpctc to Improved quality of Paddle Repo (#21929 )	5 years ago
zhouwei25	2df4be5d35	Fix openblas bug to support compile on windows when WITH_MKL=OFF (#21902 ) * Fix openblas to support compile on Windows when WITH_MKL=OFF	5 years ago
zhouwei25	cad058ce19	remove patch command and file of grpc to Improved quality of Paddle Repo (#21778 )	5 years ago
zhouwei25	a01663ca1f	remove patch command and file of cares to Improved quality of Paddle Repo (#21776 )	5 years ago
zhouwei25	3e1404d208	fix cp bug of warpctc repository,test=develop (#21901 )	5 years ago
xujiaqi01	37896e9050	fix compile error when WITH_PSLIB=ON (#21702 ) * fix compile error when WITH_PSLIB=ON * test=develop	5 years ago
zhouwei25	34dc710641	fix wrong commitID with patch file of warpctc (#21755 )	5 years ago
zhouwei25	03133c2c58	fix the bug that cannot pathch command for the second time (#21596 )	5 years ago
baojun	45d2fa4e26	update ngraph to v0.27 test=develop (#21677 )	5 years ago
Adam	e81f0228df	MKL-DNN 1.0 Update (#20162 ) * MKLDNN v1.0 rebase to Paddle 1.6 test=develop * Add hacky paddle::string::to_string() implementation * vectorize<int64-t>() -> vectorize() cleanup test=develop * PADDLE_ENFORCE and void_cast fixes test=develop * Rebase changes test=develop * Cosmetics test=develop * Delete MKL from mkldnn.cmake test=develop * CMake debug commands test=develop * Delete MKLDNN_VERBOSE and rebase fixes test=develop * Rebase fixes test=develop * Temporarily disable int8 resnet101 vgg16 and vgg19 tests test=develop * Add libmkldnn.so.1 to python setup test=develop * Add libmkldnn.so.1 to inference_lib cmake after rebase test=develop * Post rebase fixes + FC int8 changes test=develop * Fix LRN NHWC test=develop * Fix NHWC conv3d test=develop * Windows build fix + next conv3d fix test=develop * Fix conv2d on AVX2 machines test=develop	5 years ago
Leo Chen	84b7267100	dygraph_grad_maker supports varbase without grad_var (#21524 ) * dygraph_grad_maker supports varbase without grad_var, test=develop * fix compile, test=develop * fix test_tracer, test=develop * follow comments, test=develop	5 years ago
Leo Chen	cdd46d7e02	Split VarBase from Python Variable for Dygraph (#21359 ) * test=develop, fix docker with paddle nccl problem * don't expose numerous Tensor.set(), test=develop * fix condition, test=develop * fix float16 bug, test=develop * feed should be Tensor or np.array, not Variable or number, test=develop * use forcecast to copy numpy slice to new array, test=develop * remove float16-uint16 hacking, test=develop * add variable method to varbase and refactor to_variable to support return varbase * support kwargs in varbase constructor * add VarBase constructor to support default python args * refine varbase initial method * reset branch * fix ut for change VarBase error info to PaddleEnforce * cherry is parameter change before * overload isinstance to replace too many change of is_variable * rm useless files * rm useless code merged by git * test=develop, fix some ut failed error * test=develop, fix test_graph_wrapper * add some tests, test=develop * refine __getitem__, test=develop * add tests, test=develop * fix err_msg, test=develop	5 years ago
silingtong123	4640178629	modify the personal repo address of eigen and warpctc (#21445 ) * modify the repo address of eigen and warpctc * fix the eigen not work on windows * fix the eigen and warpctc can't recompile	5 years ago
Zhaolong Xing	c5f0293cf3	NV jetson(nano, tx2, xavier) inference compile support (#21393 ) * add jeston compile support test=develop * refine the cmake test=develop	5 years ago
Tao Luo	060bf8d0d5	Revert "revert flags.cmake (#21437 )" (#21485 ) This reverts commit `c93c9e5bfe`. test=develop	5 years ago
gongweibao	c93c9e5bfe	revert flags.cmake test=develop (#21437 )	5 years ago
Zhaolong Xing	6aa13f46cb	update openblas version (#21450 ) test=develop	5 years ago
zhouwei25	fce24315fb	fix cub/threadpool include_dir to match setup.py.in,test=develop (#21436 )	5 years ago
Tao Luo	c0656dcb1a	remove -Wno-error=sign-compare, make warning as error (#21358 ) * remove -Wno-error=sign-compare, make warning as error test=develop test=document_fix * fix exist compile warning test=develop	5 years ago
zhouwei25	b39f947698	Eliminate the impact on incremental compilation (#21410 )	5 years ago
Michał Gallus	5d7d548275	INT8 Fully-connected (#17641 ) * Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop	5 years ago
Tao Luo	d8e7d25274	make CUDA_ARCH_NAME default Auto (#21352 ) * make CUDA_ARCH_NAME default Auto test=develop * refine warning test=develop	5 years ago
silingtong123	4b429c190d	package the CAPI inference library and third_party (#21299 )	5 years ago
zhouwei25	345b67b5e2	remove warning LNK4006 and warning LNK4221 (#21226 )	5 years ago
zhouwei25	341dee0657	Cache 3rd source code, improve stability, reduce the compilation time (#21190 )	5 years ago
Zeng Jinle	925280b96c	Change GCC version to be 8.2 in Dockerfile.GCC8 (#21222 ) * make Docker to gcc 8.2, test=develop * add -std=c11 to grpc.cmake, test=develop	5 years ago
zhouwei25	c0dcb090a3	Determine whether to copy and link inference lib by ON_INFER (#20931 )	5 years ago
Zeng Jinle	cdb3d27985	Fix warn of gcc8 (#21205 ) * fix warnings oof gcc 8 compilation, test=develop * fix boost::bad_get, test=develop * refine PADDLE_ENFORCE, test=develop	5 years ago
zhouwei25	5d821578d9	fix bug when build openblas with a computer that has installed openblas before,test=develop (#21160 )	5 years ago
Jeng Bai-Cheng	330b173c38	Better TensorRT support (#20858 ) * Fix TensorRT detection bug 1. Add new search path for TensorRT at tensorrt.cmake 2. Add better debug message 3. Fix the bug of detection of TensorRT version In NVIDIA official docker image, TensorRT headers are located at `/usr/include/x86_64-linux-gnu` and TensorRT libraries are located at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will fail to detect TensorRT. There is no debug/warning message to tell developer that TensorRT is failed to be detected. In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is defined at `NvInferVersion.h` instead of `NvInfer.h`, so add compatibility fix. * Fix TensorRT variables in CMake 1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}` 2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}` Manually type path may locate incorrect path of TensorRT. Use the paths detected by system instead. * Fix TensorRT library path 1. Add new variable - `${TENSORRT_LIBRARY_DIR}` 2. Fix TensorRT library path inference_lib.cmake and setup.py.in need the path of TensorRT library instead of the file of TensorRT library, so add new variable to fix it. * Add more general search rule for TensoRT Let system detect architecture instead of manually assign it, so replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`. * Add more general search rule for TensorRT Remove duplicate search rules for TensorRT libraries. Use `${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so test=develop	5 years ago
zhouwei25	d257355089	Remove useless code of openblas and fix the previous incorrect message (#21092 )	5 years ago
Michał Gallus	6cc544aa28	Add Shallow clone to ExternalProjects (#21060 ) test=develop	5 years ago
joanna.wozna.intel	77c2083586	Add transpose2 INT8 for mkl-dnn (#19424 ) * Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdba4c859abb945e062ab13124f70508054, reversing changes made to 2ce6473f144da298aba4a43d46918f27d463cf7c. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd78ca47ae56881161172b2aacd349aba90. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop	5 years ago
zhouwei25	89bc18eec0	move more third party library related logic to third_party.cmake (#20927 )	5 years ago
Chen Weihang	7ee25189c3	Enrich the type of error and declare the error type interfaces (#21024 ) * Enrich the type of error and declare the error type interfaces, test=develop * adjust tests to adapt new form, test=develop * add inference deps with error_codes.pb.h, test=develop * restore stack iter start pos, test=develop * polish code based review comments, test=develop	5 years ago
Zeng Jinle	878a40f57d	Support NoNeedBufferVarsInference in dygraph backward (#20868 ) * support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop	5 years ago
zhouwei25	394edd8647	fix mklml and cblas bug,test=develop (#20970 )	5 years ago
hong	8c4573a3cb	GradMaker for dygraph (#19706 ) * refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop	5 years ago
zhouwei25	b741761098	Integration of third_party compilation structure (#20887 )	5 years ago
wopeizl	3b31b74e20	remove the warning issue test=develop (#20718 )	5 years ago
zhouwei25	bcd77e147c	Cmake_generotor support has been added to enable multi-version VS support (#20755 )	5 years ago
wopeizl	9e5948230e	add support to gcc8, add docker env test=develop (#19807 ) * add support to gcc8, add docker env test=develop	5 years ago
WangXi	507afa8a8a	Fix dgc nan by stripping nccl from sparseReduce. (#20630 )	5 years ago
石晓伟	48b27229a8	fix version.cmake, test=develop (#20606 )	5 years ago
633WHU	12e4be0382	Dlpack support (#20039 ) * support dlpack to tensor and implement python interface test=develop * add unittest for _to_dlpack and from_dlpack test=develop	5 years ago
tangwei12	c9139c3db3	trainer from dataset fetch targets (#19760 ) add executor.FetchHandler for train/infer from the dataset	5 years ago
zhaoyuchen2018	e867366805	Add multihead op for ernie opt (#19933 ) * Add multihead op for ernie opt test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine softmax test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine kernel. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine cuda kernel test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine cuda version test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine code test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com> * Refine cmake test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
liym27	3aa331d97e	fix conv2d and conv3d: (#20042 ) 1.support asymmetric padding; 2.support padding algorithm:"SAME" and "VALID"; 3.support channel_last: data_format NHWC and NDHWC; 4.change doc of python API and c++; test=develop, test=document_preview	5 years ago
石晓伟	01b9d07963	update operator compatible info, test=develop (#19978 ) * update operator compatible info, test=develop * revert cmake/version.cmake, test=develop * add unit_tests and fix bugs, test=develop * update ../paddle/fluid/framework/framework.proto, test=develop * fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop * update paddle/fluid/framework/version_test.cc, test=develop * add comments and rename interfaces, test=develop	5 years ago
gongweibao	ae593e57fa	Add dgc source code to bos platform. (#19892 ) * add dgc.tgz to bos	5 years ago
Yiqun Liu	3cd985a669	Add a pass to fuse fc+elementwise_add+layernorm (#19776 ) * Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop	5 years ago
chengjuntao	00efd1d8a9	add deformable conv v1 op and cpu version of deformable conv v2 (#18500 ) * add deformable conv v1 op, test=develop	5 years ago
zhouwei25	b5a5d93bbe	fix the dependencies of third party and inference lib (#19684 )	6 years ago
Huihuang Zheng	12542320c5	Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989 ) TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation	6 years ago
Yiqun Liu	a65c728e5d	Implement the GPU kernel of fc operator (#19687 ) * Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop	6 years ago
baojun	87f13f7569	upgrade ngraph to support mkldnn v1.0 (#19689 )	6 years ago
Tao Luo	bcddbc78d4	remove -Wmaybe-uninitialized warning (#19653 ) * remove -Wmaybe-uninitialized warning test=develop * remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc test=develop	6 years ago
Tao Luo	3aaea4c545	fix inference_lib deps error (#19632 ) test=develop	6 years ago
liuwei1031	9c88570881	fix the warning caused by mistach arguments of flags.cmake (#19576 )	6 years ago
silingtong123	e79cf3bce7	Enable online compilation of openblas on windows (#19602 ) * test=develop, Support for online compilation of openblas * test=develop, Modify the prefix of openblas static library	6 years ago
hutuxian	c756b5d231	Paddlebox Framework (#18982 ) * Support looking up embeddings from BoxPS. * Add a _pull_box_sparse op, for now this op is not exposed to users. * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on. * Add 'BoxPSDataset' in python code. * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS. * Add UT. * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982	6 years ago
liuwei1031	d6cb1a4122	add dynamic C runtime support on windows, test=develop (#19502 )	6 years ago
Yihua Xu	b920395842	Use sparse matrix to implement fused emb_seq_pool operator (#19064 ) * Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * Ignore the deprecated status for windows test=develop	6 years ago
Zeng Jinle	5b6673c44d	merge develop to solve conflict, also fix API doc, test=develop (#18823 )	6 years ago
liuwei1031	50582071dc	fix compilation issue in windows vs2017 (#19183 ) * fix compilation issue in windows vs2017, test=develop * fix gtest lib not found issue, test=develop	6 years ago
zhouwei25	2f0dc8463a	fix the bug that PYTHON_EXECUTABLE not exists (#19225 ) * test=develop,fix the inference library compilation bug on windows * test=develop,Fix the inference library compilation bug on windows * test=develop,fix the bug that PYTHON_EXECUTABLE not exists	6 years ago
zhouwei25	ef46918ad1	Fix the inference library compilation bug on windows (#19190 ) * test=develop,fix the inference library compilation bug on windows	6 years ago
Tao Luo	32a670badc	remove WITH_FAST_MATH option (#19149 ) test=develop	6 years ago
wopeizl	80b7ef6fc8	add tensorrt support for windows (#19084 ) * add tensorrt support for windows	6 years ago
Krzysztof Binias	e1b5833b88	[PROPOSAL] Add support for dynamic code analysis (Sanitizers) (#18303 ) * Add support for dynamic code analysis (Sanitizers) test=develop * Move options to one option test=develop * Missing check test=develop	6 years ago
baojun	adcfc53b18	upgrade ngraph version and simplify ngraph engine (#18853 ) * upgrade ngraph to v0.24 test=develop * simplify io test=develop	6 years ago
Huihuang Zheng	0d3f16f53e	Try to modify external gflags to solve CI compilation (#18872 )	6 years ago
Tao Luo	8de5aa1bde	remove package.cmake (#18760 ) test=develop	6 years ago
Tao Luo	0ae45f0b53	remove unused cmake file (#18744 ) test=develop	6 years ago
Tao Luo	c457a69db5	remove unused gzstream.cmake (#18705 ) test=develop	6 years ago
Jacek Czaja	0d8e6c9b8b	MKL-DNN upgrade to 0.20 (#18370 ) test=develop	6 years ago
gongweibao	ec1000cca9	Change to use brpc rdma branch instead of personal branch. (#18683 )	6 years ago

1 2 3 4 5 ...

1322 Commits (08dc5bc27e3fc1e3822e73c36d8a66b53daa5118)