Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into my_unpool_max_2d

8 years ago · 5b449b6021
parent 3206094b5e 4ecbab42d8
commit 5b449b6021
17 changed files with 262 additions and 140 deletions
--- a/cmake/external/grpc.cmake
+++ b/cmake/external/grpc.cmake
@ -42,7 +42,7 @@ ExternalProject_Add(
    # Disable -Werror, otherwise the compile will fail in MacOS.
    # It seems that we cannot configure that by make command.
    # Just dry run make command and remove `-Werror`, then use a shell to run make commands
-    BUILD_COMMAND  ${BUILD_CMD}
+    BUILD_COMMAND  ${BUILD_CMD} HAS_SYSTEM_PROTOBUF=false -s -j8 static grpc_cpp_plugin
    INSTALL_COMMAND make prefix=${GRPC_INSTALL_DIR} install
 )
--- a/doc/design/refactor/distributed_architecture.md
+++ b/doc/design/refactor/distributed_architecture.md
--- a/doc/getstarted/build_and_install/build_from_source_cn.rst
+++ b/doc/getstarted/build_and_install/build_from_source_cn.rst
@ -1,4 +1,4 @@
-从源码编译PaddlePaddle
+从源码编译
 ======================
 .. _build_step:
@ -7,8 +7,11 @@
 ----------------
 PaddlePaddle主要使用 `CMake <https://cmake.org>`_ 以及GCC, G++作为编译工具。
-我们推荐您使用PaddlePaddle编译环境镜像完成编译，这样可以免去单独安装编译依赖的步骤，可选的不同编译环境
+我们推荐您使用PaddlePaddle Docker编译环境镜像完成编译，这样可以免去单独安装编译依赖的步骤，可选的不同编译环境Docker镜像
 可以在 `这里 <https://hub.docker.com/r/paddlepaddle/paddle_manylinux_devel/tags/>`_ 找到。
 如果您选择不使用Docker镜像，则需要在本机安装下面章节列出的 `编译依赖`_ 之后才能开始编译的步骤。
 编译PaddlePaddle，需要执行：
 .. code-block:: bash
@ -22,7 +25,6 @@ PaddlePaddle主要使用 `CMake <https://cmake.org>`_ 以及GCC, G++作为编译
   cd build
   cmake -DWITH_GPU=OFF -DWITH_TESTING=OFF ..
   make
 编译完成后会在build/python/dist目录下生成输出的whl包，可以选在在当前机器安装也可以拷贝到目标机器安装：
@ -31,7 +33,33 @@ PaddlePaddle主要使用 `CMake <https://cmake.org>`_ 以及GCC, G++作为编译
   pip install python/dist/*.whl
-.. _build_step:
+.. _run_test:
 执行单元测试
 ----------------
 如果您期望在编译完成后立即执行所有的单元测试，可以按照下面的方法：
 使用Docker的情况下，设置 :code:`RUN_TEST=ON` 和 :code:`WITH_TESTING=ON` 就会在完成编译之后，立即执行单元测试。
 开启 :code:`WITH_GPU=ON` 可以指定同时执行GPU上的单元测试。
 .. code-block:: bash
   docker run -it -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_TESTING=ON" -e "RUN_TEST=ON" paddlepaddle/paddle_manylinux_devel:cuda8.0_cudnn5 bash -x paddle/scripts/docker/build.sh
 如果不使用Docker，可以执行ctest命令即可：
 .. code-block:: bash
   mkdir build
   cd build
   cmake -DWITH_GPU=OFF -DWITH_TESTING=OFF ..
   make
   ctest
   # 指定执行其中一个单元测试 test_mul_op
   ctest -R test_mul_op
 .. _compile_deps:
 编译依赖
 ----------------
--- a/doc/getstarted/build_and_install/build_from_source_en.rst
+++ b/doc/getstarted/build_and_install/build_from_source_en.rst
@ -1,4 +1,4 @@
-Build PaddlePaddle from Sources
+Build from Sources
 ==========================
 .. _build_step:
@ -9,14 +9,18 @@ How To Build
 PaddlePaddle mainly uses `CMake <https://cmake.org>`_ and GCC, G++ as compile
 tools. We recommend you to use our pre-built Docker image to run the build
 to avoid installing dependencies by yourself. We have several build environment
-Docker images `here <https://hub.docker.com/r/paddlepaddle/paddle_manylinux_devel/tags/>`_.
+Docker images `here <https://hub.docker.com/r/paddlepaddle/paddle_manylinux_devel/tags/>`_ .
 If you choose not to use Docker image for your build, you need to install the
 below `Compile Dependencies`_ before run the build.
 Then run:
 .. code-block:: bash
   git clone https://github.com/PaddlePaddle/Paddle.git
   cd Paddle
-   # run the following command to build CPU-Only binaries if you are using docker
+   # run the following command to build a CPU-Only binaries if you are using docker
   docker run -it -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_TESTING=OFF" paddlepaddle/paddle_manylinux_devel:cuda8.0_cudnn5 bash -x paddle/scripts/docker/build.sh
   # else run these commands
   mkdir build
@ -32,7 +36,35 @@ machine or copy it to the target machine.
   pip install python/dist/*.whl
-.. _build_step:
+
 .. _run_test:
 Run Tests
 ----------------
 If you wish to run the tests, you may follow the below steps:
 When using Docker, set :code:`RUN_TEST=ON` and :code:`WITH_TESTING=ON` will run test immediately after the build.
 Set :code:`WITH_GPU=ON` Can also run tests on GPU.
 .. code-block:: bash
   docker run -it -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_TESTING=ON" -e "RUN_TEST=ON" paddlepaddle/paddle_manylinux_devel:cuda8.0_cudnn5 bash -x paddle/scripts/docker/build.sh
 If you don't use Docker, just run ctest will start the tests:
 .. code-block:: bash
   mkdir build
   cd build
   cmake -DWITH_GPU=OFF -DWITH_TESTING=ON ..
   make
   ctest
   # run a single test like test_mul_op
   ctest -R test_mul_op
 .. _compile_deps:
 Compile Dependencies
 ----------------
--- a/doc/getstarted/build_and_install/docker_install_cn.rst
+++ b/doc/getstarted/build_and_install/docker_install_cn.rst
@ -1,4 +1,4 @@
-使用Docker安装运行PaddlePaddle
+使用Docker安装运行
 ================================
 使用Docker安装和运行PaddlePaddle可以无需考虑依赖环境即可运行。并且也可以在Windows的docker中运行。
--- a/doc/getstarted/build_and_install/docker_install_en.rst
+++ b/doc/getstarted/build_and_install/docker_install_en.rst
@ -1,4 +1,4 @@
-PaddlePaddle in Docker Containers
+Run in Docker Containers
 =================================
 Run PaddlePaddle in Docker container so that you don't need to care about
--- a/doc/getstarted/build_and_install/pip_install_cn.rst
+++ b/doc/getstarted/build_and_install/pip_install_cn.rst
@ -1,4 +1,4 @@
-使用pip安装PaddlePaddle
+使用pip安装
 ================================
 PaddlePaddle可以使用常用的Python包管理工具
--- a/doc/getstarted/build_and_install/pip_install_en.rst
+++ b/doc/getstarted/build_and_install/pip_install_en.rst
@ -1,4 +1,4 @@
-Install PaddlePaddle Using pip
+Install Using pip
 ================================
 You can use current widely used Python package management
--- a/doc/howto/index_cn.rst
+++ b/doc/howto/index_cn.rst
@ -19,7 +19,6 @@
 ..  toctree::
  :maxdepth: 1
  dev/build_cn.rst
  dev/write_docs_cn.rst
 模型配置
--- a/doc/howto/index_en.rst
+++ b/doc/howto/index_en.rst
@ -18,7 +18,6 @@ Development
 ..  toctree::
  :maxdepth: 1
  dev/build_en.rst
  dev/new_layer_en.rst
  dev/contribute_to_paddle_en.md
--- a/paddle/operators/conv_cudnn_op.cu.cc
+++ b/paddle/operators/conv_cudnn_op.cu.cc
@ -63,7 +63,7 @@ class CudnnConvOpKernel : public framework::OpKernel<T> {
    cudnnConvolutionDescriptor_t cudnn_conv_desc =
        conv_desc.descriptor<T>(paddings, strides, dilations);
-#if CUDNN_VERSION_MIN(7, 0, 0)
+#if CUDNN_VERSION_MIN(7, 0, 1)
    // cudnn 7 can support groups, no need to do it mannually
    // FIXME(typhoonzero): find a better way to disable groups
    // rather than setting it to 1.
@ -180,7 +180,7 @@ class CudnnConvGradOpKernel : public framework::OpKernel<T> {
    cudnnConvolutionDescriptor_t cudnn_conv_desc =
        conv_desc.descriptor<T>(paddings, strides, dilations);
-#if CUDNN_VERSION_MIN(7, 0, 0)
+#if CUDNN_VERSION_MIN(7, 0, 1)
    // cudnn 7 can support groups, no need to do it mannually
    // FIXME(typhoonzero): find a better way to disable groups
    // rather than setting it to 1.
--- a/paddle/platform/cuda_profiler.h
+++ b/paddle/platform/cuda_profiler.h
@ -0,0 +1,53 @@
 /* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
    http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. */
 #pragma once
 #include <cuda_profiler_api.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 namespace paddle {
 namespace platform {
 void CudaProfilerInit(std::string output_file, std::string output_mode,
                      std::vector<std::string> config_flags) {
  std::array<char, 128> buf;
  std::string tmpl = "/tmp/cuda_profile_config.XXXXXX";
  PADDLE_ENFORCE_LT(tmpl.size(), buf.size());
  memcpy(buf.data(), tmpl.data(), tmpl.size());
  auto result = mktemp(buf.data());
  PADDLE_ENFORCE(strlen(result) != 0);
  std::string config_file = result;
  {
    std::ofstream ofs(config_file, std::ios::out | std::ios::trunc);
    PADDLE_ENFORCE(ofs.is_open(), "ofstream: ", ofs.rdstate());
    for (const auto& line : config_flags) {
      ofs << line << std::endl;
    }
  }
  PADDLE_ENFORCE(output_mode == "kvp" || output_mode == "csv");
  cudaOutputMode_t mode = output_mode == "csv" ? cudaCSV : cudaKeyValuePair;
  PADDLE_ENFORCE(
      cudaProfilerInitialize(config_file.c_str(), output_file.c_str(), mode));
 }
 void CudaProfilerStart() { PADDLE_ENFORCE(cudaProfilerStart()); }
 void CudaProfilerStop() { PADDLE_ENFORCE(cudaProfilerStop()); }
 }  // namespace platform
 }  // namespace paddle
--- a/paddle/platform/dynload/cudnn.cc
+++ b/paddle/platform/dynload/cudnn.cc
@ -37,6 +37,10 @@ CUDNN_DNN_ROUTINE_EACH_AFTER_R4(DEFINE_WRAP);
 CUDNN_DNN_ROUTINE_EACH_R5(DEFINE_WRAP);
 #endif
 #ifdef CUDNN_DNN_ROUTINE_EACH_R7
 CUDNN_DNN_ROUTINE_EACH_R7(DEFINE_WRAP);
 #endif
 }  // namespace dynload
 }  // namespace platform
 }  // namespace paddle
--- a/paddle/platform/dynload/cudnn.h
+++ b/paddle/platform/dynload/cudnn.h
@ -135,6 +135,12 @@ CUDNN_DNN_ROUTINE_EACH_AFTER_R4(DECLARE_DYNAMIC_LOAD_CUDNN_WRAP)
 CUDNN_DNN_ROUTINE_EACH_R5(DECLARE_DYNAMIC_LOAD_CUDNN_WRAP)
 #endif
 #if CUDNN_VERSION >= 7001
 #define CUDNN_DNN_ROUTINE_EACH_R7(__macro) \
  __macro(cudnnSetConvolutionGroupCount);
 CUDNN_DNN_ROUTINE_EACH_R7(DECLARE_DYNAMIC_LOAD_CUDNN_WRAP)
 #endif
 }  // namespace dynload
 }  // namespace platform
 }  // namespace paddle
--- a/paddle/pybind/pybind.cc
+++ b/paddle/pybind/pybind.cc
@ -37,6 +37,7 @@ limitations under the License. */
 #ifdef PADDLE_WITH_CUDA
 #include "paddle/operators/nccl/nccl_gpu_common.h"
 #include "paddle/platform/cuda_profiler.h"
 #include "paddle/platform/gpu_info.h"
 #endif
@ -460,6 +461,10 @@ All parameter, weight, gradient are variables in Paddle.
  m.def("op_support_gpu", OpSupportGPU);
 #ifdef PADDLE_WITH_CUDA
  m.def("get_cuda_device_count", platform::GetCUDADeviceCount);
  m.def("nvprof_init", platform::CudaProfilerInit);
  m.def("nvprof_start", platform::CudaProfilerStart);
  m.def("nvprof_stop", platform::CudaProfilerStop);
 #endif
  return m.ptr();
--- a/python/paddle/v2/fluid/profiler.py
+++ b/python/paddle/v2/fluid/profiler.py
@ -0,0 +1,46 @@
 import paddle.v2.fluid.core as core
 from contextlib import contextmanager
 __all__ = ['CudaProfiler']
 NVPROF_CONFIG = [
    "gpustarttimestamp",
    "gpuendtimestamp",
    "gridsize3d",
    "threadblocksize",
    "streamid",
    "enableonstart 0",
    "conckerneltrace",
 ]
@contextmanager
 def cuda_profiler(output_file, output_mode=None, config=None):
    """The CUDA profiler.
    This fuctions is used to profile CUDA program by CUDA runtime application
    programming interface. The profiling result will be written into
    `output_file` with Key-Value pair format or Comma separated values format.
    The user can set the output mode by `output_mode` argument and set the
    counters/options for profiling by `config` argument. The default config
    caontains 'gpustarttimestamp', 'gpustarttimestamp', 'gridsize3d',
    'threadblocksize', 'streamid', 'enableonstart 0', 'conckerneltrace'.
    Args:
        output_file (string) : The output file name, the result will be
            written into this file.
        output_mode (string) : The output mode has Key-Value pair format and
            Comma separated values format. It should be 'kv' or 'csv'.
        config (string) : The profiler options and counters can refer to
            "Compute Command Line Profiler User Guide".
    """
    if output_mode is None:
        output_mode = 'csv'
    if output_mode not in ['kv', 'csv']:
        raise ValueError("The output mode must be 'key-value' or 'csv'.")
    config = NVPROF_CONFIG if config is None else config
    core.nvprof_init(output_file, output_mode, config)
    # Enables profiler collection by the active CUDA profiling tool.
    core.nvprof_start()
    yield
    # Disables profiler collection.
    core.nvprof_stop()
--- a/python/paddle/v2/fluid/tests/test_profiler.py
+++ b/python/paddle/v2/fluid/tests/test_profiler.py
@ -0,0 +1,28 @@
 import unittest
 import numpy as np
 import paddle.v2.fluid as fluid
 import paddle.v2.fluid.profiler as profiler
 import paddle.v2.fluid.layers as layers
 class TestProfiler(unittest.TestCase):
    def test_nvprof(self):
        if not fluid.core.is_compile_gpu():
            return
        epoc = 8
        dshape = [4, 3, 28, 28]
        data = layers.data(name='data', shape=[3, 28, 28], dtype='float32')
        conv = layers.conv2d(data, 20, 3, stride=[1, 1], padding=[1, 1])
        place = fluid.GPUPlace(0)
        exe = fluid.Executor(place)
        exe.run(fluid.default_startup_program())
        with profiler.cuda_profiler("cuda_profiler.txt", 'csv') as nvprof:
            for i in range(epoc):
                input = np.random.random(dshape).astype("float32")
                exe.run(fluid.default_main_program(), feed={'data': input})
 if __name__ == '__main__':
    unittest.main()