From f22ece9273b54f1a248f7a787e252eb04a5acea3 Mon Sep 17 00:00:00 2001 From: Yi Wang Date: Thu, 24 Aug 2017 19:44:19 -0700 Subject: [PATCH 1/9] Add a document on building using Docker --- Dockerfile | 4 +- doc/howto/dev/build_en.md | 83 ++++++++++++++++++++++++++++++++++ paddle/scripts/docker/build.sh | 6 +-- 3 files changed, 87 insertions(+), 6 deletions(-) create mode 100644 doc/howto/dev/build_en.md diff --git a/Dockerfile b/Dockerfile index 98f61ba586..136db772cc 100644 --- a/Dockerfile +++ b/Dockerfile @@ -10,13 +10,11 @@ RUN /bin/bash -c 'if [[ -n ${UBUNTU_MIRROR} ]]; then sed -i 's#http://archive.ub ARG WITH_GPU ARG WITH_AVX ARG WITH_DOC -ARG WITH_STYLE_CHECK ENV WOBOQ OFF -ENV WITH_GPU=${WITH_GPU:-OFF} +ENV WITH_GPU=${WITH_GPU:-ON} ENV WITH_AVX=${WITH_AVX:-ON} ENV WITH_DOC=${WITH_DOC:-OFF} -ENV WITH_STYLE_CHECK=${WITH_STYLE_CHECK:-OFF} ENV HOME /root # Add bash enhancements diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md new file mode 100644 index 0000000000..80488a147d --- /dev/null +++ b/doc/howto/dev/build_en.md @@ -0,0 +1,83 @@ +# Build PaddlePaddle from Source Code and Run Unit Test + +## What Developers Need + +To contribute to PaddlePaddle, you need + +1. A computer -- Linux, BSD, Windows, MacOS, and +1. Docker. + +Nothing else. Not even Python and GCC, because you can install all build tools into a Docker image. + +## General Process + +1. Retrieve source code. + + ```bash + git clone https://github.com/paddlepaddle/paddle + ``` + +2. Install build tools. + + ```bash + cd paddle; docker build -t paddle:dev . + ``` + +3. Build from source. + + ```bash + docker run -v $PWD:/paddle paddle:dev + ``` + +4. Run unit tests. + + ```bash + docker run -v $PWD:/paddle paddle:dev "cd/build; ctest" + ``` + + +## Docker, Or Not? + +- What is Docker? + + If you haven't heard of it, consider it something like Python's virtualenv. + +- Docker or virtual machine? + + Some people compare Docker with VMs, but Docker doesn't virtualize any hardware, and it doesn't run a guest OS. + +- Why Docker? + + Using a Docker image of build tools standardize the building environment, and easier for others to reproduce your problem, if there is any, and help. + + Also, some build tools don't run on Windows or Mac or BSD, but Docker runs almost everywhere, so developers can use whatever computer they want. + +- Can I don't use Docker? + + Sure, you don't have to install build tools into a Docker image; instead, you can install them onto your local computer. This document exists because Docker would make the development way easier. + +- How difficult is it to learn Docker? + + It takes you ten minutes to read https://docs.docker.com/get-started/ and saves you more than one hour to install all required build tools, configure them, and upgrade them when new versions of PaddlePaddle require some new tools. + +- Docker requires sudo + + An owner of a computer has the administrative privilege, a.k.a., sudo. If you use a shared computer for development, please ask the administrator to install and configure Docker. We will do our best to support rkt, another container technology that doesn't require sudo. + +- Can I use my favorite IDE? + + Yes, of course. The source code resides on your local computer, and you can edit it using whatever editor you like. + + Many PaddlePaddle developers are using Emacs. They add the following few lines into their `~/.emacs` configure file: + + ```emacs + (global-set-key "\C-cc" 'compile) + (setq compile-command + "docker run --rm -it -v $(git rev-parse --show-toplevel):/paddle paddle:dev") + ``` + + so they could type `Ctrl-C` and `c` to build PaddlePaddle from source. + +- How many parallel building processes does the Docker container run? + + Our building Docker image runs a Bash script https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh, which calls `make -j$(nproc)` to starts as many processes as the number of your processors. diff --git a/paddle/scripts/docker/build.sh b/paddle/scripts/docker/build.sh index 2941662f34..7bab814ae8 100644 --- a/paddle/scripts/docker/build.sh +++ b/paddle/scripts/docker/build.sh @@ -38,7 +38,7 @@ Configuring cmake in /paddle/build ... -DWITH_SWIG_PY=${WITH_SWIG_PY:-ON} -DCUDNN_ROOT=/usr/ -DWITH_STYLE_CHECK=${WITH_STYLE_CHECK:-OFF} - -DWITH_TESTING=${WITH_TESTING:-OFF} + -DWITH_TESTING=${WITH_TESTING:-ON} -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ======================================== EOF @@ -56,8 +56,8 @@ cmake .. \ -DWITH_C_API=${WITH_C_API:-OFF} \ -DWITH_PYTHON=${WITH_PYTHON:-ON} \ -DCUDNN_ROOT=/usr/ \ - -DWITH_STYLE_CHECK=${WITH_STYLE_CHECK:-OFF} \ - -DWITH_TESTING=${WITH_TESTING:-OFF} \ + -DWITH_STYLE_CHECK=${WITH_STYLE_CHECK:-ON} \ + -DWITH_TESTING=${WITH_TESTING:-ON} \ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON cat < Date: Thu, 24 Aug 2017 20:37:39 -0700 Subject: [PATCH 2/9] Update unit test running and CUDA --- doc/howto/dev/build_en.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md index 80488a147d..de0733f963 100644 --- a/doc/howto/dev/build_en.md +++ b/doc/howto/dev/build_en.md @@ -29,12 +29,25 @@ Nothing else. Not even Python and GCC, because you can install all build tools docker run -v $PWD:/paddle paddle:dev ``` + This builds a CUDA-enabled version and writes all binary outputs to directory `./build` of the local computer, other than the Docker container. If we want to build only the CPU part, we can type + + ```bash + docker run -e WITH_GPU=OFF -v $PWD:/paddle paddle:dev + ``` + 4. Run unit tests. + To run all unit tests using the first GPU of a node: + ```bash - docker run -v $PWD:/paddle paddle:dev "cd/build; ctest" + NV_GPU=0 nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" ``` + If we used `WITH_GPU=OFF` at build time, it generates only CPU-based unit tests, and we don't need nvidia-docker to run them. We can just run + + ```bash + docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" + ``` ## Docker, Or Not? From 1e61d91f24e9213ab43edc62cf2c6f9e47a62d1f Mon Sep 17 00:00:00 2001 From: Yi Wang Date: Thu, 24 Aug 2017 21:38:13 -0700 Subject: [PATCH 3/9] Update index and add Chinese version --- doc/howto/dev/build_cn.md | 100 ++++++++++++++++++++++++++++++++++++++ doc/howto/dev/build_en.md | 6 ++- doc/howto/index_cn.rst | 1 + doc/howto/index_en.rst | 1 + 4 files changed, 107 insertions(+), 1 deletion(-) create mode 100644 doc/howto/dev/build_cn.md diff --git a/doc/howto/dev/build_cn.md b/doc/howto/dev/build_cn.md new file mode 100644 index 0000000000..dc372de9fa --- /dev/null +++ b/doc/howto/dev/build_cn.md @@ -0,0 +1,100 @@ +# 编译PaddlePaddle和运行单元测试 + +## 需要的软硬件 + +为了开发PaddlePaddle,我们需要 + +1. 一台电脑,可以装的是 Linux, BSD, Windows 或者 MacOS 操作系统,以及 +1. Docker。 + +不需要其他任何软件了。即便是 Python 和 GCC 都不需要,因为我们会把所有编译工具都安装进一个 Docker image 里。 + +## 总体流程 + +1. 获取源码 + + ```bash + git clone https://github.com/paddlepaddle/paddle + ``` + +2. 安装工具 + + ```bash + cd paddle; docker build -t paddle:dev . + ``` + +3. 编译 + + ```bash + docker run -v $PWD:/paddle paddle:dev + ``` + + 这个命令编译出一个 CUDA-enabled 版本。所有二进制文件会被写到本机的 `./build` 目录,而不是写到 Docker container 里。如果我们只需要编译一个只支持 CPU 的版本,可以用 + + ```bash + docker run -e WITH_GPU=OFF -v $PWD:/paddle paddle:dev + ``` + +4. 运行单元测试 + + 用本机的第一个 GPU 来运行包括 GPU 单元测试在内的所有单元测试: + + ```bash + NV_GPU=0 nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" + ``` + + 如果编译的时候我们用了 `WITH_GPU=OFF` 选项,那么编译过程只会产生 CPU-based 单元测试,那么我们也就不需要 nvidia-docker 来运行单元测试了。我们只需要: + + ```bash + docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" + ``` + +## 为什么要 Docker 呀? + +- 什么是 Docker? + + 如果您没有听说 Docker,可以把它想象为一个类似 virtualenv 的系统,但是虚拟的不仅仅是 Python 的运行环境。 + +- Docker 还是虚拟机? + + 有人用虚拟机来类比 Docker。需要强调的是:Docker 不会虚拟任何硬件,Docker container 里运行的编译工具实际上都是在本机的 CPU 和操作系统上直接运行的,性能和把编译工具安装在本机运行基本一样。 + +- 为什么用 Docker? + + 把工具和配置都安装在一个 Docker image 里可以标准化编译环境。这样如果遇到问题,其他人可以复现问题以便帮助。 + + 另外,对于习惯使用Windows和MacOS的开发者来说,使用Docker就不用配置交叉编译环境了。 + +- 我可以选择不用Docker吗? + + 当然可以。大家可以用把开发工具安装进入 Docker image 一样的方式,把这些工具安装到本机。这篇文档介绍基于 Docker 的开发流程,是因为这个流程比其他方法都更简便。 + +- 学习 Docker 有多难? + + 理解 Docker 并不难,大概花十分钟看一遍 https://zhuanlan.zhihu.com/p/19902938 即可。这可以帮您省掉花一小时安装和配置各种开发工具,以及切换机器时需要新安装的辛苦。别忘了 PaddlePaddle 更新可能导致需要新的开发工具。更别提简化问题复现带来的好处了。 + +- Docker 需要 sudo + + 如果用自己的电脑开发,自然也就有管理员权限(sudo)了。如果用公用的电脑开发,需要请管理员安装和配置好 Docker。此外,PaddlePaddle 项目在努力开始支持其他不需要 sudo 的集装箱技术,比如 rkt。 + +- 我可以用 IDE 吗? + + 当然可以,因为源码就在本机上。IDE 默认调用 make 之类的程序来编译源码,我们只需要配置 IDE 来调用 Docker 命令编译源码即可。 + + 很多 PaddlePaddle 开发者使用 Emacs。他们在自己的 `~/.emacs` 配置文件里加两行 + + ```emacs + (global-set-key "\C-cc" 'compile) + (setq compile-command + "docker run --rm -it -v $(git rev-parse --show-toplevel):/paddle paddle:dev") + ``` + + 就可以按 `Ctrl-C` 和 `c` 键来启动编译了。 + +- 可以并行编译吗? + + 是的。我们的 Docker image 运行一个 Bash 脚本 https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh 。这个脚本调用 `make -j$(nproc)` 来启动和 CPU 核一样多的进程来并行编译。 + +- Docker on Windows/MacOS? + + Docker 在 Windows 和 MacOS 都可以运行。不过实际上是运行在一个 Linux 虚拟机上。可能需要注意给这个虚拟机多分配一些 CPU 和内存,以保证编译高效。具体做法请参考 https://github.com/PaddlePaddle/Paddle/issues/627 。 diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md index de0733f963..640d126018 100644 --- a/doc/howto/dev/build_en.md +++ b/doc/howto/dev/build_en.md @@ -91,6 +91,10 @@ Nothing else. Not even Python and GCC, because you can install all build tools so they could type `Ctrl-C` and `c` to build PaddlePaddle from source. -- How many parallel building processes does the Docker container run? +- Does Docker do parallel building? Our building Docker image runs a Bash script https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh, which calls `make -j$(nproc)` to starts as many processes as the number of your processors. + +- Docker on Windows/MacOS? + + On Windows and MacOS, Docker containers run in a Linux VM. You might want to give this VM some more memory and CPUs so to make the building efficient. Please refer to https://github.com/PaddlePaddle/Paddle/issues/627 for details. diff --git a/doc/howto/index_cn.rst b/doc/howto/index_cn.rst index 26449a6365..0608aa3096 100644 --- a/doc/howto/index_cn.rst +++ b/doc/howto/index_cn.rst @@ -19,6 +19,7 @@ .. toctree:: :maxdepth: 1 + dev/build_cn.rst dev/write_docs_cn.rst dev/contribute_to_paddle_cn.md diff --git a/doc/howto/index_en.rst b/doc/howto/index_en.rst index 1fbfcd260b..1b6034be4e 100644 --- a/doc/howto/index_en.rst +++ b/doc/howto/index_en.rst @@ -18,6 +18,7 @@ Development .. toctree:: :maxdepth: 1 + dev/build_en.rst dev/new_layer_en.rst dev/contribute_to_paddle_en.md From c8d0c9af865cd0ac47d1cd7461c24793d833eeff Mon Sep 17 00:00:00 2001 From: Yi Wang Date: Fri, 25 Aug 2017 11:24:48 -0700 Subject: [PATCH 4/9] In response to comments from Luo Tao --- doc/howto/dev/build_cn.md | 6 +++--- doc/howto/dev/build_en.md | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/howto/dev/build_cn.md b/doc/howto/dev/build_cn.md index dc372de9fa..7c95579636 100644 --- a/doc/howto/dev/build_cn.md +++ b/doc/howto/dev/build_cn.md @@ -71,7 +71,7 @@ - 学习 Docker 有多难? - 理解 Docker 并不难,大概花十分钟看一遍 https://zhuanlan.zhihu.com/p/19902938 即可。这可以帮您省掉花一小时安装和配置各种开发工具,以及切换机器时需要新安装的辛苦。别忘了 PaddlePaddle 更新可能导致需要新的开发工具。更别提简化问题复现带来的好处了。 + 理解 Docker 并不难,大概花十分钟看一下[这篇文章](https://zhuanlan.zhihu.com/p/19902938)。这可以帮您省掉花一小时安装和配置各种开发工具,以及切换机器时需要新安装的辛苦。别忘了 PaddlePaddle 更新可能导致需要新的开发工具。更别提简化问题复现带来的好处了。 - Docker 需要 sudo @@ -93,8 +93,8 @@ - 可以并行编译吗? - 是的。我们的 Docker image 运行一个 Bash 脚本 https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh 。这个脚本调用 `make -j$(nproc)` 来启动和 CPU 核一样多的进程来并行编译。 + 是的。我们的 Docker image 运行一个 [Bash 脚本](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh)。这个脚本调用 `make -j$(nproc)` 来启动和 CPU 核一样多的进程来并行编译。 - Docker on Windows/MacOS? - Docker 在 Windows 和 MacOS 都可以运行。不过实际上是运行在一个 Linux 虚拟机上。可能需要注意给这个虚拟机多分配一些 CPU 和内存,以保证编译高效。具体做法请参考 https://github.com/PaddlePaddle/Paddle/issues/627 。 + Docker 在 Windows 和 MacOS 都可以运行。不过实际上是运行在一个 Linux 虚拟机上。可能需要注意给这个虚拟机多分配一些 CPU 和内存,以保证编译高效。具体做法请参考[这个issue](https://github.com/PaddlePaddle/Paddle/issues/627)。 diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md index 640d126018..3be2405ea7 100644 --- a/doc/howto/dev/build_en.md +++ b/doc/howto/dev/build_en.md @@ -71,7 +71,7 @@ Nothing else. Not even Python and GCC, because you can install all build tools - How difficult is it to learn Docker? - It takes you ten minutes to read https://docs.docker.com/get-started/ and saves you more than one hour to install all required build tools, configure them, and upgrade them when new versions of PaddlePaddle require some new tools. + It takes you ten minutes to read [an introductory article](https://docs.docker.com/get-started) and saves you more than one hour to install all required build tools, configure them, and upgrade them when new versions of PaddlePaddle require some new tools. - Docker requires sudo @@ -93,8 +93,8 @@ Nothing else. Not even Python and GCC, because you can install all build tools - Does Docker do parallel building? - Our building Docker image runs a Bash script https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh, which calls `make -j$(nproc)` to starts as many processes as the number of your processors. + Our building Docker image runs a [Bash script](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh), which calls `make -j$(nproc)` to starts as many processes as the number of your processors. - Docker on Windows/MacOS? - On Windows and MacOS, Docker containers run in a Linux VM. You might want to give this VM some more memory and CPUs so to make the building efficient. Please refer to https://github.com/PaddlePaddle/Paddle/issues/627 for details. + On Windows and MacOS, Docker containers run in a Linux VM. You might want to give this VM some more memory and CPUs so to make the building efficient. Please refer to [this issue](https://github.com/PaddlePaddle/Paddle/issues/627) for details. From f71f3935e3ce05a8e90edc971f5ab08d71ed2966 Mon Sep 17 00:00:00 2001 From: Yi Wang Date: Fri, 25 Aug 2017 11:51:53 -0700 Subject: [PATCH 5/9] In response to comments from Chen Xi --- doc/howto/dev/build_cn.md | 20 +++++++++++++------- doc/howto/dev/build_en.md | 34 ++++++++++++++++++++-------------- 2 files changed, 33 insertions(+), 21 deletions(-) diff --git a/doc/howto/dev/build_cn.md b/doc/howto/dev/build_cn.md index 7c95579636..0077d90118 100644 --- a/doc/howto/dev/build_cn.md +++ b/doc/howto/dev/build_cn.md @@ -23,13 +23,17 @@ cd paddle; docker build -t paddle:dev . ``` + 请注意这个命令结尾处的 `.`;它表示 `docker build` 应该读取当前目录下的 [`Dockerfile`文件](https://github.com/PaddlePaddle/Paddle/blob/develop/Dockerfile),按照其内容创建一个名为 `paddle:dev` 的 Docker image,并且把各种开发工具安装进去。 + 3. 编译 + 以下命令启动一个 Docker container 来执行 `paddle:dev` 这个 Docker image,同时把当前目录(源码树根目录)映射为 container 里的 `/paddle` 目录,并且运行 `Dockerfile` 描述的默认入口程序 [`build.sh`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh)。这个脚本调用 `cmake` 和 `make` 来编译 `/paddle` 里的源码,结果输出到 `/paddle/build`,也就是本地的源码树根目录里的 `build` 子目录。 + ```bash docker run -v $PWD:/paddle paddle:dev ``` - 这个命令编译出一个 CUDA-enabled 版本。所有二进制文件会被写到本机的 `./build` 目录,而不是写到 Docker container 里。如果我们只需要编译一个只支持 CPU 的版本,可以用 + 上述命令编译出一个 CUDA-enabled 版本。如果我们只需要编译一个只支持 CPU 的版本,可以用 ```bash docker run -e WITH_GPU=OFF -v $PWD:/paddle paddle:dev @@ -57,7 +61,7 @@ - Docker 还是虚拟机? - 有人用虚拟机来类比 Docker。需要强调的是:Docker 不会虚拟任何硬件,Docker container 里运行的编译工具实际上都是在本机的 CPU 和操作系统上直接运行的,性能和把编译工具安装在本机运行基本一样。 + 有人用虚拟机来类比 Docker。需要强调的是:Docker 不会虚拟任何硬件,Docker container 里运行的编译工具实际上都是在本机的 CPU 和操作系统上直接运行的,性能和把编译工具安装在本机运行一样。 - 为什么用 Docker? @@ -73,10 +77,6 @@ 理解 Docker 并不难,大概花十分钟看一下[这篇文章](https://zhuanlan.zhihu.com/p/19902938)。这可以帮您省掉花一小时安装和配置各种开发工具,以及切换机器时需要新安装的辛苦。别忘了 PaddlePaddle 更新可能导致需要新的开发工具。更别提简化问题复现带来的好处了。 -- Docker 需要 sudo - - 如果用自己的电脑开发,自然也就有管理员权限(sudo)了。如果用公用的电脑开发,需要请管理员安装和配置好 Docker。此外,PaddlePaddle 项目在努力开始支持其他不需要 sudo 的集装箱技术,比如 rkt。 - - 我可以用 IDE 吗? 当然可以,因为源码就在本机上。IDE 默认调用 make 之类的程序来编译源码,我们只需要配置 IDE 来调用 Docker 命令编译源码即可。 @@ -95,6 +95,12 @@ 是的。我们的 Docker image 运行一个 [Bash 脚本](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh)。这个脚本调用 `make -j$(nproc)` 来启动和 CPU 核一样多的进程来并行编译。 -- Docker on Windows/MacOS? +## 可能碰到的问题 + +- Docker 需要 sudo + + 如果用自己的电脑开发,自然也就有管理员权限(sudo)了。如果用公用的电脑开发,需要请管理员安装和配置好 Docker。此外,PaddlePaddle 项目在努力开始支持其他不需要 sudo 的集装箱技术,比如 rkt。 + +- 在 Windows/MacOS 上编译很慢 Docker 在 Windows 和 MacOS 都可以运行。不过实际上是运行在一个 Linux 虚拟机上。可能需要注意给这个虚拟机多分配一些 CPU 和内存,以保证编译高效。具体做法请参考[这个issue](https://github.com/PaddlePaddle/Paddle/issues/627)。 diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md index 3be2405ea7..95752beba0 100644 --- a/doc/howto/dev/build_en.md +++ b/doc/howto/dev/build_en.md @@ -7,7 +7,7 @@ To contribute to PaddlePaddle, you need 1. A computer -- Linux, BSD, Windows, MacOS, and 1. Docker. -Nothing else. Not even Python and GCC, because you can install all build tools into a Docker image. +Nothing else. Not even Python and GCC, because you can install all build tools into a Docker image. We run all the tools by running this image. ## General Process @@ -17,19 +17,23 @@ Nothing else. Not even Python and GCC, because you can install all build tools git clone https://github.com/paddlepaddle/paddle ``` -2. Install build tools. +2. Install build tools into a Docker image. ```bash cd paddle; docker build -t paddle:dev . ``` + Please be aware of the `.` at the end of the command, which refers to the [`./Dockerfile` file](https://github.com/PaddlePaddle/Paddle/blob/develop/Dockerfile). `docker build` follows instructions in this file to create a Docker image named `paddle:dev`, and installs building tools into it. + 3. Build from source. + This following command starts a Docker container that executes the Docker image `paddle:dev`, mapping the current directory to `/paddle/` in the container, and runs the default entry-point [`build.sh`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh) as specified in the Dockefile. `build.sh` invokes `cmake` and `make` to build PaddlePaddle source code, which had been mapped to `/paddle`, and writes outputs to `/paddle/build`, which maps to `build` in the current source directory on the computer. + ```bash docker run -v $PWD:/paddle paddle:dev ``` - This builds a CUDA-enabled version and writes all binary outputs to directory `./build` of the local computer, other than the Docker container. If we want to build only the CPU part, we can type + Above command builds a CUDA-enabled version. If we want to build a CPU-only version, we can type ```bash docker run -e WITH_GPU=OFF -v $PWD:/paddle paddle:dev @@ -57,25 +61,21 @@ Nothing else. Not even Python and GCC, because you can install all build tools - Docker or virtual machine? - Some people compare Docker with VMs, but Docker doesn't virtualize any hardware, and it doesn't run a guest OS. + Some people compare Docker with VMs, but Docker doesn't virtualize any hardware nor running a guest OS, which means there is no compromise on the performance. - Why Docker? - Using a Docker image of build tools standardize the building environment, and easier for others to reproduce your problem, if there is any, and help. + Using a Docker image of build tools standardizes the building environment, which makes it easier for others to reproduce your problems and to help. Also, some build tools don't run on Windows or Mac or BSD, but Docker runs almost everywhere, so developers can use whatever computer they want. -- Can I don't use Docker? +- Can I choose not to use Docker? - Sure, you don't have to install build tools into a Docker image; instead, you can install them onto your local computer. This document exists because Docker would make the development way easier. + Sure, you don't have to install build tools into a Docker image; instead, you can install them in your local computer. This document exists because Docker would make the development way easier. - How difficult is it to learn Docker? - It takes you ten minutes to read [an introductory article](https://docs.docker.com/get-started) and saves you more than one hour to install all required build tools, configure them, and upgrade them when new versions of PaddlePaddle require some new tools. - -- Docker requires sudo - - An owner of a computer has the administrative privilege, a.k.a., sudo. If you use a shared computer for development, please ask the administrator to install and configure Docker. We will do our best to support rkt, another container technology that doesn't require sudo. + It takes you ten minutes to read [an introductory article](https://docs.docker.com/get-started) and saves you more than one hour to install all required build tools, configure them, especially when new versions of PaddlePaddle require some new tools. Not even to mention the time saved when other people trying to reproduce the issue you have. - Can I use my favorite IDE? @@ -93,8 +93,14 @@ Nothing else. Not even Python and GCC, because you can install all build tools - Does Docker do parallel building? - Our building Docker image runs a [Bash script](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh), which calls `make -j$(nproc)` to starts as many processes as the number of your processors. + Our building Docker image runs a [Bash script](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh), which calls `make -j$(nproc)` to starts as many processes as the number of your CPU cores. + +## Some Gotchas + +- Docker requires sudo + + An owner of a computer has the administrative privilege, a.k.a., sudo, and Docker requires this privilege to work properly. If you use a shared computer for development, please ask the administrator to install and configure Docker. We will do our best to support rkt, another container technology that doesn't require sudo. -- Docker on Windows/MacOS? +- Docker on Windows/MacOS builds slowly On Windows and MacOS, Docker containers run in a Linux VM. You might want to give this VM some more memory and CPUs so to make the building efficient. Please refer to [this issue](https://github.com/PaddlePaddle/Paddle/issues/627) for details. From 4b0235c1f2792cdecfe7d8f3e0bb1d0c57c6f361 Mon Sep 17 00:00:00 2001 From: Yi Wang Date: Fri, 25 Aug 2017 14:31:02 -0700 Subject: [PATCH 6/9] Update build.sh --- paddle/scripts/docker/build.sh | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/paddle/scripts/docker/build.sh b/paddle/scripts/docker/build.sh index 7bab814ae8..1798642022 100644 --- a/paddle/scripts/docker/build.sh +++ b/paddle/scripts/docker/build.sh @@ -63,12 +63,11 @@ cmake .. \ cat < Date: Fri, 25 Aug 2017 14:43:29 -0700 Subject: [PATCH 7/9] Run a specific test --- doc/howto/dev/build_cn.md | 6 ++++++ doc/howto/dev/build_en.md | 6 ++++++ 2 files changed, 12 insertions(+) diff --git a/doc/howto/dev/build_cn.md b/doc/howto/dev/build_cn.md index 0077d90118..79b4ff9d5a 100644 --- a/doc/howto/dev/build_cn.md +++ b/doc/howto/dev/build_cn.md @@ -53,6 +53,12 @@ docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" ``` + 有时候我们只想运行一个特定的单元测试,比如 `memory_test`,我们可以 + + ```bash + docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" + ``` + ## 为什么要 Docker 呀? - 什么是 Docker? diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md index 95752beba0..e1b55929f9 100644 --- a/doc/howto/dev/build_en.md +++ b/doc/howto/dev/build_en.md @@ -53,6 +53,12 @@ Nothing else. Not even Python and GCC, because you can install all build tools docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" ``` + Sometimes we want to run a specific unit test, say `memory_test`, we can run + + ```bash + docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" + ``` + ## Docker, Or Not? - What is Docker? From 852f341615808b6a5e6249b3b7c1f5f20fd22ec9 Mon Sep 17 00:00:00 2001 From: Yi Wang Date: Fri, 25 Aug 2017 16:48:52 -0700 Subject: [PATCH 8/9] Add clean build section --- doc/howto/dev/build_cn.md | 10 +++++++++- doc/howto/dev/build_en.md | 10 +++++++++- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/doc/howto/dev/build_cn.md b/doc/howto/dev/build_cn.md index 79b4ff9d5a..d9d520893f 100644 --- a/doc/howto/dev/build_cn.md +++ b/doc/howto/dev/build_cn.md @@ -56,7 +56,15 @@ 有时候我们只想运行一个特定的单元测试,比如 `memory_test`,我们可以 ```bash - docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" + nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" + ``` + +5. 清理 + + 有时候我们会希望清理掉已经下载的第三方依赖以及已经编译的二进制文件。此时只需要: + + ```bash + rm -rf build ``` ## 为什么要 Docker 呀? diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md index e1b55929f9..318bf3d384 100644 --- a/doc/howto/dev/build_en.md +++ b/doc/howto/dev/build_en.md @@ -56,7 +56,15 @@ Nothing else. Not even Python and GCC, because you can install all build tools Sometimes we want to run a specific unit test, say `memory_test`, we can run ```bash - docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" + nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" + ``` + +5. Clean Build. + + Sometimes, we might want to clean all thirt-party dependents and built binaries. To do so, just + + ```bash + rm -rf build ``` ## Docker, Or Not? From ec5e20c9f12e89e13b52978b8bb27997c77f059c Mon Sep 17 00:00:00 2001 From: Yi Wang Date: Fri, 25 Aug 2017 17:14:28 -0700 Subject: [PATCH 9/9] Remove stopped containers and dangling images --- doc/howto/dev/build_cn.md | 18 +++++++++++------- doc/howto/dev/build_en.md | 4 ++++ 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/doc/howto/dev/build_cn.md b/doc/howto/dev/build_cn.md index d9d520893f..0b911f7b75 100644 --- a/doc/howto/dev/build_cn.md +++ b/doc/howto/dev/build_cn.md @@ -7,7 +7,7 @@ 1. 一台电脑,可以装的是 Linux, BSD, Windows 或者 MacOS 操作系统,以及 1. Docker。 -不需要其他任何软件了。即便是 Python 和 GCC 都不需要,因为我们会把所有编译工具都安装进一个 Docker image 里。 +不需要依赖其他任何软件了。即便是 Python 和 GCC 都不需要,因为我们会把所有编译工具都安装进一个 Docker image 里。 ## 总体流程 @@ -17,7 +17,7 @@ git clone https://github.com/paddlepaddle/paddle ``` -2. 安装工具 +2. 安装开发工具到 Docker image 里 ```bash cd paddle; docker build -t paddle:dev . @@ -30,13 +30,13 @@ 以下命令启动一个 Docker container 来执行 `paddle:dev` 这个 Docker image,同时把当前目录(源码树根目录)映射为 container 里的 `/paddle` 目录,并且运行 `Dockerfile` 描述的默认入口程序 [`build.sh`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh)。这个脚本调用 `cmake` 和 `make` 来编译 `/paddle` 里的源码,结果输出到 `/paddle/build`,也就是本地的源码树根目录里的 `build` 子目录。 ```bash - docker run -v $PWD:/paddle paddle:dev + docker run --rm -v $PWD:/paddle paddle:dev ``` 上述命令编译出一个 CUDA-enabled 版本。如果我们只需要编译一个只支持 CPU 的版本,可以用 ```bash - docker run -e WITH_GPU=OFF -v $PWD:/paddle paddle:dev + docker run --rm -e WITH_GPU=OFF -v $PWD:/paddle paddle:dev ``` 4. 运行单元测试 @@ -44,19 +44,19 @@ 用本机的第一个 GPU 来运行包括 GPU 单元测试在内的所有单元测试: ```bash - NV_GPU=0 nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" + NV_GPU=0 nvidia-docker run --rm -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" ``` 如果编译的时候我们用了 `WITH_GPU=OFF` 选项,那么编译过程只会产生 CPU-based 单元测试,那么我们也就不需要 nvidia-docker 来运行单元测试了。我们只需要: ```bash - docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" + docker run --rm -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest" ``` 有时候我们只想运行一个特定的单元测试,比如 `memory_test`,我们可以 ```bash - nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" + nvidia-docker run --rm -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test" ``` 5. 清理 @@ -118,3 +118,7 @@ - 在 Windows/MacOS 上编译很慢 Docker 在 Windows 和 MacOS 都可以运行。不过实际上是运行在一个 Linux 虚拟机上。可能需要注意给这个虚拟机多分配一些 CPU 和内存,以保证编译高效。具体做法请参考[这个issue](https://github.com/PaddlePaddle/Paddle/issues/627)。 + +- 磁盘不够 + + 本文中的例子里,`docker run` 命令里都用了 `--rm` 参数,这样保证运行结束之后的 containers 不会保留在磁盘上。可以用 `docker ps -a` 命令看到停止后但是没有删除的 containers。`docker build` 命令有时候会产生一些中间结果,是没有名字的 images,也会占用磁盘。可以参考[这篇文章](https://zaiste.net/posts/removing_docker_containers/)来清理这些内容。 diff --git a/doc/howto/dev/build_en.md b/doc/howto/dev/build_en.md index 318bf3d384..d0048e3714 100644 --- a/doc/howto/dev/build_en.md +++ b/doc/howto/dev/build_en.md @@ -118,3 +118,7 @@ Nothing else. Not even Python and GCC, because you can install all build tools - Docker on Windows/MacOS builds slowly On Windows and MacOS, Docker containers run in a Linux VM. You might want to give this VM some more memory and CPUs so to make the building efficient. Please refer to [this issue](https://github.com/PaddlePaddle/Paddle/issues/627) for details. + +- Not enough disk space + + Examples in this article uses option `--rm` with the `docker run` command. This option ensures that stopped containers do not exist on hard disks. We can use `docker ps -a` to list all containers, including stopped. Sometimes `docker build` generates some intermediate dangling images, which also take disk space. To clean them, please refer to [this article](https://zaiste.net/posts/removing_docker_containers/).