commit
45ced9da46
@ -0,0 +1,124 @@
|
||||
# Build PaddlePaddle from Source Code and Run Unit Test
|
||||
|
||||
## What Developers Need
|
||||
|
||||
To contribute to PaddlePaddle, you need
|
||||
|
||||
1. A computer -- Linux, BSD, Windows, MacOS, and
|
||||
1. Docker.
|
||||
|
||||
Nothing else. Not even Python and GCC, because you can install all build tools into a Docker image. We run all the tools by running this image.
|
||||
|
||||
## General Process
|
||||
|
||||
1. Retrieve source code.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/paddlepaddle/paddle
|
||||
```
|
||||
|
||||
2. Install build tools into a Docker image.
|
||||
|
||||
```bash
|
||||
cd paddle; docker build -t paddle:dev .
|
||||
```
|
||||
|
||||
Please be aware of the `.` at the end of the command, which refers to the [`./Dockerfile` file](https://github.com/PaddlePaddle/Paddle/blob/develop/Dockerfile). `docker build` follows instructions in this file to create a Docker image named `paddle:dev`, and installs building tools into it.
|
||||
|
||||
3. Build from source.
|
||||
|
||||
This following command starts a Docker container that executes the Docker image `paddle:dev`, mapping the current directory to `/paddle/` in the container, and runs the default entry-point [`build.sh`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh) as specified in the Dockefile. `build.sh` invokes `cmake` and `make` to build PaddlePaddle source code, which had been mapped to `/paddle`, and writes outputs to `/paddle/build`, which maps to `build` in the current source directory on the computer.
|
||||
|
||||
```bash
|
||||
docker run -v $PWD:/paddle paddle:dev
|
||||
```
|
||||
|
||||
Above command builds a CUDA-enabled version. If we want to build a CPU-only version, we can type
|
||||
|
||||
```bash
|
||||
docker run -e WITH_GPU=OFF -v $PWD:/paddle paddle:dev
|
||||
```
|
||||
|
||||
4. Run unit tests.
|
||||
|
||||
To run all unit tests using the first GPU of a node:
|
||||
|
||||
```bash
|
||||
NV_GPU=0 nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest"
|
||||
```
|
||||
|
||||
If we used `WITH_GPU=OFF` at build time, it generates only CPU-based unit tests, and we don't need nvidia-docker to run them. We can just run
|
||||
|
||||
```bash
|
||||
docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest"
|
||||
```
|
||||
|
||||
Sometimes we want to run a specific unit test, say `memory_test`, we can run
|
||||
|
||||
```bash
|
||||
nvidia-docker run -v $PWD:/paddle paddle:dev bash -c "cd /paddle/build; ctest -V -R memory_test"
|
||||
```
|
||||
|
||||
5. Clean Build.
|
||||
|
||||
Sometimes, we might want to clean all thirt-party dependents and built binaries. To do so, just
|
||||
|
||||
```bash
|
||||
rm -rf build
|
||||
```
|
||||
|
||||
## Docker, Or Not?
|
||||
|
||||
- What is Docker?
|
||||
|
||||
If you haven't heard of it, consider it something like Python's virtualenv.
|
||||
|
||||
- Docker or virtual machine?
|
||||
|
||||
Some people compare Docker with VMs, but Docker doesn't virtualize any hardware nor running a guest OS, which means there is no compromise on the performance.
|
||||
|
||||
- Why Docker?
|
||||
|
||||
Using a Docker image of build tools standardizes the building environment, which makes it easier for others to reproduce your problems and to help.
|
||||
|
||||
Also, some build tools don't run on Windows or Mac or BSD, but Docker runs almost everywhere, so developers can use whatever computer they want.
|
||||
|
||||
- Can I choose not to use Docker?
|
||||
|
||||
Sure, you don't have to install build tools into a Docker image; instead, you can install them in your local computer. This document exists because Docker would make the development way easier.
|
||||
|
||||
- How difficult is it to learn Docker?
|
||||
|
||||
It takes you ten minutes to read [an introductory article](https://docs.docker.com/get-started) and saves you more than one hour to install all required build tools, configure them, especially when new versions of PaddlePaddle require some new tools. Not even to mention the time saved when other people trying to reproduce the issue you have.
|
||||
|
||||
- Can I use my favorite IDE?
|
||||
|
||||
Yes, of course. The source code resides on your local computer, and you can edit it using whatever editor you like.
|
||||
|
||||
Many PaddlePaddle developers are using Emacs. They add the following few lines into their `~/.emacs` configure file:
|
||||
|
||||
```emacs
|
||||
(global-set-key "\C-cc" 'compile)
|
||||
(setq compile-command
|
||||
"docker run --rm -it -v $(git rev-parse --show-toplevel):/paddle paddle:dev")
|
||||
```
|
||||
|
||||
so they could type `Ctrl-C` and `c` to build PaddlePaddle from source.
|
||||
|
||||
- Does Docker do parallel building?
|
||||
|
||||
Our building Docker image runs a [Bash script](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/build.sh), which calls `make -j$(nproc)` to starts as many processes as the number of your CPU cores.
|
||||
|
||||
## Some Gotchas
|
||||
|
||||
- Docker requires sudo
|
||||
|
||||
An owner of a computer has the administrative privilege, a.k.a., sudo, and Docker requires this privilege to work properly. If you use a shared computer for development, please ask the administrator to install and configure Docker. We will do our best to support rkt, another container technology that doesn't require sudo.
|
||||
|
||||
- Docker on Windows/MacOS builds slowly
|
||||
|
||||
On Windows and MacOS, Docker containers run in a Linux VM. You might want to give this VM some more memory and CPUs so to make the building efficient. Please refer to [this issue](https://github.com/PaddlePaddle/Paddle/issues/627) for details.
|
||||
|
||||
- Not enough disk space
|
||||
|
||||
Examples in this article uses option `--rm` with the `docker run` command. This option ensures that stopped containers do not exist on hard disks. We can use `docker ps -a` to list all containers, including stopped. Sometimes `docker build` generates some intermediate dangling images, which also take disk space. To clean them, please refer to [this article](https://zaiste.net/posts/removing_docker_containers/).
|
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,86 @@
|
||||
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License. */
|
||||
|
||||
#include "paddle/operators/scatter_op.h"
|
||||
#include "paddle/framework/ddim.h"
|
||||
|
||||
namespace paddle {
|
||||
namespace operators {
|
||||
|
||||
class ScatterOp : public framework::OperatorWithKernel {
|
||||
public:
|
||||
using framework::OperatorWithKernel::OperatorWithKernel;
|
||||
|
||||
protected:
|
||||
void InferShape(const framework::InferShapeContext &ctx) const override {
|
||||
PADDLE_ENFORCE_EQ(ctx.Input<Tensor>("Index")->dims().size(), 1,
|
||||
"Update Index should be 1-D.");
|
||||
PADDLE_ENFORCE_EQ(ctx.Input<Tensor>("Ref")->dims().size(),
|
||||
ctx.Input<Tensor>("Updates")->dims().size(),
|
||||
"Reference and Updates should have the same shape size");
|
||||
PADDLE_ENFORCE_EQ(ctx.Input<Tensor>("Updates")->dims()[0],
|
||||
ctx.Input<Tensor>("Index")->dims()[0],
|
||||
"Updates and Index should have same batch-size.");
|
||||
framework::DDim data_dim(ctx.Input<Tensor>("Updates")->dims());
|
||||
for (int i = 1; i < data_dim.size(); ++i)
|
||||
PADDLE_ENFORCE_EQ(data_dim[i], ctx.Input<Tensor>("Updates")->dims()[i]);
|
||||
ctx.Output<Tensor>("Out")->Resize(ctx.Input<Tensor>("Ref")->dims());
|
||||
}
|
||||
};
|
||||
|
||||
class ScatterGradOp : public framework::OperatorWithKernel {
|
||||
public:
|
||||
using framework::OperatorWithKernel::OperatorWithKernel;
|
||||
|
||||
protected:
|
||||
void InferShape(const framework::InferShapeContext &ctx) const override {
|
||||
auto *dUpdates = ctx.Output<Tensor>(framework::GradVarName("Updates"));
|
||||
auto *Updates = ctx.Input<Tensor>("Updates");
|
||||
auto *dRef = ctx.Output<Tensor>(framework::GradVarName("Ref"));
|
||||
auto *Ref = ctx.Input<Tensor>("Ref");
|
||||
|
||||
dRef->Resize(Ref->dims());
|
||||
dUpdates->Resize(Updates->dims());
|
||||
}
|
||||
};
|
||||
|
||||
class ScatterOpMaker : public framework::OpProtoAndCheckerMaker {
|
||||
public:
|
||||
ScatterOpMaker(framework::OpProto *proto,
|
||||
framework::OpAttrChecker *op_checker)
|
||||
: OpProtoAndCheckerMaker(proto, op_checker) {
|
||||
AddInput("Ref", "The source input of scatter op");
|
||||
AddInput("Index",
|
||||
"The index input of scatter op where Ref will be updated");
|
||||
AddInput("Updates", "The updated value of updates op");
|
||||
AddOutput("Out", "The output of add op");
|
||||
AddComment(R"DOC(
|
||||
Scatter Operator by selecting from the first axis,
|
||||
|
||||
Out = Ref
|
||||
Out[Index] = Ref[Index] + Updates
|
||||
)DOC");
|
||||
}
|
||||
};
|
||||
} // namespace operators
|
||||
} // namespace paddle
|
||||
|
||||
namespace ops = paddle::operators;
|
||||
REGISTER_OP(scatter, ops::ScatterOp, ops::ScatterOpMaker, scatter_grad,
|
||||
ops::ScatterGradOp);
|
||||
REGISTER_OP_CPU_KERNEL(scatter,
|
||||
ops::ScatterOpKernel<paddle::platform::CPUPlace, float>);
|
||||
REGISTER_OP_CPU_KERNEL(
|
||||
scatter_grad,
|
||||
ops::ScatterGradientOpKernel<paddle::platform::CPUPlace, float>);
|
@ -0,0 +1,20 @@
|
||||
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License. */
|
||||
|
||||
#define EIGEN_USE_GPU
|
||||
#include "paddle/operators/scatter_op.h"
|
||||
|
||||
namespace ops = paddle::operators;
|
||||
REGISTER_OP_GPU_KERNEL(scatter,
|
||||
ops::ScatterOpKernel<paddle::platform::GPUPlace, float>);
|
@ -0,0 +1,60 @@
|
||||
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License. */
|
||||
|
||||
#pragma once
|
||||
#include "gather.h"
|
||||
#include "paddle/framework/eigen.h"
|
||||
#include "paddle/framework/op_registry.h"
|
||||
#include "scatter.h"
|
||||
|
||||
namespace paddle {
|
||||
namespace operators {
|
||||
|
||||
using Tensor = framework::Tensor;
|
||||
|
||||
template <typename Place, typename T>
|
||||
class ScatterOpKernel : public framework::OpKernel {
|
||||
public:
|
||||
void Compute(const framework::ExecutionContext &ctx) const override {
|
||||
auto *Ref = ctx.Input<Tensor>("Ref");
|
||||
auto *Index = ctx.Input<Tensor>("Index");
|
||||
auto *Updates = ctx.Input<Tensor>("Updates");
|
||||
auto *Out = ctx.Output<Tensor>("Out");
|
||||
|
||||
// In place output: Out = Ref, Out[Index] += Updates
|
||||
Out->ShareDataWith<T>(*Ref);
|
||||
// Apply ScatterUpdate: Out[index] += Updates[:]
|
||||
ScatterUpdate<T>(ctx.GetPlace(), Updates, Index, Out);
|
||||
}
|
||||
};
|
||||
|
||||
template <typename Place, typename T>
|
||||
class ScatterGradientOpKernel : public framework::OpKernel {
|
||||
public:
|
||||
void Compute(const framework::ExecutionContext &ctx) const override {
|
||||
auto *dRef = ctx.Output<Tensor>(framework::GradVarName("Ref"));
|
||||
auto *dUpdates = ctx.Output<Tensor>(framework::GradVarName("Updates"));
|
||||
auto *Index = ctx.Input<Tensor>("Index");
|
||||
auto *dOut = ctx.Input<Tensor>(framework::GradVarName("Out"));
|
||||
|
||||
// In place gradient: dRef = dO
|
||||
dRef->ShareDataWith<T>(*dOut);
|
||||
dUpdates->mutable_data<T>(ctx.GetPlace());
|
||||
// Gradient by Gather: dUpdates += dO[Index]
|
||||
Gather<T>(ctx.GetPlace(), dOut, Index, dUpdates);
|
||||
}
|
||||
};
|
||||
|
||||
} // namespace operators
|
||||
} // namespace paddle
|
@ -0,0 +1,38 @@
|
||||
import unittest
|
||||
from op_test_util import OpTestMeta
|
||||
from gradient_checker import GradientChecker, create_op
|
||||
import numpy
|
||||
import paddle.v2.framework.core as core
|
||||
from paddle.v2.framework.op import Operator
|
||||
|
||||
|
||||
class TestScatterOp(unittest.TestCase):
|
||||
__metaclass__ = OpTestMeta
|
||||
|
||||
def setUp(self):
|
||||
self.type = "scatter"
|
||||
ref_np = numpy.ones((3, 3)).astype("float32")
|
||||
index_np = numpy.array([1, 2]).astype("int32")
|
||||
updates_np = numpy.random.random((2, 3)).astype("float32")
|
||||
output_np = numpy.copy(ref_np)
|
||||
output_np[index_np] += updates_np
|
||||
self.inputs = {'Ref': ref_np, 'Index': index_np, 'Updates': updates_np}
|
||||
self.outputs = {'Out': output_np}
|
||||
|
||||
|
||||
class TestScatterGradOp(GradientChecker):
|
||||
def test_scatter_grad(self):
|
||||
op = create_op("scatter")
|
||||
# test data setup
|
||||
ref_np = numpy.ones((3, 10)).astype("float32")
|
||||
index_np = numpy.array([1, 2]).astype("int32")
|
||||
updates_np = numpy.random.random((2, 10)).astype("float32")
|
||||
output_np = numpy.copy(ref_np)
|
||||
output_np[index_np] += updates_np
|
||||
inputs = {'Ref': ref_np, 'Index': index_np, 'Updates': updates_np}
|
||||
self.check_grad(
|
||||
op, inputs, set(["Updates", "Ref"]), "Out", in_place=True)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
Loading…
Reference in new issue