Merge branch 'develop' into fix_neon_memory_out_of_range

update-doc-pybind
Liu Yiqun 8 years ago
commit a310946f8f

@ -51,19 +51,19 @@ Please refer to our [release announcement](https://github.com/PaddlePaddle/Paddl
- **Connected to Products**
In addition, PaddlePaddle is also designed to be easily deployable. At Baidu,
PaddlePaddle has been deployed into products or service with a vast number
PaddlePaddle has been deployed into products and services with a vast number
of users, including ad click-through rate (CTR) prediction, large-scale image
classification, optical character recognition(OCR), search ranking, computer
virus detection, recommendation, etc. It is widely utilized in products at
Baidu and it has achieved a significant impact. We hope you can also exploit
the capability of PaddlePaddle to make a huge impact for your product.
Baidu and it has achieved a significant impact. We hope you can also explore
the capability of PaddlePaddle to make an impact on your product.
## Installation
It is recommended to check out the
[Docker installation guide](http://doc.paddlepaddle.org/develop/doc/getstarted/build_and_install/docker_install_en.html)
before looking into the
[build from source guide](http://doc.paddlepaddle.org/develop/doc/getstarted/build_and_install/build_from_source_en.html)
[build from source guide](http://doc.paddlepaddle.org/develop/doc/getstarted/build_and_install/build_from_source_en.html).
## Documentation
@ -72,7 +72,7 @@ We provide [English](http://doc.paddlepaddle.org/develop/doc/) and
- [Deep Learning 101](http://book.paddlepaddle.org/index.html)
You might want to start from this online interactive book that can run in Jupyter Notebook.
You might want to start from this online interactive book that can run in a Jupyter Notebook.
- [Distributed Training](http://doc.paddlepaddle.org/develop/doc/howto/usage/cluster/cluster_train_en.html)

@ -0,0 +1,73 @@
# Design for TensorArray
TensorArray as a new concept is borrowed from TensorFlow,
it is meant to be used with dynamic iteration primitives such as `while_loop` and `map_fn`.
This concept can be used to support our new design of dynamic operations, and help to refactor some existing variant-sentence-related layers,
such as `RecurrentGradientMachine`.
In [our design for dynamic RNN](https://github.com/PaddlePaddle/Paddle/pull/4401),
`TensorArray` is used to segment inputs and store states in all time steps.
By providing some methods similar to a C++ array,
the definition of some state-based dynamic models such as RNN could be more natural and highly flexible.
## Dynamic-Related Methods
Some basic methods should be proposed as follows:
### stack()
Pack the values in a `TensorArray` into a tensor with rank one higher than each tensor in `values`.
### unstack(axis=0)
Unpacks the given dimension of a rank-`R` tensor into rank-`(R-1)` tensors.
### concat()
Return the values in the `TensorArray` as a concatenated Tensor.
### write(index, value, data_shared=true)
Write value into index of the TensorArray.
### read(index)
Read the value at location `index` in the `TensorArray`.
### size()
Return the number of values.
## LoDTensor-related Supports
The `RecurrentGradientMachine` in Paddle serves as a flexible RNN layer; it takes variant length sequences as input,
because each step of RNN could only take a tensor-represented batch of data as input,
some preprocess should be taken on the inputs such as sorting the sentences by their length in descending order and cut each word and pack to new batches.
Such cut-like operations can be embedded into `TensorArray` as general methods called `unpack` and `pack`.
With these two methods, a variant-sentence-RNN can be implemented like
```c++
// input is the varient-length data
LodTensor sentence_input(xxx);
TensorArray ta;
Tensor indice_map;
Tensor boot_state = xxx; // to initialize rnn's first state
TensorArray::unpack(input, 1/*level*/, true/*sort_by_length*/, &ta, &indice_map);
TessorArray step_outputs;
TensorArray states;
for (int step = 0; step = ta.size(); step++) {
auto state = states.read(step);
// rnnstep is a function which acts like a step of RNN
auto step_input = ta.read(step);
auto step_output = rnnstep(step_input, state);
step_outputs.write(step_output, true/*data_shared*/);
}
// rnn_output is the final output of an rnn
LoDTensor rnn_output = ta.pack(ta, indice_map);
```
the code above shows that by embedding the LoDTensor-related preprocess operations into `TensorArray`,
the implementation of a RNN that supports varient-length sentences is far more concise than `RecurrentGradientMachine` because the latter mixes all the codes together, hard to read and extend.
some details are as follows.
### unpack(level, sort_by_length)
Split LodTensor in some `level` and generate batches, if set `sort_by_length`, will sort by length.
Returns:
- a new `TensorArray`, whose values are LodTensors and represents batches of data.
- an int32 Tensor, which stores the map from the new batch's indices to original LoDTensor
### pack(level, indices_map)
Recover the original LoD-arranged LoDTensor with the values in a `TensorArray` and `level` and `indices_map`.

@ -182,7 +182,7 @@ Note that **different devices (CPU, GPU)share an Op definition; whether or not t
`MulOp`'s CPU and GPU share the same `Kernel`. A non-sharing `OpKernel` example can be seen in [`OnehotCrossEntropyOpKernel`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/cross_entropy_op.h#L43).
To ease the writing of `OpKernel` compute, and for reusing code cross-device, `Eigen unsupported Tensor` module is used to implement `Compute` interface. To learn about how the Eigen library is used in PaddlePaddle, please see [usage document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/use_eigen_cn.md).
To ease the writing of `OpKernel` compute, and for reusing code cross-device, [`Eigen-unsupported Tensor`](https://bitbucket.org/eigen/eigen/src/default/unsupported/Eigen/CXX11/src/Tensor/README.md?fileviewer=file-view-default) module is used to implement `Compute` interface. To learn about how the Eigen library is used in PaddlePaddle, please see [usage document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/use_eigen_cn.md).
This concludes the forward implementation of an operator. Next its operation and kernel need to be registered in a `.cc` file.

@ -0,0 +1,146 @@
## How to use Eigen in Paddle
Essentially, a neural network is a compute graph. T data needed for the computation is stored in `Tensor`s and its computation procedure is described by `Operator`s. An `Operator` calls the `Compute` interface in its corresponding `OpKernel` and operates on the `Tensor`.
### Eigen Tensor Module
The Eigen Tensor module supports powerful element-wise computation. In addition, a piece of code written using it can be run on both the CPU and the GPU.
Note that Eigen Tensor is still being actively developed, so its tests are not completely covered and its documentation may be sparse.
For details on Eigen Tensor module, please see [doc 1](https://github.com/RLovelett/eigen/blob/master/unsupported/Eigen/CXX11/src/Tensor/README.md) and [doc 2](https://bitbucket.org/eigen/eigen/src/default/unsupported/Eigen/CXX11/src/Tensor/README.md).
### paddle::framework::Tensor
Paddle Tensor's is defined in the framework directory with the following interface:
```cpp
class Tensor {
public:
/*! Return a pointer to mutable memory block. */
template <typename T>
inline T* data();
/**
* @brief Return a pointer to mutable memory block.
* @note If not exist, then allocation.
*/
template <typename T>
inline T* mutable_data(platform::Place place);
/**
* @brief Return a pointer to mutable memory block.
*
* @param[in] dims The dimensions of the memory block.
* @param[in] place The place of the memory block.
*
* @note If not exist, then allocation.
*/
template <typename T>
inline T* mutable_data(DDim dims, platform::Place place);
/*! Resize the dimensions of the memory block. */
inline Tensor& Resize(const DDim& dims);
/*! Return the dimensions of the memory block. */
inline const DDim& dims() const;
private:
/*! holds the memory block if allocated. */
std::shared_ptr<Placeholder> holder_;
/*! points to dimensions of memory block. */
DDim dim_;
};
```
`Placeholder` is used to delay memory allocation; that is, we can first define a tensor, using `Resize` to configure its shape, and then call `mutuable_data` to allocate the actual memory.
```cpp
paddle::framework::Tensor t;
paddle::platform::CPUPlace place;
// set size first
t.Resize({2, 3});
// allocate memory on CPU later
t.mutable_data(place);
```
### paddle::framework::Tensor Usage
`AddOp` demonstrates Tensor's usage.
- InferShape
When computing a neural network's compute graph, first call every `Operator`'s `InferShape` method, and use `Resize` to configure the size of the output tensor.
```cpp
void InferShape(const framework::InferShapeContext &ctx) const override {
PADDLE_ENFORCE_EQ(ctx.Input<Tensor>("X")->dims(),
ctx.Input<Tensor>("Y")->dims(),
"Two input of Add Op's dimension must be same.");
ctx.Output<Tensor>("Out")->Resize(ctx.Input<Tensor>("X")->dims());
}
```
- Run
```cpp
void Compute(const framework::ExecutionContext& context) const override {
auto* input0 = context.Input<Tensor>("X");
auto* input1 = context.Input<Tensor>("Y");
auto* output = context.Output<Tensor>("Out");
output->mutable_data<T>(context.GetPlace());
auto x = EigenVector<T>::Flatten(*input0);
auto y = EigenVector<T>::Flatten(*input1);
auto z = EigenVector<T>::Flatten(*output);
auto place = context.GetEigenDevice<Place>();
z.device(place) = x + y;
}
```
### paddle::framework::Tensor到EigenTensor的转换
As shown above, in actual computation, we need to transform the input and output `Tensor`s into formats Eigen supports. We show some functions in [eigen.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/eigen.h) to implement the transformation from `paddle::framework::Tensor`to `EigenTensor/EigenMatrix/EigenVector/EigenScalar`.
Using EigenTensor as an example:
```cpp
Tensor t;
float* p = t.mutable_data<float>(make_ddim({1, 2, 3}), platform::CPUPlace());
for (int i = 0; i < 1 * 2 * 3; i++) {
p[i] = static_cast<float>(i);
}
EigenTensor<float, 3>::Type et = EigenTensor<float, 3>::From(t);
```
`From` is an interfacing method provided by the EigenTensor template, which implements the transformation from a `paddle::framework::Tensor` object to an EigenTensor. Since `rank` is a template parameter, it needs to be explicitly specified at the time of the transformation.
In Eigen, tensors with different ranks are different types, with `Vector` bring a rank-1 instance. Note that `EigenVector<T>::From` uses a transformation from an 1-dimensional Paddle tensor to a 1-dimensional Eigen tensor while `EigenVector<T>::Flatten` reshapes a paddle tensor and flattens it into a 1-dimensional Eigen tensor. Both resulting tensors are still typed EigenVector.
For more transformations, see the [unit tests](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/eigen_test.cc) in the `eigen_test.cc` file.
### Implementing Computation
While computing, the device interface is needed from the EigenTensors on the left hand side of the assignments. Note that the computation between EigenTensors only changes the data originally inthe Tensor and does not change all the shape information associated with the Tensor.
```cpp
auto x = EigenVector<T>::Flatten(*input0);
auto y = EigenVector<T>::Flatten(*input1);
auto z = EigenVector<T>::Flatten(*output);
auto place = context.GetEigenDevice<Place>();
z.device(place) = x + y;
```
In this code segment, input0/input1/output can be Tensors of arbitrary dimension. We are calling Flatten from EigenVector, transforming a tensor of any dimension into a 1-dimensional EigenVector. After completing computation, input0/input1/output will retain the same shape information, and they can be resized using the `Resize` interface.
Because the Eigen Tensor module is under-documented, please refer to `OpKernel`'s computation code in TensorFlow's [kernel module documentation](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/kernels).

@ -72,6 +72,22 @@ bool operator==(const LoD& a, const LoD& b) {
return true;
}
size_t LoDTensor::NumElements(size_t level, size_t idx) const {
PADDLE_ENFORCE_LT(level, NumLevels());
PADDLE_ENFORCE_LT(idx, NumElements(level));
// the last level of LoD, just return number of records in Tensor
if (level == NumLevels() - 1) {
return lod_[level][idx + 1] - lod_[level][idx];
}
// high level of LoD, and there is another lower level, return number of
// lower-level elements
auto tmp = SliceInLevel(lod_, level, idx, idx + 1);
PADDLE_ENFORCE_GE(tmp.size(), 2);
// there is a 0 as a placeholder stored in LoD, so the number of elements
// equals lod.size() - 1
return tmp[1].size() - 1;
}
void LoDTensor::ShrinkLevels(size_t level_begin, size_t level_end) {
auto new_lod = framework::SliceLevels(lod_, level_begin, level_end);
lod_ = new_lod;

@ -38,6 +38,18 @@ using Vector = thrust::host_vector<
T, thrust::system::cuda::experimental::pinned_allocator<T>>;
#endif
/*
* 3-level LoD stores
*
* 0 10 20
* 0 5 10 15 20
* 0 2 5 7 10 12 15 20
*
* - in a level, each element indicates offset in the underlying Tensor
* - the first element should be 0 and that indicates that this sequence start
* from 0
* - each sequence's begin and end(no-inclusive) is level[id, id+1]
*/
using LoD = std::vector<Vector<size_t>>;
LoD SliceLevels(const LoD& in, size_t level_begin, size_t level_end);
@ -65,11 +77,8 @@ class LoDTensor : public Tensor {
* Get a element from LoD.
*/
size_t lod_element(size_t level, size_t elem) const {
PADDLE_ENFORCE(level < NumLevels(), "level [%d] out of range [%d]", level,
NumLevels());
PADDLE_ENFORCE(elem < NumElements(level),
"element begin [%d] out of range [%d]", elem,
NumElements(level));
PADDLE_ENFORCE_LT(level, NumLevels());
PADDLE_ENFORCE_LT(elem, NumElements(level));
return (lod_)[level][elem];
}
@ -82,12 +91,23 @@ class LoDTensor : public Tensor {
* Number of elements in a level.
*/
size_t NumElements(size_t level = 0) const {
PADDLE_ENFORCE(level < NumLevels(), "level [%d] out of range [%d]", level,
NumLevels());
PADDLE_ENFORCE_LT(level, NumLevels());
// the last offset is the end of last element
return (lod_)[level].size() - 1;
}
/*
* Number of lower-level elements.
* For example, a 2-level lod-tensor
*
* 0-th level | |
* 1-th level || |||
*
* NumElements(0, 0) get 2
* NumElements(0, 1) get 3
*/
size_t NumElements(size_t level, size_t idx) const;
/*
* Shrink levels[level_begin:level_end]
*/

@ -56,6 +56,12 @@ TEST_F(LoDTensorTester, NumElements) {
ASSERT_EQ(lod_tensor_.NumElements(2), 8UL);
}
TEST_F(LoDTensorTester, NumElements2) {
ASSERT_EQ(lod_tensor_.NumElements(0, 0), 2UL);
ASSERT_EQ(lod_tensor_.NumElements(0, 1), 2UL);
ASSERT_EQ(lod_tensor_.NumElements(1, 1), 2UL);
}
TEST_F(LoDTensorTester, ShrinkLevels) {
// slice 1 level
for (size_t level = 0; level < 3UL; ++level) {
@ -65,7 +71,7 @@ TEST_F(LoDTensorTester, ShrinkLevels) {
ASSERT_EQ(new_lod_tensor.NumElements(0), lod_tensor_.NumElements(level));
ASSERT_EQ(new_lod_tensor.data<float>(), lod_tensor_.data<float>());
}
// slice 2 level
// shrink 2 level
for (size_t level = 0; level < 2UL; ++level) {
LoDTensor new_lod_tensor = lod_tensor_;
new_lod_tensor.ShrinkLevels(level, level + 2);

@ -36,8 +36,8 @@ TEST(LoDTensor, LoDInGPU) {
lod_tensor.mutable_data<float>(place);
lod_tensor.set_lod(src_lod);
CHECK_EQ(lod_tensor.lod_element(0, 2), 4);
CHECK_EQ(lod_tensor.lod_element(0, 4), 8);
CHECK_EQ(lod_tensor.lod_element(0, 2), 4UL);
CHECK_EQ(lod_tensor.lod_element(0, 4), 8UL);
auto lod = lod_tensor.lod();

@ -28,7 +28,7 @@ bool MKLDNNConvLayer::init(const LayerMap& layerMap,
if (!MKLDNNLayer::init(layerMap, parameterMap)) {
return false;
}
CHECK_EQ(inputLayers_.size(), 1) << "Only support one input layer yet";
CHECK_EQ(inputLayers_.size(), 1UL) << "Only support one input layer yet";
CHECK_EQ(inputLayers_.size(), parameters_.size());
CHECK(config_.shared_biases()) << "Only support shared biases yet";

@ -28,7 +28,7 @@ bool MKLDNNFcLayer::init(const LayerMap& layerMap,
return false;
}
CHECK_EQ(inputLayers_.size(), 1) << "Only support one input layer yet";
CHECK_EQ(inputLayers_.size(), 1UL) << "Only support one input layer yet";
CHECK_EQ(inputLayers_.size(), parameters_.size());
CHECK(!parameters_[0]->isSparse()) << "Do not support sparse yet";

@ -228,7 +228,7 @@ void genGroundTruth(vector<SingleBeamExpansion>& beamExpansions,
curBeam.groundTruth[j] = *(start + n);
curBeam.inBeam[j] = 1;
} else {
CHECK_LE(curBeam.rowIdxInBeam[j] + 1,
CHECK_LE((size_t)curBeam.rowIdxInBeam[j] + 1,
curBeam.subSeqStartPos.size() - 1);
int start = curBeam.subSeqStartPos[curBeam.rowIdxInBeam[j]];
int end = curBeam.subSeqStartPos[curBeam.rowIdxInBeam[j] + 1];

@ -1,5 +0,0 @@
---
Language: Cpp
BasedOnStyle: Google
Standard: Cpp11
...

@ -0,0 +1 @@
../framework/.clang-format

@ -1,5 +0,0 @@
---
Language: Cpp
BasedOnStyle: Google
Standard: Cpp11
...

@ -0,0 +1 @@
../framework/.clang-format

@ -42,7 +42,7 @@ class MultiplexGPUKernel : public framework::OpKernel {
for (auto i = 0; i < rows; i++) {
int32_t k = index[i];
PADDLE_ENFORCE_GE(k, 0, "index must be nonnegative.");
PADDLE_ENFORCE_LT(k, ins.size(),
PADDLE_ENFORCE_LT((size_t)k, ins.size(),
"index exceeds the number of candidate tensors.");
memory::Copy(place, out->data<T>() + i * cols, place,
ins[k]->data<T>() + i * cols, cols * sizeof(T), stream);

@ -0,0 +1 @@
../framework/.clang-format

@ -1,6 +1,6 @@
if(WITH_PYTHON)
cc_library(paddle_pybind SHARED
SRCS pybind.cc protobuf.cc
SRCS pybind.cc exception.cc protobuf.cc
DEPS pybind python backward
${GLOB_OP_LIB})
endif(WITH_PYTHON)

@ -0,0 +1,34 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/pybind/exception.h"
namespace paddle {
namespace pybind {
void BindException(pybind11::module& m) {
static pybind11::exception<platform::EnforceNotMet> exc(m, "EnforceNotMet");
pybind11::register_exception_translator([](std::exception_ptr p) {
try {
if (p) std::rethrow_exception(p);
} catch (const platform::EnforceNotMet& e) {
exc(e.what());
}
});
m.def("__unittest_throw_exception__", [] { PADDLE_THROW("test exception"); });
}
} // namespace pybind
} // namespace paddle

@ -0,0 +1,23 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "paddle/platform/enforce.h"
#include "pybind11/pybind11.h"
namespace paddle {
namespace pybind {
extern void BindException(pybind11::module& m);
} // namespace pybind
} // namespace paddle

@ -46,8 +46,7 @@ struct variant_caster<V<Ts...>> {
template <typename T>
typename std::enable_if<
!std::is_same<T, boost::detail::variant::void_>::value,
bool>::type
!std::is_same<T, boost::detail::variant::void_>::value, bool>::type
try_load(handle src, bool convert) {
auto caster = make_caster<T>();
if (!load_success_ && caster.load(src, convert)) {
@ -71,8 +70,7 @@ struct variant_caster<V<Ts...>> {
return load_success_;
}
static handle cast(Type const &src,
return_value_policy policy,
static handle cast(Type const &src, return_value_policy policy,
handle parent) {
variant_caster_visitor visitor(policy, parent);
return boost::apply_visitor(visitor, src);
@ -101,8 +99,8 @@ inline std::vector<T> RepeatedToVector(
const google::protobuf::RepeatedField<T> &repeated_field) {
std::vector<T> ret;
ret.reserve(repeated_field.size());
std::copy(
repeated_field.begin(), repeated_field.end(), std::back_inserter(ret));
std::copy(repeated_field.begin(), repeated_field.end(),
std::back_inserter(ret));
return ret;
}
@ -134,7 +132,7 @@ class VarDescBind;
// read/write speed. Only when we want the protobuf message, the local changes
// will be synchronized (by `Sync` method).
class VarDescBind {
public:
public:
explicit VarDescBind(const std::string &name) { desc_.set_name(name); }
VarDesc *Proto() { return &desc_; }
@ -157,12 +155,12 @@ public:
return desc_.lod_tensor().data_type();
}
private:
private:
VarDesc desc_;
};
class OpDescBind {
public:
public:
OpDesc *Proto() {
Sync();
return &op_desc_;
@ -174,8 +172,8 @@ public:
const std::vector<std::string> &Input(const std::string &name) const {
auto it = inputs_.find(name);
PADDLE_ENFORCE(
it != inputs_.end(), "Input %s cannot be found in Op %s", name, Type());
PADDLE_ENFORCE(it != inputs_.end(), "Input %s cannot be found in Op %s",
name, Type());
return it->second;
}
@ -196,10 +194,8 @@ public:
const std::vector<std::string> &Output(const std::string &name) const {
auto it = outputs_.find(name);
PADDLE_ENFORCE(it != outputs_.end(),
"Output %s cannot be found in Op %s",
name,
Type());
PADDLE_ENFORCE(it != outputs_.end(), "Output %s cannot be found in Op %s",
name, Type());
return it->second;
}
@ -258,7 +254,7 @@ public:
return boost::get<BlockDesc *>(it->second)->idx();
}
private:
private:
struct SetAttrDescVisitor : public boost::static_visitor<void> {
explicit SetAttrDescVisitor(OpDesc::Attr *attr) : attr_(attr) {}
mutable OpDesc::Attr *attr_;
@ -325,7 +321,7 @@ private:
};
class BlockDescBind {
public:
public:
BlockDescBind(ProgramDescBind *prog, BlockDesc *desc)
: prog_(prog), desc_(desc), need_update_(false) {}
@ -349,8 +345,8 @@ public:
VarDescBind *Var(py::bytes name_bytes) const {
std::string name = name_bytes;
auto it = vars_.find(name);
PADDLE_ENFORCE(
it != vars_.end(), "Can not find variable %s in current block.", name);
PADDLE_ENFORCE(it != vars_.end(),
"Can not find variable %s in current block.", name);
return it->second.get();
}
@ -398,7 +394,7 @@ public:
BlockDesc *RawPtr() { return desc_; }
private:
private:
ProgramDescBind *prog_; // not_own
BlockDesc *desc_; // not_own
bool need_update_;
@ -412,7 +408,7 @@ using ProgDescMap =
static ProgDescMap *g_bind_map = nullptr;
class ProgramDescBind {
public:
public:
static ProgramDescBind &Instance(ProgramDesc *prog) {
if (g_bind_map == nullptr) {
g_bind_map = new ProgDescMap();
@ -449,7 +445,7 @@ public:
return prog_;
}
private:
private:
explicit ProgramDescBind(ProgramDesc *prog) : prog_(prog) {
for (auto &block : *prog->mutable_blocks()) {
blocks_.emplace_back(new BlockDescBind(this, &block));
@ -492,8 +488,7 @@ void BindProgramDesc(py::module &m) {
return &ProgramDescBind::Instance(prog_desc);
},
py::return_value_policy::reference)
.def("append_block",
&ProgramDescBind::AppendBlock,
.def("append_block", &ProgramDescBind::AppendBlock,
py::return_value_policy::reference)
.def("block", &ProgramDescBind::Block, py::return_value_policy::reference)
.def("__str__", &ProgramDescBind::DebugString)
@ -504,20 +499,16 @@ void BindBlockDesc(py::module &m) {
py::class_<BlockDescBind>(m, "BlockDesc", "")
.def_property_readonly("id", &BlockDescBind::ID)
.def_property_readonly("parent", &BlockDescBind::Parent)
.def("append_op",
&BlockDescBind::AppendOp,
.def("append_op", &BlockDescBind::AppendOp,
py::return_value_policy::reference)
.def("prepend_op",
&BlockDescBind::PrependOp,
.def("prepend_op", &BlockDescBind::PrependOp,
py::return_value_policy::reference)
.def("new_var", &BlockDescBind::NewVar,
py::return_value_policy::reference)
.def(
"new_var", &BlockDescBind::NewVar, py::return_value_policy::reference)
.def("var", &BlockDescBind::Var, py::return_value_policy::reference)
.def("all_vars",
&BlockDescBind::AllVars,
.def("all_vars", &BlockDescBind::AllVars,
py::return_value_policy::reference)
.def("all_ops",
&BlockDescBind::AllOps,
.def("all_ops", &BlockDescBind::AllOps,
py::return_value_policy::reference);
}

@ -21,6 +21,7 @@ limitations under the License. */
#include "paddle/operators/recurrent_op.h"
#include "paddle/platform/enforce.h"
#include "paddle/platform/place.h"
#include "paddle/pybind/exception.h"
#include "paddle/pybind/pybind.h"
#include "paddle/pybind/tensor_py.h"
#include "paddle/string/to_string.h"
@ -47,6 +48,8 @@ PYBIND11_PLUGIN(core) {
// not cause namespace pollution.
using namespace paddle::framework; // NOLINT
BindException(m);
py::class_<Tensor>(m, "Tensor", py::buffer_protocol())
.def_buffer(
[](Tensor &self) -> py::buffer_info { return CastToPyBuffer(self); })
@ -161,8 +164,7 @@ All parameter, weight, gradient are variables in Paddle.
py::return_value_policy::reference)
.def("find_var", &Scope::FindVar, py::return_value_policy::reference)
.def(py::init<>())
.def("new_scope",
[](Scope &self) -> Scope * { return &self.NewScope(); },
.def("new_scope", [](Scope &self) -> Scope * { return &self.NewScope(); },
py::return_value_policy::reference)
.def("drop_kids", &Scope::DropKids);
@ -230,8 +232,7 @@ All parameter, weight, gradient are variables in Paddle.
})
.def("infer_shape", &OperatorBase::InferShape)
.def("run",
[](OperatorBase &self,
const Scope &scope,
[](OperatorBase &self, const Scope &scope,
const platform::DeviceContext &dev_ctx) {
self.Run(scope, dev_ctx);
dev_ctx.Wait();
@ -259,10 +260,8 @@ All parameter, weight, gradient are variables in Paddle.
retv->SetType("plain_net");
return retv;
})
.def("append_op",
[](operators::NetOp &self, const OperatorBase &op) {
self.AppendOp(op);
})
.def("append_op", [](operators::NetOp &self,
const OperatorBase &op) { self.AppendOp(op); })
.def("complete_add_op", &operators::NetOp::CompleteAddOp)
.def("complete_add_op", [](std::shared_ptr<operators::NetOp> &self) {
self->CompleteAddOp();
@ -282,9 +281,10 @@ All parameter, weight, gradient are variables in Paddle.
auto rnn_op = OpRegistry::CreateOp(desc);
return static_cast<operators::RecurrentOp *>(rnn_op.release());
})
.def("set_stepnet",
[](operators::RecurrentOp &self, const operators::NetOp &net)
-> void { self.set_stepnet(net.Clone()); });
.def("set_stepnet", [](operators::RecurrentOp &self,
const operators::NetOp &net) -> void {
self.set_stepnet(net.Clone());
});
// cond_op
py::class_<operators::CondOp, OperatorBase>(m, "CondOp")

@ -63,11 +63,8 @@ struct CastToPyBufferImpl<true, I, ARGS...> {
}
return py::buffer_info(
dst_tensor.mutable_data<CUR_TYPE>(dst_tensor.holder_->place()),
sizeof(CUR_TYPE),
py::format_descriptor<CUR_TYPE>::format(),
(size_t)framework::arity(dst_tensor.dims()),
dims_outside,
strides);
sizeof(CUR_TYPE), py::format_descriptor<CUR_TYPE>::format(),
(size_t)framework::arity(dst_tensor.dims()), dims_outside, strides);
} else {
constexpr bool less = I + 1 < std::tuple_size<std::tuple<ARGS...>>::value;
return CastToPyBufferImpl<less, I + 1, ARGS...>()(tensor);
@ -110,8 +107,8 @@ void PyCUDATensorSetFromArray(
self.Resize(framework::make_ddim(dims));
auto *dst = self.mutable_data<T>(place);
paddle::platform::GpuMemcpySync(
dst, array.data(), sizeof(T) * array.size(), cudaMemcpyHostToDevice);
paddle::platform::GpuMemcpySync(dst, array.data(), sizeof(T) * array.size(),
cudaMemcpyHostToDevice);
}
#endif

@ -0,0 +1 @@
../framework/.clang-format

@ -30,7 +30,7 @@ namespace string {
// its syntax is simple as it doesn't own/manage the string, it is
// cheap to construct Pieces and pass them around.
class Piece {
public:
public:
static const size_t npos = static_cast<size_t>(-1);
// We provide non-explicit singleton constructors so users can
@ -57,7 +57,7 @@ public:
// Return a string that contains the copy of the referenced data.
std::string ToString() const { return std::string(data_, size_); }
private:
private:
const char* data_;
size_t size_;

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save