# Contribute Code
We sincerely appreciate your contribution. This document explains our workflow and work style.
## Workflow
PaddlePaddle uses this [Git branching model]( The following steps guide usual contributions.
1. Fork
Our development community has been growing fastly; it doesn't make sense for everyone to write into the official repo. So, please file Pull Requests from your fork. To make a fork, just head over to the GitHub page and click the ["Fork" button](
1. Clone
To make a copy of your fork to your local computers, please run
git clone
cd paddle
1. Create the local feature branch
For daily works like adding a new feature or fixing a bug, please open your feature branch before coding:
git checkout -b my-cool-stuff
1. Commit
Before issuing your first `git commit` command, please install [`pre-commit`]( by running the following commands:
pip install pre-commit
pre-commit install
Our pre-commit configuration requires clang-format 3.8 for auto-formating C/C++ code and yapf for Python.
Once installed, `pre-commit` checks the style of code and documentation in every commit. We will see something like the following when you run `git commit`:
➜ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
1. Build and test
Users can build PaddlePaddle natively on Linux and Mac OS X. But to unify the building environment and to make it easy for debugging, the recommended way is [using Docker](
1. Keep pulling
An experienced Git user pulls from the official repo often -- daily or even hourly, so they notice conflicts with others work early, and it's easier to resolve smaller conflicts.
git remote add upstream
git pull upstream develop
1. Push and file a pull request
You can "push" your local work into your forked repo:
git push origin my-cool-stuff
The push allows you to create a pull request, requesting owners of this [official repo]( to pull your change into the official one.
To create a pull request, please follow [these steps](
If your change is for fixing an issue, please write ["Fixes <issue-URL>"]( in the description section of your pull request. Github would close the issue when the owners merge your pull request.
Please remember to specify some reviewers for your pull request. If you don't know who are the right ones, please follow Github's recommendation.
1. Delete local and remote branches
To keep your local workspace and your fork clean, you might want to remove merged branches:
git push origin :my-cool-stuff
git checkout develop
git pull upstream develop
git branch -d my-cool-stuff
### Code Review
- Please feel free to ping your reviewers by sending them the URL of your pull request via IM or email. Please do this after your pull request passes the CI.
- Please answer reviewers' every comment. If you are to follow the comment, please write "Done"; please give a reason otherwise.
- If you don't want your reviewers to get overwhelmed by email notifications, you might reply their comments by [in a batch](
- Reduce the unnecessary commits. Some developers commit often. It is recommended to append a sequence of small changes into one commit by running `git commit --amend` instead of `git commit`.
## Coding Standard
### Code Style
Our C/C++ code follows the [Google style guide](
Our Python code follows the [PEP8 style guide](
Our build process helps to check the code style. In [``](, the entry point of our [builder Docker image](, the CMake argument `WITH_STYLE_CHECK` is set to `ON` by default. This flag is on
Please install pre-commit, which automatically reformat the changes to C/C++ and Python code whenever we run `git commit`. To check the whole codebase, we can run the command `pre-commit run -a`, as in the [`` file](, which is invoked by [our Travis CI configuration](
### Unit Tests
Please remember to add related unit tests.
- For C/C++ code, please follow [`google-test` Primer](
- For Python code, please use [Python's standard `unittest` package](
### Writing Logs
We use [glog]( for logging in our C/C++ code.
For general information, please use `LOG`. For debug information, please use [`VLOG`]( The reason is at [here](
`VLOG` requires a *verbose level* parameter. For example:
VLOG(3) << "Operator FC is taking " << num_inputs << "inputs."
When we run a PaddlePaddle application or test, we can specify a verbose threshold. For example:
GLOG_vmodule=buddy_allocator=2 \
GLOG_v=10 \
python \
This will enable VLOG messages generated by `buddy_allocator.{h,cc}` and in the verbose range of 0 to 3, so you will see above example VLOG message, which is in level 3. This suggests that we output overall messages in lower verbose levels, so they display with higher probability. When coding C++, please follow the verbose level convention as follows:
- verbose level 1: [framework](
- verbose level 3: [operators](
- verbose level 5: [memory](, [platform](
- verbose level 7: [math](

@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file.
The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size]( We have done a [benchmark experiment](, which shows that protobuf is not fit for the task.
As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](, and has a description information proto of [LoDTensorDesc]( We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, the `name` of the tensor, and the `LoD` information in [LoDTensor]( A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,
As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](, and has a description information proto of [LoDTensorDesc]( We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, and the `LoD` information in [LoDTensor]( A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,
The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format.
[offset] [type] [description]
0004 4 bytes integer HeaderLength, the length of LoDTensorDesc
0008 4 bytes integer ContentLength, the length of LodTensor Buffer
0009 1 bytes char TensorDesc
00010 1 bytes char TensorDesc
00100 1 bytes char TensorValue
00101 1 bytes char TensorValue
00102 1 bytes char TensorValue ..
|field name | type | description |
| --- | --- | --- |
| version | uint32_t | Version of saved file. Always 0 now. |
| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
| tensor desc | void* | TensorDesc protobuf binary message |
| tensor data | void* | Tensor's data in binary format. The length of `tensor_data` is decided by `TensorDesc.dims()` and `TensorDesc.data_type()` |
| lod_level | uint64_t | Level of LoD |
| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
| data of lod[0] | uint64_t* | [Optional] lod[0].data() |
| ... | ... | ... |
## Summary

@ -1,39 +1,36 @@
# 构建Raspberry Pi平台上的PaddlePaddle库
对于Rasspberry Pi系统用户可通过ssh等方式登录到Raspberry Pi系统上按照[源码编译PaddlePaddle](相关文档所述直接编译Raspberry Pi平台上适用的PaddlePaddle库。
通常有两个方法来构建基于 Rasspberry Pi 的版本:
用户也可以在自己熟悉的开发平台上通过交叉编译的方式来编译。这篇文档将以Linux x86-64平台为例介绍交叉编译Raspberry Pi平台上适用的PaddlePaddle的方法和步骤
1. 通过ssh等方式登录到Raspberry Pi系统上来构建。所需的开发工具和第三方库可以参考 [`/Dockerfile`](
## 准备交叉编译环境
1. 另一个方法是交叉编译。这篇文档介绍在 Linux/x64 上交叉编译Raspberry Pi平台上适用的PaddlePaddle的方法和步骤。
从源码交叉编译PaddlePaddle用户需要提前准备好交叉编译环境。用户可自行前往[github](下载Raspberry Pi平台使用的C/C++交叉编译工具链,也可通过以下命令获取:
## 安装交叉编译器
克隆下面 Github repo
git clone
该github仓库中包含若干个预编译好的、针对不同平台的编译工具。宿主机是Linux x86-64环境则需选用`arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64`下的作为编译工具所使用的编译器为arm-linux-gnueabihf-gcc 4.8.3。
即可在 `./tools/tree/master/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64` 目录里找到交叉编译器 arm-linux-gnueabihf-gcc 4.8.3。运行该编译工具链需要一台 Linux x64 机器上以及 2.14版本以上的 glibc。
## 配置交叉编译参数
CMake[支持交叉编译](。PaddlePaddle for Raspberry Pi的配置信息在[cmake/cross_compiling/raspberry_pi.cmake](。
交叉编译Raspberry Pi版本PaddlePaddle库时有一些必须配置的参数
- `CMAKE_SYSTEM_NAME`CMake编译的目标平台必须配置为`RPi`。在设置`CMAKE_SYSTEM_NAME=RPi`后PaddlePaddle的CMake系统才认为在是在交叉编译Raspberry Pi系统的版本并自动编译宿主机版protoc可执行文件、目标机版protobuf库、以及目标机版OpenBLAS库。
Raspberry Pi平台可选配置参数
- `CMAKE_SYSTEM_NAME`CMake编译的目标平台必须配置为`RPi`。在设置`CMAKE_SYSTEM_NAME=RPi`后PaddlePaddle的CMake系统才认为在是在交叉编译Raspberry Pi系统的版本并自动编译宿主机版protoc可执行文件、目标机版protobuf库、以及目标机版OpenBLAS库。
- `RPI_TOOLCHAIN`编译工具链所在的绝对路径或者相对于构建目录的相对路径。PaddlePaddle的CMake系统将根据该值自动设置需要使用的交叉编译器否则用户需要在cmake时手动设置这些值。无默认值。
- `RPI_ARM_NEON`是否使用NEON指令。目前必须设置成`ON`,默认值为`ON`。
- `RPI_TOOLCHAIN`编译工具链所在的绝对路径或者相对于构建目录的相对路径。PaddlePaddle的CMake系统将根据该值自动设置需要使用的交叉编译器否则用户需要在cmake时手动设置这些值。无默认值。
- `RPI_ARM_NEON`是否使用NEON指令。目前必须设置成`ON`,默认值为`ON`。
- `HOST_C/CXX_COMPILER`宿主机的C/C++编译器。在编译宿主机版protoc可执行文件和目标机版OpenBLAS库时需要用到。默认设置成环境变量`CC`的值;若环境变量`CC`没有设置,则设置成`cc`编译器。
@ -47,7 +44,9 @@ cmake -DCMAKE_SYSTEM_NAME=RPi \
## 编译和安装
@ -60,6 +59,4 @@ make install
注意如果你曾经在源码目录下编译过其他平台的PaddlePaddle库请先使用`rm -rf`命令删除`third_party`目录和`build`目录以确保所有的第三方依赖库和PaddlePaddle代码都是针对新的CMake配置重新编译的。
执行完安装命令后由于上一步cmake配置中`WITH_C_API`设置为`ON``your/path/to/install`目录中会包含`include`和`lib`目录,其中`include`中包含C-API的头文件`lib`中包含一个Raspberry Pi版本的库。
执行完安装命令后,,`your/path/to/install`目录中会包含`include`和`lib`目录,其中`include`中包含C-API的头文件`lib`中包含一个Raspberry Pi版本的库。

@ -0,0 +1,62 @@
# Build PaddlePaddle for Raspberry Pi
You may use any of the following two approaches to build the inference library of PaddlePaddle for Raspberry Pi:
1. Build using SSH: Log in to a Raspberry Pi using SSH and build the library. The required development tools and third-party dependencies are listed in here: [`/Dockerfile`](
1. Cross-compile: We talk about how to cross-compile PaddlePaddle for Raspberry Pi on a Linux/x64 machine, in more detail in this article.
## The Cross-Compiling Toolchain
Step 1. Clone the Github repo by running the following command.
git clone
Step 2. Use the pre-built cross-compiler found in `./tools/tree/master/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64`. To run it on a Linux computer, glibc version >= 2.14 is needed.
## CMake Arguments
CMake supports [cross-compiling]( All CMake configuration arguments required for the cross-compilation for Raspberry Pi can be found in [`cmake/cross_compiling/raspberry_pi.cmake`](
Some important arguments that need to be set:
- `CMAKE_SYSTEM_NAME`: The target platform. Must be `RPi`.
- `RPI_TOOLCHAIN`: The absolute path of the cross-compiling toolchain.
- `RPI_ARM_NEON`: Use ARM NEON Intrinsics. This is a required argument and set default to `ON`.
- `HOST_C/CXX_COMPILER`: The C/C++ compiler for the host. It is used to build building tools running on the host, for example, protoc.
A commonly-used CMake configuration is as follows:
-DRPI_TOOLCHAIN=your/path/to/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64 \
-DCMAKE_INSTALL_PREFIX=your/path/to/install \
To build the inference library, please set the argument WITH_API to ON: `WITH_C_API=ON`.
You can add more arguments. For example, to minimize the size of the generated inference library, you may use `CMAKE_BUILD_TYPE=MinSizeRel`. For performance optimization, you may use `CMAKE_BUILD_TYPE=Release`.
## Build and Install
The following commands build the inference library of PaddlePaddle for Raspberry Pi and third-party dependencies.
make install
The intermediate files will be stored in `build`. Third-party libraries will be located in `build/third_party`. If you have already built it for other platforms like Android or iOS, you may want to clear these directories by running the command: `rm -rf build`.
The infernece library will be in `your/path/to/install/lib`, with related header files in `your/path/to/install/include`.

@ -67,7 +67,7 @@ func main() {
cp, err = pserver.LoadCheckpoint(e, idx)
if err != nil {
if err == pserver.ErrCheckpointNotFound {
log.Info("Could not find the pserver checkpoint.")
log.Info("load checkpoint error", "error", err)
} else {
@ -99,7 +99,7 @@ func main() {
go func() {
log.Info("starting pserver", log.Ctx{"port": *port})
log.Info("serving pserver", log.Ctx{"port": *port})
err = http.Serve(l, nil)

go/glide.lock generated

@ -1,5 +1,5 @@
hash: 51d9e2e46d7fd9173ff11ecada40f7b7728756be18d5e2f032535f66465e6e15
updated: 2017-10-24T15:04:09.987751592-07:00
hash: 107c058cf5c9163a75d40eef2273a793c36112683c25d72aa8288827fdde3a19
updated: 2017-10-30T03:46:19.137696069Z
- name:
version: bae2f1293d092fd8167939d5108d1b025eaef9de

@ -30,3 +30,4 @@ import:
version: v2.13
- package:
version: v1.6.0
- package:

@ -123,7 +123,8 @@ func paddle_set_dataset(client C.paddle_master_client, path **C.char, size
err := c.SetDataset(paths)
if err != nil {
log.Error("error set dataset", log.Ctx{"error": err})
log.Error("error set dataset",
log.Ctx{"error": err, "paths": paths})

@ -121,6 +121,7 @@ func (c *Client) StartGetRecords(passID int) {
func (c *Client) getRecords(passID int) {
i := 0
for {
t, err := c.getTask(passID)
if err != nil {
@ -130,12 +131,20 @@ func (c *Client) getRecords(passID int) { <- record{nil, err}
if err.Error() == ErrPassAfter.Error() {
// wait util last pass finishes
time.Sleep(time.Second * 3)
if i%60 == 0 {
log.Debug("getTask of passID error.",
log.Ctx{"error": err, "passID": passID})
i = 0
log.Error("getTask error.", log.Ctx{"error": err})
// if err.Error() == ErrPassAfter.Error()
// wait util last pass finishes
// if other error such as network error
// wait to reconnect or task time out
time.Sleep(time.Second * 3)
i += 3
for _, chunk := range t.Chunks {

@ -117,6 +117,7 @@ func TestNextRecord(t *testing.T) {
if e != nil {
// test for n passes
for pass := 0; pass < 10; pass++ {

@ -0,0 +1,4 @@
# Ignore everything in this directory
# Except this file

@ -13,5 +13,5 @@
# limitations under the License.
go_test(pserver_test DEPS paddle_go_optimizer)
go_test(pserver_test DEPS paddle_go_optimizer gen_proto_go)

@ -71,9 +71,15 @@ func newOptimizer(paramWithConfigs ParameterWithConfig, State []byte) *optimizer
cstate = unsafe.Pointer(&s[0])
var cptr (*C.uchar)
if len(c) > 0 {
cptr = (*C.uchar)(&c[0])
} else {
log.Error("empty config", "param name", paramWithConfigs.Param.Name)
o.config = c
o.opt = C.paddle_create_optimizer(

@ -17,21 +17,25 @@ package pserver
import (
uuid ""
pb ""
log ""
@ -40,7 +44,7 @@ type ElementType int
// ErrCheckpointNotFound indicates that the pserver checkpoint could
// not be found.
var ErrCheckpointNotFound = errors.New("checkpoint not found")
var ErrCheckpointNotFound = errors.New("checkpoint not found in etcd")
// RPC error message.
const (
@ -66,6 +70,46 @@ type Parameter struct {
Content []byte
func float32ToString(b []byte) string {
f := make([]float32, len(b)/4)
buf := bytes.NewReader(b)
err := binary.Read(buf, binary.LittleEndian, &f)
if err != nil {
return ""
return fmt.Sprintf("%v", f)
func float32ByteToString(c []byte) string {
var a []byte
var b []byte
if len(c) <= 80 {
a = c
} else {
a = c[0:40]
b = c[len(c)-40:]
var s string
s = float32ToString(a)
if b == nil {
return s
s = strings.Replace(s, "]", "", -1) + "..." + strings.Replace(float32ToString(b), "[", "", -1)
return s
func (p Parameter) String() string {
if p.ElementType != Float32 {
return fmt.Sprintf("name:%v ElementType:%v",
p.Name, p.ElementType)
return float32ByteToString(p.Content)
// ParameterWithConfig contains the parameter and the configuration.
type ParameterWithConfig struct {
Param Parameter
@ -76,7 +120,7 @@ type ParameterWithConfig struct {
type checkpointMeta struct {
UUID string `json:"uuid"`
Path string `json:"path"`
MD5 string `json:"md5"`
CRC32 uint32 `json:"crc32"`
Timestamp int64 `json:"timestamp"`
@ -92,7 +136,7 @@ type Service struct {
idx int
checkpointInterval time.Duration
checkpointPath string
client *EtcdClient
client KVStore
mu sync.Mutex
optMap map[string]*optimizer
@ -104,7 +148,12 @@ type parameterCheckpoint struct {
State []byte
func loadMeta(e *EtcdClient, idx int) (meta checkpointMeta, err error) {
type KVStore interface {
GetKey(key string, timeout time.Duration) ([]byte, error)
PutKey(key string, value []byte, timeout time.Duration, withLease bool) error
func loadMeta(e KVStore, idx int) (meta checkpointMeta, err error) {
v, err := e.GetKey(PsCheckpoint+strconv.Itoa(idx), 3*time.Second)
if err != nil {
@ -123,7 +172,7 @@ func loadMeta(e *EtcdClient, idx int) (meta checkpointMeta, err error) {
// LoadCheckpoint loads checkpoint from file.
func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) {
func LoadCheckpoint(e KVStore, idx int) (Checkpoint, error) {
log.Info("Loading checkpoint", "pserver index", idx)
defer traceTime(time.Now(), "load checkpoint")
@ -137,11 +186,8 @@ func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) {
return nil, err
// TODO(helin): change MD5 to CRC since CRC is better for file
// checksum in our use case (emphasize speed over security).
h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
crc32 := crc32.ChecksumIEEE(content)
if crc32 != cpMeta.CRC32 {
return nil, errors.New(WrongChecksum)
@ -150,12 +196,13 @@ func LoadCheckpoint(e *EtcdClient, idx int) (Checkpoint, error) {
if err = dec.Decode(&cp); err != nil {
return nil, err
return cp, nil
// NewService creates a new service, will bypass etcd registration if no
// endpoints specified. It will recovery from checkpoint file if a exists a specified checkpoint.
func NewService(idx int, interval time.Duration, path string, client *EtcdClient, cp Checkpoint) (*Service, error) {
func NewService(idx int, interval time.Duration, path string, client KVStore, cp Checkpoint) (*Service, error) {
s := &Service{
idx: idx,
checkpointInterval: interval,
@ -173,6 +220,7 @@ func NewService(idx int, interval time.Duration, path string, client *EtcdClient
s.optMap[p.Param.Name] = newOptimizer(p, item.State)
return s, nil
@ -186,7 +234,9 @@ func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, _ *int) error
// TODO(helin): parse parameter config
c := &pb.OptimizerConfig{}
proto.Unmarshal(paramWithConfigs.Config, c)
log.Debug(fmt.Sprintf("OptimizerConfig:%v", c))
@ -221,7 +271,7 @@ func (s *Service) FinishInitParams(_ int, _ *int) error {
for range t {
err := s.checkpoint()
if err != nil {
log.Error("finish init params error", log.Ctx{"error": err})
log.Error("checkpoint error", log.Ctx{"error": err})
@ -236,7 +286,8 @@ func (s *Service) SendGrad(g Gradient, _ *int) error {
select {
case <-s.initialized:
log.Warn("received gradient before initialization.", "name", g.Name, "size", len(g.Content), "type", g.ElementType)
log.Warn("received gradient before initialization.",
"name", g.Name, "size", len(g.Content), "type", g.ElementType)
return errors.New(Uninitialized)
@ -245,10 +296,14 @@ func (s *Service) SendGrad(g Gradient, _ *int) error {
o, ok := s.optMap[g.Name]
if !ok {
log.Warn("received gradient but can't find name.",
"name", g.Name, "size", len(g.Content), "type", g.ElementType)
return fmt.Errorf("parameter: %s does not exist", g.Name)
log.Info("received gradient from trainer, updating gradient.", "name", g.Name, "size", len(g.Content), "type", g.ElementType)
log.Info("received gradient from trainer, updating gradient.",
"name", g.Name, "size", len(g.Content), "type", g.ElementType)
return o.UpdateParameter(g)
@ -274,6 +329,7 @@ func (s *Service) GetParam(name string, parameter *Parameter) error {
parameter.Name = name
parameter.ElementType = opt.elementType
parameter.Content = opt.GetWeights()
log.Info("sending parameter to the trainer", "name", parameter.Name, "size", len(parameter.Content), "type", parameter.ElementType)
return nil
@ -354,20 +410,29 @@ func (s *Service) checkpoint() (err error) {
oldMeta, err := loadMeta(s.client, s.idx)
if err == ErrCheckpointNotFound {
log.Info("Do not have existing checkpoint.")
log.Info("old meta not found, skip removing old meta")
err = nil
} else if err == nil {
log.Info("removing old meta")
if oldMeta.Path != "" {
rmErr := os.Remove(oldMeta.Path)
if rmErr != nil {
// log error, but still treat checkpoint as
// successful.
log.Error("remove old meta file error", log.Ctx{"error": rmErr})
if err != nil {
h := md5.New()
md5 := hex.EncodeToString(h.Sum(buf.Bytes()))
crc32 := crc32.ChecksumIEEE(buf.Bytes())
cpMeta := checkpointMeta{
UUID: id,
Timestamp: time.Now().UnixNano(),
MD5: md5,
CRC32: crc32,
Path: p,
@ -381,14 +446,5 @@ func (s *Service) checkpoint() (err error) {
if oldMeta.Path != "" {
rmErr := os.Remove(oldMeta.Path)
if rmErr != nil {
// log error, but still treat checkpoint as
// successful.
log.Error("remove old meta file error", log.Ctx{"error": rmErr})

@ -0,0 +1,86 @@
package pserver
import (
const testDir = "./test_data"
type myKV struct {
m map[string][]byte
func (m *myKV) GetKey(key string, timeout time.Duration) ([]byte, error) {
if m.m == nil {
m.m = make(map[string][]byte)
return m.m[key], nil
func (m *myKV) PutKey(key string, value []byte, timeout time.Duration, withLease bool) error {
if m.m == nil {
m.m = make(map[string][]byte)
m.m[key] = value
return nil
func TestCheckpoint(t *testing.T) {
kv := &myKV{}
s, err := NewService(0, time.Hour, testDir, kv, nil)
assert.Nil(t, err)
err = s.checkpoint()
assert.Nil(t, err)
_, err = LoadCheckpoint(kv, 0)
assert.Nil(t, err)
func float32ToByte(f float32) []byte {
var buf bytes.Buffer
err := binary.Write(&buf, binary.LittleEndian, f)
if err != nil {
fmt.Println("binary.Write failed:", err)
return buf.Bytes()
func TestCheckpointWithData(t *testing.T) {
kv := &myKV{}
s, err := NewService(0, time.Hour, testDir, kv, nil)
assert.Nil(t, err)
var content []byte
for i := 0; i < 50000; i++ {
content = append(content, float32ToByte(float32(i))...)
p1 := Parameter{Name: "p1", ElementType: 1, Content: content}
err = s.InitParam(ParameterWithConfig{Param: p1}, nil)
assert.Nil(t, err)
err = s.FinishInitParams(0, nil)
assert.Nil(t, err)
var p2 Parameter
err = s.GetParam(p1.Name, &p2)
assert.Nil(t, err)
assert.Equal(t, p1, p2)
err = s.checkpoint()
assert.Nil(t, err)
cp, err := LoadCheckpoint(kv, 0)
assert.Nil(t, err)
s1, err := NewService(0, time.Hour, testDir, kv, cp)
assert.Nil(t, err)
var p3 Parameter
err = s1.GetParam(p1.Name, &p3)
assert.Nil(t, err)
assert.Equal(t, p1, p3)

@ -15,6 +15,7 @@
package pserver_test
import (
@ -179,6 +180,32 @@ func TestBlockUntilInitialized(t *testing.T) {
func TestCheckpointSpeed(t *testing.T) {
//TODO(zhihong): test speed
func TestGradientString(t *testing.T) {
g := pserver.Parameter{}
g.ElementType = pserver.Float32
g.Content = []byte{0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40, 0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40}
if g.String() != "[3.3702806e+12 2.142699 3.3702806e+12 2.142699]" {
t.Fatal("get float data error!")
g.Content = []byte{0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40,
0x18, 0x2d, 0x44, 0x54, 0xfb, 0x21, 0x09, 0x40}
if g.String() != "[3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699...3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699 3.3702806e+12 2.142699]" {
t.Fatal("get float data error!", g.String())

@ -64,12 +64,18 @@ paddle_error paddle_gradient_machine_create_for_inference_with_parameters(
modelConfigProtobuf.resize(modelConfigSize);[0], modelConfigSize);
paddle::TrainerConfig config;
paddle::ModelConfig modelConfig;
if (!config.ParseFromString(modelConfigProtobuf) || !config.IsInitialized()) {
if (!modelConfig.ParseFromString(modelConfigProtobuf) ||
!modelConfig.IsInitialized()) {
} else {
modelConfig = config.model_config();
auto ptr = new paddle::capi::CGradientMachine();
config.model_config(), CREATE_MODE_TESTING, {paddle::PARAMETER_VALUE}));
std::vector<paddle::ParameterPtr>& parameters = ptr->machine->getParameters();
for (auto& para : parameters) {

@ -1,6 +1,5 @@
# ddim lib
proto_library(framework_proto SRCS framework.proto)
proto_library(saver_proto SRCS framework.proto saver.proto)
cc_library(ddim SRCS DEPS eigen3)
cc_test(ddim_test SRCS DEPS ddim)
@ -10,13 +9,13 @@ cc_library(tensor SRCS DEPS ddim place paddle_memory device_context)
cc_test(tensor_test SRCS DEPS tensor)
cc_test(eigen_test SRCS DEPS tensor)
cc_library(lod_tensor SRCS DEPS ddim place tensor saver_proto framework_proto)
cc_library(lod_tensor SRCS DEPS ddim place tensor framework_proto)
cc_test(lod_tensor_test SRCS DEPS lod_tensor paddle_memory)
nv_test(lod_tensor_gpu_test SRCS DEPS lod_tensor)
cc_test(variable_test SRCS
cc_library(scope SRCS
cc_library(scope SRCS DEPS glog)
cc_test(scope_test SRCS DEPS scope)
@ -25,9 +24,10 @@ cc_test(program_desc_test SRCS DEPS proto_desc)
cc_library(op_proto_maker SRCS DEPS framework_proto attribute)
cc_test(op_proto_maker_test SRCS DEPS op_proto_maker)
cc_library(op_info SRCS DEPS attribute framework_proto)
cc_library(operator SRCS DEPS op_info device_context tensor scope glog)
cc_library(shape_inference SRCS DEPS ddim attribute)
cc_library(operator SRCS DEPS op_info device_context tensor scope glog shape_inference)
cc_test(operator_test SRCS DEPS operator op_registry)
cc_library(proto_desc SRCS DEPS attribute ddim op_info operator)
cc_library(proto_desc SRCS DEPS shape_inference op_info operator glog)
cc_library(op_registry SRCS DEPS op_proto_maker op_info operator glog proto_desc)
cc_test(op_registry_test SRCS DEPS op_registry)
@ -43,7 +43,7 @@ add_custom_command(TARGET framework_py_proto POST_BUILD
cc_library(backward SRCS DEPS net_op)
cc_test(backward_test SRCS DEPS backward recurrent_op device_context)
cc_test(backward_test SRCS DEPS backward recurrent_op device_context fill_constant_op)
cc_library(executor SRCS DEPS op_registry device_context scope framework_proto backward glog)

@ -315,6 +315,7 @@ static void CreateGradVarInBlock(
return false; /* not break */
if (need_infer_shape) {
@ -452,11 +453,16 @@ ParamGradInfoMap AppendBackward(
std::transform(target_shape_desc.begin(), target_shape_desc.end(),
[](int64_t dim) { return static_cast<int>(dim); });
VLOG(3) << "backward from loss=" << target.Name()
<< " data_type=" << target.GetDataType();
std::unique_ptr<OpDescBind> fill_one_op(
new OpDescBind("fill_constant", {}, {{"Out", {fill_one_op_out}}},
{{"shape", target_shape},
{"value", static_cast<float>(1.0)},
{"data_type", framework::DataType::FP32}}));
{"data_type", target.GetDataType()}}));
// infer var type of fill_one_op
size_t forward_op_num = root_block->OpSize();
size_t forward_block_num = program_desc.Size();
@ -475,8 +481,7 @@ ParamGradInfoMap AppendBackward(
std::unordered_map<std::string, GradVarInfo> retv;
auto var = root_block->Var(fill_one_op_out);
// FIXME(qiao) infer the data type
auto& target_grad = retv[target.Name()];
target_grad.name_ = fill_one_op_out;

@ -21,6 +21,8 @@
#include "paddle/framework/var_desc.h"
#include "paddle/operators/net_op.h"
namespace paddle {
namespace framework {

Some files were not shown because too many files have changed in this diff Show More
