From d7bde1ff2d46452b012698fb362e6cc187e23d57 Mon Sep 17 00:00:00 2001 From: gongweibao Date: Thu, 11 May 2017 17:50:57 +0800 Subject: [PATCH 01/39] first time to add --- doc/howto/dev/introduction_to_pr.md | 122 ++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 doc/howto/dev/introduction_to_pr.md diff --git a/doc/howto/dev/introduction_to_pr.md b/doc/howto/dev/introduction_to_pr.md new file mode 100644 index 0000000000..39d8101c66 --- /dev/null +++ b/doc/howto/dev/introduction_to_pr.md @@ -0,0 +1,122 @@ +# Desgin Doc的一点总结 +## 前言 +故事还要从前两天我提交的[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)的Design Doc PR开始。 + +我肯定是用心写了,我也清楚地记得,提交之前也仔细的检查了(这点自觉性还是有的),结果,我突然发现,我的PR被Comments刷屏了;这还是其次,更要命的是,当网络速度稍慢的时候会提示我:Comments 太多啦!页面打不开!!哦。简直有点人间惨剧的味道!我要把这些和之前的Comments以及一些典型的Comments总结一下,避免同样的问题,同时也希望对您有用。 + +我觉得里边有几个基本的原则: + +- 做事情要精益求精 + 这个是做事情的基础,也是我们讨论的基础。美,多是相似的,丑才是千奇百怪。 + + 精益求精是一种态度,态度比什么都重要。[yi](#yi) +- 节约别人的时间 + 同时也是给自己节约时间 + +- 有礼貌,有感恩的心 + > 我发现诸多之前提了comment,但是没有改,也没有回复。这个不太好吧。[ying](#ying) + > 每个comment都必须回复。这是开源社区的基本礼貌。别人帮了忙,应该说谢谢。[yi](#yi) + +我的理解,写Doc精益求精要从基础开始,首先要规范;其次要有提纲挈领的东西,让人一眼基本明白要做的事情;然后,讲述事情结构要清晰,要分层,用最小化涉及的原则,该讲的要解释,不该讲的不讲;接下来逻辑要清晰,要流畅,不要太突兀。。。当然还有很多,比如文笔要好!:) + +锻炼一下,中国人办事也不一定那么糙的,对吧! :) + +## 基础规范 + +- 语言选择:中文 or 英文? + 最好用英文,因为外国同学无法阅读中文。但是,语言是用来交流的,如果您认为英文无法完善的表达自己的意思的时候,还是用中文吧! + + 重要的是,不管哪种文,***英文的拼写要对,中文要没有错别字***! + + 如果文章用的是中文,那么请用中文的标点符号。反之,用英文的标点符号。***而且要保持统一、而且要保持完整***。[wuyi](#wuyi) + + 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114951817) + + 还有这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093563) + + (请原谅我多加了一个Here做锚文本,Sphinx有一个bug,用中文的会报错。下边的处理类似。) + +- 缩写的全称 + 一个缩写第一次出现的时候,要把他们的未缩写形式写出来。如果该单词代表的概念比较复杂,请在术语解释中把他写清楚! + + 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093329) + +- 英语写法错误 + [yi](#yi)总结了一下在code review的时候经常见的一些英语写法错误: Python写成python,TensorFlow写成Tensorflow,Travis CI 写成 Travis-CI,ReLU写成Relu,unit test写成unittest,MD5写成md5。 + + 大小写简单的规则如下: + - 英文的缩写是要用大写的。 + 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115091985) + - 英文的句子的首字母是要大写的。 + - 一个专有的名词第一个字母一般是要大写的。 + + yiwang推荐了一个工具:[grammer](https://www.grammarly.com/),可以作为插件安装到Chrome中,自动检查拼写和语法问题。 + +- 不要提交没有用的文件 + 例如这个 [Here](https://github.com/PaddlePaddle/Paddle/pull/1964#discussion_r114414822) + +- 提交的代码要经过验证 + 提交的代码都不能通过编译是几个意思? + +- 参考资料要设置超链,链接到对方 + 不设置的话俗称Copy。可以用`[name](#name)`设置MarkDown文件中的引用。一如此文中很多地方出现的那样。 + +- 给别人的东西步骤是要可以执行的 + 看这个[Here](https://github.com/wangkuiyi/ipynb)是如何写步骤的,或者这个[Here](https://github.com/PaddlePaddle/Paddle/pull/1602#issuecomment-285964510)(虽然只是一个Comment)。 + +- 链接可以直接跳到精确的位置 + 如果能一步到位就不要让别人走两步,节约大家彼此的时间。 + +## 提纲挈领 +### 架构图 +一个系统或者模块的设计文档,首先要有架构图。***要有架构图,要有架构图,要有架构图***。。。。这句话可以多重复几遍。俗称,无图无真相或者一图胜千言。最起码有一个架构图说明各个模块的关系。图的表现能力远远超过了文字,特别是模块关系相关的和流程相关的场景下。 + +可以看一下这个文档中的图 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train) + +顺便提一句,里边的图都是用[OmniGraffle](https://www.omnigroup.com/omnigraffle)画的,就是贵了点:$99。“99你买了不吃亏,99你买了不上当。”[wuyi](#wuyi) + +### 层次结构清晰 +代码层面上,我们为了开发大的项目,很自然的把代码分模块、分文件。文档也是如此。每个文档或者章节说明他应该要想说的事情,尽量的减少涉及的范围。涉及的范围越少,需要阐述、解释的东西就越少,理解起来需要的背景知识就越少,就越好理解,写的人出错的概率也越少。。。 + +- 整体分层 + - 系统设计 + - 模块设计 + - 接口设计 + - 部署 + +相互之间需要尽量少的“越权”。例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147388) + +另外一个容易忽视的问题是,文档内部的层次结构。MarkDown文件不想Doc一样可以设置目录页,一个部分如果太多,就会让看得人失去层次感。以这个文档为例 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)。这个文档我也写过,只是把`Fault Recovery`的部分和前边正常的`Training Job`合到一起去了,结果发现越写越乱,后来看到Helin写的分层之后的文档,感觉流畅多了。 + +## 逻辑要清晰、流畅 +- 概念的出现不要突兀。 + 文档的最前边有一个术语的解释,把文档中提到的概念先解释一下,这样,别人在看到那些概念的时候不会觉得很突兀。同时,前后要呼应。 + + 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114952115) + + 如果概念或者名词后边没有出现,该删除还是删除了吧! + +- 多种方案的选择要简述原因。 + 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147115) + +- Design doc中不应该有“?” [wuyi](#wuyi) + 应该都是陈述句的描述。有不确定的问题可以提Issue来讨论获得结论。 + 对于自己不确定的地方,与其含混而过不如找人讨论先搞一个靠谱的可以讨论的。绝大多数情况下,这种含混而过的都会被Reivew给纠出来。即便就不出来,岂不是自己给自己埋一个坑? + +- 文档当中不要黏贴大量的代码 + 代码一般都是细节,改变的非常的快,文档可能很快就失效了,需要重新的修正。另外,最主要的是大段的代码会让人的思路中断,陷于实现的细节当中。 + +- 不准备实现的就不要写了 + 最多放到放到`Future`中展望一下。 + +## 文笔要好 +啊呀,不想当作家的程序员不是好程序员。这个当然比较难,要看大家的“学好数理化,走遍天下都不怕”的基本功的“深厚”程度了:) + +推荐一下公众号:老万故事会[link](https://freewechat.com/profile/MzI1MDQ3NTAxOQ==),一个文章和代码写的一样好的人[yi](#yi)。 + +## 参考资料 +- WangYi +- WuYi +- Helin +- YanXu +- CaoYing From f23ee80daa682833604bbc22874aa25c1d50bfdd Mon Sep 17 00:00:00 2001 From: gongweibao Date: Thu, 11 May 2017 17:59:44 +0800 Subject: [PATCH 02/39] Fix bugs --- doc/howto/dev/introduction_to_pr.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/doc/howto/dev/introduction_to_pr.md b/doc/howto/dev/introduction_to_pr.md index 39d8101c66..663e81eeb9 100644 --- a/doc/howto/dev/introduction_to_pr.md +++ b/doc/howto/dev/introduction_to_pr.md @@ -2,7 +2,7 @@ ## 前言 故事还要从前两天我提交的[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)的Design Doc PR开始。 -我肯定是用心写了,我也清楚地记得,提交之前也仔细的检查了(这点自觉性还是有的),结果,我突然发现,我的PR被Comments刷屏了;这还是其次,更要命的是,当网络速度稍慢的时候会提示我:Comments 太多啦!页面打不开!!哦。简直有点人间惨剧的味道!我要把这些和之前的Comments以及一些典型的Comments总结一下,避免同样的问题,同时也希望对您有用。 +这个文档我肯定是用心写了,我也清楚地记得,提交之前也仔细的检查了(这点自觉性还是有的),结果,我突然发现,我的PR被Comments刷屏了;这还是其次,更要命的是,当网络速度稍慢的时候会提示我:Comments 太多啦!页面打不开!!哦。简直有点人间惨剧的味道!我要把这些和之前的Comments以及一些典型的Comments总结一下,避免同样的问题,同时也希望对您有用。 我觉得里边有几个基本的原则: @@ -103,18 +103,18 @@ 应该都是陈述句的描述。有不确定的问题可以提Issue来讨论获得结论。 对于自己不确定的地方,与其含混而过不如找人讨论先搞一个靠谱的可以讨论的。绝大多数情况下,这种含混而过的都会被Reivew给纠出来。即便就不出来,岂不是自己给自己埋一个坑? -- 文档当中不要黏贴大量的代码 +- 文档当中不要黏贴大量的代码 代码一般都是细节,改变的非常的快,文档可能很快就失效了,需要重新的修正。另外,最主要的是大段的代码会让人的思路中断,陷于实现的细节当中。 -- 不准备实现的就不要写了 +- 不准备实现的就不要写了 最多放到放到`Future`中展望一下。 ## 文笔要好 啊呀,不想当作家的程序员不是好程序员。这个当然比较难,要看大家的“学好数理化,走遍天下都不怕”的基本功的“深厚”程度了:) -推荐一下公众号:老万故事会[link](https://freewechat.com/profile/MzI1MDQ3NTAxOQ==),一个文章和代码写的一样好的人[yi](#yi)。 +顺便推荐一下公众号:老万故事会[link](https://freewechat.com/profile/MzI1MDQ3NTAxOQ==),一个文章和代码写的一样好的人[yi](#yi)。 -## 参考资料 +## 参考 - WangYi - WuYi - Helin From e3a37a7913b7bd2a2f2612105e885334e74743b7 Mon Sep 17 00:00:00 2001 From: gongweibao Date: Thu, 11 May 2017 18:09:16 +0800 Subject: [PATCH 03/39] Fix bugs --- doc/howto/dev/introduction_to_pr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/howto/dev/introduction_to_pr.md b/doc/howto/dev/introduction_to_pr.md index 663e81eeb9..bb9ff53322 100644 --- a/doc/howto/dev/introduction_to_pr.md +++ b/doc/howto/dev/introduction_to_pr.md @@ -56,7 +56,7 @@ 例如这个 [Here](https://github.com/PaddlePaddle/Paddle/pull/1964#discussion_r114414822) - 提交的代码要经过验证 - 提交的代码都不能通过编译是几个意思? + 提交的代码都没有验证过是几个意思? - 参考资料要设置超链,链接到对方 不设置的话俗称Copy。可以用`[name](#name)`设置MarkDown文件中的引用。一如此文中很多地方出现的那样。 From c0794374d27267bf1af60980f311aeea1acfbf45 Mon Sep 17 00:00:00 2001 From: gongweibao Date: Fri, 12 May 2017 09:32:11 +0800 Subject: [PATCH 04/39] fix bugs and add sections --- doc/howto/dev/introduction_to_pr.md | 43 ++++++++++++++++++++--------- 1 file changed, 30 insertions(+), 13 deletions(-) diff --git a/doc/howto/dev/introduction_to_pr.md b/doc/howto/dev/introduction_to_pr.md index bb9ff53322..71cbbdf861 100644 --- a/doc/howto/dev/introduction_to_pr.md +++ b/doc/howto/dev/introduction_to_pr.md @@ -2,8 +2,8 @@ ## 前言 故事还要从前两天我提交的[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)的Design Doc PR开始。 -这个文档我肯定是用心写了,我也清楚地记得,提交之前也仔细的检查了(这点自觉性还是有的),结果,我突然发现,我的PR被Comments刷屏了;这还是其次,更要命的是,当网络速度稍慢的时候会提示我:Comments 太多啦!页面打不开!!哦。简直有点人间惨剧的味道!我要把这些和之前的Comments以及一些典型的Comments总结一下,避免同样的问题,同时也希望对您有用。 - +[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)这个文档我肯定是用心写了,我也清楚地记得,提交之前也仔细的检查了(这点自觉性还是有的),结果,我突然发现,我的PR被Comments刷屏了;这还是其次,更要命的是,当网络速度稍慢的时候会提示我:Comments 太多啦!页面打不开!!哦。简直有点人间惨剧的味道!我要把这些和之前的Comments以及一些典型的Comments总结一下,避免同样的问题,同时也希望对您有用。 +link 我觉得里边有几个基本的原则: - 做事情要精益求精 @@ -30,30 +30,30 @@ 如果文章用的是中文,那么请用中文的标点符号。反之,用英文的标点符号。***而且要保持统一、而且要保持完整***。[wuyi](#wuyi) - 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114951817) + 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114951817) - 还有这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093563) + 还有这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093563) - (请原谅我多加了一个Here做锚文本,Sphinx有一个bug,用中文的会报错。下边的处理类似。) + (请原谅我多加了一个link做锚文本,Sphinx有一个bug:用中文的会报错。下边的处理类似。) - 缩写的全称 一个缩写第一次出现的时候,要把他们的未缩写形式写出来。如果该单词代表的概念比较复杂,请在术语解释中把他写清楚! - 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093329) + 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093329) - 英语写法错误 [yi](#yi)总结了一下在code review的时候经常见的一些英语写法错误: Python写成python,TensorFlow写成Tensorflow,Travis CI 写成 Travis-CI,ReLU写成Relu,unit test写成unittest,MD5写成md5。 大小写简单的规则如下: - 英文的缩写是要用大写的。 - 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115091985) + 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115091985) - 英文的句子的首字母是要大写的。 - 一个专有的名词第一个字母一般是要大写的。 yiwang推荐了一个工具:[grammer](https://www.grammarly.com/),可以作为插件安装到Chrome中,自动检查拼写和语法问题。 - 不要提交没有用的文件 - 例如这个 [Here](https://github.com/PaddlePaddle/Paddle/pull/1964#discussion_r114414822) + 例如这个 [link](https://github.com/PaddlePaddle/Paddle/pull/1964#discussion_r114414822) - 提交的代码要经过验证 提交的代码都没有验证过是几个意思? @@ -62,12 +62,16 @@ 不设置的话俗称Copy。可以用`[name](#name)`设置MarkDown文件中的引用。一如此文中很多地方出现的那样。 - 给别人的东西步骤是要可以执行的 - 看这个[Here](https://github.com/wangkuiyi/ipynb)是如何写步骤的,或者这个[Here](https://github.com/PaddlePaddle/Paddle/pull/1602#issuecomment-285964510)(虽然只是一个Comment)。 + 看这个[link](https://github.com/wangkuiyi/ipynb)是如何写步骤的,或者这个[link](https://github.com/PaddlePaddle/Paddle/pull/1602#issuecomment-285964510)(虽然只是一个Comment)。 - 链接可以直接跳到精确的位置 如果能一步到位就不要让别人走两步,节约大家彼此的时间。 ## 提纲挈领 +### 目标要明确 +一般开头的一段话就要用简短的几句话把文档要描述的目标写清楚,别人看了之后,心中有了一个大概:这个文档准备讲什么。 + +例如这个[link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train#objective) ### 架构图 一个系统或者模块的设计文档,首先要有架构图。***要有架构图,要有架构图,要有架构图***。。。。这句话可以多重复几遍。俗称,无图无真相或者一图胜千言。最起码有一个架构图说明各个模块的关系。图的表现能力远远超过了文字,特别是模块关系相关的和流程相关的场景下。 @@ -84,20 +88,20 @@ - 接口设计 - 部署 -相互之间需要尽量少的“越权”。例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147388) +相互之间需要尽量少的“越权”。例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147388) -另外一个容易忽视的问题是,文档内部的层次结构。MarkDown文件不想Doc一样可以设置目录页,一个部分如果太多,就会让看得人失去层次感。以这个文档为例 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)。这个文档我也写过,只是把`Fault Recovery`的部分和前边正常的`Training Job`合到一起去了,结果发现越写越乱,后来看到Helin写的分层之后的文档,感觉流畅多了。 +另外一个容易忽视的问题是,文档内部的层次结构。MarkDown文件不像Doc一样可以自动生成目录页,一个部分如果太多,就会让看得人失去层次感。以这个文档为例 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)。这个文档我也写过,只是把`Fault Recovery`的部分和前边正常的`Training Job`合到一起去了,结果发现越写越乱,后来看到Helin写的分层之后的文档,感觉流畅多了。 ## 逻辑要清晰、流畅 - 概念的出现不要突兀。 文档的最前边有一个术语的解释,把文档中提到的概念先解释一下,这样,别人在看到那些概念的时候不会觉得很突兀。同时,前后要呼应。 - 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114952115) + 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114952115) 如果概念或者名词后边没有出现,该删除还是删除了吧! - 多种方案的选择要简述原因。 - 例如这个[Here](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147115) + 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147115) - Design doc中不应该有“?” [wuyi](#wuyi) 应该都是陈述句的描述。有不确定的问题可以提Issue来讨论获得结论。 @@ -114,6 +118,19 @@ 顺便推荐一下公众号:老万故事会[link](https://freewechat.com/profile/MzI1MDQ3NTAxOQ==),一个文章和代码写的一样好的人[yi](#yi)。 +## 如何提高写文档的效率 +这段其实本来没想到加的,开完组会之后听老大讲提高工作效率的事情有点感想,就写了。 + +我不是很怀疑自己写代码的效率,但是严重怀疑自己写文档的效率。感觉写文档比写代码烧脑多了。现在想想,最主要的点在于文档的结构和分层问题。 + +提高效率最好的办法是什么?是确定范围,很多东西不用讲了,当然效率就提高上去了。 + +写系统设计文档,主要需要表现模块间的关系和通信,特别是特定场景下的他们是如何配合的。这个过程中需要把关键的概念说清楚,例如这个[link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)中,`Trainjob`和`Fault Recovery`主要是讲模块间关系和配合的,[`task`](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train#task)作为一个关键的概念讲了。其他的部分,稍显细节的都可以放到模块设计中去讲。 + +另外一个,认真≠ 犹豫,也≠ 纠结。 + +如果感到了纠结,那说明没找到问题的根本。我在写文件上传的时候对做不做缓存优化纠结了很长时间,请教了Helin一会就讨论完毕了。如果感到了纠结,那是需要跟别人请教。不纠结的地方,下决断要果断。 + ## 参考 - WangYi - WuYi From cb6436b50ce40985609cf18ae81ef308c32c8602 Mon Sep 17 00:00:00 2001 From: dangqingqing Date: Thu, 1 Jun 2017 00:56:07 +0800 Subject: [PATCH 05/39] CPU implementation of row convolution --- paddle/function/RowConvOp.cpp | 172 ++++++++++++++++++++++++ paddle/function/RowConvOp.h | 42 ++++++ paddle/gserver/layers/RowConvLayer.cpp | 105 +++++++++++++++ paddle/gserver/layers/RowConvLayer.h | 46 +++++++ paddle/gserver/tests/test_LayerGrad.cpp | 20 +++ proto/ModelConfig.proto | 5 + 6 files changed, 390 insertions(+) create mode 100644 paddle/function/RowConvOp.cpp create mode 100644 paddle/function/RowConvOp.h create mode 100644 paddle/gserver/layers/RowConvLayer.cpp create mode 100644 paddle/gserver/layers/RowConvLayer.h diff --git a/paddle/function/RowConvOp.cpp b/paddle/function/RowConvOp.cpp new file mode 100644 index 0000000000..f92b286c69 --- /dev/null +++ b/paddle/function/RowConvOp.cpp @@ -0,0 +1,172 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#include "RowConvOp.h" +#include "paddle/math/Vector.h" + +namespace paddle { + +template <> +void RowConv(CpuMatrix& out, + const CpuMatrix& in, + const CpuMatrix& filter, + const CpuIVector& seq) { + const int* starts = seq.getData(); + const size_t numSeq = seq.getSize() - 1; + const size_t contextLength = filter.getHeight(); + for (size_t i = 0; i < numSeq; ++i) { + size_t begin = starts[i]; + size_t end = starts[i + 1]; + for (size_t j = begin; j < end; ++j) { + MatrixPtr x; + MatrixPtr w; + if ((j + contextLength) < end) { + x = (const_cast(in)).subMatrix(j, contextLength); + w = (const_cast(filter)).subMatrix(0, contextLength); + } else { + x = (const_cast(in)).subMatrix(j, end - j); + w = (const_cast(filter)).subMatrix(0, end - j); + } + MatrixPtr y = out.subMatrix(j, 1); + y->addDotMulVMM(*x, *w); + } + } +} + +template <> +void RowConvGrad(const CpuMatrix& outG, + const CpuMatrix& in, + const CpuMatrix& filter, + CpuMatrix& inG, + CpuMatrix& filterG, + const CpuIVector& seq) { + // gradient w.r.t filter + const int* starts = seq.getData(); + const size_t numSeq = seq.getSize() - 1; + const size_t contextLength = filter.getHeight(); + if (filterG) { + for (size_t i = 0; i < numSeq; ++i) { + size_t begin = starts[i]; + size_t end = starts[i + 1]; + size_t steps = end - begin; + for (size_t j = 0; j < contextLength; ++j) { + MatrixPtr x = + (const_cast(in)).subMatrix(begin + j, steps - j); + MatrixPtr dy = + (const_cast(outG)).subMatrix(begin, steps - j); + MatrixPtr dw = filterG.subMatrix(j, 1); + dw->addDotMulVMM(*dy, *x); + } + } + } + + // gradient w.r.t input feature + if (inG) { + for (size_t i = 0; i < numSeq; ++i) { + size_t begin = starts[i]; + size_t end = starts[i + 1]; + size_t steps = end - begin; + for (size_t j = 0; j < steps; ++j) { + MatrixPtr dx = inG.subMatrix(begin + j, 1); + for (size_t t = 0; t < contextLength; ++t) { + if ((int(j) - int(t)) >= 0) { + MatrixPtr dy = + (const_cast(outG)).subMatrix(begin + j - t, 1); + MatrixPtr w = (const_cast(filter)).subMatrix(t, 1); + dx->addDotMul(*dy, *w, 1.0, 1.0); + } + } + } + } + } +} + +/** + * \brief TODO(qingqing) + * + */ + +template +class RowConvFunc : public FunctionBase { +public: + void init(const FuncConfig& config) override {} + + void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { + // check + CHECK_EQ(2UL, inputs.size()); + CHECK_EQ(1UL, outputs.size()); + CHECK_EQ(outputs[0].getArgType(), ADD_TO); + CHECK(inputs[0].isSequenceArg() && outputs[0].isSequenceArg()) + << "SequenceArg required here."; + const auto in = dynamic_cast(inputs[0]); + auto out = dynamic_cast(outputs[0]); + auto w = inputs[1]; + CHECK(in.data() && out.data() && in.getSequenceId().data()); + CHECK_EQ(in.shape().ndims(), 2UL); + CHECK_EQ(out.shape().ndims(), 2UL); + CHECK_EQ(in.shape()[1], out.shape()[1]); + CHECK_EQ(in.shape()[0], out.shape()[0]); + CHECK_EQ(w.shape()[1], in.shape()[1]); + + auto outMat = out.matrix(); + const auto inMat = in.matrix(); + const auto wMat = w.matrix(); + const auto seqId = in.getSequenceId().vector(); + + RowConv(outMat, inMat, wMat, seqId); + } +}; + +/** + * \brief The backward propagation of padding Function. Remove the elements + * in the padding positions of forward. + * + * Argument in this Function: + */ + +template +class RowConvGradFunc : public FunctionBase { +public: + void init(const FuncConfig& config) override {} + + void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { + const auto outGrad = dynamic_cast(inputs[0]); + const auto in = dynamic_cast(inputs[1]); + const auto w = inputs[2]; + auto inGrad = dynamic_cast(outputs[0]); + auto wGrad = outputs[1]; + + const auto outGMat = outGrad.matrix(); + const auto inMat = in.matrix(); + const auto wMat = w.matrix(); + auto inGMat = inGrad.data() + ? inGrad.matrix() + : typename Tensor::Matrix(nullptr, 0, 0); + auto wGMat = wGrad.data() + ? wGrad.matrix() + : typename Tensor::Matrix(nullptr, 0, 0); + const auto seqId = in.getSequenceId().vector(); + + RowConvGrad(outGMat, inMat, wMat, inGMat, wGMat, seqId); + } +}; + +REGISTER_TYPED_FUNC(RowConv, CPU, RowConvFunc); +REGISTER_TYPED_FUNC(RowConvGrad, CPU, RowConvGradFunc); +#ifndef PADDLE_ONLY_CPU +REGISTER_TYPED_FUNC(RowConv, GPU, RowConvFunc); +REGISTER_TYPED_FUNC(RowConvGrad, GPU, PadGradFunc); +#endif + +} // namespace paddle diff --git a/paddle/function/RowConvOp.h b/paddle/function/RowConvOp.h new file mode 100644 index 0000000000..cd78ec724a --- /dev/null +++ b/paddle/function/RowConvOp.h @@ -0,0 +1,42 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#pragma once + +#include "Function.h" + +namespace paddle { + +/** + * \brief TODO(qingqing) + * + */ +template +void RowConv(typename Tensor::Matrix& out, + const typename Tensor::Matrix& in, + const typename Tensor::Matrix& filter, + const typename Tensor::Vector& seq); + +/** + * \brief TODO(qingqing) + * + */ +template +void RowConvGrad(const typename Tensor::Matrix& outG, + const typename Tensor::Matrix& in, + const typename Tensor::Matrix& filter, + typename Tensor::Matrix& inG, + typename Tensor::Matrix& filterG, + const typename Tensor::Vector& seq); +} // namespace paddle diff --git a/paddle/gserver/layers/RowConvLayer.cpp b/paddle/gserver/layers/RowConvLayer.cpp new file mode 100644 index 0000000000..d4b1406297 --- /dev/null +++ b/paddle/gserver/layers/RowConvLayer.cpp @@ -0,0 +1,105 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#include "RowConvLayer.h" +#include "paddle/utils/Stat.h" + +namespace paddle { + +REGISTER_LAYER(row_conv, RowConvLayer); + +bool RowConvLayer::init(const LayerMap& layerMap, + const ParameterMap& parameterMap) { + /* Initialize the basic parent class */ + Layer::init(layerMap, parameterMap); + + contexLength_ = config_.inputs(0).row_conv_conf().context_length(); + + CHECK_EQ(inputLayers_.size(), 1UL); + weight_.reset(new Weight(contexLength_, getSize(), parameters_[0])); + createFunction(forward_, "RowConv", FuncConfig()); + createFunction(backward_, "RowConvGrad", FuncConfig()); + + return true; +} + +void RowConvLayer::forward(PassType passType) { + Layer::forward(passType); + MatrixPtr input = getInputValue(0); + size_t height = input->getHeight(); + size_t width = input->getWidth(); + CHECK_EQ(width, getSize()); + resetOutput(height, width); + + const auto startPos = getInput(0).sequenceStartPositions->getVector(useGpu_); + wDims_ = TensorShape({contexLength_, width}); + + MatrixPtr outV = getOutputValue(); + BufferArgs inputs; + BufferArgs outputs; + inputs.addArg(*getInputValue(0), *startPos); + inputs.addArg(*weight_->getW(), wDims_); + outputs.addArg(*getOutputValue(), *startPos, ADD_TO); + + { + REGISTER_TIMER_INFO("RowConvForward", getName().c_str()); + forward_[0]->calc(inputs, outputs); + } + + /* activation */ { + REGISTER_TIMER_INFO("FwAtvTimer", getName().c_str()); + forwardActivation(); + } +} + +void RowConvLayer::backward(const UpdateCallback& callback) { + /* Do derivation */ { + REGISTER_TIMER_INFO("BpAvtTimer", getName().c_str()); + backwardActivation(); + } + + const auto startPos = getInput(0).sequenceStartPositions->getVector(useGpu_); + + BufferArgs inputs; + BufferArgs outputs; + inputs.addArg(*getOutputGrad(), *startPos); + inputs.addArg(*getInputValue(0), *startPos); + inputs.addArg(*weight_->getW(), *startPos); + + MatrixPtr inGrad = getInputGrad(0); + MatrixPtr wGrad = weight_->getWGrad(); + size_t h = getInputValue(0)->getHeight(); + size_t w = getInputValue(0)->getWidth(); + outputs.addArg( + inGrad ? (*inGrad) : *(Matrix::create(nullptr, h, w, false, useGpu_)), + *startPos, + ADD_TO); + outputs.addArg( + wGrad ? (*wGrad) + : *(Matrix::create(nullptr, contexLength_, w, false, useGpu_)), + wDims_, + ADD_TO); + + { + REGISTER_TIMER_INFO("RowConvBackward", getName().c_str()); + backward_[0]->calc(inputs, outputs); + } + + { + REGISTER_TIMER_INFO("WeightUpdate", getName().c_str()); + weight_->getParameterPtr()->incUpdate(callback); + } +} + +} // namespace paddle diff --git a/paddle/gserver/layers/RowConvLayer.h b/paddle/gserver/layers/RowConvLayer.h new file mode 100644 index 0000000000..05be6ca6a9 --- /dev/null +++ b/paddle/gserver/layers/RowConvLayer.h @@ -0,0 +1,46 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#pragma once + +#include "Layer.h" + +namespace paddle { + +/** + * \brief Row Convolution Layer. + */ +class RowConvLayer : public Layer { +public: + explicit RowConvLayer(const LayerConfig& config) : Layer(config) {} + + ~RowConvLayer() {} + + bool init(const LayerMap& layerMap, + const ParameterMap& parameterMap) override; + void forward(PassType passType) override; + void backward(const UpdateCallback& callback = nullptr) override; + +protected: + // Row convolution weight, context_lenght_ * fan_out. + // fan_out is the size of output feature. + std::unique_ptr weight_; + + // std::unique_ptr biases_; + + // how many steps to look ahead + size_t contexLength_; + TensorShape wDims_; +}; +} // namespace paddle diff --git a/paddle/gserver/tests/test_LayerGrad.cpp b/paddle/gserver/tests/test_LayerGrad.cpp index e1e8e7fae7..6adffcf53b 100644 --- a/paddle/gserver/tests/test_LayerGrad.cpp +++ b/paddle/gserver/tests/test_LayerGrad.cpp @@ -1705,6 +1705,26 @@ TEST(Layer, TransLayer) { } } +TEST(Layer, RowConvLayer) { + const int context = 3; + const int size = 512; + + TestConfig config; + config.layerConfig.set_type("row_conv"); + config.layerConfig.set_size(size); + config.layerConfig.set_active_type("sigmoid"); + + config.inputDefs.push_back( + {INPUT_SEQUENCE_DATA, "layer_0", size, context * size}); + LayerInputConfig* input = config.layerConfig.add_inputs(); + RowConvConfig* conv = input->mutable_row_conv_conf(); + conv->set_context_length(context); + + for (auto useGpu : {false, true}) { + testLayerGrad(config, "row_conv", 100, false, useGpu, false); + } +} + int main(int argc, char** argv) { testing::InitGoogleTest(&argc, argv); initMain(argc, argv); diff --git a/proto/ModelConfig.proto b/proto/ModelConfig.proto index 4f9b53d6f6..29270829bb 100644 --- a/proto/ModelConfig.proto +++ b/proto/ModelConfig.proto @@ -194,6 +194,10 @@ message MaxOutConfig { required uint32 groups = 2; } +message RowConvConfig { + required uint32 context_length = 1; +} + message ProjectionConfig { required string type = 1; required string name = 2; @@ -279,6 +283,7 @@ message LayerInputConfig { optional SppConfig spp_conf = 12; optional PriorBoxConfig priorbox_conf = 13; optional PadConfig pad_conf = 14; + optional RowConvConfig row_conv_conf = 15; } message LayerConfig { From b3ac51ff90093aaa1168a3f75b4e931c5b34eb9e Mon Sep 17 00:00:00 2001 From: dangqingqing Date: Sat, 3 Jun 2017 22:48:07 +0800 Subject: [PATCH 06/39] GPU implementation of row conv. --- paddle/function/CMakeLists.txt | 1 + paddle/function/RowConvOp.cpp | 37 +++- paddle/function/RowConvOpGpu.cu | 329 ++++++++++++++++++++++++++++++ paddle/function/RowConvOpTest.cpp | 69 +++++++ 4 files changed, 432 insertions(+), 4 deletions(-) create mode 100644 paddle/function/RowConvOpGpu.cu create mode 100644 paddle/function/RowConvOpTest.cpp diff --git a/paddle/function/CMakeLists.txt b/paddle/function/CMakeLists.txt index 233a53709a..1f54ac1231 100644 --- a/paddle/function/CMakeLists.txt +++ b/paddle/function/CMakeLists.txt @@ -28,6 +28,7 @@ if(WITH_TESTING) add_simple_unittest(PadOpTest) add_simple_unittest(MulOpTest) add_simple_unittest(CosSimOpTest) + add_simple_unittest(RowConvOpTest) endif() endif() diff --git a/paddle/function/RowConvOp.cpp b/paddle/function/RowConvOp.cpp index f92b286c69..24b7e3cdff 100644 --- a/paddle/function/RowConvOp.cpp +++ b/paddle/function/RowConvOp.cpp @@ -13,6 +13,7 @@ See the License for the specific language governing permissions and limitations under the License. */ #include "RowConvOp.h" +#include #include "paddle/math/Vector.h" namespace paddle { @@ -127,10 +128,8 @@ public: RowConv(outMat, inMat, wMat, seqId); } }; - /** - * \brief The backward propagation of padding Function. Remove the elements - * in the padding positions of forward. + * \brief TODO(qingqing) * * Argument in this Function: */ @@ -158,7 +157,37 @@ public: : typename Tensor::Matrix(nullptr, 0, 0); const auto seqId = in.getSequenceId().vector(); + std::cout << "in:" << std::endl; + for (int i = 0; i < inMat.getHeight(); ++i) { + for (int j = 0; j < inMat.getWidth(); ++j) { + std::cout << outGMat.getElement(i, j) << " "; + } + std::cout << std::endl; + } + + std::cout << "w:" << std::endl; + for (int i = 0; i < wMat.getHeight(); ++i) { + for (int j = 0; j < wMat.getWidth(); ++j) { + std::cout << wMat.getElement(i, j) << " "; + } + std::cout << std::endl; + } + + std::cout << "w:" << std::endl; + for (int i = 0; i < seqId.getSize(); ++i) { + std::cout << seqId.getElement(i) << " "; + } + std::cout << std::endl; + RowConvGrad(outGMat, inMat, wMat, inGMat, wGMat, seqId); + + std::cout << std::endl << "out:" << std::endl; + for (int i = 0; i < inGMat.getHeight(); ++i) { + for (int j = 0; j < inGMat.getWidth(); ++j) { + std::cout << inGMat.getElement(i, j) << " "; + } + std::cout << std::endl; + } } }; @@ -166,7 +195,7 @@ REGISTER_TYPED_FUNC(RowConv, CPU, RowConvFunc); REGISTER_TYPED_FUNC(RowConvGrad, CPU, RowConvGradFunc); #ifndef PADDLE_ONLY_CPU REGISTER_TYPED_FUNC(RowConv, GPU, RowConvFunc); -REGISTER_TYPED_FUNC(RowConvGrad, GPU, PadGradFunc); +REGISTER_TYPED_FUNC(RowConvGrad, GPU, RowConvGradFunc); #endif } // namespace paddle diff --git a/paddle/function/RowConvOpGpu.cu b/paddle/function/RowConvOpGpu.cu new file mode 100644 index 0000000000..5b0e065a21 --- /dev/null +++ b/paddle/function/RowConvOpGpu.cu @@ -0,0 +1,329 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#include "hl_base.h" +#include "RowConvOp.h" + +namespace paddle { + +template +__global__ void KeRowConv(real* y, const real* x, const real* w, + const int* starts, const int height, const int width, + const int numSeq, const int context) { + + const int tidx = threadIdx.x; + const int tidy = threadIdx.y; + const int blky = blockDim.y; + const int gidx = blockIdx.x * blockDim.x; + + __shared__ real sw[BLOCK_H][BLOCK_W]; + + for (int i = tidy; i < context; i += blky) { + sw[i][tidx] = gidx + tidx < width ? w[i*width + gidx + tidx] : 0.0; + } + + __syncthreads(); + + for (int i = 0; i < numSeq; ++i) { + const int start = starts[i]; + const int end = starts[i + 1]; + const int steps = end - start; + for (int j = tidy; j < steps; j += blky) { + real sum = 0; + int off = (start + j) * width; + for (int t = 0; t < context; ++t) { + if ((start + j + t) < end) { + int xoff = off + t * width; + real xVal = gidx + tidx < width ? x[xoff + gidx + tidx] : 0.0; + sum += sw[t][tidx] * xVal; + } + } + if (gidx + tidx < width) { + y[off + gidx + tidx] += sum; + } + } + } +} + +__global__ void KeRowConv2(real* y, const real* x, const real* w, + const int* starts, const int height, const int width, + const int numSeq, const int context) { + const int tidx = threadIdx.x; + const int tidy = threadIdx.y; + const int blky = blockDim.y; + const int gidx = blockIdx.x * blockDim.x; + + for (int i = 0; i < numSeq; ++i) { + const int start = starts[i]; + const int end = starts[i + 1]; + const int steps = end - start; + for (int j = tidy; j < steps; j += blky) { + int off = (start + j) * width; + real sum = 0; + for (int t = 0; t < context && (start + j + t) < end; ++t) { + int xoff = off + t * width; + real xd = gidx + tidx < width ? x[xoff + gidx + tidx] : 0.0; + real wd = gidx + tidx < width ? w[t * width + gidx + tidx] : 0.0; + sum += wd * xd; + } + if (gidx + tidx < width) { + y[off + gidx + tidx] += sum; + } + } + } +} + + + +template <> +void RowConv(GpuMatrix& out, + const GpuMatrix& in, + const GpuMatrix& filter, + const GpuIVector& seq) { + const size_t numSeq = seq.getSize() - 1; + const size_t contextLength = filter.getHeight(); + const size_t height = in.getHeight(); + const size_t width = in.getWidth(); + + LOG(INFO) << numSeq; + LOG(INFO) << contextLength; + LOG(INFO) << height; + LOG(INFO) << width; + + real* y = out.getData(); + const real* x = in.getData(); + const real* w = filter.getData(); + const int* starts = seq.getData(); + + dim3 dimBlock(32, 32); + dim3 dimGrid(DIVUP(width, dimBlock.x), 1); + LOG(INFO) << dimGrid.x; + + if (contextLength <= 32) { + KeRowConv<32, 32><<>> + (y, x, w, starts, height, width, numSeq, contextLength); + } else { + KeRowConv2<<>> + (y, x, w, starts, height, width, numSeq, contextLength); + } + CHECK_SYNC("RowConv"); +} + + +template +__global__ void KeRowConvBwWeight(real* dw, const real* x, const real* dy, + const int* starts, const int height, const int width, const int numSeq, + const int context) { + + const int tidx = threadIdx.x; + const int tidy = threadIdx.y; + const int blky = blockDim.y; + const int gidx = blockIdx.x * blockDim.x; + + __shared__ real sh_x[BLOCK_H][BLOCK_W]; + __shared__ real sh_dy[BLOCK_H][BLOCK_W]; + __shared__ real sh_dw[CONTEXT][BLOCK_W]; + + for (int t = tidy; t < context; t += blky) { + sh_dw[t][tidx] = 0.0; + } + __syncthreads(); + + for (int i = 0; i < numSeq; ++i) { + const int start = starts[i]; + const int end = starts[i + 1]; + const int steps = end - start; + for (int j = tidy; j < steps; j += BLOCK_H) { + int xoff = gidx + tidx; + int yoff = start + j; + + // transpose + sh_x[tidx][tidy] = xoff < width && yoff < end ? x[yoff * width + xoff] : 0.0; + sh_dy[tidx][tidy] = xoff < width && yoff < end ? dy[yoff * width + xoff] : 0.0; + __syncthreads(); + + for (int t = 0; t < context; t++) { + real val = tidx + t < blockDim.x ? sh_x[tidy][tidx + t] * sh_dy[tidy][tidx]: 0.0; + // warp size and blockDim.x is 32. + for (int offset = 16; offset > 0; offset /= 2) { + val += __shfl_down(val, offset); + } + if (tidx == 0) { + sh_dw[t][tidy] += val; + } + __syncthreads(); + } + } + } + + for (int t = tidy; t < context && (gidx + tidx) < width; t += blky) { + dw[t * width + gidx + tidx] += sh_dw[t][tidx]; + } +} + +template +__global__ void KeRowConvBwWeight2(real* dw, const real* x, const real* dy, + const int* starts, const int height, const int width, const int numSeq, + const int context) { + + const int tidx = threadIdx.x; + const int tidy = threadIdx.y; + const int gidx = blockIdx.x * blockDim.x; + + __shared__ real sh_x[BLOCK_H][BLOCK_W]; + __shared__ real sh_dy[BLOCK_H][BLOCK_W]; + + for (int i = 0; i < numSeq; ++i) { + const int start = starts[i]; + const int end = starts[i + 1]; + const int steps = end - start; + for (int j = 0; j < steps; j += BLOCK_H) { + int xoff = gidx + tidx; + int yoff = start + j; + + // transpose + sh_x[tidx][tidy] = xoff < width && yoff < end ? x[yoff * width + xoff] : 0.0; + sh_dy[tidx][tidy] = xoff < width && yoff < end ? dy[yoff * width + xoff] : 0.0; + __syncthreads(); + + for (int t = 0; t < context; t++) { + real val = tidx + t < blockDim.x ? sh_x[tidy][tidx + t] * sh_dy[tidy][tidx]: 0.0; + // warp size and blockDim.x is 32. + for (int offset = 16; offset > 0; offset /= 2) { + val += __shfl_down(val, offset); + } + if (tidx == 0 && (gidx + tidy) < width) { + dw[t*width + gidx + tidy] += val; + } + } + } + } +} + +template +__global__ void KeRowConvBwData(real* dx, const real* w, const real* dy, + const int* starts, const int height, const int width, const int numSeq, + const int context) { + + const int tidx = threadIdx.x; + const int tidy = threadIdx.y; + const int blky = blockDim.y; + const int gidx = blockIdx.x * blockDim.x; + + __shared__ real sw[BLOCK_H][BLOCK_W]; + + for (int i = tidy; i < context; i += blky) { + sw[i][tidx] = gidx + tidx < width ? w[i*width + gidx + tidx] : 0.0; + } + + __syncthreads(); + + for (int i = 0; i < numSeq; ++i) { + const int start = starts[i]; + const int end = starts[i + 1]; + const int steps = end - start; + for (int j = tidy; j < steps; j += blky) { + real sum = 0; + int off = (start + j) * width; + for (int t = 0; t < context && (j - t) >= 0; ++t) { + int dyOff = off - t * width; + real dyVal = gidx + tidx < width ? dy[dyOff + gidx + tidx] : 0.0; + sum += sw[t][tidx] * dyVal; + } + if (gidx + tidx < width) { + dx[off + gidx + tidx] += sum; + } + } + } +} + +__global__ void KeRowConvBwData2(real* dx, const real* w, const real* dy, + const int* starts, const int height, const int width, const int numSeq, + const int context) { + + const int tidx = threadIdx.x; + const int tidy = threadIdx.y; + const int blky = blockDim.y; + const int gidx = blockIdx.x * blockDim.x; + + for (int i = 0; i < numSeq; ++i) { + const int start = starts[i]; + const int end = starts[i + 1]; + const int steps = end - start; + for (int j = tidy; j < steps; j += blky) { + real sum = 0; + int off = (start + j) * width; + for (int t = 0; t < context && (j - t) >= 0; ++t) { + int dyOff = off - t * width; + real dyVal = gidx + tidx < width ? dy[dyOff + gidx + tidx] : 0.0; + real wVal = gidx + tidx < width ? w[t * width + gidx + tidx] : 0.0; + sum += wVal * dyVal; + } + if (gidx + tidx < width) { + dx[off + gidx + tidx] += sum; + } + } + } +} + + +template <> +void RowConvGrad(const GpuMatrix& outG, + const GpuMatrix& in, + const GpuMatrix& filter, + GpuMatrix& inG, + GpuMatrix& filterG, + const GpuIVector& seq) { + const size_t numSeq = seq.getSize() - 1; + const size_t contextLength = filter.getHeight(); + const size_t height = in.getHeight(); + const size_t width = in.getWidth(); + + const real* dy = outG.getData(); + const real* x = in.getData(); + const real* w = filter.getData(); + real* dx = inG.getData(); + real* dw = filterG.getData(); + const int* starts = seq.getData(); + + dim3 dimBlock(32, 32); + dim3 dimGrid(DIVUP(width, dimBlock.x), 1); + + if (contextLength <= 16) { + KeRowConvBwWeight<32, 32, 16> + <<>> + (dw, x, dy, starts, height, width, numSeq, contextLength); + } else { + KeRowConvBwWeight2<32, 32> + <<>> + (dw, x, dy, starts, height, width, numSeq, contextLength); + } + + + dim3 dimBlock2(32, 32); + dim3 dimGrid2(DIVUP(width, dimBlock2.x), 1); + if (contextLength <= 64) { + KeRowConvBwData<32, 64> + <<>> + (dx, w, dy, starts, height, width, numSeq, contextLength); + } else { + KeRowConvBwData2 + <<>> + (dx, w, dy, starts, height, width, numSeq, contextLength); + } + + CHECK_SYNC("RowConvGrad"); +} + +} // namespace paddle diff --git a/paddle/function/RowConvOpTest.cpp b/paddle/function/RowConvOpTest.cpp new file mode 100644 index 0000000000..9898df1a97 --- /dev/null +++ b/paddle/function/RowConvOpTest.cpp @@ -0,0 +1,69 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#include +#include "FunctionTest.h" + +namespace paddle { + +void testRowConvFw(size_t batchSize, size_t dim, size_t contextLength) { + FunctionCompare test("RowConv", FuncConfig()); + + test.addSequence(SequenceIdArg(TensorShape{batchSize})); + test.addInputs(SequenceArg(VALUE_TYPE_FLOAT, TensorShape{batchSize, dim})); + test.addInputs(BufferArg(VALUE_TYPE_FLOAT, TensorShape{contextLength, dim})); + + test.addOutputs(SequenceArg(VALUE_TYPE_FLOAT, TensorShape{batchSize, dim}), + ADD_TO); + + test.run(); +} + +void testRowConvBw(size_t batchSize, size_t dim, size_t contextLength) { + FunctionCompare test("RowConvGrad", FuncConfig()); + + test.addSequence(SequenceIdArg(TensorShape{batchSize})); + test.addInputs(SequenceArg(VALUE_TYPE_FLOAT, TensorShape{batchSize, dim})); + test.addInputs(SequenceArg(VALUE_TYPE_FLOAT, TensorShape{batchSize, dim})); + test.addInputs(BufferArg(VALUE_TYPE_FLOAT, TensorShape{contextLength, dim})); + + test.addOutputs(SequenceArg(VALUE_TYPE_FLOAT, TensorShape{batchSize, dim}), + ADD_TO); + test.addOutputs(BufferArg(VALUE_TYPE_FLOAT, TensorShape{contextLength, dim}), + ADD_TO); + + test.run(); +} + +TEST(RowConv, real) { + // for (size_t numSamples : {17, 129}) { + // for (size_t dim : {16, 248}) { + // for (size_t context: {3, 7, 65}) { + LOG(INFO) << "==========="; + // for (size_t numSamples : {17}) { + // for (size_t dim : {16}) { + // for (size_t context: {3}) { + size_t numSamples = 17; + size_t dim = 16; + size_t context = 3; + LOG(INFO) << " numSamples=" << numSamples << " dim=" << dim + << " context length=" << context; + testRowConvFw(numSamples, dim, context); + // testRowConvBw(numSamples, dim, context); + // } + // } + // } +} + +} // namespace paddle From 18cd1f2558338e3e999670cdfac7e1c61c1ea428 Mon Sep 17 00:00:00 2001 From: dangqingqing Date: Sun, 4 Jun 2017 13:29:16 +0800 Subject: [PATCH 07/39] Fix bug and Python API. --- paddle/function/RowConvOp.cpp | 93 ++++++++------ paddle/function/RowConvOp.h | 18 ++- paddle/function/RowConvOpGpu.cu | 113 ++++++++++-------- paddle/function/RowConvOpTest.cpp | 27 ++--- paddle/gserver/layers/RowConvLayer.cpp | 2 +- paddle/gserver/layers/RowConvLayer.h | 4 +- python/paddle/trainer/config_parser.py | 17 +++ .../paddle/trainer_config_helpers/layers.py | 76 ++++++++++++ .../tests/configs/file_list.sh | 2 +- .../configs/protostr/test_row_conv.protostr | 41 +++++++ .../tests/configs/test_row_conv.py | 9 ++ 11 files changed, 295 insertions(+), 107 deletions(-) create mode 100644 python/paddle/trainer_config_helpers/tests/configs/protostr/test_row_conv.protostr create mode 100644 python/paddle/trainer_config_helpers/tests/configs/test_row_conv.py diff --git a/paddle/function/RowConvOp.cpp b/paddle/function/RowConvOp.cpp index 24b7e3cdff..c3abb64971 100644 --- a/paddle/function/RowConvOp.cpp +++ b/paddle/function/RowConvOp.cpp @@ -61,7 +61,7 @@ void RowConvGrad(const CpuMatrix& outG, size_t begin = starts[i]; size_t end = starts[i + 1]; size_t steps = end - begin; - for (size_t j = 0; j < contextLength; ++j) { + for (size_t j = 0; j < contextLength && (begin + j) < end; ++j) { MatrixPtr x = (const_cast(in)).subMatrix(begin + j, steps - j); MatrixPtr dy = @@ -81,7 +81,7 @@ void RowConvGrad(const CpuMatrix& outG, for (size_t j = 0; j < steps; ++j) { MatrixPtr dx = inG.subMatrix(begin + j, 1); for (size_t t = 0; t < contextLength; ++t) { - if ((int(j) - int(t)) >= 0) { + if (int(j - t) >= 0) { MatrixPtr dy = (const_cast(outG)).subMatrix(begin + j - t, 1); MatrixPtr w = (const_cast(filter)).subMatrix(t, 1); @@ -94,8 +94,37 @@ void RowConvGrad(const CpuMatrix& outG, } /** - * \brief TODO(qingqing) + * \brief The row convolution is called lookahead convolution. It is firstly + * introduced in deep-speech2 system. The bidirectional RNN that learns + * representation for a sequence by performing a forward and a backward pass + * through the entire sequence. However, unlike unidirectional RNNs, + * bidirectional RNNs are challenging to deploy in an online and low-latency + * setting. The lookahead convolution incorporates information from future + * subsequences in a computationally efficient manner to improve unidirectional + * recurrent neural networks. * + * The connection of row convolution is different form the 1D sequence + * convolution. Assumed that, the future context-length is k, that is to say, + * it can get the output at timestep t by using the the input feature from t-th + * timestep to (t+k)-th timestep. Assumed that the hidden dim of input + * activations are d, the activations r_t for the new layer at time-step t are: + * + * + * -- k + 1 + * r(t,i) = > W(i,j) * h(t+j-1, i), for (1 <= i <= d) + * -- j = 1 + * + * + * The weight shape is: (k + 1) x d + * Function Arguments: + * + * \param inputs[0] The input activations. + * \param inputs[0] The filter (or weight) and shape is (k+1) x d. + * \param outputs[1] The output activations. + * + * [1] Dario Amodei, etc. Deep Speech 2 : End-to-End Speech Recognition in + * English + * and Mandarin. https://arxiv.org/abs/1512.02595 */ template @@ -128,10 +157,21 @@ public: RowConv(outMat, inMat, wMat, seqId); } }; + /** - * \brief TODO(qingqing) + * \brief The backward of row convolution function. This function calculated + * the gradient w.r.t filter and the gradient w.r.t input activations(or data). * * Argument in this Function: + * + * \param inputs[0] The gradient w.r.t output activations. + * \param inputs[1] The input activations. + * \param inputs[2] The filter (or weight) and shape is (k+1) x d. + * \param outputs[0] The gradient w.r.t input activations. + * \param outputs[1] The gradient w.r.r filter. + * + * Abbreviation: + * w.r.t: with respect to. */ template @@ -140,12 +180,27 @@ public: void init(const FuncConfig& config) override {} void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { + // check + CHECK_EQ(3UL, inputs.size()); + CHECK_EQ(2UL, outputs.size()); + CHECK_EQ(outputs[0].getArgType(), ADD_TO); + CHECK_EQ(outputs[1].getArgType(), ADD_TO); + CHECK(inputs[0].isSequenceArg() && inputs[1].isSequenceArg() && + outputs[0].isSequenceArg()) + << "SequenceArg required here."; + const auto outGrad = dynamic_cast(inputs[0]); const auto in = dynamic_cast(inputs[1]); const auto w = inputs[2]; auto inGrad = dynamic_cast(outputs[0]); auto wGrad = outputs[1]; + CHECK_EQ(in.shape().ndims(), 2UL); + CHECK_EQ(outGrad.shape().ndims(), 2UL); + CHECK_EQ(in.shape()[1], outGrad.shape()[1]); + CHECK_EQ(in.shape()[0], outGrad.shape()[0]); + CHECK_EQ(wGrad.shape()[1], in.shape()[1]); + const auto outGMat = outGrad.matrix(); const auto inMat = in.matrix(); const auto wMat = w.matrix(); @@ -157,37 +212,7 @@ public: : typename Tensor::Matrix(nullptr, 0, 0); const auto seqId = in.getSequenceId().vector(); - std::cout << "in:" << std::endl; - for (int i = 0; i < inMat.getHeight(); ++i) { - for (int j = 0; j < inMat.getWidth(); ++j) { - std::cout << outGMat.getElement(i, j) << " "; - } - std::cout << std::endl; - } - - std::cout << "w:" << std::endl; - for (int i = 0; i < wMat.getHeight(); ++i) { - for (int j = 0; j < wMat.getWidth(); ++j) { - std::cout << wMat.getElement(i, j) << " "; - } - std::cout << std::endl; - } - - std::cout << "w:" << std::endl; - for (int i = 0; i < seqId.getSize(); ++i) { - std::cout << seqId.getElement(i) << " "; - } - std::cout << std::endl; - RowConvGrad(outGMat, inMat, wMat, inGMat, wGMat, seqId); - - std::cout << std::endl << "out:" << std::endl; - for (int i = 0; i < inGMat.getHeight(); ++i) { - for (int j = 0; j < inGMat.getWidth(); ++j) { - std::cout << inGMat.getElement(i, j) << " "; - } - std::cout << std::endl; - } } }; diff --git a/paddle/function/RowConvOp.h b/paddle/function/RowConvOp.h index cd78ec724a..2c5de6151a 100644 --- a/paddle/function/RowConvOp.h +++ b/paddle/function/RowConvOp.h @@ -19,7 +19,14 @@ limitations under the License. */ namespace paddle { /** - * \brief TODO(qingqing) + * \brief The forward of row convolution. + * + * \param[out] out The output data and shape is h x d. h is the sum of + * time steps of all samples in one mini-batch. + * \param[in] in The input data and shape is h x d. + * \param[in] filter The filter and shape is k x d. The lookahead step + * number plus one equals k. + * \param[in] seq The sequence start positions. * */ template @@ -29,7 +36,14 @@ void RowConv(typename Tensor::Matrix& out, const typename Tensor::Vector& seq); /** - * \brief TODO(qingqing) + * \brief The backward of row convolution. + * + * \param[in] outG The gradient w.r.t output data. + * \param[in] in The input data. + * \param[in] filter The filter. + * \param[out] inG The gradient w.r.t input data. + * \param[out] filterG The gradient w.r.t filter. + * \param[in] seq The sequence start positions. * */ template diff --git a/paddle/function/RowConvOpGpu.cu b/paddle/function/RowConvOpGpu.cu index 5b0e065a21..06e2c2baac 100644 --- a/paddle/function/RowConvOpGpu.cu +++ b/paddle/function/RowConvOpGpu.cu @@ -96,11 +96,6 @@ void RowConv(GpuMatrix& out, const size_t height = in.getHeight(); const size_t width = in.getWidth(); - LOG(INFO) << numSeq; - LOG(INFO) << contextLength; - LOG(INFO) << height; - LOG(INFO) << width; - real* y = out.getData(); const real* x = in.getData(); const real* w = filter.getData(); @@ -108,7 +103,6 @@ void RowConv(GpuMatrix& out, dim3 dimBlock(32, 32); dim3 dimGrid(DIVUP(width, dimBlock.x), 1); - LOG(INFO) << dimGrid.x; if (contextLength <= 32) { KeRowConv<32, 32><<>> @@ -131,12 +125,12 @@ __global__ void KeRowConvBwWeight(real* dw, const real* x, const real* dy, const int blky = blockDim.y; const int gidx = blockIdx.x * blockDim.x; - __shared__ real sh_x[BLOCK_H][BLOCK_W]; - __shared__ real sh_dy[BLOCK_H][BLOCK_W]; + __shared__ real sh_x[BLOCK_W][BLOCK_H]; + __shared__ real sh_dy[BLOCK_W][BLOCK_H + CONTEXT - 1]; __shared__ real sh_dw[CONTEXT][BLOCK_W]; - for (int t = tidy; t < context; t += blky) { - sh_dw[t][tidx] = 0.0; + if (tidy < context) { + sh_dw[tidy][tidx] = 0.0; } __syncthreads(); @@ -144,21 +138,31 @@ __global__ void KeRowConvBwWeight(real* dw, const real* x, const real* dy, const int start = starts[i]; const int end = starts[i + 1]; const int steps = end - start; - for (int j = tidy; j < steps; j += BLOCK_H) { + const int size = ((steps + BLOCK_H - 1)/BLOCK_H) * BLOCK_H; + for (int j = tidy; j < size; j += BLOCK_H) { int xoff = gidx + tidx; int yoff = start + j; // transpose - sh_x[tidx][tidy] = xoff < width && yoff < end ? x[yoff * width + xoff] : 0.0; - sh_dy[tidx][tidy] = xoff < width && yoff < end ? dy[yoff * width + xoff] : 0.0; + sh_x[tidx][tidy] = (xoff < width && yoff < end) ? x[yoff * width + xoff] : 0.0; + sh_dy[tidx][tidy + context - 1] = (xoff < width && yoff < end) ? dy[yoff * width + xoff] : 0.0; + __syncthreads(); + if (tidy < (context - 1)) { + yoff = yoff - context + 1; + sh_dy[tidx][tidy] = (xoff < width && yoff >= start) ? dy[yoff * width + xoff] : 0.0; + } __syncthreads(); for (int t = 0; t < context; t++) { - real val = tidx + t < blockDim.x ? sh_x[tidy][tidx + t] * sh_dy[tidy][tidx]: 0.0; + real val = sh_x[tidy][tidx] * sh_dy[tidy][tidx + context - 1 - t]; + __syncthreads(); // warp size and blockDim.x is 32. - for (int offset = 16; offset > 0; offset /= 2) { - val += __shfl_down(val, offset); - } + val += __shfl_down(val, 16); + val += __shfl_down(val, 8); + val += __shfl_down(val, 4); + val += __shfl_down(val, 2); + val += __shfl_down(val, 1); + __syncthreads(); if (tidx == 0) { sh_dw[t][tidy] += val; } @@ -167,7 +171,7 @@ __global__ void KeRowConvBwWeight(real* dw, const real* x, const real* dy, } } - for (int t = tidy; t < context && (gidx + tidx) < width; t += blky) { + for (int t = tidy; (t < context) && ((gidx + tidx) < width); t += blky) { dw[t * width + gidx + tidx] += sh_dw[t][tidx]; } } @@ -188,21 +192,30 @@ __global__ void KeRowConvBwWeight2(real* dw, const real* x, const real* dy, const int start = starts[i]; const int end = starts[i + 1]; const int steps = end - start; - for (int j = 0; j < steps; j += BLOCK_H) { + + const int size = ((steps + BLOCK_H - 1)/BLOCK_H) * BLOCK_H; + for (int j = tidy; j < size; j += BLOCK_H) { int xoff = gidx + tidx; int yoff = start + j; // transpose - sh_x[tidx][tidy] = xoff < width && yoff < end ? x[yoff * width + xoff] : 0.0; - sh_dy[tidx][tidy] = xoff < width && yoff < end ? dy[yoff * width + xoff] : 0.0; + sh_x[tidx][tidy] = (xoff < width && yoff < end) ? x[yoff * width + xoff] : 0.0; __syncthreads(); for (int t = 0; t < context; t++) { - real val = tidx + t < blockDim.x ? sh_x[tidy][tidx + t] * sh_dy[tidy][tidx]: 0.0; + sh_dy[tidx][tidy] = (xoff < width && (yoff - t) >= start && yoff - t < end) ? dy[(yoff - t) * width + xoff] : 0.0; + __syncthreads(); + + real val = sh_x[tidy][tidx] * sh_dy[tidy][tidx]; + __syncthreads(); // warp size and blockDim.x is 32. - for (int offset = 16; offset > 0; offset /= 2) { - val += __shfl_down(val, offset); - } + val += __shfl_down(val, 16); + val += __shfl_down(val, 8); + val += __shfl_down(val, 4); + val += __shfl_down(val, 2); + val += __shfl_down(val, 1); + __syncthreads(); + if (tidx == 0 && (gidx + tidy) < width) { dw[t*width + gidx + tidy] += val; } @@ -293,34 +306,36 @@ void RowConvGrad(const GpuMatrix& outG, const real* dy = outG.getData(); const real* x = in.getData(); const real* w = filter.getData(); - real* dx = inG.getData(); - real* dw = filterG.getData(); const int* starts = seq.getData(); - dim3 dimBlock(32, 32); - dim3 dimGrid(DIVUP(width, dimBlock.x), 1); - - if (contextLength <= 16) { - KeRowConvBwWeight<32, 32, 16> - <<>> - (dw, x, dy, starts, height, width, numSeq, contextLength); - } else { - KeRowConvBwWeight2<32, 32> - <<>> - (dw, x, dy, starts, height, width, numSeq, contextLength); + if (filterG) { + dim3 dimBlock(32, 32); + dim3 dimGrid(DIVUP(width, dimBlock.x), 1); + real* dw = filterG.getData(); + if (contextLength <= 16) { + KeRowConvBwWeight<32, 32, 16> + <<>> + (dw, x, dy, starts, height, width, numSeq, contextLength); + } else { + KeRowConvBwWeight2<32, 32> + <<>> + (dw, x, dy, starts, height, width, numSeq, contextLength); + } } - - dim3 dimBlock2(32, 32); - dim3 dimGrid2(DIVUP(width, dimBlock2.x), 1); - if (contextLength <= 64) { - KeRowConvBwData<32, 64> - <<>> - (dx, w, dy, starts, height, width, numSeq, contextLength); - } else { - KeRowConvBwData2 - <<>> - (dx, w, dy, starts, height, width, numSeq, contextLength); + if (inG) { + real* dx = inG.getData(); + dim3 dimBlock2(32, 32); + dim3 dimGrid2(DIVUP(width, dimBlock2.x), 1); + if (contextLength <= 64) { + KeRowConvBwData<32, 64> + <<>> + (dx, w, dy, starts, height, width, numSeq, contextLength); + } else { + KeRowConvBwData2 + <<>> + (dx, w, dy, starts, height, width, numSeq, contextLength); + } } CHECK_SYNC("RowConvGrad"); diff --git a/paddle/function/RowConvOpTest.cpp b/paddle/function/RowConvOpTest.cpp index 9898df1a97..1c95d3ff2c 100644 --- a/paddle/function/RowConvOpTest.cpp +++ b/paddle/function/RowConvOpTest.cpp @@ -47,23 +47,16 @@ void testRowConvBw(size_t batchSize, size_t dim, size_t contextLength) { } TEST(RowConv, real) { - // for (size_t numSamples : {17, 129}) { - // for (size_t dim : {16, 248}) { - // for (size_t context: {3, 7, 65}) { - LOG(INFO) << "==========="; - // for (size_t numSamples : {17}) { - // for (size_t dim : {16}) { - // for (size_t context: {3}) { - size_t numSamples = 17; - size_t dim = 16; - size_t context = 3; - LOG(INFO) << " numSamples=" << numSamples << " dim=" << dim - << " context length=" << context; - testRowConvFw(numSamples, dim, context); - // testRowConvBw(numSamples, dim, context); - // } - // } - // } + for (size_t numSamples : {17, 129, 2020}) { + for (size_t dim : {16, 512, 2560}) { + for (size_t context : {3, 19, 65}) { + VLOG(3) << " numSamples=" << numSamples << " dim=" << dim + << " context length=" << context; + testRowConvFw(numSamples, dim, context); + testRowConvBw(numSamples, dim, context); + } + } + } } } // namespace paddle diff --git a/paddle/gserver/layers/RowConvLayer.cpp b/paddle/gserver/layers/RowConvLayer.cpp index d4b1406297..5302e0e1a8 100644 --- a/paddle/gserver/layers/RowConvLayer.cpp +++ b/paddle/gserver/layers/RowConvLayer.cpp @@ -75,7 +75,7 @@ void RowConvLayer::backward(const UpdateCallback& callback) { BufferArgs outputs; inputs.addArg(*getOutputGrad(), *startPos); inputs.addArg(*getInputValue(0), *startPos); - inputs.addArg(*weight_->getW(), *startPos); + inputs.addArg(*weight_->getW(), wDims_); MatrixPtr inGrad = getInputGrad(0); MatrixPtr wGrad = weight_->getWGrad(); diff --git a/paddle/gserver/layers/RowConvLayer.h b/paddle/gserver/layers/RowConvLayer.h index 05be6ca6a9..b3bdda2f35 100644 --- a/paddle/gserver/layers/RowConvLayer.h +++ b/paddle/gserver/layers/RowConvLayer.h @@ -37,9 +37,7 @@ protected: // fan_out is the size of output feature. std::unique_ptr weight_; - // std::unique_ptr biases_; - - // how many steps to look ahead + // The step number to look ahead plus one equals contexLength_. size_t contexLength_; TensorShape wDims_; }; diff --git a/python/paddle/trainer/config_parser.py b/python/paddle/trainer/config_parser.py index 5d540664a7..9066ce05f3 100644 --- a/python/paddle/trainer/config_parser.py +++ b/python/paddle/trainer/config_parser.py @@ -2081,6 +2081,23 @@ class MaxOutLayer(LayerBase): g_layer_map[input_layer.name].width, out_channels) +@config_layer('row_conv') +class RowConvLayer(LayerBase): + def __init__(self, name, inputs, context_length, **xargs): + super(RowConvLayer, self).__init__( + name, 'maxout', 0, inputs=inputs, **xargs) + config_assert( + len(self.inputs) == 1, + 'TransLayer must have one and only one input') + input_layer = self.get_input_layer(0) + row_conv_conf = self.config.inputs[0].row_conv_conf + row_conv_conf.context_length = context_length + self.set_layer_size(input_layer.size) + psize = context_length * input_layer.size + dims = [context_length, input_layer.size] + self.create_input_parameter(0, psize, dims) + + # key: cost type # value: cost class g_cost_map = {} diff --git a/python/paddle/trainer_config_helpers/layers.py b/python/paddle/trainer_config_helpers/layers.py index 81cce31fec..47b6232877 100755 --- a/python/paddle/trainer_config_helpers/layers.py +++ b/python/paddle/trainer_config_helpers/layers.py @@ -120,6 +120,7 @@ __all__ = [ 'smooth_l1_cost', 'layer_support', 'multiplex_layer', + 'row_conv_layer', ] @@ -187,6 +188,7 @@ class LayerType(object): SPP_LAYER = "spp" PAD_LAYER = "pad" MULTIPLEX_LAYER = "multiplex" + ROW_CONV_LAYER = "row_conv" PRINT_LAYER = "print" PRIORBOX_LAYER = "priorbox" @@ -5528,3 +5530,77 @@ def multiplex_layer(input, name=None, layer_attr=None): layer_type=LayerType.MULTIPLEX_LAYER, parents=input, size=l.config.size) + + +@wrap_name_default() +@wrap_act_default(act=LinearActivation()) +@wrap_param_attr_default() +@layer_support(DROPOUT) +def row_conv_layer(input, + context_len, + act=None, + name=None, + param_attr=None, + layer_attr=None): + """ + + The row convolution is called lookahead convolution. It is firstly + introduced in paper of `Deep Speech 2: End-toEnd Speech Recognition + in English and Mandarin `_ . + + The bidirectional RNN that learns representation for a sequence by + performing a forward and a backward pass through the entire sequence. + However, unlike unidirectional RNNs, bidirectional RNNs are challenging + to deploy in an online and low-latency setting. The lookahead convolution + incorporates information from future subsequences in a computationally + efficient manner to improve unidirectional recurrent neural networks. + + The connection of row convolution is different form the 1D sequence + convolution. Assumed that, the future context-length is k, that is to say, + it can get the output at timestep t by using the the input feature from t-th + timestep to (t+k+1)-th timestep. Assumed that the hidden dim of input + activations are d, the activations r_t for the new layer at time-step t are: + + .. math:: + + r_{t,r} = \sum_{j=1}^{k + 1} {w_{i,j}h_{t+j-1, i}} + \quad \text{for} \quad (1 \leq i \leq d) + + Note: + The `context_len` is `k + 1`. That is to say, the lookahead step + number plus one equals context_len. + + + .. code-block:: python + + row_conv = row_conv_layer(input=input_layer, context_len=3) + + + :param input: The input layer. + :type input: LayerOutput + :param context_len: The context length equals the lookahead step number + plus one. + :type context_len: int + :param act: Activation Type. Default is linear activation. + :type act: BaseActivation + :param param_attr: The Parameter Attribute. If None, the parameter will be + initialized smartly. It's better set it by yourself. + :type param_attr: ParameterAttribute + :param layer_attr: Extra Layer config. + :type layer_attr: ExtraLayerAttribute|None + :return: LayerOutput object. + :rtype: LayerOutput + + """ + assert isinstance(input, LayerOutput) + assert context_len > 0, "the context_len must be greatet than 0." + + Layer( + inputs=[Input(input.name, **param_attr.attr)], + name=name, + context_length=context_len, + type=LayerType.ROW_CONV_LAYER, + active_type=act.name, + **ExtraLayerAttribute.to_kwargs(layer_attr)) + return LayerOutput( + name, LayerType.ROW_CONV_LAYER, input, activation=act, size=input.size) diff --git a/python/paddle/trainer_config_helpers/tests/configs/file_list.sh b/python/paddle/trainer_config_helpers/tests/configs/file_list.sh index 981ccbf248..db3d3c6550 100755 --- a/python/paddle/trainer_config_helpers/tests/configs/file_list.sh +++ b/python/paddle/trainer_config_helpers/tests/configs/file_list.sh @@ -5,6 +5,6 @@ last_first_seq test_expand_layer test_ntm_layers test_hsigmoid img_layers img_trans_layers util_layers simple_rnn_layers unused_layers test_cost_layers test_rnn_group shared_fc shared_lstm shared_gru test_cost_layers_with_weight test_spp_layer test_bilinear_interp test_maxout test_bi_grumemory math_ops -test_seq_concat_reshape test_pad test_smooth_l1 test_multiplex_layer) +test_seq_concat_reshape test_pad test_smooth_l1 test_multiplex_layer test_row_conv) export whole_configs=(test_split_datasource) diff --git a/python/paddle/trainer_config_helpers/tests/configs/protostr/test_row_conv.protostr b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_row_conv.protostr new file mode 100644 index 0000000000..9ec15d2a19 --- /dev/null +++ b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_row_conv.protostr @@ -0,0 +1,41 @@ +type: "nn" +layers { + name: "data" + type: "data" + size: 2560 + active_type: "" +} +layers { + name: "__row_conv_layer_0__" + type: "maxout" + size: 2560 + active_type: "relu" + inputs { + input_layer_name: "data" + input_parameter_name: "___row_conv_layer_0__.w0" + row_conv_conf { + context_length: 19 + } + } +} +parameters { + name: "___row_conv_layer_0__.w0" + size: 48640 + initial_mean: 0.0 + initial_std: 0.229415733871 + dims: 19 + dims: 2560 + initial_strategy: 0 + initial_smart: true +} +input_layer_names: "data" +output_layer_names: "__row_conv_layer_0__" +sub_models { + name: "root" + layer_names: "data" + layer_names: "__row_conv_layer_0__" + input_layer_names: "data" + output_layer_names: "__row_conv_layer_0__" + is_recurrent_layer_group: false +} + diff --git a/python/paddle/trainer_config_helpers/tests/configs/test_row_conv.py b/python/paddle/trainer_config_helpers/tests/configs/test_row_conv.py new file mode 100644 index 0000000000..ab33c496b0 --- /dev/null +++ b/python/paddle/trainer_config_helpers/tests/configs/test_row_conv.py @@ -0,0 +1,9 @@ +from paddle.trainer_config_helpers import * + +settings(batch_size=1000, learning_rate=1e-5) + +data = data_layer(name='data', size=2560) + +row_conv = row_conv_layer(input=data, context_len=19, act=ReluActivation()) + +outputs(row_conv) From 6e8c5665b4a2d47ea9de2d347408d0d86d47e50d Mon Sep 17 00:00:00 2001 From: dangqingqing Date: Tue, 6 Jun 2017 16:52:35 +0800 Subject: [PATCH 08/39] Add layers into doc. --- doc/api/v2/config/layer.rst | 11 +++++++++++ paddle/function/RowConvOpGpu.cu | 4 ++-- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/doc/api/v2/config/layer.rst b/doc/api/v2/config/layer.rst index 1efa74ecda..7c22b67775 100644 --- a/doc/api/v2/config/layer.rst +++ b/doc/api/v2/config/layer.rst @@ -59,6 +59,11 @@ context_projection .. autoclass:: paddle.v2.layer.context_projection :noindex: +row_conv +-------- +.. autoclass:: paddle.v2.layer.row_conv + :noindex: + Image Pooling Layer =================== @@ -346,6 +351,12 @@ sampling_id .. autoclass:: paddle.v2.layer.sampling_id :noindex: +multiplex +--------- +.. autoclass:: paddle.v2.layer.multiplex + :noindex: + + Slicing and Joining Layers ========================== diff --git a/paddle/function/RowConvOpGpu.cu b/paddle/function/RowConvOpGpu.cu index 06e2c2baac..c0b947e224 100644 --- a/paddle/function/RowConvOpGpu.cu +++ b/paddle/function/RowConvOpGpu.cu @@ -312,8 +312,8 @@ void RowConvGrad(const GpuMatrix& outG, dim3 dimBlock(32, 32); dim3 dimGrid(DIVUP(width, dimBlock.x), 1); real* dw = filterG.getData(); - if (contextLength <= 16) { - KeRowConvBwWeight<32, 32, 16> + if (contextLength <= 32) { + KeRowConvBwWeight<32, 32, 32> <<>> (dw, x, dy, starts, height, width, numSeq, contextLength); } else { From 9011f9e52c614c1f357a4220ffbb16ee5155f0df Mon Sep 17 00:00:00 2001 From: gongweibao Date: Wed, 7 Jun 2017 15:52:06 +0800 Subject: [PATCH 09/39] add precommit --- python/paddle/v2/dataset/common.py | 44 +++++++++++++++++++ python/paddle/v2/dataset/tests/common_test.py | 16 +++++++ 2 files changed, 60 insertions(+) diff --git a/python/paddle/v2/dataset/common.py b/python/paddle/v2/dataset/common.py index 418b592a5a..89675080e2 100644 --- a/python/paddle/v2/dataset/common.py +++ b/python/paddle/v2/dataset/common.py @@ -149,3 +149,47 @@ def cluster_files_reader(files_pattern, yield line return reader + + +def convert(output_path, eader, num_shards, name_prefix): + import recordio + import cPickle as pickle + """ + Convert data from reader to recordio format files. + + :param output_path: directory in which output files will be saved. + :param reader: a data reader, from which the convert program will read data instances. + :param num_shards: the number of shards that the dataset will be partitioned into. + :param name_prefix: the name prefix of generated files. + """ + + def open_needs(idx): + n = "%s/%s-%05d" % (output_path, name_prefix, idx) + w = recordio.writer(n) + f = open(n, "w") + idx += 1 + + return w, f, idx + + def close_needs(w, f): + if w is not None: + w.close() + + if f is not None: + f.close() + + idx = 0 + w = None + f = None + + for i, d in enumerate(reader()): + if w is None: + w, f, idx = open_needs(idx) + + w.write(pickle.dumps(d, pickle.HIGHEST_PROTOCOL)) + + if i % num_shards == 0 and i >= num_shards: + close_needs(w, f) + w, f, idx = open_needs(idx) + + close_needs(w, f) diff --git a/python/paddle/v2/dataset/tests/common_test.py b/python/paddle/v2/dataset/tests/common_test.py index f9815d4f9e..3120026e1e 100644 --- a/python/paddle/v2/dataset/tests/common_test.py +++ b/python/paddle/v2/dataset/tests/common_test.py @@ -57,6 +57,22 @@ class TestCommon(unittest.TestCase): for idx, e in enumerate(reader()): self.assertEqual(e, str("0")) + def test_convert(self): + def test_reader(): + def reader(): + for x in xrange(10): + yield x + + return reader + + path = tempfile.mkdtemp() + + paddle.v2.dataset.common.convert(path, + test_reader(), 4, 'random_images') + + files = glob.glob(temp_path + '/random_images-*') + self.assertEqual(len(files), 3) + if __name__ == '__main__': unittest.main() From 96a56b962acffd0be7790c0641ab9bc87c7e98a1 Mon Sep 17 00:00:00 2001 From: gongweibao Date: Wed, 7 Jun 2017 15:59:12 +0800 Subject: [PATCH 10/39] rm not need --- doc/howto/dev/introduction_to_pr.md | 139 ---------------------------- 1 file changed, 139 deletions(-) delete mode 100644 doc/howto/dev/introduction_to_pr.md diff --git a/doc/howto/dev/introduction_to_pr.md b/doc/howto/dev/introduction_to_pr.md deleted file mode 100644 index 71cbbdf861..0000000000 --- a/doc/howto/dev/introduction_to_pr.md +++ /dev/null @@ -1,139 +0,0 @@ -# Desgin Doc的一点总结 -## 前言 -故事还要从前两天我提交的[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)的Design Doc PR开始。 - -[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)这个文档我肯定是用心写了,我也清楚地记得,提交之前也仔细的检查了(这点自觉性还是有的),结果,我突然发现,我的PR被Comments刷屏了;这还是其次,更要命的是,当网络速度稍慢的时候会提示我:Comments 太多啦!页面打不开!!哦。简直有点人间惨剧的味道!我要把这些和之前的Comments以及一些典型的Comments总结一下,避免同样的问题,同时也希望对您有用。 -link -我觉得里边有几个基本的原则: - -- 做事情要精益求精 - 这个是做事情的基础,也是我们讨论的基础。美,多是相似的,丑才是千奇百怪。 - - 精益求精是一种态度,态度比什么都重要。[yi](#yi) -- 节约别人的时间 - 同时也是给自己节约时间 - -- 有礼貌,有感恩的心 - > 我发现诸多之前提了comment,但是没有改,也没有回复。这个不太好吧。[ying](#ying) - > 每个comment都必须回复。这是开源社区的基本礼貌。别人帮了忙,应该说谢谢。[yi](#yi) - -我的理解,写Doc精益求精要从基础开始,首先要规范;其次要有提纲挈领的东西,让人一眼基本明白要做的事情;然后,讲述事情结构要清晰,要分层,用最小化涉及的原则,该讲的要解释,不该讲的不讲;接下来逻辑要清晰,要流畅,不要太突兀。。。当然还有很多,比如文笔要好!:) - -锻炼一下,中国人办事也不一定那么糙的,对吧! :) - -## 基础规范 - -- 语言选择:中文 or 英文? - 最好用英文,因为外国同学无法阅读中文。但是,语言是用来交流的,如果您认为英文无法完善的表达自己的意思的时候,还是用中文吧! - - 重要的是,不管哪种文,***英文的拼写要对,中文要没有错别字***! - - 如果文章用的是中文,那么请用中文的标点符号。反之,用英文的标点符号。***而且要保持统一、而且要保持完整***。[wuyi](#wuyi) - - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114951817) - - 还有这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093563) - - (请原谅我多加了一个link做锚文本,Sphinx有一个bug:用中文的会报错。下边的处理类似。) - -- 缩写的全称 - 一个缩写第一次出现的时候,要把他们的未缩写形式写出来。如果该单词代表的概念比较复杂,请在术语解释中把他写清楚! - - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093329) - -- 英语写法错误 - [yi](#yi)总结了一下在code review的时候经常见的一些英语写法错误: Python写成python,TensorFlow写成Tensorflow,Travis CI 写成 Travis-CI,ReLU写成Relu,unit test写成unittest,MD5写成md5。 - - 大小写简单的规则如下: - - 英文的缩写是要用大写的。 - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115091985) - - 英文的句子的首字母是要大写的。 - - 一个专有的名词第一个字母一般是要大写的。 - - yiwang推荐了一个工具:[grammer](https://www.grammarly.com/),可以作为插件安装到Chrome中,自动检查拼写和语法问题。 - -- 不要提交没有用的文件 - 例如这个 [link](https://github.com/PaddlePaddle/Paddle/pull/1964#discussion_r114414822) - -- 提交的代码要经过验证 - 提交的代码都没有验证过是几个意思? - -- 参考资料要设置超链,链接到对方 - 不设置的话俗称Copy。可以用`[name](#name)`设置MarkDown文件中的引用。一如此文中很多地方出现的那样。 - -- 给别人的东西步骤是要可以执行的 - 看这个[link](https://github.com/wangkuiyi/ipynb)是如何写步骤的,或者这个[link](https://github.com/PaddlePaddle/Paddle/pull/1602#issuecomment-285964510)(虽然只是一个Comment)。 - -- 链接可以直接跳到精确的位置 - 如果能一步到位就不要让别人走两步,节约大家彼此的时间。 - -## 提纲挈领 -### 目标要明确 -一般开头的一段话就要用简短的几句话把文档要描述的目标写清楚,别人看了之后,心中有了一个大概:这个文档准备讲什么。 - -例如这个[link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train#objective) -### 架构图 -一个系统或者模块的设计文档,首先要有架构图。***要有架构图,要有架构图,要有架构图***。。。。这句话可以多重复几遍。俗称,无图无真相或者一图胜千言。最起码有一个架构图说明各个模块的关系。图的表现能力远远超过了文字,特别是模块关系相关的和流程相关的场景下。 - -可以看一下这个文档中的图 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train) - -顺便提一句,里边的图都是用[OmniGraffle](https://www.omnigroup.com/omnigraffle)画的,就是贵了点:$99。“99你买了不吃亏,99你买了不上当。”[wuyi](#wuyi) - -### 层次结构清晰 -代码层面上,我们为了开发大的项目,很自然的把代码分模块、分文件。文档也是如此。每个文档或者章节说明他应该要想说的事情,尽量的减少涉及的范围。涉及的范围越少,需要阐述、解释的东西就越少,理解起来需要的背景知识就越少,就越好理解,写的人出错的概率也越少。。。 - -- 整体分层 - - 系统设计 - - 模块设计 - - 接口设计 - - 部署 - -相互之间需要尽量少的“越权”。例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147388) - -另外一个容易忽视的问题是,文档内部的层次结构。MarkDown文件不像Doc一样可以自动生成目录页,一个部分如果太多,就会让看得人失去层次感。以这个文档为例 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)。这个文档我也写过,只是把`Fault Recovery`的部分和前边正常的`Training Job`合到一起去了,结果发现越写越乱,后来看到Helin写的分层之后的文档,感觉流畅多了。 - -## 逻辑要清晰、流畅 -- 概念的出现不要突兀。 - 文档的最前边有一个术语的解释,把文档中提到的概念先解释一下,这样,别人在看到那些概念的时候不会觉得很突兀。同时,前后要呼应。 - - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114952115) - - 如果概念或者名词后边没有出现,该删除还是删除了吧! - -- 多种方案的选择要简述原因。 - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147115) - -- Design doc中不应该有“?” [wuyi](#wuyi) - 应该都是陈述句的描述。有不确定的问题可以提Issue来讨论获得结论。 - 对于自己不确定的地方,与其含混而过不如找人讨论先搞一个靠谱的可以讨论的。绝大多数情况下,这种含混而过的都会被Reivew给纠出来。即便就不出来,岂不是自己给自己埋一个坑? - -- 文档当中不要黏贴大量的代码 - 代码一般都是细节,改变的非常的快,文档可能很快就失效了,需要重新的修正。另外,最主要的是大段的代码会让人的思路中断,陷于实现的细节当中。 - -- 不准备实现的就不要写了 - 最多放到放到`Future`中展望一下。 - -## 文笔要好 -啊呀,不想当作家的程序员不是好程序员。这个当然比较难,要看大家的“学好数理化,走遍天下都不怕”的基本功的“深厚”程度了:) - -顺便推荐一下公众号:老万故事会[link](https://freewechat.com/profile/MzI1MDQ3NTAxOQ==),一个文章和代码写的一样好的人[yi](#yi)。 - -## 如何提高写文档的效率 -这段其实本来没想到加的,开完组会之后听老大讲提高工作效率的事情有点感想,就写了。 - -我不是很怀疑自己写代码的效率,但是严重怀疑自己写文档的效率。感觉写文档比写代码烧脑多了。现在想想,最主要的点在于文档的结构和分层问题。 - -提高效率最好的办法是什么?是确定范围,很多东西不用讲了,当然效率就提高上去了。 - -写系统设计文档,主要需要表现模块间的关系和通信,特别是特定场景下的他们是如何配合的。这个过程中需要把关键的概念说清楚,例如这个[link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)中,`Trainjob`和`Fault Recovery`主要是讲模块间关系和配合的,[`task`](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train#task)作为一个关键的概念讲了。其他的部分,稍显细节的都可以放到模块设计中去讲。 - -另外一个,认真≠ 犹豫,也≠ 纠结。 - -如果感到了纠结,那说明没找到问题的根本。我在写文件上传的时候对做不做缓存优化纠结了很长时间,请教了Helin一会就讨论完毕了。如果感到了纠结,那是需要跟别人请教。不纠结的地方,下决断要果断。 - -## 参考 -- WangYi -- WuYi -- Helin -- YanXu -- CaoYing From f904e794103174e4fefcdf39bdd9db424bdc0cbc Mon Sep 17 00:00:00 2001 From: gongweibao Date: Wed, 7 Jun 2017 16:04:34 +0800 Subject: [PATCH 11/39] rm not need --- doc/howto/dev/introduction_to_pr.md | 139 ---------------------------- 1 file changed, 139 deletions(-) delete mode 100644 doc/howto/dev/introduction_to_pr.md diff --git a/doc/howto/dev/introduction_to_pr.md b/doc/howto/dev/introduction_to_pr.md deleted file mode 100644 index 71cbbdf861..0000000000 --- a/doc/howto/dev/introduction_to_pr.md +++ /dev/null @@ -1,139 +0,0 @@ -# Desgin Doc的一点总结 -## 前言 -故事还要从前两天我提交的[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)的Design Doc PR开始。 - -[FileManager](https://github.com/PaddlePaddle/Paddle/pull/2013)这个文档我肯定是用心写了,我也清楚地记得,提交之前也仔细的检查了(这点自觉性还是有的),结果,我突然发现,我的PR被Comments刷屏了;这还是其次,更要命的是,当网络速度稍慢的时候会提示我:Comments 太多啦!页面打不开!!哦。简直有点人间惨剧的味道!我要把这些和之前的Comments以及一些典型的Comments总结一下,避免同样的问题,同时也希望对您有用。 -link -我觉得里边有几个基本的原则: - -- 做事情要精益求精 - 这个是做事情的基础,也是我们讨论的基础。美,多是相似的,丑才是千奇百怪。 - - 精益求精是一种态度,态度比什么都重要。[yi](#yi) -- 节约别人的时间 - 同时也是给自己节约时间 - -- 有礼貌,有感恩的心 - > 我发现诸多之前提了comment,但是没有改,也没有回复。这个不太好吧。[ying](#ying) - > 每个comment都必须回复。这是开源社区的基本礼貌。别人帮了忙,应该说谢谢。[yi](#yi) - -我的理解,写Doc精益求精要从基础开始,首先要规范;其次要有提纲挈领的东西,让人一眼基本明白要做的事情;然后,讲述事情结构要清晰,要分层,用最小化涉及的原则,该讲的要解释,不该讲的不讲;接下来逻辑要清晰,要流畅,不要太突兀。。。当然还有很多,比如文笔要好!:) - -锻炼一下,中国人办事也不一定那么糙的,对吧! :) - -## 基础规范 - -- 语言选择:中文 or 英文? - 最好用英文,因为外国同学无法阅读中文。但是,语言是用来交流的,如果您认为英文无法完善的表达自己的意思的时候,还是用中文吧! - - 重要的是,不管哪种文,***英文的拼写要对,中文要没有错别字***! - - 如果文章用的是中文,那么请用中文的标点符号。反之,用英文的标点符号。***而且要保持统一、而且要保持完整***。[wuyi](#wuyi) - - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114951817) - - 还有这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093563) - - (请原谅我多加了一个link做锚文本,Sphinx有一个bug:用中文的会报错。下边的处理类似。) - -- 缩写的全称 - 一个缩写第一次出现的时候,要把他们的未缩写形式写出来。如果该单词代表的概念比较复杂,请在术语解释中把他写清楚! - - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115093329) - -- 英语写法错误 - [yi](#yi)总结了一下在code review的时候经常见的一些英语写法错误: Python写成python,TensorFlow写成Tensorflow,Travis CI 写成 Travis-CI,ReLU写成Relu,unit test写成unittest,MD5写成md5。 - - 大小写简单的规则如下: - - 英文的缩写是要用大写的。 - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115091985) - - 英文的句子的首字母是要大写的。 - - 一个专有的名词第一个字母一般是要大写的。 - - yiwang推荐了一个工具:[grammer](https://www.grammarly.com/),可以作为插件安装到Chrome中,自动检查拼写和语法问题。 - -- 不要提交没有用的文件 - 例如这个 [link](https://github.com/PaddlePaddle/Paddle/pull/1964#discussion_r114414822) - -- 提交的代码要经过验证 - 提交的代码都没有验证过是几个意思? - -- 参考资料要设置超链,链接到对方 - 不设置的话俗称Copy。可以用`[name](#name)`设置MarkDown文件中的引用。一如此文中很多地方出现的那样。 - -- 给别人的东西步骤是要可以执行的 - 看这个[link](https://github.com/wangkuiyi/ipynb)是如何写步骤的,或者这个[link](https://github.com/PaddlePaddle/Paddle/pull/1602#issuecomment-285964510)(虽然只是一个Comment)。 - -- 链接可以直接跳到精确的位置 - 如果能一步到位就不要让别人走两步,节约大家彼此的时间。 - -## 提纲挈领 -### 目标要明确 -一般开头的一段话就要用简短的几句话把文档要描述的目标写清楚,别人看了之后,心中有了一个大概:这个文档准备讲什么。 - -例如这个[link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train#objective) -### 架构图 -一个系统或者模块的设计文档,首先要有架构图。***要有架构图,要有架构图,要有架构图***。。。。这句话可以多重复几遍。俗称,无图无真相或者一图胜千言。最起码有一个架构图说明各个模块的关系。图的表现能力远远超过了文字,特别是模块关系相关的和流程相关的场景下。 - -可以看一下这个文档中的图 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train) - -顺便提一句,里边的图都是用[OmniGraffle](https://www.omnigroup.com/omnigraffle)画的,就是贵了点:$99。“99你买了不吃亏,99你买了不上当。”[wuyi](#wuyi) - -### 层次结构清晰 -代码层面上,我们为了开发大的项目,很自然的把代码分模块、分文件。文档也是如此。每个文档或者章节说明他应该要想说的事情,尽量的减少涉及的范围。涉及的范围越少,需要阐述、解释的东西就越少,理解起来需要的背景知识就越少,就越好理解,写的人出错的概率也越少。。。 - -- 整体分层 - - 系统设计 - - 模块设计 - - 接口设计 - - 部署 - -相互之间需要尽量少的“越权”。例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147388) - -另外一个容易忽视的问题是,文档内部的层次结构。MarkDown文件不像Doc一样可以自动生成目录页,一个部分如果太多,就会让看得人失去层次感。以这个文档为例 [link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)。这个文档我也写过,只是把`Fault Recovery`的部分和前边正常的`Training Job`合到一起去了,结果发现越写越乱,后来看到Helin写的分层之后的文档,感觉流畅多了。 - -## 逻辑要清晰、流畅 -- 概念的出现不要突兀。 - 文档的最前边有一个术语的解释,把文档中提到的概念先解释一下,这样,别人在看到那些概念的时候不会觉得很突兀。同时,前后要呼应。 - - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r114952115) - - 如果概念或者名词后边没有出现,该删除还是删除了吧! - -- 多种方案的选择要简述原因。 - 例如这个[link](https://github.com/PaddlePaddle/Paddle/pull/2013#discussion_r115147115) - -- Design doc中不应该有“?” [wuyi](#wuyi) - 应该都是陈述句的描述。有不确定的问题可以提Issue来讨论获得结论。 - 对于自己不确定的地方,与其含混而过不如找人讨论先搞一个靠谱的可以讨论的。绝大多数情况下,这种含混而过的都会被Reivew给纠出来。即便就不出来,岂不是自己给自己埋一个坑? - -- 文档当中不要黏贴大量的代码 - 代码一般都是细节,改变的非常的快,文档可能很快就失效了,需要重新的修正。另外,最主要的是大段的代码会让人的思路中断,陷于实现的细节当中。 - -- 不准备实现的就不要写了 - 最多放到放到`Future`中展望一下。 - -## 文笔要好 -啊呀,不想当作家的程序员不是好程序员。这个当然比较难,要看大家的“学好数理化,走遍天下都不怕”的基本功的“深厚”程度了:) - -顺便推荐一下公众号:老万故事会[link](https://freewechat.com/profile/MzI1MDQ3NTAxOQ==),一个文章和代码写的一样好的人[yi](#yi)。 - -## 如何提高写文档的效率 -这段其实本来没想到加的,开完组会之后听老大讲提高工作效率的事情有点感想,就写了。 - -我不是很怀疑自己写代码的效率,但是严重怀疑自己写文档的效率。感觉写文档比写代码烧脑多了。现在想想,最主要的点在于文档的结构和分层问题。 - -提高效率最好的办法是什么?是确定范围,很多东西不用讲了,当然效率就提高上去了。 - -写系统设计文档,主要需要表现模块间的关系和通信,特别是特定场景下的他们是如何配合的。这个过程中需要把关键的概念说清楚,例如这个[link](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train)中,`Trainjob`和`Fault Recovery`主要是讲模块间关系和配合的,[`task`](https://github.com/PaddlePaddle/Paddle/tree/develop/doc/design/cluster_train#task)作为一个关键的概念讲了。其他的部分,稍显细节的都可以放到模块设计中去讲。 - -另外一个,认真≠ 犹豫,也≠ 纠结。 - -如果感到了纠结,那说明没找到问题的根本。我在写文件上传的时候对做不做缓存优化纠结了很长时间,请教了Helin一会就讨论完毕了。如果感到了纠结,那是需要跟别人请教。不纠结的地方,下决断要果断。 - -## 参考 -- WangYi -- WuYi -- Helin -- YanXu -- CaoYing From 99dc60642d6b94a5ccc92be21917cfa866d6e7f8 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Wed, 7 Jun 2017 23:42:11 +0800 Subject: [PATCH 12/39] new parameterupdater use paddle pserver cclient of go --- CMakeLists.txt | 1 + .../cluster_train/remote_parameter_updater.md | 21 ++++ go/cmake/golang.cmake | 8 +- go/pserver/cclient/CMakeLists.txt | 12 +- go/pserver/cclient/test/CMakeLists.txt | 13 ++- go/pserver/cclient/test/main.c | 19 ++-- go/pserver/cclient/test/test_train.py | 60 ++++++++++ paddle/api/CMakeLists.txt | 4 +- paddle/api/Paddle.i | 1 + paddle/api/PaddleAPI.h | 2 + paddle/api/ParameterUpdater.cpp | 9 ++ paddle/trainer/CMakeLists.txt | 11 +- paddle/trainer/NewRemoteParameterUpdater.cpp | 88 +++++++++++++++ paddle/trainer/NewRemoteParameterUpdater.h | 105 ++++++++++++++++++ python/paddle/v2/optimizer.py | 15 ++- python/paddle/v2/trainer.py | 7 +- 16 files changed, 352 insertions(+), 24 deletions(-) create mode 100644 doc/design/cluster_train/remote_parameter_updater.md create mode 100644 go/pserver/cclient/test/test_train.py create mode 100644 paddle/trainer/NewRemoteParameterUpdater.cpp create mode 100644 paddle/trainer/NewRemoteParameterUpdater.h diff --git a/CMakeLists.txt b/CMakeLists.txt index 79210d0436..c2218be5ef 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -127,6 +127,7 @@ endif(WITH_GPU) add_subdirectory(proto) add_subdirectory(paddle) add_subdirectory(python) +add_subdirectory(go/pserver/cclient) if(WITH_DOC) add_subdirectory(doc) diff --git a/doc/design/cluster_train/remote_parameter_updater.md b/doc/design/cluster_train/remote_parameter_updater.md new file mode 100644 index 0000000000..6e8e593845 --- /dev/null +++ b/doc/design/cluster_train/remote_parameter_updater.md @@ -0,0 +1,21 @@ +# Design Doc: Remote Parameter Updater for Cluster Train + +For an overview of distribute training, please refer to [distributed training design doc](README.md). In this design doc, we will discuss the parameter updater that will use parameter server cclient [The Client Library of Parameter Server Design Doc](pserver_client.md) to manage and update parameters. + +## Parameter Updater + +Parameter Updater is used by trainer to manage and update parameter, there are mainly two kind of parameter updater: local and remote, since this design is for cluster train, we will only discuss remote parameter updater here. + +### Remote Parameter Updater + +Remote Parameter Updater manage parameters through remote parameter server with the client that communicate with pserver([The Client Library of Parameter Server Design Doc](pserver_client.md)) + +In PaddlePaddle Python V2 API, trainer is implemented in python, and the trainer will hold a instance of parameter updater and call it's functions directly. In this design, we will also expose the api of RemoteParameterUpdater to python with swig. + +#### Sparse Remote Parameter Updater + +Since we will only implement dense parameter management new, the mechanism for sparse parameter will be discussed in next stage. + +### Interface Design + +TBD diff --git a/go/cmake/golang.cmake b/go/cmake/golang.cmake index d38d06de23..7c85fb6298 100644 --- a/go/cmake/golang.cmake +++ b/go/cmake/golang.cmake @@ -17,7 +17,7 @@ function(GO_LIBRARY NAME BUILD_TYPE) endif() file(GLOB GO_SOURCE RELATIVE "${CMAKE_CURRENT_SOURCE_DIR}" "*.go") - file(RELATIVE_PATH rel ${CMAKE_BINARY_DIR} ${CMAKE_CURRENT_SOURCE_DIR}) + file(RELATIVE_PATH rel ${CMAKE_CURRENT_BINARY_DIR} ${CMAKE_CURRENT_SOURCE_DIR}) # find Paddle directory. get_filename_component(PARENT_DIR ${CMAKE_CURRENT_SOURCE_DIR} DIRECTORY) @@ -32,12 +32,14 @@ function(GO_LIBRARY NAME BUILD_TYPE) # will use the local changes in Paddle rather than checkout Paddle # in github. add_custom_target(copyPaddle - COMMAND ln -sf ${PADDLE_DIR} ${PADDLE_IN_GOPATH}) + COMMAND rm -rf ${PADDLE_IN_GOPATH}/Paddle + COMMAND ln -sf ${PADDLE_DIR} ${PADDLE_IN_GOPATH}/Paddle) add_dependencies(goGet copyPaddle) add_custom_command(OUTPUT ${OUTPUT_DIR}/.timestamp COMMAND env GOPATH=${GOPATH} ${CMAKE_Go_COMPILER} build ${BUILD_MODE} - -o "${CMAKE_CURRENT_BINARY_DIR}/${LIB_NAME}" + -gcflags=-shared -asmflags=-shared -installsuffix=_shared -a + -o "${CMAKE_CURRENT_BINARY_DIR}/${LIB_NAME}" ${CMAKE_GO_FLAGS} ${GO_SOURCE} WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}) diff --git a/go/pserver/cclient/CMakeLists.txt b/go/pserver/cclient/CMakeLists.txt index c017d74656..e00dd6b14a 100644 --- a/go/pserver/cclient/CMakeLists.txt +++ b/go/pserver/cclient/CMakeLists.txt @@ -9,5 +9,15 @@ project(cxx_go C Go) include(golang) include(flags) -go_library(client STATIC) +go_library(paddle_pserver_cclient STATIC) + +if(PROJ_ROOT) + add_custom_command(OUTPUT ${PROJ_ROOT}/paddle/trainer/libpaddle_pserver_cclient.a + COMMAND cp ${CMAKE_BINARY_DIR}/go/pserver/cclient/libpaddle_pserver_cclient.h ${PROJ_ROOT}/paddle/trainer/ + COMMAND cp ${CMAKE_BINARY_DIR}/go/pserver/cclient/libpaddle_pserver_cclient.a ${PROJ_ROOT}/paddle/trainer/ + WORKING_DIRECTORY ${PROJ_ROOT}/paddle + DEPENDS paddle_pserver_cclient) + add_custom_target(paddle_pserver_cclient_lib ALL DEPENDS ${PROJ_ROOT}/paddle/trainer/libpaddle_pserver_cclient.a) +endif(PROJ_ROOT) + add_subdirectory(test) diff --git a/go/pserver/cclient/test/CMakeLists.txt b/go/pserver/cclient/test/CMakeLists.txt index 16f84648c1..762772812f 100644 --- a/go/pserver/cclient/test/CMakeLists.txt +++ b/go/pserver/cclient/test/CMakeLists.txt @@ -1,11 +1,16 @@ cmake_minimum_required(VERSION 3.0) -include_directories(${CMAKE_BINARY_DIR}) - add_executable(main main.c) -add_dependencies(main client) +add_dependencies(main paddle_pserver_cclient) if(APPLE) set(CMAKE_EXE_LINKER_FLAGS "-framework CoreFoundation -framework Security") endif() -target_link_libraries(main ${CMAKE_BINARY_DIR}/libclient.a) + +if(PROJ_ROOT) + include_directories(${CMAKE_BINARY_DIR}/go/pserver/cclient/) + target_link_libraries(main ${CMAKE_BINARY_DIR}/go/pserver/cclient/libpaddle_pserver_cclient.a pthread) +else(PROJ_ROOT) + include_directories(${CMAKE_BINARY_DIR}) + target_link_libraries(main ${CMAKE_BINARY_DIR}/libpaddle_pserver_cclient.a pthread) +endif(PROJ_ROOT) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index f75a2110b9..0ad890daa2 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -1,6 +1,6 @@ #include -#include "libclient.h" +#include "libpaddle_pserver_cclient.h" void fail() { // TODO(helin): fix: gtest using cmake is not working, using this @@ -14,10 +14,11 @@ int main() { client c = paddle_new_pserver_client(addr, 1); retry: if (paddle_begin_init_params(c)) { + paddle_parameter param; char name_a[] = "param_a"; char name_b[] = "param_b"; - unsigned char content[] = {0x00, 0x11, 0x22}; + unsigned char content[] = {0x00, 0x00, 0x00}; param.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; param.name = name_a; param.content = content; @@ -32,6 +33,7 @@ retry: if (paddle_init_param(c, param, NULL, 0) != 0) { goto retry; } + if (paddle_finish_init_params(c) != 0) { goto retry; } @@ -41,30 +43,31 @@ retry: unsigned char content[] = {0x00, 0x11, 0x22}; paddle_gradient grads[2] = { - {"param_a", PADDLE_ELEMENT_TYPE_INT32, content, 3}, - {"param_b", PADDLE_ELEMENT_TYPE_FLOAT32, content, 3}}; + {"param_a", PADDLE_ELEMENT_TYPE_FLOAT32, content, 3}, + {"param_b", PADDLE_ELEMENT_TYPE_INT32, content, 3}}; - if (!paddle_send_grads(c, grads, 2)) { + if (paddle_send_grads(c, grads, 2) != 0) { fail(); } paddle_parameter* params[2] = {NULL, NULL}; char* names[] = {"param_a", "param_b"}; - if (!paddle_get_params(c, names, params, 2)) { + if (paddle_get_params(c, names, params, 2) != 0) { fail(); } // get parameters again by reusing the allocated parameter buffers. - if (!paddle_get_params(c, names, params, 2)) { + if (paddle_get_params(c, names, params, 2) != 0) { fail(); } paddle_release_param(params[0]); paddle_release_param(params[1]); - if (!paddle_save_model(c, "/tmp/")) { + if (paddle_save_model(c, "/tmp/") != 0) { fail(); } + printf("test success!\n"); return 0; } diff --git a/go/pserver/cclient/test/test_train.py b/go/pserver/cclient/test/test_train.py new file mode 100644 index 0000000000..ddd6371e0c --- /dev/null +++ b/go/pserver/cclient/test/test_train.py @@ -0,0 +1,60 @@ +import paddle.v2 as paddle +import paddle.v2.dataset.uci_housing as uci_housing + + +def main(): + # init + paddle.init(use_gpu=False, trainer_count=1, trainer_id=1) + + # network config + x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13)) + y_predict = paddle.layer.fc(input=x, + param_attr=paddle.attr.Param(name='w'), + size=1, + act=paddle.activation.Linear(), + bias_attr=paddle.attr.Param(name='b')) + y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1)) + cost = paddle.layer.mse_cost(input=y_predict, label=y) + + # create parameters + parameters = paddle.parameters.create(cost) + + # create optimizer + optimizer = paddle.optimizer.Momentum(momentum=0) + + trainer = paddle.trainer.SGD(cost=cost, + parameters=parameters, + update_equation=optimizer, + is_local=False, + pserver_spec="localhost:3000") + + # event_handler to print training and testing info + def event_handler(event): + if isinstance(event, paddle.event.EndIteration): + if event.batch_id % 100 == 0: + print "Pass %d, Batch %d, Cost %f" % ( + event.pass_id, event.batch_id, event.cost) + + if isinstance(event, paddle.event.EndPass): + if (event.pass_id + 1) % 10 == 0: + result = trainer.test( + reader=paddle.batch( + uci_housing.test(), batch_size=2), + feeding={'x': 0, + 'y': 1}) + print "Test %d, %.2f" % (event.pass_id, result.cost) + + # training + trainer.train( + reader=paddle.batch( + paddle.reader.shuffle( + uci_housing.train(), buf_size=500), + batch_size=2), + feeding={'x': 0, + 'y': 1}, + event_handler=event_handler, + num_passes=30) + + +if __name__ == '__main__': + main() diff --git a/paddle/api/CMakeLists.txt b/paddle/api/CMakeLists.txt index e147659566..c258a15240 100644 --- a/paddle/api/CMakeLists.txt +++ b/paddle/api/CMakeLists.txt @@ -16,7 +16,7 @@ set(API_HEADER Internal.h) add_library(paddle_api STATIC ${API_SOURCES}) -add_dependencies(paddle_api gen_proto_cpp) +add_dependencies(paddle_api gen_proto_cpp paddle_pserver_cclient_lib) INCLUDE(${SWIG_USE_FILE}) INCLUDE_DIRECTORIES(${PROJ_ROOT}/paddle) @@ -44,7 +44,7 @@ SET(SWIG_MODULE_swig_paddle_EXTRA_DEPS ) IF(APPLE) - SET(MACOS_LD_FLAGS "-undefined dynamic_lookup -Wl,-all_load") + SET(MACOS_LD_FLAGS "-undefined dynamic_lookup -Wl,-all_load -framework CoreFoundation -framework Security") ELSE(APPLE) SET(START_GROUP "-Xlinker -start-group") SET(END_GROUP "-Xlinker -end-group") diff --git a/paddle/api/Paddle.i b/paddle/api/Paddle.i index 068ba286c0..3237e73745 100644 --- a/paddle/api/Paddle.i +++ b/paddle/api/Paddle.i @@ -179,6 +179,7 @@ namespace std { %newobject ParameterOptimizer::needSpecialTraversal; %newobject ParameterUpdater::createLocalUpdater; %newobject ParameterUpdater::createRemoteUpdater; +%newobject ParameterUpdater::createNewRemoteUpdater; %feature("director") UpdateCallback; %feature("autodoc", 1); // To generate method stub, for code hint in ide diff --git a/paddle/api/PaddleAPI.h b/paddle/api/PaddleAPI.h index da0f157abd..7565ea51fe 100644 --- a/paddle/api/PaddleAPI.h +++ b/paddle/api/PaddleAPI.h @@ -841,6 +841,8 @@ public: static ParameterUpdater* createRemoteUpdater(OptimizationConfig* config, int passCount, bool useSparseUpdater); + static ParameterUpdater* createNewRemoteUpdater( + OptimizationConfig* config, const std::string pserverSpec); ~ParameterUpdater(); /** diff --git a/paddle/api/ParameterUpdater.cpp b/paddle/api/ParameterUpdater.cpp index 79921ea6e7..eaf8518ae2 100644 --- a/paddle/api/ParameterUpdater.cpp +++ b/paddle/api/ParameterUpdater.cpp @@ -15,6 +15,7 @@ limitations under the License. */ #include "PaddleAPI.h" #include "PaddleAPIPrivate.h" +#include "paddle/trainer/NewRemoteParameterUpdater.h" #include "paddle/trainer/RemoteParameterUpdater.h" #include "paddle/trainer/ThreadParameterUpdater.h" @@ -28,6 +29,14 @@ ParameterUpdater *ParameterUpdater::createLocalUpdater( return updater; } +ParameterUpdater *ParameterUpdater::createNewRemoteUpdater( + OptimizationConfig *config, const std::string pserverSpec) { + auto updater = new ParameterUpdater(); + updater->m->updater.reset(new paddle::NewRemoteParameterUpdater( + config->m->getConfig(), pserverSpec)); + return updater; +} + ParameterUpdater *ParameterUpdater::createRemoteUpdater( OptimizationConfig *config, int passCount, bool useSparseUpdater) { auto updater = new ParameterUpdater(); diff --git a/paddle/trainer/CMakeLists.txt b/paddle/trainer/CMakeLists.txt index 06c019f0a9..9d246b6690 100644 --- a/paddle/trainer/CMakeLists.txt +++ b/paddle/trainer/CMakeLists.txt @@ -4,6 +4,7 @@ set(TRAINER_SOURCES ParameterUpdater.cpp ParamUtil.cpp RemoteParameterUpdater.cpp + NewRemoteParameterUpdater.cpp Tester.cpp Trainer.cpp TrainerInternal.cpp @@ -16,6 +17,7 @@ set(TRAINER_HEADERS ParameterUpdater.h ParamUtil.h RemoteParameterUpdater.h + NewRemoteParameterUpdater.h Tester.h TesterConfig.h Trainer.h @@ -32,7 +34,7 @@ add_style_check_target(paddle_trainer_lib add_style_check_target(paddle_trainer_lib ${TRAINER_HEADERS}) add_dependencies(paddle_trainer_lib - gen_proto_cpp) + gen_proto_cpp paddle_pserver_cclient_lib) macro(add_paddle_exe TARGET_NAME) add_executable(${TARGET_NAME} ${ARGN}) @@ -56,3 +58,10 @@ install(TARGETS paddle_trainer paddle_merge_model set_target_properties(paddle_trainer PROPERTIES INSTALL_RPATH_USE_LINK_PATH TRUE) set_target_properties(paddle_merge_model PROPERTIES INSTALL_RPATH_USE_LINK_PATH TRUE) + +if(APPLE) + set(CMAKE_EXE_LINKER_FLAGS "-framework CoreFoundation -framework Security") +endif() + +target_link_libraries(paddle_trainer ${CMAKE_CURRENT_SOURCE_DIR}/libpaddle_pserver_cclient.a) +target_link_libraries(paddle_trainer_lib ${CMAKE_CURRENT_SOURCE_DIR}/libpaddle_pserver_cclient.a) diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp new file mode 100644 index 0000000000..9060052e11 --- /dev/null +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -0,0 +1,88 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#include "NewRemoteParameterUpdater.h" +#include "Trainer.h" +#include "paddle/utils/Stat.h" + +DECLARE_int32(trainer_id); +DECLARE_string(save_dir); + +namespace paddle { +NewRemoteParameterUpdater::NewRemoteParameterUpdater( + const OptimizationConfig &config, const std::string pserverSpec) + : pserverSpec_(pserverSpec) {} + +void NewRemoteParameterUpdater::init( + const std::vector ¶meters) { + ParameterUpdater::init(parameters); + LOG(INFO) << "NewRemoteParameterUpdater init in"; + + for (auto ¶ : parameters_) { + para->getBuf(PARAMETER_VALUE)->zeroMem(); + para->getBuf(PARAMETER_GRADIENT)->zeroMem(); + } + + // create parameter server client. + parameterClient_ = + paddle_new_pserver_client((char *)pserverSpec_.c_str(), FLAGS_trainer_id); + + // init names_ for get parameter through paddle_cclient + names_ = (char **)malloc(parameterSize() * sizeof(char *)); + for (int i = 0; i < parameterSize(); ++i) { + names_[i] = (char *)parameters_[i]->getName().c_str(); + } + + // init new parameter and gradient. + initNewParameter(newParameters_, PARAMETER_VALUE); + initNewParameter(newGradients_, PARAMETER_GRADIENT); + + // init parameter, one trainer will get the opportunity to int parameter and + // send them to parameter server. Others will get the initialized parameter + // from parameter server + if (paddle_begin_init_params(parameterClient_)) { + LOG(INFO) << "paddle_begin_init_params start"; + for (int i = 0; i < parameterSize(); ++i) { + paddle_init_param(parameterClient_, *newParameters_[i], NULL, 0); + } + paddle_finish_init_params(parameterClient_); + LOG(INFO) << "paddle_begin_init_params done"; + } else { + paddle_get_params( + parameterClient_, names_, newParameters_, (int)parameters_.size()); + } + + LOG(INFO) << "NewRemoteParameterUpdater initialized"; +} + +void NewRemoteParameterUpdater::updateImpl(Parameter *para) {} + +void NewRemoteParameterUpdater::finishBatch(real cost) { + LOG(INFO) << "finishBatch in, cost: " << cost; + + // send gradient to parameter server. + paddle_send_grads(parameterClient_, *newGradients_, parameterSize()); + // get the updated parameter from parameterClient. + paddle_get_params(parameterClient_, names_, newParameters_, parameterSize()); + + // clear gradient after update parameter. + for (auto ¶ : parameters_) { + para->getBuf(PARAMETER_GRADIENT)->zeroMem(); + } +} + +void NewRemoteParameterUpdater::startPass() {} + +bool NewRemoteParameterUpdater::finishPass() { return true; } +} diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h new file mode 100644 index 0000000000..33640bc8a3 --- /dev/null +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -0,0 +1,105 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#pragma once + +#include +#include +#include "ParameterUpdater.h" +#include "libpaddle_pserver_cclient.h" +#include "paddle/pserver/ParameterClient2.h" +#include "paddle/utils/Queue.h" +#include "paddle/utils/Util.h" + +namespace paddle { + +/** + * New remote parameter updater for dense parameters that use cclient of go. + */ +class NewRemoteParameterUpdater : public ParameterUpdater { +public: + NewRemoteParameterUpdater(const OptimizationConfig& config, + const std::string pserverSpec); + ~NewRemoteParameterUpdater() { + if (newGradients_) { + paddle_pserver_client_release(parameterClient_); + } + } + + /** + * initialize the internal parameter client and itself. + */ + virtual void init(const std::vector& parameters); + /** + * @brief start batch + * + * @note one batch training exhibits stateful feature to help + * to do performance tuning, sgd optimization if necessary. + */ + virtual PassType startBatch(int64_t batchSize) { return PASS_TRAIN; } + + /** + * send parameters to pservers and get returned parameters + * from all pservers if necessary. + */ + virtual void finishBatch(real cost); + virtual void startPass(); + virtual bool finishPass(); + + int parameterSize() { return (int)parameters_.size(); } + + /** + * init parameter of paddle pserver cclient. + * @param new_paras + * @param type + */ + void initNewParameter(paddle_parameter**& new_paras, ParameterType type) { + new_paras = + (paddle_parameter**)malloc(sizeof(paddle_parameter*) * parameterSize()); + for (int i = 0; i < parameterSize(); ++i) { + new_paras[i] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); + memset(new_paras[i], 0, sizeof(paddle_parameter)); + } + + for (int i = 0; i < parameterSize(); ++i) { + ParameterPtr para = parameters_[i]; + new_paras[i]->content_len = 10; + new_paras[i]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + new_paras[i]->name = (char*)para->getName().c_str(); + new_paras[i]->content = + (unsigned char*)(para->getBuf(type).get()->getData()); + new_paras[i]->content_len = (int)para->getBuf(type).get()->getSize(); + } + } + +protected: + /** + * work need to do after finishBatch + */ + virtual void updateImpl(Parameter* para); + +protected: + /// internal parameter client object for exchanging data with pserver + client parameterClient_ = -1; + /// the parameters for new pserver client + paddle_parameter** newParameters_; + /// the gradinets for new pserver client + paddle_parameter** newGradients_; + /// the names for new parameters. + char** names_; + /// the specification of parameter server "host1:port,host1:port" + std::string pserverSpec_; +}; + +} // namespace paddle diff --git a/python/paddle/v2/optimizer.py b/python/paddle/v2/optimizer.py index 5e99d4a241..1ef2dceca9 100644 --- a/python/paddle/v2/optimizer.py +++ b/python/paddle/v2/optimizer.py @@ -45,7 +45,12 @@ class Optimizer(object): return swig_api.ParameterUpdater.createRemoteUpdater( self.__opt_conf__, pass_num, use_sparse_updater) - def create_updater(self, is_local, num_passes, use_sparse_updater): + def __create_new_remote_updater__(self, pserver_spec): + return swig_api.ParameterUpdater.createNewRemoteUpdater( + self.__opt_conf__, pserver_spec) + + def create_updater(self, is_local, num_passes, use_sparse_updater, + pserver_spec): """ create proper parameter_updater by configuration. :param is_local: create local or remote parameter updater @@ -64,8 +69,12 @@ class Optimizer(object): if is_local: parameter_updater = self.__create_local_updater__() else: - parameter_updater = self.__create_remote_updater__( - num_passes, use_sparse_updater) + if pserver_spec is None: + parameter_updater = self.__create_remote_updater__( + num_passes, use_sparse_updater) + else: + parameter_updater = self.__create_new_remote_updater__( + pserver_spec) return parameter_updater diff --git a/python/paddle/v2/trainer.py b/python/paddle/v2/trainer.py index 8fdb67cc26..f9658a8c5d 100644 --- a/python/paddle/v2/trainer.py +++ b/python/paddle/v2/trainer.py @@ -49,7 +49,8 @@ class SGD(object): parameters, update_equation, extra_layers=None, - is_local=True): + is_local=True, + pserver_spec=None): if not isinstance(parameters, v2_parameters.Parameters): raise TypeError('parameters should be parameters') @@ -63,6 +64,7 @@ class SGD(object): self.__parameters__ = parameters self.__topology_in_proto__ = topology.proto() self.__is_local__ = is_local + self.__pserver_spec__ = pserver_spec self.__use_sparse_updater__ = self.__topology__.use_sparse_updater() # # In local mode, disable sparse_remote_update. @@ -126,7 +128,8 @@ class SGD(object): __check_train_args__(**locals()) self.__parameter_updater__ = self.__optimizer__.create_updater( - self.__is_local__, num_passes, self.__use_sparse_updater__) + self.__is_local__, num_passes, self.__use_sparse_updater__, + self.__pserver_spec__) self.__parameter_updater__.init(self.__gradient_machine__) self.__gradient_machine__.start() From f18d83f3c492711271c0a44dcb6ef4ac62cc61c7 Mon Sep 17 00:00:00 2001 From: dangqingqing Date: Thu, 8 Jun 2017 16:11:40 +0800 Subject: [PATCH 13/39] follow comments --- paddle/function/RowConvOp.cpp | 11 +++++------ paddle/gserver/layers/RowConvLayer.cpp | 5 +++-- python/paddle/trainer_config_helpers/layers.py | 8 ++++++++ 3 files changed, 16 insertions(+), 8 deletions(-) diff --git a/paddle/function/RowConvOp.cpp b/paddle/function/RowConvOp.cpp index c3abb64971..b6501e8f4d 100644 --- a/paddle/function/RowConvOp.cpp +++ b/paddle/function/RowConvOp.cpp @@ -136,6 +136,7 @@ public: // check CHECK_EQ(2UL, inputs.size()); CHECK_EQ(1UL, outputs.size()); + // TODO(qingqing): support ASSIGN_TO. CHECK_EQ(outputs[0].getArgType(), ADD_TO); CHECK(inputs[0].isSequenceArg() && outputs[0].isSequenceArg()) << "SequenceArg required here."; @@ -144,9 +145,7 @@ public: auto w = inputs[1]; CHECK(in.data() && out.data() && in.getSequenceId().data()); CHECK_EQ(in.shape().ndims(), 2UL); - CHECK_EQ(out.shape().ndims(), 2UL); - CHECK_EQ(in.shape()[1], out.shape()[1]); - CHECK_EQ(in.shape()[0], out.shape()[0]); + CHECK(in.shape() == out.shape()); CHECK_EQ(w.shape()[1], in.shape()[1]); auto outMat = out.matrix(); @@ -176,6 +175,7 @@ public: template class RowConvGradFunc : public FunctionBase { + // TODO(qingqing): split into RowConvDataFunc and RowConvWeightFunc public: void init(const FuncConfig& config) override {} @@ -196,9 +196,8 @@ public: auto wGrad = outputs[1]; CHECK_EQ(in.shape().ndims(), 2UL); - CHECK_EQ(outGrad.shape().ndims(), 2UL); - CHECK_EQ(in.shape()[1], outGrad.shape()[1]); - CHECK_EQ(in.shape()[0], outGrad.shape()[0]); + CHECK(in.shape() == inGrad.shape()); + CHECK(in.shape() == outGrad.shape()); CHECK_EQ(wGrad.shape()[1], in.shape()[1]); const auto outGMat = outGrad.matrix(); diff --git a/paddle/gserver/layers/RowConvLayer.cpp b/paddle/gserver/layers/RowConvLayer.cpp index 5302e0e1a8..54d77999ad 100644 --- a/paddle/gserver/layers/RowConvLayer.cpp +++ b/paddle/gserver/layers/RowConvLayer.cpp @@ -43,13 +43,14 @@ void RowConvLayer::forward(PassType passType) { resetOutput(height, width); const auto startPos = getInput(0).sequenceStartPositions->getVector(useGpu_); - wDims_ = TensorShape({contexLength_, width}); + MatrixPtr w = weight_->getW(); + wDims_ = TensorShape({w->getHeight(), w->getWidth()}); MatrixPtr outV = getOutputValue(); BufferArgs inputs; BufferArgs outputs; inputs.addArg(*getInputValue(0), *startPos); - inputs.addArg(*weight_->getW(), wDims_); + inputs.addArg(*w, wDims_); outputs.addArg(*getOutputValue(), *startPos, ADD_TO); { diff --git a/python/paddle/trainer_config_helpers/layers.py b/python/paddle/trainer_config_helpers/layers.py index 1fd62cda50..1c3d776a4f 100755 --- a/python/paddle/trainer_config_helpers/layers.py +++ b/python/paddle/trainer_config_helpers/layers.py @@ -191,6 +191,14 @@ class LayerType(object): PAD_LAYER = "pad" MULTIPLEX_LAYER = "multiplex" ROW_CONV_LAYER = "row_conv" + + PRINT_LAYER = 'print' + PRIORBOX_LAYER = 'priorbox' + + CTC_LAYER = 'ctc' + WARP_CTC_LAYER = 'warp_ctc' + CRF_LAYER = 'crf' + CRF_DECODING_LAYER = 'crf_decoding' NCE_LAYER = 'nce' RANK_COST = 'rank-cost' From 6f1c91da992b9f7b230633c0ac56db184d4df5c2 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Thu, 8 Jun 2017 22:38:30 +0800 Subject: [PATCH 14/39] refine code --- go/pserver/cclient/test/main.c | 1 - paddle/trainer/NewRemoteParameterUpdater.cpp | 6 +- paddle/trainer/NewRemoteParameterUpdater.h | 68 ++++++++++++-------- 3 files changed, 43 insertions(+), 32 deletions(-) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index 0ad890daa2..b95abf96b1 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -14,7 +14,6 @@ int main() { client c = paddle_new_pserver_client(addr, 1); retry: if (paddle_begin_init_params(c)) { - paddle_parameter param; char name_a[] = "param_a"; char name_b[] = "param_b"; diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp index 9060052e11..13110adb45 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.cpp +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -45,8 +45,8 @@ void NewRemoteParameterUpdater::init( } // init new parameter and gradient. - initNewParameter(newParameters_, PARAMETER_VALUE); - initNewParameter(newGradients_, PARAMETER_GRADIENT); + newParameters_ = initNewParameter(PARAMETER_VALUE); + newGradients_ = initNewParameter(PARAMETER_GRADIENT); // init parameter, one trainer will get the opportunity to int parameter and // send them to parameter server. Others will get the initialized parameter @@ -60,7 +60,7 @@ void NewRemoteParameterUpdater::init( LOG(INFO) << "paddle_begin_init_params done"; } else { paddle_get_params( - parameterClient_, names_, newParameters_, (int)parameters_.size()); + parameterClient_, names_, newParameters_, parameterSize()); } LOG(INFO) << "NewRemoteParameterUpdater initialized"; diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index 33640bc8a3..5fd404dcf8 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -32,9 +32,9 @@ public: NewRemoteParameterUpdater(const OptimizationConfig& config, const std::string pserverSpec); ~NewRemoteParameterUpdater() { - if (newGradients_) { - paddle_pserver_client_release(parameterClient_); - } + releaseNewParameter(newParameters_); + releaseNewParameter(newGradients_); + if (parameterClient_ >= 0) paddle_pserver_client_release(parameterClient_); } /** @@ -57,37 +57,49 @@ public: virtual void startPass(); virtual bool finishPass(); - int parameterSize() { return (int)parameters_.size(); } - +protected: /** - * init parameter of paddle pserver cclient. - * @param new_paras - * @param type + * work need to do after finishBatch */ - void initNewParameter(paddle_parameter**& new_paras, ParameterType type) { - new_paras = - (paddle_parameter**)malloc(sizeof(paddle_parameter*) * parameterSize()); - for (int i = 0; i < parameterSize(); ++i) { - new_paras[i] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); - memset(new_paras[i], 0, sizeof(paddle_parameter)); + virtual void updateImpl(Parameter* para); + +private: + int parameterSize() { + return (int)parameters_.size(); } - for (int i = 0; i < parameterSize(); ++i) { - ParameterPtr para = parameters_[i]; - new_paras[i]->content_len = 10; - new_paras[i]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; - new_paras[i]->name = (char*)para->getName().c_str(); - new_paras[i]->content = - (unsigned char*)(para->getBuf(type).get()->getData()); - new_paras[i]->content_len = (int)para->getBuf(type).get()->getSize(); + /** + * init parameter of paddle pserver cclient. + * @param new_params + * @param type + */ + paddle_parameter** initNewParameter(ParameterType type) { + paddle_parameter** new_params = + (paddle_parameter**)malloc(sizeof(paddle_parameter*) * parameterSize()); + for (int i = 0; i < parameterSize(); ++i) { + new_params[i] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); + memset(new_params[i], 0, sizeof(paddle_parameter)); + } + + for (int i = 0; i < parameterSize(); ++i) { + ParameterPtr param = parameters_[i]; + new_params[i]->content_len = 10; + new_params[i]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + new_params[i]->name = (char*)param->getName().c_str(); + new_params[i]->content = + (unsigned char*)(param->getBuf(type).get()->getData()); + new_params[i]->content_len = (int)param->getBuf(type).get()->getSize(); + } + return new_params; } - } -protected: - /** - * work need to do after finishBatch - */ - virtual void updateImpl(Parameter* para); + void releaseNewParameter(paddle_parameter** newParams) { + if (newParams != NULL) { + for (int i = 0; i < parameterSize(); ++i) { + paddle_release_param(newParams[i]); + } + } + } protected: /// internal parameter client object for exchanging data with pserver From 28476f5f6e81a219914cf70f92a6c0fde2a6c203 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Fri, 9 Jun 2017 14:41:22 +0800 Subject: [PATCH 15/39] fix the problem of paddle_send_grad --- go/pserver/cclient/cclient.go | 4 +- go/pserver/cclient/test/main.c | 50 ++++++++++++++-- paddle/trainer/NewRemoteParameterUpdater.cpp | 8 ++- paddle/trainer/NewRemoteParameterUpdater.h | 62 ++++++++++---------- 4 files changed, 85 insertions(+), 39 deletions(-) diff --git a/go/pserver/cclient/cclient.go b/go/pserver/cclient/cclient.go index 0b4aa79806..662e925427 100644 --- a/go/pserver/cclient/cclient.go +++ b/go/pserver/cclient/cclient.go @@ -164,10 +164,10 @@ func paddle_finish_init_params(client C.client) C.int { } //export paddle_send_grads -func paddle_send_grads(client C.client, grads *C.paddle_gradient, total C.int) C.int { +func paddle_send_grads(client C.client, grads **C.paddle_gradient, total C.int) C.int { var gs []pserver.Gradient for i := 0; i < int(total); i++ { - grad := (*C.paddle_gradient)(unsafe.Pointer((uintptr(unsafe.Pointer(grads)) + uintptr(i)*unsafe.Sizeof(*grads)))) + grad := *(**C.paddle_gradient)(unsafe.Pointer((uintptr(unsafe.Pointer(grads)) + uintptr(i)*unsafe.Sizeof(*grads)))) et := pserver.ElementType(grad.element_type) name := C.GoString(grad.name) content := cArrayToSlice(unsafe.Pointer(grad.content), int(grad.content_len)) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index b95abf96b1..1039139307 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -1,4 +1,5 @@ #include +#include #include "libpaddle_pserver_cclient.h" @@ -9,6 +10,21 @@ void fail() { exit(-1); } +void print_parameter(paddle_gradient* param) { + if (param == NULL) { + printf("param is NULL!!\n"); + } else { + printf("==== parameter ====\n"); + printf("name: %s\n", param->name); + printf("content_len: %d\n", param->content_len); + printf("content_type: %d\n", param->element_type); + for (int i = 0; i < param->content_len; ++i) { + printf("0x%x ", param->content[i]); + } + printf("\n"); + } +} + int main() { char addr[] = "localhost:3000"; client c = paddle_new_pserver_client(addr, 1); @@ -40,12 +56,27 @@ retry: fail(); } - unsigned char content[] = {0x00, 0x11, 0x22}; - paddle_gradient grads[2] = { - {"param_a", PADDLE_ELEMENT_TYPE_FLOAT32, content, 3}, - {"param_b", PADDLE_ELEMENT_TYPE_INT32, content, 3}}; + unsigned char content1[] = {0x12, 0x23, 0x34}; + unsigned char content2[] = {0x45, 0x56, 0x67}; + + paddle_gradient** new_params = + (paddle_gradient**)malloc(sizeof(paddle_gradient*) * 2); + new_params[0] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); + new_params[0]->name = "param_a"; + new_params[0]->content = content1; + new_params[0]->content_len = 3; + new_params[0]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; - if (paddle_send_grads(c, grads, 2) != 0) { + new_params[1] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); + new_params[1]->name = "param_b"; + new_params[1]->content = content2; + new_params[1]->content_len = 3; + new_params[1]->element_type = PADDLE_ELEMENT_TYPE_INT32; + + print_parameter(new_params[0]); + print_parameter(new_params[1]); + + if (paddle_send_grads(c, new_params, 2) != 0) { fail(); } @@ -55,6 +86,15 @@ retry: fail(); } + print_parameter(params[0]); + print_parameter(params[1]); + + /// change name of parameter. + char* names2[] = {"param_1", "param_2"}; + if (paddle_get_params(c, names2, params, 2) == 0) { + fail(); + } + // get parameters again by reusing the allocated parameter buffers. if (paddle_get_params(c, names, params, 2) != 0) { fail(); diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp index 13110adb45..35df973897 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.cpp +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -22,7 +22,11 @@ DECLARE_string(save_dir); namespace paddle { NewRemoteParameterUpdater::NewRemoteParameterUpdater( const OptimizationConfig &config, const std::string pserverSpec) - : pserverSpec_(pserverSpec) {} + : parameterClient_(-1), + newParameters_(nullptr), + newGradients_(nullptr), + names_(nullptr), + pserverSpec_(pserverSpec) {} void NewRemoteParameterUpdater::init( const std::vector ¶meters) { @@ -72,7 +76,7 @@ void NewRemoteParameterUpdater::finishBatch(real cost) { LOG(INFO) << "finishBatch in, cost: " << cost; // send gradient to parameter server. - paddle_send_grads(parameterClient_, *newGradients_, parameterSize()); + paddle_send_grads(parameterClient_, newGradients_, parameterSize()); // get the updated parameter from parameterClient. paddle_get_params(parameterClient_, names_, newParameters_, parameterSize()); diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index 5fd404dcf8..ed82de3f99 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -32,6 +32,7 @@ public: NewRemoteParameterUpdater(const OptimizationConfig& config, const std::string pserverSpec); ~NewRemoteParameterUpdater() { + LOG(INFO) << "~NewRemoteParameterUpdater in"; releaseNewParameter(newParameters_); releaseNewParameter(newGradients_); if (parameterClient_ >= 0) paddle_pserver_client_release(parameterClient_); @@ -64,46 +65,47 @@ protected: virtual void updateImpl(Parameter* para); private: - int parameterSize() { - return (int)parameters_.size(); - } + int parameterSize() { return (int)parameters_.size(); } - /** - * init parameter of paddle pserver cclient. - * @param new_params - * @param type - */ - paddle_parameter** initNewParameter(ParameterType type) { - paddle_parameter** new_params = - (paddle_parameter**)malloc(sizeof(paddle_parameter*) * parameterSize()); - for (int i = 0; i < parameterSize(); ++i) { - new_params[i] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); - memset(new_params[i], 0, sizeof(paddle_parameter)); - } + /** + * init parameter of go paddle pserver cclient. + * @param new_params + * @param type + */ + paddle_parameter** initNewParameter(ParameterType type) { + paddle_parameter** new_params = + (paddle_parameter**)malloc(sizeof(paddle_parameter*) * parameterSize()); + for (int i = 0; i < parameterSize(); ++i) { + new_params[i] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); + memset(new_params[i], 0, sizeof(paddle_parameter)); + } - for (int i = 0; i < parameterSize(); ++i) { - ParameterPtr param = parameters_[i]; - new_params[i]->content_len = 10; - new_params[i]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; - new_params[i]->name = (char*)param->getName().c_str(); - new_params[i]->content = - (unsigned char*)(param->getBuf(type).get()->getData()); - new_params[i]->content_len = (int)param->getBuf(type).get()->getSize(); - } - return new_params; + for (int i = 0; i < parameterSize(); ++i) { + ParameterPtr param = parameters_[i]; + new_params[i]->content_len = 10; + new_params[i]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + new_params[i]->name = (char*)param->getName().c_str(); + new_params[i]->content = + (unsigned char*)(param->getBuf(type).get()->getData()); + new_params[i]->content_len = (int)param->getBuf(type).get()->getSize(); } + return new_params; + } - void releaseNewParameter(paddle_parameter** newParams) { - if (newParams != NULL) { - for (int i = 0; i < parameterSize(); ++i) { - paddle_release_param(newParams[i]); + void releaseNewParameter(paddle_parameter** newParams) { + if (newParams != nullptr) { + for (int i = 0; i < parameterSize(); ++i) { + auto param = newParams[i]; + if (param != nullptr) { + paddle_release_param(param); } } } + } protected: /// internal parameter client object for exchanging data with pserver - client parameterClient_ = -1; + client parameterClient_; /// the parameters for new pserver client paddle_parameter** newParameters_; /// the gradinets for new pserver client From 966bf9ae1f090d404f56033e1c9f51c15eb6c2ad Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Fri, 9 Jun 2017 16:34:24 +0800 Subject: [PATCH 16/39] fix the problem in cclient when malloc paddle_parameter --- go/pserver/cclient/cclient.go | 7 +++++-- go/pserver/cclient/test/main.c | 9 +++++---- go/pserver/service.go | 4 ++++ paddle/trainer/NewRemoteParameterUpdater.h | 4 ++-- 4 files changed, 16 insertions(+), 8 deletions(-) diff --git a/go/pserver/cclient/cclient.go b/go/pserver/cclient/cclient.go index 662e925427..be16a143d8 100644 --- a/go/pserver/cclient/cclient.go +++ b/go/pserver/cclient/cclient.go @@ -42,6 +42,7 @@ import ( "strings" "sync" "unsafe" + "fmt" "github.com/PaddlePaddle/Paddle/go/pserver" ) @@ -204,12 +205,14 @@ func paddle_get_params(client C.client, names **C.char, dst **C.paddle_parameter } p := ps[i] - param := *(**C.paddle_parameter)(unsafe.Pointer((uintptr(unsafe.Pointer(dst)) + uintptr(i)*unsafe.Sizeof(*dst)))) + paramPtr := (**C.paddle_parameter)(unsafe.Pointer((uintptr(unsafe.Pointer(dst)) + uintptr(i)*unsafe.Sizeof(*dst)))) + param := *paramPtr nameReady := false contentAllocated := false if unsafe.Pointer(param) == nullPtr { - param = (*C.paddle_parameter)(C.calloc(1, C.size_t(unsafe.Sizeof(*param)))) + *paramPtr = (*C.paddle_parameter)(C.calloc(1, C.size_t(unsafe.Sizeof(*param)))) + param = *paramPtr } else { if unsafe.Pointer(param.name) != nullPtr { if n := C.GoString(param.name); n != p.Name { diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index 1039139307..59cf5756fd 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -21,7 +21,7 @@ void print_parameter(paddle_gradient* param) { for (int i = 0; i < param->content_len; ++i) { printf("0x%x ", param->content[i]); } - printf("\n"); + printf("\n\n"); } } @@ -33,17 +33,18 @@ retry: paddle_parameter param; char name_a[] = "param_a"; char name_b[] = "param_b"; - unsigned char content[] = {0x00, 0x00, 0x00}; + unsigned char content1[] = {0x01, 0x02, 0x03}; param.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; param.name = name_a; - param.content = content; + param.content = content1; param.content_len = 3; if (paddle_init_param(c, param, NULL, 0) != 0) { goto retry; } + unsigned char content2[] = {0x04, 0x05, 0x06}; param.element_type = PADDLE_ELEMENT_TYPE_INT32; param.name = name_b; - param.content = content; + param.content = content2; param.content_len = 3; if (paddle_init_param(c, param, NULL, 0) != 0) { goto retry; diff --git a/go/pserver/service.go b/go/pserver/service.go index d5787b9708..ab814662b6 100644 --- a/go/pserver/service.go +++ b/go/pserver/service.go @@ -29,6 +29,10 @@ type Parameter struct { Content []byte } +func (p *Parameter) toString() { + fmt.Println(p.Name, p.ElementType, p.Content) +} + // ParameterWithConfig contains the parameter and the configuration. type ParameterWithConfig struct { Param Parameter diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index ed82de3f99..1dbb3658fb 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -33,8 +33,8 @@ public: const std::string pserverSpec); ~NewRemoteParameterUpdater() { LOG(INFO) << "~NewRemoteParameterUpdater in"; - releaseNewParameter(newParameters_); - releaseNewParameter(newGradients_); +// releaseNewParameter(newParameters_); +// releaseNewParameter(newGradients_); if (parameterClient_ >= 0) paddle_pserver_client_release(parameterClient_); } From ce3408a7a610e564b1ef08719fbe82b600685a8f Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Fri, 9 Jun 2017 22:28:57 +0000 Subject: [PATCH 17/39] modify pserver client C API, create better test Please refer to the change design doc for what in API have changed. --- doc/design/cluster_train/pserver_client.md | 37 +++++--- go/pserver/cclient/cclient.go | 98 +++++++++------------- go/pserver/cclient/test/CMakeLists.txt | 2 + go/pserver/cclient/test/main.c | 96 ++++++++++++--------- go/pserver/client_test.go | 2 +- go/pserver/service.go | 12 +-- go/pserver/service_test.go | 10 +-- 7 files changed, 139 insertions(+), 118 deletions(-) diff --git a/doc/design/cluster_train/pserver_client.md b/doc/design/cluster_train/pserver_client.md index b3e4079010..474b8c572c 100644 --- a/doc/design/cluster_train/pserver_client.md +++ b/doc/design/cluster_train/pserver_client.md @@ -74,14 +74,25 @@ typedef enum { typedef struct { char* name; paddle_element_type element_type; - void* content; + unsigned char* content; int content_len; } paddle_parameter, paddle_gradient; -typedef struct paddle_pserver_client paddle_pserver_client; +typedef int paddle_pserver_client; -paddle_pserver_client* paddle_new_pserver_client(); -void paddle_pserver_client_release(paddle_pserver_client* client); +/** + * @brief creates a pserver client that talks to etcd for coordination. + */ +paddle_pserver_client paddle_new_etcd_pserver_client(char* etcd_addr); + +/** + * @brief creates a pserver client given pserver addresses. + * + * @param pserver_addrs comma-separated pserver addresses. + * @param selected if current pserver client is selected to initialize all parameter servers. + */ +paddle_pserver_client paddle_new_pserver_client(char* pserver_addrs, int selected); +void paddle_pserver_client_release(paddle_pserver_client c); /** * @brief paddle_begin_init_params begins to initialize parameters on @@ -95,7 +106,7 @@ void paddle_pserver_client_release(paddle_pserver_client* client); * @return 1 if the trainer is selected to initialize parameter * servers, otherwise 0. */ -int paddle_begin_init_params(paddle_pserver_client* client); +int paddle_begin_init_params(paddle_pserver_client client); /** * @brief paddle_init_param initializes the parameter on parameter @@ -109,7 +120,7 @@ int paddle_begin_init_params(paddle_pserver_client* client); * @paddle_begin_init_param). Or simply exit the program and wait for * the cluster management system to restart the trainer. */ -int paddle_init_param(paddle_pserver_client* client, paddle_parameter param, const unsigned char* param_config_proto, int config_len); +int paddle_init_param(paddle_pserver_client client, paddle_parameter param, const unsigned char* param_config_proto, int config_len); /** * @brief paddle_finish_init_params tells parameter servers client has @@ -120,7 +131,7 @@ int paddle_init_param(paddle_pserver_client* client, paddle_parameter param, con * @paddle_begin_init_param). Or simply exit the program and wait for * the cluster management system to restart the trainer. */ -int paddle_finish_init_params(paddle_pserver_client* client); +int paddle_finish_init_params(paddle_pserver_client client); /** * @brief paddle_send_grads sends gradients to parameter servers for @@ -131,7 +142,7 @@ int paddle_finish_init_params(paddle_pserver_client* client); * @param learning_rate the learning rate for the gradients. * @return 0 if successful, otherwise -1. */ -int paddle_send_grads(paddle_pserver_client* client, const paddle_gradient* grads, int len); +int paddle_send_grads(paddle_pserver_client client, const paddle_gradient* grads, int len); /** * @brief paddle_get_params gets parameters from parameter servers. @@ -139,13 +150,15 @@ int paddle_send_grads(paddle_pserver_client* client, const paddle_gradient* grad * paddle_get_params will block until parameters are initialized on * the parameter servers. * - * @param names the array of names of the parameters to get. - * @param dst the destination array of parameters to save to. + * @param dst the destination array of parameter pointers to save to. + * The parameter pointer must be pre-popullated with required parameter name, + * and the content of parameter must be pre-allocated of the size of required + * parameter on pserver. * @param len the length of the names array and the paddle_parameter * array. * @return 0 if successful, otherwise -1. */ -int paddle_get_params(paddle_pserver_client* client, const char** names, paddle_parameter* dst, int len); +int paddle_get_params(paddle_pserver_client client, paddle_parameter** dst, int len); /** * @brief paddle_save_model indicates parameters to save the parameter @@ -154,5 +167,5 @@ int paddle_get_params(paddle_pserver_client* client, const char** names, paddle_ * @param path the path to save parameters. * @return 0 if successful, otherwise -1. */ -int paddle_save_model(paddle_pserver_client* client, const char* path); +int paddle_save_model(paddle_pserver_client client, const char* path); ``` diff --git a/go/pserver/cclient/cclient.go b/go/pserver/cclient/cclient.go index 0b4aa79806..c87f3853c5 100644 --- a/go/pserver/cclient/cclient.go +++ b/go/pserver/cclient/cclient.go @@ -19,21 +19,7 @@ typedef struct { int content_len; } paddle_parameter, paddle_gradient; -static inline void paddle_release_param(paddle_parameter* param) { - if (param != NULL) { - if (param->name != NULL) { - free(param->name); - } - - if (param->content != NULL) { - free(param->content); - } - - free(param); - } -} - -typedef int client; +typedef int paddle_pserver_client; */ import "C" @@ -48,10 +34,10 @@ import ( var nullPtr = unsafe.Pointer(uintptr(0)) var mu sync.Mutex -var handleMap = make(map[C.client]*pserver.Client) -var curHandle C.client +var handleMap = make(map[C.paddle_pserver_client]*pserver.Client) +var curHandle C.paddle_pserver_client -func add(c *pserver.Client) C.client { +func add(c *pserver.Client) C.paddle_pserver_client { mu.Lock() defer mu.Unlock() client := curHandle @@ -60,13 +46,13 @@ func add(c *pserver.Client) C.client { return client } -func get(client C.client) *pserver.Client { +func get(client C.paddle_pserver_client) *pserver.Client { mu.Lock() defer mu.Unlock() return handleMap[client] } -func remove(client C.client) *pserver.Client { +func remove(client C.paddle_pserver_client) *pserver.Client { mu.Lock() defer mu.Unlock() h := handleMap[client] @@ -100,7 +86,7 @@ func (l lister) List() []pserver.Server { } //export paddle_new_pserver_client -func paddle_new_pserver_client(addrs *C.char, selected int) C.client { +func paddle_new_pserver_client(addrs *C.char, selected int) C.paddle_pserver_client { a := C.GoString(addrs) as := strings.Split(a, ",") servers := make([]pserver.Server, len(as)) @@ -113,18 +99,18 @@ func paddle_new_pserver_client(addrs *C.char, selected int) C.client { } //export paddle_new_etcd_pserver_client -func paddle_new_etcd_pserver_client(etcd_addr *C.char) C.client { +func paddle_new_etcd_pserver_client(etcd_addr *C.char) C.paddle_pserver_client { // TODO(helin): fault tolerant pserver client using etcd. panic("not implemented.") } //export paddle_pserver_client_release -func paddle_pserver_client_release(client C.client) { +func paddle_pserver_client_release(client C.paddle_pserver_client) { remove(client) } //export paddle_begin_init_params -func paddle_begin_init_params(client C.client) C.int { +func paddle_begin_init_params(client C.paddle_pserver_client) C.int { c := get(client) if selected := c.BeginInitParams(); selected { return 1 @@ -133,7 +119,7 @@ func paddle_begin_init_params(client C.client) C.int { } //export paddle_init_param -func paddle_init_param(client C.client, param C.paddle_parameter, param_config unsafe.Pointer, config_len C.int) C.int { +func paddle_init_param(client C.paddle_pserver_client, param C.paddle_parameter, param_config unsafe.Pointer, config_len C.int) C.int { et := pserver.ElementType(param.element_type) name := C.GoString(param.name) content := cArrayToSlice(unsafe.Pointer(param.content), int(param.content_len)) @@ -143,7 +129,12 @@ func paddle_init_param(client C.client, param C.paddle_parameter, param_config u } c := get(client) err := c.InitParam(pc) + if err != nil { + if err.Error() == pserver.AlreadyInitialized { + log.Println("parameter", name, "already initialized, treat paddle_init_param as sucessful.") + return 0 + } log.Println(err) return -1 } @@ -152,10 +143,15 @@ func paddle_init_param(client C.client, param C.paddle_parameter, param_config u } //export paddle_finish_init_params -func paddle_finish_init_params(client C.client) C.int { +func paddle_finish_init_params(client C.paddle_pserver_client) C.int { c := get(client) err := c.FinishInitParams() if err != nil { + if err.Error() == pserver.AlreadyInitialized { + log.Println("parameters already initialized, treat paddle_finish_init_params as sucessful.") + return 0 + } + log.Println(err) return -1 } @@ -164,7 +160,7 @@ func paddle_finish_init_params(client C.client) C.int { } //export paddle_send_grads -func paddle_send_grads(client C.client, grads *C.paddle_gradient, total C.int) C.int { +func paddle_send_grads(client C.paddle_pserver_client, grads *C.paddle_gradient, total C.int) C.int { var gs []pserver.Gradient for i := 0; i < int(total); i++ { grad := (*C.paddle_gradient)(unsafe.Pointer((uintptr(unsafe.Pointer(grads)) + uintptr(i)*unsafe.Sizeof(*grads)))) @@ -185,11 +181,11 @@ func paddle_send_grads(client C.client, grads *C.paddle_gradient, total C.int) C } //export paddle_get_params -func paddle_get_params(client C.client, names **C.char, dst **C.paddle_parameter, total C.int) C.int { +func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, total C.int) C.int { var ns []string for i := 0; i < int(total); i++ { - name := *(**C.char)(unsafe.Pointer((uintptr(unsafe.Pointer(names)) + uintptr(i)*unsafe.Sizeof(*names)))) - ns = append(ns, C.GoString(name)) + param := *(**C.paddle_parameter)(unsafe.Pointer((uintptr(unsafe.Pointer(dst)) + uintptr(i)*unsafe.Sizeof(*dst)))) + ns = append(ns, C.GoString(param.name)) } c := get(client) ps, err := c.GetParams(ns) @@ -198,44 +194,32 @@ func paddle_get_params(client C.client, names **C.char, dst **C.paddle_parameter return -1 } - for i := 0; i < int(total); i++ { - if i >= len(ps) { - break + if len(ps) != len(ns) { + return -1 + } + + for i := range ps { + if ns[i] != ps[i].Name { + return -1 } + } + for i := 0; i < int(total); i++ { p := ps[i] param := *(**C.paddle_parameter)(unsafe.Pointer((uintptr(unsafe.Pointer(dst)) + uintptr(i)*unsafe.Sizeof(*dst)))) - nameReady := false - contentAllocated := false if unsafe.Pointer(param) == nullPtr { - param = (*C.paddle_parameter)(C.calloc(1, C.size_t(unsafe.Sizeof(*param)))) + log.Println("Error: must pre-allocate parameter.") + return -1 } else { - if unsafe.Pointer(param.name) != nullPtr { - if n := C.GoString(param.name); n != p.Name { - log.Println("Warning: the pre-allocated parameter name does not match the parameter name, it will be freed.", n, p.Name) - C.free(unsafe.Pointer(param.name)) - } else { - nameReady = true - } - } - if unsafe.Pointer(param.content) != nullPtr { - if int(param.content_len) == len(p.Content) { - contentAllocated = true - } else { - log.Println("Warning: the pre-allocated content len does not match parameter content len, the pre-allocated content will be freed.", param.content_len, len(p.Content)) - C.free(unsafe.Pointer(param.content)) + if int(param.content_len) != len(p.Content) { + log.Println("Error: the pre-allocated content len does not match parameter content len.", param.content_len, len(p.Content)) + return -1 } } } - if !nameReady { - param.name = C.CString(p.Name) - } - if !contentAllocated { - param.content = (*C.uchar)(C.malloc(C.size_t(len(p.Content)))) - } C.memcpy(unsafe.Pointer(param.content), unsafe.Pointer(&p.Content[0]), C.size_t(len(p.Content))) param.content_len = C.int(len(p.Content)) param.element_type = C.paddle_element_type(p.ElementType) @@ -245,7 +229,7 @@ func paddle_get_params(client C.client, names **C.char, dst **C.paddle_parameter } //export paddle_save_model -func paddle_save_model(client C.client, path *C.char) C.int { +func paddle_save_model(client C.paddle_pserver_client, path *C.char) C.int { p := C.GoString(path) c := get(client) err := c.Save(p) diff --git a/go/pserver/cclient/test/CMakeLists.txt b/go/pserver/cclient/test/CMakeLists.txt index 16f84648c1..77bf250b7c 100644 --- a/go/pserver/cclient/test/CMakeLists.txt +++ b/go/pserver/cclient/test/CMakeLists.txt @@ -7,5 +7,7 @@ add_dependencies(main client) if(APPLE) set(CMAKE_EXE_LINKER_FLAGS "-framework CoreFoundation -framework Security") +else() + set(CMAKE_EXE_LINKER_FLAGS "-pthread") endif() target_link_libraries(main ${CMAKE_BINARY_DIR}/libclient.a) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index f75a2110b9..63ae7aa92a 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -2,67 +2,87 @@ #include "libclient.h" -void fail() { - // TODO(helin): fix: gtest using cmake is not working, using this - // hacky way for now. - printf("test failed.\n"); +// TODO(helin): fix: gtest using cmake is not working, using this +// hacky way for now. +#define fail() \ + fprintf(stderr, "info: %s:%d: ", __FILE__, __LINE__); \ exit(-1); + +void sendGrads(paddle_pserver_client c) { + unsigned char grad_a[2000] = {2}; + unsigned char grad_b[3000] = {3}; + paddle_gradient grads[2] = { + {"param_a", PADDLE_ELEMENT_TYPE_FLOAT32, grad_a, 2000}, + {"param_b", PADDLE_ELEMENT_TYPE_FLOAT32, grad_b, 3000}}; + + if (paddle_send_grads(c, grads, 2)) { + fail(); + } +} + +void getParams(paddle_pserver_client c) { + paddle_parameter param_a; + paddle_parameter param_b; + char name_a[] = "param_a"; + char name_b[] = "param_b"; + // must pre-allocate the content for parameter to receive. + unsigned char content_a[2000] = {}; + unsigned char content_b[3000] = {}; + param_a.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + param_a.name = name_a; + param_a.content = content_a; + param_a.content_len = 2000; + param_b.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + param_b.name = name_b; + param_b.content = content_b; + param_b.content_len = 3000; + + paddle_parameter* params[2] = {¶m_a, ¶m_b}; + if (paddle_get_params(c, params, 2)) { + fail(); + } } int main() { char addr[] = "localhost:3000"; - client c = paddle_new_pserver_client(addr, 1); + paddle_pserver_client c = paddle_new_pserver_client(addr, 1); retry: if (paddle_begin_init_params(c)) { paddle_parameter param; char name_a[] = "param_a"; char name_b[] = "param_b"; - unsigned char content[] = {0x00, 0x11, 0x22}; + unsigned char content_a[2000] = {1}; + unsigned char content_b[3000] = {0}; param.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; param.name = name_a; - param.content = content; - param.content_len = 3; - if (paddle_init_param(c, param, NULL, 0) != 0) { + param.content = content_a; + param.content_len = 2000; + int error = paddle_init_param(c, param, NULL, 0); + if (error != 0) { goto retry; } - param.element_type = PADDLE_ELEMENT_TYPE_INT32; + + param.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; param.name = name_b; - param.content = content; - param.content_len = 3; - if (paddle_init_param(c, param, NULL, 0) != 0) { + param.content = content_b; + param.content_len = 3000; + error = paddle_init_param(c, param, NULL, 0); + if (error != 0) { goto retry; } - if (paddle_finish_init_params(c) != 0) { + + error = paddle_finish_init_params(c); + if (error != 0) { goto retry; } - } else { - fail(); - } - - unsigned char content[] = {0x00, 0x11, 0x22}; - paddle_gradient grads[2] = { - {"param_a", PADDLE_ELEMENT_TYPE_INT32, content, 3}, - {"param_b", PADDLE_ELEMENT_TYPE_FLOAT32, content, 3}}; - - if (!paddle_send_grads(c, grads, 2)) { - fail(); - } - - paddle_parameter* params[2] = {NULL, NULL}; - char* names[] = {"param_a", "param_b"}; - if (!paddle_get_params(c, names, params, 2)) { - fail(); } - // get parameters again by reusing the allocated parameter buffers. - if (!paddle_get_params(c, names, params, 2)) { - fail(); + for (int i = 0; i < 100; i++) { + sendGrads(c); + getParams(c); } - paddle_release_param(params[0]); - paddle_release_param(params[1]); - - if (!paddle_save_model(c, "/tmp/")) { + if (paddle_save_model(c, "/tmp/")) { fail(); } diff --git a/go/pserver/client_test.go b/go/pserver/client_test.go index a9a0948a51..d0371a26a1 100644 --- a/go/pserver/client_test.go +++ b/go/pserver/client_test.go @@ -117,7 +117,7 @@ func TestClientFull(t *testing.T) { for i := range params { if names[i] != params[i].Name { - t.Fatalf("order of returned parameter does not required: parameter name: %s, required name: %s", names[i], params[i]) + t.Fatalf("order of returned parameter does not required: parameter name: %s, required name: %s", names[i], params[i].Name) } } } diff --git a/go/pserver/service.go b/go/pserver/service.go index d5787b9708..33e0eb5c5a 100644 --- a/go/pserver/service.go +++ b/go/pserver/service.go @@ -9,8 +9,10 @@ import ( // ElementType is the type of elements of a Parameter. type ElementType int -var ErrAlreadyInitialized = errors.New("pserver already initialized") -var ErrUninitialized = errors.New("pserver not fully initialized") +const ( + AlreadyInitialized = "pserver already initialized" + Uninitialized = "pserver not fully initialized" +) // Supported element types const ( @@ -59,7 +61,7 @@ func NewService() *Service { func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, dummy *int) error { select { case <-s.initialized: - return ErrAlreadyInitialized + return errors.New(AlreadyInitialized) default: } @@ -80,7 +82,7 @@ func (s *Service) InitParam(paramWithConfigs ParameterWithConfig, dummy *int) er func (s *Service) FinishInitParams(dummy0 int, dummy1 *int) error { select { case <-s.initialized: - return ErrAlreadyInitialized + return errors.New(AlreadyInitialized) default: } @@ -94,7 +96,7 @@ func (s *Service) SendGrad(g Gradient, dummy *int) error { select { case <-s.initialized: default: - return ErrUninitialized + return errors.New(Uninitialized) } s.mu.Lock() diff --git a/go/pserver/service_test.go b/go/pserver/service_test.go index 4c9fac4536..796492ffb4 100644 --- a/go/pserver/service_test.go +++ b/go/pserver/service_test.go @@ -16,7 +16,7 @@ func TestFull(t *testing.T) { p.Content = []byte{1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0} p.ElementType = pserver.Int32 var dummy int - err := s.InitParam(pserver.ParameterWithConfig{p, nil}, &dummy) + err := s.InitParam(pserver.ParameterWithConfig{Param: p, Config: nil}, &dummy) if err != nil { t.FailNow() } @@ -25,7 +25,7 @@ func TestFull(t *testing.T) { p1.Name = "param_b" p1.Content = []byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} p1.ElementType = pserver.Float32 - err = s.InitParam(pserver.ParameterWithConfig{p1, nil}, &dummy) + err = s.InitParam(pserver.ParameterWithConfig{Param: p1, Config: nil}, &dummy) if err != nil { t.FailNow() } @@ -81,7 +81,7 @@ func TestMultipleInit(t *testing.T) { } err = s.FinishInitParams(0, &dummy) - if err != pserver.ErrAlreadyInitialized { + if err.Error() != pserver.AlreadyInitialized { t.FailNow() } } @@ -90,7 +90,7 @@ func TestUninitialized(t *testing.T) { s := pserver.NewService() var dummy int err := s.SendGrad(pserver.Gradient{}, &dummy) - if err != pserver.ErrUninitialized { + if err.Error() != pserver.Uninitialized { t.FailNow() } } @@ -135,7 +135,7 @@ func TestBlockUntilInitialized(t *testing.T) { p.Content = []byte{1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0} p.ElementType = pserver.Int32 var dummy int - err := s.InitParam(pserver.ParameterWithConfig{p, nil}, &dummy) + err := s.InitParam(pserver.ParameterWithConfig{Param: p, Config: nil}, &dummy) if err != nil { t.FailNow() } From 83c852c12b4c40eef5961d1da9e0978f1ba6777f Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Fri, 9 Jun 2017 22:39:32 +0000 Subject: [PATCH 18/39] better logging --- go/pserver/cclient/cclient.go | 32 +++++++++++++++++++++++++++++--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/go/pserver/cclient/cclient.go b/go/pserver/cclient/cclient.go index c87f3853c5..6a76ec920e 100644 --- a/go/pserver/cclient/cclient.go +++ b/go/pserver/cclient/cclient.go @@ -132,7 +132,7 @@ func paddle_init_param(client C.paddle_pserver_client, param C.paddle_parameter, if err != nil { if err.Error() == pserver.AlreadyInitialized { - log.Println("parameter", name, "already initialized, treat paddle_init_param as sucessful.") + log.Printf("parameter %s already initialized, treat paddle_init_param as sucessful.\n", name) return 0 } log.Println(err) @@ -194,12 +194,38 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, return -1 } + names := func() (string, string) { + retNames := "" + for _, p := range ps { + if retNames == "" { + retNames = p.Name + } else { + retNames = ", " + p.Name + } + } + + requestedNames := "" + for _, n := range ns { + if requestedNames == "" { + requestedNames = n + } else { + requestedNames = ", " + n + } + } + + return requestedNames, retNames + } + if len(ps) != len(ns) { + requestedNames, retNames := names() + log.Printf("pserver returned wrong number of parameters. Requested: %s, returned: %s.\n", retNames, requestedNames) return -1 } for i := range ps { if ns[i] != ps[i].Name { + requestedNames, retNames := names() + log.Printf("pserver returned wrong parameters, or not in requested order. Requested: %s, returned: %s.\n", retNames, requestedNames) return -1 } } @@ -209,12 +235,12 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, param := *(**C.paddle_parameter)(unsafe.Pointer((uintptr(unsafe.Pointer(dst)) + uintptr(i)*unsafe.Sizeof(*dst)))) if unsafe.Pointer(param) == nullPtr { - log.Println("Error: must pre-allocate parameter.") + log.Println("must pre-allocate parameter.") return -1 } else { if unsafe.Pointer(param.content) != nullPtr { if int(param.content_len) != len(p.Content) { - log.Println("Error: the pre-allocated content len does not match parameter content len.", param.content_len, len(p.Content)) + log.Printf("the pre-allocated content len does not match parameter content len. Pre-allocated len: %d, returned len: %d", param.content_len, len(p.Content)) return -1 } } From 0e71ab29db2db20160689097dcfc88220def928d Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Fri, 9 Jun 2017 22:49:38 +0000 Subject: [PATCH 19/39] fix comment --- go/pserver/cclient/test/main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index 63ae7aa92a..09af316e17 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -2,7 +2,7 @@ #include "libclient.h" -// TODO(helin): fix: gtest using cmake is not working, using this +// TODO(helin): Fix: gtest using cmake is not working, using this // hacky way for now. #define fail() \ fprintf(stderr, "info: %s:%d: ", __FILE__, __LINE__); \ @@ -25,7 +25,7 @@ void getParams(paddle_pserver_client c) { paddle_parameter param_b; char name_a[] = "param_a"; char name_b[] = "param_b"; - // must pre-allocate the content for parameter to receive. + // Must pre-allocate the prameter content before calling paddle_get_params. unsigned char content_a[2000] = {}; unsigned char content_b[3000] = {}; param_a.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; From c44f5dd883b49d063d336dda874f1794270c2982 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Sat, 10 Jun 2017 20:46:26 +0800 Subject: [PATCH 20/39] add simple updater, this version can train a model --- go/pserver/cclient/cclient.go | 1 - go/pserver/cclient/test/CMakeLists.txt | 4 + go/pserver/cclient/test/main.c | 71 ++++-------- go/pserver/cclient/test/test_cclient.c | 114 +++++++++++++++++++ go/pserver/optimizer.c | 8 +- paddle/trainer/NewRemoteParameterUpdater.cpp | 2 - paddle/trainer/NewRemoteParameterUpdater.h | 2 +- 7 files changed, 146 insertions(+), 56 deletions(-) create mode 100644 go/pserver/cclient/test/test_cclient.c diff --git a/go/pserver/cclient/cclient.go b/go/pserver/cclient/cclient.go index be16a143d8..7fdf9f0ec2 100644 --- a/go/pserver/cclient/cclient.go +++ b/go/pserver/cclient/cclient.go @@ -42,7 +42,6 @@ import ( "strings" "sync" "unsafe" - "fmt" "github.com/PaddlePaddle/Paddle/go/pserver" ) diff --git a/go/pserver/cclient/test/CMakeLists.txt b/go/pserver/cclient/test/CMakeLists.txt index 762772812f..e7d0a74237 100644 --- a/go/pserver/cclient/test/CMakeLists.txt +++ b/go/pserver/cclient/test/CMakeLists.txt @@ -2,6 +2,8 @@ cmake_minimum_required(VERSION 3.0) add_executable(main main.c) add_dependencies(main paddle_pserver_cclient) +add_executable(test_cclient test_cclient.c) +add_dependencies(test_cclient paddle_pserver_cclient) if(APPLE) set(CMAKE_EXE_LINKER_FLAGS "-framework CoreFoundation -framework Security") @@ -10,7 +12,9 @@ endif() if(PROJ_ROOT) include_directories(${CMAKE_BINARY_DIR}/go/pserver/cclient/) target_link_libraries(main ${CMAKE_BINARY_DIR}/go/pserver/cclient/libpaddle_pserver_cclient.a pthread) + target_link_libraries(test_cclient ${CMAKE_BINARY_DIR}/go/pserver/cclient/libpaddle_pserver_cclient.a pthread) else(PROJ_ROOT) include_directories(${CMAKE_BINARY_DIR}) target_link_libraries(main ${CMAKE_BINARY_DIR}/libpaddle_pserver_cclient.a pthread) + target_link_libraries(test_cclient ${CMAKE_BINARY_DIR}/libpaddle_pserver_cclient.a pthread) endif(PROJ_ROOT) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index 59cf5756fd..a074808b16 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -1,5 +1,4 @@ #include -#include #include "libpaddle_pserver_cclient.h" @@ -10,21 +9,6 @@ void fail() { exit(-1); } -void print_parameter(paddle_gradient* param) { - if (param == NULL) { - printf("param is NULL!!\n"); - } else { - printf("==== parameter ====\n"); - printf("name: %s\n", param->name); - printf("content_len: %d\n", param->content_len); - printf("content_type: %d\n", param->element_type); - for (int i = 0; i < param->content_len; ++i) { - printf("0x%x ", param->content[i]); - } - printf("\n\n"); - } -} - int main() { char addr[] = "localhost:3000"; client c = paddle_new_pserver_client(addr, 1); @@ -33,23 +17,21 @@ retry: paddle_parameter param; char name_a[] = "param_a"; char name_b[] = "param_b"; - unsigned char content1[] = {0x01, 0x02, 0x03}; + unsigned char content[] = {0x00, 0x11, 0x22}; param.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; param.name = name_a; - param.content = content1; + param.content = content; param.content_len = 3; if (paddle_init_param(c, param, NULL, 0) != 0) { goto retry; } - unsigned char content2[] = {0x04, 0x05, 0x06}; param.element_type = PADDLE_ELEMENT_TYPE_INT32; param.name = name_b; - param.content = content2; + param.content = content; param.content_len = 3; if (paddle_init_param(c, param, NULL, 0) != 0) { goto retry; } - if (paddle_finish_init_params(c) != 0) { goto retry; } @@ -57,27 +39,22 @@ retry: fail(); } - unsigned char content1[] = {0x12, 0x23, 0x34}; - unsigned char content2[] = {0x45, 0x56, 0x67}; - - paddle_gradient** new_params = - (paddle_gradient**)malloc(sizeof(paddle_gradient*) * 2); - new_params[0] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); - new_params[0]->name = "param_a"; - new_params[0]->content = content1; - new_params[0]->content_len = 3; - new_params[0]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; - - new_params[1] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); - new_params[1]->name = "param_b"; - new_params[1]->content = content2; - new_params[1]->content_len = 3; - new_params[1]->element_type = PADDLE_ELEMENT_TYPE_INT32; - - print_parameter(new_params[0]); - print_parameter(new_params[1]); - - if (paddle_send_grads(c, new_params, 2) != 0) { + unsigned char content[] = {0x00, 0x11, 0x22}; + paddle_gradient** grads = + (paddle_gradient**)malloc(sizeof(paddle_gradient*) * 2); + grads[0] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); + grads[0]->name = "param_a"; + grads[0]->content = content; + grads[0]->content_len = 3; + grads[0]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + + grads[1] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); + grads[1]->name = "param_b"; + grads[1]->content = content; + grads[1]->content_len = 3; + grads[1]->element_type = PADDLE_ELEMENT_TYPE_INT32; + + if (paddle_send_grads(c, grads, 2) != 0) { fail(); } @@ -87,15 +64,6 @@ retry: fail(); } - print_parameter(params[0]); - print_parameter(params[1]); - - /// change name of parameter. - char* names2[] = {"param_1", "param_2"}; - if (paddle_get_params(c, names2, params, 2) == 0) { - fail(); - } - // get parameters again by reusing the allocated parameter buffers. if (paddle_get_params(c, names, params, 2) != 0) { fail(); @@ -109,5 +77,6 @@ retry: } printf("test success!\n"); + return 0; } diff --git a/go/pserver/cclient/test/test_cclient.c b/go/pserver/cclient/test/test_cclient.c new file mode 100644 index 0000000000..4d6fca29fb --- /dev/null +++ b/go/pserver/cclient/test/test_cclient.c @@ -0,0 +1,114 @@ +#include +#include + +#include "libpaddle_pserver_cclient.h" + +typedef float real; + +void fail() { + // TODO(helin): fix: gtest using cmake is not working, using this + // hacky way for now. + printf("test failed.\n"); + exit(-1); +} + +void print_parameter(paddle_gradient* param) { + if (param == NULL) { + printf("param is NULL!!\n"); + } else { + printf("==== parameter ====\n"); + printf("name: %s\n", param->name); + printf("content_len: %d\n", param->content_len); + printf("content_type: %d\n", param->element_type); + for (int i = 0; i < param->content_len/sizeof(real); ++i) { + printf("%f ", ((float *)param->content)[i]); + } + printf("\n\n"); + } +} + +int main() { + char addr[] = "localhost:3000"; + client c = paddle_new_pserver_client(addr, 1); + + char* names[] = {"param_a", "param_b"}; +retry: + + if (paddle_begin_init_params(c)) { + paddle_parameter param; + real param_content1[] = {0.1, 0.2, 0.3}; + param.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + param.name = names[0]; + param.content = (unsigned char*)param_content1; + param.content_len = 3 * sizeof(real); + if (paddle_init_param(c, param, NULL, 0) != 0) { + goto retry; + } + real param_content2[] = {0.4, 0.5, 0.6}; + param.element_type = PADDLE_ELEMENT_TYPE_INT32; + param.name = names[1]; + param.content = (unsigned char*)param_content2; + param.content_len = 3 * sizeof(real); + if (paddle_init_param(c, param, NULL, 0) != 0) { + goto retry; + } + + if (paddle_finish_init_params(c) != 0) { + goto retry; + } + } else { + fail(); + } + + printf("get initialized parameters from pserver:\n"); + paddle_parameter* param_ptrs[2] = {NULL, NULL}; + if (paddle_get_params(c, names, param_ptrs, 2) != 0) { + fail(); + } + print_parameter(param_ptrs[0]); + print_parameter(param_ptrs[1]); + + printf("send gradient to pserver:\n"); + real gradient_content1[] = {0.01, 0.02, 0.03}; + real gradinet_content2[] = {0.04, 0.05, 0.06}; + + paddle_gradient** grads = + (paddle_gradient**)malloc(sizeof(paddle_gradient*) * 2); + grads[0] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); + grads[0]->name = names[0]; + grads[0]->content = (unsigned char*)gradient_content1; + grads[0]->content_len = 3 * sizeof(real); + grads[0]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + + grads[1] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); + grads[1]->name = names[1]; + grads[1]->content = (unsigned char*)gradinet_content2; + grads[1]->content_len = 3 * sizeof(real); + grads[1]->element_type = PADDLE_ELEMENT_TYPE_INT32; + + print_parameter(grads[0]); + print_parameter(grads[1]); + + if (paddle_send_grads(c, grads, 2) != 0) { + fail(); + } + + printf("get updated parameters from pserver:\n"); + // get parameters again by reusing the allocated parameter buffers. + if (paddle_get_params(c, names, param_ptrs, 2) != 0) { + fail(); + } + + print_parameter(param_ptrs[0]); + print_parameter(param_ptrs[1]); + + paddle_release_param(param_ptrs[0]); + paddle_release_param(param_ptrs[1]); + + if (paddle_save_model(c, "/tmp/") != 0) { + fail(); + } + + printf("test success!\n"); + return 0; +} diff --git a/go/pserver/optimizer.c b/go/pserver/optimizer.c index b8da3ec959..5d0b1017ce 100644 --- a/go/pserver/optimizer.c +++ b/go/pserver/optimizer.c @@ -32,7 +32,13 @@ int update_SGD(void* optimizer, const void* gradient, int num_bytes) { SGD_optimizer* o = (SGD_optimizer*)optimizer; - // TODO + // TODO(a simple SGD implement) + float* parameter = (float *)buffer; + float* grad = (float *)gradient; + + for(int i = 0; i < num_bytes/sizeof(float); ++i) { + parameter[i] -= o->learning_rate * grad[i]; + } return 0; } diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp index 35df973897..0f879dbde0 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.cpp +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -73,8 +73,6 @@ void NewRemoteParameterUpdater::init( void NewRemoteParameterUpdater::updateImpl(Parameter *para) {} void NewRemoteParameterUpdater::finishBatch(real cost) { - LOG(INFO) << "finishBatch in, cost: " << cost; - // send gradient to parameter server. paddle_send_grads(parameterClient_, newGradients_, parameterSize()); // get the updated parameter from parameterClient. diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index 1dbb3658fb..d3a032badc 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -87,7 +87,7 @@ private: new_params[i]->name = (char*)param->getName().c_str(); new_params[i]->content = (unsigned char*)(param->getBuf(type).get()->getData()); - new_params[i]->content_len = (int)param->getBuf(type).get()->getSize(); + new_params[i]->content_len = (int)param->getBuf(type).get()->getSize() * sizeof(real); } return new_params; } From 39d0b3de99112e6b90c22147b6c15917f5a3e1d5 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Sat, 10 Jun 2017 21:18:03 +0800 Subject: [PATCH 21/39] add test file mnist_test.py, free resource of newRemoteParameterUpdater properly --- go/pserver/cclient/test/mnist_test.py | 134 +++++++++++++++++++++ paddle/trainer/NewRemoteParameterUpdater.h | 14 +-- 2 files changed, 141 insertions(+), 7 deletions(-) create mode 100644 go/pserver/cclient/test/mnist_test.py diff --git a/go/pserver/cclient/test/mnist_test.py b/go/pserver/cclient/test/mnist_test.py new file mode 100644 index 0000000000..c77af49130 --- /dev/null +++ b/go/pserver/cclient/test/mnist_test.py @@ -0,0 +1,134 @@ +import paddle.v2 as paddle +import gzip + + +def softmax_regression(img): + predict = paddle.layer.fc(input=img, + size=10, + act=paddle.activation.Softmax()) + return predict + + +def multilayer_perceptron(img): + # The first fully-connected layer + hidden1 = paddle.layer.fc(input=img, size=128, act=paddle.activation.Relu()) + # The second fully-connected layer and the according activation function + hidden2 = paddle.layer.fc(input=hidden1, + size=64, + act=paddle.activation.Relu()) + # The thrid fully-connected layer, note that the hidden size should be 10, + # which is the number of unique digits + predict = paddle.layer.fc(input=hidden2, + size=10, + act=paddle.activation.Softmax()) + return predict + + +def convolutional_neural_network(img): + # first conv layer + conv_pool_1 = paddle.networks.simple_img_conv_pool( + input=img, + filter_size=5, + num_filters=20, + num_channel=1, + pool_size=2, + pool_stride=2, + act=paddle.activation.Tanh()) + # second conv layer + conv_pool_2 = paddle.networks.simple_img_conv_pool( + input=conv_pool_1, + filter_size=5, + num_filters=50, + num_channel=20, + pool_size=2, + pool_stride=2, + act=paddle.activation.Tanh()) + # The first fully-connected layer + fc1 = paddle.layer.fc(input=conv_pool_2, + size=128, + act=paddle.activation.Tanh()) + # The softmax layer, note that the hidden size should be 10, + # which is the number of unique digits + predict = paddle.layer.fc(input=fc1, + size=10, + act=paddle.activation.Softmax()) + return predict + + +def main(): + paddle.init(use_gpu=False, trainer_count=1, trainer_id=1) + + # define network topology + images = paddle.layer.data( + name='pixel', type=paddle.data_type.dense_vector(784)) + label = paddle.layer.data( + name='label', type=paddle.data_type.integer_value(10)) + + # Here we can build the prediction network in different ways. Please + # choose one by uncomment corresponding line. + predict = softmax_regression(images) + #predict = multilayer_perceptron(images) + #predict = convolutional_neural_network(images) + + cost = paddle.layer.classification_cost(input=predict, label=label) + parameters = paddle.parameters.create(cost) + + optimizer = paddle.optimizer.Momentum( + learning_rate=0.1 / 128.0, + momentum=0.9, + regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128)) + + trainer = paddle.trainer.SGD(cost=cost, + parameters=parameters, + update_equation=optimizer, + is_local=False, + pserver_spec="localhost:3000") + + lists = [] + + def event_handler(event): + if isinstance(event, paddle.event.EndIteration): + if event.batch_id % 1000 == 0: + print "Pass %d, Batch %d, Cost %f, %s" % ( + event.pass_id, event.batch_id, event.cost, event.metrics) + + with gzip.open('params.tar.gz', 'w') as f: + parameters.to_tar(f) + + elif isinstance(event, paddle.event.EndPass): + result = trainer.test(reader=paddle.batch( + paddle.dataset.mnist.test(), batch_size=128)) + print "Test with Pass %d, Cost %f, %s\n" % ( + event.pass_id, result.cost, result.metrics) + lists.append((event.pass_id, result.cost, + result.metrics['classification_error_evaluator'])) + + trainer.train( + reader=paddle.batch( + paddle.reader.shuffle( + paddle.dataset.mnist.train(), buf_size=8192), + batch_size=128), + event_handler=event_handler, + num_passes=100) + + # find the best pass + best = sorted(lists, key=lambda list: float(list[1]))[0] + print 'Best pass is %s, testing Avgcost is %s' % (best[0], best[1]) + print 'The classification accuracy is %.2f%%' % (100 - float(best[2]) * 100) + + test_creator = paddle.dataset.mnist.test() + test_data = [] + for item in test_creator(): + test_data.append((item[0], )) + if len(test_data) == 100: + break + + # output is a softmax layer. It returns probabilities. + # Shape should be (100, 10) + probs = paddle.infer( + output_layer=predict, parameters=parameters, input=test_data) + print probs.shape + + +if __name__ == '__main__': + main() diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index d3a032badc..db9b44af19 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -32,9 +32,11 @@ public: NewRemoteParameterUpdater(const OptimizationConfig& config, const std::string pserverSpec); ~NewRemoteParameterUpdater() { - LOG(INFO) << "~NewRemoteParameterUpdater in"; -// releaseNewParameter(newParameters_); -// releaseNewParameter(newGradients_); + if (names_ != nullptr) { + free(names_); + } + releaseNewParameter(newParameters_); + releaseNewParameter(newGradients_); if (parameterClient_ >= 0) paddle_pserver_client_release(parameterClient_); } @@ -95,11 +97,9 @@ private: void releaseNewParameter(paddle_parameter** newParams) { if (newParams != nullptr) { for (int i = 0; i < parameterSize(); ++i) { - auto param = newParams[i]; - if (param != nullptr) { - paddle_release_param(param); - } + free(newParams[i]); } + free(newParams); } } From 4f366be4e298db78c7438f60d72fddf39b2b6ccc Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Sat, 10 Jun 2017 21:23:02 +0800 Subject: [PATCH 22/39] clang format --- go/pserver/cclient/test/main.c | 2 +- go/pserver/cclient/test/test_cclient.c | 4 ++-- go/pserver/optimizer.c | 6 +++--- paddle/trainer/NewRemoteParameterUpdater.h | 3 ++- 4 files changed, 8 insertions(+), 7 deletions(-) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index a074808b16..72ec359076 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -41,7 +41,7 @@ retry: unsigned char content[] = {0x00, 0x11, 0x22}; paddle_gradient** grads = - (paddle_gradient**)malloc(sizeof(paddle_gradient*) * 2); + (paddle_gradient**)malloc(sizeof(paddle_gradient*) * 2); grads[0] = (paddle_gradient*)malloc(sizeof(paddle_gradient)); grads[0]->name = "param_a"; grads[0]->content = content; diff --git a/go/pserver/cclient/test/test_cclient.c b/go/pserver/cclient/test/test_cclient.c index 4d6fca29fb..82cef386d7 100644 --- a/go/pserver/cclient/test/test_cclient.c +++ b/go/pserver/cclient/test/test_cclient.c @@ -20,8 +20,8 @@ void print_parameter(paddle_gradient* param) { printf("name: %s\n", param->name); printf("content_len: %d\n", param->content_len); printf("content_type: %d\n", param->element_type); - for (int i = 0; i < param->content_len/sizeof(real); ++i) { - printf("%f ", ((float *)param->content)[i]); + for (int i = 0; i < param->content_len / sizeof(real); ++i) { + printf("%f ", ((float*)param->content)[i]); } printf("\n\n"); } diff --git a/go/pserver/optimizer.c b/go/pserver/optimizer.c index 5d0b1017ce..47fe1efbf5 100644 --- a/go/pserver/optimizer.c +++ b/go/pserver/optimizer.c @@ -33,10 +33,10 @@ int update_SGD(void* optimizer, int num_bytes) { SGD_optimizer* o = (SGD_optimizer*)optimizer; // TODO(a simple SGD implement) - float* parameter = (float *)buffer; - float* grad = (float *)gradient; + float* parameter = (float*)buffer; + float* grad = (float*)gradient; - for(int i = 0; i < num_bytes/sizeof(float); ++i) { + for (int i = 0; i < num_bytes / sizeof(float); ++i) { parameter[i] -= o->learning_rate * grad[i]; } return 0; diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index db9b44af19..b7c0425982 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -89,7 +89,8 @@ private: new_params[i]->name = (char*)param->getName().c_str(); new_params[i]->content = (unsigned char*)(param->getBuf(type).get()->getData()); - new_params[i]->content_len = (int)param->getBuf(type).get()->getSize() * sizeof(real); + new_params[i]->content_len = + (int)param->getBuf(type).get()->getSize() * sizeof(real); } return new_params; } From da3e84a6d25fe75f63a624e4e523aba7a8c378c6 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Sat, 10 Jun 2017 21:43:59 +0800 Subject: [PATCH 23/39] change trainer_id --- go/pserver/cclient/test/mnist_test.py | 5 +---- go/pserver/cclient/test/test_train.py | 2 +- paddle/trainer/NewRemoteParameterUpdater.cpp | 4 ++-- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/go/pserver/cclient/test/mnist_test.py b/go/pserver/cclient/test/mnist_test.py index c77af49130..c3a3af55e2 100644 --- a/go/pserver/cclient/test/mnist_test.py +++ b/go/pserver/cclient/test/mnist_test.py @@ -56,7 +56,7 @@ def convolutional_neural_network(img): def main(): - paddle.init(use_gpu=False, trainer_count=1, trainer_id=1) + paddle.init(use_gpu=False, trainer_count=1) # define network topology images = paddle.layer.data( @@ -92,9 +92,6 @@ def main(): print "Pass %d, Batch %d, Cost %f, %s" % ( event.pass_id, event.batch_id, event.cost, event.metrics) - with gzip.open('params.tar.gz', 'w') as f: - parameters.to_tar(f) - elif isinstance(event, paddle.event.EndPass): result = trainer.test(reader=paddle.batch( paddle.dataset.mnist.test(), batch_size=128)) diff --git a/go/pserver/cclient/test/test_train.py b/go/pserver/cclient/test/test_train.py index ddd6371e0c..3f8d5d793b 100644 --- a/go/pserver/cclient/test/test_train.py +++ b/go/pserver/cclient/test/test_train.py @@ -4,7 +4,7 @@ import paddle.v2.dataset.uci_housing as uci_housing def main(): # init - paddle.init(use_gpu=False, trainer_count=1, trainer_id=1) + paddle.init(use_gpu=False, trainer_count=1) # network config x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13)) diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp index 0f879dbde0..d554e09759 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.cpp +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -39,8 +39,8 @@ void NewRemoteParameterUpdater::init( } // create parameter server client. - parameterClient_ = - paddle_new_pserver_client((char *)pserverSpec_.c_str(), FLAGS_trainer_id); + parameterClient_ = paddle_new_pserver_client((char *)pserverSpec_.c_str(), + FLAGS_trainer_id == 0); // init names_ for get parameter through paddle_cclient names_ = (char **)malloc(parameterSize() * sizeof(char *)); From 283bdc5062be0ba14b0ae3ca6cc211ddaf25fd1c Mon Sep 17 00:00:00 2001 From: gongweibao Date: Mon, 12 Jun 2017 10:29:35 +0800 Subject: [PATCH 24/39] fix by helin's comments --- paddle/parameter/tests/test_argument.cpp | 2 +- python/paddle/v2/dataset/common.py | 58 +++++++++++-------- python/paddle/v2/dataset/tests/common_test.py | 26 +++++++-- 3 files changed, 56 insertions(+), 30 deletions(-) diff --git a/paddle/parameter/tests/test_argument.cpp b/paddle/parameter/tests/test_argument.cpp index 81fe4ee397..98ab013548 100644 --- a/paddle/parameter/tests/test_argument.cpp +++ b/paddle/parameter/tests/test_argument.cpp @@ -42,7 +42,7 @@ TEST(Argument, poolSequenceWithStride) { CHECK_EQ(outStart[3], 4); CHECK_EQ(outStart[4], 7); - CHECK_EQ(stridePositions->getSize(), 8); + CHECK_EQ(stridePositions->getSize(), 8UL); auto result = reversed ? strideResultReversed : strideResult; for (int i = 0; i < 8; i++) { CHECK_EQ(stridePositions->getData()[i], result[i]); diff --git a/python/paddle/v2/dataset/common.py b/python/paddle/v2/dataset/common.py index 89675080e2..8023fa3cf8 100644 --- a/python/paddle/v2/dataset/common.py +++ b/python/paddle/v2/dataset/common.py @@ -151,9 +151,14 @@ def cluster_files_reader(files_pattern, return reader -def convert(output_path, eader, num_shards, name_prefix): +def convert(output_path, + reader, + num_shards, + name_prefix, + max_lines_to_shuffle=10000): import recordio import cPickle as pickle + import random """ Convert data from reader to recordio format files. @@ -161,35 +166,40 @@ def convert(output_path, eader, num_shards, name_prefix): :param reader: a data reader, from which the convert program will read data instances. :param num_shards: the number of shards that the dataset will be partitioned into. :param name_prefix: the name prefix of generated files. + :param max_lines_to_shuffle: the max lines numbers to shuffle before writing. """ - def open_needs(idx): - n = "%s/%s-%05d" % (output_path, name_prefix, idx) - w = recordio.writer(n) - f = open(n, "w") - idx += 1 + assert num_shards >= 1 + assert max_lines_to_shuffle >= 1 - return w, f, idx + def open_writers(): + w = [] + for i in range(0, num_shards): + n = "%s/%s-%05d-of-%05d" % (output_path, name_prefix, i, + num_shards - 1) + w.append(recordio.writer(n)) - def close_needs(w, f): - if w is not None: - w.close() + return w - if f is not None: - f.close() + def close_writers(w): + for i in range(0, num_shards): + w[i].close() - idx = 0 - w = None - f = None + def write_data(w, lines): + random.shuffle(lines) + for i, d in enumerate(lines): + d = pickle.dumps(d, pickle.HIGHEST_PROTOCOL) + w[i % num_shards].write(d) - for i, d in enumerate(reader()): - if w is None: - w, f, idx = open_needs(idx) - - w.write(pickle.dumps(d, pickle.HIGHEST_PROTOCOL)) + w = open_writers() + lines = [] - if i % num_shards == 0 and i >= num_shards: - close_needs(w, f) - w, f, idx = open_needs(idx) + for i, d in enumerate(reader()): + lines.append(d) + if i % max_lines_to_shuffle == 0 and i >= max_lines_to_shuffle: + write_data(w, lines) + lines = [] + continue - close_needs(w, f) + write_data(w, lines) + close_writers(w) diff --git a/python/paddle/v2/dataset/tests/common_test.py b/python/paddle/v2/dataset/tests/common_test.py index 3120026e1e..cfa194eba3 100644 --- a/python/paddle/v2/dataset/tests/common_test.py +++ b/python/paddle/v2/dataset/tests/common_test.py @@ -58,20 +58,36 @@ class TestCommon(unittest.TestCase): self.assertEqual(e, str("0")) def test_convert(self): + record_num = 10 + num_shards = 4 + def test_reader(): def reader(): - for x in xrange(10): + for x in xrange(record_num): yield x return reader path = tempfile.mkdtemp() - paddle.v2.dataset.common.convert(path, - test_reader(), 4, 'random_images') + test_reader(), num_shards, + 'random_images') - files = glob.glob(temp_path + '/random_images-*') - self.assertEqual(len(files), 3) + files = glob.glob(path + '/random_images-*') + self.assertEqual(len(files), num_shards) + + recs = [] + for i in range(0, num_shards): + n = "%s/random_images-%05d-of-%05d" % (path, i, num_shards - 1) + r = recordio.reader(n) + while True: + d = r.read() + if d is None: + break + recs.append(d) + + recs.sort() + self.assertEqual(total, record_num) if __name__ == '__main__': From dc458a0d5ac7d5e88047856b773da1052eed80d8 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Mon, 12 Jun 2017 14:18:05 +0800 Subject: [PATCH 25/39] change go version --- .travis.yml | 1 + go/pserver/cclient/test/test_cclient.c | 3 ++- go/pserver/optimizer.c | 3 ++- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/.travis.yml b/.travis.yml index 44b755ee32..f9b4a7e083 100644 --- a/.travis.yml +++ b/.travis.yml @@ -50,6 +50,7 @@ before_install: # protobuf version. - pip install numpy wheel 'protobuf==3.1' sphinx==1.5.6 recommonmark sphinx-rtd-theme==0.1.9 virtualenv pre-commit requests==2.9.2 LinkChecker - pip install rarfile + - eval "$(GIMME_GO_VERSION=1.8.3 gimme)" - | function timeout() { perl -e 'alarm shift; exec @ARGV' "$@"; } script: diff --git a/go/pserver/cclient/test/test_cclient.c b/go/pserver/cclient/test/test_cclient.c index 82cef386d7..50ba2d5597 100644 --- a/go/pserver/cclient/test/test_cclient.c +++ b/go/pserver/cclient/test/test_cclient.c @@ -20,7 +20,8 @@ void print_parameter(paddle_gradient* param) { printf("name: %s\n", param->name); printf("content_len: %d\n", param->content_len); printf("content_type: %d\n", param->element_type); - for (int i = 0; i < param->content_len / sizeof(real); ++i) { + int i; + for (i = 0; i < param->content_len / sizeof(real); ++i) { printf("%f ", ((float*)param->content)[i]); } printf("\n\n"); diff --git a/go/pserver/optimizer.c b/go/pserver/optimizer.c index 47fe1efbf5..48bbceb343 100644 --- a/go/pserver/optimizer.c +++ b/go/pserver/optimizer.c @@ -36,7 +36,8 @@ int update_SGD(void* optimizer, float* parameter = (float*)buffer; float* grad = (float*)gradient; - for (int i = 0; i < num_bytes / sizeof(float); ++i) { + int i; + for (i = 0; i < num_bytes / sizeof(float); ++i) { parameter[i] -= o->learning_rate * grad[i]; } return 0; From 37594eae1737ad7c95016f48a385521ceb0de529 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Tue, 13 Jun 2017 08:15:12 +0800 Subject: [PATCH 26/39] add paramConfig for each parameter --- go/pserver/cclient/test/main.c | 2 -- go/pserver/cclient/test/test_cclient.c | 3 +-- go/pserver/optimizer.c | 1 - go/pserver/service.go | 6 +----- paddle/trainer/NewRemoteParameterUpdater.cpp | 8 ++++++-- paddle/trainer/NewRemoteParameterUpdater.h | 1 - 6 files changed, 8 insertions(+), 13 deletions(-) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index 72ec359076..6adc3c9b53 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -76,7 +76,5 @@ retry: fail(); } - printf("test success!\n"); - return 0; } diff --git a/go/pserver/cclient/test/test_cclient.c b/go/pserver/cclient/test/test_cclient.c index 50ba2d5597..9083064eee 100644 --- a/go/pserver/cclient/test/test_cclient.c +++ b/go/pserver/cclient/test/test_cclient.c @@ -21,7 +21,7 @@ void print_parameter(paddle_gradient* param) { printf("content_len: %d\n", param->content_len); printf("content_type: %d\n", param->element_type); int i; - for (i = 0; i < param->content_len / sizeof(real); ++i) { + for (i = 0; i < param->content_len / (int)sizeof(real); ++i) { printf("%f ", ((float*)param->content)[i]); } printf("\n\n"); @@ -110,6 +110,5 @@ retry: fail(); } - printf("test success!\n"); return 0; } diff --git a/go/pserver/optimizer.c b/go/pserver/optimizer.c index 48bbceb343..f16ba2cbf8 100644 --- a/go/pserver/optimizer.c +++ b/go/pserver/optimizer.c @@ -32,7 +32,6 @@ int update_SGD(void* optimizer, const void* gradient, int num_bytes) { SGD_optimizer* o = (SGD_optimizer*)optimizer; - // TODO(a simple SGD implement) float* parameter = (float*)buffer; float* grad = (float*)gradient; diff --git a/go/pserver/service.go b/go/pserver/service.go index ab814662b6..7d2a1fea86 100644 --- a/go/pserver/service.go +++ b/go/pserver/service.go @@ -29,10 +29,6 @@ type Parameter struct { Content []byte } -func (p *Parameter) toString() { - fmt.Println(p.Name, p.ElementType, p.Content) -} - // ParameterWithConfig contains the parameter and the configuration. type ParameterWithConfig struct { Param Parameter @@ -53,7 +49,7 @@ type Service struct { // NewService creates a new service. func NewService() *Service { - s := &Service{opt: newOptimizer(sgd, 0.01)} + s := &Service{opt: newOptimizer(sgd, 0.005)} s.paramMap = make(map[string]Parameter) s.initialized = make(chan struct{}) return s diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp index d554e09759..b3655d9d02 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.cpp +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -31,7 +31,6 @@ NewRemoteParameterUpdater::NewRemoteParameterUpdater( void NewRemoteParameterUpdater::init( const std::vector ¶meters) { ParameterUpdater::init(parameters); - LOG(INFO) << "NewRemoteParameterUpdater init in"; for (auto ¶ : parameters_) { para->getBuf(PARAMETER_VALUE)->zeroMem(); @@ -58,7 +57,12 @@ void NewRemoteParameterUpdater::init( if (paddle_begin_init_params(parameterClient_)) { LOG(INFO) << "paddle_begin_init_params start"; for (int i = 0; i < parameterSize(); ++i) { - paddle_init_param(parameterClient_, *newParameters_[i], NULL, 0); + auto paramConfig = parameters_[i]->getConfig(); + std::string bytes = paramConfig.SerializeAsString(); + const char *array = bytes.data(); + int size = (int)bytes.size(); + paddle_init_param( + parameterClient_, *newParameters_[i], (void *)array, size); } paddle_finish_init_params(parameterClient_); LOG(INFO) << "paddle_begin_init_params done"; diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index b7c0425982..1f22c15cef 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -84,7 +84,6 @@ private: for (int i = 0; i < parameterSize(); ++i) { ParameterPtr param = parameters_[i]; - new_params[i]->content_len = 10; new_params[i]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; new_params[i]->name = (char*)param->getName().c_str(); new_params[i]->content = From 77c4dce75902b06229f7cb91e6500a98d0eec82a Mon Sep 17 00:00:00 2001 From: gongweibao Date: Tue, 13 Jun 2017 14:41:56 +0800 Subject: [PATCH 27/39] modify 10000 to 1000 --- python/paddle/v2/dataset/common.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/paddle/v2/dataset/common.py b/python/paddle/v2/dataset/common.py index 8023fa3cf8..9c614914b5 100644 --- a/python/paddle/v2/dataset/common.py +++ b/python/paddle/v2/dataset/common.py @@ -155,7 +155,7 @@ def convert(output_path, reader, num_shards, name_prefix, - max_lines_to_shuffle=10000): + max_lines_to_shuffle=1000): import recordio import cPickle as pickle import random From 22b5a388cfe9366876bba3a0349bcae24fd20a70 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Tue, 13 Jun 2017 22:41:08 +0000 Subject: [PATCH 28/39] fix according to comments --- go/pserver/cclient/cclient.go | 72 ++++++++++++++--------------------- 1 file changed, 29 insertions(+), 43 deletions(-) diff --git a/go/pserver/cclient/cclient.go b/go/pserver/cclient/cclient.go index 6a76ec920e..e753b461bc 100644 --- a/go/pserver/cclient/cclient.go +++ b/go/pserver/cclient/cclient.go @@ -20,6 +20,8 @@ typedef struct { } paddle_parameter, paddle_gradient; typedef int paddle_pserver_client; +#define PSERVER_ERROR -1 +#define PSERVER_OK 0 */ import "C" @@ -115,7 +117,7 @@ func paddle_begin_init_params(client C.paddle_pserver_client) C.int { if selected := c.BeginInitParams(); selected { return 1 } - return 0 + return C.PSERVER_OK } //export paddle_init_param @@ -133,13 +135,13 @@ func paddle_init_param(client C.paddle_pserver_client, param C.paddle_parameter, if err != nil { if err.Error() == pserver.AlreadyInitialized { log.Printf("parameter %s already initialized, treat paddle_init_param as sucessful.\n", name) - return 0 + return C.PSERVER_OK } log.Println(err) - return -1 + return C.PSERVER_ERROR } - return 0 + return C.PSERVER_OK } //export paddle_finish_init_params @@ -149,14 +151,14 @@ func paddle_finish_init_params(client C.paddle_pserver_client) C.int { if err != nil { if err.Error() == pserver.AlreadyInitialized { log.Println("parameters already initialized, treat paddle_finish_init_params as sucessful.") - return 0 + return C.PSERVER_OK } log.Println(err) - return -1 + return C.PSERVER_ERROR } - return 0 + return C.PSERVER_OK } //export paddle_send_grads @@ -174,10 +176,10 @@ func paddle_send_grads(client C.paddle_pserver_client, grads *C.paddle_gradient, err := c.SendGrads(gs) if err != nil { log.Println(err) - return -1 + return C.PSERVER_ERROR } - return 0 + return C.PSERVER_OK } //export paddle_get_params @@ -191,42 +193,26 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, ps, err := c.GetParams(ns) if err != nil { log.Println(err) - return -1 - } - - names := func() (string, string) { - retNames := "" - for _, p := range ps { - if retNames == "" { - retNames = p.Name - } else { - retNames = ", " + p.Name - } - } - - requestedNames := "" - for _, n := range ns { - if requestedNames == "" { - requestedNames = n - } else { - requestedNames = ", " + n - } - } - - return requestedNames, retNames + return C.PSERVER_ERROR } if len(ps) != len(ns) { - requestedNames, retNames := names() - log.Printf("pserver returned wrong number of parameters. Requested: %s, returned: %s.\n", retNames, requestedNames) - return -1 + pn := make([]string, len(ps)) + for i, p := range ps { + pn[i] = p.Name + } + log.Printf("pserver returned wrong number of parameters. Requested: %s, returned: %s.\n", strings.Join(pn, ", "), strings.Join(ns, ", ")) + return C.PSERVER_ERROR } for i := range ps { if ns[i] != ps[i].Name { - requestedNames, retNames := names() - log.Printf("pserver returned wrong parameters, or not in requested order. Requested: %s, returned: %s.\n", retNames, requestedNames) - return -1 + pn := make([]string, len(ps)) + for i, p := range ps { + pn[i] = p.Name + } + log.Printf("pserver returned wrong parameters, or not in requested order. Requested: %s, returned: %s.\n", strings.Join(pn, ", "), strings.Join(ns, ", ")) + return C.PSERVER_ERROR } } @@ -236,12 +222,12 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, if unsafe.Pointer(param) == nullPtr { log.Println("must pre-allocate parameter.") - return -1 + return C.PSERVER_ERROR } else { if unsafe.Pointer(param.content) != nullPtr { if int(param.content_len) != len(p.Content) { log.Printf("the pre-allocated content len does not match parameter content len. Pre-allocated len: %d, returned len: %d", param.content_len, len(p.Content)) - return -1 + return C.PSERVER_ERROR } } } @@ -251,7 +237,7 @@ func paddle_get_params(client C.paddle_pserver_client, dst **C.paddle_parameter, param.element_type = C.paddle_element_type(p.ElementType) } - return 0 + return C.PSERVER_OK } //export paddle_save_model @@ -261,10 +247,10 @@ func paddle_save_model(client C.paddle_pserver_client, path *C.char) C.int { err := c.Save(p) if err != nil { log.Println(err) - return -1 + return C.PSERVER_ERROR } - return 0 + return C.PSERVER_OK } func main() {} // Required but ignored From f05649afb79b746e4c3d07112f79ce7a3cce0344 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Thu, 8 Jun 2017 20:57:04 +0000 Subject: [PATCH 29/39] move connection to higher hierarchy, master package need to use it too. --- go/{pserver/internal => }/connection/conn.go | 0 go/pserver/client.go | 2 +- 2 files changed, 1 insertion(+), 1 deletion(-) rename go/{pserver/internal => }/connection/conn.go (100%) diff --git a/go/pserver/internal/connection/conn.go b/go/connection/conn.go similarity index 100% rename from go/pserver/internal/connection/conn.go rename to go/connection/conn.go diff --git a/go/pserver/client.go b/go/pserver/client.go index f8bd0aa59f..4f35141a9f 100644 --- a/go/pserver/client.go +++ b/go/pserver/client.go @@ -6,7 +6,7 @@ import ( "sort" "time" - "github.com/PaddlePaddle/Paddle/go/pserver/internal/connection" + "github.com/PaddlePaddle/Paddle/go/connection" ) // TODO(helin): add RPC call retry logic From 72a73ab6d2139fae73dc922505acad6d8aa41ec4 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Thu, 8 Jun 2017 22:18:36 +0000 Subject: [PATCH 30/39] implement master server client, RPC part. --- go/cmd/master/master.go | 2 -- go/connection/conn.go | 15 ++++++++ go/master/client.go | 74 ++++++++++++++++++++++++++++++++++++++ go/master/client_test.go | 78 ++++++++++++++++++++++++++++++++++++++++ go/master/service.go | 11 ++++-- go/pserver/client.go | 19 +++++++--- 6 files changed, 190 insertions(+), 9 deletions(-) create mode 100644 go/master/client.go create mode 100644 go/master/client_test.go diff --git a/go/cmd/master/master.go b/go/cmd/master/master.go index d1f3d7d76c..65548b7b68 100644 --- a/go/cmd/master/master.go +++ b/go/cmd/master/master.go @@ -50,7 +50,6 @@ func main() { panic("no valid datset specified.") } - idx := 0 for _, path := range paths { f, err := os.Open(path) if err != nil { @@ -66,7 +65,6 @@ func main() { count := index.NumChunks() for i := 0; i < count; i++ { chunk := master.Chunk{ - Idx: idx, Path: path, Index: *index.ChunkIndex(i), } diff --git a/go/connection/conn.go b/go/connection/conn.go index 1c04f11725..0bab2def1d 100644 --- a/go/connection/conn.go +++ b/go/connection/conn.go @@ -21,6 +21,18 @@ func New() *Conn { return c } +// Close closes the connection. +func (c *Conn) Close() error { + c.mu.Lock() + defer c.mu.Unlock() + + if c.client == nil { + return nil + } + + return c.client.Close() +} + // Connect connects the connection to a address. func (c *Conn) Connect(addr string) error { c.mu.Lock() @@ -56,6 +68,9 @@ func (c *Conn) Connect(addr string) error { return nil } +// TODO(helin): refactor Call to be able to perform given retry +// policy. + // Call make a RPC call. // // Call will be blocked until the connection to remote RPC service diff --git a/go/master/client.go b/go/master/client.go new file mode 100644 index 0000000000..23ef18f9e2 --- /dev/null +++ b/go/master/client.go @@ -0,0 +1,74 @@ +package master + +import ( + "log" + "time" + + "github.com/PaddlePaddle/Paddle/go/connection" +) + +// Addresser provide the address of the master server. +type Addresser interface { + Address() string +} + +// Client is the client of the master server. +type Client struct { + conn *connection.Conn +} + +// NewClient creates a new Client. +func NewClient(addr Addresser) *Client { + c := &Client{} + c.conn = connection.New() + go c.monitorMaster(addr) + return c +} + +func (c *Client) monitorMaster(addr Addresser) { + lastMaster := "" + monitor := func() { + curMaster := addr.Address() + if curMaster != lastMaster { + if curMaster == "" { + err := c.conn.Close() + if err != nil { + log.Println(err) + } + } else { + err := c.conn.Connect(curMaster) + if err != nil { + log.Println(err) + + // connect to addr failed, set + // to last known addr in order + // to retry next time. + curMaster = lastMaster + } + + } + } + + lastMaster = curMaster + } + + monitor() + ticker := time.NewTicker(10 * time.Second) + for _ = range ticker.C { + monitor() + } +} + +// GetTask gets a new task from the master server. +func (c *Client) GetTask() (Task, error) { + var dummy int + var t Task + err := c.conn.Call("Service.GetTask", dummy, &t) + return t, err +} + +// TaskFinished tells the master server a task is finished. +func (c *Client) TaskFinished(taskID int) error { + var dummy int + return c.conn.Call("Service.TaskFinished", taskID, &dummy) +} diff --git a/go/master/client_test.go b/go/master/client_test.go new file mode 100644 index 0000000000..4603bdc4d6 --- /dev/null +++ b/go/master/client_test.go @@ -0,0 +1,78 @@ +package master_test + +import ( + "fmt" + "net" + "net/http" + "net/rpc" + "strconv" + "strings" + "testing" + "time" + + "github.com/PaddlePaddle/Paddle/go/master" +) + +const ( + totalTask = 20 + chunkPerTask = 10 +) + +var port int + +func init() { + l, err := net.Listen("tcp", ":0") + if err != nil { + panic(err) + } + + ss := strings.Split(l.Addr().String(), ":") + p, err := strconv.Atoi(ss[len(ss)-1]) + if err != nil { + panic(err) + } + port = p + + go func(l net.Listener) { + chunks := make([]master.Chunk, totalTask) + s := master.NewService(chunks, chunkPerTask, time.Second, 1) + server := rpc.NewServer() + err := server.Register(s) + if err != nil { + panic(err) + } + + mux := http.NewServeMux() + mux.Handle(rpc.DefaultRPCPath, server) + err = http.Serve(l, mux) + if err != nil { + panic(err) + } + }(l) +} + +type addresser string + +func (a addresser) Address() string { + return string(a) +} + +func TestClientFull(t *testing.T) { + c := master.NewClient(addresser(fmt.Sprintf(":%d", port))) + + for i := 0; i < 5*totalTask/chunkPerTask; i++ { + task, err := c.GetTask() + if err != nil { + panic(err) + } + + if len(task.Chunks) != chunkPerTask { + t.Fatal("wrong number of chunk per task", len(task.Chunks)) + } + + err = c.TaskFinished(task.ID) + if err != nil { + panic(err) + } + } +} diff --git a/go/master/service.go b/go/master/service.go index ab17a62f38..8d6bbecc49 100644 --- a/go/master/service.go +++ b/go/master/service.go @@ -75,9 +75,8 @@ func NewService(chunks []Chunk, chunksPerTask int, timeoutDur time.Duration, tim // Chunk is a chunk of data consisted of several data instances. type Chunk struct { - Idx int // index of the chunk within the file Path string - Index recordio.Index // block index + Index recordio.Index // chunk index } // Task is the basic unit of data instances assigned to trainers. @@ -123,6 +122,8 @@ func (s *Service) GetTask(dummy int, task *Task) error { return err } + *task = t.Task + time.AfterFunc(s.timeoutDur, func(taskID int, epoch int) func() { return func() { s.mu.Lock() @@ -174,5 +175,11 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error { t.NumTimeout = 0 s.taskQueues.Done = append(s.taskQueues.Done, t) delete(s.taskQueues.Pending, taskID) + + if len(s.taskQueues.Todo) == 0 { + s.taskQueues.Todo = s.taskQueues.Done + s.taskQueues.Done = nil + } + return s.snapshot() } diff --git a/go/pserver/client.go b/go/pserver/client.go index 4f35141a9f..7930f012c3 100644 --- a/go/pserver/client.go +++ b/go/pserver/client.go @@ -47,7 +47,7 @@ func NewClient(l Lister, pserverNum int, sel Selector) *Client { // monitorPservers monitors pserver addresses, and updates connection // when the address changes. func (c *Client) monitorPservers(l Lister, pserverNum int) { - knownServers := make([]Server, pserverNum) + lastServers := make([]Server, pserverNum) ticker := time.NewTicker(10 * time.Second) monitor := func() { curServers := make([]Server, pserverNum) @@ -56,8 +56,17 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) { curServers[l.Index] = l } - for i := range knownServers { - if knownServers[i].Addr != curServers[i].Addr { + for i := range lastServers { + if lastServers[i].Addr != curServers[i].Addr { + if curServers[i].Addr == "" { + err := c.pservers[i].Close() + if err != nil { + log.Println(err) + } + + continue + } + err := c.pservers[i].Connect(curServers[i].Addr) if err != nil { log.Println(err) @@ -65,12 +74,12 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) { // connect to addr failed, set // to last known addr in order // to retry next time. - curServers[i].Addr = knownServers[i].Addr + curServers[i].Addr = lastServers[i].Addr } } } - knownServers = curServers + lastServers = curServers } monitor() From 54e8263cae3ffcc597d977330f78fb5020c1dba2 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Fri, 9 Jun 2017 01:43:38 +0000 Subject: [PATCH 31/39] implement master server client, remove unnecessary dummy variable --- go/cmd/master/master.go | 50 +-------------- go/master/client.go | 14 +++-- go/master/client_test.go | 49 +++++++++++---- go/master/service.go | 121 ++++++++++++++++++++++++++++++++----- go/pserver/client.go | 12 ++-- go/pserver/service_test.go | 31 ++++------ 6 files changed, 174 insertions(+), 103 deletions(-) diff --git a/go/cmd/master/master.go b/go/cmd/master/master.go index 65548b7b68..25cd1cafcd 100644 --- a/go/cmd/master/master.go +++ b/go/cmd/master/master.go @@ -1,78 +1,32 @@ package main import ( - "fmt" "net" "net/http" "net/rpc" - "os" - "path/filepath" "strconv" - "strings" "time" "github.com/namsral/flag" "github.com/PaddlePaddle/Paddle/go/master" - "github.com/PaddlePaddle/recordio" ) func main() { port := flag.Int("port", 8080, "port of the master server.") - dataset := flag.String("training_dataset", "", "dataset: comma separated path to RecordIO paths, supports golb patterns.") + faultTolerance := flag.Bool("fault_tolerance", false, "enable fault tolerance (requires etcd).") taskTimeoutDur := flag.Duration("task_timout_dur", 20*time.Minute, "task timout duration.") taskTimeoutMax := flag.Int("task_timeout_max", 3, "max timtout count for each task before it being declared failed task.") chunkPerTask := flag.Int("chunk_per_task", 10, "chunk per task.") flag.Parse() - if *dataset == "" { - panic("no dataset specified.") - } - if *faultTolerance { panic("fault tolernance not implemented.") - } - - var chunks []master.Chunk - var paths []string - ss := strings.Split(*dataset, ",") - fmt.Println(ss) - for _, s := range ss { - match, err := filepath.Glob(s) - if err != nil { - panic(err) - } - paths = append(paths, match...) - } - - if len(paths) == 0 { - panic("no valid datset specified.") - } - - for _, path := range paths { - f, err := os.Open(path) - if err != nil { - panic(err) - } - - index, err := recordio.LoadIndex(f) - if err != nil { - panic(err) - } - f.Close() - count := index.NumChunks() - for i := 0; i < count; i++ { - chunk := master.Chunk{ - Path: path, - Index: *index.ChunkIndex(i), - } - chunks = append(chunks, chunk) - } } - s := master.NewService(chunks, *chunkPerTask, *taskTimeoutDur, *taskTimeoutMax) + s := master.NewService(*chunkPerTask, *taskTimeoutDur, *taskTimeoutMax) err := rpc.Register(s) if err != nil { panic(err) diff --git a/go/master/client.go b/go/master/client.go index 23ef18f9e2..791db5a975 100644 --- a/go/master/client.go +++ b/go/master/client.go @@ -59,16 +59,22 @@ func (c *Client) monitorMaster(addr Addresser) { } } +// SetDataset set dataset for the master server to dispatch. +// +// SetDataset can be call multiple times from different nodes. But +// only the first call will be honored. +func (c *Client) SetDataset(globPaths []string) error { + return c.conn.Call("Service.SetDataset", globPaths, nil) +} + // GetTask gets a new task from the master server. func (c *Client) GetTask() (Task, error) { - var dummy int var t Task - err := c.conn.Call("Service.GetTask", dummy, &t) + err := c.conn.Call("Service.GetTask", 0, &t) return t, err } // TaskFinished tells the master server a task is finished. func (c *Client) TaskFinished(taskID int) error { - var dummy int - return c.conn.Call("Service.TaskFinished", taskID, &dummy) + return c.conn.Call("Service.TaskFinished", taskID, nil) } diff --git a/go/master/client_test.go b/go/master/client_test.go index 4603bdc4d6..5abad0d820 100644 --- a/go/master/client_test.go +++ b/go/master/client_test.go @@ -5,12 +5,14 @@ import ( "net" "net/http" "net/rpc" + "os" "strconv" "strings" "testing" "time" "github.com/PaddlePaddle/Paddle/go/master" + "github.com/PaddlePaddle/recordio" ) const ( @@ -34,8 +36,7 @@ func init() { port = p go func(l net.Listener) { - chunks := make([]master.Chunk, totalTask) - s := master.NewService(chunks, chunkPerTask, time.Second, 1) + s := master.NewService(chunkPerTask, time.Second, 1) server := rpc.NewServer() err := server.Register(s) if err != nil { @@ -58,21 +59,47 @@ func (a addresser) Address() string { } func TestClientFull(t *testing.T) { + const p = "/tmp/master_client_test_0" + f, err := os.Create(p) + if err != nil { + panic(err) + } + + for i := 0; i < totalTask*chunkPerTask; i++ { + w := recordio.NewWriter(f, -1, -1) + w.Write(nil) + // call Close to force RecordIO writing a chunk. + w.Close() + } + f.Close() + c := master.NewClient(addresser(fmt.Sprintf(":%d", port))) + c.SetDataset([]string{p}) - for i := 0; i < 5*totalTask/chunkPerTask; i++ { - task, err := c.GetTask() - if err != nil { - panic(err) + checkOnePass := func(i int) { + var tasks []master.Task + for i := 0; i < totalTask; i++ { + task, err := c.GetTask() + if err != nil { + t.Fatal(i, err) + } + tasks = append(tasks, task) } - if len(task.Chunks) != chunkPerTask { - t.Fatal("wrong number of chunk per task", len(task.Chunks)) + _, err = c.GetTask() + if err == nil { + t.Fatal(i, "should get error.") } - err = c.TaskFinished(task.ID) - if err != nil { - panic(err) + for _, task := range tasks { + err = c.TaskFinished(task.ID) + if err != nil { + t.Fatal(i, err) + } } } + + for i := 0; i < 10; i++ { + checkOnePass(i) + } } diff --git a/go/master/service.go b/go/master/service.go index 8d6bbecc49..c80037a3b3 100644 --- a/go/master/service.go +++ b/go/master/service.go @@ -3,6 +3,8 @@ package master import ( "errors" "log" + "os" + "path/filepath" "sync" "time" @@ -13,18 +15,15 @@ const ( targetTaskCount = 300 ) -// errors -var ( - ErrNoMoreTask = errors.New("no more task for current pass") - ErrPendingTaskNotFound = errors.New("pending task not found") -) - // Service is the master server service. type Service struct { - timeoutDur time.Duration - timeoutMax int + chunksPerTask int + timeoutDur time.Duration + timeoutMax int + ready chan struct{} mu sync.Mutex + initBegan bool taskQueues taskQueues } @@ -63,13 +62,14 @@ func partition(chunks []Chunk, chunksPerTask int) []taskEntry { } // NewService creates a new service. -func NewService(chunks []Chunk, chunksPerTask int, timeoutDur time.Duration, timeoutMax int) *Service { +func NewService(chunksPerTask int, timeoutDur time.Duration, timeoutMax int) *Service { s := &Service{} + s.chunksPerTask = chunksPerTask s.timeoutDur = timeoutDur s.timeoutMax = timeoutMax s.taskQueues = taskQueues{} s.taskQueues.Pending = make(map[int]taskEntry) - s.taskQueues.Todo = partition(chunks, chunksPerTask) + s.ready = make(chan struct{}) return s } @@ -104,13 +104,102 @@ func (s *Service) snapshot() error { return nil } +// SetDataset sets dataset to dispatch for the master server. +// +// SetDataset can be call multiple times. But only the first call will +// be honored. +func (s *Service) SetDataset(globPaths []string, dummy *int) error { + if len(globPaths) == 0 { + return errors.New("no dataset specified") + } + + s.mu.Lock() + defer s.mu.Unlock() + if s.initBegan { + // SetDataset already called. All trainer will call + // SetDataset, but we only handle the first one. Treat + // other calls as successful but do nothing. + return nil + } + + s.initBegan = true + + var chunks []Chunk + var paths []string + + for _, s := range globPaths { + match, err := filepath.Glob(s) + if err != nil { + panic(err) + } + paths = append(paths, match...) + } + + if len(paths) == 0 { + return errors.New("no valid datset specified") + } + + for _, path := range paths { + f, err := os.Open(path) + if err != nil { + panic(err) + } + + index, err := recordio.LoadIndex(f) + if err != nil { + return err + } + err = f.Close() + if err != nil { + return err + } + + count := index.NumChunks() + for i := 0; i < count; i++ { + chunk := Chunk{ + Path: path, + Index: *index.ChunkIndex(i), + } + chunks = append(chunks, chunk) + } + } + + s.taskQueues.Todo = partition(chunks, s.chunksPerTask) + + err := s.snapshot() + if err != nil { + return err + } + + close(s.ready) + return nil +} + // GetTask gets a new task from the service. func (s *Service) GetTask(dummy int, task *Task) error { + select { + case <-s.ready: + } + s.mu.Lock() defer s.mu.Unlock() if len(s.taskQueues.Todo) == 0 { - return ErrNoMoreTask + if len(s.taskQueues.Done) == 0 { + if len(s.taskQueues.Pending) == 0 { + return errors.New("all task failed") + } + + // TODO(helin): client need to retry in this + // error case. Gotcha: RPC client can't + // compare returned error with predefined + // erros like io.EOF. Because interface don't + // have same dynamic value when in different + // process. + return errors.New("no more available task") + } + s.taskQueues.Todo = s.taskQueues.Done + s.taskQueues.Todo = nil } t := s.taskQueues.Todo[0] @@ -163,12 +252,16 @@ func (s *Service) GetTask(dummy int, task *Task) error { // TaskFinished tell the service that a task is finished. func (s *Service) TaskFinished(taskID int, dummy *int) error { + select { + case <-s.ready: + } + s.mu.Lock() defer s.mu.Unlock() t, ok := s.taskQueues.Pending[taskID] if !ok { - return ErrPendingTaskNotFound + return errors.New("pending task not found") } // task finished, reset timeout @@ -176,8 +269,8 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error { s.taskQueues.Done = append(s.taskQueues.Done, t) delete(s.taskQueues.Pending, taskID) - if len(s.taskQueues.Todo) == 0 { - s.taskQueues.Todo = s.taskQueues.Done + if len(s.taskQueues.Pending) == 0 { + s.taskQueues.Todo = append(s.taskQueues.Todo, s.taskQueues.Done...) s.taskQueues.Done = nil } diff --git a/go/pserver/client.go b/go/pserver/client.go index 7930f012c3..bbe93cbb6b 100644 --- a/go/pserver/client.go +++ b/go/pserver/client.go @@ -102,16 +102,14 @@ func (c *Client) BeginInitParams() bool { // InitParam initializes the parameter on parameter servers. func (c *Client) InitParam(paramWithConfigs ParameterWithConfig) error { - var dummy int - return c.pservers[c.partition(paramWithConfigs.Param.Name)].Call("Service.InitParam", paramWithConfigs, &dummy) + return c.pservers[c.partition(paramWithConfigs.Param.Name)].Call("Service.InitParam", paramWithConfigs, nil) } // FinishInitParams tells parameter servers client has sent all // parameters to parameter servers as initialization. func (c *Client) FinishInitParams() error { for _, p := range c.pservers { - var dummy int - err := p.Call("Service.FinishInitParams", dummy, &dummy) + err := p.Call("Service.FinishInitParams", 0, nil) if err != nil { return err } @@ -125,8 +123,7 @@ func (c *Client) SendGrads(grads []Gradient) error { errCh := make(chan error, len(grads)) for _, g := range grads { go func(g Gradient) { - var dummy int - err := c.pservers[c.partition(g.Name)].Call("Service.SendGrad", g, &dummy) + err := c.pservers[c.partition(g.Name)].Call("Service.SendGrad", g, nil) errCh <- err }(g) } @@ -205,8 +202,7 @@ func (c *Client) Save(path string) error { errCh := make(chan error, len(c.pservers)) for _, p := range c.pservers { - var dummy int - err := p.Call("Service.Save", path, &dummy) + err := p.Call("Service.Save", path, nil) errCh <- err } diff --git a/go/pserver/service_test.go b/go/pserver/service_test.go index 796492ffb4..c40cecd0b6 100644 --- a/go/pserver/service_test.go +++ b/go/pserver/service_test.go @@ -15,8 +15,7 @@ func TestFull(t *testing.T) { p.Name = "param_a" p.Content = []byte{1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0} p.ElementType = pserver.Int32 - var dummy int - err := s.InitParam(pserver.ParameterWithConfig{Param: p, Config: nil}, &dummy) + err := s.InitParam(pserver.ParameterWithConfig{p, nil}, nil) if err != nil { t.FailNow() } @@ -25,12 +24,12 @@ func TestFull(t *testing.T) { p1.Name = "param_b" p1.Content = []byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} p1.ElementType = pserver.Float32 - err = s.InitParam(pserver.ParameterWithConfig{Param: p1, Config: nil}, &dummy) + err = s.InitParam(pserver.ParameterWithConfig{p1, nil}, nil) if err != nil { t.FailNow() } - err = s.FinishInitParams(0, &dummy) + err = s.FinishInitParams(0, nil) if err != nil { t.FailNow() } @@ -46,11 +45,11 @@ func TestFull(t *testing.T) { } g1, g2 := pserver.Gradient(p1), pserver.Gradient(p) - err = s.SendGrad(g1, &dummy) + err = s.SendGrad(g1, nil) if err != nil { t.FailNow() } - err = s.SendGrad(g2, &dummy) + err = s.SendGrad(g2, nil) if err != nil { t.FailNow() @@ -74,23 +73,21 @@ func TestFull(t *testing.T) { func TestMultipleInit(t *testing.T) { s := pserver.NewService() - var dummy int - err := s.FinishInitParams(0, &dummy) + err := s.FinishInitParams(0, nil) if err != nil { t.FailNow() } - err = s.FinishInitParams(0, &dummy) - if err.Error() != pserver.AlreadyInitialized { + err = s.FinishInitParams(0, nil) + if err != pserver.ErrAlreadyInitialized { t.FailNow() } } func TestUninitialized(t *testing.T) { s := pserver.NewService() - var dummy int - err := s.SendGrad(pserver.Gradient{}, &dummy) - if err.Error() != pserver.Uninitialized { + err := s.SendGrad(pserver.Gradient{}, nil) + if err != pserver.ErrUninitialized { t.FailNow() } } @@ -112,8 +109,7 @@ func TestBlockUntilInitialized(t *testing.T) { wg.Add(1) go func() { - var dummy int - err := s.Save("", &dummy) + err := s.Save("", nil) if err != nil { t.FailNow() } @@ -134,13 +130,12 @@ func TestBlockUntilInitialized(t *testing.T) { p.Name = "param_a" p.Content = []byte{1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0} p.ElementType = pserver.Int32 - var dummy int - err := s.InitParam(pserver.ParameterWithConfig{Param: p, Config: nil}, &dummy) + err := s.InitParam(pserver.ParameterWithConfig{p, nil}, nil) if err != nil { t.FailNow() } - err = s.FinishInitParams(0, &dummy) + err = s.FinishInitParams(0, nil) if err != nil { t.FailNow() } From 41af738af5f7928bc04a73a7fe05f410333dc259 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Mon, 12 Jun 2017 20:30:49 +0000 Subject: [PATCH 32/39] fix according to comments --- go/connection/conn.go | 1 + go/master/service.go | 68 +++++++++++++++++++++++-------------------- 2 files changed, 37 insertions(+), 32 deletions(-) diff --git a/go/connection/conn.go b/go/connection/conn.go index 0bab2def1d..ea6bf972f6 100644 --- a/go/connection/conn.go +++ b/go/connection/conn.go @@ -62,6 +62,7 @@ func (c *Conn) Connect(addr string) error { c.waitConn = nil } } else { + client.Close() return errors.New("client already set from a concurrent goroutine") } diff --git a/go/master/service.go b/go/master/service.go index c80037a3b3..30859d9296 100644 --- a/go/master/service.go +++ b/go/master/service.go @@ -11,10 +11,6 @@ import ( "github.com/PaddlePaddle/recordio" ) -const ( - targetTaskCount = 300 -) - // Service is the master server service. type Service struct { chunksPerTask int @@ -23,7 +19,7 @@ type Service struct { ready chan struct{} mu sync.Mutex - initBegan bool + initDone bool taskQueues taskQueues } @@ -104,54 +100,35 @@ func (s *Service) snapshot() error { return nil } -// SetDataset sets dataset to dispatch for the master server. -// -// SetDataset can be call multiple times. But only the first call will -// be honored. -func (s *Service) SetDataset(globPaths []string, dummy *int) error { - if len(globPaths) == 0 { - return errors.New("no dataset specified") - } - - s.mu.Lock() - defer s.mu.Unlock() - if s.initBegan { - // SetDataset already called. All trainer will call - // SetDataset, but we only handle the first one. Treat - // other calls as successful but do nothing. - return nil - } - - s.initBegan = true - +func getChunks(globPaths []string) ([]Chunk, error) { var chunks []Chunk var paths []string for _, s := range globPaths { match, err := filepath.Glob(s) if err != nil { - panic(err) + return nil, err } paths = append(paths, match...) } if len(paths) == 0 { - return errors.New("no valid datset specified") + return nil, errors.New("no valid datset specified") } for _, path := range paths { f, err := os.Open(path) if err != nil { - panic(err) + return nil, err } index, err := recordio.LoadIndex(f) if err != nil { - return err + return nil, err } err = f.Close() if err != nil { - return err + return nil, err } count := index.NumChunks() @@ -164,14 +141,41 @@ func (s *Service) SetDataset(globPaths []string, dummy *int) error { } } + return chunks, nil +} + +// SetDataset sets dataset to dispatch for the master server. +// +// SetDataset can be call multiple times. But only the first call will +// be honored. +func (s *Service) SetDataset(globPaths []string, dummy *int) error { + if len(globPaths) == 0 { + return errors.New("no dataset specified") + } + + s.mu.Lock() + defer s.mu.Unlock() + if s.initDone { + // Already initialized. All trainer will call + // SetDataset, but we only handle the first one. Treat + // other calls as successful but do nothing. + return nil + } + + chunks, err := getChunks(globPaths) + if err != nil { + return err + } + s.taskQueues.Todo = partition(chunks, s.chunksPerTask) - err := s.snapshot() + err = s.snapshot() if err != nil { return err } close(s.ready) + s.initDone = true return nil } @@ -193,7 +197,7 @@ func (s *Service) GetTask(dummy int, task *Task) error { // TODO(helin): client need to retry in this // error case. Gotcha: RPC client can't // compare returned error with predefined - // erros like io.EOF. Because interface don't + // errors like io.EOF. Because interface don't // have same dynamic value when in different // process. return errors.New("no more available task") From 9bd74ddac342601ead87a07763d14b353d0d5726 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Mon, 12 Jun 2017 20:40:13 +0000 Subject: [PATCH 33/39] fix some linter warnings --- go/connection/conn.go | 7 ++++++- go/master/service.go | 1 - go/pserver/client.go | 2 +- go/pserver/service_test.go | 13 ++++++++----- 4 files changed, 15 insertions(+), 8 deletions(-) diff --git a/go/connection/conn.go b/go/connection/conn.go index ea6bf972f6..bc9b5f0617 100644 --- a/go/connection/conn.go +++ b/go/connection/conn.go @@ -2,6 +2,7 @@ package connection import ( "errors" + "log" "net/rpc" "sync" ) @@ -62,7 +63,11 @@ func (c *Conn) Connect(addr string) error { c.waitConn = nil } } else { - client.Close() + err := client.Close() + if err != nil { + log.Println(err) + } + return errors.New("client already set from a concurrent goroutine") } diff --git a/go/master/service.go b/go/master/service.go index 30859d9296..3edbb7e9c0 100644 --- a/go/master/service.go +++ b/go/master/service.go @@ -50,7 +50,6 @@ func partition(chunks []Chunk, chunksPerTask int) []taskEntry { if len(cur.Task.Chunks) > 0 { cur.Task.ID = id - id++ result = append(result, cur) } diff --git a/go/pserver/client.go b/go/pserver/client.go index bbe93cbb6b..d8c65b2e13 100644 --- a/go/pserver/client.go +++ b/go/pserver/client.go @@ -83,7 +83,7 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) { } monitor() - for _ = range ticker.C { + for range ticker.C { monitor() } } diff --git a/go/pserver/service_test.go b/go/pserver/service_test.go index c40cecd0b6..175c3c3ad8 100644 --- a/go/pserver/service_test.go +++ b/go/pserver/service_test.go @@ -15,7 +15,7 @@ func TestFull(t *testing.T) { p.Name = "param_a" p.Content = []byte{1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0} p.ElementType = pserver.Int32 - err := s.InitParam(pserver.ParameterWithConfig{p, nil}, nil) + err := s.InitParam(pserver.ParameterWithConfig{Param: p, Config: nil}, nil) if err != nil { t.FailNow() } @@ -24,7 +24,7 @@ func TestFull(t *testing.T) { p1.Name = "param_b" p1.Content = []byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} p1.ElementType = pserver.Float32 - err = s.InitParam(pserver.ParameterWithConfig{p1, nil}, nil) + err = s.InitParam(pserver.ParameterWithConfig{Param: p1, Config: nil}, nil) if err != nil { t.FailNow() } @@ -95,13 +95,14 @@ func TestUninitialized(t *testing.T) { func TestBlockUntilInitialized(t *testing.T) { s := pserver.NewService() ch := make(chan struct{}, 2) + errCh := make(chan error, 2) var wg sync.WaitGroup wg.Add(1) go func() { var param pserver.Parameter err := s.GetParam("param_a", ¶m) if err != nil { - t.FailNow() + errCh <- err } wg.Done() ch <- struct{}{} @@ -111,7 +112,7 @@ func TestBlockUntilInitialized(t *testing.T) { go func() { err := s.Save("", nil) if err != nil { - t.FailNow() + errCh <- err } wg.Done() ch <- struct{}{} @@ -123,6 +124,8 @@ func TestBlockUntilInitialized(t *testing.T) { case <-ch: // some function returned before initialization is completed. t.FailNow() + case <-errCh: + t.FailNow() default: } @@ -130,7 +133,7 @@ func TestBlockUntilInitialized(t *testing.T) { p.Name = "param_a" p.Content = []byte{1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0} p.ElementType = pserver.Int32 - err := s.InitParam(pserver.ParameterWithConfig{p, nil}, nil) + err := s.InitParam(pserver.ParameterWithConfig{Param: p, Config: nil}, nil) if err != nil { t.FailNow() } From f6148eb24830ed9fab0f7ea18ea4437a006c692e Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Mon, 12 Jun 2017 20:42:06 +0000 Subject: [PATCH 34/39] rename getChunks to readChunks --- go/master/service.go | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/go/master/service.go b/go/master/service.go index 3edbb7e9c0..6d6a4e30ab 100644 --- a/go/master/service.go +++ b/go/master/service.go @@ -99,7 +99,7 @@ func (s *Service) snapshot() error { return nil } -func getChunks(globPaths []string) ([]Chunk, error) { +func readChunks(globPaths []string) ([]Chunk, error) { var chunks []Chunk var paths []string @@ -161,7 +161,7 @@ func (s *Service) SetDataset(globPaths []string, dummy *int) error { return nil } - chunks, err := getChunks(globPaths) + chunks, err := readChunks(globPaths) if err != nil { return err } From 0bebaa05beda6aca1d9cbedc3fb87c9978cd7df6 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Tue, 13 Jun 2017 21:41:57 +0000 Subject: [PATCH 35/39] fix according to comments --- go/master/client.go | 2 + go/master/client_test.go | 15 ++++++ go/master/service.go | 110 +++++++++++++++++++++++---------------- go/pserver/client.go | 33 ++++++------ 4 files changed, 101 insertions(+), 59 deletions(-) diff --git a/go/master/client.go b/go/master/client.go index 791db5a975..20c66340dc 100644 --- a/go/master/client.go +++ b/go/master/client.go @@ -28,6 +28,8 @@ func NewClient(addr Addresser) *Client { func (c *Client) monitorMaster(addr Addresser) { lastMaster := "" monitor := func() { + // get the lastest address of the master server, + // connect to the new address once address changed. curMaster := addr.Address() if curMaster != lastMaster { if curMaster == "" { diff --git a/go/master/client_test.go b/go/master/client_test.go index 5abad0d820..df708ad791 100644 --- a/go/master/client_test.go +++ b/go/master/client_test.go @@ -11,6 +11,8 @@ import ( "testing" "time" + log "github.com/sirupsen/logrus" + "github.com/PaddlePaddle/Paddle/go/master" "github.com/PaddlePaddle/recordio" ) @@ -23,6 +25,8 @@ const ( var port int func init() { + log.SetLevel(log.ErrorLevel) + l, err := net.Listen("tcp", ":0") if err != nil { panic(err) @@ -91,6 +95,17 @@ func TestClientFull(t *testing.T) { t.Fatal(i, "should get error.") } + err = c.TaskFinished(tasks[0].ID) + if err != nil { + t.Fatal(err) + } + tasks = tasks[1:] + task, err := c.GetTask() + if err != nil { + t.Fatal(err) + } + tasks = append(tasks, task) + for _, task := range tasks { err = c.TaskFinished(task.ID) if err != nil { diff --git a/go/master/service.go b/go/master/service.go index 6d6a4e30ab..c2ab3cc6d8 100644 --- a/go/master/service.go +++ b/go/master/service.go @@ -2,12 +2,13 @@ package master import ( "errors" - "log" "os" "path/filepath" "sync" "time" + log "github.com/sirupsen/logrus" + "github.com/PaddlePaddle/recordio" ) @@ -112,7 +113,7 @@ func readChunks(globPaths []string) ([]Chunk, error) { } if len(paths) == 0 { - return nil, errors.New("no valid datset specified") + return nil, errors.New("no valid dataset specified") } for _, path := range paths { @@ -170,6 +171,7 @@ func (s *Service) SetDataset(globPaths []string, dummy *int) error { err = s.snapshot() if err != nil { + log.Errorln(err) return err } @@ -178,6 +180,43 @@ func (s *Service) SetDataset(globPaths []string, dummy *int) error { return nil } +func (s *Service) checkTimeoutFunc(taskID int, epoch int) func() { + return func() { + s.mu.Lock() + defer s.mu.Unlock() + + t, ok := s.taskQueues.Pending[taskID] + if !ok { + return + } + + if t.Epoch != epoch { + // new epoch, task launched after the + // schedule of this timeout check. + return + } + + defer func() { + err := s.snapshot() + if err != nil { + log.Errorln(err) + } + }() + + delete(s.taskQueues.Pending, t.Task.ID) + + t.NumTimeout++ + if t.NumTimeout > s.timeoutMax { + log.Warningf("Task %v failed %d times, discard.\n", t.Task, t.NumTimeout) + s.taskQueues.Failed = append(s.taskQueues.Failed, t.Task) + return + } + + log.Warningf("Task %v failed %d times, retry.\n", t.Task, t.NumTimeout) + s.taskQueues.Todo = append(s.taskQueues.Todo, t) + } +} + // GetTask gets a new task from the service. func (s *Service) GetTask(dummy int, task *Task) error { select { @@ -190,19 +229,25 @@ func (s *Service) GetTask(dummy int, task *Task) error { if len(s.taskQueues.Todo) == 0 { if len(s.taskQueues.Done) == 0 { if len(s.taskQueues.Pending) == 0 { - return errors.New("all task failed") + err := errors.New("all task failed") + log.Warningln(err) + return err } // TODO(helin): client need to retry in this // error case. Gotcha: RPC client can't // compare returned error with predefined - // errors like io.EOF. Because interface don't + // errors like io.EOF, because interface don't // have same dynamic value when in different - // process. - return errors.New("no more available task") + // process. So we need to figure out a way for + // client to check this error correctly. + err := errors.New("no more available task") + log.Warningln(err) + return err } s.taskQueues.Todo = s.taskQueues.Done - s.taskQueues.Todo = nil + s.taskQueues.Done = nil + log.Infoln("No more todo task, but trainer is requesting task to do. Move all done task to todo.") } t := s.taskQueues.Todo[0] @@ -215,41 +260,9 @@ func (s *Service) GetTask(dummy int, task *Task) error { } *task = t.Task + log.Infof("Task #%d dispatched\n", task.ID) - time.AfterFunc(s.timeoutDur, func(taskID int, epoch int) func() { - return func() { - s.mu.Lock() - defer s.mu.Unlock() - - t, ok := s.taskQueues.Pending[taskID] - if !ok { - return - } - - if t.Epoch != epoch { - // new epoch, task launched after the - // schedule of this timeout check. - return - } - - defer func() { - err := s.snapshot() - if err != nil { - log.Println(err) - } - }() - - delete(s.taskQueues.Pending, t.Task.ID) - - t.NumTimeout++ - if t.NumTimeout > s.timeoutMax { - s.taskQueues.Failed = append(s.taskQueues.Failed, t.Task) - return - } - - s.taskQueues.Todo = append(s.taskQueues.Todo, t) - } - }(t.Task.ID, t.Epoch)) + time.AfterFunc(s.timeoutDur, s.checkTimeoutFunc(t.Task.ID, t.Epoch)) return nil } @@ -262,9 +275,13 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error { s.mu.Lock() defer s.mu.Unlock() + log.Infof("Task %d finished\n", taskID) + t, ok := s.taskQueues.Pending[taskID] if !ok { - return errors.New("pending task not found") + err := errors.New("pending task not found") + log.Warningln(err) + return err } // task finished, reset timeout @@ -272,10 +289,15 @@ func (s *Service) TaskFinished(taskID int, dummy *int) error { s.taskQueues.Done = append(s.taskQueues.Done, t) delete(s.taskQueues.Pending, taskID) - if len(s.taskQueues.Pending) == 0 { + if len(s.taskQueues.Pending) == 0 && len(s.taskQueues.Todo) == 0 { + log.Infoln("No more todo and pending task, start a new pass.") s.taskQueues.Todo = append(s.taskQueues.Todo, s.taskQueues.Done...) s.taskQueues.Done = nil } - return s.snapshot() + err := s.snapshot() + if err != nil { + log.Errorln(err) + } + return err } diff --git a/go/pserver/client.go b/go/pserver/client.go index d8c65b2e13..afe1eecd01 100644 --- a/go/pserver/client.go +++ b/go/pserver/client.go @@ -57,26 +57,29 @@ func (c *Client) monitorPservers(l Lister, pserverNum int) { } for i := range lastServers { - if lastServers[i].Addr != curServers[i].Addr { - if curServers[i].Addr == "" { - err := c.pservers[i].Close() - if err != nil { - log.Println(err) - } - - continue - } + if lastServers[i].Addr == curServers[i].Addr { + continue + } - err := c.pservers[i].Connect(curServers[i].Addr) + if curServers[i].Addr == "" { + err := c.pservers[i].Close() if err != nil { log.Println(err) - - // connect to addr failed, set - // to last known addr in order - // to retry next time. - curServers[i].Addr = lastServers[i].Addr } + + continue } + + err := c.pservers[i].Connect(curServers[i].Addr) + if err != nil { + log.Println(err) + + // connect to addr failed, set + // to last known addr in order + // to retry next time. + curServers[i].Addr = lastServers[i].Addr + } + } lastServers = curServers From 102e10afa148b0c509f76fd3f78d9df7b1aa71fe Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Tue, 13 Jun 2017 22:21:36 +0000 Subject: [PATCH 36/39] improve comment --- go/master/service.go | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/go/master/service.go b/go/master/service.go index c2ab3cc6d8..1e2a34972b 100644 --- a/go/master/service.go +++ b/go/master/service.go @@ -237,10 +237,11 @@ func (s *Service) GetTask(dummy int, task *Task) error { // TODO(helin): client need to retry in this // error case. Gotcha: RPC client can't // compare returned error with predefined - // errors like io.EOF, because interface don't - // have same dynamic value when in different - // process. So we need to figure out a way for - // client to check this error correctly. + // errors like io.EOF, because the error + // instance deserialized from RPC is a + // different instance than the error defined + // in package. So we need to figure out a way + // for client to check this error correctly. err := errors.New("no more available task") log.Warningln(err) return err From 13867a0cb53f0169887c4dc6373a6c4a61986955 Mon Sep 17 00:00:00 2001 From: Helin Wang Date: Wed, 14 Jun 2017 01:08:03 +0000 Subject: [PATCH 37/39] fix build issue caused by rebase --- go/pserver/service_test.go | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/go/pserver/service_test.go b/go/pserver/service_test.go index 175c3c3ad8..b746d13e1c 100644 --- a/go/pserver/service_test.go +++ b/go/pserver/service_test.go @@ -79,7 +79,7 @@ func TestMultipleInit(t *testing.T) { } err = s.FinishInitParams(0, nil) - if err != pserver.ErrAlreadyInitialized { + if err.Error() != pserver.AlreadyInitialized { t.FailNow() } } @@ -87,7 +87,7 @@ func TestMultipleInit(t *testing.T) { func TestUninitialized(t *testing.T) { s := pserver.NewService() err := s.SendGrad(pserver.Gradient{}, nil) - if err != pserver.ErrUninitialized { + if err.Error() != pserver.Uninitialized { t.FailNow() } } From ebba2b139bec7fe44fb4d14032011271b68a3fe2 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Wed, 14 Jun 2017 10:08:01 +0800 Subject: [PATCH 38/39] update code with new cclient --- go/pserver/cclient/test/main.c | 7 ++- go/pserver/cclient/test/test_cclient.c | 57 ++++++++++---------- paddle/trainer/NewRemoteParameterUpdater.cpp | 11 +--- paddle/trainer/NewRemoteParameterUpdater.h | 7 +-- 4 files changed, 36 insertions(+), 46 deletions(-) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index c8aed0f2e8..d052f4f5a8 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -11,10 +11,9 @@ void sendGrads(paddle_pserver_client c) { unsigned char grad_a[2000] = {2}; unsigned char grad_b[3000] = {3}; - paddle_gradient grads[2] = { - {"param_a", PADDLE_ELEMENT_TYPE_FLOAT32, grad_a, 2000}, - {"param_b", PADDLE_ELEMENT_TYPE_FLOAT32, grad_b, 3000}}; - + paddle_gradient grad1 = {"param_a", PADDLE_ELEMENT_TYPE_FLOAT32, grad_a, 2000}; + paddle_gradient grad2 = {"param_b", PADDLE_ELEMENT_TYPE_FLOAT32, grad_b, 3000}; + paddle_gradient* grads[2] = {&grad1, &grad2}; if (paddle_send_grads(c, grads, 2)) { fail(); } diff --git a/go/pserver/cclient/test/test_cclient.c b/go/pserver/cclient/test/test_cclient.c index 9083064eee..6830479fe9 100644 --- a/go/pserver/cclient/test/test_cclient.c +++ b/go/pserver/cclient/test/test_cclient.c @@ -30,30 +30,36 @@ void print_parameter(paddle_gradient* param) { int main() { char addr[] = "localhost:3000"; - client c = paddle_new_pserver_client(addr, 1); + paddle_pserver_client c = paddle_new_pserver_client(addr, 1); char* names[] = {"param_a", "param_b"}; + retry: + printf("init parameter to pserver:\n"); + + real param_content1[] = {0.1, 0.2, 0.3}; + real param_content2[] = {0.4, 0.5, 0.6}; + paddle_parameter** params = + (paddle_parameter**)malloc(sizeof(paddle_parameter*) * 2); + params[0] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); + params[0]->name = names[0]; + params[0]->content = (unsigned char*)param_content1; + params[0]->content_len = 3 * sizeof(real); + params[0]->element_type = PADDLE_ELEMENT_TYPE_FLOAT32; + + params[1] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); + params[1]->name = names[1]; + params[1]->content = (unsigned char*)param_content2; + params[1]->content_len = 3 * sizeof(real); + params[1]->element_type = PADDLE_ELEMENT_TYPE_INT32; if (paddle_begin_init_params(c)) { - paddle_parameter param; - real param_content1[] = {0.1, 0.2, 0.3}; - param.element_type = PADDLE_ELEMENT_TYPE_FLOAT32; - param.name = names[0]; - param.content = (unsigned char*)param_content1; - param.content_len = 3 * sizeof(real); - if (paddle_init_param(c, param, NULL, 0) != 0) { + if (paddle_init_param(c, *params[0], NULL, 0) != 0) { goto retry; } - real param_content2[] = {0.4, 0.5, 0.6}; - param.element_type = PADDLE_ELEMENT_TYPE_INT32; - param.name = names[1]; - param.content = (unsigned char*)param_content2; - param.content_len = 3 * sizeof(real); - if (paddle_init_param(c, param, NULL, 0) != 0) { + if (paddle_init_param(c, *params[1], NULL, 0) != 0) { goto retry; } - if (paddle_finish_init_params(c) != 0) { goto retry; } @@ -61,13 +67,13 @@ retry: fail(); } - printf("get initialized parameters from pserver:\n"); - paddle_parameter* param_ptrs[2] = {NULL, NULL}; - if (paddle_get_params(c, names, param_ptrs, 2) != 0) { + printf("get inited parameters from pserver:\n"); + // get parameters again by reusing the allocated parameter buffers. + if (paddle_get_params(c, params, 2) != 0) { fail(); } - print_parameter(param_ptrs[0]); - print_parameter(param_ptrs[1]); + print_parameter(params[0]); + print_parameter(params[1]); printf("send gradient to pserver:\n"); real gradient_content1[] = {0.01, 0.02, 0.03}; @@ -87,6 +93,7 @@ retry: grads[1]->content_len = 3 * sizeof(real); grads[1]->element_type = PADDLE_ELEMENT_TYPE_INT32; + printf("print gradient sent to pserver:\n"); print_parameter(grads[0]); print_parameter(grads[1]); @@ -96,15 +103,11 @@ retry: printf("get updated parameters from pserver:\n"); // get parameters again by reusing the allocated parameter buffers. - if (paddle_get_params(c, names, param_ptrs, 2) != 0) { + if (paddle_get_params(c, params, 2) != 0) { fail(); } - - print_parameter(param_ptrs[0]); - print_parameter(param_ptrs[1]); - - paddle_release_param(param_ptrs[0]); - paddle_release_param(param_ptrs[1]); + print_parameter(params[0]); + print_parameter(params[1]); if (paddle_save_model(c, "/tmp/") != 0) { fail(); diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp index b3655d9d02..3d4d23afc7 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.cpp +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -25,7 +25,6 @@ NewRemoteParameterUpdater::NewRemoteParameterUpdater( : parameterClient_(-1), newParameters_(nullptr), newGradients_(nullptr), - names_(nullptr), pserverSpec_(pserverSpec) {} void NewRemoteParameterUpdater::init( @@ -41,12 +40,6 @@ void NewRemoteParameterUpdater::init( parameterClient_ = paddle_new_pserver_client((char *)pserverSpec_.c_str(), FLAGS_trainer_id == 0); - // init names_ for get parameter through paddle_cclient - names_ = (char **)malloc(parameterSize() * sizeof(char *)); - for (int i = 0; i < parameterSize(); ++i) { - names_[i] = (char *)parameters_[i]->getName().c_str(); - } - // init new parameter and gradient. newParameters_ = initNewParameter(PARAMETER_VALUE); newGradients_ = initNewParameter(PARAMETER_GRADIENT); @@ -68,7 +61,7 @@ void NewRemoteParameterUpdater::init( LOG(INFO) << "paddle_begin_init_params done"; } else { paddle_get_params( - parameterClient_, names_, newParameters_, parameterSize()); + parameterClient_, newParameters_, parameterSize()); } LOG(INFO) << "NewRemoteParameterUpdater initialized"; @@ -80,7 +73,7 @@ void NewRemoteParameterUpdater::finishBatch(real cost) { // send gradient to parameter server. paddle_send_grads(parameterClient_, newGradients_, parameterSize()); // get the updated parameter from parameterClient. - paddle_get_params(parameterClient_, names_, newParameters_, parameterSize()); + paddle_get_params(parameterClient_, newParameters_, parameterSize()); // clear gradient after update parameter. for (auto ¶ : parameters_) { diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index 1f22c15cef..f735185f62 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -32,9 +32,6 @@ public: NewRemoteParameterUpdater(const OptimizationConfig& config, const std::string pserverSpec); ~NewRemoteParameterUpdater() { - if (names_ != nullptr) { - free(names_); - } releaseNewParameter(newParameters_); releaseNewParameter(newGradients_); if (parameterClient_ >= 0) paddle_pserver_client_release(parameterClient_); @@ -105,13 +102,11 @@ private: protected: /// internal parameter client object for exchanging data with pserver - client parameterClient_; + paddle_pserver_client parameterClient_; /// the parameters for new pserver client paddle_parameter** newParameters_; /// the gradinets for new pserver client paddle_parameter** newGradients_; - /// the names for new parameters. - char** names_; /// the specification of parameter server "host1:port,host1:port" std::string pserverSpec_; }; From c093a2433600b7666fd8f46aca01f4d5e40b02f6 Mon Sep 17 00:00:00 2001 From: qiaolongfei Date: Wed, 14 Jun 2017 11:27:19 +0800 Subject: [PATCH 39/39] clang format check --- go/pserver/cclient/test/main.c | 9 ++++++--- go/pserver/cclient/test/test_cclient.c | 2 +- paddle/trainer/NewRemoteParameterUpdater.cpp | 3 +-- 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/go/pserver/cclient/test/main.c b/go/pserver/cclient/test/main.c index d052f4f5a8..07e1b86b43 100644 --- a/go/pserver/cclient/test/main.c +++ b/go/pserver/cclient/test/main.c @@ -11,8 +11,10 @@ void sendGrads(paddle_pserver_client c) { unsigned char grad_a[2000] = {2}; unsigned char grad_b[3000] = {3}; - paddle_gradient grad1 = {"param_a", PADDLE_ELEMENT_TYPE_FLOAT32, grad_a, 2000}; - paddle_gradient grad2 = {"param_b", PADDLE_ELEMENT_TYPE_FLOAT32, grad_b, 3000}; + paddle_gradient grad1 = { + "param_a", PADDLE_ELEMENT_TYPE_FLOAT32, grad_a, 2000}; + paddle_gradient grad2 = { + "param_b", PADDLE_ELEMENT_TYPE_FLOAT32, grad_b, 3000}; paddle_gradient* grads[2] = {&grad1, &grad2}; if (paddle_send_grads(c, grads, 2)) { fail(); @@ -76,7 +78,8 @@ retry: } } - for (int i = 0; i < 100; i++) { + int i; + for (i = 0; i < 100; i++) { sendGrads(c); getParams(c); } diff --git a/go/pserver/cclient/test/test_cclient.c b/go/pserver/cclient/test/test_cclient.c index 6830479fe9..0f9c2ef801 100644 --- a/go/pserver/cclient/test/test_cclient.c +++ b/go/pserver/cclient/test/test_cclient.c @@ -40,7 +40,7 @@ retry: real param_content1[] = {0.1, 0.2, 0.3}; real param_content2[] = {0.4, 0.5, 0.6}; paddle_parameter** params = - (paddle_parameter**)malloc(sizeof(paddle_parameter*) * 2); + (paddle_parameter**)malloc(sizeof(paddle_parameter*) * 2); params[0] = (paddle_parameter*)malloc(sizeof(paddle_parameter)); params[0]->name = names[0]; params[0]->content = (unsigned char*)param_content1; diff --git a/paddle/trainer/NewRemoteParameterUpdater.cpp b/paddle/trainer/NewRemoteParameterUpdater.cpp index 3d4d23afc7..f25ce2f7f0 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.cpp +++ b/paddle/trainer/NewRemoteParameterUpdater.cpp @@ -60,8 +60,7 @@ void NewRemoteParameterUpdater::init( paddle_finish_init_params(parameterClient_); LOG(INFO) << "paddle_begin_init_params done"; } else { - paddle_get_params( - parameterClient_, newParameters_, parameterSize()); + paddle_get_params(parameterClient_, newParameters_, parameterSize()); } LOG(INFO) << "NewRemoteParameterUpdater initialized";