Merge branch 'release/0.10.0' into release_note

release/0.10.0
Yu Yang 8 years ago committed by GitHub
commit a0fbc1e1c9

@ -7,6 +7,11 @@
* Support rectangle input for CNN.
* Support stride pooling for seqlastin and seqfirstin.
* Expose seq_concat_layer/seq_reshape_layer in `trainer_config_helpers`.
* Add dataset package
- CIFAR, MNIST, IMDB, WMT14, CONLL05, movielens, imikolov.
* Add Priorbox layer for Single Shot Multibox Detection.
* Add smooth L1 cost.
* Add data reader creator and data reader decorator for v2 API.
* Add the cpu implementation of cmrnorm-projection.
## Improvements
@ -19,6 +24,13 @@
* Reorganize the catalog of doc/ and refine several docs.
* Add Travis-CI for checking dead links.
* Add a example for explaining sparse_vector.
* Add Relu in layer_math.py
* Simplify data processing flow for quick start.
* Support CUDNN Deconv.
* Add data feeder for v2 API.
* Support predicting the samples from sys.stdin for sentiment demo.
* Provide multi-proccess interface for image preprocessing.
* Add benchmark document for v1 API.
* Add Relu in layer_math.py.
* Add packages for automatically downloading public datasets.
* Rename Argument::sumCost to Argument::sum since Argument does not have to have any relationship with cost.
@ -49,6 +61,9 @@
* Fix LogActivation which is not defined.
* Fix bug when run test_layerHelpers multiple times.
* Fix protobuf size limit on seq2seq demo.
* Fix bug for dataprovider converter in GPU mode.
* Fix bug in GatedRecurrentLayer which only occurs in predicting or `job=test` mode.
* Fix bug for BatchNorm when testing more than models in test mode.
* Fix unit test of paramRelu.
* Fix some warning about CpuSparseMatrix.
* Fix MultiGradientMachine error if trainer_count > batch_size.

@ -69,7 +69,8 @@ def gru_encoder_decoder(data_conf,
encoder_size=512,
decoder_size=512,
beam_size=3,
max_length=250):
max_length=250,
error_clipping=50):
"""
A wrapper for an attention version of GRU Encoder-Decoder network
is_generating: whether this config is used for generating
@ -90,9 +91,19 @@ def gru_encoder_decoder(data_conf,
input=src_word_id,
size=word_vector_dim,
param_attr=ParamAttr(name='_source_language_embedding'))
src_forward = simple_gru(input=src_embedding, size=encoder_size)
src_forward = simple_gru(
input=src_embedding,
size=encoder_size,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
src_backward = simple_gru(
input=src_embedding, size=encoder_size, reverse=True)
input=src_embedding,
size=encoder_size,
reverse=True,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
encoded_vector = concat_layer(input=[src_forward, src_backward])
with mixed_layer(size=decoder_size) as encoded_proj:
@ -117,11 +128,13 @@ def gru_encoder_decoder(data_conf,
decoder_inputs += full_matrix_projection(input=context)
decoder_inputs += full_matrix_projection(input=current_word)
gru_step = gru_step_layer(
gru_step = gru_step_naive_layer(
name='gru_decoder',
input=decoder_inputs,
output_mem=decoder_mem,
size=decoder_size)
size=decoder_size,
layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
with mixed_layer(
size=target_dict_dim, bias_attr=True,

@ -2,7 +2,8 @@
============
.. toctree::
:maxdepth: 2
:maxdepth: 1
build_and_install/index_cn.rst
basic_usage/index_cn.rst
- `深度学习入门课程 <http://book.paddlepaddle.org/>`_

@ -2,7 +2,8 @@ GET STARTED
============
.. toctree::
:maxdepth: 2
:maxdepth: 1
build_and_install/index_en.rst
basic_usage/index_en.rst
- `Deep Learning 101 <http://book.paddlepaddle.org/index.en.html>`_

@ -19,18 +19,18 @@
在 PaddlePaddle中下面这些Layer能够接受双层序列作为输入完成相应的计算。
pooling_layer
==============
pooling
========
pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_pooling_layer` 配置API。
pooling 的使用示例如下,详细见 :ref:`api_v2.layer_pooling` 配置API。
.. code-block:: bash
seq_pool = pooling_layer(input=layer,
pooling_type=AvgPooling(),
agg_level=AggregateLevel.EACH_SEQUENCE)
seq_pool = pooling(input=layer,
pooling_type=pooling.Max(),
agg_level=AggregateLevel.EACH_SEQUENCE)
- `pooling_type` 目前支持两种,分别是:MaxPooling()和AvgPooling()。
- `pooling_type` 目前支持两种,分别是:pooling.Max()和pooling.Avg()。
- `agg_level=AggregateLevel.EACH_TIMESTEP` 时(默认值):
@ -47,7 +47,7 @@ pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers
last_seq 和 first_seq
=====================
last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_seq` 类似),详细见 :ref:`api_trainer_config_helpers_layers_last_seq` 配置API。
last_seq 的使用示例如下( :ref:`api_v2.layer_first_seq` 类似),详细见 :ref:`api_v2.layer_last_seq` 配置API。
.. code-block:: bash
@ -65,16 +65,16 @@ last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_
- 输入:必须是一个双层序列
- 输出一个单层序列其中每个元素是双层序列中每个subseq最后一个或第一个元素。
expand_layer
============
expand
======
expand_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_expand_layer` 配置API。
expand 的使用示例如下,详细见 :ref:`api_v2.layer_expand` 配置API。
.. code-block:: bash
expand = expand_layer(input=layer1,
expand_as=layer2,
expand_level=ExpandLevel.FROM_TIMESTEP)
ex = expand(input=layer1,
expand_as=layer2,
expand_level=ExpandLevel.FROM_TIMESTEP)
- `expand_level=ExpandLevel.FROM_TIMESTEP` 时(默认值):

@ -4,7 +4,6 @@ RNN相关模型
.. toctree::
:maxdepth: 1
rnn_config_cn.rst
recurrent_group_cn.md
hierarchical_layer_cn.rst
hrnn_rnn_api_compare_cn.rst

@ -1,7 +1,2 @@
RNN Models
==========
.. toctree::
:maxdepth: 1
rnn_config_en.rst

File diff suppressed because it is too large Load Diff

@ -14,7 +14,7 @@
- [*PersistentVolume*](https://kubernetes.io/docs/user-guide/persistent-volumes/): 和[*PersistentVolumeClaim*](https://kubernetes.io/docs/user-guide/persistent-volumes/#persistentvolumeclaims)结合将外部的存储服务在Kubernetes中描述成为统一的资源形式便于存储资源管理和Pod引用。
# 部署Kubernetes集群
## 部署Kubernetes集群
Kubernetes提供了多种集群部署的方案本文档内不重复介绍。这里给出集中常见的部署方法
@ -25,7 +25,7 @@ Kubernetes提供了多种集群部署的方案本文档内不重复介绍。
可以参考[这个表格](https://kubernetes.io/docs/getting-started-guides/#table-of-solutions)选择适合您的场景的合适方案。
# 选择存储方案
## 选择存储方案
容器不会保留在运行时生成的数据job或者应用程序在容器中运行时生成的数据会在容器销毁时消失。为了完成分布式机器学习训练任务需要有一个外部的存储服务来保存训练所需数据和训练输出。
常见的可选存储服务包括:
@ -35,9 +35,9 @@ Kubernetes提供了多种集群部署的方案本文档内不重复介绍。
- [*Ceph*](http://docs.ceph.com/docs/master/): 分布式文件系统支持rbdPOSIX API接口(ceph fs)和对象存储API参考[这里](https://kubernetes.io/docs/user-guide/volumes/#rbd)。
- [*MooseFS*](https://moosefs.com/documentation.html): 一个分布式的存储系统。需要先挂载到服务器Node上再通过kubernetes hostPath Volume挂载到容器中。
# 配置kubectl
## 配置kubectl
## 安装kubectl
### 安装kubectl
```
# OS X
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl
@ -49,7 +49,7 @@ curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s htt
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/windows/amd64/kubectl.exe
```
## 配置kubectl访问你的kubernetes集群
### 配置kubectl访问你的kubernetes集群
编辑`~/.kube/config`这个配置文件,修改`Master-IP`的地址。如果使用SSL认证则需要配置`certificate-authority`和`users`中的用户证书。如果是使用非SSL方式访问比如通过8080端口也可以去掉这些证书的配置。
```

@ -5,7 +5,6 @@ PaddlePaddle 文档
:maxdepth: 1
getstarted/index_cn.rst
tutorials/index_cn.md
howto/index_cn.rst
api/index_cn.rst
faq/index_cn.rst

@ -5,8 +5,6 @@ PaddlePaddle Documentation
:maxdepth: 1
getstarted/index_en.rst
tutorials/index_en.md
howto/index_en.rst
api/index_en.rst
about/index_en.rst

@ -114,10 +114,7 @@
</ul>
</div>
<ul class="site-page-links">
<li><a>Home</a></li>
<li><a>Get Started</a></li>
<li class="active"><a>Documentation</a></li>
<li><a>About Us</a></li>
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
@ -137,7 +134,7 @@
{{ toctree }}
{% endblock %}
</nav>
{% if toc %}
{% if False %}
<nav class="local-toc">{{ toc }}</nav>
{% endif %}
<section class="doc-content-wrap">
@ -168,7 +165,8 @@
VERSION:'{{ release|e }}',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'{{ '' if no_search_suffix else file_suffix }}',
HAS_SOURCE: {{ has_source|lower }}
HAS_SOURCE: {{ has_source|lower }},
SOURCELINK_SUFFIX: ".txt",
};
</script>
{%- for scriptfile in script_files %}

@ -48,8 +48,7 @@ lstm = lstmemory_group(
size=hidden_dim,
act=TanhActivation(),
gate_act=SigmoidActivation(),
state_act=TanhActivation(),
lstm_layer_attr=ExtraLayerAttribute(error_clipping_threshold=50))
state_act=TanhActivation())
lstm_last = last_seq(input=lstm)

@ -51,8 +51,7 @@ def lstm_group(lstm_group_input):
size=hidden_dim,
act=TanhActivation(),
gate_act=SigmoidActivation(),
state_act=TanhActivation(),
lstm_layer_attr=ExtraLayerAttribute(error_clipping_threshold=50))
state_act=TanhActivation())
return lstm_output

@ -60,6 +60,7 @@ function deploy_docs() {
deploy_docs "master" "."
deploy_docs "develop" "./develop/"
deploy_docs "release/0.10.0" "./release/0.10.0/"
# Check is there anything changed.
set +e

@ -17,14 +17,17 @@ add_test(NAME test_Trainer
WORKING_DIRECTORY ${PROJ_ROOT}/paddle/)
############### test_TrainerOnePass ##########################
add_unittest_without_exec(test_TrainerOnePass
test_TrainerOnePass.cpp)
add_test(NAME test_TrainerOnePass
COMMAND ${PROJ_ROOT}/paddle/.set_python_path.sh -d
${PROJ_ROOT}/python/:${PROJ_ROOT}/paddle/trainer/tests
${PROJ_ROOT}/paddle/.set_port.sh -p port ${CMAKE_CURRENT_BINARY_DIR}/test_TrainerOnePass
WORKING_DIRECTORY ${PROJ_ROOT}/paddle/)
if(WITH_PYTHON)
# only run test_TrainerOnePass when PYTHON is enabled, because train one pass
# is using PyDataProvider2.
add_unittest_without_exec(test_TrainerOnePass
test_TrainerOnePass.cpp)
add_test(NAME test_TrainerOnePass
COMMAND ${PROJ_ROOT}/paddle/.set_python_path.sh -d
${PROJ_ROOT}/python/:${PROJ_ROOT}/paddle/trainer/tests
${PROJ_ROOT}/paddle/.set_port.sh -p port ${CMAKE_CURRENT_BINARY_DIR}/test_TrainerOnePass
WORKING_DIRECTORY ${PROJ_ROOT}/paddle/)
endif()
################ test_CompareTwoNets ######################
add_unittest_without_exec(test_CompareTwoNets
test_CompareTwoNets.cpp)

@ -24,9 +24,12 @@ add_custom_target(paddle_python ALL DEPENDS
${OUTPUT_DIR}/.timestamp)
add_subdirectory(paddle/trainer_config_helpers/tests)
add_subdirectory(paddle/v2/tests)
add_subdirectory(paddle/v2/reader/tests)
add_subdirectory(paddle/v2/plot/tests)
if (WITH_SWIG_PY)
# enable v2 API unittest only when paddle swig api is compiled
add_subdirectory(paddle/v2/tests)
add_subdirectory(paddle/v2/reader/tests)
add_subdirectory(paddle/v2/plot/tests)
endif()
install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/dist/
DESTINATION opt/paddle/share/wheels

@ -208,12 +208,15 @@ class ExtraLayerAttribute(object):
drop_rate=None,
device=None):
self.attr = dict()
if isinstance(error_clipping_threshold, float):
assert error_clipping_threshold > 0
self.attr["error_clipping_threshold"] = error_clipping_threshold
if isinstance(drop_rate, float):
assert drop_rate > 0
if error_clipping_threshold is not None:
error_clipping_threshold = float(error_clipping_threshold)
if error_clipping_threshold < 0:
raise ValueError("Error clipping must > 0")
self.attr['error_clipping_threshold'] = error_clipping_threshold
if drop_rate is not None:
drop_rate = float(drop_rate)
if drop_rate < 0:
raise ValueError("Dropout rate must > 0")
self.attr["drop_rate"] = drop_rate
if isinstance(device, int):

@ -84,6 +84,7 @@ __all__ = [
'GeneratedInput',
'SubsequenceInput',
'gru_step_layer',
'gru_step_naive_layer',
'recurrent_layer',
'BaseGeneratedInput',
'conv_operator',
@ -2284,7 +2285,7 @@ def img_pool_layer(input,
type_name = pool_type.name + '-projection' \
if (
isinstance(pool_type, AvgPooling) or isinstance(pool_type, MaxPooling)) \
isinstance(pool_type, AvgPooling) or isinstance(pool_type, MaxPooling)) \
else pool_type.name
pool_size_y = pool_size if pool_size_y is None else pool_size_y
@ -3084,6 +3085,78 @@ def gru_step_layer(input,
activation=act)
@wrap_bias_attr_default()
@wrap_param_attr_default()
@wrap_act_default(param_names=['gate_act'], act=SigmoidActivation())
@wrap_act_default(act=TanhActivation())
@wrap_name_default('gru_step_naive')
@layer_support(ERROR_CLIPPING, DROPOUT)
def gru_step_naive_layer(input,
output_mem,
size=None,
name=None,
act=None,
gate_act=None,
bias_attr=None,
param_attr=None,
layer_attr=None):
"""
GRU Step Layer, but using MixedLayer to generate. It support ERROR_CLIPPING
and DROPOUT.
:param input:
:param output_mem:
:param size:
:param name:
:param act:
:param gate_act:
:param bias_attr:
:param param_attr:
:param layer_attr:
:return:
"""
if input.size % 3 != 0:
raise ValueError("GruStep input size must be divided by 3")
if size is None:
size = input.size / 3
def __gate__(gate_name, offset):
with mixed_layer(
name=name + "_" + gate_name,
size=size,
layer_attr=layer_attr,
bias_attr=bias_attr,
act=gate_act) as gate:
gate += identity_projection(input=input, offset=offset)
gate += full_matrix_projection(
input=output_mem, param_attr=param_attr)
return gate
update_gate = __gate__("update", 0)
reset_gate = __gate__("reset", size)
with mixed_layer(
name=name + "_reset_output", bias_attr=False) as reset_output:
reset_output += dotmul_operator(a=output_mem, b=reset_gate)
with mixed_layer(
name=name + "_output_candidate",
size=size,
layer_attr=layer_attr,
bias_attr=bias_attr,
act=act) as output_candidate:
output_candidate += identity_projection(input=input, offset=2 * size)
output_candidate += full_matrix_projection(
input=reset_output, param_attr=param_attr)
with mixed_layer(name=name) as output:
output += identity_projection(output_mem)
output += dotmul_operator(a=output_mem, b=update_gate, scale=-1.0)
output += dotmul_operator(a=output_candidate, b=update_gate)
return output
@wrap_name_default()
@layer_support()
def get_output_layer(input, arg_name, name=None, layer_attr=None):

@ -825,7 +825,8 @@ def gru_unit(input,
gru_param_attr=None,
act=None,
gate_act=None,
gru_layer_attr=None):
gru_layer_attr=None,
naive=False):
"""
Define calculations that a gated recurrent unit performs in a single time
step. This function itself is not a recurrent layer, so that it can not be
@ -857,7 +858,12 @@ def gru_unit(input,
out_mem = memory(name=name, size=size)
gru_out = gru_step_layer(
if naive:
__step__ = gru_step_naive_layer
else:
__step__ = gru_step_layer
gru_out = __step__(
name=name,
input=input,
output_mem=out_mem,
@ -879,7 +885,8 @@ def gru_group(input,
gru_param_attr=None,
act=None,
gate_act=None,
gru_layer_attr=None):
gru_layer_attr=None,
naive=False):
"""
gru_group is a recurrent layer group version of Gated Recurrent Unit. It
does exactly the same calculation as the grumemory layer does. A promising
@ -928,7 +935,8 @@ def gru_group(input,
gru_param_attr=gru_param_attr,
act=act,
gate_act=gate_act,
gru_layer_attr=gru_layer_attr)
gru_layer_attr=gru_layer_attr,
naive=naive)
return recurrent_group(
name='%s_recurrent_group' % name,
@ -949,7 +957,8 @@ def simple_gru(input,
gru_param_attr=None,
act=None,
gate_act=None,
gru_layer_attr=None):
gru_layer_attr=None,
naive=False):
"""
You maybe see gru_step_layer, grumemory in layers.py, gru_unit, gru_group,
simple_gru in network.py. The reason why there are so many interfaces is
@ -1018,7 +1027,8 @@ def simple_gru(input,
gru_param_attr=gru_param_attr,
act=act,
gate_act=gate_act,
gru_layer_attr=gru_layer_attr)
gru_layer_attr=gru_layer_attr,
naive=naive)
@wrap_name_default('simple_gru2')

@ -320,6 +320,7 @@ layers {
}
}
drop_rate: 0.5
error_clipping_threshold: 40.0
}
parameters {
name: "___embedding_0__.w0"

@ -356,6 +356,9 @@ def mixed(size=0,
return MixedLayerV2(size, input, name, act, bias_attr, layer_attr)
mixed.__doc__ = conf_helps.mixed_layer.__doc__
class RecurrentLayerInput(Layer):
def __init__(self, recurrent_name, index, parent_layers):
parents_len = len(parent_layers)
@ -404,6 +407,8 @@ data.__name__ = 'data'
AggregateLevel = conf_helps.layers.AggregateLevel
ExpandLevel = conf_helps.layers.ExpandLevel
memory = MemoryV2
memory.__name__ = 'memory'
memory.__doc__ = conf_helps.memory.__doc__
def __layer_name_mapping__(inname):
@ -512,6 +517,9 @@ def recurrent_group(step, input, name=None):
return retv
recurrent_group.__doc__ = conf_helps.recurrent_group.__doc__
@wrap_name_default()
def beam_search(step,
input,
@ -579,6 +587,8 @@ def beam_search(step,
return tmp
beam_search.__doc__ = conf_helps.beam_search.__doc__
__projection_names__ = filter(lambda x: x.endswith('_projection'),
dir(conf_helps))

Loading…
Cancel
Save