|
|
|
@ -2317,19 +2317,28 @@ def layer_norm(input,
|
|
|
|
|
Args:
|
|
|
|
|
input(Variable): The input tensor variable.
|
|
|
|
|
scale(bool): Whether to learn the adaptive gain :math:`g` after
|
|
|
|
|
normalization.
|
|
|
|
|
normalization. Default True.
|
|
|
|
|
shift(bool): Whether to learn the adaptive bias :math:`b` after
|
|
|
|
|
normalization.
|
|
|
|
|
begin_norm_axis(bool): The normalization will be performed along
|
|
|
|
|
normalization. Default True.
|
|
|
|
|
begin_norm_axis(int): The normalization will be performed along
|
|
|
|
|
dimensions from :attr:`begin_norm_axis` to :attr:`rank(input)`.
|
|
|
|
|
Default 1.
|
|
|
|
|
epsilon(float): The small value added to the variance to prevent
|
|
|
|
|
division by zero.
|
|
|
|
|
division by zero. Default 1e-05.
|
|
|
|
|
param_attr(ParamAttr|None): The parameter attribute for the learnable
|
|
|
|
|
gain :math:`g`.
|
|
|
|
|
gain :math:`g`. If :attr:`scale` is False, :attr:`param_attr` is
|
|
|
|
|
omitted. If :attr:`scale` is True and :attr:`param_attr` is None,
|
|
|
|
|
a default :code:`ParamAttr` would be added as scale. The
|
|
|
|
|
:attr:`param_attr` is initialized as 1 if it is added. Default None.
|
|
|
|
|
bias_attr(ParamAttr|None): The parameter attribute for the learnable
|
|
|
|
|
bias :math:`b`.
|
|
|
|
|
bias :math:`b`. If :attr:`shift` is False, :attr:`bias_attr` is
|
|
|
|
|
omitted. If :attr:`shift` is True and :attr:`param_attr` is None,
|
|
|
|
|
a default :code:`ParamAttr` would be added as bias. The
|
|
|
|
|
:attr:`bias_attr` is initialized as 0 if it is added. Default None.
|
|
|
|
|
act(str): Activation to be applied to the output of layer normalizaiton.
|
|
|
|
|
name (str): The name of this layer. It is optional.
|
|
|
|
|
Default None.
|
|
|
|
|
name(str): The name of this layer. It is optional. Default None, and a
|
|
|
|
|
unique name would be generated automatically.
|
|
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
|
${y_comment}
|
|
|
|
|