!10519 update documentation of warmup_lr, F1, RMSProp, Batchnorm2d and add some pictures of links of activation function.

From: @wangshuide2020
Reviewed-by: @liangchenghui,@wuxuejian
Signed-off-by: @liangchenghui
pull/10519/MERGE
mindspore-ci-bot 4 years ago committed by Gitee
commit 4dfd143483

@ -343,13 +343,12 @@ def warmup_lr(learning_rate, total_step, step_per_epoch, warmup_epoch):
Args:
learning_rate (float): The initial value of learning rate.
warmup_steps (int): The warm up steps of learning rate.
Inputs:
Tensor. The current step number.
total_step (int): The total number of steps.
step_per_epoch (int): The number of steps in per epoch.
warmup_epoch (int): A value that determines the epochs of the learning rate is warmed up.
Returns:
Tensor. The learning rate value for the current step.
list[float]. The size of list is `total_step`.
Examples:
>>> learning_rate = 0.1

@ -142,6 +142,9 @@ class ELU(Cell):
\text{alpha} * (\exp(x_i) - 1), &\text{otherwise.}
\end{cases}
The picture about ELU looks like this `ELU <https://en.wikipedia.org/wiki/
Activation_function#/media/File:Activation_elu.svg>`_.
Args:
alpha (float): The coefficient of negative factor whose type is float. Default: 1.0.
@ -178,6 +181,9 @@ class ReLU(Cell):
element-wise :math:`\max(0, x)`, specially, the neurons with the negative output
will be suppressed and the active neurons will stay the same.
The picture about ReLU looks like this `ReLU <https://en.wikipedia.org/wiki/
Activation_function#/media/File:Activation_rectified_linear.svg>`_.
Inputs:
- **input_data** (Tensor) - The input of ReLU.
@ -335,6 +341,9 @@ class GELU(Cell):
:math:`GELU(x_i) = x_i*P(X < x_i)`, where :math:`P` is the cumulative distribution function
of standard Gaussian distribution and :math:`x_i` is the element of the input.
The picture about GELU looks like this `GELU <https://en.wikipedia.org/wiki/
Activation_function#/media/File:Activation_gelu.png>`_.
Inputs:
- **input_data** (Tensor) - The input of GELU.
@ -410,6 +419,9 @@ class Sigmoid(Cell):
Sigmoid function is defined as:
:math:`\text{sigmoid}(x_i) = \frac{1}{1 + \exp(-x_i)}`, where :math:`x_i` is the element of the input.
The picture about Sigmoid looks like this `Sigmoid <https://en.wikipedia.org/wiki/
Sigmoid_function#/media/File:Logistic-curve.svg>`_.
Inputs:
- **input_data** (Tensor) - The input of Tanh.
@ -448,6 +460,9 @@ class PReLU(Cell):
Parameter :math:`w` has dimensionality of the argument channel. If called without argument
channel, a single parameter :math:`w` will be shared across all channels.
The picture about PReLU looks like this `PReLU <https://en.wikipedia.org/wiki/
Activation_function#/media/File:Activation_prelu.svg>`_.
Args:
channel (int): The dimension of input. Default: 1.
w (float): The initial value of w. Default: 0.25.

@ -340,6 +340,9 @@ class BatchNorm2d(_BatchNorm):
Note:
The implementation of BatchNorm is different in graph mode and pynative mode, therefore that mode can not be
changed after net was initilized.
Note that the formula for updating the running_mean and running_var is
:math:`\hat{x}_\text{new} = (1 - \text{momentum}) \times x_t + \text{momentum} \times \hat{x}`,
where :math:`\hat{x}` is the estimated statistic and :math:`x_t` is the new observed value.
Args:
num_features (int): `C` from an expected input of size (N, C, H, W).

@ -122,7 +122,7 @@ class Fbeta(Metric):
class F1(Fbeta):
r"""
Calculates the F1 score. F1 is a special case of Fbeta when beta is 1.
Refer to class `Fbeta` for more details.
Refer to class :class: `mindspore.nn.Fbeta` for more details.
.. math::
F_1=\frac{2\cdot true\_positive}{2\cdot true\_positive + false\_negative + false\_positive}

@ -42,13 +42,6 @@ class RMSProp(Optimizer):
"""
Implements Root Mean Squared Propagation (RMSProp) algorithm.
Note:
When separating parameter groups, the weight decay in each group will be applied on the parameters if the
weight decay is positive. When not separating parameter groups, the `weight_decay` in the API will be applied
on the parameters without 'beta' or 'gamma' in their names if `weight_decay` is positive.
To improve parameter groups performance, the customized order of parameters can be supported.
Update `params` according to the RMSProp algorithm.
The equation is as follows:
@ -88,6 +81,13 @@ class RMSProp(Optimizer):
:math:`\\eta` is learning rate, represents `learning_rate`. :math:`\\nabla Q_{i}(w)` is gradientse,
represents `gradients`.
Note:
When separating parameter groups, the weight decay in each group will be applied on the parameters if the
weight decay is positive. When not separating parameter groups, the `weight_decay` in the API will be applied
on the parameters without 'beta' or 'gamma' in their names if `weight_decay` is positive.
To improve parameter groups performance, the customized order of parameters can be supported.
Args:
params (Union[list[Parameter], list[dict]]): When the `params` is a list of `Parameter` which will be updated,
the element in `params` must be class `Parameter`. When the `params` is a list of `dict`, the "params",

Loading…
Cancel
Save