- Operator forward computing is easy to check if the result is right because it has a clear definition. **But** backpropagation is a notoriously difficult algorithm to debug and get right:
- **Firstly** you should get the right backpropagation formula according to the forward computation.
- **Secondly** you should implement it right in CPP.
- **Thirdly** it's difficult to prepare test data.
- 1. you should get the right backpropagation formula according to the forward computation.
- 2. you should implement it right in CPP.
- 3. it's difficult to prepare test data.
- Auto gradient check gets a numeric gradient by forward Operator and use it as a reference of the backward Operator's result. It has several advantages:
- **Firstly** numeric gradient checker only need forward operator.
- **Secondly** user only need to prepare the input data for forward Operator.
- 1. numeric gradient checker only need forward operator.
- 2. user only need to prepare the input data for forward Operator.
## mathematical theory
## Mathematical Theory
The following two document from stanford has a detailed explanation of how to get numeric gradient and why it's useful.
- [Gradient checking and advanced optimization(en)](http://deeplearning.stanford.edu/wiki/index.php/Gradient_checking_and_advanced_optimization)
@ -18,7 +18,7 @@ The following two document from stanford has a detailed explanation of how to ge
## Numeric Gradient Implementation
### Interface
### Python Interface
```python
def get_numeric_gradient(op,
input_values,
@ -44,14 +44,14 @@ def get_numeric_gradient(op,
### Explaination:
1. Why need `output_name`
- Why need `output_name`
- One Operator may have multiple Output, you can get independent gradient from each Output. So user should set one output to calculate.
1. Why need `input_to_check`
- Why need `input_to_check`
- One operator may have multiple inputs. Gradient Op can calculate the gradient of these Inputs at the same time. But Numeric Gradient needs to calculate them one by one. So `get_numeric_gradient` is designed to calculate the gradient for one input. If you need to compute multiple inputs, you can call `get_numeric_gradient` multiple times.