* add new API: MultiStepDecay, a new learing rate strategy, test=develop
* add new API: MultiStepDecay, a new learing rate strategy,test=develop
* add new API: MultiStepDecay, a new learing rate strategy,test=develop
* add base class of LearningRateEpochDecay, and MultiStepDecay, and StepDecay, test=develop
* fix doc to add coverage,test=develop
* add new api: optimizer.set_lr, test=develop
* add API doc and example code for optimizer.set_lr,test=develop
* add API doc and example code for optimizer.set_lr,test=develop
* Modified doc to :api_attr: imperative,test=develop
Support Various-Length Return Grammar in Dy2stat. This PR is a follow-up of https://github.com/PaddlePaddle/Paddle/pull/25176 .
The basic idea is putting no-value placeholder variables at `return` statement to make all `return` statement have same length, after that the static graph can have fixed fetch output (code at return_transformer.py). Then remove those no-value placeholder when we finally return dygraph result (code at partial_program.py).
However, various length return in Bert model is still not supported. The dy2stat can change the code as I wish but some ops which check shape at compile time (e.g. Reshape, MatMul) will throw error because of the no-value-placeholder may not have the required shape. Is this a matter? To me, those no-value placeholder will be replaced as really values meeting shape requirements at run time, so I think the solution should be some way to do the compile-time checking. By the way, every time when we have dynamic shape, it often causes problem in dy2stat. We should find a way to handle it in the future.
Fixing various return in Bert is my TODO thing and I will also find some other existing models for verification.
This PR added basic support for 'return' grammar in dy2stat. It supports the control flow of 'return'.
The basics idea is using a return value variable to store the early return statements and boolean state variables with if-else to skip the statements after the return statements.
**This PR is very basic support. There are some corner cases I didn't develop/test**. For example, 'return None', 'return different length of variables', 'return non-tensor and tensor together', 'no return statement'. **These corner cases will be done in my next PRs**. Target date is this week.
**Note**:
1. for the unit test, I changed test_program_translator.py because the StaticCode of `dyfunc_with_if_else` will change. To guarantee the correctness of `dyfunc_with_if_else`, I also run it in `TestRecursiveReturn` in test_return.py.
2. I commented the early return code in bert_dygraph_model.py because 'return different length of variables' is unsupported now. I also know that there are some other models used early return and we didn't enable it in the unit test. I will add support for it in next PRs and then re-enable those tests.
* add new api (set_global_initializer/reset_global_initializer),test=develop
* add new api (set_global_initializer/reset_global_initializer),test=develop
* fix doc and example code of set_global_initializer,test=develop
* The arg of append() can be not Tensor temporarily.
* Add Seq2Seq as ProgramTranslator Unit Test.
* set dtype of vocab_size_tensor to int64 to pass Windows-CI.
* Add a StatValue class in the backend to represent a stat.
* Add a singleton StatRegistry to maintain the collection of stats.
* For the sake of code neatness, we only support type of int and float, which can cover most of the scenarios.
* Support int and long: int or long -> six.integer_types.
* Modify test_tensor_shape: fix bug and modify comment.
* Support convert_var_shape to convert var.shape stmt
* Modify code in ifelse_simple_func.py because don't support return non-Tensor in Tensor-dependent 'if' stament currently.
* Convert the return variables of Tensor-dependent 'if' staments to Tensor if it not. test=develop
* Move function 'convert_len' to file convert_operators.py
* Support that for statements are transformed to while statements.
* Fix bug: raise None -> return None.
* Support variable loaded and created in loop.
* Use int64 in Py2 and Py3 in function to_static_variable.
* Support LoDTensorArray in reverse_op test=develop
* polish en doc and unittest code test=develop
* refine sample code test=develop
* add example of LoDTensorArray test=develop
* fix typo test=develop
* cast var in convert_logical_XX.
* Add convert_ifelse function in convert_operators.py
* Add logical_transformer. Remove LogicalTransformer from loop_transformer.py
* Revert modified tests in PR24799(convert_while_stmt).
* Comment and modify code that doesn't support `return` statement.
* Remove unnecessary class: MergeAssignTransformer, NodeTestTransformer and IfConditionVisitor in ifelse_transformer.
The random failure at Windows may due to some random gt_boxes can cause some numbers in YoloV3 to be negative thus access the invalid memory. This PR tries to solve it.
* Support return variable in only one of if body or else.
* remove after_visit in IfElseTransformer.
* Modify the result of get_name_ids in test_ifelse_basic.py
* Add unittest to test the new case.
* Modify code according to reviews.
* Support convert_while_loop.
* Comment code that not supported 'if' in test_break_continue.
* Convert int into tensor to support 'if' stmt in for/while loop.
* Add unittest to test all cases of convert_logical_XX.
* Add unittest to test all cases of convert_while_loop.
* Fix bug in LogicalOpTransformer. test=develop
* add histc operator, test=develop
* update english doc to 2.0 API, test=develop
* update API from histc to histogram, test=develop
Co-authored-by: root <root@yq01-gpu-255-129-15-00.epc.baidu.com>
The PR: https://github.com/PaddlePaddle/Paddle/pull/24651 seems causes new random failure of unit test test_parallel_executor_seresnext_base_cpu. The reason is that smaller batch size causes random optimization of neural network. I distinguished cpu/gpu batch size to fix the unittest.
DataLoader makes the data diff even if the data of reader is the same on CUDA place. This PR doesn't use DataLoader to pass the test. we will use DataLoader back after we fix it.
As discussed with QA, we will use p4 machine for unit test and the GPU on those machine may not have enough GPU, which can cause "test_parallel_executor_seresnext_base_gpu" failed. So I decrease the batch size.
In the past, the test_cond will fail with 2% probability and easy to re-produce.
Now I re-run 300 times and no failure occurs. The probability of still has the failure is (1 - 2%) ^ 300 ~= 0.00004. We can say the random failure disappears. Maybe someone fixed some bugs in PE.
* Refactor code for dump_field & dump_param: abstracting the common function in base class.
* Support dump randomly & random with lineid
* Support specify the random interval, which avoids printing too much logs.
* Support to create LoDTensorArray in control flow (cond and while_loop)
* Fix bug: return LoDTensorArray in while_loop
* Change code in list_transformer.py to accommodate the new features.
* Compatible int32 and int64 for attr in op slice/strided_slice. test=develop
* Polish code in nn.py test=develop
* Fix bug: set the same dtype for the inputs of elementwise_add. test=develop
* Convert int32 to int64 in slice op to avoid data overflow. test=develop
* Convert int32 to int64 in strided_slice_op to avoid data overflow. test=develop
* add alias in paddle.nn and paddle.tensor test=develop
* add alias in paddle.nn and paddle.tensor dir test=develop
* fix same conflict manually test=develop
* update fc and dygraph alias test=develop
* fix initalizer.py typo test=develop
* error message of cross_entropy_op, test=develop
* fix bug : can't use platform::errors::InvalidArgument in HOSTDEVICE, test=develop
* fix bug: recovery the check_variable_and_dtype for rank_loss and bpr_loss, test=develop
* Add hapi.text and corresponding unit test.
test=develop
* Remove hapi.text apis' reuse parameter args for coverage.
test=develop
* Fix TransformerCell and TransformerBeamSearchDecoder example codes.
test=develop
* Fix example codes in hapi.text.
test=develop
* Add some apis in hapi.text into example code white list.
test=develop
* Fix example code of DynamicDecode in hapi.text.
text=develop
* Rename Model.self as model in test_text.py
test=develop
* Merge hapi into Paddle
Hapi is a high level API for training and inference.
The main modules include Model, Loss, Metrics, Dataset.
Also includes common modules and models in NLP and computer vision, such as BERT, ResNet.
These modules are developed by:
0YuanZhang0, guoshengCS heavengate, LielinJiang, qingqing01, xyzhou-puck huangjun12, wangxiao1021, zhangyang.
* fix numpy ndarray mul var base error; test=develop
* add comment for __array_ufunc__ ; test=develop
* move unitest from imperative math op path to test_math_op_patch_var_base;
test=develop
* support to train in static
* support to independent decorator
* remove in_dygraph_mode condition in ProgramTranslator
* fix import param_guard and add train/eval test=develop
* Modify into ShareVarsFromScope and rm __all__ in partial_program test=develop
1. To make ProgramTranslator to support `assert` grammar, this PR adds `assert` python API and C++ code.
2. Fix a bug: graph_pattern_detector.h #include <gtest/gtest_prod.h> but didn't declared dependency at CMakeLists, which can cause single build failure.
3. Refactoring `Formatter` in print_op to make it reusable and reuse the formatter to print in assert op.
目前在while_loop的执行过程中,loop_vars中的变量在每次的循环中都会进行拷贝,但是LoDTensorArray类型的变量在while循环体中已经完成了读/写的操作,即完成了更新,此时在进行拷贝属于冗余的操作,故该PR跳过每次循环中loop_vars中LoDTensorArray类型的变量的复制过程。
在PaddleCV/ocr_recognition/atention模型的预测过程中进行性能测试:
|性能|with this PR|without this PR|提升|
|---|---|---|---|
|速度|4957.4ms|4978.47ms|0.4%|
* Replace dygraph_to_static_func with @declarative or program_translator.get_func in test_list.py
* Add comments in ConditionalBlock.
* Support list pop last item.
* Support pop the i-th item.
* Support an empty tensor array as Input in assign op and set the kernel type is float.