* Support that the input(x) of stack api is a LoDTensorArray. test=develop
* Support that the input of concat api is a LoDTensorArray. test=develop
* Add tests to test stack/concat called in dygraph_to_static. test=develop
1. Add basic support for `for in range` loop
2. Move `test_dygraph_to_static_*` to `dygraph_to_static` dir and rename them
3. Add test case for dict in while_loop
1. copy.deepcopy in NameVisitor should be changed to copy.copy to make hash or set work
2. read_context should be type of gast.Load()/gast.AugLoad(), not gast.Load/gast.AugLoad
Add basic support for while in translating dygraph to static
1. Analysis the variable liveness in class NameVisitor
2. Replace while key word using while_loop API
* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results.
* add the unit test for fetch_unmerged.
* update ut for multi-card and multi-cpu.
* add the error message and the user suggestion in FetchOpHandle. test=develop
* test ResNet decorated by dygraph_to_static_output in static mode. test=develop
* Follow comments and change dygraph_to_static_output to dygraph_to_static_graph.test=develop
* Support fetch and run program in the process of dygraph_to_static_output. test=develop
* fix to_source(gast) and remove dygraph API such as Conv2D, Linear. test=develop
* Correct CPU gradients of the argsort op, form a network to test its forward and backward process, test=develop
* fix dynamic threshold error in test_argsort_op, test=develop
1. Considering functions, I have to change the node type from single value to a set. Because python function is allowed to return different types. The set represent all possible types
2. I added scope_name and scope_type for AstVarScope, because in python functions, variable may have different scope. For example:
```
a = 3
def foo(b):
a = 9
return a + b
```
the `a` in `foo` is different to the `a` out of `foo`. Similar to class field. The scope_name will help me to know the function name when static analysis finds a `return` sentence.
* Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator.
* Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly.
* Remove CPU code in Pull/PushSparse and we will add it back when testing it fully.
* Fix some known issues: such as copying persistable vars after one epoch running.
* add partial_concat, test=develop
* fix the grids and blocks, test=develop
* fix the Paddle_Enforce, test=develop
* fix the doc of op, test=develop
* fix the doc, test=develop
* fix the doc of the op, test=develop
* replace -1 with None, test=develop
* support dygraph to static graph for simple case.
* add test for dygraph API recognition. test=develop
* support basic to_variable api. test=develop
* update dict: dygraph_class_to_static_api
* add all tests of dygraph api. test=develop
* use gast/astor instead of ast/codegen for the compatibility of PY2 and PY3. test=develop
* add arg 'num_flatten_dims' for fc ast node. test=develop
* Modify names of class by Camel-Case.
* support nested if/else
* support to derivate returns the parameter list automatically
* polish tranform function of slice
* fix modify x.numpy()[i] slice function
* support to transform ast.node into callable function
* fix get_name_ids bug and add more unittest test=develop
* fix requirements.txt test=develop
* remove useless import statement test=develop
* Fixed version compatibility issues in param of function test=develop
* use decorater to test ast_to_func test=develop
* add textwrap.dedent for source_code test=develop
* polish code comment
* fix compatibility with python2 and python3 test=develop
* fix gast version error test=develop
* fix gast repo test=develop
* polish transfer_from_node_type code test=develop
* add nested_if_else unittest test=develop
* split IfElseTransformer test=develop
* specify gast version test=develop
* fix ast_to_func root type test=develop
* improve the mul_mkldnn_op line coverage
test=develop
* remove fp32 mul mkldnn kernel
test=develop
* locally refactoring
test=develop
* change according to reviews
test=develop
* add get delay for multiprocess data loader, test=develop
* add unittest for coverage ci, test=develop
* add timeout unittest, test=develop
* increase the delay time, test=develop
* Add TopK Op Grad CPU&GPU Kernel test=develop
* Add TopK Op Grad, modify grad op maker test=develop
* Add TopK Op Grad, modify grad op maker test=develop
* Add TopK Op Grad, modify PADDLE_ENFORCE test=develop
* Add TopK Op Grad, modify PADDLE_THROW test=develop
* Add TopK Op Grad, modify unittest test=develop
* fix ngraph top k op unittest test=develop
* update ops's unittest of elementwise_pow, elementwise_max, elementwise_min, scale and sqrt
1. update elementwise_pow, elementwise_max and scale's unitests with input data type (float32 -> float64)
2. fix bug that the elementwise_pow doesn't meet threshold requirements with tackling float64 data
3. remove sqrt from op_accuracy_white_list.py
4. update the unittests of elementwise_pow, elementwise_max and elementwise_min ops that their input data shape over 100
5. test=develop
* modify the writing style according suggestions
test=develop
* Add support for dynamic_decode(while) training. test=develop
* Fix assign_op and tensor_array_read_write_op after solving conflict. test=develop
* Fix test_rnn_decode_api.py. test=develop
* Refine docs for apis in rnn.py. test=develop
* Adjust outputs of dynamic_decode. test=develop
* Remove the force_cpu update in assign_op. test=develop
* Remove the force_cpu update in assign_op. test=develop
* Make RNNCell.get_initial_states support batch_dim_idx argument. test=develop
* Rename _create_array_outof_while as _create_array_out_of_while in rnn.py.
test=develop
* support slice double grad, test=develop
* merge two doublegradopmaker to one doublegradopmaker,test=develop
* change the shape of slice_OP's unittest, test=develop
This PR provides very basic and simple framework for transforming Dygraph to Static Graph.
API names, final outputs are not determined yet. Feel free to modify or add class/function/type when you think the framework is not extendable for you.
* Add the first implememtation of fusion_group op #19621 (#3)
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop
* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop
* Disable for mac and windows.
test=develop
* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop
* Refine the CUDA kernel to support large dims.
test=develop
* Add DeviceCodePool to manage all device codes.
* Add the first implementation fusion_group op.
* Add unit-test for fusion_group op.
* Add the check of result.
* Add the check of nvrtc in unit-test.
test=develop
* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop
* Disable fusion_group op for mac and windows.
test=develop
* Make the compiling of device code return status instead of hanging up.
test=develop
* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
* Unify fusion_group_op's input and output names.
test=develop
* Add the check of CUDA driver library in unittest.
test=develop
* Enable generating code for a given subgraph. #21126 (#4)
* Enable generating code for a given subgraph.
* Support sorting the subgraph.
* Remove the rearange of expressions because we use the sorted subgraph directly.
* Enable generating code for a subgraph which is composed of grad ops.
* Use expression information to check the accuracy in unittest.
* Separate load and store from computation expressions.
test=develop
* Improve the loading statements in generated codes.
test=develop
* Remove unused arguments from formal list.
test=develop
* Enable the detection of subgraph of grad ops.
* Generate code for detected subgraph in fusion_group_pass.
* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop
* Fix a bug when checking whether the shape of all inputs are the same.
* Add debug information.
* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5)
test=develop
* Call subgraph_detector in fusion_group pass.
test=develop
* Disable fusion_group when WITH_GPU is OFF.
test=develop
* Refine all PADDLE_ENFORCE message.
test=develop
* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop
* Follow review comments.
test=develop
* Implement a common python unittest to test the ir passes.
test=develop
* Save the results in np.array and support to startup on CPU.
test=develop
* Fix the unittest.
test=develop
* Add check_program to check whether the optimized program is different from the origin one.
test=develop
* Remove the inferface all_ops.
test=develop
* Add exception test in pass_test.
test=develop