* reopen python coverage --include for test, test=develop
* if no .py file modified, not use coverage run, test=develop
* remove test code, test=develop
* add WITH_INCREMENTAL_COVERAGE, test=develop
* refine if else, test=develop
1. Type of index: int, slice(step must be 1).
2. Type of value:
(1) int32, int64, float32, bool;
(2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
(3) paddle.Tensor(int32, int64, float32, float64, bool);
* add heter box
* add trainer, worker, wrapper...
* format
* for ci
* format
* remove boost get
* boost & copyright
* rename
* rename
* format
* format
* format
Co-authored-by: yaoxuefeng6 <yaoxuefeng@baidu.com>
* add conj op for complex types
* add conj for complex types
* add more test case
* add conj_op test
* modify conj api and impl
* add complex type for fill_constant_op xpu
* add setConstant for complex type
* remove complex conj test file
* user define grad for test_conj_op
* add test case for static mode of conj api
* modify conj doc
* change input args name to x
* remove useless codes
* conj support real types
* add conj test case for real number
* delete no need to calculate inputs in dygraph op_test
* delete no need to calculate inputs in dygraph op_test
* modify grad of mul for complex types
* fix the grads of inputs args order not match bug
* merge amp related function in Momentum from paddle.fluid.contrib.optimizer into paddle.optimizer.
* Add unittest for 2.0 Momentum API.
* fix some bugs in weight_decay.
Before this commit, test_slice use old api `dygraph_to_static_func` to use Dynamic-t-Static and use Executor explicitly,which is not recommended to users.
After fixed, use recommended API `paddle.jit.to_static` to replace `dygraph_to_static_func`, which won't trigger the random exception on coverage CI.
* add conj op for complex types
* add conj for complex types
* add more test case
* add conj_op test
* modify conj api and impl
* add complex type for fill_constant_op xpu
* add setConstant for complex type
* remove complex conj test file
* user define grad for test_conj_op
* add test case for static mode of conj api
* modify conj doc
* change input args name to x
* remove useless codes
* conj support real types
* add conj test case for real number
* add complex real op & api & unittest
* add imag op & api & unittest
* refactor op impl
* revert simplify writing due to complile failed
* polish details
* polish grad op code
* fix expand && concat/transpose to new api
* update xpu_header
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* add nearest_interp on kunlun
* update error message
* added UT should not exceed 15s
* fix error
* UT limit of 15s is the first to be executed
* fix error
* fix error with CI_SKIP_CPP_TEST
* modfied tiemout setting
* fix error
1. Fix error in _build_cond_stmt of for-range stmts.
2. Support that step value is negative in for-range stmts
3. Fix code because of the diff between Py2 and Py3
Fix 3 Windows Unittests
test_fuse_all_reduce_pass: Paddle cannot run multiple-GPU on Windows so set single visible GPU flag
test_feed_data_check_shape_type: Paddle cannot run multiple-GPU on Windows so set single visible GPU flag
test_tsm: Winodws GPU size is not enough so decrease batch size and data size.
* Fix a bug when running on an operating system without "bash."
* add execution condition
* for ci-coverage
* get cpu information to check the precision problem
* Update compilation environment for musl version
* update dependencies
* remove test code
check cpu info
remove test code
review
* update alpine and third_party denpendencies
* add newline for ci Code format
* Fix api docs in RNN, Transformer, layer_norm, WeightNormParamAttr.
test=develop
* Fix api doc for print in label_smooth.
test=develop
* Update api docs according to review comments.
Add name argument in RNN back.
test=develop
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types
* add test cases for complex elementwise, matmul and getitem unittest
* add test cases for complex types
* add test cases for complex matmul unittest
* kron, reshape, transpose support complex types
* sum and trace op support complex types
* add test case of sum and trace op
* fix the bug of imag part of complex not initialized
* format file
* format code style
* kron support type promotion; modify test cases
* basic impl of type promote
* add comment & another testcase
* fix complex bugs & support python op promote type
* fix failed unittests & polish code
* add unittest for coverage
* change to only promote complex type
* polish code details
* polish several comments
Usage scenarios:A function could have run successfully in static mode, you can use it to decorate a function in the following cases:
1. An unknown error occurs in the dynamic-to-static conversion process of the function;
2. In the internal implementation of the function, it has two branches: dynamic branch and static branch;
3. Users don't want to convert the function in the process of dynamic to static.
* add the weight decay func for the momentum op
* Add the multi_precision function in Momentum Optimizer.
* Make sure that the initial value of master weights are same with the fp16 weights.
* add static loss scaling.
* add the rescale_grad function in the pure fp16 training.
* use the original momentum updating method.
* Polish some codes, such as variable names.
* add docstring for apis.
* update the var creation details of _create_master_weight.
* not modify codes about imperative momentum updating.
* Fix the error of test_dist_sparse_tensor_load_momentum UT.
* add unit test for multi precision fp16 training.
* add more unit tests for CI.
* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.
* For CI Coverage Checking.
* add fp16 for layer_norm op
* revert layernorm api
* fix forward
* fix forward
* fix backward for layernorm with fp16
* fix unit test for layernorm with fp16
* fix with_mkldnn compile error for layernorm with fp16
* 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>
* fix with_mkldnn compile error for layernorm with fp16
* fix with_mkldnn compile error for layernorm with fp16
Co-authored-by: zhiqiu <chenqiuliang@baidu.com>
This PR fixes several problems in dy2stat for Deoldify model in PaddleGan.
In model, software engineer wrote if x.shape == y.shape, the Tenser shape is a tuple in dygraph so the == returns True/False, but in static graph the == becomes element-wise comparison, which is a different behavior. In this PR we reduce the element-wise comparison result.
If software engineer write computations which uses parameters in hooks, the static graph can loss the parameter variable because we put param_guard at forward of a Layer. In this PR we made param_guard cover pre-hook and post-hook.
In PaddleGan, software engineer calculated some parameter values in __init__ by running some dygraph code. Those code also run during dy2stat. So some variables may be assign as a VarBase (Tensor) first and then Variable, which raised an error. We fixed the bug in this PR by handling the case.
TODO: We just added testcase for the 1. shape comparison. Should add test case for 2. and 3. But since we are chasing 2.0RC, I will do it in the near future PR
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types
* add test cases for complex elementwise, matmul and getitem unittest
* add test cases for complex types
* add test cases for complex matmul unittest
* The leaf tensor concept is exposed and the gradient accumulation of leaf tensor
* The leaf tensor concept is exposed and the gradient accumulation of leaf tensor
* fix coverage
* fix api doc
* fix CI unittest
* fix CI unittest
* fix unitest
* empty tensor does’t need inner_var_
* fix some error message
* Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable.
* Add a new attribute `_inplace_version` for VarBase.
* Raise exception if an inplace operation can result in incorrect gradient computation.
* Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation.
* For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode.
* Use original var_wrapper if the inplace_version is not changed.
* Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.
* Changed a variable name error
* Add comments
* Move member functions of TranslatedLayer out of function
* edit code according to review
* Edit input argument of '_run_static_graph'
* reset due to Segmentation fault
* rename variables when stitching graph
* modify code according CI
* Add comments to '__i_m_p_l__'
* remove blanks befor 'Get...'
* edit code according to review
* Add a comment to '_execution_method_creator'
* Edit a comment to '_execution_method_creator'
* Generate code coverage reports only for incremental files, test=develop
* Generate code coverage reports only for incremental files, test=develop
* Generate code coverage reports only for incremental files, test=develop
* test for diff python file, test=develop
* fix no python diff report, test=develop
* add cc test file, test=develop
* fix bug in generic.cmake, test=develop
* for debug no cc report, test=develp
* modify compire branch form test_pr to test, test=develop
* fix bug, test=develop
* test for h file changed, test=develop
* debug for redefinition of argument optimize error, test=develop
* close -o3 for test, test=develop
* remove -o3 for test, test=develop
* remove coverage option for nvcc, test=develop
* use CMAKE_CXX_FLAGS open coverage option when header file changed, test=develop
* reopen -o3, test=develop
* remove debug code, test=develop
* remove unused code, test=develop
test_mnist failed on CUDA11. We found that it is due to PaddleInference IR Optimization after debugging. We disable it in this PR and we will re-enable it after PaddleInference fixes it.
GridGenerator model failed because the output shape of `linspace` is (-1). The reason is that C++ InferShape fixes the shape to (-1):
5da3d514eb/paddle/fluid/operators/linspace_op.cc (L49)
We cannot set the shape in C++ infer shape because this Tensor may not be initialized during compile time, but when input `num` of `linspace` is an integer, we know the shape at compiler time. This PR simply set the shape in Python and add GridGenerator as unittest.
* add reducer
* refine envent for memorycopy
* add concat&split for allreduce
* apply concat & split for fuse tensor
* fix nccl dep
* fix the untest, compile problem and ddp initialize problem
* fix untest for mac & add some comments & solve the repeated param in sublayers
* fix untest for windows & fix document