* fix bug of using ignore_index and reduction,test=develop
* fix bug of celoss when using ignore_index and reduction, test=develop
* improve performance when ignore_index=-100, test=develop
* add test in test_cross_entropy_loss.py for coverage rate, test=develop
* rm comment in test_cross_entropy_loss.py, test=develop
* del hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop
* change mask to a more simplified implementation, test=develop
* del comment in python/paddle/nn/functional/loss.py, test=develop
* del hard code and change mask to a more simplified implementation, test=develop
* change mask to a more simplified implementation, test=develop
* change mask to a more simplified implementation, test=develop
* add view strategy on squeeze,unsqueeze,reshape,flatten
* add squeeze unittest
* add unittests
* use View strategy as name rather than Reuse Allacation
* fix view api doc
* fix format
* use core.ops when input of reshape2 is Tensor
* fix test_cross_entropy_loss error because of reshape2
* delete selected_rows
* change op_function
* little change
* solve HandleViewBetweenInputAndOutput
* add cast ops before and after unsupported fp16 ops.
* Keep partial net in FP32 pattern.
* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
* Add fp16 support for adam op.
* add multi precision attr for adam.
* Fix the bug of test_multi_precision_fp16_train UT.
* Code format for CI.
* Fix the redefine error about MPTypeTrait on windows.
* fix bugs of the _create_accumulators func in Momentum.
* fix bug when inserting post cast op.
* Add the update_loss_scaling op in allow_set of UnusedVarCheck.
* Update for ci coverage.
* Add some doc for OptimizerWithMixedPrecision.
* Fix the code style.
* Imporve the doc of `amp_init`.
* Change for fp16 testing if users have the infer program defined in separate way.
1. When x is Variable, call nn.shape(x) only in following cases:
1)The shape of x is used in control flow condition.
2)The dim to be used is negetive
2. When x is Variable, but x.shape or x.shape[idx] doesn't contain negetive value, don't convert to paddle.shape()
* change to tensor copy sync
* change to tensor copy sync
* make copy_to safe when use TensorCopy
* refine code
* add ut
* add cudapinned garbagecollector
* add testcase: cpu place -> cuda pinned place
1. when slice_item is a slice:
1) the start of __getitem__ should be std::max(start, 0) if slice
2) the start of __getitem__ should be std::min(end, dim)
2. when slice_item is an integer, it should be in [-dim_len, dim_len)
3. Fix error message to use accurate data
* Support storage of large parameters
* Reduce the complexity of the unittest
* Reduce the complexity of the unittest,commented out unittest for
* add unittest for static.save/load
* Increase the timeout threshold of 'test_static_save_load'
* Increase the timeout threshold of 'test_static_save_load'
* Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
* Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
* dot op support complex types
* matmul support complex types
* add test case
* matmul broadcast gradient support complex
* move conjFunctor to complex_functor.h
* reopen python coverage --include for test, test=develop
* if no .py file modified, not use coverage run, test=develop
* remove test code, test=develop
* add WITH_INCREMENTAL_COVERAGE, test=develop
* refine if else, test=develop
1. Type of index: int, slice(step must be 1).
2. Type of value:
(1) int32, int64, float32, bool;
(2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
(3) paddle.Tensor(int32, int64, float32, float64, bool);
* add heter box
* add trainer, worker, wrapper...
* format
* for ci
* format
* remove boost get
* boost & copyright
* rename
* rename
* format
* format
* format
Co-authored-by: yaoxuefeng6 <yaoxuefeng@baidu.com>
* add conj op for complex types
* add conj for complex types
* add more test case
* add conj_op test
* modify conj api and impl
* add complex type for fill_constant_op xpu
* add setConstant for complex type
* remove complex conj test file
* user define grad for test_conj_op
* add test case for static mode of conj api
* modify conj doc
* change input args name to x
* remove useless codes
* conj support real types
* add conj test case for real number
* delete no need to calculate inputs in dygraph op_test
* delete no need to calculate inputs in dygraph op_test
* modify grad of mul for complex types
* fix the grads of inputs args order not match bug
* merge amp related function in Momentum from paddle.fluid.contrib.optimizer into paddle.optimizer.
* Add unittest for 2.0 Momentum API.
* fix some bugs in weight_decay.
Before this commit, test_slice use old api `dygraph_to_static_func` to use Dynamic-t-Static and use Executor explicitly,which is not recommended to users.
After fixed, use recommended API `paddle.jit.to_static` to replace `dygraph_to_static_func`, which won't trigger the random exception on coverage CI.
* add conj op for complex types
* add conj for complex types
* add more test case
* add conj_op test
* modify conj api and impl
* add complex type for fill_constant_op xpu
* add setConstant for complex type
* remove complex conj test file
* user define grad for test_conj_op
* add test case for static mode of conj api
* modify conj doc
* change input args name to x
* remove useless codes
* conj support real types
* add conj test case for real number
* add complex real op & api & unittest
* add imag op & api & unittest
* refactor op impl
* revert simplify writing due to complile failed
* polish details
* polish grad op code
* fix expand && concat/transpose to new api
* update xpu_header
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* add nearest_interp on kunlun
* update error message
* added UT should not exceed 15s
* fix error
* UT limit of 15s is the first to be executed
* fix error
* fix error with CI_SKIP_CPP_TEST
* modfied tiemout setting
* fix error
1. Fix error in _build_cond_stmt of for-range stmts.
2. Support that step value is negative in for-range stmts
3. Fix code because of the diff between Py2 and Py3
Fix 3 Windows Unittests
test_fuse_all_reduce_pass: Paddle cannot run multiple-GPU on Windows so set single visible GPU flag
test_feed_data_check_shape_type: Paddle cannot run multiple-GPU on Windows so set single visible GPU flag
test_tsm: Winodws GPU size is not enough so decrease batch size and data size.
* Fix a bug when running on an operating system without "bash."
* add execution condition
* for ci-coverage
* get cpu information to check the precision problem
* Update compilation environment for musl version
* update dependencies
* remove test code
check cpu info
remove test code
review
* update alpine and third_party denpendencies
* add newline for ci Code format
* Fix api docs in RNN, Transformer, layer_norm, WeightNormParamAttr.
test=develop
* Fix api doc for print in label_smooth.
test=develop
* Update api docs according to review comments.
Add name argument in RNN back.
test=develop
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types
* add test cases for complex elementwise, matmul and getitem unittest
* add test cases for complex types
* add test cases for complex matmul unittest
* kron, reshape, transpose support complex types
* sum and trace op support complex types
* add test case of sum and trace op
* fix the bug of imag part of complex not initialized
* format file
* format code style
* kron support type promotion; modify test cases
* basic impl of type promote
* add comment & another testcase
* fix complex bugs & support python op promote type
* fix failed unittests & polish code
* add unittest for coverage
* change to only promote complex type
* polish code details
* polish several comments
Usage scenarios:A function could have run successfully in static mode, you can use it to decorate a function in the following cases:
1. An unknown error occurs in the dynamic-to-static conversion process of the function;
2. In the internal implementation of the function, it has two branches: dynamic branch and static branch;
3. Users don't want to convert the function in the process of dynamic to static.
* add the weight decay func for the momentum op
* Add the multi_precision function in Momentum Optimizer.
* Make sure that the initial value of master weights are same with the fp16 weights.
* add static loss scaling.
* add the rescale_grad function in the pure fp16 training.
* use the original momentum updating method.
* Polish some codes, such as variable names.
* add docstring for apis.
* update the var creation details of _create_master_weight.
* not modify codes about imperative momentum updating.
* Fix the error of test_dist_sparse_tensor_load_momentum UT.
* add unit test for multi precision fp16 training.
* add more unit tests for CI.
* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.
* For CI Coverage Checking.
* add fp16 for layer_norm op
* revert layernorm api
* fix forward
* fix forward
* fix backward for layernorm with fp16
* fix unit test for layernorm with fp16
* fix with_mkldnn compile error for layernorm with fp16
* 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>
* fix with_mkldnn compile error for layernorm with fp16
* fix with_mkldnn compile error for layernorm with fp16
Co-authored-by: zhiqiu <chenqiuliang@baidu.com>
This PR fixes several problems in dy2stat for Deoldify model in PaddleGan.
In model, software engineer wrote if x.shape == y.shape, the Tenser shape is a tuple in dygraph so the == returns True/False, but in static graph the == becomes element-wise comparison, which is a different behavior. In this PR we reduce the element-wise comparison result.
If software engineer write computations which uses parameters in hooks, the static graph can loss the parameter variable because we put param_guard at forward of a Layer. In this PR we made param_guard cover pre-hook and post-hook.
In PaddleGan, software engineer calculated some parameter values in __init__ by running some dygraph code. Those code also run during dy2stat. So some variables may be assign as a VarBase (Tensor) first and then Variable, which raised an error. We fixed the bug in this PR by handling the case.
TODO: We just added testcase for the 1. shape comparison. Should add test case for 2. and 3. But since we are chasing 2.0RC, I will do it in the near future PR
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types
* add test cases for complex elementwise, matmul and getitem unittest
* add test cases for complex types
* add test cases for complex matmul unittest
* The leaf tensor concept is exposed and the gradient accumulation of leaf tensor
* The leaf tensor concept is exposed and the gradient accumulation of leaf tensor
* fix coverage
* fix api doc
* fix CI unittest
* fix CI unittest
* fix unitest
* empty tensor does’t need inner_var_
* fix some error message