* Avoid bug on 'MAC python3.5/6'.
* Choose the saving method according to the OS.
* smaller length of '_unpack_saved_dict' for MAC OS.
* add version information of Python.
* Edit comment.
* add view strategy on squeeze,unsqueeze,reshape,flatten
* add squeeze unittest
* add unittests
* use View strategy as name rather than Reuse Allacation
* fix view api doc
* fix format
* use core.ops when input of reshape2 is Tensor
* fix test_cross_entropy_loss error because of reshape2
* fix test_cross_entropy_loss error because of reshape2
* add inplace strategy
* add elementwise_add sub
* let backward op not use inplace
* grad op do not use inplace
* fix memory increase error and add leaf error message
* delete selected_rows
* change op_function
* little change
* solve HandleViewBetweenInputAndOutput
* add unittest and leaf error message
* merge view error
* optimize op_function_generator format and support sum inplace op
* fix format of basic_engine
* fix format for framework
* little change of variable wrapper
* add reshape, squeeze, unsqueeze, scatter api
* add relu elu tanh softmax inplace api
* fix test_squeeze_op unittest
* fix test_relu_op unittest
* fix comment problems
* delete sample code of inplace api
* add reference of grad_pending_nodes in basic_engine
* fix unittest name
* add inplace apis into wlist
* fix error message
* add PADDLE_ENFORCE for set grad op twice
* fix head file error
* set expected place in child thread for dataloader
* set device id when set tensor from numpy
* revert tensor_py change
* add compile guard
* fix ci
* fix bug
* Implemented AddQuantDequantPass in imperative quantization.
* Supported LeakyReLU Quantization
* For meeting coverage rate.
* Changed the file name of test of AddQuantDequant
* Implemented more Quantized NoWeightLayers.
* Fix the loss cannot align problem between static and dynamic model quantization, add swish as supported quantized layer in imperative quantization.
* remove noweight_list
* support 2.0 API such as Pool2D and ReLu
* upgrade oneDNN version to 2.0 master branch
* - Added workarounds for new lib onednn change
* fix regex
Co-authored-by: Jacek Czaja <jacek.czaja@intel.com>
* fix bug of using ignore_index and reduction,test=develop
* fix bug of celoss when using ignore_index and reduction, test=develop
* improve performance when ignore_index=-100, test=develop
* add test in test_cross_entropy_loss.py for coverage rate, test=develop
* rm comment in test_cross_entropy_loss.py, test=develop
* del hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop
* change mask to a more simplified implementation, test=develop
* del comment in python/paddle/nn/functional/loss.py, test=develop
* del hard code and change mask to a more simplified implementation, test=develop
* change mask to a more simplified implementation, test=develop
* change mask to a more simplified implementation, test=develop
* add view strategy on squeeze,unsqueeze,reshape,flatten
* add squeeze unittest
* add unittests
* use View strategy as name rather than Reuse Allacation
* fix view api doc
* fix format
* use core.ops when input of reshape2 is Tensor
* fix test_cross_entropy_loss error because of reshape2
* delete selected_rows
* change op_function
* little change
* solve HandleViewBetweenInputAndOutput
* add cast ops before and after unsupported fp16 ops.
* Keep partial net in FP32 pattern.
* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
* Add fp16 support for adam op.
* add multi precision attr for adam.
* Fix the bug of test_multi_precision_fp16_train UT.
* Code format for CI.
* Fix the redefine error about MPTypeTrait on windows.
* fix bugs of the _create_accumulators func in Momentum.
* fix bug when inserting post cast op.
* Add the update_loss_scaling op in allow_set of UnusedVarCheck.
* Update for ci coverage.
* Add some doc for OptimizerWithMixedPrecision.
* Fix the code style.
* Imporve the doc of `amp_init`.
* Change for fp16 testing if users have the infer program defined in separate way.
1. When x is Variable, call nn.shape(x) only in following cases:
1)The shape of x is used in control flow condition.
2)The dim to be used is negetive
2. When x is Variable, but x.shape or x.shape[idx] doesn't contain negetive value, don't convert to paddle.shape()
* change to tensor copy sync
* change to tensor copy sync
* make copy_to safe when use TensorCopy
* refine code
* add ut
* add cudapinned garbagecollector
* add testcase: cpu place -> cuda pinned place
1. when slice_item is a slice:
1) the start of __getitem__ should be std::max(start, 0) if slice
2) the start of __getitem__ should be std::min(end, dim)
2. when slice_item is an integer, it should be in [-dim_len, dim_len)
3. Fix error message to use accurate data
* Support storage of large parameters
* Reduce the complexity of the unittest
* Reduce the complexity of the unittest,commented out unittest for
* add unittest for static.save/load
* Increase the timeout threshold of 'test_static_save_load'
* Increase the timeout threshold of 'test_static_save_load'
* Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
* Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
* dot op support complex types
* matmul support complex types
* add test case
* matmul broadcast gradient support complex
* move conjFunctor to complex_functor.h
* reopen python coverage --include for test, test=develop
* if no .py file modified, not use coverage run, test=develop
* remove test code, test=develop
* add WITH_INCREMENTAL_COVERAGE, test=develop
* refine if else, test=develop
1. Type of index: int, slice(step must be 1).
2. Type of value:
(1) int32, int64, float32, bool;
(2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
(3) paddle.Tensor(int32, int64, float32, float64, bool);
* add heter box
* add trainer, worker, wrapper...
* format
* for ci
* format
* remove boost get
* boost & copyright
* rename
* rename
* format
* format
* format
Co-authored-by: yaoxuefeng6 <yaoxuefeng@baidu.com>
* add conj op for complex types
* add conj for complex types
* add more test case
* add conj_op test
* modify conj api and impl
* add complex type for fill_constant_op xpu
* add setConstant for complex type
* remove complex conj test file
* user define grad for test_conj_op
* add test case for static mode of conj api
* modify conj doc
* change input args name to x
* remove useless codes
* conj support real types
* add conj test case for real number
* delete no need to calculate inputs in dygraph op_test
* delete no need to calculate inputs in dygraph op_test
* modify grad of mul for complex types
* fix the grads of inputs args order not match bug
* merge amp related function in Momentum from paddle.fluid.contrib.optimizer into paddle.optimizer.
* Add unittest for 2.0 Momentum API.
* fix some bugs in weight_decay.
Before this commit, test_slice use old api `dygraph_to_static_func` to use Dynamic-t-Static and use Executor explicitly,which is not recommended to users.
After fixed, use recommended API `paddle.jit.to_static` to replace `dygraph_to_static_func`, which won't trigger the random exception on coverage CI.
* add conj op for complex types
* add conj for complex types
* add more test case
* add conj_op test
* modify conj api and impl
* add complex type for fill_constant_op xpu
* add setConstant for complex type
* remove complex conj test file
* user define grad for test_conj_op
* add test case for static mode of conj api
* modify conj doc
* change input args name to x
* remove useless codes
* conj support real types
* add conj test case for real number
* add complex real op & api & unittest
* add imag op & api & unittest
* refactor op impl
* revert simplify writing due to complile failed
* polish details
* polish grad op code
* fix expand && concat/transpose to new api
* update xpu_header
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* update activation op on kunlun
* add nearest_interp on kunlun
* update error message
* added UT should not exceed 15s
* fix error
* UT limit of 15s is the first to be executed
* fix error
* fix error with CI_SKIP_CPP_TEST
* modfied tiemout setting
* fix error