* improve the API Sample of DataFeeder, memory_optimize and release_memory, test=develop
* update API.spec, test=develop, test=document_preview
* tweak the code format of feed API, test=develop
* update API.spec, test=develop
* improve doc for DataFeeder and default_main_program, test=develop
* init auto loss scaling
test=develop
* change API.spec
* change ifelse to switch and use reduce_sum to optimize checking isfinite
test=develop
* Remove redundant code
test=develop
* add double grad for elementwise_mul. test=develop
* remove comment. test=develop
* fix grad sum. test=develop
* fix for axis expand. test=develop
* add test for axis expand. test=develop
* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
- Now use it in conv2d_grad_grad.
- Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables,return None in Python.
* fix train_from_dataset and infer_from_dataset example
* add inductive dim for data_reader, example: shape=[-1, 1], then -1 will be inducted through run-time reading of number of elements
* Fix API example code of save_inference_model
test=develop
* Add "import" in exmaple of save_inference_model
* Fix typo "exsample" -> "example"
test=develop
Fix the following API examples:
paddle.fluid.scope_guard
paddle.fluid.backward.append_backward
paddle.fluid.cpu_places
paddle.fluid.cuda_pinned_places
paddle.fluid.cuda_places
paddle.fluid.in_dygraph_mode
paddle.fluid.CUDAPlace
paddle.fluid.CPUPlace
paddle.fluid.CUDAPinnedPlace
* Fix data and reader related api doc
* Fix data and reader related api doc
Review and fix the example code in some reader related API doc.
These APIs are:
Fix existing API example codes:
paddle.fluid.io.PyReader
paddle.fluid.layers.batch
paddle.fluid.layers.data
paddle.fluid.layers.Preprocessor
paddle.fluid.layers.py_reader
paddle.fluid.program_guard
Add new example codes:
paddle.fluid.io.PyReader.decorate_batch_generator
paddle.fluid.io.PyReader.decorate_sample_generator
paddle.fluid.io.PyReader.decorate_sample_list_generator
paddle.fluid.io.PyReader.reset
paddle.fluid.io.PyReader.start
test=develop
* Add changes to API.spec after changing doc.
test=develop
* Add blanks after python example code
test=develop
* Add blank line at py_reader example code
test=develop
* Merge API.spec
test=develop
* Modify reader.py based on reviewer's comment
test=develop
* Modify API.spec after changing doc
test=develop
* Change reader.py based on reviewer's comment
* Modify example code of decorate_sample_generator
test=develop
* Fix example code of PyReader based on reviewer
test=develop
* add use_cuda to inplace pass,test=develop
* add test softmax_with_xe_inplace test,test=develop
* fix potential inplace bug
test=develop
* add more skip vars in mem opt pass,test=develop
* follow comment,test=develop
* follow comments,move duplicate out arg check to program->graph,test=develop
* fix api doc of hash, relu, concat, argmin, argmax, argsoft and all activations funcs with no attrs
test=develop
* refine doc example code
test=develop
* remove >>> in doc example
test=develop
* refine python code block
test=develop
* update API spec
test=develop
* Add MovingAverageAbsMaxScale operator which is only used for calculating the quantization scale.
* test=develop
* change the output into inplace. test=develop
* Revert "test=develop"
This reverts commit 696cf62699ba1e1c98f61f7345ac7060010eb29a.
* Revert "change the output into inplace. test=develop"
This reverts commit a19acd20f07eee82622701a3015e6e9c073a5e0b.
* test=develop.
* update the MovingAverageAbsMaxScaleOp test. test=develop
test_distillation_strategy always failed on a machine with 4 gpus only, disable temporarily and need to figure out the root cause and add it back later
Update the folder name generation mechanism for saving the quantized model and weights.
The folder name would be unique by adding the timestamp postfix.
test=develop
* refine_dropout_mem,test=develop
* # This is a combination of 14 commits.
# The first commit's message is:
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066)
# This is the 2nd commit message:
Fleet unify distributed training (#16791)
* implement distributed transpiler with fleet
# This is the 3rd commit message:
ParallelDyGraph with GPU collective mode (#16827)
implement dygraph.parallel.DataParallel to hook reduce op.
# This is the 4th commit message:
Init mixed precision training interface (#16856)
* Init mixed precision training interface
* Add fp16 test script
test=develop
* All initializers support float16
test=develop
* Code cleanup & add more code annotations
test=develop
* Update API spec
test=develop
* Add usage example in doc
test=develop
# This is the 5th commit message:
fix reference_count_pass,test=develop (#17060)
test=develop
# This is the 6th commit message:
Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)
* Cache the information of linear interpolation in forward and use it in backward.
test=develop
* Fix cuda kernel.
test=develop
# This is the 7th commit message:
remove unnecessary prepare_data (#17080)
test=develop
# This is the 8th commit message:
fix interpolate cu. test=develop (#17101)
# This is the 9th commit message:
test=develop, double backward leaky_relu (#17067)
backward of backward: leaky_relu
# This is the 10th commit message:
fix fuse optimizer ops (#17102)
test=develop
# This is the 11th commit message:
truncated_gaussian_random supported in distributed training, test=develop (#17091)
# This is the 12th commit message:
Detailed coordinate description for yolov3 loss (#17007)
* Detailed coordinate description for yolov3 loss
test=develop
* modified api.spec
test=develop
* modified loss name
* fix api.spec
test=develop
* polish description
test=develop
* modified api.spec
test=develop
# This is the 13th commit message:
fix test_weight_decay (#17109)
test=develop
# This is the 14th commit message:
Path flag (#17105)
* fix python/paddle/fluid/__init__.py detecting problems
* Init mixed precision training interface
* Add fp16 test script
test=develop
* All initializers support float16
test=develop
* Code cleanup & add more code annotations
test=develop
* Update API spec
test=develop
* Add usage example in doc
test=develop
* resolve#17057
Fixed the bug that fuse_relu/fuse_residual option couldn't be passed to class TestConv2dInt8Op.
test=develop
* Fix the bug of test_conv2d_int8_mkldnn case which raised by improper parameter passing.
test=develop
* Support backward of backward and a new gradient checker
* Rename decorators.py to decorator_helper.py, since Python on Windows CI has decorators package.
1. Add ReluDoubleGradMaker when register relu_grad.
2. Add a new gradient checker by comparing theoretical and numerical Jacobian. Check double gradients by double_grad_check.
* move gc test to op_test
test=develop
* Revert "move gc test to op_test"
This reverts commit cf15da65c38f57c91f53b3d8b3c2365d4aa86016.
* enable gc test in some ops
test=develop