* Add a function to update FLAGS
test=develop
* Add a function to update FLAGS
test=develop
* expr flags
* Add a function to update FLAGS
test=develop
* distinguish public/private vars, test=develop
* fix windows issues, test=develop
* expr flag
* Add functions to get and set FLAGS
test=develop
* Add functions to get and set FLAGS
test=develop
* Add functions to get and set FLAGS
test=develop
* Add functions to get and set flags
test=develop
* Add functions to get and set FLAGS
test=develop
* Add a function to update FLAGS
test=develop
* Add a function to update FLAGS
test=develop
* Add functions to get and set flags in Paddle
test=develop
Co-authored-by: sneaxiy <sneaxiy@126.com>
* modify fc to linear in sample code, test=develop
* remove FC, test=develop
* remove warnings, test=develop
* drop fluid/imperative/README.md , test=develop
* change fc to linear, test=develop
* polish code style, test=develop
* append optimize op in the grad block of current block if current block is in control flow. test=develop
* add conditional grad op when optimizer used in control flow. test=develop
* add comment and modify typo. test=develop
* fix append_backward to support control flow. test=develop
* add test. test=develop
* fix copy_var_to_parent_block and conditional_block_grad. test=develop
* fix bug: revert to append conditional_block_grad vars to sub grad block. test=develop
* fix bug: revert to assign var to parent block even if var already is in parent block
* fix bug: consider outputs is empty. test=develop
* move _rename_grad_ out. test=develop
* modify code according to reviews from Huihuang. test=develop
* modify code according to reviews from Jinle. test=develop
* test=develop, fix docker with paddle nccl problem
* don't expose numerous Tensor.set(), test=develop
* fix condition, test=develop
* fix float16 bug, test=develop
* feed should be Tensor or np.array, not Variable or number, test=develop
* use forcecast to copy numpy slice to new array, test=develop
* remove float16-uint16 hacking, test=develop
* add variable method to varbase and refactor to_variable to support return varbase
* support kwargs in varbase constructor
* add VarBase constructor to support default python args
* refine varbase initial method
* reset branch
* fix ut for change VarBase error info to PaddleEnforce
* cherry is parameter change before
* overload isinstance to replace too many change of is_variable
* rm useless files
* rm useless code merged by git
* test=develop, fix some ut failed error
* test=develop, fix test_graph_wrapper
* add some tests, test=develop
* refine __getitem__, test=develop
* add tests, test=develop
* fix err_msg, test=develop
* test=develop, fix docker with paddle nccl problem
* test=develop, refine en_doc for Variable and Program
* test=document_fix, fix English doc for Variable and Program
* test=document_fix, refine astype code block style
* test=document_fix, add example code for Variable properties
* How to write custom op needs to follow framework OP spec.
* Package fluid_framework.so and headers into whl.
* Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir.
* Export some C-APIs to merge OpInfo between core.so and custom_op.so.
* Add unit testing.
* Update API.spec.
* test=develop, fix docker with paddle nccl problem
* test=develop, Add Variable api and refine dygraph related API
* test=develop, Add Variable api and refine dygraph related API
* test=develop, refine test for new api and error info
* test=develop, refine error info and test_layers
* test=develop, add API.spec
* test=devleop, fix to_string python2 and python3 compat error and refien doc
* test=devleop, add API spec
* test=devleop, update API spec
* test=devleop, update API spec
* test=develop, invoke ci
* test=develop, fix example code
* test=develop, update API spec
* test=develop, add compat test and fix inplace campat dict error
The new "fluid.data" changes old "fluid.layers.data":
1. Add shape and dtype check.
2. Remove "append_batch_size" parameter. We won't offer this in the new data layer because other deep learning platforms don't have this kind of data layer pre-processing. It may confuse users.
3. Remove "stop gradient" parameter because the data layer doesn't do back-propagation
TODO:
Now data layer feeded by executor is checked, will we want to check the feed data of readers in the future?
* Add support for new QAT models
test=develop
Co-Authored-By: Michał Gallus <michal.gallus@intel.com>
Co-Authored-By: Wojciech Uss <wojciech.uss@intel.com>
* fixed fps results
test=develop
* fix top5 accuracy drop problem
* updated for new QAT models
* skip quantizing average pooling - dirty but working
* add missing pass
* added missing conv+brelu fuse pass
* removed a call to non-existent pass
test=develop
* renamed pass
test=develop
* Adjust finding pooling scale to newest QAT models
* Remove unnecessary code from quantization_mkldnn_pass
* Copy Pooling input scale to output scale in QAT
* Refactor & remove unused code in QAT
* Incorporate fp32 FC into QAT
test=develop
* Enable graph drawing with debug flag
test=develop
* Add tests for QATv2
* Fix paths for QATv2 models
test=develop
* Add option to save transformed int8 qat model
test=develop
* Remove redundant lines from qat mkldnn pass
test=develop
* Delegate disablement of avg pooling to qat
test=develop
* fix CI bug, test=develop
* Follow Wangzhen's Review, test=develop
* Update API.spec
test=develop
* Name False in (is_unsigned, TensorScale) tuple
test=develop
* refactor dygraph,test=develop
* fix failed unittest,test=develop
* polish code,test=develop
* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop
* polish vlog and profiler, test=develop
* try to fix preceding ops order,test=develop
* test transformer in windows ci, test=develop
* use python c-api to speed up tracer.trace,test=develop
* test=develop, fix docker with paddle nccl problem
* test=develop, add ut for debug string and gradient_accumulator
* test=develop, add tests for layer/gradient_accumulator/prepared_op
* test=develop, fix complie error for test_prepared_op
* test=develop, add more ut for dygraph
* test=develop, create API.spec for dygraph api change
* test=develop, refoctor name to make it easier to understand
* test=develop, refoctor name to make it easier to understand
* test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
* test=develop, fix ut failed on parallel se-resnext
* test=develop, change one more PADDLE_ENFORCE
* support auto prune in dygraph mode
* test=develop, support auto prune
* test=develop, merge develop conflict
* test=develop, fix test_layer and test_tracer ut
* test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs
add support parameter inference when arguments starts or ends is a list containing integer and tensor variable;
test=develop,test=document_preview
improve slice op according to review(from hongyu). test=develop
fix slice op according to review: infer_flags, test=develop
fix slice op: improve overload operator __getitem__ to support attrs(starts and ends) are Variable.
test=develop,test=document_preview
fix test_slice_op: add TestSliceOp_decs_dim_6 to resolve conflict with test_slice_ngraph_op. test=develop
add stop_gradient=True when attr(starts) or attr(ends) is tensor Variable.
test=develop,test=document_preview
* refactor dygraph,test=develop
* fix failed unittest,test=develop
* polish code,test=develop
* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop
* polish vlog and profiler, test=develop
* try to fix preceding ops order,test=develop
* test transformer in windows ci, test=develop
* use python c-api to speed up tracer.trace,test=develop
* test=develop, fix docker with paddle nccl problem
* test=develop, add ut for debug string and gradient_accumulator
* test=develop, add tests for layer/gradient_accumulator/prepared_op
* test=develop, fix complie error for test_prepared_op
* test=develop, add more ut for dygraph
* test=develop, create API.spec for dygraph api change
* test=develop, refoctor name to make it easier to understand
* test=develop, refoctor name to make it easier to understand
* test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
* test=develop, fix ut failed on parallel se-resnext
* test=develop, change one more PADDLE_ENFORCE
* add to and detach for Variable in dygraph, test=develop
* add detach for Variable in dygraph, test=develop
* add detach for Variable in dygraph, test=develop
* add detach for Variable in dygraph, test=develop
* add detach for Variable in dygraph, test=develop
* add detach for Variable in dygraph, test=develop
* add exception check, test=develop
add fl_listen_and_serv op for Federated_learning and fl_distribute_transpiler add this op to pserver program . This op just listen the endpoint and sum&scale.
* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop
* supports collective training in executor
* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop
* fix comment
test=develop
* use unique name for nccl_id
* supports output to stream in program_to_code
* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code
* set op role in collective training
* add collective op role
* remove orig file
* add build optimizer by strategy
* add collective strategy
* refine collective strategy
* add multi-process role maker
* refine strategy building factory so that we can easily plugin more strategy
* scale loss grad in collective sgd transpiler
* add support for distributed fc
* code format
* revert some features for dist fc
* add support for distributed fc training
* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop
* supports collective training in executor
* make fetch_list runable with variables, add more unittest for use_program_cache
test=develop
* use unique name for nccl_id
* supports output to stream in program_to_code
* insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code
* set op role in collective training
* add collective op role
* fix comment
test=develop
* remove orig file
* add build optimizer by strategy
* add collective strategy
* refine collective strategy
* add multi-process role maker
* refine strategy building factory so that we can easily plugin more strategy
* scale loss grad in collective sgd transpiler
* add support for distributed fc
* code format
* revert some features for dist fc
* add support for distributed fc training
* test=develop
add collective op unittest standard
* test=develop
remove the test_collective directory
* test=develop
remove the test_collective directory
* remove slicegather test
* code format for reducescatter
* update attr of shard_index_op
* Modify macro nccl_helper
* remove test without distribute
* macro collective_helper
* marcro update
* test=develop
update support python3.5
* test=develop change gpu memory use to 0.1 when test
* test=develop
update ut equal func
* test=develop
set flags to 1.5
* test=develop fix pickle dumple py35
* test=develop
fix divide in slice and add sync_comm_stream
update atol and rtol to 1e-05
rm shard_index op and test
modify read input from file to read from memory
remove origin_program in framework and add i/o in c_sync_calc_stream
* test=develop update unittest sync operator I/O
* Update backward.py:
- If there is no input grad var in all outputs of previous ops, do not append this op into graph.
- Only apply this stragety when double backward.
* Update some double backward op.
* Update sum_op to judge whether a tensor is empty by numel or IsInitialized().
Add Pipeline Concurrency Train Mode:
- Cpp: pipeline_trainer & section_worker
- Python: PipelineOptimizer
- Add a new data_feed type: PrivateInstantDataFeed
- Add a test demo of pipeline trainer and the test model is gnn
- Do not support win32 now
Note the append_batch_size variable is doing prepend. We should
change the name, but due to backward compatibility, I suggest to
change at v2.0. Not now.
test=develop
* improve the API Sample of DataFeeder, memory_optimize and release_memory, test=develop
* update API.spec, test=develop, test=document_preview
* tweak the code format of feed API, test=develop
* update API.spec, test=develop
* improve doc for DataFeeder and default_main_program, test=develop
Fix the following API examples:
paddle.fluid.scope_guard
paddle.fluid.backward.append_backward
paddle.fluid.cpu_places
paddle.fluid.cuda_pinned_places
paddle.fluid.cuda_places
paddle.fluid.in_dygraph_mode
paddle.fluid.CUDAPlace
paddle.fluid.CPUPlace
paddle.fluid.CUDAPinnedPlace
* Fix data and reader related api doc
* Fix data and reader related api doc
Review and fix the example code in some reader related API doc.
These APIs are:
Fix existing API example codes:
paddle.fluid.io.PyReader
paddle.fluid.layers.batch
paddle.fluid.layers.data
paddle.fluid.layers.Preprocessor
paddle.fluid.layers.py_reader
paddle.fluid.program_guard
Add new example codes:
paddle.fluid.io.PyReader.decorate_batch_generator
paddle.fluid.io.PyReader.decorate_sample_generator
paddle.fluid.io.PyReader.decorate_sample_list_generator
paddle.fluid.io.PyReader.reset
paddle.fluid.io.PyReader.start
test=develop
* Add changes to API.spec after changing doc.
test=develop
* Add blanks after python example code
test=develop
* Add blank line at py_reader example code
test=develop
* Merge API.spec
test=develop
* Modify reader.py based on reviewer's comment
test=develop
* Modify API.spec after changing doc
test=develop
* Change reader.py based on reviewer's comment
* Modify example code of decorate_sample_generator
test=develop
* Fix example code of PyReader based on reviewer
test=develop