* bugfix for warpctc
* fix warpctc commit id
* fix warpctc commit id
* fix warpctc commit id
* fix warpctc commit id
* fix warpctc commit id
* fix WARPCTC_WITH_HIP invalid
* Add logs to find out why can not dlopen libwarpctc.so
* fix warpctc commit id
* fix unit test test_warpctc_op
* Optime failed log for dlopen
* Optime failed log for dlopen
* Delete extra changes
* fix warpctc commit id
* fix warpctc commit id
* Add is_compiled_with_rocm for test_warpctc_op
* fix warpctc commit id
* Cancel optimize dlopen failed reason, move to next pr, due to it makes windows ci failed
* Cancel optimize dlopen failed reason, move to next pr, due to it makes windows ci failed
* Cancel optimize dlopen failed reason, move to next pr, due to it makes windows ci failed
* fix code style problems
* support multihead_matmul_fuse_pass_v3
* fix compile problems
* embedding_eltwise_ln pass support lookup_table_v2
* suppoort matmul and matmul_v2 in qkv matmul
* add deprecated for softmax_with_cross_entropy, test=develop
* test for deprecated in english doc, test=develop
* test deprecated for softmax_with_cross_entropy in english doc, test=develop
* fix readme and English doc for cross_entropy, test=develop
* rm test for softmax_with_cross_entropy deprecated, test=develop
* update readme for CrossEntropyLoss, test=develop
* fix readme format, test=develop
* fix readme format, test=develop
* fix readme format for cross_entropy, test=develop
* add softmax_switch and fix softlabel for cross_entropy, test=develop
* 1)recovery softmax_with_cross_entropy in fluid 2) change softmax_switch to use_softmax 3) add example for softlabel for cross_entropy, test=develop
* fix Example number for cross_entropy, test=develop
* fix code format, test=develop
* fix for CI-Coverage, test=develop
* fix for CI-Coverage, test=develop
* fix ci-coverage for Non-ASCII character '\xe2' in file, test=develop
* fix ci-coverage for Non-ASCII character '\xe2' in nn.layer.loss.py, test=develop
* update description for doc when use_softmax=Fasle, test=develop
* fix some docs and code example for cross_entropy, test=develop
* delete redundant description for soft_label parameter of cross_entropy, test=develop
* fix some comment for test_cross_entropy_loss.py, test=develop
* give shape related contructor and reshape warning
* change line num to fit ut
* change ut to fit
* remove useless code
* call resize directly in constructor
* add roi_align_plugin
* add roi align unit_test
* add roi align serialization
* remove roi align static plugin because of batch dim issue
* refine roi align unittest and add fp16/serialization
* add trt roi align condition to op_teller
* refine error message
* remove unnecessary reshape layer
* trt affine channel converter
* add trt affine channel base test
* add trt affine channel NHWC
* remove asterisk for python2 compatibility
* trt affine channel converter
* add trt affine channel base test
* add trt affine channel NHWC
* remove asterisk for python2 compatibility
* fix rebase
* move LodTensor to Tensor
* add dbg info
* affine channel converter only support NCHW
* scale,bias are parameters, use create_parameters api
* reduce test input size to not exceed the timelimit of ci
* refine affine channel unittest and add serialization/dynamic test
* change super to InferencePassTest for python2 compatibility
* change super to InferencePassTest for python2 compatibility
* fix affine channel fp16 serialize setting
* add multiclass_nms
* add multiclass_nms unittest
* add default enable_tensorrt_oss option
* refine multiclas nms unittest and add serialization/dynamic test
* change super to InferencePassTest for python2 compatibility
* refine multiclass nms unittest
* move out dynamic shape test due to ci timelimit
Our old `loop_body` function may return single element when `loop_vars` just contains only 1 element, which can cause bug. The key point of this PR is forcing `loop_body` functions always return tuple.
* fix tensorrt output varible reshape
* move padding shape x 1 x 1 in ernie to qkv and fc
* update layer name
* fix softmax when input is dynamic, fc not padding any more
* fix varlen
* move fc x_dim assert to op_teller
* nearest_interp op converter w/ dynamic/static
* fix data_layout include
* add trt nearest unit_test
* add nearest_interp NHWC test
* update trt nearest interp nhwc testcase
* remove asterisk for python2 compatibility
* add empty line to prevent conflict
* nearest_interp op converter w/ dynamic/static
* fix data_layout include
* add trt nearest unit_test
* add nearest_interp NHWC test
* update trt nearest interp nhwc testcase
* remove asterisk for python2 compatibility
* add empty line to prevent conflict
* change the priority of out_h, out_w
* Support loading parameters from checkpoint to save quantized model
* Fix the unittest test_moving_average_abs_max_scale_op
* Add unittest of save_quantized_model from checkpoint
* Add comments to explain the function
* add softmax_switch for softmax_with_cross_entropy_op, test=develop
* delete using EigenMatrix in softmax_with_cross_entropy_op.h, test=develop
* add REGISTER_OP_VERSION for softmax_switch attr of softmax_with_cross_entropy_op, test=develop
* add precision on mac
* added judge
* match file_ut.json on mac
* fix code format error
* fix code format error
* fix error caused by length of ut_lists exceeds the limit
* fix format error,notest,test=cpu
* fix code format error
* add windows judge on get_pr_ut
Fix Read-Only Attribute as while_loop Output:
Usually, our convert_while_loop will be like:
```
[a, b, c] = paddle.jit.dy2static.convert_while_loop(
condition_name, body_name, [a, b, c])
```
where a, b, c are in loop_var_names.
However, if loop_var_names contains property such as foo.x, we cannot
assign the attribute as output of convert_while_loop because Python
property is a kind of read-only attribute. To handle the case, we replace
the attributes which are output of convert_while_loop with generated
variables, then if we know the attribute is not read-only at runtime, we
assign the attribute. The created statements are like:
```
[a, b, __attribute_variable_1] = paddle.jit.dy2static.convert_while_loop(
condition_name, body_name, [a, b, foo.x])
if not isinstance(getattr(type(foo), x, None), property): foo.x = __attribute_variable_1
```
* Decrease threshold for failed ut retry
* retry Method upgrade
* second method upgrade
* fix error
* Remove the comment lines
* test for modified_retry_times
* fix error
* fix some error
* fix error
* fix error
* remove test content
* fix error
* Reduce duplicate code
* fix more than 10 ut failed bug
* fix more than 10 ut failed bug on mac
* support trt serialize when load model from memory
* delete conv_bn_fuse_pass before tensorrt, with which trt serialize engine id is not stable
* Revert "delete conv_bn_fuse_pass before tensorrt, with which trt serialize engine id is not stable"
performance degradation, fix in the future
This reverts commit fa6cd17e60b15df351efda379ddd00e9e9c1fea9.
* add delete conv_bn
* delete path when delete_cache_files
* [Custom OP]add PD_THROW and PD_CHECK for User error message
* PD_THROW and PD_CHECK, fix comment
* fix Windows error message
* fix Windows error message
* fix CI
* remove remove_unsupport_dtype
* remove remove_unsupport_dtype
* remove test dtype
* add more include
* change dtype.h's enum as enum class to avoid conflict with inference lib
* make enum as enum class
* remove additional test
* merge develop
* polish code
* add cache for VariableWrapper
* modify args names and vlog level
* format code style
* add log when set cache to variable_wrapper
* add log when set cache to variable_wrapper
* add comment to variableWrapper cache
* format code style
* add simple attr support and test
* add int, float attr support
* support other attribute
* add custom attrs test in cmake
* polish details
* fix test failed
* add backward test
* update test flags
message(FATAL_ERROR"cmake ${CMAKE_VERSION} is not supported when WITH_GPU=ON because of bug https://cmake.org/pipermail/cmake/2018-September/068195.html. "
More infomation about installation, please view [Quick Install](https://www.paddlepaddle.org.cn/install/quick)
Now our developers can acquire Tesla V100 online computing resources for free. If you create a program by AI Studio, you will obtain 12 hours to train models online per day. If you can insist on that for five consecutive days, then you will receive an extra 48 hours. [Click here to start](https://ai.baidu.com/support/news?action=detail&id=981).
Now our developers can acquire Tesla V100 online computing resources for free. If you create a program by AI Studio, you will obtain 10 hours to train models online per day. [Click here to start](https://aistudio.baidu.com/aistudio/index).