Zhaolong Xing
bcddefef39
[Fix Ut]: fix inference ut which exist bug on windows. ( #25814 )
...
* fix windows test
test=develop
* fix ci
test=develop
5 years ago
lilong12
5f30e57cdd
fix test_pipeline, test=develop ( #25808 )
...
* fix test_pipeline, test=develop
5 years ago
Chen Weihang
d47304e6d9
Refine paddle error stack format ( #25790 )
...
* refine error stack format
* polish compile traceback format
* polish detail format
5 years ago
tangwei12
caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) ( #22957 )
...
* Integrated Trainer of Parameter Server
5 years ago
hong
c2a21ca9c9
Fix dygraph grad bugs ( #25781 )
...
* fix double grad visitid unit; test=develop
* change name hash_pair to HashPair; test=develop
* follow comment; test=develop
5 years ago
cc
42189be67b
[Quant] Remove the output for moving_average_abs_max_scale op ( #25697 )
...
* Remove the output for moving_average_abs_max_scale op, test=develop
5 years ago
Dong Daxiang
a96d54ac19
Generate final strategy ( #25782 )
...
* refine strategy compiler and meta optimizers
make async as a_sync
5 years ago
Chen Weihang
2469b578f5
Unified paddle error format when catch system signal ( #25765 )
...
* unified signal error format
* refine signal error message
5 years ago
Zhou Wei
b484a59c39
fix copy file random fail on windows ( #25731 )
5 years ago
Chen Weihang
23d1228c4d
remove ProgramTranslator.save_inference_model ( #25740 )
...
* remove ProgramTranslator.save_inference_model
* adapt save_quantized_model
* revert buffer check implemention
* remove useless import function
5 years ago
Chen Weihang
1b3081b1b4
Simplify BufferedReader to improve DataLoader performance ( #25648 )
...
* simplify buffered reader to improve DataLoader performance
* fix 22 failed unittests
* fix cuda pinned context condition
* fix test_reader_reset failed
* fix two failed unittests
* change unittest place
* polish error messaage
* polish cast op GetExpecctedKernelType
* remove debug info in unittest
5 years ago
Pei Yang
55b6205ddf
add set_mkldnn_cache_capacity python api( #25524 )
5 years ago
Zhou Wei
e0a9115e28
fix random compile failure due to missing file ( #25661 )
5 years ago
Pei Yang
eef98b7f86
add macro check for using TRT api dynamicRangeIsSet() ( #25694 )
5 years ago
Pei Yang
f82baed866
fix trt instance norm plugin on gcc8. test=develop ( #25730 )
5 years ago
Dong Daxiang
920d998f1e
add more settings for distributed strategy ( #25685 )
...
* add more settings for distributed strategy
Basically, DistributedStrategy has several parts of configurations:
- BuildStrategy: the same as paddle.fluid.BuildStrategy, but the distributed arguments are moved out of BuildStrategy
- ExecutionStrategy: the same as paddle.fluid.ExecutionStrategy
- collective communication configs: nccl_comm_num, hierarchical allreduce and so on
- distributed algorithms: async_update(mainly used in PS), lars, lamb and so on
5 years ago
Sylwester Fraczek
1aaa26f102
add dnnl sigmoid (logistic) activation ( #25745 )
5 years ago
Chen Weihang
c34c80d302
Polish framework error message part3 ( #25701 )
...
* polish framework error message part3
* polish details
* fix error message print error
5 years ago
arlesniak
e52df3b125
Added DNNL cache management for DyGraph ( #25624 )
...
* Added DNNL cache management for DyGraph
* move FLAGS_use_mkldnn to more general CMakeLists, getu use of the flag in ClearGradients
* missing file
* Fixes after review
* Bringing back original idea of place for 'use_mkldnn' flag to be accessible from platform nad imperative.
* Removed duplicate and added docs
* Fixes for CI
5 years ago
wangchaochaohu
1e4ab728fb
refine the concat Op for API 2.0 test=develop ( #25307 )
5 years ago
Zhen Wang
cea5086853
Fix the double grad bug for the star gan. ( #25655 )
...
* fix the double grad bug for the star gan. test=develop
* update the retain_graph parameter doc. test=develop
* add the unit test for the retain_graph parameter. test=develop
5 years ago
Chen Weihang
364cc53618
Polish paddle fluid framework error message - part2 ( #25667 )
...
* polish framework error meg part2
* polish details
5 years ago
Adam
98899b73d2
Fix FC + GRU fuse pass ( #25687 )
5 years ago
wanghuancoder
1917b38099
fix some errmsg report,in framework/ir/, about 21 files ( #25525 )
...
* fix error msg report in ir/, about 19 files, test=develop
* modified some unclear descriptions, test=develop
* modified some unclear descriptions, test=develop
* modify unit test pass_test.cc, because the error report in pass.cc is used by pass_test.cc, test=develop
5 years ago
Leo Chen
4ec1251a1e
Refine squeeze, test=develop ( #25281 )
...
* refine squeeze, test=develop
* update squeeze, test=develop
* refine compile-time infershape, test=develop
* add more unittest, test=develop
* follow comments, test=develop
* add update_api, test=develop
* follow comments, test=develop
5 years ago
joanna.wozna.intel
e5bbffa84c
Add NOMINMAX define due to windows.h max/min macro conflict ( #25637 )
...
test=develop
5 years ago
cnn
70cee22fde
New features, add sinh and cosh op, test=develop ( #25495 )
...
* New features, add sinh and cosh op, test=develop
* remove duplicate test function and remove out paramters, test=develop
* Add out paramters temporary, remove later. test=develop
* remove out args, PR 25570, test=develop
* remove TestParameter, test=developx
* add test api for static dygraph, test=develop
* add backword unittests for sinh and cosh, test=develop
5 years ago
Zhang Ting
a1350744eb
register fp16 kernel, test=develop ( #25630 )
5 years ago
mapingshuo
5453a912fe
add fp64 support in sequence_pool, test=develop ( #25662 )
...
add fp64 support in sequence_pool, test=develop
5 years ago
Leo Chen
417b243968
fix best_fit_allocator_test on windows, test=develop ( #25650 )
...
* fix best_fit_allocator_test on windows, test=develop
* enable best_fit_allocator_test and test_math_op_patch_var_base, test=develop
5 years ago
GaoWei8
6e86fd3750
fix concat dimension ( #25606 )
...
Fix the condition of concat dimension judgment.
5 years ago
donproc
95fa383df2
optimize embedding cuda kernel lookup_table_v2,test=develop ( #25587 )
5 years ago
石晓伟
7206417259
supports xpu runtime, test=develop ( #25554 )
...
* update ResetHolder, test=develop
* add TensorShare for lite engine, test=develop
* tensor data changed from copying to sharing, test=develop
* supports xpu runtime, test=develop
* fix code styles, test=develop
5 years ago
Chen Weihang
dfb3ae1b9b
Polish some error message in framework holder - Part 1 ( #25509 )
...
* polish some error message in framework, test=develop
* fix unittest error, test=develop
* replace PADDLE_ENFORCE, test=develop
* polish details based review comment, test=develop
5 years ago
Zhang Ting
30d1ff3bb4
call cublasGemmStridedBatchedEx when using fp16, test=develop ( #25553 )
5 years ago
Zhaolong Xing
9df18b08f3
Disable windows static library generation ( #25593 )
...
* fix windows ci
test=develop
* fix ci error
5 years ago
Aurelius84
ca1185d06b
[Dy2Stat] Fix scope in run_program_op ( #25579 )
...
* add reinforcement learning model test=develop
* align backward test=develop
* add gym in paddle_build.sh test=develop
* rm pip install in script test=develop
* refine paddle_build.sh test=develop
* fix sed error in macOS test=develop
* polish code test=develop
* fix scope problem
* refine code by reviewer comment
5 years ago
Chen Weihang
a6abd92dfd
Polish install error hint message ( #25531 )
...
* polish install error hint msg, test=develop
* fix variable error, test=develop
* polish hint messgae again
5 years ago
wanghuancoder
9b46fe0440
fix some errmsg report,in framework/ir/, about 5 files ( #25539 )
...
* fix error msg report in ir/, about 5 files, test=develop
* fix error msg report in ir/, about 5 files, test=develop
* fix error msg report in ir/, about 5 files, test=develop
5 years ago
Dong Daxiang
e657d7062d
fleet base initial implementation and the API ( #25442 )
...
refactor fleet api under paddle.fleet
update DistributedStrategy
5 years ago
Jacek Czaja
7dbc441eab
[oneDNN] cache cosmetics improvement ( #25576 )
5 years ago
hong
e362095e45
fix softmax with cross entropy out of bound; test=develop ( #25549 )
5 years ago
Huihuang Zheng
d8fe517bf8
Add Support for SelectedRows for Transpose OP and Fix a Bug That SelectedRows Cannot be Supported in SimNet ( #25536 )
...
This PR fixes a bug that SelectedRows cannot be supported in SimNet. The reason of this bug is that dygraph basic_engine didn't copy var's type when the var needs to be accumulated during backward. So when a var is SelectedRows and needs to be accumulated, like SimNet which calls net for two times, the var's type will be changed to default LoDTensor thus bug happens. To fix it, we just also copy the type.
Without this PR, the accumulated SelectedRows parameters in dygraph will be changed into LoDTensor. So when we fixed the bug of supporting SelectedRows in SimNet, we found `test_imperative_lod_tensor_to_selected_rows` failed and threw the error that SelectedRows was not supported for Transpose OP. To fix it, too, this PR also added support for SelectedRows for Transpose OP.
5 years ago
Wilber
848aca7ae8
[CI] [Lite-Subgraph] CI add lite subgraph check. ( #25346 )
5 years ago
wanghuancoder
e65c5b8e83
fix some errmsg report, in framework/ir/ ( #25471 )
...
* fix paddle/fluid/framework/ir/ error msg reoprt, test=develop
* modify error msg reoprt in ir/, about errortype, grammar, supplementary infor, test=develop
* modified some unclear descriptions, test=develop
* Modify the problem that report msg is less than 20 characters, test=develop
5 years ago
Shibo Tao
71c71e684c
fix logical_* ops' doc ( #25479 )
...
* fix doc of logical_* op.
* fix doc of op pow.
* fix comment syntax error9D
* fix operator reciprocal demo.
* fix logical_* ops' doc. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
5 years ago
Aurelius84
4717bdbcfb
Fix hang in seq_topk_avg_pooling op ( #25522 )
...
* fix topk_avg_pool hang test=develop
* refactor get_topk_pos test=develop
* add check of channel_num and num_k test=develop
* add TopKPosPaddingId test=develop
5 years ago
LielinJiang
7129f544f0
Add bilateral_slice op ( #25401 )
...
* add bilateral slice op
5 years ago
GaoWei8
c10dcff12d
refine PADDLE_ENFORCE ( #25456 )
...
* Refine PADDLE_ENFORCE in paddle/fluid/platform
test=develop
5 years ago
wanghuancoder
6c0982b942
fix some errmsg report, in framework/ir/mkldnn ( #25467 )
...
* fix paddle/fluid/framework/ir/mkldnn/ error msg reoprt, test=develop
* modify error msg reoprt, about errortype, grammar, supplementary infor, test=develop
* modified some error descriptions, test=develop
5 years ago
wanghuancoder
fce6466217
fix some errmsg report, in framework/ir/ subdir(memory,optimizer,multi_device) ( #25460 )
...
* fix paddle/fluid/framework/ir/multi_devices_graph_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/memory_optimize_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/fuse_optimizer_ops_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/memory_optimize_pass/ error msg reoprt about PADDLE_ENFORCE, test=develop
* modify error msg reoprt,about errortype,grammar. test=develop
* modify error msg reoprt,about PADDLE_ENFORCE to PADDLE_ENFORCE_XXX, test=develop
* modify error msg reoprt,about PADDLE_ENFORCE to PADDLE_ENFORCE_XXX, and %s to %d, test=develop
* modified some error descriptions, test=develop
5 years ago
Zhang Ting
ca725c82f2
improve fp16 performance of slice_grad, test=develop ( #25523 )
5 years ago
yaoxuefeng
5d3766ff3d
modify flip test=develop ( #25312 )
...
According to paddle 2.0 standard
1, change flip api attr name 'dim' to 'axis'.
2, support empty axis
3, change example code to imperative mode.
5 years ago
Chen Weihang
41d2247275
[Dy2static] Refactor ProgramTranslator save_inference_model API ( #24989 )
...
* experimental refactoring, test=develop
* add TranslatedLayer & remove StaticModelRunner, test=develop
* revert tracedlayer change, test=develop
* fix test_mnist unittest error, test=develop
* add doc & examples, test=develop
* polish doc details, test=develop
* add imperative.jit module, test=develop
* change TranslatedLayer pos, test=develop
* adjust jit module import path, test=develop
* polish doc based review result
* add SaveLoadConfig.separate_params to save paraams separately
* add Layer.buffer support, test=develop
* polish doc details based review result, test=develop
* polish details baesd review comments, test=develop
* add empty str check for param, test=develop
* add unittests, test=develop
* polish details based review comment, test=develop
* remove blanks in comment, test=develop
* polish doc details, test=develop
* update imperative doc link, test=develop
* add api attr for load, test=develop
5 years ago
Pei Yang
43f9f180e5
Add api to clear intermediate tensors in AnalysisPredictor ( #25069 )
...
* add api to clear intemediate tensors in analysis predictor. test=develop
* add python api. test=develop
5 years ago
yaoxuefeng
aaa7cbd56f
modify trace api test=develop ( #25397 )
5 years ago
Huihuang Zheng
f9ac5fb992
[Dy2stat] Fix Memory Optimization in run_program_op and Add SimNet as Unit Test ( #25383 )
...
Add Similarity Net as unit test. During the unit test, we found three problems:
1. The run_program_op has memory optimization error when running dy2stat net multiple times.
2. The support for SelectedRows can cause problem in dy2stat.
3. The return grammar has problem.
This PR fixes the 1. problem but modify codes for the 2. 3. problems to make PR smaller. I will fix those two problems in the next PR(s)
5 years ago
yaoxuefeng
c42d662e2a
modify roll test=develop ( #25321 )
5 years ago
Zhen Wang
548cdbc544
Quantization-aware training for dygraph ( #24634 )
...
* Add the imperative quantization aware training.
* This is the python part of Imperative QAT. test=develop
5 years ago
Chen Weihang
0b54d54fd8
Fix index overflow bug of the CUDA kernel loop increment ( #25435 )
...
* fix softmax_with_cross_entropy cuda kernel overflow bug, test=develop
* replace old macro & for condition, test=develop
* polish details, test=develop
5 years ago
zlsh80826
e528392de9
[Paddle-TRT] SkipLayernorm vectorized memory optimization ( #25117 )
...
* add explicit specialization
* add skiplayernorm vector load if available
* test=develop
5 years ago
Chen Weihang
4061aa6488
Polish ParallelExecutor exception process logic ( #25449 )
...
* polish pe exception process logic, test=develop
* fix unittest, test=develop
* add unittests, test=develop
5 years ago
Jeng Bai-Cheng
fc93266b0a
Improve qkv transpose performance ( #23919 )
...
Use vector instruction (LDG.128) to improve qkv transpose. It
provides 1.4X speedup at same GPU base frequency.
test=develop
5 years ago
zhupengyang
5b573c58e2
randperm API: remove out, devive, stop_gradient; add name ( #25410 )
5 years ago
Chen Weihang
7be285a66f
remove useless property, test=develop ( #25461 )
...
remove useless property
5 years ago
Jacek Czaja
a5d1592f6c
Added missing oneDNN format ( #25450 )
...
test=develop
5 years ago
Chen Weihang
172d4ecb6c
remove WITH_DSO compile option ( #25444 )
5 years ago
Zhen Wang
bb45af02ac
add the c++ part of Imperative QAT. test=develop ( #25446 )
5 years ago
Jacek Czaja
050a9bf79d
[oneDNN] LRN cleanup ( #25416 )
5 years ago
GaoWei8
1974aadcf0
fix concat shape error ( #25414 )
...
* fix concat shape error
test=develop
5 years ago
tangwei12
4b3778a3ee
Revert/barrier for sync ( #25417 )
...
* add retry for prefetch
* Revert "Fix/sync barrier (#25016 )"
This reverts commit be6a315fbd
.
* reopen dist UT, test=develop
* remove fl UT, test=develop
5 years ago
ceci3
52be62c5ae
fix instance norm in dy ( #24717 )
...
* fix bn & in in dy, test=develop
* update instance_norm,test=develop
* fix bugs,test=develop
* add more case in unittest,test=develop
* fix,test=develop
* fix,test=develop
5 years ago
lilong12
e39aa70ec7
add the support for pipeline ( #24560 )
...
* add device_worker for pipeline, test=develop
5 years ago
hong
70d7d07fea
catch bad alloc exception ( #25140 )
...
* cat bad alloc exception; test=develop
* add unitest; test=develop
* move bad alloc catch to the first place; test=develop
* polish error message; test=develop
* polish error message; test=develop
* add mutex header; test=develop
5 years ago
gongweibao
80f1c50738
Fix typo in interface. ( #24779 )
5 years ago
Zhaolong Xing
7b7e605189
[Fix BUGs]: fix multhead matmul pass's instable bug ( #25123 )
...
* fix multhead matmul's instable
test=develop
* fix multihead matmul bug
test=develop
* fix converage problem
test=develop
5 years ago
zhupengyang
eb3173e2b6
rand API: remove out, device, stop_gradient; add name ( #25246 )
5 years ago
GaoWei8
ea7e532598
Refine PADDLE_ENFORCE ( #25369 )
...
* refine PADDLE_ENFORCE
test=develop
5 years ago
zhupengyang
6de75082cb
fix test_hsigmoid windows ci ( #25311 )
5 years ago
Dong Daxiang
d5e40d1ba9
Paddle fleet distributed strategy ( #25379 )
...
* add paddle.fleet.DistributedStrategy for 2.0
5 years ago
WuHaobo
f593c3fb2f
fix the formula of floor OP and ceil OP ( #25292 )
5 years ago
Wojciech Uss
d0a921ba98
Quant2 updates and fixes ( #25313 )
5 years ago
Zhang Ting
bc7610583b
use eval() to improve CPU performance ( #25243 )
5 years ago
lilong12
3d96601b82
modify pipeline optimizer to only support the mode of sync pipeline training ( #25065 )
...
* modify pipeline optimizer, test=develop
5 years ago
Kaipeng Deng
74468bf428
add mish op. ( #24565 )
...
* add mish op. test=develop
5 years ago
Chen Weihang
f07b25d8e5
fix DataLoader.generrator using error, test=develop ( #25355 )
5 years ago
GaoWei8
fb70682f00
fix PADDLE_ENFORCE ( #25297 )
...
* fix PADDLE_ENFORCE and refine the description
test=develop
5 years ago
Yang Zhang
6d6efafeeb
Add `matrix_nms_op` ( #24400 )
...
* Add `matrix_nms_op`
test=develop
* Make ci happy
test=develop
* Exit early when no detection
test=develop
* Fix license year
test=develop
* Output index as well
test=develop
* Match nms2 lod behavior and add `return_index` flag
test=develop
* Make CI happy
test=develop
* Fix wording
test=develop
5 years ago
Chen Weihang
5a959f6e6e
Refactor dynamic dso search functions ( #25214 )
...
* refactor dynamic dso search func, test=develop
* polish details, test=develop
* polish detail based review comments, test=develop
* revert string type change, test=develop
5 years ago
Jacek Czaja
17c751bec6
[oneDNN] Fix to #25078 ( #25256 )
5 years ago
MRXLT
3b8f0a64c2
Encryption infer ( #25119 )
...
* add encrypt api for inference lib
5 years ago
Wilber
4474fc1033
fix compile on windows. test=develop ( #25310 )
5 years ago
Aurelius84
bc2bd3c1ed
modify into eager_tmp of Base Class test=develop ( #25323 )
5 years ago
Chengmo
e85fcaa712
Fix fluid.embedding in Distributed Training ( #25174 )
...
* test=develop, fix_embedding
5 years ago
Aurelius84
494cb36d09
Modify tmp var name prefix in dygraph ( #25280 )
...
* Modify tmp var name prefix in dygraph test=develop
* refine comment test=develop
5 years ago
Wilber
0371cf6f94
fix compile for lite subgraph. test=develop ( #25285 )
5 years ago
Yiqun Liu
c00f827843
Avoid data transforming ShapeTensor from CPU to GPU in fill_constant op. ( #25267 )
5 years ago
Wojciech Uss
23a4f54b73
rename qat into quant ( #24948 )
...
test=develop
5 years ago
123malin
f1a9593d69
test=develop, bug fix for index_select and roll op ( #25251 )
5 years ago
FDInSky
c2e072587c
test=develop fix generate_proposals's error ( #25227 )
5 years ago
Sylwester Fraczek
36abeff44f
adding elementwiseadd quantization ( #25178 )
5 years ago
Wojciech Uss
56fa3880e3
rename qat into quant in filenames only ( #25194 )
...
test=develop
5 years ago
Wilber
4c964abdf7
support build on arm. test=develop ( #25212 )
5 years ago
Wilber
f78e161ea3
remove paddle_use_kernel and paddle_use_op. test=develop ( #25189 )
5 years ago
liym27
1458cc0c68
Fix bug: Don't check dims if contain_unknown_dim of cross_entropy_grad_op in compile time ( #25221 )
5 years ago
liu zhengxi
68e93d8a17
Fix beam_search InferShape ( #25169 )
...
* fix beam_search infershape, test=develop
* fix beam search op unittest, test=develop
5 years ago
Chen Weihang
353ea9e8ad
Add default cudnn lib path ( #25175 )
...
* add default cudnn lib path, test=develop
* change default path in func, test=develop
* move to linux branch, test=develop
* fix var error in other plat, test=develop
5 years ago
Leo Chen
ff5be2fb77
Refine error message in memory folder ( #25095 )
...
* refine PADDLE_THROW, test=develop
* refine error msg, test=develop
* refine cuda error, test=develop
* follow comments, test=develop
* fix compile problem, test=develop
* fix bug, test=develop
5 years ago
Adam
bd0b38e671
Refactor of conv fp32 oneDNN operator ( #25137 )
...
* Refactor of conv fp32 oneDNN operator
test=develop
* Formatting fix
test=develop
* Return Enforces
test=develop
* GetWeights improvements
test=develop
5 years ago
Pei Yang
b2f5a149e7
[Paddle-TRT] Better Paddle-TensorRT support for PaddleSlim quant models ( #25097 )
...
* Paddle-TensorRT support slim QAT. test=develop
* add comments. test=develop
* use RenameInput instead of ResetInputs. test=develop
5 years ago
Tao Luo
2996315fc9
fix profiler_test on win32 ( #25073 )
...
* remove disable profiler_test on win32
* add log
* enlarge the elapsed time
* Revert "add log"
test=develop
5 years ago
Shibo Tao
19c4db1b56
don't re-generate header file if content doesn't change ( #25130 )
...
* don't re-generate header file if content doesn't change. test=develop
* add copy_if_different function. test=develop
5 years ago
iducn
f282599229
disable unitest for gcc8( #25134 )
5 years ago
tianshuo78520a
1eb9ee242b
delete buddy_allocator_test_data to make repo clean ( #25046 )
5 years ago
Chen Weihang
b23801a262
polish tensor set error messag, test=develop ( #25113 )
5 years ago
Jacek Czaja
a7944904d3
[oneDNN]elementwise_add and elementwise_mul int8 support ( #24984 )
...
* Start implementing int8 eltwise add
test=develop
* - Fix to Michal PR
* - Fix
test=develop
* - Lint fixes
test=develop
* - Added checking if elementwise_mul can be used
test=develop
* - Added attribs to skip_attrs_set
test=develop
* - Improved broadcasting
test=develop
- fixes to compilation
- fix
- fix
- Lint fixes
test=develop
* - removed redundant condition
test=develop
Co-authored-by: Michal Gallus <michal.gallus@intel.com>
5 years ago
Zhaolong Xing
843581154f
fix emb eltwise layernorm ( #24873 )
...
test=develop
5 years ago
石晓伟
9ab3cf039c
remove useless test_dot, test=develop ( #24957 )
5 years ago
石晓伟
6783441e70
fix repeat definitions in liengine.cc, test=develop ( #25020 )
5 years ago
Leo Chen
fa657b3dbb
fix bug of prelu when rank not equal 4, test=develop ( #25067 )
...
* fix bug of prelu when rank not equal 4, test=develop
* fix prelu inference, test=develop
* fix api, test=develop
* fix shape when mode is chennel, test=develop
* remove debug code, test=develop
* add unittest, test=develop
5 years ago
zlsh80826
479c8834f7
[Paddle-TRT] Fixes #24731 , opt for SoftmaxKernelWithEltadd kernel, test=develop ( #24834 )
...
* blockReduce opt
* launch threads align to warpSize
* reduce unnecessary shared memory for broadcast reduced value
* vectorize SoftmaxKernelWithEltadd
* add fp16 constrain
* test=develop
5 years ago
hutuxian
5822862d8a
Monitor Framework ( #24079 )
...
* Add a StatValue class in the backend to represent a stat.
* Add a singleton StatRegistry to maintain the collection of stats.
* For the sake of code neatness, we only support type of int and float, which can cover most of the scenarios.
5 years ago
Leo Chen
028de857d4
fix dtype error of compare op, test=develop ( #25059 )
5 years ago
Jeng Bai-Cheng
bef4afa6de
bugfix for unique_ptr of IOptimizationProfile ( #23917 )
...
This commit fixs the compiling bug regarding unique_ptr of IOptimizationProfile.
IOptimizationProfile has protected dtor and is controlled by TensorRT
internally. Application shouldn't delete the pointer of IOptimizationProfile.
See TensorRT document: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_builder.html#a9ac47e100454151d8206ac91d543299a
test=develop
5 years ago
zlsh80826
49e4ee27e1
[Paddle-TRT] slice kernel optimization ( #24783 )
...
* parallel move shared data test=develop
* test=develop
5 years ago
tianshuo78520a
770c11a117
fix make device_context error ( #25045 )
...
* test=develop
* test=develop
* fix bug
* test=develop
* test=develop
5 years ago
tangwei12
be6a315fbd
Fix/sync barrier ( #25016 )
...
* fix sync barrier with barrier monitor, test=develop
5 years ago
ceci3
8db66fc3f6
fix cos_sim, test=develop ( #25017 )
5 years ago
Leo Chen
25a4dac4c2
Use allow list instead of white list ( #25002 )
...
* use allow list instead of white list, test=develop
* reduce include, test=develop
5 years ago
Zhang Ting
621b638550
improve performance of instance_norm, test=develop ( #25005 )
5 years ago
hutuxian
1c224e26af
support CMatchAuc ( #24990 )
...
Support CMatchAucCalculator based on CMatchRankAucCalculator with a new parameter ignore_rank
5 years ago
Leo Chen
bfa46c38d5
bn supports reverse_space, test=develop ( #24988 )
5 years ago
wangchaochaohu
613303dbf6
refine the slice Op to improve the performance of xlnet for fp16 training ( #24967 )
5 years ago
silingtong123
37bdb5269f
test=develop, add log message in the function UpdateDllFlag ( #24937 )
...
* test=develop, add log message in the function UpdateDllFlag
* test=develop, add the test
5 years ago
Chen Weihang
d152d7231e
clear old var in scope, test=develop ( #24976 )
5 years ago
Sylwester Fraczek
53d563a0fe
Reshape transpose matmul coverage ( #24970 )
...
* remove gmock from ut
test=develop
* coverage enabled for r+t+m fuse pass
test=develop
5 years ago
wawltor
0eb1b0bc01
Add support the 5d, 6d tensor support for the reduce ops
...
Add the support the 5d,6d tensor support for the reduce ops;
Add the same time, the compile time, it was 22 minutes, it was 21 minutes after fixed.
5 years ago
liuwei1031
8603b5fb72
fix randomly hang issue of PaddleDetection training task on windows ( #24977 )
5 years ago
silingtong123
640196c446
test=develop, remove the tensorrt dll file from windows package ( #24922 )
5 years ago
wangchaochaohu
feba131893
fix the sgement fault error of profiler in seqseq model test=develop ( #24952 )
5 years ago
Sylwester Fraczek
a7ee634b45
fix WARNING: ThreadSanitizer: heap-use-after-free ( #24929 )
...
test=develop
5 years ago
mapingshuo
24e24987f0
fixes the place info in the Print op ( #24934 )
...
fixes the CUDAPlace info in the Print op
5 years ago
Aurelius84
6be0ee159e
Support LoDTensorArray in reverse_op ( #24797 )
...
* Support LoDTensorArray in reverse_op test=develop
* polish en doc and unittest code test=develop
* refine sample code test=develop
* add example of LoDTensorArray test=develop
* fix typo test=develop
5 years ago
Leo Chen
6190023ac9
Refine error message in pybind folder ( #24886 )
...
* refine err_msg of pybind.cc, test=develop
* refine err_msg in tensor_py.h, test=develop
* refine error msg, test=develop
* fix test_exception, test=develop
* follow comments, test=develop
5 years ago
Zhou Wei
4058e736ff
temporarily disable these unittests failed on windows ( #24942 )
5 years ago
Leo Chen
a7cb97a1a5
Fix/isfinite on windows ( #24927 )
...
* refine isfinite, test=develop
* use namespace std of isfinite, test=develop, test=win_gpu
5 years ago
silingtong123
ef9b36873d
test=develop, remove the gflags/gflags.h form paddle_api.h ( #24921 )
5 years ago
whs
4c01d6d53e
Enhance checking in some operator. ( #24473 )
5 years ago
Chen Weihang
4a702ef361
Support SelelctedRows allreduce in multi-cards imperative mode ( #24690 )
...
* support selectedrows allreduce in multi-cards dygraph, test=develop
* remove useless import modules in unittests, test=develop
* add nccl cmake to get nccl version, test=develop
* add if-condition to compiled correctly, test=develop
* add detail version parseing for old nccl, test=develop
* polish camke details, test=develop
* fix remove test cmake error, test=develop
* fix cmake condition, test=develop
* change unittest camke list, test=develop
* fix unittest cmake rule, test=develop, test=framep0
5 years ago
Pei Yang
14b8540551
add default ctor for AnalysisConfig python api. test=develop ( #24924 )
5 years ago