Zhou Wei
1f74b94d3f
fix compile warning on windows MSVC, fix paddle_build.bat more safe ( #25933 )
...
* Fixed compile warning about incorrect compile options,fix paddle_build.bat
* fix paddle_build.bat to more safe
5 years ago
tangwei12
c14ec8782b
【paddle.fleet】Feature/fleet ps api 2.0 ( #25857 )
...
* add paddle.fleet.AsyncOptimizer
Co-authored-by: dongdaxiang <dongdaxiang@baidu.com>
5 years ago
Chen Weihang
3c8daa9b89
Add pin memory control for BufferedReader ( #26026 )
...
* add pin memory control
* fix buffered reader init problem
* fix unittest error
* add unittest for coverage
5 years ago
Chen Weihang
ad4a0466a5
Add cuda pinned place branch in slice op GetExpectedKernelType ( #26027 )
...
* add cuda pinned place branch
* add unittest
* add skip when not gpu
5 years ago
zhangchunle
86794cccbd
separate approve ( #26035 )
5 years ago
Feiyu Chan
e853ece0a2
update document template for unary elementwise layers ( #25896 )
...
1. update document template for unary elementwise layers(a.k.a. activation layer);
2. remove generate_op_noattr and use generate_activation instead; remove redundant function copies;
3. minor update for docstring to fix rst format errors.
4. fix doc for Rsqrt OP
5. add sample code for each activation separately;
6. remove the unused deprecated decorator.
5 years ago
joanna.wozna.intel
734cf1c3e9
Change use_quantizer attribute name and data type ( #25838 )
...
* Change use_quantizer attribute name and data type
* Fix problem with setting attribute
* Add changes due to review
* Small change in function
* Restore use_quantizer attr for compatibility
5 years ago
Leo Chen
5258d53d65
refine unsqueeze, test=develop ( #25470 )
...
* refine unsqueeze, test=develop
* update unsqueeze, test=develop
* refine unsqueeze, test=develop
* refine unsqueeze, test=develop
* update
* remove None, test=develop
* follow comments
* support bool
* update doc
* follow comments
* merge develop
5 years ago
tangwei12
3755564ae1
Fix/large scale fix ( #25999 )
...
* fix large scale KV
* fix single training using async ssa graph
5 years ago
Leo Chen
751305ecf0
Add flags to control call stack of error message ( #25997 )
...
* add flags_call_stack_level
* update
* refine code
5 years ago
Thunderbrook
fd2947babf
fix compile error with mkl ( #26030 )
...
test=develop
5 years ago
Leo Chen
0a47387bd8
Use static local variable instead of global variable for safty ( #26018 )
...
* remove global variable
* refine code
5 years ago
Pei Yang
beb0ca5fab
Fix TRT plugin registry without TRT lib ( #25982 )
...
* fix trt plugin registry without trt lib
* support trt4
* refine code style
5 years ago
123malin
2191a08317
【paddle.fleet】fleet_util move to paddle.fleet ( #25805 )
...
* test=develop,test=document_fix, remove the out args
* fleet_util move to paddle.fleet
Co-authored-by: WuHaobo <wuhaobo1994@gmail.com>
Co-authored-by: tangwei12 <tangwei12@baidu.com>
5 years ago
yaoxuefeng
224620071b
add new flatten op test=develop ( #25393 )
5 years ago
Adam
68c6160e63
Add oneDNN fusion_gru kernel ( #25594 )
...
* Add oneDNN fusion_gru kernel and fix fc+gru pass
test=develop
* Formatting changes
test=develop
* Lint fixes
test=develop
* Add memory::format_tag::any to GRU weights
test=develop
* Fix build with CUDA
* Fix build with CUDA v2
5 years ago
Thunderbrook
0cb60c700d
add heter ps mode ( #25682 )
...
* add heter ps mode
* code style
test=develop
* add with_pslib
test=develop
* unitest
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* test monitor
test=develop
* prepare trainer
test=develop
* code style
test=develop
5 years ago
Zhong Hui
dca56f47f5
fix invalid read of pnorm gradient function
...
fix invalid read of pnorm gradient function and delete the unused code
5 years ago
WangXi
2c9d0f3cb9
【paddle.fleet】Add dgc to fleet meta optimizer ( #25738 )
...
Add dgc to fleet meta optimizer, rm dgc from optimizer all
5 years ago
Zhaolong Xing
358bc06c72
[CUDNN8 support] : support CUDNN8 ( #25664 )
...
* cunn8 support
test=develop
* fix ci error
test=develop
5 years ago
Zhaolong Xing
5970871a64
add eltwise clip cuda impl. ( #25689 )
...
test=develop
5 years ago
Zhen Wang
82374dc12f
Add some error messages for the op without double grads. ( #25951 )
...
* Add some error messages for the op without double grads.
* fix the test_imperative_double_grad UT.
5 years ago
danleifeng
3dd2e3801a
【paddle.fleet】add fleetrun command for distributed running ( #25806 )
...
* add fleetrun command for distributed running; test=develop
5 years ago
Pei Yang
b717895f64
Fix registering trt plugin ( #25744 )
...
* develop dynamic shape serilization
* add test param for gelu
* fix bugs
* delete redundant comments
* debug
* fix conflict. test=develop
* fix bug. test=develop
* add trt dynamic shape serialized support
* fix ernie serialized bug
test=develop
* fix codestyle
test=develop
* fix bug
test=develop
* fix bug.test=develop
* modify cmakelist test=develop
* fix bug
test=develop
* fix error message. test=develop
* fix trt register plugin based on pr#25003
* add trt dynload
* fix deserialization bug of not finding plugin registration
* refine code style
* recover engine key in tensorrt_subgraph_pass
* for ci coverage
* add unittest for deserialization
Co-authored-by: haozech <chenhaoze94@gmail.com>
5 years ago
wawltor
a697e94693
Update the code of the compare ops for the broadcast function
...
Update the code for the compare ops for the broadcast function
5 years ago
Chen Weihang
9b5a65b819
refine init signal handler meg dumper ( #25911 )
5 years ago
wangchaochaohu
ff717d5158
Add support for tuple of concat Op test=develop ( #25800 )
5 years ago
tangwei12
253fd407e8
Fix/distibuted heart beat ( #25902 )
...
* disable heart beat UT
5 years ago
WangXi
a6c87fd091
Add amp to fleet meta optimizer, test=develop ( #25770 )
5 years ago
Pei Yang
9e9a569dae
add trt int8 support for elementwise_mul and scale ( #25676 )
5 years ago
xujiaqi01
d11c140e28
fix dump, fix cvm check ( #25400 )
...
* fix dump, fix cvm check
test=develop
* fix
test=develop
* fix
test=develop
* fix
test=develop
5 years ago
JZ-LIANG
8ebffc78c9
add lars to fleet meta optimizer ( #25884 )
5 years ago
Dong Daxiang
8d2896f1fe
【paddle.fleet】Fleet run graph in Executor and add two more strategies ( #25844 )
...
* split meta optimizer files
* add graph execution in execution, update two properties in DistributedStrategy, unit tests for these features
5 years ago
Zhang Ting
6486fe8a94
improve GPU performance of transpose, test=develop ( #25862 )
5 years ago
Zhang Ting
2d24f56a7a
avoid data transfer, test=develop ( #25810 )
5 years ago
ShenLiang
bca303165a
fix inverse bug ( #25641 )
...
* fix inverse bug, test=develop
* fix the untest, test=develop
* add singular checking, test=develop
* fix the utest, test=develop
* use memory::copy, test=develop
* fix bost_get, test=develop
* fix position, test=develop
5 years ago
Chen Weihang
48b9a56f1c
Polish framework error message - part 4 ( #25807 )
...
* polish framework error message part 4
* fix type error
* fix message error
* polish by review comments
5 years ago
Aurelius84
e52dae6ef6
Using input.place() in GetExpectedKernel in slice_op ( #25595 )
...
* modify GetExpectedKernelType
* use input place
* add ENFORCE check
5 years ago
wawltor
595a719795
Update the api for the compare_ops
...
Update the code for the compare_ops, update the api and doc
5 years ago
wangchaochaohu
32b9577b2a
refine the split op for API 2.0 test=develop ( #25320 )
5 years ago
lilong12
ce506930c3
Fix the bug that Input(Offsets) and attr(offsets) cannot be set at the same time. ( #24975 )
...
* bug fix, test=develop
5 years ago
tangwei12
2d9dbd31ad
Fix/mkl dnn ( #25835 )
5 years ago
Zhaolong Xing
bcddefef39
[Fix Ut]: fix inference ut which exist bug on windows. ( #25814 )
...
* fix windows test
test=develop
* fix ci
test=develop
5 years ago
lilong12
5f30e57cdd
fix test_pipeline, test=develop ( #25808 )
...
* fix test_pipeline, test=develop
5 years ago
Chen Weihang
d47304e6d9
Refine paddle error stack format ( #25790 )
...
* refine error stack format
* polish compile traceback format
* polish detail format
5 years ago
tangwei12
caa90a6510
Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) ( #22957 )
...
* Integrated Trainer of Parameter Server
5 years ago
hong
c2a21ca9c9
Fix dygraph grad bugs ( #25781 )
...
* fix double grad visitid unit; test=develop
* change name hash_pair to HashPair; test=develop
* follow comment; test=develop
5 years ago
cc
42189be67b
[Quant] Remove the output for moving_average_abs_max_scale op ( #25697 )
...
* Remove the output for moving_average_abs_max_scale op, test=develop
5 years ago
Dong Daxiang
a96d54ac19
Generate final strategy ( #25782 )
...
* refine strategy compiler and meta optimizers
make async as a_sync
5 years ago
Chen Weihang
2469b578f5
Unified paddle error format when catch system signal ( #25765 )
...
* unified signal error format
* refine signal error message
5 years ago
tianshuo78520a
818d38f150
Update conda_build.py for opencv dependency( #25654 )
5 years ago
Zhou Wei
b484a59c39
fix copy file random fail on windows ( #25731 )
5 years ago
Chen Weihang
23d1228c4d
remove ProgramTranslator.save_inference_model ( #25740 )
...
* remove ProgramTranslator.save_inference_model
* adapt save_quantized_model
* revert buffer check implemention
* remove useless import function
5 years ago
Chen Weihang
1b3081b1b4
Simplify BufferedReader to improve DataLoader performance ( #25648 )
...
* simplify buffered reader to improve DataLoader performance
* fix 22 failed unittests
* fix cuda pinned context condition
* fix test_reader_reset failed
* fix two failed unittests
* change unittest place
* polish error messaage
* polish cast op GetExpecctedKernelType
* remove debug info in unittest
5 years ago
Pei Yang
55b6205ddf
add set_mkldnn_cache_capacity python api( #25524 )
5 years ago
Zhou Wei
e0a9115e28
fix random compile failure due to missing file ( #25661 )
5 years ago
Pei Yang
eef98b7f86
add macro check for using TRT api dynamicRangeIsSet() ( #25694 )
5 years ago
Pei Yang
f82baed866
fix trt instance norm plugin on gcc8. test=develop ( #25730 )
5 years ago
Dong Daxiang
920d998f1e
add more settings for distributed strategy ( #25685 )
...
* add more settings for distributed strategy
Basically, DistributedStrategy has several parts of configurations:
- BuildStrategy: the same as paddle.fluid.BuildStrategy, but the distributed arguments are moved out of BuildStrategy
- ExecutionStrategy: the same as paddle.fluid.ExecutionStrategy
- collective communication configs: nccl_comm_num, hierarchical allreduce and so on
- distributed algorithms: async_update(mainly used in PS), lars, lamb and so on
5 years ago
Sylwester Fraczek
1aaa26f102
add dnnl sigmoid (logistic) activation ( #25745 )
5 years ago
Chen Weihang
c34c80d302
Polish framework error message part3 ( #25701 )
...
* polish framework error message part3
* polish details
* fix error message print error
5 years ago
arlesniak
e52df3b125
Added DNNL cache management for DyGraph ( #25624 )
...
* Added DNNL cache management for DyGraph
* move FLAGS_use_mkldnn to more general CMakeLists, getu use of the flag in ClearGradients
* missing file
* Fixes after review
* Bringing back original idea of place for 'use_mkldnn' flag to be accessible from platform nad imperative.
* Removed duplicate and added docs
* Fixes for CI
5 years ago
wangchaochaohu
1e4ab728fb
refine the concat Op for API 2.0 test=develop ( #25307 )
5 years ago
Zhen Wang
cea5086853
Fix the double grad bug for the star gan. ( #25655 )
...
* fix the double grad bug for the star gan. test=develop
* update the retain_graph parameter doc. test=develop
* add the unit test for the retain_graph parameter. test=develop
5 years ago
Chen Weihang
364cc53618
Polish paddle fluid framework error message - part2 ( #25667 )
...
* polish framework error meg part2
* polish details
5 years ago
Adam
98899b73d2
Fix FC + GRU fuse pass ( #25687 )
5 years ago
wanghuancoder
1917b38099
fix some errmsg report,in framework/ir/, about 21 files ( #25525 )
...
* fix error msg report in ir/, about 19 files, test=develop
* modified some unclear descriptions, test=develop
* modified some unclear descriptions, test=develop
* modify unit test pass_test.cc, because the error report in pass.cc is used by pass_test.cc, test=develop
5 years ago
Leo Chen
4ec1251a1e
Refine squeeze, test=develop ( #25281 )
...
* refine squeeze, test=develop
* update squeeze, test=develop
* refine compile-time infershape, test=develop
* add more unittest, test=develop
* follow comments, test=develop
* add update_api, test=develop
* follow comments, test=develop
5 years ago
joanna.wozna.intel
e5bbffa84c
Add NOMINMAX define due to windows.h max/min macro conflict ( #25637 )
...
test=develop
5 years ago
cnn
70cee22fde
New features, add sinh and cosh op, test=develop ( #25495 )
...
* New features, add sinh and cosh op, test=develop
* remove duplicate test function and remove out paramters, test=develop
* Add out paramters temporary, remove later. test=develop
* remove out args, PR 25570, test=develop
* remove TestParameter, test=developx
* add test api for static dygraph, test=develop
* add backword unittests for sinh and cosh, test=develop
5 years ago
Zhang Ting
a1350744eb
register fp16 kernel, test=develop ( #25630 )
5 years ago
mapingshuo
5453a912fe
add fp64 support in sequence_pool, test=develop ( #25662 )
...
add fp64 support in sequence_pool, test=develop
5 years ago
Leo Chen
417b243968
fix best_fit_allocator_test on windows, test=develop ( #25650 )
...
* fix best_fit_allocator_test on windows, test=develop
* enable best_fit_allocator_test and test_math_op_patch_var_base, test=develop
5 years ago
GaoWei8
6e86fd3750
fix concat dimension ( #25606 )
...
Fix the condition of concat dimension judgment.
5 years ago
donproc
95fa383df2
optimize embedding cuda kernel lookup_table_v2,test=develop ( #25587 )
5 years ago
石晓伟
7206417259
supports xpu runtime, test=develop ( #25554 )
...
* update ResetHolder, test=develop
* add TensorShare for lite engine, test=develop
* tensor data changed from copying to sharing, test=develop
* supports xpu runtime, test=develop
* fix code styles, test=develop
5 years ago
Chen Weihang
dfb3ae1b9b
Polish some error message in framework holder - Part 1 ( #25509 )
...
* polish some error message in framework, test=develop
* fix unittest error, test=develop
* replace PADDLE_ENFORCE, test=develop
* polish details based review comment, test=develop
5 years ago
Leo Chen
1ab4101d6c
add ci check for changing op-related api without core.ops, test=develop ( #25596 )
...
* add ci check for changing op-related api without core.ops, test=develop
* generate api_source_md5 file when build, test=develop
* add failed example, test=develop
* add failed example, test=develop
* handle exception, test=develop
5 years ago
Zhang Ting
30d1ff3bb4
call cublasGemmStridedBatchedEx when using fp16, test=develop ( #25553 )
5 years ago
Zhaolong Xing
9df18b08f3
Disable windows static library generation ( #25593 )
...
* fix windows ci
test=develop
* fix ci error
5 years ago
Aurelius84
ca1185d06b
[Dy2Stat] Fix scope in run_program_op ( #25579 )
...
* add reinforcement learning model test=develop
* align backward test=develop
* add gym in paddle_build.sh test=develop
* rm pip install in script test=develop
* refine paddle_build.sh test=develop
* fix sed error in macOS test=develop
* polish code test=develop
* fix scope problem
* refine code by reviewer comment
5 years ago
Chen Weihang
a6abd92dfd
Polish install error hint message ( #25531 )
...
* polish install error hint msg, test=develop
* fix variable error, test=develop
* polish hint messgae again
5 years ago
wanghuancoder
9b46fe0440
fix some errmsg report,in framework/ir/, about 5 files ( #25539 )
...
* fix error msg report in ir/, about 5 files, test=develop
* fix error msg report in ir/, about 5 files, test=develop
* fix error msg report in ir/, about 5 files, test=develop
5 years ago
Zhou Wei
1ab60544f2
windows CI scripts for xly,test=develop,test=win ( #25533 )
...
windows CI scripts for xly
5 years ago
Dong Daxiang
e657d7062d
fleet base initial implementation and the API ( #25442 )
...
refactor fleet api under paddle.fleet
update DistributedStrategy
5 years ago
zhangchunle
3382f395ed
example EXcode ( #25578 )
5 years ago
Jacek Czaja
7dbc441eab
[oneDNN] cache cosmetics improvement ( #25576 )
5 years ago
Aurelius84
1a5d3defb1
[Dy2stat] Add Reinforcement learning unittest ( #25445 )
...
* add reinforcement learning model test=develop
* align backward test=develop
* add gym in paddle_build.sh test=develop
* rm pip install in script test=develop
* refine paddle_build.sh test=develop
* fix sed error in macOS test=develop
* polish code test=develop
5 years ago
zhangchunle
1a4a4219cb
mac build exitcode ( #25540 )
5 years ago
hong
e362095e45
fix softmax with cross entropy out of bound; test=develop ( #25549 )
5 years ago
Huihuang Zheng
d8fe517bf8
Add Support for SelectedRows for Transpose OP and Fix a Bug That SelectedRows Cannot be Supported in SimNet ( #25536 )
...
This PR fixes a bug that SelectedRows cannot be supported in SimNet. The reason of this bug is that dygraph basic_engine didn't copy var's type when the var needs to be accumulated during backward. So when a var is SelectedRows and needs to be accumulated, like SimNet which calls net for two times, the var's type will be changed to default LoDTensor thus bug happens. To fix it, we just also copy the type.
Without this PR, the accumulated SelectedRows parameters in dygraph will be changed into LoDTensor. So when we fixed the bug of supporting SelectedRows in SimNet, we found `test_imperative_lod_tensor_to_selected_rows` failed and threw the error that SelectedRows was not supported for Transpose OP. To fix it, too, this PR also added support for SelectedRows for Transpose OP.
5 years ago
Wilber
848aca7ae8
[CI] [Lite-Subgraph] CI add lite subgraph check. ( #25346 )
5 years ago
wanghuancoder
e65c5b8e83
fix some errmsg report, in framework/ir/ ( #25471 )
...
* fix paddle/fluid/framework/ir/ error msg reoprt, test=develop
* modify error msg reoprt in ir/, about errortype, grammar, supplementary infor, test=develop
* modified some unclear descriptions, test=develop
* Modify the problem that report msg is less than 20 characters, test=develop
5 years ago
Shibo Tao
71c71e684c
fix logical_* ops' doc ( #25479 )
...
* fix doc of logical_* op.
* fix doc of op pow.
* fix comment syntax error9D
* fix operator reciprocal demo.
* fix logical_* ops' doc. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
* bug fix. test=develop,test=document_fix
5 years ago
Aurelius84
4717bdbcfb
Fix hang in seq_topk_avg_pooling op ( #25522 )
...
* fix topk_avg_pool hang test=develop
* refactor get_topk_pos test=develop
* add check of channel_num and num_k test=develop
* add TopKPosPaddingId test=develop
5 years ago
LielinJiang
7129f544f0
Add bilateral_slice op ( #25401 )
...
* add bilateral slice op
5 years ago
GaoWei8
c10dcff12d
refine PADDLE_ENFORCE ( #25456 )
...
* Refine PADDLE_ENFORCE in paddle/fluid/platform
test=develop
5 years ago
wanghuancoder
6c0982b942
fix some errmsg report, in framework/ir/mkldnn ( #25467 )
...
* fix paddle/fluid/framework/ir/mkldnn/ error msg reoprt, test=develop
* modify error msg reoprt, about errortype, grammar, supplementary infor, test=develop
* modified some error descriptions, test=develop
5 years ago
wanghuancoder
fce6466217
fix some errmsg report, in framework/ir/ subdir(memory,optimizer,multi_device) ( #25460 )
...
* fix paddle/fluid/framework/ir/multi_devices_graph_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/memory_optimize_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/fuse_optimizer_ops_pass/ error msg reoprt, test=develop
* fix paddle/fluid/framework/ir/memory_optimize_pass/ error msg reoprt about PADDLE_ENFORCE, test=develop
* modify error msg reoprt,about errortype,grammar. test=develop
* modify error msg reoprt,about PADDLE_ENFORCE to PADDLE_ENFORCE_XXX, test=develop
* modify error msg reoprt,about PADDLE_ENFORCE to PADDLE_ENFORCE_XXX, and %s to %d, test=develop
* modified some error descriptions, test=develop
5 years ago
Zhang Ting
ca725c82f2
improve fp16 performance of slice_grad, test=develop ( #25523 )
5 years ago
yaoxuefeng
5d3766ff3d
modify flip test=develop ( #25312 )
...
According to paddle 2.0 standard
1, change flip api attr name 'dim' to 'axis'.
2, support empty axis
3, change example code to imperative mode.
5 years ago
Chen Weihang
41d2247275
[Dy2static] Refactor ProgramTranslator save_inference_model API ( #24989 )
...
* experimental refactoring, test=develop
* add TranslatedLayer & remove StaticModelRunner, test=develop
* revert tracedlayer change, test=develop
* fix test_mnist unittest error, test=develop
* add doc & examples, test=develop
* polish doc details, test=develop
* add imperative.jit module, test=develop
* change TranslatedLayer pos, test=develop
* adjust jit module import path, test=develop
* polish doc based review result
* add SaveLoadConfig.separate_params to save paraams separately
* add Layer.buffer support, test=develop
* polish doc details based review result, test=develop
* polish details baesd review comments, test=develop
* add empty str check for param, test=develop
* add unittests, test=develop
* polish details based review comment, test=develop
* remove blanks in comment, test=develop
* polish doc details, test=develop
* update imperative doc link, test=develop
* add api attr for load, test=develop
5 years ago
Pei Yang
43f9f180e5
Add api to clear intermediate tensors in AnalysisPredictor ( #25069 )
...
* add api to clear intemediate tensors in analysis predictor. test=develop
* add python api. test=develop
5 years ago
zhangchunle
6bfbb6abab
exitcode normalize ( #25487 )
5 years ago
zhangchunle
cf6eb0e175
summary failedtests ( #25388 )
5 years ago
yaoxuefeng
aaa7cbd56f
modify trace api test=develop ( #25397 )
5 years ago
Huihuang Zheng
f9ac5fb992
[Dy2stat] Fix Memory Optimization in run_program_op and Add SimNet as Unit Test ( #25383 )
...
Add Similarity Net as unit test. During the unit test, we found three problems:
1. The run_program_op has memory optimization error when running dy2stat net multiple times.
2. The support for SelectedRows can cause problem in dy2stat.
3. The return grammar has problem.
This PR fixes the 1. problem but modify codes for the 2. 3. problems to make PR smaller. I will fix those two problems in the next PR(s)
5 years ago
yaoxuefeng
c42d662e2a
modify roll test=develop ( #25321 )
5 years ago
Zhen Wang
548cdbc544
Quantization-aware training for dygraph ( #24634 )
...
* Add the imperative quantization aware training.
* This is the python part of Imperative QAT. test=develop
5 years ago
Chen Weihang
0b54d54fd8
Fix index overflow bug of the CUDA kernel loop increment ( #25435 )
...
* fix softmax_with_cross_entropy cuda kernel overflow bug, test=develop
* replace old macro & for condition, test=develop
* polish details, test=develop
5 years ago
zlsh80826
e528392de9
[Paddle-TRT] SkipLayernorm vectorized memory optimization ( #25117 )
...
* add explicit specialization
* add skiplayernorm vector load if available
* test=develop
5 years ago
Chen Weihang
4061aa6488
Polish ParallelExecutor exception process logic ( #25449 )
...
* polish pe exception process logic, test=develop
* fix unittest, test=develop
* add unittests, test=develop
5 years ago
Jeng Bai-Cheng
fc93266b0a
Improve qkv transpose performance ( #23919 )
...
Use vector instruction (LDG.128) to improve qkv transpose. It
provides 1.4X speedup at same GPU base frequency.
test=develop
5 years ago
zhupengyang
5b573c58e2
randperm API: remove out, devive, stop_gradient; add name ( #25410 )
5 years ago
Chen Weihang
7be285a66f
remove useless property, test=develop ( #25461 )
...
remove useless property
5 years ago
tianshuo78520a
2d028389e4
Fix Cpu CI error( #25457 )
5 years ago
Jacek Czaja
a5d1592f6c
Added missing oneDNN format ( #25450 )
...
test=develop
5 years ago
Chen Weihang
172d4ecb6c
remove WITH_DSO compile option ( #25444 )
5 years ago
Zhen Wang
bb45af02ac
add the c++ part of Imperative QAT. test=develop ( #25446 )
5 years ago
Jacek Czaja
050a9bf79d
[oneDNN] LRN cleanup ( #25416 )
5 years ago
GaoWei8
1974aadcf0
fix concat shape error ( #25414 )
...
* fix concat shape error
test=develop
5 years ago
tangwei12
4b3778a3ee
Revert/barrier for sync ( #25417 )
...
* add retry for prefetch
* Revert "Fix/sync barrier (#25016 )"
This reverts commit be6a315fbd
.
* reopen dist UT, test=develop
* remove fl UT, test=develop
5 years ago
ceci3
52be62c5ae
fix instance norm in dy ( #24717 )
...
* fix bn & in in dy, test=develop
* update instance_norm,test=develop
* fix bugs,test=develop
* add more case in unittest,test=develop
* fix,test=develop
* fix,test=develop
5 years ago
lilong12
e39aa70ec7
add the support for pipeline ( #24560 )
...
* add device_worker for pipeline, test=develop
5 years ago
hong
70d7d07fea
catch bad alloc exception ( #25140 )
...
* cat bad alloc exception; test=develop
* add unitest; test=develop
* move bad alloc catch to the first place; test=develop
* polish error message; test=develop
* polish error message; test=develop
* add mutex header; test=develop
5 years ago
gongweibao
80f1c50738
Fix typo in interface. ( #24779 )
5 years ago
Zhaolong Xing
7b7e605189
[Fix BUGs]: fix multhead matmul pass's instable bug ( #25123 )
...
* fix multhead matmul's instable
test=develop
* fix multihead matmul bug
test=develop
* fix converage problem
test=develop
5 years ago
zhupengyang
eb3173e2b6
rand API: remove out, device, stop_gradient; add name ( #25246 )
5 years ago
GaoWei8
ea7e532598
Refine PADDLE_ENFORCE ( #25369 )
...
* refine PADDLE_ENFORCE
test=develop
5 years ago
zhupengyang
6de75082cb
fix test_hsigmoid windows ci ( #25311 )
5 years ago
Dong Daxiang
d5e40d1ba9
Paddle fleet distributed strategy ( #25379 )
...
* add paddle.fleet.DistributedStrategy for 2.0
5 years ago
WuHaobo
f593c3fb2f
fix the formula of floor OP and ceil OP ( #25292 )
5 years ago
Wojciech Uss
d0a921ba98
Quant2 updates and fixes ( #25313 )
5 years ago
Zhang Ting
bc7610583b
use eval() to improve CPU performance ( #25243 )
5 years ago
lilong12
3d96601b82
modify pipeline optimizer to only support the mode of sync pipeline training ( #25065 )
...
* modify pipeline optimizer, test=develop
5 years ago
Kaipeng Deng
74468bf428
add mish op. ( #24565 )
...
* add mish op. test=develop
5 years ago
Chen Weihang
f07b25d8e5
fix DataLoader.generrator using error, test=develop ( #25355 )
5 years ago
GaoWei8
fb70682f00
fix PADDLE_ENFORCE ( #25297 )
...
* fix PADDLE_ENFORCE and refine the description
test=develop
5 years ago
Yang Zhang
6d6efafeeb
Add `matrix_nms_op` ( #24400 )
...
* Add `matrix_nms_op`
test=develop
* Make ci happy
test=develop
* Exit early when no detection
test=develop
* Fix license year
test=develop
* Output index as well
test=develop
* Match nms2 lod behavior and add `return_index` flag
test=develop
* Make CI happy
test=develop
* Fix wording
test=develop
5 years ago
Chen Weihang
5a959f6e6e
Refactor dynamic dso search functions ( #25214 )
...
* refactor dynamic dso search func, test=develop
* polish details, test=develop
* polish detail based review comments, test=develop
* revert string type change, test=develop
5 years ago
Jacek Czaja
17c751bec6
[oneDNN] Fix to #25078 ( #25256 )
5 years ago
MRXLT
3b8f0a64c2
Encryption infer ( #25119 )
...
* add encrypt api for inference lib
5 years ago
Wilber
4474fc1033
fix compile on windows. test=develop ( #25310 )
5 years ago
Aurelius84
bc2bd3c1ed
modify into eager_tmp of Base Class test=develop ( #25323 )
5 years ago
Chengmo
e85fcaa712
Fix fluid.embedding in Distributed Training ( #25174 )
...
* test=develop, fix_embedding
5 years ago
Aurelius84
494cb36d09
Modify tmp var name prefix in dygraph ( #25280 )
...
* Modify tmp var name prefix in dygraph test=develop
* refine comment test=develop
5 years ago
Wilber
0371cf6f94
fix compile for lite subgraph. test=develop ( #25285 )
5 years ago
Yiqun Liu
c00f827843
Avoid data transforming ShapeTensor from CPU to GPU in fill_constant op. ( #25267 )
5 years ago
Wojciech Uss
23a4f54b73
rename qat into quant ( #24948 )
...
test=develop
5 years ago
123malin
f1a9593d69
test=develop, bug fix for index_select and roll op ( #25251 )
5 years ago