hutuxian
123255cf9f
change InitializeGPU to InitializeGPUAndLoadModel ( #24377 )
...
* Add InitializeGPUAndLoadModel to solve random hang when downloading sparse parameters.
* Update SaveBase to solve test problem.
5 years ago
Chen Weihang
aa0f254fbe
Add macro BOOST_GET to enrich the error information of boost :: get ( #24175 )
...
* add new macro BOOST_GET_SAFELY & unittests, test=develop
* add different macro type, test=develop
* fix get macro type in executor, test=develop
* four macro part change backup
* using one macro for all case, test=develop
* revert attribute change, test=develop
* change to three func to solve gcc4.8 bug, test=develop
* polish some details, test=develop
5 years ago
Wojciech Uss
db052009c7
Enabled quantize all and skip missing in QAT ( #24281 )
...
* Enabled quantize all and skip missing in QAT
5 years ago
Huihuang Zheng
8a1a2af82e
Add Assert Op ( #24280 )
...
1. To make ProgramTranslator to support `assert` grammar, this PR adds `assert` python API and C++ code.
2. Fix a bug: graph_pattern_detector.h #include <gtest/gtest_prod.h> but didn't declared dependency at CMakeLists, which can cause single build failure.
3. Refactoring `Formatter` in print_op to make it reusable and reuse the formatter to print in assert op.
5 years ago
joanna.wozna.intel
356f5ee220
[Refactoring] Unify op-dequant squashes ( #24277 )
5 years ago
liym27
ac9a7eeea4
[Dy2Stat]Support list pop ( #24250 )
...
* Replace dygraph_to_static_func with @declarative or program_translator.get_func in test_list.py
* Add comments in ConditionalBlock.
* Support list pop last item.
* Support pop the i-th item.
* Support an empty tensor array as Input in assign op and set the kernel type is float.
5 years ago
xujiaqi01
1034ca316f
add timeout and http store in communication ( #23436 )
...
* add timeout and http store in communication, add revert and confirm in fleet
* test=develop
5 years ago
wawltor
d1e1d85881
add the graph batch reader for pslib mode ( #24178 )
...
Add the pslib graph batch reader mode, add the test case for this change
5 years ago
joanna.wozna.intel
b43b46e619
[INT8] Add requant-op squash ( #24143 )
5 years ago
hutuxian
3e2bc8715f
Try to fix UT Random Fail ( #24223 )
5 years ago
Sylwester Fraczek
e1a7a88057
added reshape transpose matmul fuse pass ( #23754 )
5 years ago
Chen Weihang
9b851ba216
[dy2static] Add print transformer and unify print format ( #24068 )
...
* add print transformer & unify print format, test=develop
* remove using of dygraph_to_static_func, test=develop
* remove python stdout capture, test=develop
* fix compatibility problems for PY2, test=develop
* fix detail error, test=develop
* fix type analysis bug, test=develop
* fix print tuple compatible error in PY2, test=develop
* replace get_func to declarative, test=develop
* fix detail bug, test=develop
* fix some detail problems, test=develop
* change visit_call in print transformer, test=develop
5 years ago
wangchaochaohu
fa43d74a3a
fix the intermediate node of graph for fusion group test=develop ( #24184 )
5 years ago
Yiqun Liu
ecfddebbef
Add the implementation of inverse ( #23310 )
5 years ago
liuwei1031
9a93f6aae0
improve efficiency of runtime InferVarType ( #22778 )
...
* save InferVarType changes, test=develop
* remove code comments, test=develop
* tweak code, test=develop
* fix compilation warning, update merge_ids_op split_ids_op to new interface, test=develop
* modify fused_bn_activation_op, test=develop
* fix error of fused_bn_activation_op, test=develop
* fix PADDLE_ENFORCE and unittest coverage issue, test=develop
* tweak PADDLE_ENFORCE messages, test=develop
* improve unittest coverage, test=develop
* add StaticGraphInferVarType class, test=develop
* rebase develop branch, test=develop
* fix unittest error, test=develop
* remove comments, test=develop
* improve unittest coverage, test=develop
* imporve error message and imporve unittest coverage, test=develop
* upgrade InferVarType API, test=develop
* tweak pyfunc error message, test=develop
* fix compilation conflict - save_combine_op, test=develop
5 years ago
wangchaochaohu
2270864019
Fusion group optimize for cuda codegen( #23940 )
5 years ago
ShenLiang
94dfb7d770
opt the postprocess, test=develop ( #24155 )
5 years ago
Jacek Czaja
eb411613e9
[DNNL] refine activations Inplace support ( #24145 )
5 years ago
Jacek Czaja
461e6a01ec
[DNNL] activations Inplace support ( #24123 )
5 years ago
Zhang Ting
fb0729ee7f
avoid warnings in MAC compile ( #24124 )
5 years ago
arlesniak
d31a174f51
added fusing matmul-transpose-reshape pass ( #23866 )
5 years ago
Zeng Jinle
a67eea9f00
polish code by adding final, test=develop, test=develop ( #24114 )
5 years ago
Zeng Jinle
acef55df04
fix isolated var fetch bug, test=develop ( #24070 )
5 years ago
Jacek Czaja
c6c65c65c7
[DNNL] Added elementwise_add mkl-dnn inplace ( #23477 )
5 years ago
hutuxian
9ff558a46f
Optimize DataFeed ( #23957 )
...
* Make batch_float_feasigns & batch_uint64_feasigns as member variable
5 years ago
Zhou Wei
7817003795
Optimize the error messages of paddle CUDA API ( #23816 )
...
* Optimize the error messages of paddle CUDA API, test=develop
* fix the error messages of paddle CUDA API, test=develop
* Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop
* remove build_ex_string,test=develop
* merge conflict,test=develop
5 years ago
ShenLiang
7f0b2c7407
fix memory leaking problem of dataset, test=develop ( #23955 )
5 years ago
guofei
2b896c1f6b
Support LoDTensorArray in fetch ( #23645 )
...
* Support LoDTEnsorArray in fetch op
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
* Support LoDTensorArray in fetch
test=develop
5 years ago
Yiqun Liu
071a702060
Fix the error misjudgment when there are control nodes in graph. ( #23943 )
5 years ago
hutuxian
df64a96686
support set_test_mode and set comlog level( #23905 )
5 years ago
Zhang Ting
b88662254b
use 32 bit index to improve expand op ( #23899 )
...
* use 32 bit index to improve expand op, test=develop
* remove redundant code, test=develop
5 years ago
yiicy
a1e7387919
Variable error message enhancement, test=develop ( #23548 )
5 years ago
yaoxuefeng
5b69242fab
modify datanorm op test=develop ( #23030 )
5 years ago
Zeng Jinle
c49791362f
Correct reader device index ( #23802 )
...
* correct reader device index, test=develop
* fix async executor scope var initialization, test=develop
5 years ago
joanna.wozna.intel
12ba05ce0c
Add scale-matmul fuse pass ( #23734 )
5 years ago
Chen Weihang
532079a222
API (CompiledProgram) error message enhancement ( #23559 )
...
* api compild program error polish, test=develop
* fix coverage problem, test=develop
* fix details & add unittests, test=develop
* add test for coverage, test=develop
5 years ago
wawltor
f3d7db98f1
Add the support of bool list for assign_value op ( #23774 )
...
* Add the support of bool list for assign value, test=develop
* Fix the assign op test case for bool dtype, test=develop
5 years ago
zhongpu
b4b6763ab2
fix bug for exhaustive_search in conv_fusion_op, test=develop ( #23727 )
5 years ago
Yiqun Liu
9e85d02373
Avoid crash when calling ctx->HasInputs and add the check of shape in fill_copnstant op. ( #23698 )
5 years ago
Huihuang Zheng
1d3b0134ca
Error Message Enhancement ( #23483 )
...
This PR enhances error messages of several API/OPs:
ParallelExecutor (python && C++)
Executor (python && C++)
StaticRNN (python)
IfElse (python)
cond (python)
split_lod_tensor (python && C++)
5 years ago
wangchaochaohu
fb34bdb40c
API/OP(fill_constant) error message enhancement ( #23584 )
5 years ago
liuwei1031
2fd728a978
add new dot op( #23418 )
5 years ago
chenhaoze
9b06dd8628
Add three passes and api reference of paddle_pass_builder. test=develop ( #23741 )
...
* Add three passes and api reference of paddle_pass_builder.h
5 years ago
xujiaqi01
d98084e7ec
add save with prefix ( #23449 )
...
* add save with prefix
* test=develop
5 years ago
joanna.wozna.intel
5ee099ca57
Op-requant squash ( #23665 )
...
* Op-requant squash
test=develop
* Add matmul to op-requant test
test=develop
5 years ago
hutuxian
94a3789fd0
Add AfsAPI in PaddleBox ( #23419 )
...
* Involves AfsAPI to resolve slow downloading.
* Mainly used in PaddleBox
5 years ago
liym27
06d4aa4e73
API (BuildStrategy) error message enhancement. ( #23462 )
5 years ago
Zhen Wang
84cd45f674
Solve the conflict of ops with the same name, test for CI. ( #23573 )
...
* solve the conflict of ops with the same name. test=develop
5 years ago
Zeng Jinle
7f3e0eaad1
refine error msg, test=develop ( #23589 )
5 years ago
mozga-intel
3baaee9aab
Remove: NGraph engine from PDPD repository ( #23545 )
...
* Remove the NGraph engine from PDPD repository
1. Each operator was removed from the operator's directory
2. Each test was removed from the unittest directory
3. The parallel executor support was removed from the PDPD
4. The CMake file was removed from the PDPD
5. The NG flags were removed from the repository
test=develop
* Remove ngraph from:
1. Cmake file
2. Python file
test=develop
5 years ago
Zhang Ting
1b8fe70e48
fix VLOG, test=develop ( #23327 )
5 years ago
Chen Weihang
45880f604b
API(Program) error message enhancement ( #23519 )
...
* polish api program error message, test=develop
* fix condition error, test=develop
* fix test prune error, test=develop
* fix coverage problem, test=develop
5 years ago
joanna.wozna.intel
3cb5623dad
Add matmul dequant squash ( #23505 )
...
test=develop
5 years ago
wangchaochaohu
c1187cd6f4
Fp16 refine for fusion group ( #23472 )
5 years ago
joanna.wozna.intel
ce08fdcf2b
Add support for INT8 matmul in C-API quantization ( #23463 )
...
* Integrate matmul with cpu_quantize_pass
test=develop
* Add matmul checking scales
test=develop
* Change condition of matmul quantization
test=develop
* Remove redundant var
test=develop
5 years ago
Aurelius84
8674a82c03
Op (Scope) error message enhancement ( #23458 )
...
* Op (Scope) error message enhancement test=develop
5 years ago
wangchaochaohu
d085f79228
fix untime fail for output var stop_gradient=True for fusion group ( #23317 )
5 years ago
qingqing01
6162cf2f2e
Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO. ( #23426 )
...
* Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO
5 years ago
ShenLiang
5223e2bbc4
Add a new DataFeed named PaddleBoxDataFeed ( #23321 )
...
* add paddleboxdatafeed
* add ifdef linux and boxps
* add untest for datafeed
* fix untest of test_paddlebox_datafeed
* fix untest
* rename function
5 years ago
Chen Weihang
75bd350710
Implement StaticModelRunner to support dygraph fine-tune static graph pre-training model ( #23171 )
...
* static model runner basic implement, test=develop
* add run program op to execute loaded program, test=develop
* refactor static model runner & run program op, test=develop
* reset engine.cc to resolve conflict
* adapt the change of dygraph double grad, test=develop
* refactor impl to solve control flow error, test=develop
* clear debug code, test=develop
* fix ci str compatible error & checkout dygraph grad maker & add example, test=develop
* hide api & add op test, test=develop
* fix run program op test places error, test=develop
* fix program by review comment, test=develop
* delete change var desc name, test=develop
* fix other program by review comment, test=develop
* remove _static_graph_guard, test=develop
* add selectedrows test, test=develop
* remove desc parser, test=develop
* fix detail program, test=develop
* change socpe create & add test, test=develop
5 years ago
Kaipeng Deng
d223a24904
Fix inplace_abn compile error on Windows ( #23464 )
...
* fix inplace_abn windows compile error. test=develop
5 years ago
Tao Luo
0b583235f5
Revert "Solve the conflict of ops with the same name. ( #23199 )" ( #23494 )
...
This reverts commit abe3e6906d
.
test=develop
5 years ago
wawltor
6577f91b74
Add the sum op to API 2.0, add some parameters for new api
...
* Add the sum op to API 2.0, test=develop
* Fix the import meesage in common_ops_import
5 years ago
Zhen Wang
abe3e6906d
Solve the conflict of ops with the same name. ( #23199 )
...
* solve the conflict of ops with the same name. test=develop
5 years ago
tianshuo78520a
d8a21ef6f3
test=develop;fix error ( #23467 )
5 years ago
zhongpu
dbfbd7eac4
support Exhaustive search in dygraph ( #23415 )
...
* use global conv cache; test=develop
* use singleton cache; test=develop
* fix format error; test=develop
* add cudnn helper header; test=develop
* fix header error; test=develop
* fix mac unitest; test=develop
* fix mac unitest; test=develop
* fix file format; test=develop
* fix include file error, test=develop
* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop
* fix test_elementwise_mul_op_dim, test=develop
* fix compile error, test=develop
Co-authored-by: phlrain <phliuhongyu@126.com>
5 years ago
gongweibao
24a063f6ac
Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL ( #22586 )
5 years ago
wangchaochaohu
5c60778731
polish the code of fusion group test=develop ( #23370 )
5 years ago
Leo Chen
a62599a888
[feature] prune program by feed and fetch_list automatically ( #22474 )
...
* prune train program by fetch_list, test=develop
* add unittest for prune, test=develop
* fix pruned feed, test=develop
* support ParallelExecutor and feed prune, test=develop
* add comments, test=develop
* update unittest, test=develop
* update unittests, test=develop
* remove debug code, test=develop
* support cond in clone, test=develop
* support cond in prune, test=develop
* support multiple minimize, test=develop
* support cache, test=develop
* fix _copy_param_info_from, test=develop
* support python2 str, test=develop
* remove debug code, test=develop
* fix bug of caching CompiledProgram, test=develop
* fix multi_device issue, test=develop
* tmp
* support tuple in fetch_list and overriding use_prune, test=develop
* dont use nonlocal in python2, test=develop
* remove nonlocal, test=develop
* code clean, test=develop
* code clean, test=develop
* feed list, test=develop
* test adam, test=develop
* follow comments, test=develop
* reduce duplicate code, test=develop
* update comments, test=develop
5 years ago
Yiqun Liu
bc2981e998
Disable test_code_generator and test_post_training_quantization_mobilenetv1 ( #23440 )
5 years ago
Zeng Jinle
29337f4e17
fix conflict of inferne partial feed with gpu parallel ssa graph executor, test=develop ( #23400 )
5 years ago
zhongpu
bfb07aafe8
Revert "Exhaustive search ( #22821 )", test=develop ( #23401 )
...
This reverts commit 48144e4099
.
5 years ago
xujiaqi01
93ea9dd27a
fix stat var in hogwild worker ( #23367 )
...
* fix stat var in hogwild worker
* test=develop
5 years ago
joanna.wozna.intel
8c463700e1
Add default pass attributes ( #23042 )
5 years ago
zhongpu
48144e4099
Exhaustive search ( #22821 )
...
* use global conv cache; test=develop
* use singleton cache; test=develop
* fix format error; test=develop
* add cudnn helper header; test=develop
* fix header error; test=develop
* fix mac unitest; test=develop
* fix mac unitest; test=develop
* fix file format; test=develop
* fix include file error, test=develop
* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop
* fix test_elementwise_mul_op_dim, test=develop
Co-authored-by: phlrain <phliuhongyu@126.com>
5 years ago
Kaipeng Deng
21d95be0db
Add inplace abn op ( #22806 )
...
* add inplace_abn_op. test=develop
5 years ago
Yi Liu
821534efd3
add paralell_executor dependancy to collective_helper ( #23380 )
...
test=develop
5 years ago
Zeng Jinle
3a21980b78
add reader dependency pass, test=develop ( #23301 )
5 years ago
wangchaochaohu
d280106007
Add support for attr type Op and add fill_constant Op and scale Op ( #23163 )
...
* add attr support for fusion group and add support for fill_constant and scale Op
5 years ago
xujiaqi01
3a45767d49
add fleet pslib pull and push sparse op and push dense op ( #23139 )
...
* add fleet pslib pull and push sparse op and push dense op
* test=develop
5 years ago
Jacek Czaja
2bb1b0e89e
[DNNL] Added MKL-DNN inplace pass for C-API inference ( #23315 )
5 years ago
Wojciech Uss
f836c8aa8f
add check for scales and a message ( #23119 )
5 years ago
Tao Luo
c00d427d52
simplify the cmake log of ir/CMakeLists.txt ( #23262 )
...
test=develop
5 years ago
xujiaqi01
68ea1ad55b
add clear one table ( #23089 )
...
* add clear_one_table
* test=develop
5 years ago
danleifeng
ae3bb16d06
add MaskAucCalculator in paddlebox ( #23157 )
...
* add maskauc in paddlebox; test=develop
5 years ago
Zeng Jinle
53e6f8e1da
rename macro, test=develop ( #23161 )
5 years ago
Zeng Jinle
7ca77a90ac
add Tensor::IsSharedBufferWith method, test=develop ( #23175 )
5 years ago
Zeng Jinle
b8886bf122
rename no_need_buffer_vars_macro, test=develop ( #23159 )
5 years ago
Zeng Jinle
bae5930ba1
fix graph attr copy issues, test=develop ( #23191 )
5 years ago
Zeng Jinle
acfc9b8a70
Reader sequential and inference partial feed ( #22699 )
...
* sequential reader stage 1, test=develop
* fix ut, test=develop
* fix iterable=False reset bug, add some logs and polish code, test=develop
* inference feed partial data, test=develop
* Turn on keep_order=True for test, test=develop
* enhance ut to test more cases, test=develop
* test commit for reverting
* Revert "test commit for reverting", test=develop
This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58.
* add ut of merged and unmerged results, test=develop
* add more uts for coverages and add en doc of api, test=develop
* follow comments, test=develop
* change note style, test=develop
5 years ago
Wilber
95b356a069
update embedding_eltwise_layernorm fuse and kernel. test=develop ( #23114 )
...
update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input
5 years ago
Zeng Jinle
a31d7328b7
Add dygraph double grad implementation ( #22939 )
...
* add double grad implementation for dygraph, test=develop
* polish code, add uts, test=develop
* fix place bug, test=develop
* polish codes, add more uts for coverages, test=develop
* add no_grad_set, test=develop
* add star gan ut, test=develop
* follow comments, test=develop
5 years ago
Yiqun Liu
3af4771122
Add the detection and code-generation of sqrt and square in fusion_group ( #23095 )
5 years ago
hutuxian
0c30098f8b
Add need_save_delta parameter to solve OOM ( #23097 )
5 years ago
Sylwester Fraczek
abee05a8c8
added mkldnn swish activation ( #23041 )
5 years ago
Zhang Ting
880eb04d93
skip PrepareData when it is unnecessary ( #22839 )
...
* remove unnecessary prepare data, test=develop
* Op in while block will not skip PrepareData, test=develop
5 years ago
Adam
5842ae6785
Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn ( #22695 )" ( #22985 )
5 years ago
yaoxuefeng
660ff18488
fix datsset test=develop ( #23043 )
5 years ago
wangchaochaohu
3757e0687c
Add Unittest for backward of fusion group ( #22932 )
...
* add fusion group test for backward and refine code
5 years ago
wangchaochaohu
f0d193a23c
Cast fusion for fusion group ( #22876 )
...
* add support for expression type convert and add cast Op support in fusion group
5 years ago
Wilber
ff3ddbb502
add skip_layernorm pass. test=develop ( #22895 )
...
* add skip_layernorm pass. test=develop
5 years ago
Adam
056edf3929
Change ShareDataWith() to TensorCopy() in conv_mkldnn ( #22695 )
5 years ago
Zhaolong Xing
8d6dc102fe
[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse ( #22494 )
...
* 1. add embedding eltwise layernorm fuse
2. add embedding eltwise layernorm op
3. refine inplace_add_relu
4. refine fc_eltwise_layernorm
test=develop
* 1. refine fc
test=develop
* fix comments
test=develop
* fix comments
test=develop
5 years ago
Zeng Jinle
d33c4343e1
Imperative tracer refactoring ( #22457 )
...
* refine grad maker, test=develop
* refactor tracer stage 1, test=develop
* merge develop to solve conflict third times, test=develop
5 years ago
liu zhengxi
61fef9754b
Fix fc padding bug during inference fusion ( #22860 )
...
* fix fc padding during fusion, test=develop
* fix optim model inference after SaveOptimModel, test=develop
5 years ago
wangchaochaohu
dbb0b9b3b6
refine the profiler print ( #22823 )
...
* refine the profiler print test=develop
5 years ago
hong
5191e54494
reduce default attrs for dynamic graph ( #22850 )
...
* reduce default attrs for dynamic graph, test=develop
* add some explanations for explicit attr, test=develop
* tweak explicit attr comments, test=develop
5 years ago
Zhang Ting
72ff5a09c3
fix print bug of profile, test=develop ( #22804 )
5 years ago
Zhang Ting
4e8bc02461
add fluid.device_guard to specify the device type for Op ( #22254 )
...
* add fluid.device_guard to specify the device type for Op
5 years ago
Zhen Wang
89cfa49156
Unmerged fetch list ( #22635 )
...
* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results.
* add the unit test for fetch_unmerged.
* update ut for multi-card and multi-cpu.
* add the error message and the user suggestion in FetchOpHandle. test=develop
5 years ago
hutuxian
53a2b68f4e
support customized download command in dataset ( #22782 )
...
* user can call dataset.set_download_cmd to set its customized download cmd
* add UT to cover this scenario
5 years ago
wangchaochaohu
ca9e77a8d4
add sum op support for fusion group ( #22771 )
...
* Add the codegen and auto fusion for sum Op in fusion group
5 years ago
tianshuo78520a
433cef03e5
fix typo word ( #22784 )
5 years ago
Leo Chen
b2c1be851a
support cond in clone, test=develop ( #22657 )
...
* support cond in clone, test=develop
* refine code, test=develop
* refine code, test=develop
* follow comments, test=develop
* refine code, test=develop
5 years ago
hutuxian
175954d894
PaddleBox Framework Part2 ( #22466 )
...
* Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator.
* Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly.
* Remove CPU code in Pull/PushSparse and we will add it back when testing it fully.
* Fix some known issues: such as copying persistable vars after one epoch running.
5 years ago
GaoWei8
cdf5f6fb8c
Add an inference interface to disable FC padding ( #22097 )
...
* Add an interface of disabling FC padding
* fix bert regression
* polish fc padding interface
* recover pass function
* fix argument error
* fix mkldnn error
5 years ago
tianshuo78520a
d2ba91aad1
fix typo words ( #22653 )
5 years ago
tangwei12
66a3150135
SYNC with communicaotor ( #22344 )
...
* add sync communicator and implement
5 years ago
Yiqun Liu
22bbd54719
Add the support of fp16 in fusion_group ( #22239 )
5 years ago
wangchaochaohu
c65c6ae534
add flag to control profile level in python API ( #22319 )
...
* add python flag to control profile level test=develop
5 years ago
123malin
00594c1c88
support dumping params/grads in transpiler mode ( #22490 )
5 years ago
flame
f7eafca828
remove python inference warning ( #22602 )
5 years ago
Wilber
9a8203aa25
fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop ( #22551 )
...
当一个模型中有多个fc_lstm子图的时候,且其中fc共用了同一个persistable的bias,此时不应该将bias节点删除,只将非persistable的节点去除即可。
5 years ago
Zhaolong Xing
8acd745c25
[Ernie GPU Optim]: Fuse three fc to multihtead matmul ( #22486 )
...
* 1. optim multihead matmul: fuse three fc to multihtead matmul
test=develop
* fix conflict
test=develop
* fix comments
test=develop
5 years ago
Yiqun Liu
96770f519e
Disable fusion_group for windows and mac in build_strategy. ( #22549 )
...
test=develop
5 years ago
tangwei12
b0675c8193
fix bug with compiledProgram ( #22495 )
...
* add thread barrier for the compiled program
5 years ago
hutuxian
1a7962be97
Paddlebox about box_wrapper ( #22497 )
...
Refine PaddleBox Framework, Main functions:
* Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC.
* Replace FeedPass with new interface: BeginFeedPass & EndFeedPass
* Refactor Pull/Push Sparse Function in box_wrapper.
* Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct.
* Cache copied keys in pull sparse in order to reuse it in push period.
5 years ago
yaoxuefeng
2235ee1a5e
multi-loss optimization by adding a DownpourOpt worker ( #22025 )
...
* update
* update test=develop
* update compile set test=develop
* update compile set test=develop
* update test=develop
* update test=develop
* update test=develop
* update compile setting test=develop
* update compile setting test=develop
* update run demo test=develop
* update test=develop
* update test=develop
* fix test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update format test=develop
* update format test=develop
* update style test=develop
* update style test=develop
* change style test=develop
* change style test=develop
* change style test=develop
* add dataset unittest test=develop
* update test=develop
* update for record test=develop
* udpate style for record test=develop
* update for record test=develop
* update for record test=develop
* update for record test=develop
* fix format test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
5 years ago
zhaoyuchen2018
54970444ce
Improve transpose performance with tile sm copy, test=develop ( #22311 )
...
* Refine code, fix select tile error,test=develop
* Refine element type and some comments, test=develop
* Refine comments and gpu utils, test=develop
* Remove some useless condition
* Refine floor and ceil, test=develop
* refine for loop. test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
5 years ago
Wilber
a90fa54092
Compile without nccl deps. [1/2] ( #22509 )
...
支持不依赖nccl进行编译。[1/2]
多卡下,如果没有打开WITH_NCCL开关编译,多卡不能通信,则只能选择一张卡使用。
Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
5 years ago
guofei
3a59a7a11f
Make assign op support LoDTensorArray and modify while_loop API ( #22309 )
...
This PR makes assign op support LoDTensorArray and enable the loop_vars in
while_loop to support tuple or list.
5 years ago
Yiqun Liu
dcfb603897
Enable the detection of subgraph composed of grad ops ( #21223 )
...
* Add the first implememtation of fusion_group op #19621 (#3 )
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop
* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop
* Disable for mac and windows.
test=develop
* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop
* Refine the CUDA kernel to support large dims.
test=develop
* Add DeviceCodePool to manage all device codes.
* Add the first implementation fusion_group op.
* Add unit-test for fusion_group op.
* Add the check of result.
* Add the check of nvrtc in unit-test.
test=develop
* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop
* Disable fusion_group op for mac and windows.
test=develop
* Make the compiling of device code return status instead of hanging up.
test=develop
* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
* Unify fusion_group_op's input and output names.
test=develop
* Add the check of CUDA driver library in unittest.
test=develop
* Enable generating code for a given subgraph. #21126 (#4 )
* Enable generating code for a given subgraph.
* Support sorting the subgraph.
* Remove the rearange of expressions because we use the sorted subgraph directly.
* Enable generating code for a subgraph which is composed of grad ops.
* Use expression information to check the accuracy in unittest.
* Separate load and store from computation expressions.
test=develop
* Improve the loading statements in generated codes.
test=develop
* Remove unused arguments from formal list.
test=develop
* Enable the detection of subgraph of grad ops.
* Generate code for detected subgraph in fusion_group_pass.
* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop
* Fix a bug when checking whether the shape of all inputs are the same.
* Add debug information.
* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5 )
test=develop
* Call subgraph_detector in fusion_group pass.
test=develop
* Disable fusion_group when WITH_GPU is OFF.
test=develop
* Refine all PADDLE_ENFORCE message.
test=develop
* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop
* Follow review comments.
test=develop
5 years ago
joanna.wozna.intel
17f2c0899f
Add dequant-scale squash ( #22409 )
...
* Add dequant scale squash
test=develop
* Correct dequant-scale squash test
test=develop
5 years ago
Wilber
7bc4b09500
add WITH_NCCL option for cmake. ( #22384 )
...
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL
添加了PADDLE_WITH_NCCL定义
单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡
Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
5 years ago
xujiaqi01
d51ffe860a
fix copy table bug ( #22432 )
...
* fix copy table bug of lost some feasign
* test=develop
5 years ago
石晓伟
e1b0d7cbb1
remove anakin from code, test=develop ( #22420 )
5 years ago
xujiaqi01
371f377bea
add GeneralRoleMaker ( #22295 )
...
* add GeneralRoleMaker which is for general usage
* test=develop
5 years ago
Michał Gallus
269db0d1d1
[DNNL] Fix accuracy in INT8 FC ( #22404 )
...
* Enable quantize to reorder to nchw as well
* Correct FC MKL-DNN input dim requirements to accept 3D
* Improve DNNL FC format, error and 3D input handling
test=develop
* Improve error checking in FC
test=develop
* Improve PADDLE_ENFORCE messages in fc-related files
* Remove data layout attribute from obligatory pass args
test=develop
* Fix message in fc_mkldnn_pass to be logically correct
test=develop
5 years ago
joanna.wozna.intel
3099d9d47c
Restore requantize squash ( #22399 )
5 years ago
Adam
e7a9f6bbb7
[Bugfix] Preserve shape in inpalce operators ( #22360 )
5 years ago
Yiqun Liu
b7cac50b64
Implement a common python unittest to test the ir passes. ( #22209 )
...
* Implement a common python unittest to test the ir passes.
test=develop
* Save the results in np.array and support to startup on CPU.
test=develop
* Fix the unittest.
test=develop
* Add check_program to check whether the optimized program is different from the origin one.
test=develop
* Remove the inferface all_ops.
test=develop
* Add exception test in pass_test.
test=develop
5 years ago
tangwei12
82bc814a57
integrated HALF_ASYNC to communicator ( #21869 )
...
* add half_async in the communicator
* fix DistributedStrategy
5 years ago
Leo Chen
3e5744aa65
Remove unused inputs for some operators ( #22284 )
...
* remove unused inputs, test=develop
* remove unused inputs, test=develop
* update dtype, test=develop
* remove unused inputs, test=develop
* update op_use_default_grad_op_maker, tese=develop
* resolve conflicts, test=develop
* follow comments, test=develop
* update center_loss_grad, test=develop
5 years ago
lidanqing
895f8da7d6
change std::cout to log(INFO), vlog ( #22316 )
5 years ago
Zhen Wang
e40cfb1010
fix the bug of assert_is_op_output. test=develop ( #22262 )
5 years ago
Wojciech Uss
d3a6647372
improve placement pass tests code coverage ( #22197 )
5 years ago
zhouwei25
549e6de7ac
faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 ( #22164 )
5 years ago
xujiaqi01
e3a457d34b
add collective communication library in fleet ( #22211 )
...
* add collective communication library in fleet to replace mpi
* test=develop
5 years ago
Zhen Wang
f2522e91c4
fix the type error caused by setting bool attr in OpDesc. test=develop ( #22257 )
5 years ago
Chen Weihang
fc0b21e17b
Polish fetch error message of parallel executor ( #22206 )
...
* polish error message of parallel executor, test=develop
* change PADDLE_ENFORCE, test=develop
5 years ago
wangchaochaohu
621d3e0b66
fix the bug of profile update ( #22207 )
...
* fix the bug of profile update test=develop
5 years ago
Zhen Wang
46189b166d
Add bn and relu fuse pass ( #22048 )
...
* add bn and relu fuse pass
* add op attr assert and dtype assert
* fix some inputs&&outputs bugs for the fused op and pattern.
* add the unittest for fuse_bn_act_pass. test=develop
* use normative enforce statements. test=develop
* add the cpu test. test=develop
* add the support of batch_size=1 for the bn with relu op. test=develop
* add the error type for paddle throws. test=develop
* add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
5 years ago
zhongpu
d0f0a2520c
test Optimizer in dygraph ( #21949 )
...
* test Optimizer in dygraph, test=develop
* add optest for Optimizer in dygraph, test=develop
* fix adagrad optimizer, test=develop
* fix dpsgd optimizer, test=develop
* fix test_optimizer.py, test=develop
* fix dpsgd optimizer, this op only support cpu, test=develop
* add optest for optimizer, test=develop
* add description for dpsgd, test=develop
* add rmsprop to white_list in unused_var_check.cc, test=develop
* polish code style, test=develop
* polish code style, test=develop
* delete seed attribute for DpsgdOptimizer, test=develop
* change testing to debugging, test=develop
5 years ago
joanna.wozna.intel
5b2e98aa17
Add multiple quantize operators fuse ( #22062 )
5 years ago
Yiqun Liu
96980c2244
Polish the PADDLE_ENFORCE in fusion_group pass related codes. ( #22144 )
...
* Polish the PADDLE_ENFORCE in fusion_group pass related codes.
test=develop
* Correct the unittest because of the change relu_grad's formula.
test=develop
5 years ago
wangchaochaohu
c3876cf82d
add support for nested profiling event and printing in different level ( #22061 )
...
* add support for nested profiling event and printing in different level
5 years ago
liu zhengxi
724b13e459
fix xception precision problem, test=develop ( #22124 )
5 years ago
Yiqun Liu
b1401fb74d
Remove subgraph_detector from inference/analysis to the common framework/ir directory. ( #22094 )
...
test=develop
5 years ago
bingyanghuang
4b4a9cc88f
fix format in operator.cc ( #22101 )
5 years ago
silingtong123
6c20e7c4e6
test=develop, remove unused parameter from class RuntimeInferShapeContext constructors ( #22046 )
5 years ago
Jacek Czaja
b0b27ff699
[MKL-DNN] Conv grad and Batch Norm grad NHWC support ( #22088 )
5 years ago
Huihuang Zheng
dd4361568e
Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug ( #22029 )
5 years ago
Jacek Czaja
ad8a9cb82c
[MKL-DNN] Pool & LRN Grad Ops NHWC support ( #21747 )
5 years ago
Yiqun Liu
d48320777e
Add the first implememtation of fusion_group op ( #19621 )
...
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop
* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop
* Disable for mac and windows.
test=develop
* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop
* Refine the CUDA kernel to support large dims.
test=develop
* Add DeviceCodePool to manage all device codes.
* Add the first implementation fusion_group op.
* Add unit-test for fusion_group op.
* Add the check of result.
* Add the check of nvrtc in unit-test.
test=develop
* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop
* Disable fusion_group op for mac and windows.
test=develop
* Make the compiling of device code return status instead of hanging up.
test=develop
* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.
* Unify fusion_group_op's input and output names.
test=develop
* Add the check of CUDA driver library in unittest.
test=develop
* Refine the calling of PADDLE_ENFORCE.
test=develop
5 years ago
Michał Gallus
6192108408
[DNNL] 3D Fully-Connected ( #21746 )
5 years ago
liu zhengxi
196e20dfbb
Fix multi-threads memory out of bounds error for passes ( #21920 )
...
* fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop
* fix attention_lstm_fuse_pass during multi-threads inference, test=develop
* fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop
* fix fc_lstm_fuse_pass during multi-threads inference, test=develop
* fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop
5 years ago
石晓伟
03479469a7
fix multi-thread error of fc_gru_fuse_pass.cc, test=develop ( #21841 )
...
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop
* export FLAGS and GLOG symbols, test=develop
5 years ago
Pei Yang
3e5008ad01
fix trt calib not working bug, test=develop ( #21934 )
5 years ago
qingqing01
2066745847
Pack imperative/layer into paddle_framework.so ( #21921 )
...
* Pack imperative/layer into paddle_framework.so
5 years ago
Aurelius84
51a86d2b6b
Optimize adam speed ( #21777 )
...
* optimize adam speed by removing _finish_update test=develop
* fix SparseAdamFunctor param list test=develop
* Remove scale_op in expect_list of adam_op test=develop
* fix test optimizer loss assert error test=develop
* fix test optimizer loss assert error test=develop
* modify PADDLE_ENFORCE usage test=develop
* fix op_type in lamb_op.cc test=develop
* fix errors ostream format bug test=develop
* add betaPowOut in ngraph op test=develop
* fix ngraph::op api for gcc8 test=develop
* clean code test=develop
* modify struct into class test=develop
* remove code of beta1Tensor in lamb_op test=develop
5 years ago
Thunderbrook
c3cf42d0f7
add table id in cache shuffle ( #21585 )
...
* general table
* add sparse table
test=develop
* no cvm
test=develop
* add no_cvm
test=develop
* add note
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* add key of optimizer
test=develop
* solve pslib stop core
test=develop
* barrier
test=develop
* add notes
test=develop
* add table id in cache shuffle
test=develop
* table id
test=develop
* code style
test=develop
5 years ago
WangXi
17299b8d21
fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop ( #21801 )
5 years ago
Huihuang Zheng
557bce77da
Fix Backward Bugs in Conditional Block ( #21809 )
...
The fixed bugs:
1. The condition sub-graph is not pruned
2. When backward graph is extremely simple, the whole backward ops are pruned.
5 years ago
xujiaqi01
0eb4d990c4
fix compiled error when with_pslib=on ( #21769 )
...
* fix compiled error of butil when with_pslib=on and with_testing=on
* test=develop
5 years ago
lidanqing
d3a96632fa
Add fc-dequantize squash in cpu_quantize_squash_pass for ernie model ( #21714 )
...
* fc-dequantize squash
test=develop
* change according to reviews
test=develop
* change PADDLE_ENFORCE
test=develop
* add second test when fc-dequant do not fuse
test=develop
* change all related PADDLE_ENFORCE
test=develop
5 years ago
WangXi
8754cbd1f2
fix std::min type in nan_inf, test=develop ( #21725 )
5 years ago
joanna.wozna.intel
d419b859c0
Add reshape int8 mkldnn op ( #21428 )
...
* Add reshape int8 op
test=develop
* Change test to CPUPlace
test=develop
* Correct tests
test=develop
5 years ago
WangXi
8a0f611b64
Rewrite check nan inf tools ( #21076 )
5 years ago
tangwei12
9ad940fdfe
memory leak for cpu ( #21174 )
...
* add fake init for the trainer, fix large memory hold in the trainer
* do not merge recv vars from a remote endpoint, test=develop
* add recv and save op, merge slice var in one op, save memory
* remove hsigmoid with pull sparse, test=develop
5 years ago
Zeng Jinle
73461a7ae6
Make OperatorWithKernel::InferShape abstract ( #21633 )
...
* make OperatorWithKernel::InferShape virtual, test=develop
* fix test_prepare_op by relu, test=develop
5 years ago
Zeng Jinle
6828f3684b
fix op_registry, add ignore op_function_impl.h, test=develop ( #21654 )
5 years ago
Adam
e81f0228df
MKL-DNN 1.0 Update ( #20162 )
...
* MKLDNN v1.0 rebase to Paddle 1.6
test=develop
* Add hacky paddle::string::to_string() implementation
* vectorize<int64-t>() -> vectorize() cleanup
test=develop
* PADDLE_ENFORCE and void_cast fixes
test=develop
* Rebase changes
test=develop
* Cosmetics
test=develop
* Delete MKL from mkldnn.cmake
test=develop
* CMake debug commands
test=develop
* Delete MKLDNN_VERBOSE and rebase fixes
test=develop
* Rebase fixes
test=develop
* Temporarily disable int8 resnet101 vgg16 and vgg19 tests
test=develop
* Add libmkldnn.so.1 to python setup
test=develop
* Add libmkldnn.so.1 to inference_lib cmake after rebase
test=develop
* Post rebase fixes + FC int8 changes
test=develop
* Fix LRN NHWC
test=develop
* Fix NHWC conv3d
test=develop
* Windows build fix + next conv3d fix
test=develop
* Fix conv2d on AVX2 machines
test=develop
5 years ago
xujiaqi01
f404157205
fix master patch when slot is dense ( #21580 )
...
* fix master patch when slot is dense
* test=develop
5 years ago
xujiaqi01
c05706fe73
fix code style of fleet_wrapper ( #21639 )
...
* fix code style of fleet_wrapper
* test=develop
5 years ago
xujiaqi01
88960684aa
rm optimize_for in framework.proto ( #21571 )
...
* remove optimize_for in framework.proto
* test=develop
5 years ago
Zeng Jinle
0f8888360e
Polish op registry codes ( #21561 )
...
* polish infer shape registry, test=develop
* modify some operators registry, test=develop
5 years ago
hutuxian
c5aec2fe68
Paddlebox Related to Framework ( #21586 )
...
* Add a single_process_multi_thread transpiler.
* Add some UTs.
* Fix some API description.
5 years ago
liym27
9da7e6b4d4
add file check_op_desc.py and add interface to get default value. ( #21530 )
...
* add file check_op_desc.py and add interface to get default value. test=develop
* add test for c++ coverage rate. test=develop
* Correct typo. test=develop
5 years ago
Pei Yang
122b37ce62
make config option DisableGlogInfo() able to mute all inference logs ( #21318 )
...
* make DisableGlogInfo able to mute all logs in inference.
5 years ago
Jacek Czaja
18a5d30754
[MKL-DNN] Conv2d and Conv2d transpose MKL-DNN NHWC support ( #21466 )
5 years ago
Zhaolong Xing
c5f0293cf3
NV jetson(nano, tx2, xavier) inference compile support ( #21393 )
...
* add jeston compile support
test=develop
* refine the cmake
test=develop
5 years ago
Tao Luo
01fa4ead61
fix -Wno-error=sign-compare warning in gcc8 ( #21434 )
...
* fix -Wno-error=sign-compare warning in gcc8
test=develop
* fix warning in distributed codes
test=develop
5 years ago
wangchaochaohu
d4776ec027
fix the correctness of memcpy profiling result test=develop ( #21458 )
5 years ago
Jie Fang
5e813b53c5
nhwc optimization for batchnorm ( #21090 )
5 years ago
Leo Chen
e0c9d856fb
add unused input vars check for OpWithKernel, test=develop ( #21169 )
...
* add unused input vars check for OpWithKernel, test=develop
* remove unused vars in some ops, test=develop
* fix batch_norm, test=develop
* add white list, test=develop
* add CI check for white list, test=develop
* :ove white list to c++, test=develop
* solve failure of CI, test=develop
* add unittest for unused_var_check, test=develop
* refine code, enable check in operator_test, test=develop
* skip mkldnn, test=develop
* extend white list, test=develop
* refine condition of mkldnn, test=develop
* fix paddle_build, test=develop
* follow comments, test=develop
* fix GetExpectedKernelType
* add wiki ref to err_msg, test=develop
* follow comment, test=develop
5 years ago
Huihuang Zheng
630be31952
Fix Cond Bug for Nested Control Flow ( #21340 )
...
* Commit before merging develop
test=develop
* Backup after working with Huihuang logs
* Commit before deleting Huihuang debug loggings
* Commit before debug
test=develop
* Fix bug commit
test=develop
* Backup of fixing bugs
test=develop
* Clean up code
test=develop
* Fix a bug in sum_op
test=develop
5 years ago
Jacek Czaja
cd43c4440e
[MKL-DNN] LRN and Pool2d (FWD) NHWC support ( #21375 )
5 years ago
Zeng Jinle
6b09b73e17
add explicit conversion to NoNeedBufferVarsFunctor, test=develop ( #21430 )
5 years ago
hong
ac8546701d
Add dygraph execution context ( #20157 )
...
* add_dygraph_execution_context
* add dygraph infershape context and execution context; test=develop
* fix imperative bug; test=develop
* remove inputs outputs interface from execution context,
because it have same function with inputNames;
test=develop
* remove tracer_test ctest; test=develop
* fix split op bug; test=develop
* fix unitests bug; test=develop
* fix distribute test bug; test=develop
* fix ngraph compile bug; test=develop
* fix grad maker bug; test=develop
* fix load op bugs; test=develop
* fix operator.cc construct bug; test=develop
* remove useless name find in operator; test=develop
* add tracer_test; test=develop
* fix concat, split bug; test=develop
* remove tracer_test unitest; test=develop
* fix attribute check bug; test=develop
* add test code to fix converage; test=develop
* remove useless code, change check backward input in engin; test=develop
* unlock var type infer shape;test=develop
* add ShareAllLoD api; test=develop
* add dygraph infershape context unitest; test=develop
* remove increase and decrease lod in dygraph; test=develop
* addd override; test=develop
* fix increase descrease lod; test=develop
* fix paddle_enforce; test=develop
* disable lod op dygraph check; test=develop
* fix paddle enforce error; test=develop
* add comment for op_registry and OperatorBase; test=develop
* optimize the comment of op_registry; test=develop
* fix format of comment; test=develop
* fix format of comment; test=develop
* optimize the format of comment; test=develop
* optimize the format of the comment; test=develop
* optimize comment of op_registry; test=develop
5 years ago
Zeng Jinle
09696d5df8
Use system allocator in OpTest ( #21335 )
...
* use system allocator in unittests, test=develop
* fix op bugs, test=develop
* fix tensor copy bug when src and dst are the same, test=develop
5 years ago