Paddle

Commit Graph

Author	SHA1	Message	Date
Yanghello	62b4ff7dd2	Aes_cipher_test and cipher_utils_test failed fixed (#24816 )	5 years ago
Yanghello	5a7a517cde	Add crypto api (#24694 )	5 years ago
Chen Weihang	19e5f7879c	Append error op hint for GradOpMaker (#24750 ) * append error op hint for grad op maker, test=develop * add unittests for coverage, test=develop	5 years ago
ShenLiang	950892044f	fix conflict, test=develop (#24238 )	5 years ago
hutuxian	e6b87b3193	Support AucRunner in PaddleBox (#22884 ) * Support AucRunner in PaddleBox * update some code style	5 years ago
wangchaochaohu	dbfe5333c5	Add pe profiler Event (#24611 )	5 years ago
Wilber	ba2f8f0ce4	fix embedding_eltwise_layernorm_fuse_pass. test=develop (#24592 )	5 years ago
hutuxian	0ec3a42e97	Random Dump (#24477 ) * Refactor code for dump_field & dump_param: abstracting the common function in base class. * Support dump randomly & random with lineid * Support specify the random interval, which avoids printing too much logs.	5 years ago
Yiqun Liu	6b464f969a	Add an operator node in unittest to make the fusing result unique. (#24617 )	5 years ago
Yiqun Liu	560c815390	Add some check for CUDA Driver API and NVRTC (#22719 ) * Add the check for whether CUDA Driver and NVRTC is available for the runtime system. * Call cuInit to initialize the CUDA Driver API before all CUDA callings. test=develop * Change the behavior when libnvrtc.so can not be found, printing a warning instead of exiting. test=develop * Do not initialize CUDA Driver API for windows and macos. test=develop * Remove the call of cuInit when entering paddle and enable the test_code_generator. test=develop * Add some built-in functions for __half. test=develop * Change save_intermediate_out to false in unittest. test=develop * Fix error reference to tempropary variable when seting including path for device_code. test=develop	5 years ago
pawelpiotrowicz	db2b6b6568	Hide globals & redesign restore PR (#24279 ) test=develop	5 years ago
Jacek Czaja	8b88cd5167	[oneDNN] Fix to inplace pass (#24442 ) * - Disabling inplace pass test=develop - Disable cycles test=develop - fix test=develop - Enhancement to in-place - Lint fixes test=develop * - Lint fixes test=develop	5 years ago
hutuxian	123255cf9f	change InitializeGPU to InitializeGPUAndLoadModel (#24377 ) * Add InitializeGPUAndLoadModel to solve random hang when downloading sparse parameters. * Update SaveBase to solve test problem.	5 years ago
Chen Weihang	aa0f254fbe	Add macro BOOST_GET to enrich the error information of boost :: get (#24175 ) * add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop	5 years ago
Wojciech Uss	db052009c7	Enabled quantize all and skip missing in QAT (#24281 ) * Enabled quantize all and skip missing in QAT	5 years ago
Huihuang Zheng	8a1a2af82e	Add Assert Op (#24280 ) 1. To make ProgramTranslator to support `assert` grammar, this PR adds `assert` python API and C++ code. 2. Fix a bug: graph_pattern_detector.h #include <gtest/gtest_prod.h> but didn't declared dependency at CMakeLists, which can cause single build failure. 3. Refactoring `Formatter` in print_op to make it reusable and reuse the formatter to print in assert op.	5 years ago
joanna.wozna.intel	356f5ee220	[Refactoring] Unify op-dequant squashes (#24277 )	5 years ago
liym27	ac9a7eeea4	[Dy2Stat]Support list pop (#24250 ) * Replace dygraph_to_static_func with @declarative or program_translator.get_func in test_list.py * Add comments in ConditionalBlock. * Support list pop last item. * Support pop the i-th item. * Support an empty tensor array as Input in assign op and set the kernel type is float.	5 years ago
xujiaqi01	1034ca316f	add timeout and http store in communication (#23436 ) * add timeout and http store in communication, add revert and confirm in fleet * test=develop	5 years ago
wawltor	d1e1d85881	add the graph batch reader for pslib mode (#24178 ) Add the pslib graph batch reader mode, add the test case for this change	5 years ago
joanna.wozna.intel	b43b46e619	[INT8] Add requant-op squash (#24143 )	5 years ago
hutuxian	3e2bc8715f	Try to fix UT Random Fail (#24223 )	5 years ago
Sylwester Fraczek	e1a7a88057	added reshape transpose matmul fuse pass (#23754 )	5 years ago
Chen Weihang	9b851ba216	[dy2static] Add print transformer and unify print format (#24068 ) * add print transformer & unify print format, test=develop * remove using of dygraph_to_static_func, test=develop * remove python stdout capture, test=develop * fix compatibility problems for PY2, test=develop * fix detail error, test=develop * fix type analysis bug, test=develop * fix print tuple compatible error in PY2, test=develop * replace get_func to declarative, test=develop * fix detail bug, test=develop * fix some detail problems, test=develop * change visit_call in print transformer, test=develop	5 years ago
wangchaochaohu	fa43d74a3a	fix the intermediate node of graph for fusion group test=develop (#24184 )	5 years ago
Yiqun Liu	ecfddebbef	Add the implementation of inverse (#23310 )	5 years ago
liuwei1031	9a93f6aae0	improve efficiency of runtime InferVarType (#22778 ) * save InferVarType changes, test=develop * remove code comments, test=develop * tweak code, test=develop * fix compilation warning, update merge_ids_op split_ids_op to new interface, test=develop * modify fused_bn_activation_op, test=develop * fix error of fused_bn_activation_op, test=develop * fix PADDLE_ENFORCE and unittest coverage issue, test=develop * tweak PADDLE_ENFORCE messages, test=develop * improve unittest coverage, test=develop * add StaticGraphInferVarType class, test=develop * rebase develop branch, test=develop * fix unittest error, test=develop * remove comments, test=develop * improve unittest coverage, test=develop * imporve error message and imporve unittest coverage, test=develop * upgrade InferVarType API, test=develop * tweak pyfunc error message, test=develop * fix compilation conflict - save_combine_op, test=develop	5 years ago
wangchaochaohu	2270864019	Fusion group optimize for cuda codegen(#23940 )	5 years ago
ShenLiang	94dfb7d770	opt the postprocess, test=develop (#24155 )	5 years ago
Jacek Czaja	eb411613e9	[DNNL] refine activations Inplace support (#24145 )	5 years ago
Jacek Czaja	461e6a01ec	[DNNL] activations Inplace support (#24123 )	5 years ago
Zhang Ting	fb0729ee7f	avoid warnings in MAC compile (#24124 )	5 years ago
arlesniak	d31a174f51	added fusing matmul-transpose-reshape pass (#23866 )	5 years ago
Zeng Jinle	a67eea9f00	polish code by adding final, test=develop, test=develop (#24114 )	5 years ago
Zeng Jinle	acef55df04	fix isolated var fetch bug, test=develop (#24070 )	5 years ago
Jacek Czaja	c6c65c65c7	[DNNL] Added elementwise_add mkl-dnn inplace (#23477 )	5 years ago
hutuxian	9ff558a46f	Optimize DataFeed (#23957 ) * Make batch_float_feasigns & batch_uint64_feasigns as member variable	5 years ago
Zhou Wei	7817003795	Optimize the error messages of paddle CUDA API (#23816 ) * Optimize the error messages of paddle CUDA API, test=develop * fix the error messages of paddle CUDA API, test=develop * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL,test=develop * remove build_ex_string,test=develop * merge conflict,test=develop	5 years ago
ShenLiang	7f0b2c7407	fix memory leaking problem of dataset, test=develop (#23955 )	5 years ago
guofei	2b896c1f6b	Support LoDTensorArray in fetch (#23645 ) * Support LoDTEnsorArray in fetch op test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop	5 years ago
Yiqun Liu	071a702060	Fix the error misjudgment when there are control nodes in graph. (#23943 )	5 years ago
hutuxian	df64a96686	support set_test_mode and set comlog level(#23905 )	5 years ago
Zhang Ting	b88662254b	use 32 bit index to improve expand op (#23899 ) * use 32 bit index to improve expand op, test=develop * remove redundant code, test=develop	5 years ago
yiicy	a1e7387919	Variable error message enhancement, test=develop (#23548 )	5 years ago
yaoxuefeng	5b69242fab	modify datanorm op test=develop (#23030 )	5 years ago
Zeng Jinle	c49791362f	Correct reader device index (#23802 ) * correct reader device index, test=develop * fix async executor scope var initialization, test=develop	5 years ago
joanna.wozna.intel	12ba05ce0c	Add scale-matmul fuse pass (#23734 )	5 years ago
Chen Weihang	532079a222	API (CompiledProgram) error message enhancement (#23559 ) * api compild program error polish, test=develop * fix coverage problem, test=develop * fix details & add unittests, test=develop * add test for coverage, test=develop	5 years ago
wawltor	f3d7db98f1	Add the support of bool list for assign_value op (#23774 ) * Add the support of bool list for assign value, test=develop * Fix the assign op test case for bool dtype, test=develop	5 years ago
zhongpu	b4b6763ab2	fix bug for exhaustive_search in conv_fusion_op, test=develop (#23727 )	5 years ago
Yiqun Liu	9e85d02373	Avoid crash when calling ctx->HasInputs and add the check of shape in fill_copnstant op. (#23698 )	5 years ago
Huihuang Zheng	1d3b0134ca	Error Message Enhancement (#23483 ) This PR enhances error messages of several API/OPs: ParallelExecutor (python && C++) Executor (python && C++) StaticRNN (python) IfElse (python) cond (python) split_lod_tensor (python && C++)	5 years ago
wangchaochaohu	fb34bdb40c	API/OP(fill_constant) error message enhancement (#23584 )	5 years ago
liuwei1031	2fd728a978	add new dot op(#23418 )	5 years ago
chenhaoze	9b06dd8628	Add three passes and api reference of paddle_pass_builder. test=develop (#23741 ) * Add three passes and api reference of paddle_pass_builder.h	5 years ago
xujiaqi01	d98084e7ec	add save with prefix (#23449 ) * add save with prefix * test=develop	5 years ago
joanna.wozna.intel	5ee099ca57	Op-requant squash (#23665 ) * Op-requant squash test=develop * Add matmul to op-requant test test=develop	5 years ago
hutuxian	94a3789fd0	Add AfsAPI in PaddleBox (#23419 ) * Involves AfsAPI to resolve slow downloading. * Mainly used in PaddleBox	5 years ago
liym27	06d4aa4e73	API (BuildStrategy) error message enhancement. (#23462 )	5 years ago
Zhen Wang	84cd45f674	Solve the conflict of ops with the same name, test for CI. (#23573 ) * solve the conflict of ops with the same name. test=develop	5 years ago
Zeng Jinle	7f3e0eaad1	refine error msg, test=develop (#23589 )	5 years ago
mozga-intel	3baaee9aab	Remove: NGraph engine from PDPD repository (#23545 ) * Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop	5 years ago
Zhang Ting	1b8fe70e48	fix VLOG, test=develop (#23327 )	5 years ago
Chen Weihang	45880f604b	API(Program) error message enhancement (#23519 ) * polish api program error message, test=develop * fix condition error, test=develop * fix test prune error, test=develop * fix coverage problem, test=develop	5 years ago
joanna.wozna.intel	3cb5623dad	Add matmul dequant squash (#23505 ) test=develop	5 years ago
wangchaochaohu	c1187cd6f4	Fp16 refine for fusion group (#23472 )	5 years ago
joanna.wozna.intel	ce08fdcf2b	Add support for INT8 matmul in C-API quantization (#23463 ) * Integrate matmul with cpu_quantize_pass test=develop * Add matmul checking scales test=develop * Change condition of matmul quantization test=develop * Remove redundant var test=develop	5 years ago
Aurelius84	8674a82c03	Op (Scope) error message enhancement (#23458 ) * Op (Scope) error message enhancement test=develop	5 years ago
wangchaochaohu	d085f79228	fix untime fail for output var stop_gradient=True for fusion group (#23317 )	5 years ago
qingqing01	6162cf2f2e	Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO. (#23426 ) * Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO	5 years ago
ShenLiang	5223e2bbc4	Add a new DataFeed named PaddleBoxDataFeed (#23321 ) * add paddleboxdatafeed * add ifdef linux and boxps * add untest for datafeed * fix untest of test_paddlebox_datafeed * fix untest * rename function	5 years ago
Chen Weihang	75bd350710	Implement StaticModelRunner to support dygraph fine-tune static graph pre-training model (#23171 ) * static model runner basic implement, test=develop * add run program op to execute loaded program, test=develop * refactor static model runner & run program op, test=develop * reset engine.cc to resolve conflict * adapt the change of dygraph double grad, test=develop * refactor impl to solve control flow error, test=develop * clear debug code, test=develop * fix ci str compatible error & checkout dygraph grad maker & add example, test=develop * hide api & add op test, test=develop * fix run program op test places error, test=develop * fix program by review comment, test=develop * delete change var desc name, test=develop * fix other program by review comment, test=develop * remove _static_graph_guard, test=develop * add selectedrows test, test=develop * remove desc parser, test=develop * fix detail program, test=develop * change socpe create & add test, test=develop	5 years ago
Kaipeng Deng	d223a24904	Fix inplace_abn compile error on Windows (#23464 ) * fix inplace_abn windows compile error. test=develop	5 years ago
Tao Luo	0b583235f5	Revert "Solve the conflict of ops with the same name. (#23199 )" (#23494 ) This reverts commit `abe3e6906d`. test=develop	5 years ago
wawltor	6577f91b74	Add the sum op to API 2.0， add some parameters for new api * Add the sum op to API 2.0, test=develop * Fix the import meesage in common_ops_import	5 years ago
Zhen Wang	abe3e6906d	Solve the conflict of ops with the same name. (#23199 ) * solve the conflict of ops with the same name. test=develop	5 years ago
tianshuo78520a	d8a21ef6f3	test=develop;fix error (#23467 )	5 years ago
zhongpu	dbfbd7eac4	support Exhaustive search in dygraph (#23415 ) * use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop * fix compile error, test=develop Co-authored-by: phlrain <phliuhongyu@126.com>	5 years ago
gongweibao	24a063f6ac	Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL (#22586 )	5 years ago
wangchaochaohu	5c60778731	polish the code of fusion group test=develop (#23370 )	5 years ago
Leo Chen	a62599a888	[feature] prune program by feed and fetch_list automatically (#22474 ) * prune train program by fetch_list, test=develop * add unittest for prune, test=develop * fix pruned feed, test=develop * support ParallelExecutor and feed prune, test=develop * add comments, test=develop * update unittest, test=develop * update unittests, test=develop * remove debug code, test=develop * support cond in clone, test=develop * support cond in prune, test=develop * support multiple minimize, test=develop * support cache, test=develop * fix _copy_param_info_from, test=develop * support python2 str, test=develop * remove debug code, test=develop * fix bug of caching CompiledProgram, test=develop * fix multi_device issue, test=develop * tmp * support tuple in fetch_list and overriding use_prune, test=develop * dont use nonlocal in python2, test=develop * remove nonlocal, test=develop * code clean, test=develop * code clean, test=develop * feed list, test=develop * test adam, test=develop * follow comments, test=develop * reduce duplicate code, test=develop * update comments, test=develop	5 years ago
Yiqun Liu	bc2981e998	Disable test_code_generator and test_post_training_quantization_mobilenetv1 (#23440 )	5 years ago
Zeng Jinle	29337f4e17	fix conflict of inferne partial feed with gpu parallel ssa graph executor, test=develop (#23400 )	5 years ago
zhongpu	bfb07aafe8	Revert "Exhaustive search (#22821 )", test=develop (#23401 ) This reverts commit `48144e4099`.	5 years ago
xujiaqi01	93ea9dd27a	fix stat var in hogwild worker (#23367 ) * fix stat var in hogwild worker * test=develop	5 years ago
joanna.wozna.intel	8c463700e1	Add default pass attributes (#23042 )	5 years ago
zhongpu	48144e4099	Exhaustive search (#22821 ) * use global conv cache; test=develop * use singleton cache; test=develop * fix format error; test=develop * add cudnn helper header; test=develop * fix header error; test=develop * fix mac unitest; test=develop * fix mac unitest; test=develop * fix file format; test=develop * fix include file error, test=develop * remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop * fix test_elementwise_mul_op_dim, test=develop Co-authored-by: phlrain <phliuhongyu@126.com>	5 years ago
Kaipeng Deng	21d95be0db	Add inplace abn op (#22806 ) * add inplace_abn_op. test=develop	5 years ago
Yi Liu	821534efd3	add paralell_executor dependancy to collective_helper (#23380 ) test=develop	5 years ago
Zeng Jinle	3a21980b78	add reader dependency pass, test=develop (#23301 )	5 years ago
wangchaochaohu	d280106007	Add support for attr type Op and add fill_constant Op and scale Op (#23163 ) * add attr support for fusion group and add support for fill_constant and scale Op	5 years ago
xujiaqi01	3a45767d49	add fleet pslib pull and push sparse op and push dense op (#23139 ) * add fleet pslib pull and push sparse op and push dense op * test=develop	5 years ago
Jacek Czaja	2bb1b0e89e	[DNNL] Added MKL-DNN inplace pass for C-API inference (#23315 )	5 years ago
Wojciech Uss	f836c8aa8f	add check for scales and a message (#23119 )	5 years ago
Tao Luo	c00d427d52	simplify the cmake log of ir/CMakeLists.txt (#23262 ) test=develop	5 years ago
xujiaqi01	68ea1ad55b	add clear one table (#23089 ) * add clear_one_table * test=develop	5 years ago
danleifeng	ae3bb16d06	add MaskAucCalculator in paddlebox (#23157 ) * add maskauc in paddlebox; test=develop	5 years ago
Zeng Jinle	53e6f8e1da	rename macro, test=develop (#23161 )	5 years ago
Zeng Jinle	7ca77a90ac	add Tensor::IsSharedBufferWith method, test=develop (#23175 )	5 years ago
Zeng Jinle	b8886bf122	rename no_need_buffer_vars_macro, test=develop (#23159 )	5 years ago
Zeng Jinle	bae5930ba1	fix graph attr copy issues, test=develop (#23191 )	5 years ago
Zeng Jinle	acfc9b8a70	Reader sequential and inference partial feed (#22699 ) * sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop	5 years ago
Wilber	95b356a069	update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114 ) update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input	5 years ago
Zeng Jinle	a31d7328b7	Add dygraph double grad implementation (#22939 ) * add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop	5 years ago
Yiqun Liu	3af4771122	Add the detection and code-generation of sqrt and square in fusion_group (#23095 )	5 years ago
hutuxian	0c30098f8b	Add need_save_delta parameter to solve OOM (#23097 )	5 years ago
Sylwester Fraczek	abee05a8c8	added mkldnn swish activation (#23041 )	5 years ago
Zhang Ting	880eb04d93	skip PrepareData when it is unnecessary (#22839 ) * remove unnecessary prepare data, test=develop * Op in while block will not skip PrepareData, test=develop	5 years ago
Adam	5842ae6785	Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )" (#22985 )	5 years ago
yaoxuefeng	660ff18488	fix datsset test=develop (#23043 )	5 years ago
wangchaochaohu	3757e0687c	Add Unittest for backward of fusion group (#22932 ) * add fusion group test for backward and refine code	5 years ago
wangchaochaohu	f0d193a23c	Cast fusion for fusion group (#22876 ) * add support for expression type convert and add cast Op support in fusion group	5 years ago
Wilber	ff3ddbb502	add skip_layernorm pass. test=develop (#22895 ) * add skip_layernorm pass. test=develop	5 years ago
Adam	056edf3929	Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695 )	5 years ago
Zhaolong Xing	8d6dc102fe	[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494 ) * 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop	5 years ago
Zeng Jinle	d33c4343e1	Imperative tracer refactoring (#22457 ) * refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop	5 years ago
liu zhengxi	61fef9754b	Fix fc padding bug during inference fusion (#22860 ) * fix fc padding during fusion, test=develop * fix optim model inference after SaveOptimModel, test=develop	5 years ago
wangchaochaohu	dbb0b9b3b6	refine the profiler print (#22823 ) * refine the profiler print test=develop	5 years ago
hong	5191e54494	reduce default attrs for dynamic graph (#22850 ) * reduce default attrs for dynamic graph, test=develop * add some explanations for explicit attr, test=develop * tweak explicit attr comments, test=develop	5 years ago
Zhang Ting	72ff5a09c3	fix print bug of profile, test=develop (#22804 )	5 years ago
Zhang Ting	4e8bc02461	add fluid.device_guard to specify the device type for Op (#22254 ) * add fluid.device_guard to specify the device type for Op	5 years ago
Zhen Wang	89cfa49156	Unmerged fetch list (#22635 ) * update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop	5 years ago
hutuxian	53a2b68f4e	support customized download command in dataset (#22782 ) * user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario	5 years ago
wangchaochaohu	ca9e77a8d4	add sum op support for fusion group (#22771 ) * Add the codegen and auto fusion for sum Op in fusion group	5 years ago
tianshuo78520a	433cef03e5	fix typo word (#22784 )	5 years ago
Leo Chen	b2c1be851a	support cond in clone, test=develop (#22657 ) * support cond in clone, test=develop * refine code, test=develop * refine code, test=develop * follow comments, test=develop * refine code, test=develop	5 years ago
hutuxian	175954d894	PaddleBox Framework Part2 (#22466 ) * Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.	5 years ago
GaoWei8	cdf5f6fb8c	Add an inference interface to disable FC padding (#22097 ) * Add an interface of disabling FC padding * fix bert regression * polish fc padding interface * recover pass function * fix argument error * fix mkldnn error	5 years ago
tianshuo78520a	d2ba91aad1	fix typo words (#22653 )	5 years ago
tangwei12	66a3150135	SYNC with communicaotor (#22344 ) * add sync communicator and implement	5 years ago
Yiqun Liu	22bbd54719	Add the support of fp16 in fusion_group (#22239 )	5 years ago
wangchaochaohu	c65c6ae534	add flag to control profile level in python API (#22319 ) * add python flag to control profile level test=develop	5 years ago
123malin	00594c1c88	support dumping params/grads in transpiler mode (#22490 )	5 years ago
flame	f7eafca828	remove python inference warning (#22602 )	5 years ago
Wilber	9a8203aa25	fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551 ) 当一个模型中有多个fc_lstm子图的时候，且其中fc共用了同一个persistable的bias，此时不应该将bias节点删除，只将非persistable的节点去除即可。	5 years ago
Zhaolong Xing	8acd745c25	[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486 ) * 1. optim multihead matmul: fuse three fc to multihtead matmul test=develop * fix conflict test=develop * fix comments test=develop	5 years ago
Yiqun Liu	96770f519e	Disable fusion_group for windows and mac in build_strategy. (#22549 ) test=develop	5 years ago
tangwei12	b0675c8193	fix bug with compiledProgram (#22495 ) * add thread barrier for the compiled program	5 years ago
hutuxian	1a7962be97	Paddlebox about box_wrapper (#22497 ) Refine PaddleBox Framework, Main functions: * Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC. * Replace FeedPass with new interface: BeginFeedPass & EndFeedPass * Refactor Pull/Push Sparse Function in box_wrapper. * Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct. * Cache copied keys in pull sparse in order to reuse it in push period.	5 years ago
yaoxuefeng	2235ee1a5e	multi-loss optimization by adding a DownpourOpt worker (#22025 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop	5 years ago
zhaoyuchen2018	54970444ce	Improve transpose performance with tile sm copy, test=develop (#22311 ) * Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>	5 years ago
Wilber	a90fa54092	Compile without nccl deps. [1/2] (#22509 ) 支持不依赖nccl进行编译。[1/2] 多卡下，如果没有打开WITH_NCCL开关编译，多卡不能通信，则只能选择一张卡使用。 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
guofei	3a59a7a11f	Make assign op support LoDTensorArray and modify while_loop API (#22309 ) This PR makes assign op support LoDTensorArray and enable the loop_vars in while_loop to support tuple or list.	5 years ago
Yiqun Liu	dcfb603897	Enable the detection of subgraph composed of grad ops (#21223 ) * Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop	5 years ago
joanna.wozna.intel	17f2c0899f	Add dequant-scale squash (#22409 ) * Add dequant scale squash test=develop * Correct dequant-scale squash test test=develop	5 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
xujiaqi01	d51ffe860a	fix copy table bug (#22432 ) * fix copy table bug of lost some feasign * test=develop	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
xujiaqi01	371f377bea	add GeneralRoleMaker (#22295 ) * add GeneralRoleMaker which is for general usage * test=develop	5 years ago
Michał Gallus	269db0d1d1	[DNNL] Fix accuracy in INT8 FC (#22404 ) * Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop	5 years ago

1 2 3 4 5 ...

2992 Commits (4d7d661249652f957ee918c02760213cd3681799)