Paddle

Commit Graph

Author	SHA1	Message	Date
wanghuancoder	c68a0313a5	add paddle.fluid._cuda_synchronize (#27595 ) * add paddle.fluid._cuda_synchronize, test=develop * fix bug about core_avx core_noavx, test=develop * delete CPUPlace and XPUPlace, test=develop	5 years ago
liym27	074a71bd25	Support assignment to a Variable in dynamic mode but not deal with backward. (#27471 ) * Support assignment to a Variable in dynamic mode. Note: not deal with backward. * Rewrite VarBase __setitem__ for high-performance. * try to test 3 means to do __setitem__ and test the performance of 3 means. * Retain the means of the highest performance: C++ code and don't trace op.	5 years ago
lilong12	fa73e4a284	Initialize gloo for low level collective apis (#27356 ) * add gloo initializer, test=develop	5 years ago
Li Fuchen	1501a80f74	add support to float64 input of warpctc op. (#27399 ) * add float64 input to ctc_loss * modified error message of warpctc * update repo and tag of warpctc * add test for warpctc with float64 input * modified warpctc.cmake to make sure build always * resolved sample code bug of warpctc * add core.ops in warpctc dygraph * fix a bug of test	5 years ago
joanna.wozna.intel	b0ee1405f7	Add conv2d bfloat16 support (#27325 )	5 years ago
Zhou Wei	1e1ae5c54d	Make the Bind Method of Tensor more automatic (#27270 ) * Makes the Bind Method more intelligent * Makes the Bind Method more intelligent * fix unittest * fix unittest * fix conflict	5 years ago
Leo Chen	aba759ba16	[Feature] Enhance inplace addto strategy for gradient accumulation in static graph (#27112 ) * support use add instead of sum to do gradient accumulation * add inplace addto pass * add grad_add op and inplace addto pass * remove debug code * code refine * fix bug when sereral sum ops inserts at same op_idx * fix Flags type * add addto attribute for conv3d * fix ut * code clean * fix type	5 years ago
huangxu96	02606d45ef	Quant op dev (#25932 ) * Finished ChannelWiseQuantDequantAbsMaxOp and Passed unittests. * Finished channel-wise quantize strategy in imperative quantization. * Added Cuda code of ChannelWiseQuantDequantMaxAbsOP Add Cuda code of ChannelWiseQuantDequantMaxAbsOp * Add quant_axis for channel_wise quant. * fixed a bug in unnitests, which will not trigger axis = 1 case and cannot meet the coverage rate requirement. * Added some assert infomation and fixed some coding style mistakes.	5 years ago
Wilber	f827665ae6	[Pass Compatible] Bind python compatible. (#27262 )	5 years ago
joanna.wozna.intel	1483ea2304	Add bfloat16 passes (#26999 )	5 years ago
Zhen Wang	d708b21074	Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240 ) * update amp_check_finite_and_scale_op for static_amp. * use amp_check_finite_and_scale in static graph amp. * update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op). * add update_loss_scaling op in cpp. * add update_loss_scaling_op unit test. * update the doc of the check_finite_and_unscale op * Update the process of gradients updating skipping if the gradients have infinite values. * update the way to zero grads. * update test_update_loss_scaling_op.py * add log info when find infinite grads. * add the unit test for UpdateLossScaling Layer.	5 years ago
wangguanzhong	a28ae86e11	Enhance ops to support LoD as input for dygraph detection models. (#25316 ) * enhance collect_op for dygraph, test=develop * enhance detection ops with lod, test=develop * support none bbox left in generate_proposals, test=develop * unfiy MultiLevelRoisNum, test=develop * update core.ops, test=develop * add op register for new input & output, test=develop	5 years ago
Wilber	632125415c	Refine python inference api (#26958 )	5 years ago
yaoxuefeng	7f3e6ca596	add cuda generator (#26786 )	5 years ago
joanna.wozna.intel	95e1434bb2	Add bfloat16 data type (#25402 )	5 years ago
joanna.wozna.intel	0627a319b0	Restore "Add mkldnn bfloat16 option to C-API " (#26882 ) * Add mkldnn bfloat16 option to C-API * Add test for bfloat16 gpu * Change coverage test * Repair capi_gpu test	5 years ago
石晓伟	ced6e87eee	Revert "Add mkldnn bfloat16 option to C-API (#26676 )" (#26854 ) This reverts commit `02083bda40`.	5 years ago
arlesniak	885c61f086	Add use of global flag 'use_mkldnn' to layer_helper (#26497 ) * get use of global 'use_mkldnn' in layer_helper * update for CI * update for CI, relu test * update for CI, relu test added, make FLAGS_use_mkldnn a public flag * added more strict tests, fixes after review * fixes after review * fixes after review, CI stuff	5 years ago
yaoxuefeng	a47d92d868	fleet add save with whitelist test=develop (#23376 )	5 years ago
Wilber	68e0560c2f	refine paddle inference api (#26774 ) * refine paddle inference api Co-authored-by: nhzlx <nhzlx.dragon@gmail.com>	5 years ago
Leo Chen	844583c8fd	Refine paddle.manual_seed (#26496 ) * refine manual seed * fix ci problem * fix unittests * fix unittest * set is_init_py=false in manual_seed * fix unittest * fix bernoulli_op * fix(unittest): change random_seed to manual_seed * 🐞fix(unittest): fix manual_seed * trigger ci * fix test_sentiment * fix test_imperative_save_load * fix test_uniform_random_op * fix test_uniform_random_op * fix test_jit_save_load * merge develop * fix manual_seed * fix manual_seed * use global engine * use shared_ptr * fix double free * fix bug * fix bug * fix bug * fix test bug * fix test bug * fix test bug * fix ci	5 years ago
joanna.wozna.intel	02083bda40	Add mkldnn bfloat16 option to C-API (#26676 ) * Add mkldnn bfloat16 option to C-API * Add test for bfloat16 gpu * Change coverage test	5 years ago
Zhen Wang	f9066e6a6f	Update the demo code and the doc of varbase.backward. (#26506 ) * update the demo code and the doc of varbase.backward. * update the doc of the fake interface `paddle.fluid.Variable`. * remove BackwardStrategy.	5 years ago
lilong12	1c68138327	[api 2.0] add collective op for cpu using gloo and paddle.distributed.* apis (#26552 ) add collective op for cpu using gloo and paddle.distributed.* apis	5 years ago
Zhang Ting	0a895bc0df	improve unique op (#26537 ) * add unique_v2 op * remove unique_v2 op * update doc	5 years ago
wanghuancoder	c1f5df5269	optimized transformation form tensor to numpy (#26447 ) * optimized transformation form tensor to numpy, test=develop * optimized transformation form tensor to numpy, pass pre-commit, test=develop * modify fetchophandle zerocopy to deepcopy in PE&CUP, test=develop * modify py:array construct, test=develop * fix _fetch_var to use deep copy, test=develop	5 years ago
wanghuancoder	422a162019	api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear (#26399 ) * api2.0 paddle.nn.Bilinear and paddle.nn.functional.bilinear, test=develop * api2.0 fix code examples, test=develop * modify test_bilinear_api, about place,to_tensor , test=develop * re pass pre-commit, test=develop * Update common.py * fix BilinearTensorProduct ci error, test=develop	5 years ago
wanghuancoder	6e823cfec3	add op_function_generator.exe retry in windows, test=develop (#26591 ) add op_function_generator.exe retry in windows	5 years ago
wangchaochaohu	ebf9b2125e	add paddle.gather for API2.0 (#26455 )	5 years ago
QingshuChen	138ecf24aa	support Baidu Kunlun AI Accelerator (#25959 ) * support Baidu AI Accelerator * test=kunlun * minor * test=kunlun * support xpu op in separate file * test=kunlun * update XPU error message and remove duplicated code * test=kunlun * minor * test=kunlun * minor * test=kunlun	5 years ago
ceci3	56890dc729	Add SyncBatchNorm (#26032 ) * add SyncBatchNorm,test=develop	5 years ago
Leo Chen	049ac56c08	Print user-friendly error message in core.ops [part 2] (#26377 )	5 years ago
yaoxuefeng	23261ff44b	add cpu random Generator (#26013 )	5 years ago
Leo Chen	672578a797	Print user-friendly error message in core.ops (#26261 ) * print user-friendly error message * adjust error sumary	5 years ago
wangchaochaohu	0b81d76310	[API2.0] add op for cudnn version query test=develop (#26180 )	5 years ago
wangchaochaohu	bb11cbc250	[API2.0] add Device api (set_device and get_device)(#26103 )	5 years ago
Zhou Wei	6de463d3d1	expose and unify the Tensor concepts to the user (#25978 ) * expose and unify the Tensor concepts to the user * expose tensor to user * add copy place for Tensor * add copy place for Tensor * add note * add macro PADDLE_WITH_CUDA * remove RUN_TYPE=DIST * fix some error	5 years ago
Zhou Wei	20147ace3f	fix_copy_if_different (#25868 )	5 years ago
Leo Chen	2d95280e1f	Feature/Enable Auto-Mixed-Precision in dynamic graph (#24903 ) * add auto_cast, test=develop * add loss scaler, test=develop * add comments, test=develop * refine code, test=develop * refine code, test=develop * do not set flags automatically, test=develop * fix custom op bug, test=develop * add more test, test=develop * refine enable logic, test=develop * enable amp test with GPU, test=develop * add unittest * add test for found_inf * follow comments * follow comments * remove global variable, use singleton * add some notes * update comments * update comments * update comments * add use_dynamic_loss_scaling argument * refine found_inf * refine found_inf	5 years ago
Chen Weihang	838e36e9ed	Fix loaded variable suffix repeat error (#26169 ) * fix loaded var suffix repeat error * use new dygraph name for loaded param	5 years ago
Jack Zhou	dea41da715	add nll loss API for the paddlepaddle api2.0 * add nll loss API, update demo code of the comment	5 years ago
Chen Weihang	3c8daa9b89	Add pin memory control for BufferedReader (#26026 ) * add pin memory control * fix buffered reader init problem * fix unittest error * add unittest for coverage	5 years ago
Leo Chen	751305ecf0	Add flags to control call stack of error message (#25997 ) * add flags_call_stack_level * update * refine code	5 years ago
Thunderbrook	0cb60c700d	add heter ps mode (#25682 ) * add heter ps mode * code style test=develop * add with_pslib test=develop * unitest test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * code style test=develop * test monitor test=develop * prepare trainer test=develop * code style test=develop	5 years ago
tangwei12	caa90a6510	Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957 ) * Integrated Trainer of Parameter Server	5 years ago
Zhou Wei	b484a59c39	fix copy file random fail on windows (#25731 )	5 years ago
Pei Yang	55b6205ddf	add set_mkldnn_cache_capacity python api(#25524 )	5 years ago
Zhen Wang	cea5086853	Fix the double grad bug for the star gan. (#25655 ) * fix the double grad bug for the star gan. test=develop * update the retain_graph parameter doc. test=develop * add the unit test for the retain_graph parameter. test=develop	5 years ago
石晓伟	7206417259	supports xpu runtime, test=develop (#25554 ) * update ResetHolder, test=develop * add TensorShare for lite engine, test=develop * tensor data changed from copying to sharing, test=develop * supports xpu runtime, test=develop * fix code styles, test=develop	5 years ago
Pei Yang	43f9f180e5	Add api to clear intermediate tensors in AnalysisPredictor (#25069 ) * add api to clear intemediate tensors in analysis predictor. test=develop * add python api. test=develop	5 years ago
Zhen Wang	bb45af02ac	add the c++ part of Imperative QAT. test=develop (#25446 )	5 years ago
ceci3	52be62c5ae	fix instance norm in dy (#24717 ) * fix bn & in in dy, test=develop * update instance_norm,test=develop * fix bugs,test=develop * add more case in unittest,test=develop * fix,test=develop * fix,test=develop	5 years ago
gongweibao	80f1c50738	Fix typo in interface. (#24779 )	5 years ago
Chen Weihang	f07b25d8e5	fix DataLoader.generrator using error, test=develop (#25355 )	5 years ago
Aurelius84	494cb36d09	Modify tmp var name prefix in dygraph (#25280 ) * Modify tmp var name prefix in dygraph test=develop * refine comment test=develop	5 years ago
Shibo Tao	19c4db1b56	don't re-generate header file if content doesn't change (#25130 ) * don't re-generate header file if content doesn't change. test=develop * add copy_if_different function. test=develop	5 years ago
Chen Weihang	b23801a262	polish tensor set error messag, test=develop (#25113 )	5 years ago
hutuxian	5822862d8a	Monitor Framework (#24079 ) * Add a StatValue class in the backend to represent a stat. * Add a singleton StatRegistry to maintain the collection of stats. * For the sake of code neatness, we only support type of int and float, which can cover most of the scenarios.	5 years ago
Leo Chen	bfa46c38d5	bn supports reverse_space, test=develop (#24988 )	5 years ago
Leo Chen	6190023ac9	Refine error message in pybind folder (#24886 ) * refine err_msg of pybind.cc, test=develop * refine err_msg in tensor_py.h, test=develop * refine error msg, test=develop * fix test_exception, test=develop * follow comments, test=develop	5 years ago
Chen Weihang	4a702ef361	Support SelelctedRows allreduce in multi-cards imperative mode (#24690 ) * support selectedrows allreduce in multi-cards dygraph, test=develop * remove useless import modules in unittests, test=develop * add nccl cmake to get nccl version, test=develop * add if-condition to compiled correctly, test=develop * add detail version parseing for old nccl, test=develop * polish camke details, test=develop * fix remove test cmake error, test=develop * fix cmake condition, test=develop * change unittest camke list, test=develop * fix unittest cmake rule, test=develop, test=framep0	5 years ago
Pei Yang	14b8540551	add default ctor for AnalysisConfig python api. test=develop (#24924 )	5 years ago
Leo Chen	1e818158f5	Feature/add amp_checkout_finite_and_scale op (#24875 ) * add amp_check_finite_and_scale op, test=develop * add cpu kernel, test=develop * use bool, test=develop * follow comments, test=develop	5 years ago
Chen Weihang	d1062d5278	Replace all errors thrown by LOG(FATAL) with PADDLE_THROW (#24759 ) * remove REPLACE_ENFORCE_GLOG compile option & add ci rule prohibit LOG(FATAL) using, test=develop * remove ci test case, test=develop * replace all LOG(FATAL) & polish message, test=develop * fix typo, test=develop * polish error info detail, test=develop	5 years ago
Yanghello	aa47356b74	Add crypto python (#24836 ) * add crypto helper for paddle, test=develop * cryptopp.cmake bug fixed, test=develop * remove debug build type, test=develop * fixed CMakeLists for new target, test=develop * fix CI bug, test=develop * add cmake option flag DWITH_CRYPTO, test=develop * add crypto api for python, test=develop * Revert "add crypto api for python, test=develop" This reverts commit 3a1cfa9d055fab357f46e653a8786f96336f6b47. * Revert "Add crypto api (#24694)" This reverts commit `5a7a517cde`. * Revert "Revert "Add crypto api (#24694)"" This reverts commit f952b19fa7e8b7f9c57d31d78b9ffee1041c43ed. * fixed cryptopp cmake building error, test=develop * change WITH_CRYPTO building option to OFF, test=develop * âfixed cipher test failed, test=develop * "add crypto api for python, test=develop" This reverts commit 83fb55c0668d59afad2ad1e7e04d425c7c7dd189. * travis CI bug fixed, test=develop * fixed test in python3 * test=develop * fixed unittest, test=develop	5 years ago
Zhen Wang	23d253e1be	Fix out of range error for outs map. test=develop (#24774 )	5 years ago
ShenLiang	950892044f	fix conflict, test=develop (#24238 )	5 years ago
hutuxian	e6b87b3193	Support AucRunner in PaddleBox (#22884 ) * Support AucRunner in PaddleBox * update some code style	5 years ago
Leo Chen	9c9e635c00	support tensor to varbase, test=develop (#24660 )	5 years ago
Leo Chen	14dd6388c5	fix bug of varbase.__getitem__, test=develop (#24642 ) * fix bug of varbase.__getitem__, test=develop * fix bug of float and other type, test=develop	5 years ago
Leo Chen	d980d251f0	specify outs, test=develop (#24537 )	5 years ago
hutuxian	123255cf9f	change InitializeGPU to InitializeGPUAndLoadModel (#24377 ) * Add InitializeGPUAndLoadModel to solve random hang when downloading sparse parameters. * Update SaveBase to solve test problem.	5 years ago
Chen Weihang	aa0f254fbe	Add macro BOOST_GET to enrich the error information of boost :: get (#24175 ) * add new macro BOOST_GET_SAFELY & unittests, test=develop * add different macro type, test=develop * fix get macro type in executor, test=develop * four macro part change backup * using one macro for all case, test=develop * revert attribute change, test=develop * change to three func to solve gcc4.8 bug, test=develop * polish some details, test=develop	5 years ago
hong	67f66f0904	Fix get item out of range error (#24339 ) * raise index error when slice out of range; test=develop * add uni test; test=develop * fix format error; test=develop * add comment for py::index_error; test=develop * polish error message; test=develop * polish error message; test=develop	5 years ago
Tao Luo	9eedf05d2f	solve mklml memory leak on windows (#24015 ) * solve mklml memory leak on windows test=develop * remove unused msvcr120.dll test=develop	5 years ago
xujiaqi01	1034ca316f	add timeout and http store in communication (#23436 ) * add timeout and http store in communication, add revert and confirm in fleet * test=develop	5 years ago
Zhang Ting	ab8f8fa70d	fix example code, test=develop, test=document_fix (#24139 )	5 years ago
石晓伟	46f3139c7f	supports loading model from memory, test=develop (#24098 )	5 years ago
Leo Chen	5cccc69f1a	update name generator, test=develop (#24048 ) * update name generator, test=develop * use c++ unique name generator, test=develop	5 years ago
wawltor	5c669ad1c2	Add the support dygraph out attribute for the op of mm in api2.0 (#23978 ) Fix the dygraph mode in matmul, add the support in Linear Op	5 years ago
Kaipeng Deng	80cf3c3c4d	Refine DataLoader support multi-processing (#23107 ) * add DataLoader, Dataset, BatchSampler	5 years ago
guofei	2b896c1f6b	Support LoDTensorArray in fetch (#23645 ) * Support LoDTEnsorArray in fetch op test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop * Support LoDTensorArray in fetch test=develop	5 years ago
hutuxian	df64a96686	support set_test_mode and set comlog level(#23905 )	5 years ago
xujiaqi01	d98084e7ec	add save with prefix (#23449 ) * add save with prefix * test=develop	5 years ago
Leo Chen	b59426b52a	Enhance error msg of imperative code (#23572 ) * fix init_gflags with 'python -c', test=develop * enhance error msg related Tracer, test=develop * refine err msg, test=develop * follow comments, test=develop	5 years ago
hutuxian	94a3789fd0	Add AfsAPI in PaddleBox (#23419 ) * Involves AfsAPI to resolve slow downloading. * Mainly used in PaddleBox	5 years ago
Chen Weihang	df538439f5	api build strategy error polish, test=develop (#23546 )	5 years ago
mozga-intel	3baaee9aab	Remove: NGraph engine from PDPD repository (#23545 ) * Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop	5 years ago
Chen Weihang	45880f604b	API(Program) error message enhancement (#23519 ) * polish api program error message, test=develop * fix condition error, test=develop * fix test prune error, test=develop * fix coverage problem, test=develop	5 years ago
石晓伟	9b82e4c183	change the cmake and apis of lite engine, test=develop (#22934 ) * change the cmake and apis of lite engine, test=develop * change the cmake of lite engine, test=develop	5 years ago
guofei	ca7bd2beb1	Add a function to update FLAGS (#22851 ) * Add a function to update FLAGS test=develop * Add a function to update FLAGS test=develop * expr flags * Add a function to update FLAGS test=develop * distinguish public/private vars, test=develop * fix windows issues, test=develop * expr flag * Add functions to get and set FLAGS test=develop * Add functions to get and set FLAGS test=develop * Add functions to get and set FLAGS test=develop * Add functions to get and set flags test=develop * Add functions to get and set FLAGS test=develop * Add a function to update FLAGS test=develop * Add a function to update FLAGS test=develop * Add functions to get and set flags in Paddle test=develop Co-authored-by: sneaxiy <sneaxiy@126.com>	5 years ago
ShenLiang	5223e2bbc4	Add a new DataFeed named PaddleBoxDataFeed (#23321 ) * add paddleboxdatafeed * add ifdef linux and boxps * add untest for datafeed * fix untest of test_paddlebox_datafeed * fix untest * rename function	5 years ago
Chen Weihang	75bd350710	Implement StaticModelRunner to support dygraph fine-tune static graph pre-training model (#23171 ) * static model runner basic implement, test=develop * add run program op to execute loaded program, test=develop * refactor static model runner & run program op, test=develop * reset engine.cc to resolve conflict * adapt the change of dygraph double grad, test=develop * refactor impl to solve control flow error, test=develop * clear debug code, test=develop * fix ci str compatible error & checkout dygraph grad maker & add example, test=develop * hide api & add op test, test=develop * fix run program op test places error, test=develop * fix program by review comment, test=develop * delete change var desc name, test=develop * fix other program by review comment, test=develop * remove _static_graph_guard, test=develop * add selectedrows test, test=develop * remove desc parser, test=develop * fix detail program, test=develop * change socpe create & add test, test=develop	5 years ago
gongweibao	24a063f6ac	Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL (#22586 )	5 years ago
Zeng Jinle	0c23e3ff4d	fix Tracer::NoGrad, test=develop (#23443 )	5 years ago
Leo Chen	a62599a888	[feature] prune program by feed and fetch_list automatically (#22474 ) * prune train program by fetch_list, test=develop * add unittest for prune, test=develop * fix pruned feed, test=develop * support ParallelExecutor and feed prune, test=develop * add comments, test=develop * update unittest, test=develop * update unittests, test=develop * remove debug code, test=develop * support cond in clone, test=develop * support cond in prune, test=develop * support multiple minimize, test=develop * support cache, test=develop * fix _copy_param_info_from, test=develop * support python2 str, test=develop * remove debug code, test=develop * fix bug of caching CompiledProgram, test=develop * fix multi_device issue, test=develop * tmp * support tuple in fetch_list and overriding use_prune, test=develop * dont use nonlocal in python2, test=develop * remove nonlocal, test=develop * code clean, test=develop * code clean, test=develop * feed list, test=develop * test adam, test=develop * follow comments, test=develop * reduce duplicate code, test=develop * update comments, test=develop	5 years ago
songyouwei	99d30bfc36	speedup slice impl (#23340 ) test=develop	5 years ago
Leo Chen	488b2387e2	Feature/expand params in auto-generated pybind functions for dygraph operators (#23181 ) * expand parameters, test=develop * support resnet, test=develop * fix resnet, test=develop * support duplicable out, test=develop * support ptb * fix bugs, test=develop * support null input, test=develop * fix bugs, test=develop * fix batchNorm is_test, test=develop * refine code, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop	5 years ago
Zeng Jinle	babda94c8a	Distinguish public/private global vars (#23269 ) * distinguish public/private vars, test=develop * fix windows issues, test=develop	5 years ago
Zeng Jinle	8bfd62ffb7	Expose dygraph.grad api (#23124 ) * expose dygraph.grad api, test=develop, test=document_fix * add more parameter in dygraph.grad API, test=develop * add only_inputs=True parameter, test=develop * follow comments, test=develop, test=document_fix * fix typo, test=develop, test=document_fix	5 years ago
Zhaolong Xing	430b0099c9	[Paddle-TRT]: Ernie Dynamic shape support. (#23138 ) * add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop	5 years ago
xujiaqi01	68ea1ad55b	add clear one table (#23089 ) * add clear_one_table * test=develop	5 years ago
Zeng Jinle	acfc9b8a70	Reader sequential and inference partial feed (#22699 ) * sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop	5 years ago
Zeng Jinle	a31d7328b7	Add dygraph double grad implementation (#22939 ) * add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop	5 years ago
songyouwei	2e2da7124b	high-performance dygraph slice (#22879 ) * move __getitem__ to cpp * bug fix * add type check and gil release * support negative step with omitted ends test=develop * code refine test=develop * bug fix test=develop * slice always return different pyobj test=develop	5 years ago
zhaoyuchen2018	a020a25797	Fix model int8 quant fail, test=develop (#22891 ) As model fails when enable int8 quant, so disable allocate memory in cpu for small variable.	5 years ago
Zhaolong Xing	dd67d44a50	[Paddle-TRT] : (Part1) Dynamic shape support (#22868 ) * change the ci trt from version 5. to 6.0 * paddle-trt dynamic shape support init * conv+bias or conv+bn dynamic shape support test=develop * modity trt engine opconvert test=develop * fix ci error test=develop	5 years ago
Zeng Jinle	d41d802ba3	Add flags to limit gpu memory (#22793 ) * add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop	5 years ago
石晓伟	1861ca88f1	serialize the PaddleTensor, test=develop (#22810 ) * encapsulate the PaddleTensorToLoDTensor, test=develop * serialize the pd_tensor, test=develop * serialize tensors to file, test=develop	5 years ago
Zhang Ting	4e8bc02461	add fluid.device_guard to specify the device type for Op (#22254 ) * add fluid.device_guard to specify the device type for Op	5 years ago
Zhen Wang	89cfa49156	Unmerged fetch list (#22635 ) * update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop	5 years ago
Chen Weihang	7d8d573453	Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541 ) * add lodtensor share memory & serialization, test=develop * fix windows compile error, test=develop * deal vartype pickle & fix unittest matching error message, test=develop * update timeout variable name, test=develop * refactor memory map implement, test=develop * clear mmap file discripter when exit unexpectedly, test=develop * remove the child process fd in advance, test=develop * remove mmap fds after Queue.put in child process, test=develop * add hard unittests for register exit func, test=develop * fix python2 compatibility problem in unittest, test=develop * fix exception unittest error, test=develop * polish code based review comment, test=develop	5 years ago
hutuxian	53a2b68f4e	support customized download command in dataset (#22782 ) * user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario	5 years ago
zhaoyuchen2018	72dde4abde	Refine adam op to improve performance, test=develop (#22346 ) * Refine adam op, test=develop * Fuse kernels together to reduce cpu time. * Refine paddle enforce, test=develop * Remove some comments, test=develop * Refine code,test=develop * Refine cuda kernel, test=develop * Refine code according to comments, test=develop	5 years ago
Leo Chen	b2c1be851a	support cond in clone, test=develop (#22657 ) * support cond in clone, test=develop * refine code, test=develop * refine code, test=develop * follow comments, test=develop * refine code, test=develop	5 years ago
hutuxian	175954d894	PaddleBox Framework Part2 (#22466 ) * Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.	5 years ago
tianshuo78520a	d2ba91aad1	fix typo words (#22653 )	5 years ago
tangwei12	66a3150135	SYNC with communicaotor (#22344 ) * add sync communicator and implement	5 years ago
wangchaochaohu	c65c6ae534	add flag to control profile level in python API (#22319 ) * add python flag to control profile level test=develop	5 years ago
tangwei12	b0675c8193	fix bug with compiledProgram (#22495 ) * add thread barrier for the compiled program	5 years ago
yaoxuefeng	2235ee1a5e	multi-loss optimization by adding a DownpourOpt worker (#22025 ) * update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop	5 years ago
Wilber	de009152a7	Compile without nccl deps. [2/2] (#22484 ) Compile without nccl deps. [1/2] Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Yiqun Liu	dcfb603897	Enable the detection of subgraph composed of grad ops (#21223 ) * Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop	5 years ago
Wilber	7bc4b09500	add WITH_NCCL option for cmake. (#22384 ) cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡 Co-authored-by: 石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>	5 years ago
Leo Chen	822e5b36ec	Support int16 for Tensor (#22423 ) * add int16 support, test=develop * add test, test=develop * fix typo, test=develop * fix dtype error in slice, test=develop	5 years ago
石晓伟	e1b0d7cbb1	remove anakin from code, test=develop (#22420 )	5 years ago
xujiaqi01	371f377bea	add GeneralRoleMaker (#22295 ) * add GeneralRoleMaker which is for general usage * test=develop	5 years ago
Leo Chen	b96c7c9a7a	polish code, test=develop (#22380 ) remove unnecessary template.	5 years ago
Leo Chen	aaa4fe491a	use function instead of lambda, test=develop (#22348 ) * use function instead of lambda, test=develop * follow comments, test=develop	5 years ago
Yiqun Liu	b7cac50b64	Implement a common python unittest to test the ir passes. (#22209 ) * Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop	5 years ago
tangwei12	82bc814a57	integrated HALF_ASYNC to communicator (#21869 ) * add half_async in the communicator * fix DistributedStrategy	5 years ago
Chen Weihang	35efbe6d95	Speeding up dygraph DataLoader with multiprocessing (#21762 ) * add multiprocess for dygraph data loader, test=develop * polish code & add safe gurad, test=develop * refactor dygraph dataloader & add signal handler, test=develop * fix member initializer compile error on ci, test=develop * fix member initializer compile error one more, test=develop * remove useless config, test=develop * skip windows incompatible problem, test=develop * add unittest for coverage, test=coverage * add more exception unittest case, test=develop * deal with signal handler coverage, test=develop * polish code & add signal handler tests, test=develop * deal with coverage ci problem, test=develop * split data loader test & coverage ci fix, test=develop * remove test_imperative_data_loader_with_exception, test=develop * remove singal process except test case, test=develop * add exception tests again & remove sample list test, test=develop * split normal and exception unittests to diff class, test=develop * polish doc for use_multiprocess effect in static mode, test=develop	5 years ago
xujiaqi01	e3a457d34b	add collective communication library in fleet (#22211 ) * add collective communication library in fleet to replace mpi * test=develop	5 years ago
Zhen Wang	46189b166d	Add bn and relu fuse pass (#22048 ) * add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop	5 years ago
zhongpu	cf475f95df	Remove FC in dygraph, modify FC to Linear in sample code (#22082 ) * modify fc to linear in sample code, test=develop * remove FC, test=develop * remove warnings, test=develop * drop fluid/imperative/README.md , test=develop * change fc to linear, test=develop * polish code style, test=develop	5 years ago
Huihuang Zheng	dd4361568e	Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029 )	6 years ago
123malin	7fb817d447	add distributed_strategy (#21710 ) * add distributed_strategy	6 years ago
zhouwei25	2df4be5d35	Fix openblas bug to support compile on windows when WITH_MKL=OFF (#21902 ) * Fix openblas to support compile on Windows when WITH_MKL=OFF	6 years ago
flame	2bbc0d7d60	python zero copy inference, delete pass (#21897 ) * python zero copy inference * support delete inference pass	6 years ago
Zeng Jinle	aa4d6a5d6c	Add some debug flags to auto growth allocator (#21766 ) * add some debug flags to auto growth allocator, test=develop * add comments about auto growth, test=develop	6 years ago
Huihuang Zheng	557bce77da	Fix Backward Bugs in Conditional Block (#21809 ) The fixed bugs: 1. The condition sub-graph is not pruned 2. When backward graph is extremely simple, the whole backward ops are pruned.	6 years ago
xujiaqi01	0eb4d990c4	fix compiled error when with_pslib=on (#21769 ) * fix compiled error of butil when with_pslib=on and with_testing=on * test=develop	6 years ago
Leo Chen	fbe3ac217e	polish cmake, test=develop (#21681 ) * polish cmake, test=develop * add current directory to LD_LIBRARY_PATH, test=develop	6 years ago
mapingshuo	686f0ecb6a	add `no_need_buffer_slots` interface to pybind (#21575 ) * add no_need_buffer_slots interface to pybind	6 years ago
Zeng Jinle	6828f3684b	fix op_registry, add ignore op_function_impl.h, test=develop (#21654 )	6 years ago
Chen Weihang	d96acc3363	Refine dygraph DataLoader implementation (#21634 ) * refine dygraph dataloader & polish related code, test=develop * refine code based review comment, test=develop	6 years ago
Leo Chen	48600d7f17	Add op function generator for dygraph (#21569 ) * add op function generator, test=develop * add unittest, test=develop * follow comments, test=develop * fix windows compilation problem, test=develop	6 years ago
Leo Chen	4f81d1bd5f	Refine VarBase init function (#21587 ) * refine init function, test=develop * add tests, test=develop * remove extern, which may cause symbol error in gcc-4.8, test=develop	6 years ago
hutuxian	c5aec2fe68	Paddlebox Related to Framework (#21586 ) * Add a single_process_multi_thread transpiler. * Add some UTs. * Fix some API description.	6 years ago
liym27	9da7e6b4d4	add file check_op_desc.py and add interface to get default value. (#21530 ) * add file check_op_desc.py and add interface to get default value. test=develop * add test for c++ coverage rate. test=develop * Correct typo. test=develop	6 years ago

1 2 3 4 5 ...

888 Commits (9ebf05b003ab910bac2636496ef89d43927b7e60)