Paddle

Commit Graph

Author	SHA1	Message	Date
guomingz	2281ebf0f3	Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130 ) * Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization. Below table shows the benchmark(FPS) which measured on skx-8180(28 cores) Batch size \| with fusion \| without fusion -- \| -- \| -- 1 \| 214.7 \| 53.4 50 \| 1219.727 \| 137.280 test=develop * Fix the format issue test=develop * Add the missing nolint comments. test=develop * Fix the typos. test=develop * Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine. test=develop * Adjust the indentation. test=develop * Add the test_conv_brelu_mkldnn_fuse_pass case. test=develop * Slightly update the code per Baidu comments. Let the parameter definition embedded into the code. That's will make the code easy to understand. test=develop	6 years ago
Yibing Liu	f9796b1249	Add LAMB Optimizer support (#17489 ) * Add LAMB optimizer * Expose LAMB Optimizer's APIs test=develop, test=document_preview * Cleanup code & doc test=develop, test=document_preview * Update lamb optimizer's formula test=develop	6 years ago
mozga-intel	99ab57123c	Enabled ngraph elementwise max operator (#17517 )	6 years ago
Tao Luo	3d19f44a89	remove unused SERIAL compiler option (#17500 ) test=develop	6 years ago
mozga-intel	1eb151752e	Enable abs operator for a ngraph test=develop (#17436 )	6 years ago
Zhaolong Xing	ff7f911b4d	add quant_dequant_moving_avg_max_abs op (#17480 ) * add quant_dequant_moving_avg_max_abs op test=develop * add more note for quantdequant op test=develop	6 years ago
lvmengsi	10b23a72c1	Double backward elementwise div (#17416 ) * double backward, elementwise_div * fix dx empty. test=develop * bug fix (#17392) fix secure bug * Eanble stack operator for a Ngraph, test=develop (#17406) * fix sqrt_grad_grad unittest. test=develop (#17410) * fix sqrt_grad_grad unittest. test=develop * disable sqrt_grad_grad unittest. test=develop * test=develop, fix unittest * test=develop, fix unittest * test=develop, fix unittest * test=develop, fix bug * fix unittest. test=develop * fix unittest dx. test=develop * tmp fix! for test... test=develop * reduce tmp, test=develop * test=develop, reduce tmp * fix broadcast unittest. test=develop * fix format. test=develop * refine code. test=develop * refine code. test=develop * refine GetDoubleGradSafeTensor. test=develop * fix format. test=develop	6 years ago
Kaipeng Deng	14f223624f	fix sqrt unittest. test=develop (#17440 )	6 years ago
lvmengsi	977e9fcb27	support elementwise_sub double backward (#17476 ) add elementwise_sub_grad_grad op for backward of backward calculation	6 years ago
Yan Xu	0217555530	polish parallel dygraph code (#17164 ) * add var grad hook test=develop	6 years ago
chengduo	e336dc86bb	[Speed] Refine the Executor when the num_thread=1 (#17405 ) Refine the Executor when the num_thread=1	6 years ago
Kaipeng Deng	58d5c61a29	fix sqrt_grad_grad unittest. test=develop (#17410 ) * fix sqrt_grad_grad unittest. test=develop * disable sqrt_grad_grad unittest. test=develop	6 years ago
mozga-intel	6ee6700fac	Eanble stack operator for a Ngraph, test=develop (#17406 )	6 years ago
baojun	1ce7b45b9e	NGraph Added fill_zeros_like op test=develop (#17295 )	6 years ago
baojun	910196524d	NGraph Added dropout and dropout_grad to ngraph test=develop (#17320 )	6 years ago
mozga-intel	b189480734	Ngraph Enable gather operator test=develop (#17296 )	6 years ago
lvmengsi	4ef631013c	Double backward sqrt (#17387 ) * double backward sqrt * refine unittest. test=develop * refine test. test=develop * remove alpha in unittest. test=develop	6 years ago
lvmengsi	5d1ac41b00	Double backward reduce mean (#17372 ) * test=develop, double backward reduce_mean * add comment. test=develop * fix format. test=develop * rename GradGrad -> DoubleGrad. test=develop * fix op_use_default_grad_op_maker.spec. test=develop	6 years ago
Kaipeng Deng	bd9bef5a4e	add elementwise_add_grad_grad op (#17366 ) * add elementwise_add_grad_grad op. test=develop * use defined GradMaker. test=develop	6 years ago
jerrywgz	1c6d064627	add collect fpn proposals op,test=develop (#16074 ) * add collect fpn proposals op,test=develop	6 years ago
Kaipeng Deng	60be66e2c0	support fc_op double grad (#17317 ) * add double grad for mul_op. test=develop * fix format. test=develop * fix format. test=develop * fix format. test=develop * refine code. test=develop * remove setzero. test=develop * fix dx/dy init bug. test=develop * fix format. test=develop	6 years ago
Jiabin Yang	4624d7c642	test=develop, add gradient sort backward strategy (#17125 ) * test=develop, add gradient sort backward strategy * test=develop, fix test by add FLAGS_cudnn_deterministic on new tests	6 years ago
Jiabin Yang	c843e64cf5	Revert "rename the default version from '0.0.0' to 'latest' (#17304 )" (#17356 ) This reverts commit `f456c8beb8`.	6 years ago
Kaipeng Deng	8bae8590ac	add double grad for elementwise_mul op (#17255 ) * add double grad for elementwise_mul. test=develop * remove comment. test=develop * fix grad sum. test=develop * fix for axis expand. test=develop * add test for axis expand. test=develop	6 years ago
Kaipeng Deng	11d3a38f25	add double grad for square op (#17173 ) * add double grad for square. test=develop * formax code. test=develop * fix for grad sum. test=develop * refine shape. test=develop * refine extract. test=develop	6 years ago
chengduo	bc833945a4	Add DropLocalExeScopes in ParallelExecutor (#17297 ) * reset drop local scope counter test=develop	6 years ago
zhoukunsheng	d4b67e1692	Add Where Op(#16793 )	6 years ago
zhoukunsheng	1bfff02047	Add Diag Op(#17027 )	6 years ago
qingqing01	e32c9888f5	Double backward of conv2d. (#17211 ) * Add conv2d_grad_grad_op * Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h. - Now use it in conv2d_grad_grad. - Will simply the searching code in conv2d and conv2d_grad in next PR. * Enhance and fix bug in unit testing of gradient_checker. * Support to fetch empty variables，return None in Python.	6 years ago
wopeizl	f456c8beb8	rename the default version from '0.0.0' to 'latest' (#17304 ) * rename the default version from '0.0.0' to 'latest'	6 years ago
baojun	7bd1d03ee5	Adding lrn op for ngraph engine (#17189 ) * added lrn op test=develop * Added CreateConstant method test=develop * avoid duplicates test=develop	6 years ago
Zeng Jinle	4f8594088d	Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace (#17225 ) * add use_cuda to inplace pass,test=develop * add test softmax_with_xe_inplace test,test=develop * fix potential inplace bug test=develop * add more skip vars in mem opt pass,test=develop * follow comment,test=develop * follow comments,move duplicate out arg check to program->graph,test=develop	6 years ago
baojun	e782b54b9c	update sofmax with axis arg test=develop (#17190 )	6 years ago
Tao Luo	ff1661f12a	remove unused FLAGS_warpctc_dir (#17162 ) * remove unused FLAGS_warpctc_dir test=develop * remove FLAGS_warpctc_dir test=develop	6 years ago
Kaipeng Deng	a71d8fdb87	Softmax_cross_entropy op add axis (#16806 ) * add attr axis infershape. test=develop * add CUDA kernel. test=develop * fix unittest. test=develop * fix unittest for soft_label. test=develop * fix fp16 unittest. test=develop * remove comment code. test=develop * refine test for axis. test=develop * add python api. test=develop * fix doc. test=develop * fix fp16 unittest. test=develop * fix ngraph test. test=develop * fix ENFORCE for test_imperative_transformer. test=develop * fit for ngraph test. test=develop * fix after rebase develop. test=develop * fix doc. test=develop * fix API.spec. test=develop * fix test_layers. test=develop * fix format. test=develop	6 years ago
Zhen Wang	a914d9b116	Quant output scale (#17215 ) * Add MovingAverageAbsMaxScale operator which is only used for calculating the quantization scale. * test=develop * change the output into inplace. test=develop * Revert "test=develop" This reverts commit 696cf62699ba1e1c98f61f7345ac7060010eb29a. * Revert "change the output into inplace. test=develop" This reverts commit a19acd20f07eee82622701a3015e6e9c073a5e0b. * test=develop. * update the MovingAverageAbsMaxScaleOp test. test=develop	6 years ago
jerrywgz	cc95a7516c	fix distribute fpn proposals, test=develop (#16152 ) * fix distribute fpn proposals, test=develop	6 years ago
Zeng Jinle	ee2028a110	Add use_cuda to inplace pass (#17205 ) * add use_cuda to inplace pass,test=develop * add test softmax_with_xe_inplace test,test=develop	6 years ago
jerrywgz	a72907bbf4	Enhance concat op to support empty input. (#17015 ) * enhance_concat, test=develop	6 years ago
wopeizl	83c4f7721f	use two GPUs to run the exclusive test test=develop (#17187 )	6 years ago
tianshuo78520a	8092c40560	Modify test timeout (#17181 ) * test=develop * test=deelop	6 years ago
guru4elephant	f938ccec62	remove async executor python api to fix document (#17174 ) * remove async executor python api test=develop * remove test_async_executor.py add executor train_from_dataset demo test=develop * fix import bug test=develop	6 years ago
Zeng Jinle	5dfe2ab9e8	Fix mem leak when converting Tensor to numpy array (#17182 ) * fix mem leak when converting Tensor to numpy array test=develop * remove unused unittest,test=develop * follow comments, test=develop * fix dygraph bug,test=develop	6 years ago
Zeng Jinle	4e1bc6e805	Rewrite inplace pass and fix gc bug (#17126 ) * fix op graph view test=develop * rewrite inplace pass and fix reference count pass bug test=develop * fix unittest failed test=develop * follow comments, test=develop	6 years ago
xiaoting	bc48453b73	polish the label_smooth (#17138 ) * polish the label_smooth test=develop * polish code test=develop	6 years ago
tangwei12	deb510d451	cvm op feature (#17081 ) cvm without LoD.	6 years ago
Jiancheng Li	554d3a71d2	test=develop fix bug: fix selected_indices in nms (#17140 )	6 years ago
Zeng Jinle	28d69d710a	Refine dropout gpu memory (#17095 ) * refine_dropout_mem,test=develop * # This is a combination of 14 commits. # The first commit's message is: remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066) # This is the 2nd commit message: Fleet unify distributed training (#16791) * implement distributed transpiler with fleet # This is the 3rd commit message: ParallelDyGraph with GPU collective mode (#16827) implement dygraph.parallel.DataParallel to hook reduce op. # This is the 4th commit message: Init mixed precision training interface (#16856) * Init mixed precision training interface * Add fp16 test script test=develop * All initializers support float16 test=develop * Code cleanup & add more code annotations test=develop * Update API spec test=develop * Add usage example in doc test=develop # This is the 5th commit message: fix reference_count_pass,test=develop (#17060) test=develop # This is the 6th commit message: Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090) * Cache the information of linear interpolation in forward and use it in backward. test=develop * Fix cuda kernel. test=develop # This is the 7th commit message: remove unnecessary prepare_data (#17080) test=develop # This is the 8th commit message: fix interpolate cu. test=develop (#17101) # This is the 9th commit message: test=develop, double backward leaky_relu (#17067) backward of backward: leaky_relu # This is the 10th commit message: fix fuse optimizer ops (#17102) test=develop # This is the 11th commit message: truncated_gaussian_random supported in distributed training, test=develop (#17091) # This is the 12th commit message: Detailed coordinate description for yolov3 loss (#17007) * Detailed coordinate description for yolov3 loss test=develop * modified api.spec test=develop * modified loss name * fix api.spec test=develop * polish description test=develop * modified api.spec test=develop # This is the 13th commit message: fix test_weight_decay (#17109) test=develop # This is the 14th commit message: Path flag (#17105) * fix python/paddle/fluid/__init__.py detecting problems	6 years ago
chengduo	9ccce576d6	fix test_weight_decay (#17109 ) test=develop	6 years ago
ceci3	258e000be6	test=develop, double backward leaky_relu (#17067 ) backward of backward: leaky_relu	6 years ago
Kaipeng Deng	10c487eb21	fix interpolate cu. test=develop (#17101 )	6 years ago
whs	55ce36e981	Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090 ) * Cache the information of linear interpolation in forward and use it in backward. test=develop * Fix cuda kernel. test=develop	6 years ago
Yibing Liu	beda78258f	Init mixed precision training interface (#16856 ) * Init mixed precision training interface * Add fp16 test script test=develop * All initializers support float16 test=develop * Code cleanup & add more code annotations test=develop * Update API spec test=develop * Add usage example in doc test=develop	6 years ago
Yan Xu	0b07eef118	ParallelDyGraph with GPU collective mode (#16827 ) implement dygraph.parallel.DataParallel to hook reduce op.	6 years ago
tangwei12	1a4a51db2b	Fleet unify distributed training (#16791 ) * implement distributed transpiler with fleet	6 years ago
tangwei12	e707119a89	remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066 )	6 years ago
guomingz	2deac4e447	Fix the bug of test_conv2d_int8_mkldnn case which raised by improper parameter passing (#17058 ) * resolve #17057 Fixed the bug that fuse_relu/fuse_residual option couldn't be passed to class TestConv2dInt8Op. test=develop * Fix the bug of test_conv2d_int8_mkldnn case which raised by improper parameter passing. test=develop	6 years ago
chengduo	a2be4b4d91	Add fuse momenutum ops (#16745 ) * Add fuse momenutum ops	6 years ago
chengduo	e296e0fead	fix test_parallel_executor_seresnet random fail (#17030 ) test=develop	6 years ago
Tao Luo	b3a11943c1	Merge pull request #17031 from luotao1/reduce_test_time reduce unittest time by rename testcuda to has_cuda	6 years ago
qingqing01	c1c2633a63	Support backward of backward for Relu and add a new gradient checker by comparing theoretical and numerical Jacobian. (#16862 ) * Support backward of backward and a new gradient checker * Rename decorators.py to decorator_helper.py, since Python on Windows CI has decorators package. 1. Add ReluDoubleGradMaker when register relu_grad. 2. Add a new gradient checker by comparing theoretical and numerical Jacobian. Check double gradients by double_grad_check.	6 years ago
Zeng Jinle	f188b3708e	Move gc test to each test of op (#16999 ) * move gc test to op_test test=develop * Revert "move gc test to op_test" This reverts commit cf15da65c38f57c91f53b3d8b3c2365d4aa86016. * enable gc test in some ops test=develop	6 years ago
chengduo	7c370e42f9	Fix test_recurrent_op (#17001 ) * fix ramdom fail test=develop	6 years ago
Tao Luo	9466e956a7	reduce unittest time by rename testcuda to has_cuda test=develop	6 years ago
wopeizl	d9991dccdd	add parallel build script to ci … (#16901 ) * add parallel build script to ci test=develop * 1. classify the test case as single card/two cards/multiple cards type 2. run test case according to the run type	6 years ago
qingqing01	ea42e431f8	Speed unit testing. (#16978 ) * Speed affine_channel_op unit testing * Add check in tensor_py * Fix ONLY_CPU Compiling	6 years ago
guomingz	ae7a2cb8e3	resolve #16988 (#16995 ) Update the filter generation mechanism that it could generate the negative parameter. The original calling(np.random.random()) couldn't simulate the conv/relu fusion case. test=develop	6 years ago
liuwei1031	765c70a1b0	Unittest improve, test=develop (#16941 ) * accelerate test_ir_memory_optimize_nlp, test=develop * accelerate test_ir_memory_optimize_nlp, test=develop	6 years ago
guomingz	23df084b32	resolve #16987 (#16994 ) Rename the testcuda function to has_cuda, it will elimate the unnecessary testing. test=develop	6 years ago
Zeng Jinle	1202d3fc74	Refine model gpu memory (#16993 ) * speedup gc and inplace softmax_with_cross_entropy_grad test=develop * refine models gpu mem Merge skip vars and warning messages of mem opt remove relu mem opt test=develop * follow comments test=develop	6 years ago
Zeng Jinle	af8a041bb6	reduce py_reader unittest time (#16996 ) test=develop	6 years ago
Yibing Liu	3c375751f8	Support seq len equal to 0 in sequence ops (#16935 ) * Support seq len equal to 0 in sequence ops test=develop * Add more test cases * Fix some comments test=develop * Fix py3 error test=develop	6 years ago
lujun	a3f17280a3	fix dy-load bug, test=develop	6 years ago
gongweibao	cbdb8a17b1	Polish DGC code (#16818 )	6 years ago
lujun	dbf66dd034	Merge pull request #16954 from junjun315/fix-dygraph-checkpoint Fix dygraph checkpoint bug	6 years ago
Tao Luo	aed702cea3	Merge pull request #16920 from qingqing01/test_profile Fix test_profiler when the machine has many cores.	6 years ago
Tao Luo	b596eed73a	Merge pull request #16824 from LeoZhao-Intel/mkldnn_mul disable test_elementwise_mul_mkldnn_op case	6 years ago
lujun	3beed54cdd	Merge pull request #16917 from velconia/dygraph_untrack_op imperative fix tracer train mode	6 years ago
lujun	a7c11979ba	fix dygraph save/load checkpoint error, test=develop	6 years ago
tangwei12	2b61db07d1	fix sampling id op bug (#16909 ) * fix sampling id op bug, test=develop	6 years ago
gongweibao	b7f20ed6af	Fix unittest dataset error (#16925 )	6 years ago
Hongyu Liu	d5a7c09856	Merge pull request #16798 from phlrain/softmax_cross_support_high_rank softmax cross entropy support high rank	6 years ago
Dang Qingqing	b73a71d11e	Fix test_profiler when the machine has many cores test=develop	6 years ago
Kaipeng Deng	5d45eb06f9	Merge pull request #16858 from heavengate/fix_yolo_param Fix yolo param	6 years ago
minqiyang	97aa1838bc	Fix dygraph train mode test=develop	6 years ago
Qiyang Min	102fc8596e	Merge pull request #16777 from velconia/dygraph_untrack_op Imperative tracer does not hold op any more	6 years ago
Leo Zhao	1edcd73115	remove unnecessary new line test = develop resolve #16764	6 years ago
Leo Zhao	61cc842a53	disable test_elementwise_mul_mkldnn_op case	6 years ago
Hongyu Liu	0701c2db47	Merge pull request #16518 from zhoukunsheng/rsqrt Rsqrt	6 years ago
phlrain	766c868199	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into softmax_cross_support_high_rank	6 years ago
Tao Luo	485bc6a055	Merge pull request #16868 from chengduoZH/speedup_test_parallel_executor_transformer Reduce the layer number of transfromer model	6 years ago
Tao Luo	d4b5510c00	Merge pull request #16860 from junjun315/fix-utest-vgg Fix bug: long vgg-utest testing time	6 years ago
Hongyu Liu	2de7f3cfc3	Merge pull request #16799 from phlrain/sigmoid_corss_entropy_support_high_rank supprt high rank	6 years ago
chengduozh	3349094fe2	reduce the layer number of transfromer test=develop	6 years ago
minqiyang	73cbdc2998	Add train mode test=develop	6 years ago
colourful-tree	434caab21b	Merge pull request #16741 from colourful-tree/dev add continuous value model op	6 years ago
lujun	4aea89faa2	fix vgg-test. test=develop	6 years ago
dengkaipeng	7b1702d9a1	fix unittest and API.spec. test=develop	6 years ago
Yibing Liu	4267a81afc	Correct the lod level of compiled time in lod_reset (#16790 ) test=develop	6 years ago
chengduo	c62674f475	Refine StaticRnn (#16707 ) * enable recurrent op test=develop	6 years ago
chengduo	6220b8b9d1	[Speed] Make test_dyn_rnn faster (#16761 ) * make test_dyn_rnn faster	6 years ago
Leo Zhao	a9694bd3d6	convert output to nchw format to align with native version in avx512 mode test = develop resolve #16764	6 years ago
heqiaozhi	759940786e	Merge remote-tracking branch 'upstream/develop' into dev test=develop	6 years ago
phlrain	8063f5867f	remove sigmoid change; test=develop	6 years ago
phlrain	468f8ccff9	supprt high rank; test=develop	6 years ago
phlrain	97d4622bdb	add softmax test unit test=develop	6 years ago
phlrain	bbfc82cc42	softmax corss entropy support high rank test=develop	6 years ago
zhoukunsheng	2b2b4ca21e	Merge branch 'develop' into rsqrt	6 years ago
heqiaozhi	afa64a5cfa	add cvm unittest test=develop	6 years ago
Hongyu Liu	e2897ba13a	Merge pull request #16432 from zhoukunsheng/linspace add linspace op	6 years ago
Hongyu Liu	afe0d64c9d	Merge pull request #16320 from zhoukunsheng/all_any add reduce_all, reduce_any op	6 years ago
Tao Luo	f96446cade	Merge pull request #16738 from luotao1/high_level_api_test reduce CI time of high_level_api tests	6 years ago
chengduo	610c6442e3	Make test_parallel_executor_seresnet.py Faster (#16701 ) * slimming test_parallel_executor_seresnet.py	6 years ago
Tao Luo	544f91deba	add WITH_HIGH_LEVEL_API option, default OFF test=develop	6 years ago
Tao Luo	7d0ed2a423	remove non-existent test_image_classification_resnet test=develop	6 years ago
Zeng Jinle	674aed6a6c	Fix unittests which takes too long time (#16713 ) * fix too long unittest recommit test=develop * add fake_reader.py test=develop	6 years ago
Jiabin Yang	7060c8d89f	test=develop, refine transformer (#16734 )	6 years ago
lujun	9bd44b94da	Merge pull request #16561 from junjun315/move-api-to-root Move dygraph api to root	6 years ago
Kaipeng Deng	ed97156461	Merge pull request #16439 from heavengate/resize_scale add attr scale. test=develop	6 years ago
Tao Luo	38f01b678a	rename high_level_api tests test=develop	6 years ago
Huihuang Zheng	2146293d26	Fix op registry (#16677 ) list of fixed ops: lookup_table_op space_to_depth_op squared_l2_distance_op squared_l2_norm_op teacher_student_sigmoid_loss_op tree_conv_op warpctc_op test=develop	6 years ago
lujun	14db0680c0	merge conflict, test=develop	6 years ago
lujun	92c8ac8a74	merge conflict, test=develop	6 years ago
Jiabin Yang	a06f4b2b2c	make less batch of tests to fit ci (#16706 ) * make less batch of tests to fit ci * test=develop, invoke ci add some comments back	6 years ago
chengduo	55b15db5af	Add unit test for fuse all_reduce ops (#16699 ) * test fuse all_reduce	6 years ago
Yan Xu	c6720990c0	fix seresnext unit test (#16689 ) comment np.array(x.get_tensor()) in imperaitve mode to avoid OOM.	6 years ago
guru4elephant	7d653f0aed	Merge pull request #16652 from xjqbest/dataset_merge_develop fix dataset bug	6 years ago
chengduo	ea8655dbd2	Add unit test for fuse_opt_ops (#16550 ) * add unit test for fuse_opt_ops test=develop	6 years ago
minqiyang	7c4b9b577a	Polish code test=develop	6 years ago
minqiyang	aa07814df3	Add 3 uts test=develop	6 years ago
minqiyang	2e0b871320	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_dqn	6 years ago
xjqbest	274477005e	fix dataset testcase error test=develop	6 years ago
Zeng Jinle	bb143052cb	fix gc bug in conditional block (#16673 ) test=develop	6 years ago
xjqbest	5e5139283b	fix runtime error test=develop	6 years ago
ruri	229dc93277	Add Pixel shuffle OP (#15782 ) * add pixel_shuffle op * add pixel_shuffle op, test=develop * rewrite code, test=develop * delete useless comment, test=develop * Refine pixel_shuffle_op and unit testing * refine code,test=develop * refine .cu,test=develop * fix unittest,test=develop * Fix unit testing test=develop * resolve conflict, test=develop * fix test, test=develop * fix API, test=develop * fix test datatype bug,test=develop * polish comments,test=develop * add API,test=develop * test=develop * Add Pixel_Shuffle OP,test=develop * support python3,test=develop * add include memory to travis CI bug,test=develop	6 years ago
lujun	38382f8e27	Merge pull request #16658 from JiabinYang/fix/transformer_random_failed test=develop, fix transformer in dygraph	6 years ago
lujun	99120698b7	merge confict, test=develop	6 years ago
lujun	01f4f2d7e4	merge confict, test=develop	6 years ago
minqiyang	b29249404b	Polish code test=develop	6 years ago
minqiyang	61fe139f34	Polish code	6 years ago
lujun	e11bf2a49e	merge branch, test=develop	6 years ago
Qiyang Min	cf307d0dfb	Imperative fix ptb rnn place bug (#16632 ) * Fix bug of gradient interface * shrink transformer * Right transformer * Change from width-first backward to deep-first backward process test=develop * Reverse iterator op's input test=develop * Polish code * Change the iteration direction in ingrads' map slots test=develop * Polish code test=develop * Add GPU place in static mode of ptb rnn test=develop * Polish code test=develop	6 years ago
JiabinYang	4c18b98fcd	test=develop, fix transformer in dygraph /	6 years ago
lujun	a32c6ffa96	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into move-api-to-root	6 years ago
lujun	78ff5d72d3	Merge pull request #16520 from PaddlePaddle/move-code add some dygraph op 1.Conv3d 2.Conv3dTranspose 3.RowConv 4.GroupNorm 5.SpectralNorm 6.TreeConv and utest	6 years ago
Xin Pan	c77fb9fed9	Merge pull request #16636 from panyx0718/imperative add a simple test	6 years ago
xjqbest	271b7147cc	fix dataset bug test=develop	6 years ago
Zeng Jinle	1c526e1d1a	Fix some grad op desc makers (#16633 ) * fix some grad op desc maker test=develop * fix grad op desc makers test=develop	6 years ago
minqiyang	e377d75977	Add UT for most layers without params test=develop	6 years ago
Xin Pan	d02e3e7f79	add a simple test test=develop	6 years ago
minqiyang	2839e22739	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_dqn test=develop	6 years ago
minqiyang	cd71d645c5	Polish code test=develop	6 years ago
minqiyang	2e09e95bfc	Add GPU place in static mode of ptb rnn test=develop	6 years ago
chengduo	1342e2ea04	Fix the bug of the fast threaded executor (#16514 ) * Fix the bug of the fast threaded executor. I	6 years ago
Zeng Jinle	d658244997	fix some grad op desc maker (#16581 ) test=develop	6 years ago
lujun	ad7c1a934f	merge conficts, test=develop	6 years ago
lujun	60e3e35575	merge branch, test=develop	6 years ago
Qiyang Min	12e36d38a5	Imperative deep-first backward process (#16605 ) * Fix bug of gradient interface * shrink transformer * Right transformer * Change from width-first backward to deep-first backward process test=develop * Reverse iterator op's input test=develop * Polish code * Change the iteration direction in ingrads' map slots test=develop * Polish code test=develop	6 years ago
Jiabin Yang	353244f4fc	test=develop, add FC and test (#16604 ) * test=develop, add FC and test * test=develop, refine code	6 years ago
lujun	e97ded835a	merge branch, test=develop	6 years ago
乔龙飞 Qiao Longfei	21622ca30b	Merge pull request #16172 from jacquesqiao/add-async-ssa-graph-executor-communicator Add async ssa graph executor communicator	6 years ago
minqiyang	51ca50897a	Change the iteration direction in ingrads' map slots test=develop	6 years ago
minqiyang	d16cb8ca11	Polish code	6 years ago
minqiyang	cce766d710	Reverse iterator op's input test=develop	6 years ago
minqiyang	1a55f7d38c	Change from width-first backward to deep-first backward process test=develop	6 years ago
Jiabin Yang	22e5bcd275	test=develop, ptb_rnn fix op (#16573 )	6 years ago
lujun	d3fc3d5520	move internal function, test=develop	6 years ago
minqiyang	a0478084f8	Right transformer	6 years ago
minqiyang	124f45c9f7	shrink transformer	6 years ago
lujun	717256755a	move dygraph.nn,dygraph.layer to fluid, test=develop	6 years ago
Yan Xu	2e1e76e70e	add SeResNeXt unittest (#16503 ) * add seresnet unittest test=develop * add dropout layer test=develop * fix ci test=develop * fix comment test=develop * fix comment test=develop * fix ci test=develop * fix ci test=develop * fix ci * fix module name test=develop * run imperative serenext unit test serially test=develop	6 years ago
zhoukunsheng	5edf4fb4fb	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into all_any	6 years ago
chengduo	feb1b54f9d	fix min and max bug (#16570 ) test=develop	6 years ago
Qiao Longfei	adf272bcec	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
guru4elephant	76b49f02ee	Merge pull request #16539 from guru4elephant/train_with_pipe_reader_merge_develop Train with pipe reader merge develop	6 years ago
Qiao Longfei	baf02328b2	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
Qiyang Min	d8d73ff3db	Merge pull request #15584 from velconia/imperative_lr_scheduler Support imperative learning rate scheduler	6 years ago
lujun	cf6238fbd9	fix merge for move dir, fix utest error, test=develop	6 years ago
qingqing01	1ebd7434d5	Add linear learning warmup method in learning rate scheduler. (#16563 ) * Add linear learning warmup method This warmup lr can be combinated with other learning rate strategies. For example: decayed_lr = fluid.layers.linear_lr_warmup( fluid.layers.piecewise_decay(boundaries, lr_steps), warmup_steps, start_lr, end_lr)	6 years ago
Wu Yi	22b02bfa62	Batch norm cudnn accurate (#16545 ) * fix cudnn batch norm accuracy test=develop * fix cudnn batch norm accuracy test=develop * disable failed test for later fix test=develop	6 years ago
xjqbest	a99c8d0c29	fix client to client communication bug test=develop	6 years ago
Kaipeng Deng	3d939d32ee	Merge pull request #16023 from heavengate/kl_div_loss KL div loss: add kldiv_loss op	6 years ago
Kaipeng Deng	54474637ae	Merge pull request #16057 from heavengate/softmax_axis Add attr 'axis' for softmax	6 years ago
Kaipeng Deng	63ac947e2f	Merge pull request #16135 from heavengate/shift Add temporal_shift op for TSM model	6 years ago
lujun	04c0b12c6e	fix test layers error, test=develop	6 years ago
lujun	cf642d478d	fix merge for move dir, fix utest error, test=develop	6 years ago
Qiao Longfei	61912e879d	test_dist_base set runtime_split_send_recv to false test=develop	6 years ago
lujun	1dcd28e819	move dygraph.nn,dygraph.layer to fluid, test=develop	6 years ago
wopeizl	e014950e87	add slice support for dim < 0 (#16494 ) * add slice support for dim < 0 test=develop	6 years ago
zhoukunsheng	5284213942	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rsqrt	6 years ago
minqiyang	fb7c787d34	Fix conflicts test=develop	6 years ago
minqiyang	3e57981294	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_lr_scheduler test=develop	6 years ago
lujun	d4d5052fa6	fix merge for move dir, fix utest error, test=develop	6 years ago
zhoukunsheng	3c4f5f0368	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into linspace	6 years ago
lujun	32146857a4	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into move-code	6 years ago
dongdaxiang	720647e17f	rebase current develop and fix conflict test=develop	6 years ago
dongdaxiang	93c3c7f9b3	fix dataset testcase problem test=develop	6 years ago
dongdaxiang	d739bab844	fix async_executor problem and remove some unnecessary testcase, fix trainer_desc import problem test=develop	6 years ago
xjqbest	1497ce388d	fix code style of test_dataset.py test=develop	6 years ago
xjqbest	7cdd57a474	fix code style of test_dataset.py test=develop	6 years ago
xjqbest	748d54cb46	fix code style of test_dataset.py test=develop	6 years ago
xjqbest	1073b4d8f9	fix code style of test_dataset.py test=develop	6 years ago
dongdaxiang	45eb6f0765	run pre-commit check files and fix code style problem test=develop	6 years ago
xjqbest	e57ac5ed17	fix code style test=develop	6 years ago
xjqbest	97c74e60c3	fix code style test=develop	6 years ago
xjqbest	a38b98cb32	fix code style & runtime error test=develop	6 years ago
xjqbest	d52586a97d	add doc string test=develop	6 years ago
xjqbest	e95cafd9a7	fix code style & add dataset testcase test=develop	6 years ago
Qiao Longfei	d8974e6da0	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator test=develop	6 years ago
lujun	de605cc0fc	Merge pull request #16523 from junjun315/tensor_api move imperative to dygraph	6 years ago
chengduo	1096746cbf	Fuse Adam And SGD ops (#15933 ) * fuse optimizer	6 years ago
lujun	1c9aaeebe0	move imperative to dygraph, test=develop	6 years ago
lujun	d980ba19bc	add some dygraph op, test=develop	6 years ago
zhoukunsheng	2f9e562100	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into linspace	6 years ago
zhoukunsheng	082822d417	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rsqrt	6 years ago
zhoukunsheng	c47f3cc7fe	test=develop add rsqrt op	6 years ago
minqiyang	42507d33c6	Change atol to default value	6 years ago
dengkaipeng	193185b840	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into shift	6 years ago
dengkaipeng	8a0023892a	fix unittest. test=develop	6 years ago
whs	59f75ec76e	Make unitest of fsp op faster and more stable. (#16502 ) * Make unitest of fsp op faster and more stable. test=develop * Skip unitest of fsp op. test=develop	6 years ago
minqiyang	35c89f38c3	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_lr_scheduler test=develop	6 years ago
gongweibao	eb83abeac3	Add DGC(Deep Gradient Compression) interface. (#15841 )	6 years ago
zhoukunsheng	874b5d8362	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into linspace	6 years ago
zhoukunsheng	83c7bca13f	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into all_any	6 years ago
Qiao Longfei	b68f84090b	fix test_split_selected_rows_op test=develop	6 years ago
Zeng Jinle	c7c6eeb44e	Merge pull request #16409 from sneaxiy/feature/advance_gc Enhance gc to support deleting tensor buffer in advance	6 years ago
Jiabin Yang	54a73578a8	Feature/install check (#16044 ) * test=develop, add install check * test=develop, add install check scripts * test=develop, refine language * test=develop, add api spec * test=develop, change cdn to bj to pass ci	6 years ago
minqiyang	99128a5c72	Implement Cosine and Noam Decay test=develop	6 years ago
wopeizl	c300b1ba69	Tensor index (#16223 ) * extend the slice function for python test=develop	6 years ago
minqiyang	ec9c0874bc	Implement Expotential NatureExp Inversetime and Polynomal Decay	6 years ago
Jiabin Yang	0d9d25d40f	Feature/refactor layers to Layers (#16337 ) * test=develop, add some Layers and tests * test=develop, add more layers * test=develop, add more layers * test=develop, add force cpu option * Update test_layers.py remove pdb * test=develop, refine code	6 years ago
gongweibao	850b737112	Fix nparray.all() bug. (#16472 )	6 years ago
Xin Pan	f8c279b11c	Merge pull request #16454 from panyx0718/imperative2 polish deepCF model to support real dataset	6 years ago
Qiao Longfei	30618409db	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator	6 years ago
sneaxiy	78fb3a62e0	fix env variable settting bug test=develop	6 years ago
minqiyang	4278be8c49	Merge branch 'imperative_lr_scheduler' of https://github.com/velconia/Paddle into imperative_lr_scheduler test=develop	6 years ago
minqiyang	b5bbb13ac1	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_lr_scheduler	6 years ago
dengkaipeng	7920e3be02	revert test_softmax_cudnn. test=develop	6 years ago
Jiabin Yang	7c5319ba12	Fix/test imperative ptb rnn (#16433 ) * test=develop, fix ptb rnn * test=develop, change cdn to bj to pass ci * test=develop, fix ci	6 years ago
Jiabin Yang	f735102eab	add layer norm to Layers, add transformer test in imperative mode (#16092 ) * add layer norm to Layers, add transformer prepare encoding * little change * finish encoder part * add decoder part * finish model part * add test case and part of data feed * add transformer test * add to_parameter, add remove in set_attr * test=develop, fix pos encoding bug, create_parameter with stantard name * test=develop, rm dropout test in imperative * test=develop, fix cpu error * test=develop, fix minize bug * test=develop, fix one hot not stop gradient * test=develop, fix one hot not stop gradient * test=develop, refine parameter name * test=develop, fix transformer test in imperative mode * test=develop, fix transformer test in imperative mode * test=develop, fix boost and mkl download error * test=develop, fix boost and mkl download error * test=develop, fix ci and refine code * test=develop, fix ci and refine code	6 years ago
Xin Pan	fd24ab47ab	polish test=develop	6 years ago
Xin Pan	1f89249a95	update DeepCF model test=develop	6 years ago
sneaxiy	a7d0ac50b8	Merge develop	6 years ago
sneaxiy	7000ec85d9	fix some op grad maker fix ctest eager deletion disable bug test=develop	6 years ago
dengkaipeng	cfef382a85	fix format. test=develop	6 years ago
Zeng Jinle	4cc9809cae	Merge pull request #15799 from sneaxiy/feature/decoupled_reader Try to decouple reader with program_desc	6 years ago
whs	e9bec9369b	[slim] Add quantization strategy and distillation strategy. (#16408 ) * Add fsp operator. 1 Add unitest. 2. Add python API. 3. Add layer test. * Add quantization strategy. 1. Add API. 2. Add unitest. * Add distillatoin strategy. * Add unitest config file for quantization * Fix Copyright test=develop * Fix setup.py * Fix document of layers.py. test=develop * Fix unitest in python3. test=develop * Fix documents. test=develop * 1. refine fsp op by batched gemm 2. remove unused import test=develop * Fix test_dist_se_resnext. 1. disable test distillation. 2. reset framework.py test=develop * Enable unitest of distillation after fixing Block._clone_variable test=develop * Fix cdn issue. test=develop	6 years ago
liuwei1031	de3b70a101	fix cdn issue, test=develop (#16423 ) * fix cdn issue, test=develop * fix cdn issue, test=develop	6 years ago
zhoukunsheng	d3d31a5894	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into all_any	6 years ago
zhoukunsheng	664c342ca0	test=develop split reduce_all_any_op.h into two files add unit test for reduce_all, reduce_any	6 years ago
zhoukunsheng	43060084a4	test=develop add linspace, modify interface comments in tensor.py, merge with develop branch	6 years ago
sneaxiy	f8ed2c229e	try to fix ci error test=develop	6 years ago
zhoukunsheng	8e9ebebcef	test=develop add linspace op	6 years ago
dengkaipeng	cfda1fdea7	add attr scale. test=develop	6 years ago
Xin Pan	b55dd32e9c	Merge pull request #16394 from panyx0718/imperative2 Add DeepCF model	6 years ago
sneaxiy	2f54d9f995	Merge develop test=develop	6 years ago
sneaxiy	072d95d8f6	Merge develop test=develop	6 years ago
sneaxiy	a93a9eef8f	add op registry type refine gc code test=develop	6 years ago
chengduo	c917c13af1	increase the time limite (#16405 ) test=develop	6 years ago
whs	18779b5b8f	[Operator] Add range op. (#15431 ) * Add range op. test=develop * Add more unitests. test=develop * Fix API.spec test=develop * Fix API.spec test=develop * Fix API.spec test=develop	6 years ago
phlrain	6b971e1f19	remove test_dist_transplier; test=develop	6 years ago
phlrain	7dc4a7f4f8	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_var_name_in_opt_2	6 years ago
phlrain	d11d0e18c2	remove test_dist_transplier; test=develop	6 years ago
Xin Pan	55a7b98126	Add DeepCF model test=develop	6 years ago
Zhen Wang	ec11135d54	Merge pull request #16341 from wzzju/add_channel_wise_in_quant_pass Add channel wise in quant pass.	6 years ago
xiaolil1	e235882c18	Enable MKL-DNN INT8 Concat Kernel. (#16156 ) * Enable INT8 Concat Kernel to improve the performance of MobileNet-SSD. test=develop * Optimize UT format. test=develop * Fix UT file address issue. test=develop * Refine the license year. test=develop * Optimize code for new API. test=develop * Restructure INT8 Concat kernel. test=develop	6 years ago
Qiyang Min	171df5b56b	Merge pull request #16303 from junjun315/checkpoint for Checkpoint save and load	6 years ago
Hongyu Liu	e5478ab5c8	Merge pull request #16346 from phlrain/add_floordiv_and_mod add elementwise floordiv, mod	6 years ago
chengduo	33965527fd	Add unit test for fuse all reduce (#16354 ) * refine fused_all_reduce_op * add unit test in test_parallel_executor_seresnext test=develop	6 years ago
phlrain	5dc9b51994	fix time; test=develop	6 years ago
phlrain	56c2d384c7	add elementwise floordiv, mod; test=develop	6 years ago
Wu Yi	8bebfe5640	add resnet nccl2 dist training, mp training unit test (#16167 ) * add resnet nccl2 test=develop * test dist train test=develop * update test=develop * increase timeout test=develop * test on CI env test=develop	6 years ago
baojun	2de263a5d9	Add softmax_with_cross_entropy_op to ngraph engine (#16304 ) * Add softmax_with_cross_entropy_op test=develop * simplify implementation test=develop	6 years ago
chengduo	f26ba5bddd	Fuse AllReduce (#15921 ) * fuse all_reduce test=develop * add fuse_parameter_groups_size test=develop * Polish code test=develop * Fix travis-ci test=develop * Add SetGroupAccordingToLayers and SetGroupAccordingToGroupSize test=develop * Add SetGroupAccordingToMemorySize test=develop * fix multi_devices_graph test=develop * reset params_grads test=develop * Polish code test=develop	6 years ago
dengkaipeng	93701dba50	add jit kernel for softmax axis. test=develop	6 years ago
Wu Yi	6382b62f6b	Collective ops (#15572 ) * wip allreduce in op * wip * wip * wip * wip adding test * wip for conflict with mp mode * fix tests test=develop * fix cpu build test=develop * fix travis clang format test=develop * fix cpu build test=develop * update api.spec test=develop * delete comment test=develop * fix cpplint test=develop * fix test=develop * follow comment test=develop * add file test=develop * fix build test=develop * update test=develop * to be compatible with sync_bn, and fix mp mode in develop test=develop	6 years ago
lujun	bed0ecf3d2	checkpoint pr be moved here, test=develop	6 years ago
Zhen Wang	ec88b6cc5a	add channel wise quantization in ir pass.	6 years ago
sneaxiy	3a09693f5c	change API name test=develop	6 years ago
Yibing Liu	7e20e7691e	Fix the bug in fp16 backward kernel (#16269 ) test=develop	6 years ago
dengkaipeng	365e6cfd15	add mkldnn support. test=develop	6 years ago
dengkaipeng	217db27337	add mkldnn support. test=develop	6 years ago
dengkaipeng	6cb66721d2	add cudnn support. test=develop	6 years ago
sneaxiy	161b8ddcaa	Merge develop	6 years ago
xiaolil1	e818fa1004	Enable INT8 transpose kernel for MobileNet-SSD improvement. (#16159 ) * Enable INT8 transpose kernel for MobileNet-SSD improvement. test=develop * Refine the license year. test=develop * Delete redundant code. test=develop * Add axis check. test=develop	6 years ago
Xin Pan	3e9319f3ab	add more imperative layer tests. test=develop	6 years ago
Xin Pan	7458114b5b	Merge pull request #16228 from panyx0718/imperative graph neural network for imperative mode	6 years ago
Kaipeng Deng	b77ebb2af2	Merge pull request #15919 from heavengate/yolo_box add yolo_box for detection box calc in YOLOv3	6 years ago
Xin Pan	3be7e971ab	polish test=develop	6 years ago
Xin Pan	50ff898378	graph neural network for imperative mode test=develop	6 years ago
achao2013	81b4fad8b9	add moving average absmax op and fix bug (#15155 ) * Add moving average absmax op in quantilize-aware training.	6 years ago
Kaipeng Deng	74037cc1c8	Merge branch 'develop' into yolo_box	6 years ago
Xin Pan	92b9ce3479	Merge pull request #16073 from heavengate/yolov3_loss_imporve Yolov3 loss: add mixup score and label smooth	6 years ago
qingqing01	8ad672a287	Support sync batch norm. (#16121 ) * Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)	6 years ago
Yibing Liu	4ae23cc3c5	Impl fp16 compute kernel for slice_op (#16206 ) * Impl fp16 compute kernel for slice_op test=develop * Use data() to replace mutable_data()	6 years ago
sneaxiy	5a92e4c097	revert revert 16144 test=develop	6 years ago
sneaxiy	ad5f0e6018	merge develop	6 years ago
sneaxiy	55ba7f610b	fix numeric error test=develop	6 years ago
Zeng Jinle	a91964c8fe	Revert "PaddingRNN model memory optimize" test=develop	6 years ago
Zeng Jinle	0b49e43d3a	Merge pull request #16144 from sneaxiy/rnn_mem_opt PaddingRNN model memory optimize	6 years ago

... 4 5 6 7 8 ...

2762 Commits (d4413a54bc95e80d54403fd5c48261ca7313d125)