guomingz
2281ebf0f3
Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. ( #17130 )
...
* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization.
Below table shows the benchmark(FPS) which measured on skx-8180(28 cores)
Batch size | with fusion | without fusion
-- | -- | --
1 | 214.7 | 53.4
50 | 1219.727 | 137.280
test=develop
* Fix the format issue
test=develop
* Add the missing nolint comments.
test=develop
* Fix the typos.
test=develop
* Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine.
test=develop
* Adjust the indentation.
test=develop
* Add the test_conv_brelu_mkldnn_fuse_pass case.
test=develop
* Slightly update the code per Baidu comments.
Let the parameter definition embedded into the code.
That's will make the code easy to understand.
test=develop
6 years ago
Yibing Liu
f9796b1249
Add LAMB Optimizer support ( #17489 )
...
* Add LAMB optimizer
* Expose LAMB Optimizer's APIs
test=develop, test=document_preview
* Cleanup code & doc
test=develop, test=document_preview
* Update lamb optimizer's formula
test=develop
6 years ago
mozga-intel
99ab57123c
Enabled ngraph elementwise max operator ( #17517 )
6 years ago
Tao Luo
3d19f44a89
remove unused SERIAL compiler option ( #17500 )
...
test=develop
6 years ago
mozga-intel
1eb151752e
Enable abs operator for a ngraph test=develop ( #17436 )
6 years ago
Zhaolong Xing
ff7f911b4d
add quant_dequant_moving_avg_max_abs op ( #17480 )
...
* add quant_dequant_moving_avg_max_abs op
test=develop
* add more note for quantdequant op
test=develop
6 years ago
lvmengsi
10b23a72c1
Double backward elementwise div ( #17416 )
...
* double backward, elementwise_div
* fix dx empty. test=develop
* bug fix (#17392 )
fix secure bug
* Eanble stack operator for a Ngraph, test=develop (#17406 )
* fix sqrt_grad_grad unittest. test=develop (#17410 )
* fix sqrt_grad_grad unittest. test=develop
* disable sqrt_grad_grad unittest. test=develop
* test=develop, fix unittest
* test=develop, fix unittest
* test=develop, fix unittest
* test=develop, fix bug
* fix unittest. test=develop
* fix unittest dx. test=develop
* tmp fix! for test... test=develop
* reduce tmp, test=develop
* test=develop, reduce tmp
* fix broadcast unittest. test=develop
* fix format. test=develop
* refine code. test=develop
* refine code. test=develop
* refine GetDoubleGradSafeTensor. test=develop
* fix format. test=develop
6 years ago
Kaipeng Deng
14f223624f
fix sqrt unittest. test=develop ( #17440 )
6 years ago
lvmengsi
977e9fcb27
support elementwise_sub double backward ( #17476 )
...
add elementwise_sub_grad_grad op for backward of backward calculation
6 years ago
Yan Xu
0217555530
polish parallel dygraph code ( #17164 )
...
* add var grad hook test=develop
6 years ago
chengduo
e336dc86bb
[Speed] Refine the Executor when the num_thread=1 ( #17405 )
...
Refine the Executor when the num_thread=1
6 years ago
Kaipeng Deng
58d5c61a29
fix sqrt_grad_grad unittest. test=develop ( #17410 )
...
* fix sqrt_grad_grad unittest. test=develop
* disable sqrt_grad_grad unittest. test=develop
6 years ago
mozga-intel
6ee6700fac
Eanble stack operator for a Ngraph, test=develop ( #17406 )
6 years ago
baojun
1ce7b45b9e
NGraph Added fill_zeros_like op test=develop ( #17295 )
6 years ago
baojun
910196524d
NGraph Added dropout and dropout_grad to ngraph test=develop ( #17320 )
6 years ago
mozga-intel
b189480734
Ngraph Enable gather operator test=develop ( #17296 )
6 years ago
lvmengsi
4ef631013c
Double backward sqrt ( #17387 )
...
* double backward sqrt
* refine unittest. test=develop
* refine test. test=develop
* remove alpha in unittest. test=develop
6 years ago
lvmengsi
5d1ac41b00
Double backward reduce mean ( #17372 )
...
* test=develop, double backward reduce_mean
* add comment. test=develop
* fix format. test=develop
* rename GradGrad -> DoubleGrad. test=develop
* fix op_use_default_grad_op_maker.spec. test=develop
6 years ago
Kaipeng Deng
bd9bef5a4e
add elementwise_add_grad_grad op ( #17366 )
...
* add elementwise_add_grad_grad op. test=develop
* use defined GradMaker. test=develop
6 years ago
jerrywgz
1c6d064627
add collect fpn proposals op,test=develop ( #16074 )
...
* add collect fpn proposals op,test=develop
6 years ago
Kaipeng Deng
60be66e2c0
support fc_op double grad ( #17317 )
...
* add double grad for mul_op. test=develop
* fix format. test=develop
* fix format. test=develop
* fix format. test=develop
* refine code. test=develop
* remove setzero. test=develop
* fix dx/dy init bug. test=develop
* fix format. test=develop
6 years ago
Jiabin Yang
4624d7c642
test=develop, add gradient sort backward strategy ( #17125 )
...
* test=develop, add gradient sort backward strategy
* test=develop, fix test by add FLAGS_cudnn_deterministic on new tests
6 years ago
Jiabin Yang
c843e64cf5
Revert "rename the default version from '0.0.0' to 'latest' ( #17304 )" ( #17356 )
...
This reverts commit f456c8beb8
.
6 years ago
Kaipeng Deng
8bae8590ac
add double grad for elementwise_mul op ( #17255 )
...
* add double grad for elementwise_mul. test=develop
* remove comment. test=develop
* fix grad sum. test=develop
* fix for axis expand. test=develop
* add test for axis expand. test=develop
6 years ago
Kaipeng Deng
11d3a38f25
add double grad for square op ( #17173 )
...
* add double grad for square. test=develop
* formax code. test=develop
* fix for grad sum. test=develop
* refine shape. test=develop
* refine extract. test=develop
6 years ago
chengduo
bc833945a4
Add DropLocalExeScopes in ParallelExecutor ( #17297 )
...
* reset drop local scope counter
test=develop
6 years ago
zhoukunsheng
d4b67e1692
Add Where Op( #16793 )
6 years ago
zhoukunsheng
1bfff02047
Add Diag Op( #17027 )
6 years ago
qingqing01
e32c9888f5
Double backward of conv2d. ( #17211 )
...
* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
- Now use it in conv2d_grad_grad.
- Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables,return None in Python.
6 years ago
wopeizl
f456c8beb8
rename the default version from '0.0.0' to 'latest' ( #17304 )
...
* rename the default version from '0.0.0' to 'latest'
6 years ago
baojun
7bd1d03ee5
Adding lrn op for ngraph engine ( #17189 )
...
* added lrn op test=develop
* Added CreateConstant method test=develop
* avoid duplicates test=develop
6 years ago
Zeng Jinle
4f8594088d
Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace ( #17225 )
...
* add use_cuda to inplace pass,test=develop
* add test softmax_with_xe_inplace test,test=develop
* fix potential inplace bug
test=develop
* add more skip vars in mem opt pass,test=develop
* follow comment,test=develop
* follow comments,move duplicate out arg check to program->graph,test=develop
6 years ago
baojun
e782b54b9c
update sofmax with axis arg test=develop ( #17190 )
6 years ago
Tao Luo
ff1661f12a
remove unused FLAGS_warpctc_dir ( #17162 )
...
* remove unused FLAGS_warpctc_dir
test=develop
* remove FLAGS_warpctc_dir
test=develop
6 years ago
Kaipeng Deng
a71d8fdb87
Softmax_cross_entropy op add axis ( #16806 )
...
* add attr axis infershape. test=develop
* add CUDA kernel. test=develop
* fix unittest. test=develop
* fix unittest for soft_label. test=develop
* fix fp16 unittest. test=develop
* remove comment code. test=develop
* refine test for axis. test=develop
* add python api. test=develop
* fix doc. test=develop
* fix fp16 unittest. test=develop
* fix ngraph test. test=develop
* fix ENFORCE for test_imperative_transformer. test=develop
* fit for ngraph test. test=develop
* fix after rebase develop. test=develop
* fix doc. test=develop
* fix API.spec. test=develop
* fix test_layers. test=develop
* fix format. test=develop
6 years ago
Zhen Wang
a914d9b116
Quant output scale ( #17215 )
...
* Add MovingAverageAbsMaxScale operator which is only used for calculating the quantization scale.
* test=develop
* change the output into inplace. test=develop
* Revert "test=develop"
This reverts commit 696cf62699ba1e1c98f61f7345ac7060010eb29a.
* Revert "change the output into inplace. test=develop"
This reverts commit a19acd20f07eee82622701a3015e6e9c073a5e0b.
* test=develop.
* update the MovingAverageAbsMaxScaleOp test. test=develop
6 years ago
jerrywgz
cc95a7516c
fix distribute fpn proposals, test=develop ( #16152 )
...
* fix distribute fpn proposals, test=develop
6 years ago
Zeng Jinle
ee2028a110
Add use_cuda to inplace pass ( #17205 )
...
* add use_cuda to inplace pass,test=develop
* add test softmax_with_xe_inplace test,test=develop
6 years ago
jerrywgz
a72907bbf4
Enhance concat op to support empty input. ( #17015 )
...
* enhance_concat, test=develop
6 years ago
wopeizl
83c4f7721f
use two GPUs to run the exclusive test test=develop ( #17187 )
6 years ago
tianshuo78520a
8092c40560
Modify test timeout ( #17181 )
...
* test=develop
* test=deelop
6 years ago
guru4elephant
f938ccec62
remove async executor python api to fix document ( #17174 )
...
* remove async executor python api
test=develop
* remove test_async_executor.py
add executor train_from_dataset demo
test=develop
* fix import bug
test=develop
6 years ago
Zeng Jinle
5dfe2ab9e8
Fix mem leak when converting Tensor to numpy array ( #17182 )
...
* fix mem leak when converting Tensor to numpy array
test=develop
* remove unused unittest,test=develop
* follow comments, test=develop
* fix dygraph bug,test=develop
6 years ago
Zeng Jinle
4e1bc6e805
Rewrite inplace pass and fix gc bug ( #17126 )
...
* fix op graph view
test=develop
* rewrite inplace pass and fix reference count pass bug
test=develop
* fix unittest failed
test=develop
* follow comments, test=develop
6 years ago
xiaoting
bc48453b73
polish the label_smooth ( #17138 )
...
* polish the label_smooth
test=develop
* polish code
test=develop
6 years ago
tangwei12
deb510d451
cvm op feature ( #17081 )
...
cvm without LoD.
6 years ago
Jiancheng Li
554d3a71d2
test=develop fix bug: fix selected_indices in nms ( #17140 )
6 years ago
Zeng Jinle
28d69d710a
Refine dropout gpu memory ( #17095 )
...
* refine_dropout_mem,test=develop
* # This is a combination of 14 commits.
# The first commit's message is:
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066 )
# This is the 2nd commit message:
Fleet unify distributed training (#16791 )
* implement distributed transpiler with fleet
# This is the 3rd commit message:
ParallelDyGraph with GPU collective mode (#16827 )
implement dygraph.parallel.DataParallel to hook reduce op.
# This is the 4th commit message:
Init mixed precision training interface (#16856 )
* Init mixed precision training interface
* Add fp16 test script
test=develop
* All initializers support float16
test=develop
* Code cleanup & add more code annotations
test=develop
* Update API spec
test=develop
* Add usage example in doc
test=develop
# This is the 5th commit message:
fix reference_count_pass,test=develop (#17060 )
test=develop
# This is the 6th commit message:
Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090 )
* Cache the information of linear interpolation in forward and use it in backward.
test=develop
* Fix cuda kernel.
test=develop
# This is the 7th commit message:
remove unnecessary prepare_data (#17080 )
test=develop
# This is the 8th commit message:
fix interpolate cu. test=develop (#17101 )
# This is the 9th commit message:
test=develop, double backward leaky_relu (#17067 )
backward of backward: leaky_relu
# This is the 10th commit message:
fix fuse optimizer ops (#17102 )
test=develop
# This is the 11th commit message:
truncated_gaussian_random supported in distributed training, test=develop (#17091 )
# This is the 12th commit message:
Detailed coordinate description for yolov3 loss (#17007 )
* Detailed coordinate description for yolov3 loss
test=develop
* modified api.spec
test=develop
* modified loss name
* fix api.spec
test=develop
* polish description
test=develop
* modified api.spec
test=develop
# This is the 13th commit message:
fix test_weight_decay (#17109 )
test=develop
# This is the 14th commit message:
Path flag (#17105 )
* fix python/paddle/fluid/__init__.py detecting problems
6 years ago
chengduo
9ccce576d6
fix test_weight_decay ( #17109 )
...
test=develop
6 years ago
ceci3
258e000be6
test=develop, double backward leaky_relu ( #17067 )
...
backward of backward: leaky_relu
6 years ago
Kaipeng Deng
10c487eb21
fix interpolate cu. test=develop ( #17101 )
6 years ago
whs
55ce36e981
Speedup roi_perspective_transform op by caching the information of linear interpolation in forward ( #17090 )
...
* Cache the information of linear interpolation in forward and use it in backward.
test=develop
* Fix cuda kernel.
test=develop
6 years ago
Yibing Liu
beda78258f
Init mixed precision training interface ( #16856 )
...
* Init mixed precision training interface
* Add fp16 test script
test=develop
* All initializers support float16
test=develop
* Code cleanup & add more code annotations
test=develop
* Update API spec
test=develop
* Add usage example in doc
test=develop
6 years ago
Yan Xu
0b07eef118
ParallelDyGraph with GPU collective mode ( #16827 )
...
implement dygraph.parallel.DataParallel to hook reduce op.
6 years ago
tangwei12
1a4a51db2b
Fleet unify distributed training ( #16791 )
...
* implement distributed transpiler with fleet
6 years ago
tangwei12
e707119a89
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop ( #17066 )
6 years ago
guomingz
2deac4e447
Fix the bug of test_conv2d_int8_mkldnn case which raised by improper parameter passing ( #17058 )
...
* resolve #17057
Fixed the bug that fuse_relu/fuse_residual option couldn't be passed to class TestConv2dInt8Op.
test=develop
* Fix the bug of test_conv2d_int8_mkldnn case which raised by improper parameter passing.
test=develop
6 years ago
chengduo
a2be4b4d91
Add fuse momenutum ops ( #16745 )
...
* Add fuse momenutum ops
6 years ago
chengduo
e296e0fead
fix test_parallel_executor_seresnet random fail ( #17030 )
...
test=develop
6 years ago
Tao Luo
b3a11943c1
Merge pull request #17031 from luotao1/reduce_test_time
...
reduce unittest time by rename testcuda to has_cuda
6 years ago
qingqing01
c1c2633a63
Support backward of backward for Relu and add a new gradient checker by comparing theoretical and numerical Jacobian. ( #16862 )
...
* Support backward of backward and a new gradient checker
* Rename decorators.py to decorator_helper.py, since Python on Windows CI has decorators package.
1. Add ReluDoubleGradMaker when register relu_grad.
2. Add a new gradient checker by comparing theoretical and numerical Jacobian. Check double gradients by double_grad_check.
6 years ago
Zeng Jinle
f188b3708e
Move gc test to each test of op ( #16999 )
...
* move gc test to op_test
test=develop
* Revert "move gc test to op_test"
This reverts commit cf15da65c38f57c91f53b3d8b3c2365d4aa86016.
* enable gc test in some ops
test=develop
6 years ago
chengduo
7c370e42f9
Fix test_recurrent_op ( #17001 )
...
* fix ramdom fail
test=develop
6 years ago
Tao Luo
9466e956a7
reduce unittest time by rename testcuda to has_cuda
...
test=develop
6 years ago
wopeizl
d9991dccdd
add parallel build script to ci … ( #16901 )
...
* add parallel build script to ci test=develop
* 1. classify the test case as single card/two cards/multiple cards type
2. run test case according to the run type
6 years ago
qingqing01
ea42e431f8
Speed unit testing. ( #16978 )
...
* Speed affine_channel_op unit testing
* Add check in tensor_py
* Fix ONLY_CPU Compiling
6 years ago
guomingz
ae7a2cb8e3
resolve #16988 ( #16995 )
...
Update the filter generation mechanism that it could generate the negative parameter.
The original calling(np.random.random()) couldn't simulate the conv/relu fusion case.
test=develop
6 years ago
liuwei1031
765c70a1b0
Unittest improve, test=develop ( #16941 )
...
* accelerate test_ir_memory_optimize_nlp, test=develop
* accelerate test_ir_memory_optimize_nlp, test=develop
6 years ago
guomingz
23df084b32
resolve #16987 ( #16994 )
...
Rename the testcuda function to has_cuda, it will elimate the unnecessary testing.
test=develop
6 years ago
Zeng Jinle
1202d3fc74
Refine model gpu memory ( #16993 )
...
* speedup gc and inplace softmax_with_cross_entropy_grad
test=develop
* refine models gpu mem
Merge skip vars and warning messages of mem opt
remove relu mem opt
test=develop
* follow comments
test=develop
6 years ago
Zeng Jinle
af8a041bb6
reduce py_reader unittest time ( #16996 )
...
test=develop
6 years ago
Yibing Liu
3c375751f8
Support seq len equal to 0 in sequence ops ( #16935 )
...
* Support seq len equal to 0 in sequence ops
test=develop
* Add more test cases
* Fix some comments
test=develop
* Fix py3 error
test=develop
6 years ago
lujun
a3f17280a3
fix dy-load bug, test=develop
6 years ago
gongweibao
cbdb8a17b1
Polish DGC code ( #16818 )
6 years ago
lujun
dbf66dd034
Merge pull request #16954 from junjun315/fix-dygraph-checkpoint
...
Fix dygraph checkpoint bug
6 years ago
Tao Luo
aed702cea3
Merge pull request #16920 from qingqing01/test_profile
...
Fix test_profiler when the machine has many cores.
6 years ago
Tao Luo
b596eed73a
Merge pull request #16824 from LeoZhao-Intel/mkldnn_mul
...
disable test_elementwise_mul_mkldnn_op case
6 years ago
lujun
3beed54cdd
Merge pull request #16917 from velconia/dygraph_untrack_op
...
imperative fix tracer train mode
6 years ago
lujun
a7c11979ba
fix dygraph save/load checkpoint error, test=develop
6 years ago
tangwei12
2b61db07d1
fix sampling id op bug ( #16909 )
...
* fix sampling id op bug, test=develop
6 years ago
gongweibao
b7f20ed6af
Fix unittest dataset error ( #16925 )
6 years ago
Hongyu Liu
d5a7c09856
Merge pull request #16798 from phlrain/softmax_cross_support_high_rank
...
softmax cross entropy support high rank
6 years ago
Dang Qingqing
b73a71d11e
Fix test_profiler when the machine has many cores
...
test=develop
6 years ago
Kaipeng Deng
5d45eb06f9
Merge pull request #16858 from heavengate/fix_yolo_param
...
Fix yolo param
6 years ago
minqiyang
97aa1838bc
Fix dygraph train mode
...
test=develop
6 years ago
Qiyang Min
102fc8596e
Merge pull request #16777 from velconia/dygraph_untrack_op
...
Imperative tracer does not hold op any more
6 years ago
Leo Zhao
1edcd73115
remove unnecessary new line
...
test = develop
resolve #16764
6 years ago
Leo Zhao
61cc842a53
disable test_elementwise_mul_mkldnn_op case
6 years ago
Hongyu Liu
0701c2db47
Merge pull request #16518 from zhoukunsheng/rsqrt
...
Rsqrt
6 years ago
phlrain
766c868199
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into softmax_cross_support_high_rank
6 years ago
Tao Luo
485bc6a055
Merge pull request #16868 from chengduoZH/speedup_test_parallel_executor_transformer
...
Reduce the layer number of transfromer model
6 years ago
Tao Luo
d4b5510c00
Merge pull request #16860 from junjun315/fix-utest-vgg
...
Fix bug: long vgg-utest testing time
6 years ago
Hongyu Liu
2de7f3cfc3
Merge pull request #16799 from phlrain/sigmoid_corss_entropy_support_high_rank
...
supprt high rank
6 years ago
chengduozh
3349094fe2
reduce the layer number of transfromer
...
test=develop
6 years ago
minqiyang
73cbdc2998
Add train mode
...
test=develop
6 years ago
colourful-tree
434caab21b
Merge pull request #16741 from colourful-tree/dev
...
add continuous value model op
6 years ago
lujun
4aea89faa2
fix vgg-test. test=develop
6 years ago
dengkaipeng
7b1702d9a1
fix unittest and API.spec. test=develop
6 years ago
Yibing Liu
4267a81afc
Correct the lod level of compiled time in lod_reset ( #16790 )
...
test=develop
6 years ago
chengduo
c62674f475
Refine StaticRnn ( #16707 )
...
* enable recurrent op test=develop
6 years ago
chengduo
6220b8b9d1
[Speed] Make test_dyn_rnn faster ( #16761 )
...
* make test_dyn_rnn faster
6 years ago
Leo Zhao
a9694bd3d6
convert output to nchw format to align with native version in avx512 mode
...
test = develop
resolve #16764
6 years ago
heqiaozhi
759940786e
Merge remote-tracking branch 'upstream/develop' into dev
...
test=develop
6 years ago
phlrain
8063f5867f
remove sigmoid change; test=develop
6 years ago
phlrain
468f8ccff9
supprt high rank; test=develop
6 years ago
phlrain
97d4622bdb
add softmax test unit
...
test=develop
6 years ago
phlrain
bbfc82cc42
softmax corss entropy support high rank
...
test=develop
6 years ago
zhoukunsheng
2b2b4ca21e
Merge branch 'develop' into rsqrt
6 years ago
heqiaozhi
afa64a5cfa
add cvm unittest
...
test=develop
6 years ago
Hongyu Liu
e2897ba13a
Merge pull request #16432 from zhoukunsheng/linspace
...
add linspace op
6 years ago
Hongyu Liu
afe0d64c9d
Merge pull request #16320 from zhoukunsheng/all_any
...
add reduce_all, reduce_any op
6 years ago
Tao Luo
f96446cade
Merge pull request #16738 from luotao1/high_level_api_test
...
reduce CI time of high_level_api tests
6 years ago
chengduo
610c6442e3
Make test_parallel_executor_seresnet.py Faster ( #16701 )
...
* slimming test_parallel_executor_seresnet.py
6 years ago
Tao Luo
544f91deba
add WITH_HIGH_LEVEL_API option, default OFF
...
test=develop
6 years ago
Tao Luo
7d0ed2a423
remove non-existent test_image_classification_resnet
...
test=develop
6 years ago
Zeng Jinle
674aed6a6c
Fix unittests which takes too long time ( #16713 )
...
* fix too long unittest
recommit
test=develop
* add fake_reader.py
test=develop
6 years ago
Jiabin Yang
7060c8d89f
test=develop, refine transformer ( #16734 )
6 years ago
lujun
9bd44b94da
Merge pull request #16561 from junjun315/move-api-to-root
...
Move dygraph api to root
6 years ago
Kaipeng Deng
ed97156461
Merge pull request #16439 from heavengate/resize_scale
...
add attr scale. test=develop
6 years ago
Tao Luo
38f01b678a
rename high_level_api tests
...
test=develop
6 years ago
Huihuang Zheng
2146293d26
Fix op registry ( #16677 )
...
list of fixed ops:
lookup_table_op
space_to_depth_op
squared_l2_distance_op
squared_l2_norm_op
teacher_student_sigmoid_loss_op
tree_conv_op
warpctc_op
test=develop
6 years ago
lujun
14db0680c0
merge conflict, test=develop
6 years ago
lujun
92c8ac8a74
merge conflict, test=develop
6 years ago
Jiabin Yang
a06f4b2b2c
make less batch of tests to fit ci ( #16706 )
...
* make less batch of tests to fit ci
* test=develop, invoke ci
add some comments back
6 years ago
chengduo
55b15db5af
Add unit test for fuse all_reduce ops ( #16699 )
...
* test fuse all_reduce
6 years ago
Yan Xu
c6720990c0
fix seresnext unit test ( #16689 )
...
comment np.array(x.get_tensor()) in imperaitve mode to avoid OOM.
6 years ago
guru4elephant
7d653f0aed
Merge pull request #16652 from xjqbest/dataset_merge_develop
...
fix dataset bug
6 years ago
chengduo
ea8655dbd2
Add unit test for fuse_opt_ops ( #16550 )
...
* add unit test for fuse_opt_ops
test=develop
6 years ago
minqiyang
7c4b9b577a
Polish code
...
test=develop
6 years ago
minqiyang
aa07814df3
Add 3 uts
...
test=develop
6 years ago
minqiyang
2e0b871320
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_dqn
6 years ago
xjqbest
274477005e
fix dataset testcase error
...
test=develop
6 years ago
Zeng Jinle
bb143052cb
fix gc bug in conditional block ( #16673 )
...
test=develop
6 years ago
xjqbest
5e5139283b
fix runtime error
...
test=develop
6 years ago
ruri
229dc93277
Add Pixel shuffle OP ( #15782 )
...
* add pixel_shuffle op
* add pixel_shuffle op, test=develop
* rewrite code, test=develop
* delete useless comment, test=develop
* Refine pixel_shuffle_op and unit testing
* refine code,test=develop
* refine .cu,test=develop
* fix unittest,test=develop
* Fix unit testing
test=develop
* resolve conflict, test=develop
* fix test, test=develop
* fix API, test=develop
* fix test datatype bug,test=develop
* polish comments,test=develop
* add API,test=develop
* test=develop
* Add Pixel_Shuffle OP,test=develop
* support python3,test=develop
* add include memory to travis CI bug,test=develop
6 years ago
lujun
38382f8e27
Merge pull request #16658 from JiabinYang/fix/transformer_random_failed
...
test=develop, fix transformer in dygraph
6 years ago
lujun
99120698b7
merge confict, test=develop
6 years ago
lujun
01f4f2d7e4
merge confict, test=develop
6 years ago
minqiyang
b29249404b
Polish code
...
test=develop
6 years ago
minqiyang
61fe139f34
Polish code
6 years ago
lujun
e11bf2a49e
merge branch, test=develop
6 years ago
Qiyang Min
cf307d0dfb
Imperative fix ptb rnn place bug ( #16632 )
...
* Fix bug of gradient interface
* shrink transformer
* Right transformer
* Change from width-first backward to deep-first backward process
test=develop
* Reverse iterator op's input
test=develop
* Polish code
* Change the iteration direction in ingrads' map slots
test=develop
* Polish code
test=develop
* Add GPU place in static mode of ptb rnn
test=develop
* Polish code
test=develop
6 years ago
JiabinYang
4c18b98fcd
test=develop, fix transformer in dygraph
...
/
6 years ago
lujun
a32c6ffa96
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into move-api-to-root
6 years ago
lujun
78ff5d72d3
Merge pull request #16520 from PaddlePaddle/move-code
...
add some dygraph op
1.Conv3d
2.Conv3dTranspose
3.RowConv
4.GroupNorm
5.SpectralNorm
6.TreeConv
and utest
6 years ago
Xin Pan
c77fb9fed9
Merge pull request #16636 from panyx0718/imperative
...
add a simple test
6 years ago
xjqbest
271b7147cc
fix dataset bug
...
test=develop
6 years ago
Zeng Jinle
1c526e1d1a
Fix some grad op desc makers ( #16633 )
...
* fix some grad op desc maker
test=develop
* fix grad op desc makers
test=develop
6 years ago
minqiyang
e377d75977
Add UT for most layers without params
...
test=develop
6 years ago
Xin Pan
d02e3e7f79
add a simple test
...
test=develop
6 years ago
minqiyang
2839e22739
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_dqn
...
test=develop
6 years ago
minqiyang
cd71d645c5
Polish code
...
test=develop
6 years ago
minqiyang
2e09e95bfc
Add GPU place in static mode of ptb rnn
...
test=develop
6 years ago
chengduo
1342e2ea04
Fix the bug of the fast threaded executor ( #16514 )
...
* Fix the bug of the fast threaded executor. I
6 years ago
Zeng Jinle
d658244997
fix some grad op desc maker ( #16581 )
...
test=develop
6 years ago
lujun
ad7c1a934f
merge conficts, test=develop
6 years ago
lujun
60e3e35575
merge branch, test=develop
6 years ago
Qiyang Min
12e36d38a5
Imperative deep-first backward process ( #16605 )
...
* Fix bug of gradient interface
* shrink transformer
* Right transformer
* Change from width-first backward to deep-first backward process
test=develop
* Reverse iterator op's input
test=develop
* Polish code
* Change the iteration direction in ingrads' map slots
test=develop
* Polish code
test=develop
6 years ago
Jiabin Yang
353244f4fc
test=develop, add FC and test ( #16604 )
...
* test=develop, add FC and test
* test=develop, refine code
6 years ago
lujun
e97ded835a
merge branch, test=develop
6 years ago
乔龙飞 Qiao Longfei
21622ca30b
Merge pull request #16172 from jacquesqiao/add-async-ssa-graph-executor-communicator
...
Add async ssa graph executor communicator
6 years ago
minqiyang
51ca50897a
Change the iteration direction in ingrads' map slots
...
test=develop
6 years ago
minqiyang
d16cb8ca11
Polish code
6 years ago
minqiyang
cce766d710
Reverse iterator op's input
...
test=develop
6 years ago
minqiyang
1a55f7d38c
Change from width-first backward to deep-first backward process
...
test=develop
6 years ago
Jiabin Yang
22e5bcd275
test=develop, ptb_rnn fix op ( #16573 )
6 years ago
lujun
d3fc3d5520
move internal function, test=develop
6 years ago
minqiyang
a0478084f8
Right transformer
6 years ago
minqiyang
124f45c9f7
shrink transformer
6 years ago
lujun
717256755a
move dygraph.nn,dygraph.layer to fluid, test=develop
6 years ago
Yan Xu
2e1e76e70e
add SeResNeXt unittest ( #16503 )
...
* add seresnet unittest test=develop
* add dropout layer test=develop
* fix ci test=develop
* fix comment test=develop
* fix comment test=develop
* fix ci test=develop
* fix ci test=develop
* fix ci
* fix module name test=develop
* run imperative serenext unit test serially test=develop
6 years ago
zhoukunsheng
5edf4fb4fb
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into all_any
6 years ago
chengduo
feb1b54f9d
fix min and max bug ( #16570 )
...
test=develop
6 years ago
Qiao Longfei
adf272bcec
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
...
test=develop
6 years ago
guru4elephant
76b49f02ee
Merge pull request #16539 from guru4elephant/train_with_pipe_reader_merge_develop
...
Train with pipe reader merge develop
6 years ago
Qiao Longfei
baf02328b2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
...
test=develop
6 years ago
Qiyang Min
d8d73ff3db
Merge pull request #15584 from velconia/imperative_lr_scheduler
...
Support imperative learning rate scheduler
6 years ago
lujun
cf6238fbd9
fix merge for move dir, fix utest error, test=develop
6 years ago
qingqing01
1ebd7434d5
Add linear learning warmup method in learning rate scheduler. ( #16563 )
...
* Add linear learning warmup method
This warmup lr can be combinated with other learning rate strategies.
For example:
decayed_lr = fluid.layers.linear_lr_warmup(
fluid.layers.piecewise_decay(boundaries, lr_steps),
warmup_steps, start_lr, end_lr)
6 years ago
Wu Yi
22b02bfa62
Batch norm cudnn accurate ( #16545 )
...
* fix cudnn batch norm accuracy test=develop
* fix cudnn batch norm accuracy test=develop
* disable failed test for later fix test=develop
6 years ago
xjqbest
a99c8d0c29
fix client to client communication bug
...
test=develop
6 years ago
Kaipeng Deng
3d939d32ee
Merge pull request #16023 from heavengate/kl_div_loss
...
KL div loss: add kldiv_loss op
6 years ago
Kaipeng Deng
54474637ae
Merge pull request #16057 from heavengate/softmax_axis
...
Add attr 'axis' for softmax
6 years ago
Kaipeng Deng
63ac947e2f
Merge pull request #16135 from heavengate/shift
...
Add temporal_shift op for TSM model
6 years ago
lujun
04c0b12c6e
fix test layers error, test=develop
6 years ago
lujun
cf642d478d
fix merge for move dir, fix utest error, test=develop
6 years ago
Qiao Longfei
61912e879d
test_dist_base set runtime_split_send_recv to false test=develop
6 years ago
lujun
1dcd28e819
move dygraph.nn,dygraph.layer to fluid, test=develop
6 years ago
wopeizl
e014950e87
add slice support for dim < 0 ( #16494 )
...
* add slice support for dim < 0 test=develop
6 years ago
zhoukunsheng
5284213942
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rsqrt
6 years ago
minqiyang
fb7c787d34
Fix conflicts
...
test=develop
6 years ago
minqiyang
3e57981294
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_lr_scheduler
...
test=develop
6 years ago
lujun
d4d5052fa6
fix merge for move dir, fix utest error, test=develop
6 years ago
zhoukunsheng
3c4f5f0368
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into linspace
6 years ago
lujun
32146857a4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into move-code
6 years ago
dongdaxiang
720647e17f
rebase current develop and fix conflict
...
test=develop
6 years ago
dongdaxiang
93c3c7f9b3
fix dataset testcase problem
...
test=develop
6 years ago
dongdaxiang
d739bab844
fix async_executor problem and remove some unnecessary testcase, fix trainer_desc import problem
...
test=develop
6 years ago
xjqbest
1497ce388d
fix code style of test_dataset.py
...
test=develop
6 years ago
xjqbest
7cdd57a474
fix code style of test_dataset.py
...
test=develop
6 years ago
xjqbest
748d54cb46
fix code style of test_dataset.py
...
test=develop
6 years ago
xjqbest
1073b4d8f9
fix code style of test_dataset.py
...
test=develop
6 years ago
dongdaxiang
45eb6f0765
run pre-commit check files and fix code style problem
...
test=develop
6 years ago
xjqbest
e57ac5ed17
fix code style
...
test=develop
6 years ago
xjqbest
97c74e60c3
fix code style
...
test=develop
6 years ago
xjqbest
a38b98cb32
fix code style & runtime error
...
test=develop
6 years ago
xjqbest
d52586a97d
add doc string
...
test=develop
6 years ago
xjqbest
e95cafd9a7
fix code style & add dataset testcase
...
test=develop
6 years ago
Qiao Longfei
d8974e6da0
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
...
test=develop
6 years ago
lujun
de605cc0fc
Merge pull request #16523 from junjun315/tensor_api
...
move imperative to dygraph
6 years ago
chengduo
1096746cbf
Fuse Adam And SGD ops ( #15933 )
...
* fuse optimizer
6 years ago
lujun
1c9aaeebe0
move imperative to dygraph, test=develop
6 years ago
lujun
d980ba19bc
add some dygraph op, test=develop
6 years ago
zhoukunsheng
2f9e562100
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into linspace
6 years ago
zhoukunsheng
082822d417
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rsqrt
6 years ago
zhoukunsheng
c47f3cc7fe
test=develop
...
add rsqrt op
6 years ago
minqiyang
42507d33c6
Change atol to default value
6 years ago
dengkaipeng
193185b840
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into shift
6 years ago
dengkaipeng
8a0023892a
fix unittest. test=develop
6 years ago
whs
59f75ec76e
Make unitest of fsp op faster and more stable. ( #16502 )
...
* Make unitest of fsp op faster and more stable.
test=develop
* Skip unitest of fsp op.
test=develop
6 years ago
minqiyang
35c89f38c3
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_lr_scheduler
...
test=develop
6 years ago
gongweibao
eb83abeac3
Add DGC(Deep Gradient Compression) interface. ( #15841 )
6 years ago
zhoukunsheng
874b5d8362
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into linspace
6 years ago
zhoukunsheng
83c7bca13f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into all_any
6 years ago
Qiao Longfei
b68f84090b
fix test_split_selected_rows_op test=develop
6 years ago
Zeng Jinle
c7c6eeb44e
Merge pull request #16409 from sneaxiy/feature/advance_gc
...
Enhance gc to support deleting tensor buffer in advance
6 years ago
Jiabin Yang
54a73578a8
Feature/install check ( #16044 )
...
* test=develop, add install check
* test=develop, add install check scripts
* test=develop, refine language
* test=develop, add api spec
* test=develop, change cdn to bj to pass ci
6 years ago
minqiyang
99128a5c72
Implement Cosine and Noam Decay
...
test=develop
6 years ago
wopeizl
c300b1ba69
Tensor index ( #16223 )
...
* extend the slice function for python
test=develop
6 years ago
minqiyang
ec9c0874bc
Implement Expotential NatureExp Inversetime and Polynomal Decay
6 years ago
Jiabin Yang
0d9d25d40f
Feature/refactor layers to Layers ( #16337 )
...
* test=develop, add some Layers and tests
* test=develop, add more layers
* test=develop, add more layers
* test=develop, add force cpu option
* Update test_layers.py
remove pdb
* test=develop, refine code
6 years ago
gongweibao
850b737112
Fix nparray.all() bug. ( #16472 )
6 years ago
Xin Pan
f8c279b11c
Merge pull request #16454 from panyx0718/imperative2
...
polish deepCF model to support real dataset
6 years ago
Qiao Longfei
30618409db
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
sneaxiy
78fb3a62e0
fix env variable settting bug
...
test=develop
6 years ago
minqiyang
4278be8c49
Merge branch 'imperative_lr_scheduler' of https://github.com/velconia/Paddle into imperative_lr_scheduler
...
test=develop
6 years ago
minqiyang
b5bbb13ac1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_lr_scheduler
6 years ago
dengkaipeng
7920e3be02
revert test_softmax_cudnn. test=develop
6 years ago
Jiabin Yang
7c5319ba12
Fix/test imperative ptb rnn ( #16433 )
...
* test=develop, fix ptb rnn
* test=develop, change cdn to bj to pass ci
* test=develop, fix ci
6 years ago
Jiabin Yang
f735102eab
add layer norm to Layers, add transformer test in imperative mode ( #16092 )
...
* add layer norm to Layers, add transformer prepare encoding
* little change
* finish encoder part
* add decoder part
* finish model part
* add test case and part of data feed
* add transformer test
* add to_parameter, add remove in set_attr
* test=develop, fix pos encoding bug, create_parameter with stantard name
* test=develop, rm dropout test in imperative
* test=develop, fix cpu error
* test=develop, fix minize bug
* test=develop, fix one hot not stop gradient
* test=develop, fix one hot not stop gradient
* test=develop, refine parameter name
* test=develop, fix transformer test in imperative mode
* test=develop, fix transformer test in imperative mode
* test=develop, fix boost and mkl download error
* test=develop, fix boost and mkl download error
* test=develop, fix ci and refine code
* test=develop, fix ci and refine code
6 years ago
Xin Pan
fd24ab47ab
polish
...
test=develop
6 years ago
Xin Pan
1f89249a95
update DeepCF model
...
test=develop
6 years ago
sneaxiy
a7d0ac50b8
Merge develop
6 years ago
sneaxiy
7000ec85d9
fix some op grad maker
...
fix ctest eager deletion disable bug
test=develop
6 years ago
dengkaipeng
cfef382a85
fix format. test=develop
6 years ago
Zeng Jinle
4cc9809cae
Merge pull request #15799 from sneaxiy/feature/decoupled_reader
...
Try to decouple reader with program_desc
6 years ago
whs
e9bec9369b
[slim] Add quantization strategy and distillation strategy. ( #16408 )
...
* Add fsp operator.
1 Add unitest.
2. Add python API.
3. Add layer test.
* Add quantization strategy.
1. Add API.
2. Add unitest.
* Add distillatoin strategy.
* Add unitest config file for quantization
* Fix Copyright
test=develop
* Fix setup.py
* Fix document of layers.py.
test=develop
* Fix unitest in python3.
test=develop
* Fix documents.
test=develop
* 1. refine fsp op by batched gemm
2. remove unused import
test=develop
* Fix test_dist_se_resnext.
1. disable test distillation.
2. reset framework.py
test=develop
* Enable unitest of distillation after fixing Block._clone_variable
test=develop
* Fix cdn issue.
test=develop
6 years ago
liuwei1031
de3b70a101
fix cdn issue, test=develop ( #16423 )
...
* fix cdn issue, test=develop
* fix cdn issue, test=develop
6 years ago
zhoukunsheng
d3d31a5894
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into all_any
6 years ago
zhoukunsheng
664c342ca0
test=develop
...
split reduce_all_any_op.h into two files
add unit test for reduce_all, reduce_any
6 years ago
zhoukunsheng
43060084a4
test=develop
...
add linspace, modify interface comments in tensor.py, merge with develop branch
6 years ago
sneaxiy
f8ed2c229e
try to fix ci error
...
test=develop
6 years ago
zhoukunsheng
8e9ebebcef
test=develop
...
add linspace op
6 years ago
dengkaipeng
cfda1fdea7
add attr scale. test=develop
6 years ago
Xin Pan
b55dd32e9c
Merge pull request #16394 from panyx0718/imperative2
...
Add DeepCF model
6 years ago
sneaxiy
2f54d9f995
Merge develop
...
test=develop
6 years ago
sneaxiy
072d95d8f6
Merge develop
...
test=develop
6 years ago
sneaxiy
a93a9eef8f
add op registry type
...
refine gc code
test=develop
6 years ago
chengduo
c917c13af1
increase the time limite ( #16405 )
...
test=develop
6 years ago
whs
18779b5b8f
[Operator] Add range op. ( #15431 )
...
* Add range op.
test=develop
* Add more unitests.
test=develop
* Fix API.spec
test=develop
* Fix API.spec
test=develop
* Fix API.spec
test=develop
6 years ago
phlrain
6b971e1f19
remove test_dist_transplier; test=develop
6 years ago
phlrain
7dc4a7f4f8
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_var_name_in_opt_2
6 years ago
phlrain
d11d0e18c2
remove test_dist_transplier; test=develop
6 years ago
Xin Pan
55a7b98126
Add DeepCF model
...
test=develop
6 years ago
Zhen Wang
ec11135d54
Merge pull request #16341 from wzzju/add_channel_wise_in_quant_pass
...
Add channel wise in quant pass.
6 years ago
xiaolil1
e235882c18
Enable MKL-DNN INT8 Concat Kernel. ( #16156 )
...
* Enable INT8 Concat Kernel to improve the performance of MobileNet-SSD.
test=develop
* Optimize UT format.
test=develop
* Fix UT file address issue.
test=develop
* Refine the license year.
test=develop
* Optimize code for new API.
test=develop
* Restructure INT8 Concat kernel.
test=develop
6 years ago
Qiyang Min
171df5b56b
Merge pull request #16303 from junjun315/checkpoint
...
for Checkpoint save and load
6 years ago
Hongyu Liu
e5478ab5c8
Merge pull request #16346 from phlrain/add_floordiv_and_mod
...
add elementwise floordiv, mod
6 years ago
chengduo
33965527fd
Add unit test for fuse all reduce ( #16354 )
...
* refine fused_all_reduce_op
* add unit test in test_parallel_executor_seresnext
test=develop
6 years ago
phlrain
5dc9b51994
fix time; test=develop
6 years ago
phlrain
56c2d384c7
add elementwise floordiv, mod; test=develop
6 years ago
Wu Yi
8bebfe5640
add resnet nccl2 dist training, mp training unit test ( #16167 )
...
* add resnet nccl2 test=develop
* test dist train test=develop
* update test=develop
* increase timeout test=develop
* test on CI env test=develop
6 years ago
baojun
2de263a5d9
Add softmax_with_cross_entropy_op to ngraph engine ( #16304 )
...
* Add softmax_with_cross_entropy_op test=develop
* simplify implementation test=develop
6 years ago
chengduo
f26ba5bddd
Fuse AllReduce ( #15921 )
...
* fuse all_reduce
test=develop
* add fuse_parameter_groups_size
test=develop
* Polish code
test=develop
* Fix travis-ci
test=develop
* Add SetGroupAccordingToLayers and SetGroupAccordingToGroupSize
test=develop
* Add SetGroupAccordingToMemorySize
test=develop
* fix multi_devices_graph
test=develop
* reset params_grads
test=develop
* Polish code
test=develop
6 years ago
dengkaipeng
93701dba50
add jit kernel for softmax axis. test=develop
6 years ago
Wu Yi
6382b62f6b
Collective ops ( #15572 )
...
* wip allreduce in op
* wip
* wip
* wip
* wip adding test
* wip for conflict with mp mode
* fix tests test=develop
* fix cpu build test=develop
* fix travis clang format test=develop
* fix cpu build test=develop
* update api.spec test=develop
* delete comment test=develop
* fix cpplint test=develop
* fix test=develop
* follow comment test=develop
* add file test=develop
* fix build test=develop
* update test=develop
* to be compatible with sync_bn, and fix mp mode in develop test=develop
6 years ago
lujun
bed0ecf3d2
checkpoint pr be moved here, test=develop
6 years ago
Zhen Wang
ec88b6cc5a
add channel wise quantization in ir pass.
6 years ago
sneaxiy
3a09693f5c
change API name
...
test=develop
6 years ago
Yibing Liu
7e20e7691e
Fix the bug in fp16 backward kernel ( #16269 )
...
test=develop
6 years ago
dengkaipeng
365e6cfd15
add mkldnn support. test=develop
6 years ago
dengkaipeng
217db27337
add mkldnn support. test=develop
6 years ago
dengkaipeng
6cb66721d2
add cudnn support. test=develop
6 years ago
sneaxiy
161b8ddcaa
Merge develop
6 years ago
xiaolil1
e818fa1004
Enable INT8 transpose kernel for MobileNet-SSD improvement. ( #16159 )
...
* Enable INT8 transpose kernel for MobileNet-SSD improvement.
test=develop
* Refine the license year.
test=develop
* Delete redundant code.
test=develop
* Add axis check.
test=develop
6 years ago
Xin Pan
3e9319f3ab
add more imperative layer tests.
...
test=develop
6 years ago
Xin Pan
7458114b5b
Merge pull request #16228 from panyx0718/imperative
...
graph neural network for imperative mode
6 years ago
Kaipeng Deng
b77ebb2af2
Merge pull request #15919 from heavengate/yolo_box
...
add yolo_box for detection box calc in YOLOv3
6 years ago
Xin Pan
3be7e971ab
polish
...
test=develop
6 years ago
Xin Pan
50ff898378
graph neural network for imperative mode
...
test=develop
6 years ago
achao2013
81b4fad8b9
add moving average absmax op and fix bug ( #15155 )
...
* Add moving average absmax op in quantilize-aware training.
6 years ago
Kaipeng Deng
74037cc1c8
Merge branch 'develop' into yolo_box
6 years ago
Xin Pan
92b9ce3479
Merge pull request #16073 from heavengate/yolov3_loss_imporve
...
Yolov3 loss: add mixup score and label smooth
6 years ago
qingqing01
8ad672a287
Support sync batch norm. ( #16121 )
...
* Support Sync Batch Norm.
* Note, do not enable it in one device.
Usage:
build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
loss_name=loss_mean.name,
build_strategy=build_strategy)
6 years ago
Yibing Liu
4ae23cc3c5
Impl fp16 compute kernel for slice_op ( #16206 )
...
* Impl fp16 compute kernel for slice_op
test=develop
* Use data() to replace mutable_data()
6 years ago
sneaxiy
5a92e4c097
revert revert 16144
...
test=develop
6 years ago
sneaxiy
ad5f0e6018
merge develop
6 years ago
sneaxiy
55ba7f610b
fix numeric error
...
test=develop
6 years ago
Zeng Jinle
a91964c8fe
Revert "PaddingRNN model memory optimize"
...
test=develop
6 years ago
Zeng Jinle
0b49e43d3a
Merge pull request #16144 from sneaxiy/rnn_mem_opt
...
PaddingRNN model memory optimize
6 years ago