Yibing Liu
e8990e64f6
Fix trust ratio in lamb ( #17614 )
...
test=develop
6 years ago
Guo Sheng
2a7b321110
Fix the example code in some Python API. ( #17343 )
...
* Fix the example code in some Python API.
test=develop
* Fix the example code in some Python API by adding import.
test=develop
6 years ago
chengduo
b5f4d5ed0e
Add broadcast operators ( #17503 )
...
* This PR adds broadcast for multi-process. And it could be used in dynamic graph to broadcast parameters.
6 years ago
flame
2280f185d7
BuildStrategy api comment ( #17348 )
...
Python examples of fluid.layers.io.double_buffer and some BuildStrategy's methods.
6 years ago
Sylwester Fraczek
5b2a3c4b12
Conv concat relu quantization ( #17466 )
...
* add conv_concat_relu fuse
test=develop
* add test code
test=develop
* added missing include with unordered_map
test=develop
* review fixes for wojtuss
test=develop
* remove 'should (not) be fused' comment statements
one of them was invalid anyway
test=develop
6 years ago
Sylwester Fraczek
bccb0ba49a
fix quantize_squash_pass segfault when no tensor linked to Bias ( #17292 )
...
* fix quantize_squash_pass segfault when there is no tensor linked do Bias input
test=develop
* add googlenet test
test=develop
* fix concat CreateKey not using input format
test=develop
6 years ago
chengduo
2dc1c6f25c
Add profiler in tracer ( #17076 )
...
* add profiler in tracer.cc
* add profiler in layer.cc
test=develop
* add profiler in Layer.cc
test=develop
6 years ago
mozga-intel
0d4cbdad91
[NGraph] Enable elementwise mul operator ( #17552 )
6 years ago
tianshuo78520a
cee9dcc383
Delete LoDTensorset in API.spec ( #17577 )
...
* test=develop
* test=develop
* test=develop
* del #
6 years ago
mozga-intel
f2694e122d
[NGraph] Enable assign operator for a ngraph, test=develop ( #17437 )
...
* Enable assign operator for a ngraph, test=develop
* Cross_entropy operators needs to be updated
6 years ago
mozga-intel
cf02cb5e98
Enable elementwise sub operator for ngraph ( #17527 )
6 years ago
guru4elephant
7f8bc49d00
polish_executor_and_add_ctx_cache ( #17536 )
...
* polish_executor_and_add_ctx_cache
6 years ago
tensor-tang
7ae461eb13
[CPU] refine cpu softmax bwd ( #17534 )
...
* refine softmax fwd
test=develop
* refine cpu softmax bwd
test=develop
* fix batch size
test=develop
* fix compile issue with gpu
test=develop
* add value clip
6 years ago
Yibing Liu
6e11f97708
Add exponential moving average ( #17562 )
...
* Add exponential moving average
test=develop, test=document_preview
* Polish documents
test=develop, test=document_preview
* Update API spec
test=develop, test=document_preview
6 years ago
tensor-tang
0600b370ea
[CPU] refine softmax op fwd on CPU ( #17522 )
...
* refine softmax fwd
test=develop
* fix compile issue wih gpu
test=develop
* add value clip to avoid exp
6 years ago
Zeng Jinle
c6189637cd
Fix allocator bug ( #16712 )
...
* Revert "Revert "Fix allocator bug""
This reverts commit 174d0d0b90
.
* Revert "fix travis ci"
This reverts commit 5656fa9f7c
.
test=develop
* add inlined_vector.h, test=develop
* add inlined_vector_test,test=develop
6 years ago
mozga-intel
035771512d
Enable elementwise min operator for ngraph ( #17521 )
6 years ago
Kaipeng Deng
cf60e5a2db
fix API python example ( #17226 )
...
* fix api example. test=develop
* fix API.spec. test=develop
* fix spectral_norm format. test=develpp
* merge develop
* add import. test=develop
* fix indent. test=develop
* fix indent. test=develop
* add import fluid. test=develop
6 years ago
Qiao Longfei
92e7d5d7cc
fix distribute doc test=develop ( #17318 )
...
* fix distribute doc
6 years ago
jerrywgz
c1aae8b8d2
Fix GetExpectedKernelType in Concat op ( #17459 )
...
* fix concat op vartype check, test=develop
6 years ago
Qiao Longfei
58f7695ab2
Async exe support communicator ( #17386 )
...
Async exe support communicator
6 years ago
Zhaolong Xing
38da103034
fix trt ci bug temporary. ( #17565 )
...
ban all trt ut. will fix it later.
test=develop
6 years ago
mozga-intel
109b5aed5a
[NGraph] Enable reshape operator test=develop ( #17512 )
6 years ago
zhang wenhui
9bb6a421e3
fix bpr_loss data_norm teacher_student_sigmoid_loss api & fix continuous_value_model ( #17331 )
...
* fix bpr data_norm teacher_student_sigmoid , test=develop test=document_preview
修复了bpr_loss data_norm teacher_student_sigmoid_loss三个api, 同时修复了continuous_value_model文档英文拼写错误
6 years ago
lijianshe02
300bd7504d
fix api-doc related bugs test=develop test=document_preview ( #17360 )
...
* fix api doc according to the reviewer's comment test=develop
6 years ago
lijianshe02
daf88968e2
fix bug that saved optimal model path in test_analyzer_save_model con… ( #17555 )
...
* modify saved model path in analyzer_save_model.cc test=develop
6 years ago
Krzysztof Binias
43d15b9d96
Enable square operator for the nGraph Bridge. ( #17551 )
...
test=develop
6 years ago
Sevin F. Varoglu
f86f49e779
[NGraph] add increment op to ngraph engine ( #16929 )
...
* add increment op to ngraph engine
test=develop
* fix style errors
test=develop
6 years ago
baojun
8923612b10
NGraph enable parse serialized graph test=develop ( #17453 )
6 years ago
Yiqun Liu
cf5d271c5a
Fix examples of fluid.layers.sums and fluid.layers.DynamicRNN ( #17308 )
...
* Fix examples of fluid.layers.sums.
test=document_preview
* Correct the example of DynamicRNN and its functions.
test=develop
* Add 'import paddle.fluid as fluid' to examples.
test=develop
* Update API.spec.
test=develop
* Add space lines.
test=develop
* Update the API.spec.
test=develop
6 years ago
guomingz
2281ebf0f3
Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. ( #17130 )
...
* Relu6 is the bottleneck op for Mobilenet-v2. As the mkldnn supports the conv/relu6 fusion, we implement it fusion via cpass way. Due to the int8 enabling for this fusion will be supported in MKLDNN v0.20, so this PR is focused on the fp32 optimization.
Below table shows the benchmark(FPS) which measured on skx-8180(28 cores)
Batch size | with fusion | without fusion
-- | -- | --
1 | 214.7 | 53.4
50 | 1219.727 | 137.280
test=develop
* Fix the format issue
test=develop
* Add the missing nolint comments.
test=develop
* Fix the typos.
test=develop
* Register the conv_brelu_mkldnn_fuse_pass for the MKLDNN engine.
test=develop
* Adjust the indentation.
test=develop
* Add the test_conv_brelu_mkldnn_fuse_pass case.
test=develop
* Slightly update the code per Baidu comments.
Let the parameter definition embedded into the code.
That's will make the code easy to understand.
test=develop
6 years ago
Yibing Liu
f9796b1249
Add LAMB Optimizer support ( #17489 )
...
* Add LAMB optimizer
* Expose LAMB Optimizer's APIs
test=develop, test=document_preview
* Cleanup code & doc
test=develop, test=document_preview
* Update lamb optimizer's formula
test=develop
6 years ago
mozga-intel
99ab57123c
Enabled ngraph elementwise max operator ( #17517 )
6 years ago
Tao Luo
3d19f44a89
remove unused SERIAL compiler option ( #17500 )
...
test=develop
6 years ago
zhaoyuchen2018
dfdcd91869
Add api doc code examples ( #17285 )
...
* Add api doc code examples
add or fix topk, squeeze, stack, StaticRNN,
StaticRNN memory in doc
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Add squeeze md5.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Add import package
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
6 years ago
mozga-intel
1eb151752e
Enable abs operator for a ngraph test=develop ( #17436 )
6 years ago
lidanqing
36757ed203
Enabling resnet101, vgg16, vgg19 INT8v2 model tests ( #17468 )
...
* Add 6 models tests support in CMake
* enabling resnet101, vgg16, vgg19 INT8v2 model tests
test=develop
* remove SERIAL
test=develop
6 years ago
liuwei1031
ba70cc499e
fix security bugs : ( #17464 )
...
http://newicafe.baidu.com:80/issue/PaddleSec-33/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-28/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-25/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-24/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-21/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-20/show?from=page
test=develop
6 years ago
Zhaolong Xing
ff7f911b4d
add quant_dequant_moving_avg_max_abs op ( #17480 )
...
* add quant_dequant_moving_avg_max_abs op
test=develop
* add more note for quantdequant op
test=develop
6 years ago
Qiao Longfei
287de41c04
Optimize communicator flags ( #17494 )
...
* optimize communicator flag
* change flags in init py test=develop
6 years ago
liuwei1031
c3949f5699
remove two useless flags: enable_subgraph_optimize, memory_optimize_debug, test=develop ( #17491 )
6 years ago
liuwei1031
f82e4d75e7
improve the doc of paddle.fluid.memory_optimize, test=develop ( #17473 )
...
* improve the doc of paddle.fluid.memory_optimize, test=develop
* fix typo, test=develop
6 years ago
Tao Luo
32da5e9c3d
remove unused expected_kernel_cache_pass ( #17486 )
...
test=develop
6 years ago
wopeizl
ca3ba378c7
fix the random compilation failure on windows test=develop ( #17475 )
...
* fix the random compilation failure on windows
6 years ago
lvmengsi
10b23a72c1
Double backward elementwise div ( #17416 )
...
* double backward, elementwise_div
* fix dx empty. test=develop
* bug fix (#17392 )
fix secure bug
* Eanble stack operator for a Ngraph, test=develop (#17406 )
* fix sqrt_grad_grad unittest. test=develop (#17410 )
* fix sqrt_grad_grad unittest. test=develop
* disable sqrt_grad_grad unittest. test=develop
* test=develop, fix unittest
* test=develop, fix unittest
* test=develop, fix unittest
* test=develop, fix bug
* fix unittest. test=develop
* fix unittest dx. test=develop
* tmp fix! for test... test=develop
* reduce tmp, test=develop
* test=develop, reduce tmp
* fix broadcast unittest. test=develop
* fix format. test=develop
* refine code. test=develop
* refine code. test=develop
* refine GetDoubleGradSafeTensor. test=develop
* fix format. test=develop
6 years ago
qingqing01
97f0ec2357
Fix compiling error with cuDNN 5.1 ( #17458 )
...
test=develop
6 years ago
Zeng Jinle
3d4e8268c6
fix recurrent fwd bug when no backward and scope clear ( #17460 )
6 years ago
lvmengsi
977e9fcb27
support elementwise_sub double backward ( #17476 )
...
add elementwise_sub_grad_grad op for backward of backward calculation
6 years ago
jiaqi
75cda4d9df
fix data_feed_desc.py example run error ( #17452 )
...
* fix data_feed_desc.py example run error
test=develop
test=test=document_preview
* fix data_feed_desc.py example display error
test=develop
test=document_preview
* update API.spec for DataFeedDesc
test=develop
test=document_preview
6 years ago
chengduo
5a6ab38013
Add record event And remove CSP ( #17447 )
...
* add record_event
test=develop
* remove csp
test=develop
6 years ago
Yan Xu
0217555530
polish parallel dygraph code ( #17164 )
...
* add var grad hook test=develop
6 years ago
Jiabin Yang
d7df4e5e5b
Fix/Fix memory leak in dygraph ( #17394 )
...
* test=develop, add gradient sort backward strategy
* test=develop, fix test by add FLAGS_cudnn_deterministic on new tests
* test=develop, fix memory leak in dygraph mode
* test=develop, fix memory leak in dygraph mode
* test=develop, polish code
* test=develop, polish code
* test=develop, polish code
6 years ago
Qiao Longfei
728bbaa4e3
add cache_update_mutex_ for operator test=develop ( #17124 )
...
* add cache_update_mutex_ for operator
6 years ago
Bai Yifan
3a9ae28d32
fix assert,test=develop ( #17445 )
6 years ago
zhaoyuchen2018
b02f2aff04
Add conditional compile for gru opt ( #17368 )
...
* improve gru unit performance.
refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Add conditional compile for gru opt
Not enable gru opt if compute ability < 700
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* refine code.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
6 years ago
liuwei1031
6a53fa95e7
improve the API Sample of DataFeeder, memory_optimize and release_memory ( #17374 )
...
* improve the API Sample of DataFeeder, memory_optimize and release_memory, test=develop
* update API.spec, test=develop, test=document_preview
* tweak the code format of feed API, test=develop
* update API.spec, test=develop
* improve doc for DataFeeder and default_main_program, test=develop
6 years ago
guru4elephant
43c9561e9a
add inductive shape index ( #17435 )
...
add inductive shape index
6 years ago
Zeng Jinle
712bfb17cb
fix recurrent_op,test=develop ( #17433 )
6 years ago
Tao Luo
5babcd02dd
Revert "remove unnecessary prepare_data ( #17080 )" ( #17432 )
...
This reverts commit aca60e9a20
.
6 years ago
chengduo
e336dc86bb
[Speed] Refine the Executor when the num_thread=1 ( #17405 )
...
Refine the Executor when the num_thread=1
6 years ago
Jie Fang
30e178fa2c
init auto loss scaling ( #17194 )
...
* init auto loss scaling
test=develop
* change API.spec
* change ifelse to switch and use reduce_sum to optimize checking isfinite
test=develop
* Remove redundant code
test=develop
6 years ago
Zhen Wang
4a1b7fec96
Add setting Scope function for the graph class ( #17417 )
...
* add set_not_owned function for graph
* add scope set. test=develop
* add scope_ptr enforce not null before setting.test=develop
6 years ago
mozga-intel
6ee6700fac
Eanble stack operator for a Ngraph, test=develop ( #17406 )
6 years ago
flame
e48dd92fc8
bug fix ( #17392 )
...
fix secure bug
6 years ago
jiaqi
66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug ( #17118 )
...
* add save/load model, shrink table, cvm, config file & fix pull dense bug
test=develop
* fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error
add client flush, add get data size
test=develop
* fix global shuffle bug
test=develop
* fix global shuffle bug
test=develop
* fix code style
test=develop
* fix code style & modify pslib cmake
test=develop
* fix error of _role_maker
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix windows compile error of fleet
test=develop
* fix global shuffle bug
* add comment
test=develop
* update pslib.cmake
test=develop
* fix fill sparse bug
test=develop
* fix push sparse bug
test=develop
6 years ago
Krzysztof Binias
0823a7bc8b
Optimize the sequence padding op ( #17403 )
...
test=develop
6 years ago
baojun
1ce7b45b9e
NGraph Added fill_zeros_like op test=develop ( #17295 )
6 years ago
baojun
910196524d
NGraph Added dropout and dropout_grad to ngraph test=develop ( #17320 )
6 years ago
mozga-intel
b189480734
Ngraph Enable gather operator test=develop ( #17296 )
6 years ago
lvmengsi
4ef631013c
Double backward sqrt ( #17387 )
...
* double backward sqrt
* refine unittest. test=develop
* refine test. test=develop
* remove alpha in unittest. test=develop
6 years ago
JesseyXujin
829fcc98fb
Fix some APIs' example
...
* test=develop
* test=develop
* test=develop
6 years ago
Zeng Jinle
eab34b2df6
fix_dygraph_mem_leak, test=develop ( #17396 )
6 years ago
lvmengsi
5d1ac41b00
Double backward reduce mean ( #17372 )
...
* test=develop, double backward reduce_mean
* add comment. test=develop
* fix format. test=develop
* rename GradGrad -> DoubleGrad. test=develop
* fix op_use_default_grad_op_maker.spec. test=develop
6 years ago
jerrywgz
0cae5a36b6
enhance generate mask labels, test=develop ( #17380 )
6 years ago
Kaipeng Deng
bd9bef5a4e
add elementwise_add_grad_grad op ( #17366 )
...
* add elementwise_add_grad_grad op. test=develop
* use defined GradMaker. test=develop
6 years ago
jerrywgz
1c6d064627
add collect fpn proposals op,test=develop ( #16074 )
...
* add collect fpn proposals op,test=develop
6 years ago
Kaipeng Deng
60be66e2c0
support fc_op double grad ( #17317 )
...
* add double grad for mul_op. test=develop
* fix format. test=develop
* fix format. test=develop
* fix format. test=develop
* refine code. test=develop
* remove setzero. test=develop
* fix dx/dy init bug. test=develop
* fix format. test=develop
6 years ago
Zhen Wang
ad8bbe587e
Fix some api example codes' bugs and these APIs include load_inference_model, load_vars, save_vars, L1DecayRegularizer and L2DecayRegularizer. ( #17324 )
...
* fix some api example codes' bugs.
* update API.spec. test=develop test=document_preview
* add import fluid. test=develop test=document_preview
6 years ago
Tao Luo
68ec0a6f74
make parallel_executor support FLAGS_use_mkldnn ( #17341 )
...
* make parallel_executor support FLAGS_use_mkldnn
test=develop
* add warning when set mkldnn_enabled_op_types_ in non-mkldnn env
test=develop
6 years ago
liuwei1031
0863599323
Fix the uninitialized gru_value.output_value. ( #17197 )
...
test=develop
6 years ago
zhoukunsheng
2ff7ea3337
Expose sign op ( #17117 )
...
* test=develop
add sign op
* Update nn.py
test=develop
delete stop_gradient assignment
6 years ago
tianshuo78520a
f0acc36684
test=develop ( #17357 )
6 years ago
Yihua Xu
218d8d8f73
Optimize the computing kernel of sequence_reverse operator ( #17349 )
...
* Optimize the computing kernel of sequence_reverse operator.
test=develop
* Clean code
test=develop
* Fix for cpplint syntax checking.
test=develop
* Fix the compile warning issue.
test=develop
6 years ago
Yiqun Liu
dcda20233c
Optimize the elementwise op using eigen ( #15494 )
...
* Optimize the elementwise op with CUDA kernels.
test=develop
* Support setting of attr in op config file.
test=develop
* Add the support the setting dtype and initializer in config.
test=develop
* Save workspace.
* Add initializer "zeros".
test=develop
* Fix compiling error.
* Support the use of existed file to initailize tensor in op_tester.
* Use eigen to optimize the elementwise_add/mul for the case that x and y have the same dims.
test=develop
6 years ago
Jiabin Yang
4624d7c642
test=develop, add gradient sort backward strategy ( #17125 )
...
* test=develop, add gradient sort backward strategy
* test=develop, fix test by add FLAGS_cudnn_deterministic on new tests
6 years ago
qingqing01
1d0ba5e815
Fix the example code in some Python API ( #17333 )
...
* Fix the example code in some Python API
* Update paddle/fluid/API.spec
* Fix some examples format
6 years ago
Kaipeng Deng
8bae8590ac
add double grad for elementwise_mul op ( #17255 )
...
* add double grad for elementwise_mul. test=develop
* remove comment. test=develop
* fix grad sum. test=develop
* fix for axis expand. test=develop
* add test for axis expand. test=develop
6 years ago
Kaipeng Deng
11d3a38f25
add double grad for square op ( #17173 )
...
* add double grad for square. test=develop
* formax code. test=develop
* fix for grad sum. test=develop
* refine shape. test=develop
* refine extract. test=develop
6 years ago
Jiabin Yang
31536016ea
test=develop, test=document_preview, fix 13 api doc and code ( #17293 )
...
* test=develop, test=document_preview, fix all 13 api doc and code
* test=develop, fix rst
* test=develop, refresh API.spec
6 years ago
chengduo
bc833945a4
Add DropLocalExeScopes in ParallelExecutor ( #17297 )
...
* reset drop local scope counter
test=develop
6 years ago
zhoukunsheng
d4b67e1692
Add Where Op( #16793 )
6 years ago
zhoukunsheng
1bfff02047
Add Diag Op( #17027 )
6 years ago
zhaoyuchen2018
8a2caacdbc
improve gru unit performance. ( #16338 )
...
refine code
fuse cublas calling and kernels into one cuda kernel.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
6 years ago
SunGaofeng
ddb24d48c5
test=develop ( #17322 )
6 years ago
qingqing01
e32c9888f5
Double backward of conv2d. ( #17211 )
...
* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
- Now use it in conv2d_grad_grad.
- Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables,return None in Python.
6 years ago
Zeng Jinle
5e5e7b3305
fix data_type error message ( #17312 )
...
test=develop
6 years ago
Zeng Jinle
fff270eacd
follow comments,test=develop ( #17273 )
6 years ago
Zhaolong Xing
7a3bb061d8
fix: ( #17279 )
...
1. infernce multi card occupy
2. facebox model inference occupy too much
test=develop
6 years ago
xiaoting
50ad9046c9
add import, test=develop ( #17229 )
6 years ago
zhoukunsheng
4292bd8687
Mod floordiv ( #17251 )
...
* test=develop
add elementwise_mod and elementwise_floordiv, fix equation problem in elementwise_mod
6 years ago
guru4elephant
5d6a1fcf16
fix infer_from_dataset and train_from_dataset ( #17243 )
...
* fix train_from_dataset and infer_from_dataset example
* add inductive dim for data_reader, example: shape=[-1, 1], then -1 will be inducted through run-time reading of number of elements
6 years ago
chengduo
516317cf91
use sync copy ( #17291 )
...
test=develop
6 years ago
Huihuang Zheng
2c4462711f
Fix API example code of save_inference_model ( #17274 )
...
* Fix API example code of save_inference_model
test=develop
* Add "import" in exmaple of save_inference_model
* Fix typo "exsample" -> "example"
test=develop
6 years ago
xiaoting
9ed4aaada4
modified formula for Lrn ( #17281 )
...
* modified formula for lrn
test=develop
* modified api.spec
test=develop
6 years ago
zhaoyuchen2018
792443ef23
Refine elementwise kernel. ( #16952 )
...
* Refine elementwise kernel.
Add a simple cuda kernel if grad x and y both exist
Use 2D block cuda kernel to do broadcast.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* refine code.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* refine code.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
6 years ago
lujun
e388a1fb66
Repair api example ( #17221 )
...
Fix the following API examples:
paddle.fluid.scope_guard
paddle.fluid.backward.append_backward
paddle.fluid.cpu_places
paddle.fluid.cuda_pinned_places
paddle.fluid.cuda_places
paddle.fluid.in_dygraph_mode
paddle.fluid.CUDAPlace
paddle.fluid.CPUPlace
paddle.fluid.CUDAPinnedPlace
6 years ago
Yiqun Liu
6b84688ba2
Optimize the cuda implementation of sum_op ( #17283 )
...
* Optimize the cuda implementation of sum_op, which add two lod_tensors inplace.
test=develop
* Use eigen to add to tensors.
test=develop
6 years ago
chengduo
db5e74ab95
update assert ( #17282 )
...
test=develop
6 years ago
Hongyu Liu
c3195de522
Fix concat shape check ( #17247 )
...
* fix shape_check; test=develop
* fix format; test=develop
* fix format; test=develop
* fix ddim bug; test=develop
* fix c++ format; test=develop
* change function name; test=develop
6 years ago
lvmengsi
dab71e8d97
Fix api example ( #17231 )
...
* fix API examples, test=develop
6 years ago
whs
7d7e29957f
Fix bp of roi perspective transform op. ( #17216 )
6 years ago
baojun
7bd1d03ee5
Adding lrn op for ngraph engine ( #17189 )
...
* added lrn op test=develop
* Added CreateConstant method test=develop
* avoid duplicates test=develop
6 years ago
Wojciech Uss
984aa90583
improved unit test output ( #17266 )
...
added printing data type to differentiate int8 and fp32 latency results
test=develop
6 years ago
chengduo
8f534696b7
Polish Executor and Compiler doc ( #17262 )
...
* polish doc
test=develop
* updata parallel executor doc
test=develop
* update API.spec
test=develop
* polish code
test=develop
6 years ago
tianshuo78520a
dd86b40058
document_preview ( #17166 )
...
* document_preview
* change name
* document
* add document_preview.sh
* add document_preview.sh
* add paddle_build.sh
* nohup python
* change port runserver
* test doc
* test=develop
* test=develop
* test=develop
* add git clone FluidDoc,PaddlePaddle.org
* change PaddlePaddle.org
* Add port comment
* change directory
* change PADDLE_ROOT
6 years ago
gongweibao
91784f8ec3
Fix code in document. ( #17237 )
6 years ago
chengduo
04bd413acb
Code Clean: Move all pass to paddle::framework::ir ( #17228 )
...
* move pass to ir
* polish code
test=develop
* fix dependency
test=develop
6 years ago
Huihuang Zheng
648320bb6c
Fix some data and reader related API code ( #17202 )
...
* Fix data and reader related api doc
* Fix data and reader related api doc
Review and fix the example code in some reader related API doc.
These APIs are:
Fix existing API example codes:
paddle.fluid.io.PyReader
paddle.fluid.layers.batch
paddle.fluid.layers.data
paddle.fluid.layers.Preprocessor
paddle.fluid.layers.py_reader
paddle.fluid.program_guard
Add new example codes:
paddle.fluid.io.PyReader.decorate_batch_generator
paddle.fluid.io.PyReader.decorate_sample_generator
paddle.fluid.io.PyReader.decorate_sample_list_generator
paddle.fluid.io.PyReader.reset
paddle.fluid.io.PyReader.start
test=develop
* Add changes to API.spec after changing doc.
test=develop
* Add blanks after python example code
test=develop
* Add blank line at py_reader example code
test=develop
* Merge API.spec
test=develop
* Modify reader.py based on reviewer's comment
test=develop
* Modify API.spec after changing doc
test=develop
* Change reader.py based on reviewer's comment
* Modify example code of decorate_sample_generator
test=develop
* Fix example code of PyReader based on reviewer
test=develop
6 years ago
Zeng Jinle
f2fa3f7300
fix api doc,test=develop ( #17241 )
6 years ago
Zeng Jinle
4f8594088d
Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace ( #17225 )
...
* add use_cuda to inplace pass,test=develop
* add test softmax_with_xe_inplace test,test=develop
* fix potential inplace bug
test=develop
* add more skip vars in mem opt pass,test=develop
* follow comment,test=develop
* follow comments,move duplicate out arg check to program->graph,test=develop
6 years ago
baojun
e782b54b9c
update sofmax with axis arg test=develop ( #17190 )
6 years ago
tensor-tang
71f0c6d5bd
fix api doc of hash, relu, concat, argmin, argmax, argsoft and all activations ( #17235 )
...
* fix api doc of hash, relu, concat, argmin, argmax, argsoft and all activations funcs with no attrs
test=develop
* refine doc example code
test=develop
* remove >>> in doc example
test=develop
* refine python code block
test=develop
* update API spec
test=develop
6 years ago
Zeng Jinle
6fafd37e12
fix retry_allocator ( #17245 )
...
test=develop
6 years ago
Tao Luo
ff1661f12a
remove unused FLAGS_warpctc_dir ( #17162 )
...
* remove unused FLAGS_warpctc_dir
test=develop
* remove FLAGS_warpctc_dir
test=develop
6 years ago
Kaipeng Deng
a71d8fdb87
Softmax_cross_entropy op add axis ( #16806 )
...
* add attr axis infershape. test=develop
* add CUDA kernel. test=develop
* fix unittest. test=develop
* fix unittest for soft_label. test=develop
* fix fp16 unittest. test=develop
* remove comment code. test=develop
* refine test for axis. test=develop
* add python api. test=develop
* fix doc. test=develop
* fix fp16 unittest. test=develop
* fix ngraph test. test=develop
* fix ENFORCE for test_imperative_transformer. test=develop
* fit for ngraph test. test=develop
* fix after rebase develop. test=develop
* fix doc. test=develop
* fix API.spec. test=develop
* fix test_layers. test=develop
* fix format. test=develop
6 years ago
songhao
c2e20e2a29
fix build warning like 'comparison between signed and unsigned ( #17240 )
...
integer', test=develop
6 years ago
Zhen Wang
a914d9b116
Quant output scale ( #17215 )
...
* Add MovingAverageAbsMaxScale operator which is only used for calculating the quantization scale.
* test=develop
* change the output into inplace. test=develop
* Revert "test=develop"
This reverts commit 696cf62699ba1e1c98f61f7345ac7060010eb29a.
* Revert "change the output into inplace. test=develop"
This reverts commit a19acd20f07eee82622701a3015e6e9c073a5e0b.
* test=develop.
* update the MovingAverageAbsMaxScaleOp test. test=develop
6 years ago
zhaoyuchen2018
32b62c25af
optimize sum op ( #16820 )
...
* optimize sum op
fuse multi eigen kernel calls into one cuda kernel.
refine code
test=develop.
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code according to comments.
test=develop
* refine code
delete sum_op_gpu.h
test=develop
* Fix test error.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* refine code in format.
test=develop.
* refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
6 years ago
石晓伟
a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 ( #17156 )
...
* cherry-pick commit from 8877054
* cherry-pick commit from 3f0b97d
* cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn
(cherry picked from commit 8643dbc233
)
* Cherry-Pick from 16662 : Anakin subgraph cpu support
(cherry picked from commit 7ad182e16c
)
* Cherry-pick from 1662, 16797.. : add anakin int8 support
(cherry picked from commit e14ab180fe
)
* Cherry-pick from 16813 : change singleton to graph RegistBlock
test=release/1.4
(cherry picked from commit 4b9fa42307
)
* Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2
Support ShuffleNet and MobileNet-v2, test=release/1.4
(cherry picked from commit a6fb066f90
)
* Cherry-pick : anakin subgraph add opt config layout argument #16846
test=release/1.4
(cherry picked from commit 8121b3eccb
)
* 1. add shuffle_channel_detect
(cherry picked from commit 6efdea8997
)
* update shuffle_channel op convert, test=release/1.4
(cherry picked from commit e4726a066f
)
* Modify symbol export rules
test=develop
6 years ago
Tao Luo
16922e0093
fix api_example of tree_conv ( #17239 )
...
test=develop
6 years ago
jerrywgz
ef66baedc0
Refine api doc ( #17230 )
...
* refine api comment, test=develop
6 years ago
Leo Zhao
54636a1982
call SetNumThreads everytime to avoid missing omp thread setting ( #17224 )
...
* call SetNumThreads everytime to avoid missing omp thread setting
resolve #17153
test=develop
* add paddle_num_threads into config for test_analyzer_pyramid_dnn
resolve #17153
test=develop
6 years ago
Yibing Liu
6b0f27e802
Fix some APIs' example ( #17214 )
6 years ago
ruri
5817077c99
Fix unexecutable API examples ( #17218 )
...
* fix unexecutable API comments, test=develop
* add API.spec,test=develop
6 years ago
jerrywgz
cc95a7516c
fix distribute fpn proposals, test=develop ( #16152 )
...
* fix distribute fpn proposals, test=develop
6 years ago
Tao Luo
9ec4615deb
fix profiler and name_scope API examples ( #17212 )
...
* fix profiler and name_scope API examples
test=develop
* update API.spec
test=develop
6 years ago
Zeng Jinle
c5eeecca7c
Fix tensor_py.h ( #17195 )
...
* fix tensor_py,test=develop
* change class name,test=develop
6 years ago
Zeng Jinle
ee2028a110
Add use_cuda to inplace pass ( #17205 )
...
* add use_cuda to inplace pass,test=develop
* add test softmax_with_xe_inplace test,test=develop
6 years ago
chengduo
950aec55fd
It doesn't need sync when fetch_list nit not empty ( #17201 )
...
test=develop
6 years ago
jerrywgz
a72907bbf4
Enhance concat op to support empty input. ( #17015 )
...
* enhance_concat, test=develop
6 years ago
wopeizl
83c4f7721f
use two GPUs to run the exclusive test test=develop ( #17187 )
6 years ago
chengduo
3c6ab799cd
Remove unnecessary set_devices ( #17158 )
...
* remove unnecessary set_devices
6 years ago
guru4elephant
f938ccec62
remove async executor python api to fix document ( #17174 )
...
* remove async executor python api
test=develop
* remove test_async_executor.py
add executor train_from_dataset demo
test=develop
* fix import bug
test=develop
6 years ago
Zeng Jinle
5dfe2ab9e8
Fix mem leak when converting Tensor to numpy array ( #17182 )
...
* fix mem leak when converting Tensor to numpy array
test=develop
* remove unused unittest,test=develop
* follow comments, test=develop
* fix dygraph bug,test=develop
6 years ago
Huihuang Zheng
e4a5332416
Fix a typo in gpu_info.cc ( #17175 )
...
test=develop
6 years ago
tensor-tang
79ed1c76cd
fix bn fuse vardesc and add model saver ( #17143 )
...
* fix bn fuse vardesc and add model saver
test=develop
* unify save model in test helper
test=develop
* fix mkdir on windows
test=develop
* remove magic number use bn bias var desc
test=develop
6 years ago
Zeng Jinle
4e1bc6e805
Rewrite inplace pass and fix gc bug ( #17126 )
...
* fix op graph view
test=develop
* rewrite inplace pass and fix reference count pass bug
test=develop
* fix unittest failed
test=develop
* follow comments, test=develop
6 years ago
Zeng Jinle
08773b6069
fix reader default stream,test=develop ( #17106 )
6 years ago
xiaoting
bc48453b73
polish the label_smooth ( #17138 )
...
* polish the label_smooth
test=develop
* polish code
test=develop
6 years ago
Leo Zhao
bf4b21fa3d
fix assertion failure issue when test_analyzer_bert uses ngraph ( #17148 )
...
resolve #17147
test=develop
6 years ago
tangwei12
deb510d451
cvm op feature ( #17081 )
...
cvm without LoD.
6 years ago
wopeizl
3acb3635c2
1. move the API check into CPU process ( #17110 )
...
* 1. move the API check into CPU process
2. adjust the check order
6 years ago
tianshuo78520a
92ce445227
Supplementary monitoring file reason explanation ( #17131 )
6 years ago
Zeng Jinle
28d69d710a
Refine dropout gpu memory ( #17095 )
...
* refine_dropout_mem,test=develop
* # This is a combination of 14 commits.
# The first commit's message is:
remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066 )
# This is the 2nd commit message:
Fleet unify distributed training (#16791 )
* implement distributed transpiler with fleet
# This is the 3rd commit message:
ParallelDyGraph with GPU collective mode (#16827 )
implement dygraph.parallel.DataParallel to hook reduce op.
# This is the 4th commit message:
Init mixed precision training interface (#16856 )
* Init mixed precision training interface
* Add fp16 test script
test=develop
* All initializers support float16
test=develop
* Code cleanup & add more code annotations
test=develop
* Update API spec
test=develop
* Add usage example in doc
test=develop
# This is the 5th commit message:
fix reference_count_pass,test=develop (#17060 )
test=develop
# This is the 6th commit message:
Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090 )
* Cache the information of linear interpolation in forward and use it in backward.
test=develop
* Fix cuda kernel.
test=develop
# This is the 7th commit message:
remove unnecessary prepare_data (#17080 )
test=develop
# This is the 8th commit message:
fix interpolate cu. test=develop (#17101 )
# This is the 9th commit message:
test=develop, double backward leaky_relu (#17067 )
backward of backward: leaky_relu
# This is the 10th commit message:
fix fuse optimizer ops (#17102 )
test=develop
# This is the 11th commit message:
truncated_gaussian_random supported in distributed training, test=develop (#17091 )
# This is the 12th commit message:
Detailed coordinate description for yolov3 loss (#17007 )
* Detailed coordinate description for yolov3 loss
test=develop
* modified api.spec
test=develop
* modified loss name
* fix api.spec
test=develop
* polish description
test=develop
* modified api.spec
test=develop
# This is the 13th commit message:
fix test_weight_decay (#17109 )
test=develop
# This is the 14th commit message:
Path flag (#17105 )
* fix python/paddle/fluid/__init__.py detecting problems
6 years ago
Huihuang Zheng
b9494058b3
Use CudnnWorkspaceHandle in exhaustive search ( #17082 )
...
1. Use CudnnWorkspaceHandle in exhaustive search of conv_cudnn.
2. For Ops using CudnnWorkspaceHandle in exhaustive search, release their GPU memory after exhaustive search.
test=develop
6 years ago
tianshuo78520a
2192e7bb61
Path flag ( #17105 )
...
* fix python/paddle/fluid/__init__.py detecting problems
6 years ago
xiaoting
7da7881c0e
Detailed coordinate description for yolov3 loss ( #17007 )
...
* Detailed coordinate description for yolov3 loss
test=develop
* modified api.spec
test=develop
* modified loss name
* fix api.spec
test=develop
* polish description
test=develop
* modified api.spec
test=develop
6 years ago
chengduo
794a195881
fix fuse optimizer ops ( #17102 )
...
test=develop
6 years ago
ceci3
258e000be6
test=develop, double backward leaky_relu ( #17067 )
...
backward of backward: leaky_relu
6 years ago
Kaipeng Deng
10c487eb21
fix interpolate cu. test=develop ( #17101 )
6 years ago
Tao Luo
aca60e9a20
remove unnecessary prepare_data ( #17080 )
...
test=develop
6 years ago
whs
55ce36e981
Speedup roi_perspective_transform op by caching the information of linear interpolation in forward ( #17090 )
...
* Cache the information of linear interpolation in forward and use it in backward.
test=develop
* Fix cuda kernel.
test=develop
6 years ago
Zeng Jinle
842ded14b0
fix reference_count_pass,test=develop ( #17060 )
...
test=develop
6 years ago
Yibing Liu
beda78258f
Init mixed precision training interface ( #16856 )
...
* Init mixed precision training interface
* Add fp16 test script
test=develop
* All initializers support float16
test=develop
* Code cleanup & add more code annotations
test=develop
* Update API spec
test=develop
* Add usage example in doc
test=develop
6 years ago
Yan Xu
0b07eef118
ParallelDyGraph with GPU collective mode ( #16827 )
...
implement dygraph.parallel.DataParallel to hook reduce op.
6 years ago
Tao Luo
d9cd989825
Merge pull request #17048 from luotao1/fix_runtime_cache_bug
...
fix runtime_context_cache bug when gpu model has an op runs only on cpu
6 years ago
wopeizl
f5d6937fe1
specify the cuda arch name and bin to decrease the compile time for i… ( #17020 )
...
1. specify the cuda arch name and bin to decrease the compile time for inference test=develop
2. simplify the script and add comments
3. remove the fluid process from cicheck
6 years ago
chengduo
cc31681687
use fast executor as default ( #17044 )
...
test=develop
6 years ago
chengduo
a2be4b4d91
Add fuse momenutum ops ( #16745 )
...
* Add fuse momenutum ops
6 years ago
guru4elephant
03d469ad98
Merge pull request #17005 from wopeizl/fix_ncclwrapper_win1
...
fix nccl wrapper on windows
6 years ago
tangwei12
13295d90d9
load persistables with selected rows, test=develop ( #17047 )
6 years ago
luotao1
490e746269
fix runtime_context_cache bug when gpu model has an op runs only on cpu
...
test=develop
6 years ago
Zeng Jinle
0c335dcd2c
Make conv cudnn workspace size configurable ( #17036 )
...
* make_conv_cudnn_ws_size_configurable, test=develop
* change std::max to std::min
test=develop
6 years ago
jerrywgz
ea3504c7ec
Merge pull request #17017 from jerrywgz/fix_potential_hung
...
fix potential hung in generate proposals, test=develop
6 years ago
qingqing01
c1c2633a63
Support backward of backward for Relu and add a new gradient checker by comparing theoretical and numerical Jacobian. ( #16862 )
...
* Support backward of backward and a new gradient checker
* Rename decorators.py to decorator_helper.py, since Python on Windows CI has decorators package.
1. Add ReluDoubleGradMaker when register relu_grad.
2. Add a new gradient checker by comparing theoretical and numerical Jacobian. Check double gradients by double_grad_check.
6 years ago
tangwei12
45136b1b41
fix bug in save, test=develop
6 years ago
jerrywgz
47013af0a6
Merge pull request #17011 from jerrywgz/enhance_generate_proposal_labels
...
enhance generate proposal labels, test=develop
6 years ago
tianshuo78520a
73a360b504
Cmakelists fix ( #17018 )
...
* fix cmakelist detecting problems
6 years ago
liuwei1031
a770ce0615
add doc for memory_optimize, test=develop ( #17010 )
...
* add doc for memory_optimize, test=develop
* update doc, test=develop
* doc update, test=develop
6 years ago
wopeizl
d9991dccdd
add parallel build script to ci … ( #16901 )
...
* add parallel build script to ci test=develop
* 1. classify the test case as single card/two cards/multiple cards type
2. run test case according to the run type
6 years ago
jerrywgz
b2df6de860
fix potential hung in generate proposals, test=develop
6 years ago
Zeng Jinle
24923f7604
fix py_reader demo ( #16997 )
...
test=develop
6 years ago
qingqing01
ea42e431f8
Speed unit testing. ( #16978 )
...
* Speed affine_channel_op unit testing
* Add check in tensor_py
* Fix ONLY_CPU Compiling
6 years ago
jerrywgz
d3a66fc616
enhance generate proposal labels, test=develop
6 years ago
wopeizl
51a0243a56
fix nccl wrapper on windows
...
test=develop
6 years ago
Zeng Jinle
1202d3fc74
Refine model gpu memory ( #16993 )
...
* speedup gc and inplace softmax_with_cross_entropy_grad
test=develop
* refine models gpu mem
Merge skip vars and warning messages of mem opt
remove relu mem opt
test=develop
* follow comments
test=develop
6 years ago
Yibing Liu
3c375751f8
Support seq len equal to 0 in sequence ops ( #16935 )
...
* Support seq len equal to 0 in sequence ops
test=develop
* Add more test cases
* Fix some comments
test=develop
* Fix py3 error
test=develop
6 years ago
Tao Luo
c017025531
Merge pull request #16981 from luotao1/disable_runtime_context_default
...
disable runtime_context_cache pass by default
6 years ago
Yibing Liu
36c05d36ab
Check some shapes only in runtime ( #16919 )
...
* Check some shapes only in runtime
test=develop
* Follow review comments
test=develop
* Update API spec
6 years ago
Tao Luo
aa7b975bf6
disable runtime_context_cache pass by default
...
test=develop
6 years ago
Zhaolong Xing
27cd3efdd1
Merge pull request #16969 from NHZlX/fix_trt_anakin_compile_rely
...
fix trt anakin subgraph compile rely
6 years ago
tianshuo78520a
3242e88b70
fix cmakelist detecting problems ( #16944 )
...
* fix cmakelist detecting problems
* test=develop
* test=develop
6 years ago
jiaqi
8bcba3db84
Merge pull request #16896 from xjqbest/develop
...
fix bug of num > INT_MAX
6 years ago
nhzlx
bc6b0ca1f4
fix trt anakin subgraph compile rely
...
test=develop
6 years ago
guru4elephant
bbc6c5714f
Merge pull request #16887 from guru4elephant/add_nccl_context_pybind
...
Add nccl context pybind
6 years ago
gongweibao
cbdb8a17b1
Polish DGC code ( #16818 )
6 years ago
lujun
dbf66dd034
Merge pull request #16954 from junjun315/fix-dygraph-checkpoint
...
Fix dygraph checkpoint bug
6 years ago
Tao Luo
aa9caa1691
Merge pull request #16951 from luotao1/reduce_ci_time
...
use multi-thread to speedup CI tests
6 years ago
Guo Sheng
9f1d4a152b
Merge pull request #16902 from guoshengCS/refine-infer-shape
...
Refine ENFORCE in infer_shape of gru_op and lstm_unit_op.
6 years ago
Guo Sheng
caf2848356
Merge pull request #16898 from Superjomn/fix/logical_op_infershape
...
fix logical op infershape
6 years ago