GaoWei8
d4dda8628e
optimize fc jit ( #21878 )
...
test=develop
5 years ago
zhouwei25
013225bb68
fix Execution order of ci_check_unittest, and add it to Linux_py35 ( #21640 )
5 years ago
Chen Weihang
2b941736f3
fix softmax_with_cross_entropy_fix bug, test=develop ( #21810 )
5 years ago
Thunderbrook
c3cf42d0f7
add table id in cache shuffle ( #21585 )
...
* general table
* add sparse table
test=develop
* no cvm
test=develop
* add no_cvm
test=develop
* add note
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* code style
test=develop
* add key of optimizer
test=develop
* solve pslib stop core
test=develop
* barrier
test=develop
* add notes
test=develop
* add table id in cache shuffle
test=develop
* table id
test=develop
* code style
test=develop
5 years ago
Michał Gallus
253e664275
Disable memory opt pass when DNNL is on ( #21826 )
...
* Disable memory opt pass when DNNL is on
* Refine comment above mem optimization pass enablement
test=develop
5 years ago
Chengmo
a86f11b5f5
Speed GEO dense calc & communication ( #21579 )
...
* test=develop, speed dense calc & communication
5 years ago
Wojciech Uss
666c3bb9b0
handle multi-inputs with empty inputs for mkldnn_concat_op ( #21827 )
...
test=develop
5 years ago
Zeng Jinle
aa4d6a5d6c
Add some debug flags to auto growth allocator ( #21766 )
...
* add some debug flags to auto growth allocator, test=develop
* add comments about auto growth, test=develop
5 years ago
guofei
8b7c50f49a
Make While Op could run on GPU place and add while_loop unittest ( #21672 )
...
1. Make while_op accept GPU conditional data
2. Add more complex test cases for while_loop API
5 years ago
WangXi
17299b8d21
fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop ( #21801 )
5 years ago
Huihuang Zheng
557bce77da
Fix Backward Bugs in Conditional Block ( #21809 )
...
The fixed bugs:
1. The condition sub-graph is not pruned
2. When backward graph is extremely simple, the whole backward ops are pruned.
5 years ago
xujiaqi01
0eb4d990c4
fix compiled error when with_pslib=on ( #21769 )
...
* fix compiled error of butil when with_pslib=on and with_testing=on
* test=develop
5 years ago
Huihuang Zheng
0677a1c1c1
Fix That conditional_block_op Doesn't Have InferShape ( #21733 )
5 years ago
zhaoyuchen2018
a5a8d14414
Fix softmax cuda bug ( #21720 )
...
* Fix softmax cuda bug
* Refine multihead log and softmax logic
5 years ago
Kaipeng Deng
943a44492b
yolo_box OP add Attr(clip_bbox). ( #21620 )
...
* yolo_box OP add Attr(clip_bbox). test=develop
5 years ago
Michał Gallus
a5159d8480
Re-anble vgg and resnet101 models download ( #21713 )
...
test=develop
5 years ago
Leo Chen
7181afd75c
Fix elementwise_pow bug on CUDA place with integer ( #21675 )
...
* fix elementwise_pow bug on integer, test=develop
* use llrint to support elementwise_pow_grad, test=develop
* add some tests, test=develop
* revert grad functor, test=develop
5 years ago
石晓伟
2bb135825e
fix analysis_predictor when func is called multiple times, test=release/1.6 ( #21665 )
5 years ago
lidanqing
d3a96632fa
Add fc-dequantize squash in cpu_quantize_squash_pass for ernie model ( #21714 )
...
* fc-dequantize squash
test=develop
* change according to reviews
test=develop
* change PADDLE_ENFORCE
test=develop
* add second test when fc-dequant do not fuse
test=develop
* change all related PADDLE_ENFORCE
test=develop
5 years ago
Chen Weihang
1fd1f06f11
Rename paddle throw error macro ( #21657 )
...
* rename paddle throw error macro, test=develop
* fix new error use case, test=develop
5 years ago
WangXi
8754cbd1f2
fix std::min type in nan_inf, test=develop ( #21725 )
5 years ago
Leo Chen
fbe3ac217e
polish cmake, test=develop ( #21681 )
...
* polish cmake, test=develop
* add current directory to LD_LIBRARY_PATH, test=develop
5 years ago
joanna.wozna.intel
d419b859c0
Add reshape int8 mkldnn op ( #21428 )
...
* Add reshape int8 op
test=develop
* Change test to CPUPlace
test=develop
* Correct tests
test=develop
5 years ago
WangXi
8a0f611b64
Rewrite check nan inf tools ( #21076 )
5 years ago
tangwei12
9ad940fdfe
memory leak for cpu ( #21174 )
...
* add fake init for the trainer, fix large memory hold in the trainer
* do not merge recv vars from a remote endpoint, test=develop
* add recv and save op, merge slice var in one op, save memory
* remove hsigmoid with pull sparse, test=develop
5 years ago
Zhaolong Xing
fbbd94a6ce
there is bug for inference using auto grwoth allocator ( #21621 )
...
test=develop
5 years ago
Zeng Jinle
73461a7ae6
Make OperatorWithKernel::InferShape abstract ( #21633 )
...
* make OperatorWithKernel::InferShape virtual, test=develop
* fix test_prepare_op by relu, test=develop
5 years ago
mapingshuo
686f0ecb6a
add `no_need_buffer_slots` interface to pybind ( #21575 )
...
* add no_need_buffer_slots interface to pybind
5 years ago
Zeng Jinle
6828f3684b
fix op_registry, add ignore op_function_impl.h, test=develop ( #21654 )
5 years ago
GaoWei8
5af0c7ba89
Modify padding strategy: remove weight copy in fc padding ( #21650 )
...
test=develop
5 years ago
Chen Weihang
d96acc3363
Refine dygraph DataLoader implementation ( #21634 )
...
* refine dygraph dataloader & polish related code, test=develop
* refine code based review comment, test=develop
5 years ago
wangchaochaohu
5eec8cf5af
fix the mean grad OP performance improvement test=develop ( #21658 )
5 years ago
Zeng Jinle
29f64c8c9e
refine some grad op makers, test=develop ( #21629 )
5 years ago
mapingshuo
e2d849b989
Dropout with seed ( #21590 )
...
* add seed op
5 years ago
Adam
e81f0228df
MKL-DNN 1.0 Update ( #20162 )
...
* MKLDNN v1.0 rebase to Paddle 1.6
test=develop
* Add hacky paddle::string::to_string() implementation
* vectorize<int64-t>() -> vectorize() cleanup
test=develop
* PADDLE_ENFORCE and void_cast fixes
test=develop
* Rebase changes
test=develop
* Cosmetics
test=develop
* Delete MKL from mkldnn.cmake
test=develop
* CMake debug commands
test=develop
* Delete MKLDNN_VERBOSE and rebase fixes
test=develop
* Rebase fixes
test=develop
* Temporarily disable int8 resnet101 vgg16 and vgg19 tests
test=develop
* Add libmkldnn.so.1 to python setup
test=develop
* Add libmkldnn.so.1 to inference_lib cmake after rebase
test=develop
* Post rebase fixes + FC int8 changes
test=develop
* Fix LRN NHWC
test=develop
* Fix NHWC conv3d
test=develop
* Windows build fix + next conv3d fix
test=develop
* Fix conv2d on AVX2 machines
test=develop
5 years ago
rensilin
7f5d532a9c
fix: fail to call ZeroCopyTensor::mutable_data() when device_id is no… ( #21461 )
...
* ZeroCopyTensor::mutable_data in the right device, test=develop
* add unittest for zerocopy, test=develop
5 years ago
xujiaqi01
f404157205
fix master patch when slot is dense ( #21580 )
...
* fix master patch when slot is dense
* test=develop
5 years ago
xujiaqi01
c05706fe73
fix code style of fleet_wrapper ( #21639 )
...
* fix code style of fleet_wrapper
* test=develop
5 years ago
wangchaochaohu
95b95a284b
Mean gpu optimize ( #21643 )
...
* accelerate mean op test=develop
5 years ago
Leo Chen
48600d7f17
Add op function generator for dygraph ( #21569 )
...
* add op function generator, test=develop
* add unittest, test=develop
* follow comments, test=develop
* fix windows compilation problem, test=develop
5 years ago
lidanqing
fbf9eca0d3
QAT Int8 document ( #21360 )
...
* update benchmark for int8v2, QAT1, QAT2 accuracy and performance
test=document_fix
* change according to reviews
test=develop test=document_fix
* improve some descriptions and some models
test=develop test=document_fix
* update models benchmark data
test=develop test=document_fix
* update int8v2 and qat2 performance
test=develop test=document_fix
5 years ago
liym27
be6a639655
Add CI for checking Input/Output/Attr of modified Ops ( #21522 )
...
* add shell scripts. test=develop
* rename test_pybind_inference to test_pybind_interface and print repeat process in check_op_desc.py. test=develop
* add approval RD. test=develop
5 years ago
Leo Chen
4f81d1bd5f
Refine VarBase init function ( #21587 )
...
* refine init function, test=develop
* add tests, test=develop
* remove extern, which may cause symbol error in gcc-4.8, test=develop
5 years ago
Leo Chen
84b7267100
dygraph_grad_maker supports varbase without grad_var ( #21524 )
...
* dygraph_grad_maker supports varbase without grad_var, test=develop
* fix compile, test=develop
* fix test_tracer, test=develop
* follow comments, test=develop
5 years ago
xujiaqi01
88960684aa
rm optimize_for in framework.proto ( #21571 )
...
* remove optimize_for in framework.proto
* test=develop
5 years ago
Zeng Jinle
0f8888360e
Polish op registry codes ( #21561 )
...
* polish infer shape registry, test=develop
* modify some operators registry, test=develop
5 years ago
Aurelius84
3d9dee575e
Set lod_level of Out in compile time of sequence_pool_op ( #21604 )
5 years ago
zhouwei25
346705967d
monitoring changes of unittest, delete one unittest will need approve ( #21377 )
5 years ago
Zeng Jinle
97e76cb96d
refine dev_ctx.Wait() exception throw, test=develop ( #21600 )
5 years ago
Huihuang Zheng
1dcf6a7212
Add Much Complex Test and Fix Bugs for Control Flow cond API ( #21532 )
...
Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests.
Fix bugs:
1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op.
2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var.
This PR also did some code clean up:
1. Print the var name when sgd_op catches shape error so that it is easier to debug
2. Fix a typo: dicta -> dict
5 years ago