nhzlx
a5bfed3776
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_benchmark_for_trt
...
test=develop
6 years ago
nhzlx
afc51e6f82
add benchmark for trt
6 years ago
Zhaolong Xing
bc6d0a3427
Merge pull request #14762 from NHZlX/fix_bug_of_trt_pool
...
fix bug of trt pool2d converter
6 years ago
tensor-tang
53709e7e61
refine names
6 years ago
superjomn
edd1f5a92b
fix visualizer
...
test=develop
6 years ago
Brian Liu
9623b45f40
Remove unnecessary MKLDNN reorder ( #14799 )
...
When data flow from a MKLDNN OP kernel to a non-MKLDNN OP kernel,
data layout transform (via MKLDNN reorder) will occur even when
those two OP kernels share same layout. Add code to remove this
unnecessary reorder.
test=develop
6 years ago
frankwhzhang
90c7f9870e
fix 'name', test=develop
6 years ago
Qiao Longfei
abf140289f
split selected rows op should always init output selected rows
...
test=develop
6 years ago
nhzlx
019e8bbed2
fix comments test=develop
6 years ago
frankwhzhang
271c480822
update API, test=develop
6 years ago
frankwhzhang
c9a653820b
fix label_pos ,add test_layers.py, test=develop
6 years ago
Tao Luo
e99597d35c
Merge branch 'develop' into luotao1-has_attr
6 years ago
sneaxiy
66182abda6
add cuda cudnn version check
...
test=develop
6 years ago
Yu Yang
f0c0bf328d
Add gperftools supports for PE
6 years ago
frankwhzhang
a672b291e5
fix code style, test=develop
6 years ago
frankwhzhang
ea95f9c335
fix style bug, test=develop
6 years ago
frankwhzhang
68c2025844
fix nn.py&API.spec, test=develop
6 years ago
Xin Pan
748549b2e3
Revert "Merge pull request #14798 from PaddlePaddle/revert-14786-revert-14782-revert-14398-imperative"
...
This reverts commit b1d3a1c8b4
, reversing
changes made to f1fb64b17f
.
6 years ago
bingyanghuang
943ad4781f
One possible solution to add flexibility for mkldnn placement pass ( #14768 )
...
* Choose to turn on use_mkldnn attribute v1
* Fix mkldnn_op empty bug
* format change test=develop
* fix ci test=develop
* fix ci test and add test in dam test=develop
* add example to dam compare test test=develop
* review changes test=develop
6 years ago
baojun-nervana
fddbd87c0a
Rename argument
...
test=develop
6 years ago
baojun-nervana
22ac2133e4
Rename class
...
test=develop
6 years ago
baojun-nervana
bfde5e10ce
Move ngraph compile control to cmake
...
test=develop
6 years ago
sneaxiy
2c6159a151
fix unittest
...
fix cmake
test=develop
6 years ago
Xin Pan
c049fa7cf7
Revert "Revert "Revert "Imperative"""
6 years ago
gongweibao
f1fb64b17f
Add reduce sparse tensor feature. ( #14757 )
6 years ago
sneaxiy
eb8252466b
polish code
...
add unittest model containing while_op
remove unnecessary codes
test=develop
6 years ago
Tao Luo
c83d5b7a16
Merge pull request #14709 from yihuaxu/develop_4f71a6ee2_conv3d_bias_fusion_mkldnn_impl
...
Implement the fusion of convolution 3D and bias for mkldnn
6 years ago
Zeng Jinle
add98c9e7d
Merge pull request #14745 from sneaxiy/fix_eigen_deallocate
...
Fix eigen deallocate bug
6 years ago
frankwhzhang
f4cc5881b0
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into bpr
6 years ago
frankwhzhang
97de98cd0a
update bpr_loss op code, test=develop
6 years ago
Xin Pan
6c80bb3ce9
Merge pull request #14786 from PaddlePaddle/revert-14782-revert-14398-imperative
...
Revert "Revert "Imperative""
6 years ago
heqiaozhi
575ae7c6c3
refine pslib inferface & fix some bugs
6 years ago
Yihua Xu
3821fc3950
Merge branch 'develop' into develop_4f71a6ee2_conv3d_bias_fusion_mkldnn_impl
...
test=develop
6 years ago
Yihua Xu
240d974ac5
Clean Code
...
test=develop
6 years ago
Tao Luo
54fcafb5f6
Merge pull request #14707 from yihuaxu/develop_4f71a6ee2_conv3d_mkldnn_opt
...
Implement conv3d with mkldnn library
6 years ago
Xin Pan
2538ef64f1
Revert "Revert "Imperative""
6 years ago
guru4elephant
b82a44ea85
Merge pull request #14778 from wangguibao/async_executor_bugfix
...
Async executor bugfix: Tensor changed to LoDTensor
6 years ago
sneaxiy
8095fb5e68
fix code bug in CPU compilation
...
test=develop
7 years ago
sneaxiy
387bac46b5
refine code
...
test=develop
7 years ago
Tao Luo
cf66133857
Merge pull request #14734 from luotao1/memory_load
...
support loading from memory
7 years ago
Yihua Xu
155328a488
Clean Code
...
test=develop
7 years ago
Xin Pan
6217f42ab7
Revert "Imperative"
7 years ago
Tao Luo
743cb840f1
update with comments
...
test=develop
7 years ago
tensor-tang
ce674b685f
add readme doc and complete TODOs
7 years ago
wangguibao
5a2cd4505b
AsyncExecutor bugfix: Tensor to LoDTensor
...
test=develop
7 years ago
wangguibao
5f98d80039
AsyncExecutor bugfix: Tensor change to LoDTensor
7 years ago
flame
f6a877bc57
add tool to visualize inference model ( #14621 )
7 years ago
frankwhzhang
93551a3440
update API.spec
7 years ago
Tao Luo
42359e88a4
clean code
...
test=develop
7 years ago
Tao Luo
923b18877e
Merge branch 'develop' into memory_load
...
test=develop
7 years ago
Tao Luo
405b2486db
support loading from memory
...
test=develop
7 years ago
Xin Pan
b52f5d2870
Merge pull request #14398 from panyx0718/imperative
...
Imperative
7 years ago
frankwhzhang
272f3d3111
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into bpr
7 years ago
frankwhzhang
570d89ec84
add bpr_loss operator , test=develop
7 years ago
qingqing01
549f165b59
Speed conv_fusion_op for identity activation. ( #14744 )
...
* Refine conv_fusion_op for identity activation.
* Fix unit testing.
test=develop
7 years ago
tensor-tang
fab0ee8757
Merge remote-tracking branch 'ups/develop' into refine/jitkernel
7 years ago
Houjiang Chen
c6b39a0099
Merge pull request #14714 from NHZlX/add_prelu_gpu
...
add prelu cuda kernel for inference.
7 years ago
Jiabin Yang
8a111ac64d
Merge pull request #14763 from junjun315/fix-mac-build-check
...
fix the bug for mac build. python -c error. test=develop
7 years ago
tensor-tang
dbe451976b
Merge pull request #14753 from tensor-tang/refine/namespace
...
remove jit namespace
7 years ago
sneaxiy
0f96c2e80f
fix thread-safety bug
...
test=develop
7 years ago
lujun
5026741b82
fix the bug for mac build. python -c error. test=develop
7 years ago
nhzlx
722b0a805f
fix bug of trt pool
...
test=develop
7 years ago
Jiabin Yang
d9bb55a1f9
Merge pull request #14756 from JiabinYang/fix_hs_op
...
fix bug in dist train on hs, test=develop
7 years ago
Yihua Xu
65dbc7cca4
Merge branch 'develop' into develop_4f71a6ee2_conv3d_mkldnn_opt
7 years ago
JiabinYang
e05e1d7d88
fix bug in dist train on hs, test=develop
7 years ago
tensor-tang
a1eb21e704
refine names
7 years ago
tensor-tang
b523787f9f
remove jit namespace
...
test=develop
7 years ago
tensor-tang
191948c933
enable jitcode
7 years ago
tensor-tang
4a93db9288
remove jit namespace
...
test=develop
7 years ago
Hongyu Liu
8cda28f345
Merge pull request #14733 from phlrain/add_cudnn_5_support
...
Add cudnn 5 support
7 years ago
heqiaozhi
d3ca359e44
config init & adapt to interface
7 years ago
Xin Pan
73b4d1aa72
Merge pull request #14742 from panyx0718/infer2
...
support customized kernel selection
7 years ago
Jiabin Yang
21c0f8749e
Merge pull request #14728 from JiabinYang/optimize_hs_op
...
Optimize hs op
7 years ago
tensor-tang
45bfa70cb8
complete vmul jit kernel
7 years ago
tensor-tang
77236e33fc
init jitkernel
7 years ago
Xin Pan
82d68281c0
follow comments
...
test=develop
7 years ago
sneaxiy
900765224c
fix deallocate bug
...
test=develop
7 years ago
liuhongyu
b408fc4dac
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_5_support
7 years ago
liuhongyu
8b2898e201
fix bug of formate; test=develop
7 years ago
Xin Pan
41c28d54c6
allow customize kernel selection
...
test=develop
7 years ago
Xin Pan
439af8d50a
Merge pull request #14717 from panyx0718/infer
...
fix a const_cast and avoid using stale program.
7 years ago
lujun
104a332a28
Merge pull request #14722 from junjun315/up-12-python-install
...
fix mac ci test step, test=develop
7 years ago
liuhongyu
773dc73fbf
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_5_support
7 years ago
liuhongyu
8daf67f90f
fix bugs; test=develop
7 years ago
chengduo
04539d4c5d
Fix clip.py ( #14718 )
...
* expose square
test=develop
* fix activation
test=develop
* Add square API
test=develop
* add necessary op
* code refine
* fix API.spec
test=develop
* fix unit test
test=develop
* add unit test sparse_grad_clip
test=develop
* fix API.spec
test=develop
* remove mac test for test_gradient_clip
test=develop
* remove selectedrows_mul_tensor
test=develop
7 years ago
sneaxiy
d0c8b9b9b3
remove timeout unittest
...
test=develop
7 years ago
heqiaozhi
419506f510
refine for compile pslib.so
7 years ago
Xin Pan
052cc5f538
Merge pull request #14725 from ZongwuYang/my-cool-stuff
...
My cool stuff
7 years ago
Michal Gallus
6fdbb365ce
Include MKL-DNN header to concat op only when flag is set
...
test=develop
7 years ago
Michal Gallus
f2a880421e
Fix style @ concat integration and tests
...
test=develop
7 years ago
Michal Gallus
738069e491
Refactor MKL-DNN Concat
...
test=develop
7 years ago
Michal Gallus
208f912512
Implement MKL-DNN Concat
...
test=develop
7 years ago
Wu Yi
29d9fb53fc
[Feature] multi process multi gpu dist training, boost v100 performance by 20% ( #14661 )
...
* wip multi process multi gpu dist training
* workable for p2p
* update test=develop
* change back env name test=develop
* fix alloc init
* fix cpu build test=devlop
* fix mac tests test=develop
* refine code
* refine test=develop
7 years ago
liuhongyu
e80402fd0e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_5_support
7 years ago
liuhongyu
968dd3c078
add cudnn 5 support; test=develop
7 years ago
sneaxiy
e694d0c2e4
fix while_op eager deletion bug
...
add unittest
test=develop
7 years ago
Xin Pan
461ca35be1
Merge pull request #14590 from panyx0718/fix4
...
enable API check for readers
7 years ago
gongweibao
50a698525d
Fix log level ( #14692 )
7 years ago
JiabinYang
8c75705984
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize_hs_op
...
, test=develop
7 years ago
Xin Pan
dc458b1482
Merge pull request #14713 from panyx0718/api
...
add more files to protected file list
7 years ago
JiabinYang
b387a19410
optimize op with blas
7 years ago
Zeng Jinle
ff4237309a
Merge pull request #14720 from sneaxiy/fix_seq_mask_op_infershape
...
Fix sequence_mask_op InferShape
7 years ago
heqiaozhi
2301abc481
cc libaray add pslib
7 years ago
ZongwuYang
1560eb4a6d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into my-cool-stuff
7 years ago
ZongwuYang
deb04809bd
test=develop
...
Fix the bug that profiler cannot trace the nccl allreduce operator
7 years ago
Xin Pan
da4e0bf1a1
add 2 more files
...
test=develop
7 years ago
Xin Pan
7c5289f68e
Merge pull request #14719 from PaddlePaddle/revert-14666-feature/estiminate_flops
...
Revert "Add EstiminateFlops"
7 years ago
lujun
9da5954a21
fix mac ci test step, test=develop
7 years ago
Kaipeng Deng
934f13a70a
Merge pull request #14371 from heavengate/yolo_loss
...
Add YOLOv3 loss operator for YOLOv3 model
7 years ago
sneaxiy
35a2578426
fix bug
...
test=develop
7 years ago
sneaxiy
65867d8989
test=develop
7 years ago
Jiabin Yang
6dcc6378b7
Merge pull request #14665 from JiabinYang/ci/add_import_check
...
add mac ci check on import
7 years ago
zhang wenhui
abbe382e1e
Revert "Add EstiminateFlops"
7 years ago
Xin Pan
0591ba96ec
fix hack
...
test=develop
7 years ago
sneaxiy
64ad051b9a
merge develop
...
test=develop
7 years ago
sneaxiy
c47c451a00
fix bug
7 years ago
heqiaozhi
a77fa67bbd
async_thread_trainer & libmct & pslib.cmake
7 years ago
Tao Luo
3437e17713
Merge branch 'has_attr' of https://github.com/luotao1/Paddle into luotao1-has_attr
7 years ago
nhzlx
e7abe6b654
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_prelu_gpu
...
test=develop
7 years ago
nhzlx
f75815b78c
add prelu gpu inference
7 years ago
Xin Pan
bcf36d8401
add more files to protected file list
...
test=develop
7 years ago
Tao Luo
0e3048db43
Merge pull request #14659 from luotao1/update_pass
...
update is_test_pass and mkldnn_placement_pass
7 years ago
Xin Pan
7e0801d4ed
Merge pull request #14441 from baojun-nervana/intel/ngraph_op
...
Implementing ngraph engine
7 years ago
Yihua Xu
82eefceabe
Add the profile_mkldnn flag for profile function(test=develop)
7 years ago
Xin Pan
35e6b5e16a
polish
...
test=develop
7 years ago
Yihua Xu
ea00270fe8
Remove the dims checking when the dim is 3 (test=develop)
7 years ago
Xin Pan
b80fe8264a
polish
...
test=develop
7 years ago
Yihua Xu
64e261c6cd
Implement the fusion of convolution and bias for mkldnn
...
(test=develop)
7 years ago
Tao Luo
8d6984eb9b
change OpHasAttr to RuntimeHasAttr, add some comments
...
test=develop
7 years ago
jerrywgz
96dc3d8326
Merge pull request #14511 from jerrywgz/ignore_index_for_sigmoid_cross_entropy
...
add ignore index for sigmoid cross entropy with logits op, test=develop
7 years ago
Tao Luo
a6ac42669c
Merge branch 'develop' into update_pass
7 years ago
Yihua Xu
669191c9cc
Implement conv3d with mkldnn library (test=develop)
7 years ago
Hongyu Liu
4f71a6ee2c
Merge pull request #14622 from PaddlePaddle/add_cudnn_lstm
...
Add cudnn lstm
7 years ago
Yibing Liu
c7382df80f
Print assert failure id in lookup_table_op ( #14698 )
7 years ago
Yu Yang
0f0e197914
Merge pull request #14666 from reyoung/feature/estiminate_flops
...
Add EstiminateFlops
7 years ago
Xin Pan
93c16d9628
polish the autograd (need to verify correctness)
...
test=develop
7 years ago
Xin Pan
c3236f82d6
polish
7 years ago
Xin Pan
e5d64fd4d1
initial imperative
...
test=develop
7 years ago
Xin Pan
4d0df1fea7
add fields for autograd
...
test=develop
7 years ago
Xin Pan
8138391631
add OpBase and unify with VarBase
...
test=develop
7 years ago
Xin Pan
f6f0692451
clean up
...
test=develop
7 years ago
Xin Pan
0318c95149
rebase develop
7 years ago
Xin Pan
aeb74af54c
allow operator to run imperatively
7 years ago
Xin Pan
b1f6fda5e5
run forward
7 years ago
Xin Pan
a6d23083f0
some tracing
...
test=develop
7 years ago
Xin Pan
dac92e560c
initial commit
7 years ago
barrierye
08233beed7
add the comment for CheckFile function. test=develop
7 years ago
barrierye
d62a3dd72d
add the comment for CheckFile function. test=develop
7 years ago
barrierye
d89108766c
update CheckFile function in data_feed for ignore the space at the end of each line of data(for example, it may be added '\t' character to the end of the reduce task output when processes data by hadoop, which does not affect the correctness of the data). test=develop
7 years ago
phlrain
9f7eae861d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
Tao Luo
61ae88b760
Revert "Fix for accuracy problem for inplace operators when MKL-DNN mode is enabled"
7 years ago
dongdaxiang
52a0be7bb4
add mct into CMakeLists.txt
7 years ago
phlrain
25df78eaf3
fix api spec; test=develop
7 years ago
phlrain
4c256ca6be
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
phlrain
b65722d3cf
fix uni test; test=develop
7 years ago
Tao Luo
99177b424b
Merge pull request #14693 from kbinias/fix-for-accuracy-problem-for-inlplace-operators
...
Fix for accuracy problem for inplace operators when MKL-DNN mode is enabled
7 years ago
heqiaozhi
3c239cd640
pslib
7 years ago
tangwei12
618f7620e2
add enforce for auc ( #14687 )
...
* add enforce for AUC, test=develop
7 years ago
Krzysztof Binias
bc7db6cec9
Fix for accuracy problem for inplace operators when MKL-DNN mode is enabled
...
test=develop
7 years ago
phlrain
2770ea1a73
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
chengduozh
3f4aca618f
code refine
...
test=develop
7 years ago
chengduozh
af8c2cec13
fix operator.cmake
...
test=develop
7 years ago
chengduozh
679d8fc6fe
rename op name
...
test=develop
7 years ago
chengduozh
1013d6d05d
Merge branch 'add_cudnn_lstm' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
jerrywgz
3df0538940
replace -100 to kIgnoreIndex
7 years ago
Wang Guibao
41e19eb431
AsyncExecutor ( #14627 )
...
* AsyncExecutor: C++ side
* Google naming conventions
* Rename MultiExecutor to AsyncExecutor
* pybind with async_executor
* Naming convention
* remove some flags and unused code
* add refactored file of async_executor and data_feed
* clear async executor interface and add data feed factory
* split async executor into executor_thread_worker and async_executor, refactor pybind, add datafeed and corresponding proto
* Fix async_executor interfaces: 1) Remove all protobufs; 2) Stop after each epoch
* refine async_executor_refactor.cc
* add some files about datafeed
* Revert "add some files about datafeed"
This reverts commit 8ee8133ab841196925a2812b76f18d2812a6701d.
* Interface rework
* add MultiSlotDataFeed
* Creating DataFeedDesc from .proto file, then manipulate it (add/del fields etc) from python side
* update data_feed for add MultiSlotDataFeed
* update datafeed and async_executor to run bow_net demo
* fix bug that finish_set_filelist failed in multithread
* delete finish_binding_memory_(flag), because it can not be marked under the current interface
* Fix bug
* update async_executor.py for support set_use_slots
* update async_executor.py for support set_use_slots and set set_dense_slots
* fix bug that when the number of files is less than the number of threads, it will fetch nan
* remove redundant code, and make executor exit when set a illegal queue size
* add batch_size check
* add MultiSlotDesc
* Revert "add MultiSlotDesc"
This reverts commit 2e72ebfad364ed6b5dcc75f38ffb2a1fdec83d8e.
* add some checkpoint in DataFeedDesc
* add CheckFile function in MultiSlotDataFeed
* update something error info
* fix deaded lock bug
* Fix fetch variable
* Merge error
* fix code style in async_executor
* using one lock blocking queue replace two lock blocking queue because of some bugs
* update code style
* add utest for data_feed
* Fix fetch var
* update utest for data_feed for multithread
* update SetFileList info
* fix bug in utest of data_feed
* Add comments for python
* Add comments for python code
* Fix pybind.cc with new pybind11 version
* add note for DataFeedDesc's set_use_slots function
* Add save_model
* update data_feed_test for multi-type
* add comment for executor_thread_worker
* Remove unused code
* update data_feed_test for generate test data file
* removed unnecessary interfaces and add comments
* c++ style check
* update data_feed.cc
* AsyncExecutor: C++ side
Google naming conventions
Rename MultiExecutor to AsyncExecutor
pybind with async_executor
Naming convention
remove some flags and unused code
add refactored file of async_executor and data_feed
clear async executor interface and add data feed factory
split async executor into executor_thread_worker and async_executor, refactor pybind, add datafeed and corresponding proto
Fix async_executor interfaces: 1) Remove all protobufs; 2) Stop after each epoch
refine async_executor_refactor.cc
add some files about datafeed
Revert "add some files about datafeed"
This reverts commit 8ee8133ab841196925a2812b76f18d2812a6701d.
add MultiSlotDataFeed
Interface rework
Creating DataFeedDesc from .proto file, then manipulate it (add/del fields etc) from python side
update datafeed and async_executor to run bow_net demo
update async_executor.py for support set_use_slots
Fix bug
update async_executor.py for support set_use_slots and set set_dense_slots
fix bug that when the number of files is less than the number of threads, it will fetch nan
remove redundant code, and make executor exit when set a illegal queue size
add MultiSlotDesc
Revert "add MultiSlotDesc"
This reverts commit 2e72ebfad364ed6b5dcc75f38ffb2a1fdec83d8e.
add some checkpoint in DataFeedDesc
Fix fetch variable
fix code style in async_executor
Fix fetch var
add utest for data_feed
Add comments for python
update utest for data_feed for multithread
fix bug in utest of data_feed
Add comments for python code
Fix pybind.cc with new pybind11 version
add note for DataFeedDesc's set_use_slots function
update data_feed_test for multi-type
Add save_model
update data_feed_test for generate test data file
removed unnecessary interfaces and add comments
add comment for executor_thread_worker
Remove unused code
update data_feed.cc
c++ style check
* commit for code style
* commit for code style
* commit for code style
* commit for code style
* Comment away __init__ in async_executor.py
* clang-format fix test=develop
* use PADDLE_THROW instead of exit(-1); use unique_ptr to manage scope var in data_feed_test.cc
* commit for update code style
* commit for update code style
* Add async_executor demo; Remove some methods
test=develop
* commit for update code style
* commit for update code style
* commit for update code style
* update API.spec
* AsyncExecutor
test=develop
* AsyncExecutor
test=develop
* AsyncExecutor
test=develop
* AsyncExecutor
test=develop
* Fix API.spec
test=develop
* Fix API.spec
test=develop
* Fix windows build error
test=develop
* FIx windows build error
test=develop
* FIx windows build error
test=develop
* FIx windows build error
test=develop
* Fix Windows Build
test=develop
* Fix Windows Build
test=develop
* Fix Windows Build
test=develop
* Fix code style
test=develop
* Fix code style
test=develop
* update datafeed
* Fix code style
test=develop
* update data_feed_test for test Tensor test=develop
* Fix code style
test=develop
* Fix windows build failure
test=develop
* Fix code style and windows build failure
test=develop
* Fix PYTHON3.5 build failure
test=develop
* AsyncExecutor API
test=develop
7 years ago
JiabinYang
a770d5c9db
fix error don't interupt shell
...
, test=develop
7 years ago
whs
1b9753d109
Make pad2d support for variable paddings. ( #14667 )
...
* Make pad2d support for variable paddings.
test=develop
* Rename get_paddings and add inline modifier.
test=develop
* Fix comments.
7 years ago
Tao Luo
2af5762cf8
Merge pull request #14668 from wzzju/use_small_dam
...
support the small dam model. test=develop
7 years ago
Tao Luo
ff16c47898
Merge pull request #14671 from luotao1/box_coder
...
speedup box_coder_op for multi-threads
7 years ago
baojun-nervana
fc61bf1b16
Renamed methods
...
test=develope
7 years ago
sneaxiy
096673f675
refactor eager deletion
...
test=develop
7 years ago
ZhenWang
6e48e47406
test=develop
7 years ago
ZhenWang
e1da6cd754
add the normal dam and the small dam
7 years ago
luotao1
bcc90123f0
speedup box_coder_op for multi-threads
...
test=develop
7 years ago
ZhenWang
d5947b0ed7
test=develop
7 years ago
ZhenWang
33b4963505
unify the normal and small dam model.
7 years ago
Yan Chunwei
4b7617740e
fix container not cleared ( #14231 )
7 years ago
Tao Luo
c856ac8721
add OpHasAttr in node.h, update is_test_pass and mkldnn_placement_pass
...
test=develop
7 years ago
ZhenWang
8f2e556e65
support the small dam model. test=develop
7 years ago
phlrain
6ce4250172
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
Qiao Longfei
44debca844
Merge pull request #14589 from jacquesqiao/refactor-prefetch
...
Refactor prefetch
7 years ago
phlrain
bd94ab0ef3
rename op; test=develop
7 years ago
phlrain
92f5be1d82
remove inputvarname in operator; test=develop
7 years ago
Xin Pan
40f1c4a6f0
fix
...
test=develop
7 years ago
phlrain
cf1fe61004
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
luotao1
5db273d874
enhance HasAttr to fix ci
...
test=develop
7 years ago
Yu Yang
589b863b98
Add EstiminateFlops
...
test=develop
7 years ago
phlrain
4b9689379f
fix cudnn lstm; test=develop
7 years ago
phlrain
d1a17cadd4
fix cudnn rnn; test=develop
7 years ago
JiabinYang
4124253796
add mac ci check on import, test=develop
7 years ago
Qiao Longfei
9450048acb
add PADDLE_ENABLE_REMOTE_PREFETCH to enable remote prefetch
...
test=develop
7 years ago
Xin Pan
75939c2059
fix
...
test=develop
7 years ago
Tao Luo
20120d9c97
Merge pull request #14608 from jczaja/prv-conv2d-transpose-mkldnn
...
[MKL-DNN]conv2d transpose
7 years ago
Qiao Longfei
3e45a5a5ec
lookup_table gpu kernel support prefetch
...
test=develop
7 years ago
Zhaolong Xing
d215293c92
Merge pull request #14649 from NHZlX/add_params_sync_pass
...
Add params sync pass
7 years ago
Qiyang Min
055da6e00d
Merge pull request #14656 from velconia/disable_dist_transpiler_ut_in_mac
...
Change pip to correct version when install wheel package
7 years ago
qingqing01
731d45a39a
Enable BatchNorm to use global mean and variane during training ( #14630 )
...
* Enable BatchNorm to use global mean and variane during training
* Update doc and follow comments.
7 years ago
nhzlx
49c28b8c52
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_params_sync_pass
...
test=develop
7 years ago
nhzlx
3c83a2f720
fix comments
7 years ago
Xin Pan
ad6ed5b745
fix py3
...
test=develop
7 years ago
Xin Pan
0cc9ab3dc2
enable API check for readers
...
test=develop
7 years ago
luotao1
4a4daa8ab4
Merge branch 'develop' into has_attr
7 years ago
Qiao Longfei
75eba6108d
Add scope doc ( #14582 )
...
* add doc for scope
* update doc for force_init_on_cpu
test=develop
* follow comment test=develop
* update format test=develop
7 years ago
Tao Luo
ea47685f91
Merge pull request #14646 from jczaja/prv-softmax-mkl-sasum
...
Softmax for inference MKL further changes
7 years ago
Qiao Longfei
3a3cfc2d8d
prefetch support gpu
...
test=develop
7 years ago
minqiyang
fe0dee88d8
Change pip version to correct version when install wheel package
...
test=develop
7 years ago
baojun-nervana
d5ee05e6c3
Replaced VarIsTensor
...
test=develop
7 years ago
baojun-nervana
e6bd53be60
Named to RuntimeInferShape
...
test=develop
7 years ago
Sang Ik Lee
24e70920db
Refactor some build settings.
...
test=develop
7 years ago
baojun-nervana
a29696146c
Added annotation
...
test=develop
7 years ago
Sang Ik Lee
d6125a5eec
Include ngraph in inference demo build.
...
test=develop
7 years ago
baojun-nervana
caf4b937b3
Added RunInferShape
...
test=develop
7 years ago
baojun-nervana
1d19eb2bd4
Implemented ngraph engine
...
test=develop
7 years ago
Qiao Longfei
4b9082a4cd
follow comment
7 years ago
Tao Luo
b4de023ee1
Merge pull request #14636 from Superjomn/fix/word2vec
...
fix word2vec bug
7 years ago
luotao1
fe915901cd
update Opdesc's HasAttr
...
test=develop
7 years ago
chengduo
6776e92846
refine tensor_array_write_read ( #14643 )
...
test=develop
7 years ago
nhzlx
d3e140a572
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_params_sync_pass
...
test=develop
7 years ago
nhzlx
d666c8eb1d
fix benchmark
7 years ago
nhzlx
900fbb83f9
add params sync pass
7 years ago
superjomn
9c665c81ae
update
...
test=develop
7 years ago
Jacek Czaja
48e1b97e8e
- Coding style fixes
...
test=develop
7 years ago
Qiao Longfei
d32de7e6e1
fix code format test=develop
7 years ago
Qiao Longfei
5a660aee7d
update log level in parameter prefetch test=develop
7 years ago
Qiao Longfei
8ebde595c9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
...
test=develop
7 years ago
Qiao Longfei
b9d3d75fc4
fix prefetch dependency test=develop
7 years ago
Qiao Longfei
145c535750
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
...
test=develop
7 years ago
minqiyang
9d7c3b18c0
Polish code
...
test=develop
7 years ago
minqiyang
2b430adaee
Polish code
...
test=develop
7 years ago
minqiyang
a02ce58f2c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
...
test=develop
7 years ago
Jiabin Yang
12e1719f96
Merge pull request #14352 from JiabinYang/enhance_hierachical_sigmod_op
...
Enhance hierarchical sigmoid op
7 years ago
Qiao Longfei
40f68b1349
unit test ready
7 years ago
Qiao Longfei
36e26a53b0
Optimize bilinear tensor product op ( #14485 )
...
* optimize bilinear_tensor_product
* add set zero to set grad to 0.
7 years ago
Tao Luo
4ec9de0122
Merge pull request #14628 from Sand3r-/mgallus/mkldnn-elementwise_mul
...
EltwiseMul: Changes from previous PR
7 years ago
Qiao Longfei
35b79ab865
Merge pull request #13983 from jacquesqiao/add-ctr-reader
...
Add ctr reader
7 years ago
wopeizl
b1dbbb7f88
Merge pull request #14629 from wopeizl/windows/port
...
fix the build issue on manylinux1
7 years ago
Qiao Longfei
da387720d7
fix infer compile test=develop
7 years ago
Jacek Czaja
cf40daee58
- Building fix to softmax for inference
7 years ago
Clementine
6c71c1f8f9
Add activation gelu ( #14569 )
7 years ago
Michal Gallus
9455be0ba5
EltwiseMul: Extract StringToFormat to MKLDNN helper
...
test=develop
7 years ago
peizhilin
351dc78e1c
code style fix
...
test=develop
7 years ago
Jacek Czaja
1540df51cf
- Fix to test_conv2d_transpose_mkldnn for GPU
...
test=develop
7 years ago
JiabinYang
eda069068d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enhance_hierachical_sigmod_op
7 years ago
JiabinYang
a08dc83eb0
remove arg 'non_leaf_num', test=develop
7 years ago
chengduo
6648f5ed6f
add ShareLoD for dropout_grad ( #14616 )
...
test=develop
7 years ago
peizhilin
b6b8626e9c
fix the build issue on manylinux1
7 years ago
Qiao Longfei
18fd2d01b7
update embedding api
7 years ago
JiabinYang
7594787deb
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enhance_hierachical_sigmod_op
7 years ago
JiabinYang
c469334cfb
polish python code and comment, test=develop
7 years ago
Xin Pan
3c77ce3751
Merge pull request #14593 from panyx0718/fix5
...
Protect important header files.
7 years ago
Qiao Longfei
92afbb923c
fix compile problem test=develop
7 years ago
Tao Luo
e8ef14d2a7
Merge pull request #14610 from Superjomn/revert/cache_fix
...
Revert "fix transfer cache thread_local bug (#14581 )"
7 years ago
Qiao Longfei
97cbec9b74
clean code
7 years ago
Qiao Longfei
1edd435da6
fix ci problem test=develop
7 years ago
JiabinYang
87648f8edf
merge develop, test=develop
7 years ago
Yiqun Liu
726f2cefe3
Fix bug of referencing a temporary variable. ( #14614 )
...
test=develop
7 years ago
wopeizl
db9284ecde
Merge pull request #14617 from wopeizl/windows/online
...
Windows/online
7 years ago
JiabinYang
c3c3c0b33c
polish code, test=develop
7 years ago
gongweibao
867c312bc4
Fix allreduce dependency order. ( #14586 )
7 years ago
Jacek Czaja
8bfa1fa9bb
- ASUM MKL integration
7 years ago
phlrain
487ee36aec
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
tangwei12
56a4912b76
Make NCE_OP more efficient and support SelectedRows ( #14469 )
...
* Fix truncated normal.
* Fix.
* Make nce support more distribution.
* Fix API.spec.
* Fix python API.
* Fix.
test=develop
* Fix API.spec
test=develop
* Fix sampler.
* Fix order of arguments in python API.
test=develop
* NCE add selectedrows support
* NCE update weighted sampling
* fix bugs in nce_op, and assign_value_op optimized
* fix bugs in nce_op, revert assign_value_op
* nce_op optimize
* nce_op optimize
* nce_op optimize
* add selectedRows test later
test=develop
* add selectedRows supported
* add selectedRows supported
test=develop
* add selectedRows supported
* add nce selectedRows supported, test=develop
* add nce selectedRows supported
* add nce selectedRows supported, test=develop
* fix height in nce, test=develop
* add ut
* add ut, test=develop
* make AutoGrownIndex inline
test=develop
* fix tinny error, test=develop
7 years ago
liuhongyu
1ffe41d722
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
Qiao Longfei
9589babe12
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
...
test=develop
7 years ago
liuhongyu
05917c3c79
add cudnn lstm; test=develop
7 years ago
Zeng Jinle
1c48d61442
Merge pull request #14599 from sneaxiy/fix_mac_unittest_bug
...
Fix Mac unittest bug
7 years ago
Qiao Longfei
f35f3fe77a
ctr reader can not be used in windows
...
test=develop
7 years ago
peizhilin
6a85dd3278
Merge remote-tracking branch 'upstream/develop' into windows/build
...
test=develop
7 years ago
peizhilin
38715e6fd0
minor fix
7 years ago
JiabinYang
7389597ce2
Update API.spec, test=develop
7 years ago
peizhilin
511cc9024a
fix for build issue
7 years ago
Qiao Longfei
6bef565dac
clean code test=develop
7 years ago
Qiao Longfei
e7d1f524f3
change log level
...
test=develop
7 years ago
JiabinYang
7e4bd695e6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enhance_hierachical_sigmod_op
7 years ago
Qiao Longfei
fe54adf70c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-ctr-reader
7 years ago
JiabinYang
b10df8bcfa
refine code and add none bias ut, test=develop
7 years ago
Kaipeng Deng
251a1bb0f4
Merge pull request #14588 from heavengate/revert_interpolate
...
fix interpolate_op incompatible. test=develop
7 years ago
Qiao Longfei
668ae9083e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-ctr-reader
7 years ago
Qiyang Min
30e47bce8b
Merge branch 'develop' into revert_vlog
7 years ago
tensor-tang
3ae6692a0d
Merge pull request #14512 from tensor-tang/fea/jit/rnn
...
Fea/jit/rnn
7 years ago
superjomn
4babc6b06c
update
...
test=develop
7 years ago
sneaxiy
f3522a11d2
fix mac unittest bug
...
test=develop
7 years ago
Qiao Longfei
87e4edd2ea
fix grad_varname in remote prefetch
7 years ago
Qiyang Min
6232d1f1dd
Merge pull request #14578 from velconia/add_production_dockerfile
...
Add python3.6 and python3.7 support to production generated Dockerfile
7 years ago
superjomn
dc249d3b69
Revert "fix transfer cache thread_local bug ( #14581 )"
...
This reverts commit 5c073a4db2
.
7 years ago
Qiao Longfei
d98c59fd2c
support none sliced variable
7 years ago
dengkaipeng
bb489d4cc9
add interp_method default bilinear. test=develop
7 years ago
dengkaipeng
78f563917c
revert interpolate_op to bilinear_interp_op & nearest_interp_op. test=develop
7 years ago
Jacek Czaja
fb24690a58
- conv2d transpose MKL-DNN
...
test=develop
- Added new header for MKLDNN reuse functionality
- Extended conv2d_transpose GetExpectedKernelType for MKL-DNN supporrt
- Buildable conv transpose mkldnn and conv mkldnn using conv template
- Conv2d transpose roughlt implemented and buildable
- Added modifications conv2d transpose MKLDNN unit tests
- Fix to UT of conv2d transpose mkldnn op
- Wrong type of MKLDNN primitive was chosen for conv2d transpose
- HAcks for conv2d transpose
- UT enalbed
- Replaced copying loop with memcpy
- Draft of passing lambda into AcquireMemory
- Made reorder (IOHW->OIHW) to be called only once
7 years ago
tensor-tang
7a91271436
Merge branch 'develop' into fea/jit/rnn
7 years ago
minqiyang
be04d99fe4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
...
test=develop
7 years ago
wopeizl
05b7ee7eeb
Merge pull request #14545 from wopeizl/windows/online
...
Windows/online
7 years ago
JiabinYang
81e145764d
refine code and comments, test=develop
7 years ago
minqiyang
bcaa8a3b67
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_production_dockerfile
...
test=develop
7 years ago
Qiao Longfei
af2f5fc824
fix some bugs
7 years ago
JiabinYang
2f6b529aff
refine code and comments, test=develop
7 years ago
Xin Pan
e32f4c5423
fix
...
test=develop
7 years ago
Xin Pan
3e665862b8
Protect important header files.
...
test=develop
7 years ago
minqiyang
e43f5bc77c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_dist_resnet_ut_in_py36
...
test=develop
7 years ago
minqiyang
53433d7f2e
Revert the changes of VLOG
...
test=develop
7 years ago
tensor-tang
1f0291a51e
add comments and follow comments
...
test=develop
7 years ago
tensor-tang
557229bd39
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
Qiao Longfei
ed9fa4b301
can run
7 years ago
peizhilin
30849d1f20
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
qingqing01
6224e61fd9
Transpose-Flatten-Concat fusion operator. ( #14568 )
...
* Transpose-Flatten-Concat fusion operator.
* Add unit testing and fix bug.
7 years ago
Yan Chunwei
5c073a4db2
fix transfer cache thread_local bug ( #14581 )
7 years ago
Xin Pan
87332bb18d
Merge pull request #14579 from Superjomn/fix/transfer-cache-compile-error
...
fix compile
7 years ago
minqiyang
8b154c172f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_dist_resnet_ut_in_py36
...
test=develop
7 years ago
Qiao Longfei
686d15c8e0
update grpc_variable_response
7 years ago
Jiabin Yang
13bc7619f5
Merge pull request #14552 from JiabinYang/fix_mac/fix_pinned_memory
...
fix Mac unittest error on reading pined memory flag
7 years ago
tangwei12
3639d99f99
Fix save and load lookup table/optimizer vars ( #14301 )
...
* fix mkdir conflict
* fix load/save lookup tables
test=develop
* add lookup_table_utils
* fix load optimize vars on pserver
* delete lookup table utils
* fix save and load lookup tables
* fix load optimizer var
* fix load optimizer var, test=develop
* fix python 3 style, test=develop
* move lookup_table_utils to contrib utils
7 years ago
peizhilin
36cd18b549
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
qingqing01
39ec80def4
Remove the memory copy of feeding data in C++ inference API ( #14577 )
...
* Remove the memory copy for feeding data in C++ inference API
* Fix compling dependence
* Fix compling in ONLY_CPU mode
7 years ago
peizhilin
b2f8d4183d
Given the different fraction_of_gpu_memory_to_use depends on platform
7 years ago
Qiao Longfei
d827881502
fix pserver and prefetch rpc
7 years ago
peizhilin
1afa9492af
Recover the profiler
7 years ago
Yiqun Liu
bf222f197d
Use sub scope in tensor_array_to_tensor op. ( #14524 )
...
test=develop
7 years ago
superjomn
4b40c0013b
fix compile
...
test=develop
7 years ago
JiabinYang
02d68051db
add sparsed bias grad, test=develop
7 years ago
dzhwinter
840c1b29ad
test=develop ( #14562 )
...
* test=develop
remove code.
* test=develop
7 years ago
Qiao Longfei
5856c2f332
change Var to FindVar
7 years ago
Yu Yang
26af9cf90c
Merge pull request #14565 from chengduoZH/fix_cublas_warp_error
...
Fix cublas warp error
7 years ago
Qiao Longfei
312b7786d9
clean code
7 years ago
Qiao Longfei
2b6c0c09d6
add unit test
7 years ago
Yan Chunwei
923c8e3332
add benchmark for inference ( #14571 )
7 years ago
minqiyang
c92c440fa1
Add python3.6 and python3.7 support to production generated Dockerfile
...
test=develop
7 years ago
Qiao Longfei
47280ef8b4
lookup table op support prefetch
7 years ago
Yan Chunwei
a7188d5bc7
fix executor transfer cache bug ( #14518 )
7 years ago
gongweibao
c1bf9664cd
Add options to disable SO_REUSEPORT of grpc. ( #14269 )
7 years ago
minqiyang
ee73810fd5
Fix API.spec
...
test=develop
7 years ago
Qiao Longfei
4ad5fd8f54
add parameter prefetch
7 years ago
Qiao Longfei
9d276fe8a8
add parameter prefetch
7 years ago
minqiyang
d2045260a5
Change visibilities of variant_visitor of pybind11
...
test=develop
7 years ago
minqiyang
b67229187e
Change to PYBIND11_MODULE because the deprecation of PYBIND11_PLUGIN
...
test=develop
7 years ago
minqiyang
81994e84e0
Change the include files because the version changes of pybind11
...
test=develop
7 years ago
Tao Luo
e90afec47b
Merge pull request #14543 from luotao1/threads
...
add thread related inference api
7 years ago
qingqing01
64ca3d176c
Add bias_attr in sequence_conv_pool API. ( #14553 )
7 years ago
chengduozh
f7847ca6a3
fix cublas warp error
...
test=develop
7 years ago
Zhaolong Xing
e52d90a35e
Merge pull request #14527 from hjchen2/develop
...
Refine split TensorRT plugin
7 years ago
Qiyang Min
4531281386
Merge pull request #14526 from velconia/add_python36and37_to_paddle_build
...
Add python 3.6 and python 3.7 support to paddle build
7 years ago
JiabinYang
47c4e65d60
test=develop
7 years ago
luotao1
116979a40a
refine api name
...
test=develop
7 years ago
luotao1
e66b4c6bff
adjust tester_helper to make multi-instance multi-thread work
...
test=develop
7 years ago
luotao1
a5c4b463c9
add SetMKLDNNThreadId api
7 years ago
luotao1
e21edb26f6
add Set/GetCPUNumThreads api
7 years ago
Qiao Longfei
9851a53478
add prefetch part in pserver
7 years ago
JiabinYang
5cd2fc9fd0
just for test
7 years ago
JiabinYang
42470f14b7
test=develop
7 years ago
peizhilin
445fff24dc
add the bigobj option to NVCC compile
...
fix code style
7 years ago