xuezhong
b0c75f1763
remove debug print
6 years ago
xuezhong
880836329d
add cell clip and proj clip, fix bug for h0
6 years ago
jerrywgz
4eb44380a6
Merge branch 'develop' into add_clip_op
6 years ago
Xin Pan
30cc8b7a92
Merge pull request #15554 from heavengate/yolo_loss_darknet
...
Yolo loss darknet
6 years ago
mozga-intel
312500dcb5
Enable pool2d operator for a ngraph engine ( #15395 )
...
* Enable pool2d operator for a ngraph engine
test=develop
* Update
test=develop
6 years ago
Tao Luo
ea92905be4
Merge pull request #15478 from kbinias/kbinias/seperate-folders-for-mkldnn
...
Make separate folders for mkldnn codes
6 years ago
Yibing Liu
170842cbb4
Some improvements to support bert mixed precision training ( #15585 )
...
* Some improvements to support bert mixed precision training
test=develop
* Revert the cast in layer_norm
test=develop
6 years ago
Yiqun Liu
16d54f7f23
Return parent_idx in beam_search op ( #15520 )
...
* Refine beam_search_op to output an extra parent_idx tensor.
test=develop
* Fix the unittest test_beam_search_op.
test=develop
* Fix the merging mistake.
test=develop
6 years ago
jerrywgz
72ee3c6232
Merge pull request #15398 from jerrywgz/add_axis_for_boxcoder
...
Add axis for boxcoder
6 years ago
jerrywgz
e402c0ec7d
test=develop
6 years ago
Kaipeng Deng
d3eeb92bba
Merge pull request #15491 from tink2123/new_align_corners
...
add align_corners and align_mode for image_resize
6 years ago
jerrywgz
3046799ecd
Merge branch 'develop' into add_clip_op
6 years ago
dzhwinter
1a44b2fbe8
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
Jiabin Yang
2d0ffdc485
test=develop, fix debug mode unitest, hsigmoid ( #15574 )
6 years ago
tensor-tang
2b0811c3fb
refine vadd jitkernel choice
...
test=develop
6 years ago
tensor-tang
a18c0d4242
cache fc kernel
...
test=develop
6 years ago
tensor-tang
6e1ee7fb57
cache softmax kernel func
...
test=develop
6 years ago
Krzysztof Binias
69b7c595d6
Small fix
...
test=develop
6 years ago
Krzysztof Binias
b1bdcd4de8
Make separate folders for mkldnn codes
...
test=develop
6 years ago
dzhwinter
06f2448848
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
dengkaipeng
23d34d1f7e
move yolov3_loss to detection. test=develop
6 years ago
tensor-tang
c7449227e8
Merge pull request #15563 from tensor-tang/jit/softmax
...
refine softmax kernel
6 years ago
dengkaipeng
733bb82ec0
downsample -> downsample_ratio. test=develop
6 years ago
dengkaipeng
ae0b0d5f93
fix doc. test=develop
6 years ago
dengkaipeng
56e21c558e
add comments and docs. test=develop
6 years ago
dengkaipeng
577424e5ec
use darknet loss and trick
6 years ago
dengkaipeng
042fecefab
use L2Loss. test=develop
6 years ago
dengkaipeng
af124dcdf6
fix API error
6 years ago
dengkaipeng
c945ffa7f8
fix label_smooth and mixup score
6 years ago
tink2123
2b89f59055
add attr use_label_smooth test=develop
6 years ago
dengkaipeng
8218e30176
add gtscore. test=develop
6 years ago
dengkaipeng
3c08f620c2
add label smooth. test=develop
6 years ago
dengkaipeng
cc01db6029
calc valid gt before loss calc. test=develop
6 years ago
dengkaipeng
32d533c2cd
cache obj_mask and gt_match_mask. test=develop
6 years ago
dengkaipeng
6c5a5d0789
format code. test=develop
6 years ago
dengkaipeng
e7e4f084e5
ignore pred overlap gt > 0.7. test=develop
6 years ago
dengkaipeng
db8ff57a61
remove useless code and update doc. test=develop
6 years ago
dengkaipeng
577a92d992
use typename DeviceContext. test=develop
6 years ago
dengkaipeng
0c4acc8305
imporve yolo loss implement. test=develop
6 years ago
dengkaipeng
2fbfef2ec9
fix no box expression. test=develop
6 years ago
dengkaipeng
c0fa8d2eec
use L1Loss for w, h. test=develop
6 years ago
dengkaipeng
3841983aa0
fix division error in mean process. test=develop
6 years ago
dengkaipeng
192d293854
use stable Sigmoid Cross Entropy implement. test=develop
6 years ago
tink2123
909f864a9b
remove unnecessary flags
...
test=develop
6 years ago
tink2123
6961a94e94
avoid out_size less than 1
...
test=develop
6 years ago
jerrywgz
7bc8481c62
Merge pull request #15418 from jerrywgz/refine_nms
...
Refine nms
6 years ago
tensor-tang
d59f733551
refine softmax and use with cache
...
test=develop
6 years ago
tensor-tang
7383eefd2d
add softmax mix and mkl code
...
test=develop
6 years ago
tensor-tang
50945685f2
add hmax, hsum jitcode
...
test=develop
6 years ago
tensor-tang
8117725852
add jit kernel hsum, hmax and softmax refer code
...
test=develop
6 years ago
Zeng Jinle
bf7dedcbc7
Merge pull request #15545 from sneaxiy/fix_debug_nccl_error
...
Fix nccl unittest error in debug mode
6 years ago
dzhwinter
ee3aae56cd
merge develop branch. test=develop
6 years ago
jerrywgz
cee2e1b089
refine code, test=develop
6 years ago
sneaxiy
ba4f43fd62
fix compile error in distributed mode
...
test=develop
6 years ago
tink2123
a0c63f1106
add align_flag
...
test=develop
6 years ago
Tao Luo
b919190232
Merge pull request #15531 from jczaja/prv-googlenet-fix
...
Performance and functional fixes to LRN
6 years ago
Zhaolong Xing
97b76c94c4
Merge pull request #15242 from NHZlX/trt_int8_ultimate_version
...
add trt int8 support
6 years ago
Kaipeng Deng
aeca5c50b2
fix grid_sampler PADDLE_ENFORCE error. test=develop ( #15542 )
6 years ago
乔龙飞 Qiao Longfei
5f89ce7fcd
Merge pull request #15536 from jacquesqiao/fix-prefetch-one-parameter
...
Fix prefetch one parameter
6 years ago
Jacek Czaja
5885c5cdf6
- Added explanation to LRN MKL-DNN op on alpha modification
...
test=develop
6 years ago
Jacek Czaja
4aa7ef3c13
- Compensation fix to LRN MKL-DNN op
...
test=develop
6 years ago
Qiao Longfei
806658d72b
add space after colon in commnet test=develop
6 years ago
nhzlx
b43ea40c51
delete the usage of the const_cast
...
test=develop
6 years ago
baojun-nervana
8e9308a51a
mv ngraph_bridge to ngraph directory test=develop
6 years ago
Qiao Longfei
4d13434443
fix a little problem test=develop
6 years ago
Qiao Longfei
9c3910f390
IncreaseBatchBarrier should be in the right condition test=develop
6 years ago
ruri
88bd7e1a61
Merge pull request #15027 from shippingwang/shufflechannel
...
Add Shuffle Channel Operator
6 years ago
Jacek Czaja
fa286b1052
LRN reengineering
...
Added reading dst mem pd from lrn pd
coding style fixes
test=develop
6 years ago
nhzlx
92cf4a4c6b
fix comments
...
test=develop
6 years ago
tensor-tang
e043ea9653
Merge pull request #15515 from tensor-tang/jit/benchmark
...
jit benchmark use tensor with alignment
6 years ago
Qiao Longfei
5a0c6593d5
revert RequestGetHandler
6 years ago
jerrywgz
466a10dcdd
refine code, test=develop
6 years ago
乔龙飞 Qiao Longfei
c58555067e
Merge pull request #14731 from jacquesqiao/optimize-cpp-reader
...
Optimize cpp reader
6 years ago
jerrywgz
a39240c3b6
add attr variance for box coder, test=develop
6 years ago
gongweibao
d54494ba87
cleanup test=develop ( #15347 )
6 years ago
Qiao Longfei
84220765a7
refine code, add more log
6 years ago
Qiao Longfei
c750be6d9d
add some log
6 years ago
gongweibao
fe8f28c957
Add GetVariableNoBarrier on brpc. ( #15488 )
6 years ago
tangwei12
981fc2bdba
fix bug in merge_ids ( #15503 )
...
* fix mistakes in merge_ids, test=develop
6 years ago
baojun
efce25673c
Adding ngraph_engine_op ( #14948 )
...
* enable ngraph_engine_op
test=develop
* merge develop test=develop
* avoid const_cast test=develop
* rm ngraph_operator test=develop
* Added TODO to move EnableNgraph test=develop
* Add TODO to remove const_cast test=develop
6 years ago
chengduo
f8f91fb4b3
Revert conv transpose cudnn ( #15514 )
...
* Revert "set constant for loss"
This reverts commit 167933f678ccbb3563e949710279efe004a27731.
* Revert "remove workspace_handle"
test=develop
This reverts commit b4aca8ede9e685bce1dfb1c59e63919f33432572.
6 years ago
tensor-tang
b67584a6e9
jit benchmark use tensor
...
test=develop
6 years ago
Yiqun Liu
3008fa1261
Add the CUDA kernel for beam_search op ( #15020 )
...
* Refine the beam_search op and test.
* A basic CUDA implementation of beam_search for small batch_size.
* Implement CUDA kernel for beam_search_op.
* Use multiple CUDA threads in the same block to select the top beam.
* Update the python api of beam_search op.
* Enable extend function in CPU kernel of beam_search op.
* Unify the CUDA codes.
test=develop
* Unify the CPU kernel of beam_search op.
* Ensure the seletced items of beam_search_op's CPU kernel sorted by scores.
* Update the description of beam_search in API.spec.
* Enable the use of CUDA kernel in beam_search op.
* Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements.
test=develop
* Follow comments.
test=develop
* Call the CPU kernel for beam_search op when batch_size > 4.
test=develop
* Remove the except of is_empty op in PrepareData.
test=develop
6 years ago
tink2123
78145c7dff
modified some comments
...
test=develop
6 years ago
nhzlx
027d24c831
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
6 years ago
chengduo
bf91d11ed5
Clean elementwise_op_function ( #15502 )
...
test=develop
6 years ago
tangwei12
5cfc40dea8
nce add check sample lables, test=develop ( #15463 )
...
* nce add check sample lables, test=develop
6 years ago
tink2123
e448bdb298
modified some comments
...
test=develop
6 years ago
tink2123
88744e4ab8
fixed some errors
...
test=develop
6 years ago
jerrywgz
9eb2d7b3e1
refine code, test=develop
6 years ago
jerrywgz
6dfd789bfc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refine_nms
6 years ago
jerrywgz
6928f8318f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_axis_for_boxcoder
6 years ago
jerrywgz
e60c8438fc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_clip_op
6 years ago
tink2123
48cc484643
add align_corners and align_mode for image_resize
...
test=develop
6 years ago
jerrywgz
11f1baa406
refine code, test=develop
6 years ago
Zhaolong Xing
b7b68f2a8c
Merge pull request #15461 from NHZlX/fix_trt_stream_bug
...
fix trt stream bug.
6 years ago
tangwei12
8b50ad80ff
checkpoint at distributed training ( #14854 )
...
checkpoint for distributed training.
6 years ago
jerrywgz
57e5f61ec8
add gpu kernel, test=develop
6 years ago
jerrywgz
cc53453057
add comment and refine code, test=develop
6 years ago
qingqing01
07dc5a1506
Add generate_mask_labels_op to support Mask-RCNN and refine some code. ( #15371 )
...
* Add generate_mask_labels_op to support Mask-RCNN.
* Refine sigmoid_cross_entropy to support nomalize mode.
* Fix generator_proposals_label.
* Use DeviceTemporaryAllocator in roi_pool and roi_algin.
* Remove shape check in data_feeder.
6 years ago
Yiqun Liu
eaad3e4c3d
Add check of input in sequence_expand op. ( #15466 )
...
* Add check of input in sequence_expand op.
test=develop
* Correct the unittest of sequence_expand op.
test=develop
6 years ago
gongweibao
f4dec5cdee
Check collective server's data. ( #15449 )
6 years ago
jerrywgz
c12a969bd4
refine comment and unittest, test=develop
6 years ago
chengduo
5a8bd82c0c
Remove workspace_handle ( #15376 )
...
* remove workspace_handle
test=develop
* set constant for loss
test=develop
6 years ago
jerrywgz
1c558ad388
add gpu kernel for box clip, test=develop
6 years ago
nhzlx
5b92ddabe2
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_trt_stream_bug
...
test=develop
6 years ago
nhzlx
2f4aee361a
fix comments
...
test=develop
6 years ago
nhzlx
ec213730bc
fix trt stream bug.
...
BUG: After continuing to input different data, the output cannot be aligned
test=develop
6 years ago
wopeizl
a8aa79130b
Merge pull request #15453 from wopeizl/fix15313
...
fix pr 15313
6 years ago
gongweibao
7f8b40f68d
Fix brpc complation error. ( #15451 )
6 years ago
jerrywgz
0d4b60ab8b
add lod for slice op, test=develop
6 years ago
dzhwinter
8f3b252392
squash commits. test=develop
6 years ago
peizhilin
e6a3a3a31a
fix pr 15313
...
test=develop
6 years ago
jerrywgz
66bb5dd760
refine infer shape, test=develop
6 years ago
tensor-tang
266e625d2e
Merge pull request #15399 from tensor-tang/refine/seqpool/fc
...
fix cpu jitkernel test and refine benchmark test
6 years ago
Qiao Longfei
45578c1b48
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
6 years ago
Yan Chunwei
885c4e57ab
fea/infer memory optim2 ( #14953 )
6 years ago
jerrywgz
0d91507859
fix share lod, test=develop
6 years ago
Tao Luo
6597ccb01f
Merge pull request #15413 from luotao1/legacy_code
...
remove legacy code
6 years ago
Dun
9f8f0fc2d3
Memory optimization of depthwise conv op and group norm op ( #15313 )
...
* mem opt
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine with cub test=develop
* fix mkldnn test && remove comments && test=develop
* polish code && test=develop
* add only_forward test && test=develop
6 years ago
jerrywgz
5246285e34
test=develop
6 years ago
jerrywgz
b10d84bc5a
fix bug when run on GPU, test=develop
6 years ago
whs
530869f829
Share LoD from Input(Rois). ( #15420 )
...
test=develop
6 years ago
gongweibao
7ab4af2716
Fix brpc compilation. ( #15417 )
6 years ago
Dun Liang
e5004f3c1c
fix ci && test=develop
6 years ago
tensor-tang
316e44b1b7
fix unused warnings
...
test=develop
6 years ago
Wu Yi
7e651a38dd
fix mac cmake version 3.13 build ( #15386 )
...
* fix mac cmake version 3.13 test=develop
* fix again test=develop
6 years ago
jerrywgz
b62a17bbae
add nms api
6 years ago
tensor-tang
579d758254
fix jitkernel tests and refine benchmark
...
test=develop
6 years ago
jerrywgz
f660553d77
enhance nms for mask rcnn, test=develop
6 years ago
shippingwang
14f2a1060d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into shufflechannel
6 years ago
jerrywgz
88ee56d0b2
enhance nms for mask rcnn
6 years ago
zhaozhehao
e2ba9668b4
Tree conv op ( #15217 )
...
* refactor tree2col operator with new memory mechanism test=develop
* test=develop
* test=develop
* Modified API according to panyx0718 test=develop
* fix API change according to heavengate test=develop
* Modify API comment test=develop
6 years ago
Tao Luo
3ede8b67e6
update CMakeLists.txt
6 years ago
Yiqun Liu
f413b6892b
Revert the modification of while_op in #14764 . ( #15372 )
...
* Revert the modification of while_op in #14764 .
test=develop
* Remove the dependency of GRPC_DEPS.
test=develop
6 years ago
jerrywgz
ab9d6a4f39
add comments, test=develop
6 years ago
jerrywgz
10dd3b37ad
add axis for box coder op
6 years ago
乔龙飞 Qiao Longfei
adba4384ec
Merge pull request #15161 from jacquesqiao/gru-add-mode
...
gru add origin mode
6 years ago
nhzlx
8817841c73
fix unit test bug
...
test=develop
6 years ago
jerrywgz
5fb2856584
test_develop
6 years ago
Xin Pan
3ecf6bb338
Merge pull request #15028 from yihuaxu/develop_641313ea7_elementwise_mul_mkldnn_bug_fix
...
Fix the exception when tensor format is x
6 years ago
jerrywgz
af448373c7
test=develop
6 years ago
nhzlx
b938324381
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into trt_int8_ultimate_version
...
test=develop
6 years ago
nhzlx
312fe0ece1
add trt int8 calibration support
...
fix comments
test=develop
6 years ago
wopeizl
994e73f685
Merge pull request #15351 from wopeizl/fixbuildissue
...
disable the parallel mode for adam op on windows test=develop
6 years ago
jerrywgz
481d8bce2f
add box clip op
6 years ago
Yiqun Liu
568cc2ffa8
Optimize while_op for test ( #14764 )
...
* Simplify the compare op for CPU.
* Use asynchronous tensor copy in reshape_op's kernel.
* Optimize while_op for test, avoiding creating variables every time.
test=develop
* Enable the cache of kernel type and kernel function.
test=develop
* Enable profiling with gperftools.
* Remove flags for testing, and fix the linking error.
test=develop
* Delete the codes of ChooseKernel.
test=develop
* Fix bug when preparing ExecutorPrepareContext for while_op.
* Fix missing depending on grpc libraries.
* Remove the redundant print.
test=develop
* Follow comments.
* Remove the codes related to prepare the ExecutorPrepareContext for while_op.
test=develop
6 years ago
tensor-tang
3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
...
Enable element_wise_add operator for a ngraph engine
6 years ago
tensor-tang
904a39239d
Merge pull request #15254 from mozga-intel/mozga-intel/softmax_operator_ngraph
...
Enable softmax operator for a ngraph engine
6 years ago
peizhilin
cd562f8fb7
disable the parallel mode for adam op on windows test=develop
6 years ago
Xin Pan
16cb3ebd68
Merge pull request #15268 from xiaolil1/pool-int8
...
Enhance key generation for Pool INT8 test
6 years ago
tensor-tang
a7fc3d42a0
Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub
...
Fuse/second order mul sub and fuse repeated fc relu
6 years ago
mozga-intel
cba729404d
Enable softmax operator for a ngraph engine
...
test=develop
6 years ago
Qiao Longfei
cd31b90a46
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
...
test=develop
6 years ago
Qiao Longfei
8c516a24e5
remote min_row_size_to_use_multithread in adam interface test=develop
6 years ago
Qiao Longfei
9b4fe283e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
Qiyang Min
3f687765e6
Merge pull request #15281 from velconia/fix_expand_op_compile_time
...
Fix expand op compile time bug
6 years ago
minqiyang
c4cf5967db
Change backward op infershape
...
test=develop
6 years ago
tensor-tang
84b0ecdcce
Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub
...
test=develop
6 years ago
chengduo
46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn ( #15186 )"" ( #15290 )
...
test=develop
This reverts commit 358e657f68
.
6 years ago
Qiao Longfei
4d15515c40
fix gru_gpu_kernel test=develop
6 years ago
tensor-tang
93e75c5ae5
refine jitcode of vsub and vsquare
...
test=develop
6 years ago
tensor-tang
d618e48309
fix fuse square mat order and refine test
...
test=develop
6 years ago
Qiao Longfei
4feae25378
fix build problem test=develop
6 years ago
tensor-tang
38de1ff472
add fusion squared mat sub op
6 years ago
Qiao Longfei
e641ffe77b
change interface and api spec for dynamic_gru test=develop
6 years ago
tensor-tang
09c5786e22
add square jitkernel
6 years ago
Qiao Longfei
4c7be265d3
update avx gru grad kernel test=develop
6 years ago
Qiao Longfei
9b16e54064
update gru_grad_op
...
test=develop
6 years ago
Qiao Longfei
e477d789a1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode
6 years ago
tensor-tang
f347d6e4a1
add repeated fc relu unit test
...
test=develop
6 years ago
tensor-tang
99010e6eae
init repeated fc relu op
6 years ago
tensor-tang
266a5d2f52
implement matmul refer and mkl kernel
6 years ago
tensor-tang
c5623c87a3
init jit matmul kernel
6 years ago
Xin Pan
a1bfb35dd6
try fix py2
...
test=develop
6 years ago
Dun Liang
a900015c03
add async copy and pinned place
6 years ago
colourful-tree
576c740d5d
Merge pull request #14964 from colourful-tree/data_norm
...
add data norm op
6 years ago
colourful-tree
d5a8909131
Merge pull request #14950 from colourful-tree/develop
...
add teacher student sigmoid loss
6 years ago
minqiyang
bc3e0d6e01
Fix expand op compile time bug
...
test=develop
6 years ago
chengduozh
358e657f68
Revert "Remove workspace_handle in conv_cudnn ( #15186 )"
...
test=develop
This reverts commit 064512aa47
.
6 years ago
tensor-tang
fc9fbab6a0
Merge pull request #15271 from tensor-tang/fix/typo
...
fix typo and refine
6 years ago
chengduo
064512aa47
Remove workspace_handle in conv_cudnn ( #15186 )
...
* remove workspace_handle in conv2d_cudnn
test=develop
* remove workspace_handle
test=develop
* fix bug
test=develop
* make test_conv2d_op SERIAL
test=develop
* save memory in conv_cudnn
test=develop
* enhance thread safety
test=develop
* enhance temporary allocator
test=develop
* Add excess fraction
test=develop
* follow comments
test=develop
* fix bug and code refine
test=develop
* fix memory size check
test=develop
* rename reuse_tmp_allocation_excess_fraction
test=develop
6 years ago
tensor-tang
c3a9f3c4b2
fix typo and refine
...
test=develop
6 years ago
tensor-tang
146e942c65
Merge pull request #15250 from tensor-tang/refine/seqpool/feed
...
Refine/seqpool/feed with infer zerocopytensor
6 years ago
xiaolil1
8f17c714de
Conv int8 residual ( #15145 )
...
* Enable basic MKL-DNN INT8 Conv OP
test=develop
* Modify test case
test=develop
* Clean unittest code
test=develop
* Fix test
test=develop
* Modify test
test=develop
* Enable MKL-DNN INT8 Conv with Relu Fusion OP
test=develop
* Enable INT8 Conv with residual fusion OP
test=develop
* Modify code.
test=develop
* Modify basic INT8 Conv
test=develop
* Modify Conv.
test=develop
* fix style
test=develop
* Fix style
test=develop
* Fix test
test=develop
* Modify code.
test=develop
* Fix test
test=develop
6 years ago
xiaoli.liu@intel.com
f34e779f4d
Enhance key generation for INT8 test.
...
test=develop
6 years ago
Wu Yi
fd85418329
[Feature] support mix precision training for resnet ( #14899 )
...
* clip softmax for fp16
* updates
* fuse xent support fp16 test=develop
* wip
* wip
* add simple row reduce
* wip fp16 accurate softmax
* add accurate softmax kernel for fp16 test=develop
* update test=develop
* fix cpu build test=develop
* update api.spec test=develop
* follow comments test=develop
* fix build test=develop
* fix trt build test=develop
* fix inference build test=develop
* fix merge test=develop
* update test=develop
* try fix build test=develop
* fix build test=develop
* rename real_exp test=develop
* fortest
* remove hacky kernels test=develop
* clean up test=develop
6 years ago
tensor-tang
ce909664d8
Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
6 years ago
乔龙飞 Qiao Longfei
5e74c4e88f
Merge pull request #15100 from jacquesqiao/fix-dist-sparse-decay
...
fix dist sparse l2 decay
6 years ago
tensor-tang
8e086a8521
follow comment and fix typo
...
test=develop
6 years ago
Qiao Longfei
653cd31971
remote unused code
6 years ago
Qiao Longfei
0a79d7a404
fix merge
6 years ago
Qiao Longfei
422449a945
fix style
6 years ago
Qiao Longfei
edad60e612
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-cpp-reader
6 years ago
nhzlx
4e3522e5b4
add trt int8 support
...
test=develop
6 years ago
Qiao Longfei
d0e3b24002
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
...
test=develop
6 years ago
tensor-tang
f8c305b243
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat_2
...
test=develop
6 years ago
tensor-tang
223c61ca5e
Merge pull request #15170 from tensor-tang/jit/seqpool
...
refine seqpool op
6 years ago
Qiao Longfei
c3b9edf958
follow comment test=develop
6 years ago
Zeng Jinle
e29f10d315
Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var
...
Remove op handle lock and fix var
6 years ago
mozga-intel
eff90eb941
PADDLE_WITH_NGRAPH was removed from the code
...
test=develop
6 years ago
mozga-intel
a42f8f4f6f
Enable element_wise_add operator for a ngraph
...
test=develop
6 years ago
mozga-intel
e4184008a4
PADDLE_WITH_NGRAPH was removed from the code
...
test=develop
6 years ago
Qiao Longfei
3ace486ebd
fix sum_op selected rows test=develop
6 years ago
tensor-tang
f702f8fd10
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat
6 years ago
Qiao Longfei
b16e832d4d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-dist-sparse-decay
6 years ago
sneaxiy
ed409ac9f4
Revert "Revert "Remove op handle lock""
...
test=develop
6 years ago
Tao Luo
4d9aa1745a
Merge pull request #14806 from mozga-intel/mozga-intel/scale_operator_ngraph
...
Enable scale operator for a ngraph engine
6 years ago
Tao Luo
dc0c221426
Merge pull request #14803 from mozga-intel/mozga-intel/mean_operator_ngraph
...
Enable mean operator for a ngraph engine
6 years ago
Zeng Jinle
dacfaaa966
Revert "Remove op handle lock"
...
test=develop
6 years ago
Qiyang Min
317840d3ba
Merge pull request #14277 from velconia/add_fused_emb_seq_pool_op
...
Add fused emb seq pool op
6 years ago
tensor-tang
2dd331cc21
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat
...
test=develop
6 years ago
tensor-tang
316636404f
add seqpool concat unit test
6 years ago
xiaolil1
c8f101e5da
Conv int8 relu ( #15130 )
...
* Enable basic MKL-DNN INT8 Conv OP
test=develop
* Modify test case
test=develop
* Clean unittest code
test=develop
* Fix test
test=develop
* Modify test
test=develop
* Enable MKL-DNN INT8 Conv with Relu Fusion OP
test=develop
* Modify basic INT8 Conv
test=develop
* fix type
test=develop
* Modify test
test=develop
6 years ago
tensor-tang
7923d7271f
add fusion seqpool concat op
6 years ago
Zeng Jinle
f3a13512fc
Merge pull request #15139 from sneaxiy/remove_op_handle_lock
...
Remove op handle lock
6 years ago
Qiao Longfei
44b300556d
change min_row_size_to_use_multithread to parameter of adam
...
test=develop
6 years ago
Qiao Longfei
87b4eb1da4
change min_param_size_to_use_multithread to min_row_size_to_use_multithread
6 years ago
minqiyang
0f94c1ac14
Polish code
...
test=develop
6 years ago
minqiyang
c09a379015
remove const_cast
...
test=develop
6 years ago
tensor-tang
102d93712e
Merge remote-tracking branch 'ups/develop' into jit/seqpool
...
test=develop
6 years ago
tensor-tang
123b98f417
refine heigth and codesize and support all pool
...
test=develop
6 years ago
tensor-tang
0145f40f45
use height from params of jitcode
6 years ago
tensor-tang
e0591deebc
enhance seqpool jitcode
6 years ago
Zeng Jinle
99e6e8b00f
Merge pull request #15179 from sneaxiy/fix_crf_grad_lod
...
Fix crf grad lod share
6 years ago
minqiyang
db8eb9b688
Polish code
...
test=develop
6 years ago
minqiyang
39b98709b1
Move fused ops to fused dir
...
test=develop
6 years ago
minqiyang
920d4a8b78
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_fused_emb_seq_pool_op
...
test=develop
6 years ago
乔龙飞 Qiao Longfei
7c891e1ecc
Merge pull request #15111 from jacquesqiao/fix-adam-tmp-var
...
Fix adam tmp var on cpu
6 years ago
mozga-intel
e77956c920
Enable mean operator for a ngraph
...
test=develop
6 years ago
mozga-intel
dd768714ab
Enable scale operator for a ngraph
...
test=develop
6 years ago
sneaxiy
be425461a1
fix crf grad lod share
...
test=develop
6 years ago
Qiao Longfei
3e1b914fcb
update gru op forward kernel
6 years ago
Qiao Longfei
7a81ab8607
complete gru_unite_op and test
6 years ago
Qiao Longfei
72618c8da5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gru-add-mode
6 years ago
Qiao Longfei
17b1b660fc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
Qiao Longfei
c15270c5b2
optimize multi thread adam
6 years ago
乔龙飞 Qiao Longfei
e1679b8847
Merge pull request #14893 from JiabinYang/feature/add_prefech_hs
...
Feature/add prefech hs
6 years ago
tensor-tang
92201d3956
support avg and sqrt pool and add mkl impl
...
test=develop
6 years ago
tensor-tang
c50060bb26
add jitcode impl and use it
6 years ago
tensor-tang
142bb41748
add seqpool jitkernel test and benchmark
6 years ago
tensor-tang
e58a569c6c
use seqpool jitkernel
6 years ago
tensor-tang
3e01a4048f
add refer seqpool jitkernel
6 years ago
xiaolil1
bbc9336878
Enable basic MKL-DNN INT8 Conv OP ( #15124 )
...
* Enable basic MKL-DNN INT8 Conv OP
test=develop
* Modify test case
test=develop
* Clean unittest code
test=develop
* Fix test
test=develop
* Modify test
test=develop
* Modify basic INT8 Conv
test=develop
6 years ago
Qiao Longfei
e10af895de
update gru grad op
...
test=develop
6 years ago
Qiao Longfei
78ec7c0f99
gru add origin mode
...
test=develop
6 years ago
Yan Xu
a1e60ab19b
Merge pull request #14791 from Yancey1989/parallel_graph_mode
...
[Feature] Add ParallelGraph executor mode in parallelexecutor to improve performance
6 years ago
Qiao Longfei
0e747e8d02
change the limit of thead num
6 years ago
qingqing01
c981bf0f9d
Fix compling error with cuDNN v5 ( #15148 )
...
test=develop
6 years ago
wopeizl
67093da398
Merge pull request #15122 from wopeizl/windows/fixhuberloss
...
fix the huber loss compile issue on windows test=develop
6 years ago
sneaxiy
d0a8a1e950
remove_op_handle_lock
...
test=develop
6 years ago
Xin Pan
087af6a686
Merge pull request #15131 from panyx0718/clean
...
hide temp tensor allocation
6 years ago
Yancey1989
e65436103f
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
...
test=develop
6 years ago
sneaxiy
6f06e6cdac
Merge remote origin
...
test=develop
6 years ago
xiaolil1
8eb1f26211
Enable INT8 pool OP ( #15046 )
...
* Enable INT8 pool OP
test=develop
* fix unittest
test=develop
* Clean unittest code.
test=develop
6 years ago
Xin Pan
9186451f60
hide GetTensor
...
test=develop
6 years ago
peizhilin
dba009dbbf
fix script issue
...
test=develop
6 years ago
peizhilin
cd2d60b4c8
fix build issue for density prior box op on windows test=develop
6 years ago
peizhilin
1f423f84ac
fix the huber loss compile issue on windows test=develop
6 years ago
sneaxiy
d25395fc98
remove tensor core lock
...
test=develop
6 years ago
peizhilin
b3688100ad
fix unittest
...
test=develop
6 years ago
peizhilin
5d8f281397
restore the memory mode
...
test=develop
6 years ago
peizhilin
33b7821a75
fix save and load ops on windows test=develop
6 years ago
Qiao Longfei
dfe85fb358
fix build
6 years ago
Qiao Longfei
f057bbd1d1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-adam-tmp-var
6 years ago
Qiao Longfei
f1c973b014
adam op should not create tmp var in compute
6 years ago
Yancey1989
0a885ac12a
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
...
test=develop
6 years ago
shippingwang
83f2e2c903
rewrite the comments, test=develop
6 years ago
gongweibao
ce70229ba6
Add max_body_size flags to brpc ( #15084 )
6 years ago
qingqing01
6f0a1d7b47
Inception fusion operator. ( #14968 )
...
* Inception fusion operator.
* Support horizontal layer fusion in conv_fusion_op.
* Search conv algo strategy for variable-length input.
search N times and cache the searched algos. For other input, choose the algo of input whose area is closest to this input.
6 years ago
Qiao Longfei
25d44d40ac
sum op support empty selected rows as input
6 years ago
Zeng Jinle
25b49a0896
Merge pull request #14933 from sneaxiy/rewrite_ddim
...
Rewrite ddim
6 years ago
Wu Yi
a8bc05b5ff
Refactor distributed RPC ( #15075 )
...
* wip
* wip
* refactor no.1 dir structure test=develop
* fix linking test=develop
* fix includes test=develop
* fix build test=develop
* fix build test=develop
6 years ago
Xin Pan
3e8408429d
Merge pull request #15053 from panyx0718/imperative_hold
...
refactor to avoid scope.
6 years ago
sneaxiy
73896eeb94
merge develop
...
test=develop
6 years ago
Yancey1989
4743c9cd5d
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
Xin Pan
f7294f8b25
register float16
...
test=develop
6 years ago
Zhaolong Xing
4048cfa9da
Merge pull request #15048 from NHZlX/add_affine_channel_fuse
...
Add conv+ affine channel fuse pass
6 years ago
Zeng Jinle
c0bcff00dc
Merge pull request #14962 from sneaxiy/rewrite_variable_type
...
Rewrite variable type
6 years ago
Qiao Longfei
d161215332
optimize adam multi thread
6 years ago
wopeizl
719ebe3786
Merge pull request #15070 from wopeizl/windows/testcasefix
...
fix test issues on windows
6 years ago
Qiao Longfei
7a58ad5c79
lazy mode have higher priority then multithread
...
test=develop
6 years ago
Xin Pan
f52b514dcd
call kernel
6 years ago
Xin Pan
7b6bf9ddf2
make fill_constant kernel-based
...
test=develop
6 years ago
Xin Pan
61491ce250
clean
...
test=develop
6 years ago
Xin Pan
ce7e503cbe
refactor to avoid scope.
...
test=develop
6 years ago
Qiyang Min
0238a3bb4f
Merge pull request #14972 from velconia/accelerate_lstm
...
Accelerate PADDLE_ENFORCE
6 years ago
Houjiang Chen
242d3c71a6
Merge pull request #15031 from hjchen2/develop
...
Fix conv_elementwise_add2_act pass
6 years ago
Qiao Longfei
d0572bf02e
add log for lazy mode test=develop
6 years ago
Qiao Longfei
1177b0bc84
update multi thread adam
6 years ago
Qiao Longfei
3b294e2e2e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
6 years ago
Zeng Jinle
988bc2b5a7
Merge pull request #15060 from dzhwinter/fix/nccl
...
fix ci error. test=develop
6 years ago
sneaxiy
c4ce2e7b21
merge develop, solve conflict
...
test=develop
6 years ago
shippingwang
9322d34032
Fix, test=develop
6 years ago
sneaxiy
b56aca82e9
merge develop
...
test=develop
6 years ago
jerrywgz
ef2d292bfc
Merge pull request #14956 from jerrywgz/fix_bug_in_ifelse
...
fix bug in if-else op
6 years ago
peizhilin
e49276e731
restore the huber_loss_op
...
test=develop
6 years ago
Yancey1989
86bb583881
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
peizhilin
01c00b07dd
fix test issues on windows
...
test=develop
6 years ago
tangwei12
dc8eca826e
code style fix, test=develop ( #15045 )
...
* code style fix, test=develop
6 years ago