phlrain
b65722d3cf
fix uni test; test=develop
7 years ago
tangwei12
618f7620e2
add enforce for auc ( #14687 )
...
* add enforce for AUC, test=develop
7 years ago
phlrain
2770ea1a73
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
chengduozh
3f4aca618f
code refine
...
test=develop
7 years ago
chengduozh
af8c2cec13
fix operator.cmake
...
test=develop
7 years ago
chengduozh
679d8fc6fe
rename op name
...
test=develop
7 years ago
jerrywgz
3df0538940
replace -100 to kIgnoreIndex
7 years ago
Wang Guibao
41e19eb431
AsyncExecutor ( #14627 )
...
* AsyncExecutor: C++ side
* Google naming conventions
* Rename MultiExecutor to AsyncExecutor
* pybind with async_executor
* Naming convention
* remove some flags and unused code
* add refactored file of async_executor and data_feed
* clear async executor interface and add data feed factory
* split async executor into executor_thread_worker and async_executor, refactor pybind, add datafeed and corresponding proto
* Fix async_executor interfaces: 1) Remove all protobufs; 2) Stop after each epoch
* refine async_executor_refactor.cc
* add some files about datafeed
* Revert "add some files about datafeed"
This reverts commit 8ee8133ab841196925a2812b76f18d2812a6701d.
* Interface rework
* add MultiSlotDataFeed
* Creating DataFeedDesc from .proto file, then manipulate it (add/del fields etc) from python side
* update data_feed for add MultiSlotDataFeed
* update datafeed and async_executor to run bow_net demo
* fix bug that finish_set_filelist failed in multithread
* delete finish_binding_memory_(flag), because it can not be marked under the current interface
* Fix bug
* update async_executor.py for support set_use_slots
* update async_executor.py for support set_use_slots and set set_dense_slots
* fix bug that when the number of files is less than the number of threads, it will fetch nan
* remove redundant code, and make executor exit when set a illegal queue size
* add batch_size check
* add MultiSlotDesc
* Revert "add MultiSlotDesc"
This reverts commit 2e72ebfad364ed6b5dcc75f38ffb2a1fdec83d8e.
* add some checkpoint in DataFeedDesc
* add CheckFile function in MultiSlotDataFeed
* update something error info
* fix deaded lock bug
* Fix fetch variable
* Merge error
* fix code style in async_executor
* using one lock blocking queue replace two lock blocking queue because of some bugs
* update code style
* add utest for data_feed
* Fix fetch var
* update utest for data_feed for multithread
* update SetFileList info
* fix bug in utest of data_feed
* Add comments for python
* Add comments for python code
* Fix pybind.cc with new pybind11 version
* add note for DataFeedDesc's set_use_slots function
* Add save_model
* update data_feed_test for multi-type
* add comment for executor_thread_worker
* Remove unused code
* update data_feed_test for generate test data file
* removed unnecessary interfaces and add comments
* c++ style check
* update data_feed.cc
* AsyncExecutor: C++ side
Google naming conventions
Rename MultiExecutor to AsyncExecutor
pybind with async_executor
Naming convention
remove some flags and unused code
add refactored file of async_executor and data_feed
clear async executor interface and add data feed factory
split async executor into executor_thread_worker and async_executor, refactor pybind, add datafeed and corresponding proto
Fix async_executor interfaces: 1) Remove all protobufs; 2) Stop after each epoch
refine async_executor_refactor.cc
add some files about datafeed
Revert "add some files about datafeed"
This reverts commit 8ee8133ab841196925a2812b76f18d2812a6701d.
add MultiSlotDataFeed
Interface rework
Creating DataFeedDesc from .proto file, then manipulate it (add/del fields etc) from python side
update datafeed and async_executor to run bow_net demo
update async_executor.py for support set_use_slots
Fix bug
update async_executor.py for support set_use_slots and set set_dense_slots
fix bug that when the number of files is less than the number of threads, it will fetch nan
remove redundant code, and make executor exit when set a illegal queue size
add MultiSlotDesc
Revert "add MultiSlotDesc"
This reverts commit 2e72ebfad364ed6b5dcc75f38ffb2a1fdec83d8e.
add some checkpoint in DataFeedDesc
Fix fetch variable
fix code style in async_executor
Fix fetch var
add utest for data_feed
Add comments for python
update utest for data_feed for multithread
fix bug in utest of data_feed
Add comments for python code
Fix pybind.cc with new pybind11 version
add note for DataFeedDesc's set_use_slots function
update data_feed_test for multi-type
Add save_model
update data_feed_test for generate test data file
removed unnecessary interfaces and add comments
add comment for executor_thread_worker
Remove unused code
update data_feed.cc
c++ style check
* commit for code style
* commit for code style
* commit for code style
* commit for code style
* Comment away __init__ in async_executor.py
* clang-format fix test=develop
* use PADDLE_THROW instead of exit(-1); use unique_ptr to manage scope var in data_feed_test.cc
* commit for update code style
* commit for update code style
* Add async_executor demo; Remove some methods
test=develop
* commit for update code style
* commit for update code style
* commit for update code style
* update API.spec
* AsyncExecutor
test=develop
* AsyncExecutor
test=develop
* AsyncExecutor
test=develop
* AsyncExecutor
test=develop
* Fix API.spec
test=develop
* Fix API.spec
test=develop
* Fix windows build error
test=develop
* FIx windows build error
test=develop
* FIx windows build error
test=develop
* FIx windows build error
test=develop
* Fix Windows Build
test=develop
* Fix Windows Build
test=develop
* Fix Windows Build
test=develop
* Fix code style
test=develop
* Fix code style
test=develop
* update datafeed
* Fix code style
test=develop
* update data_feed_test for test Tensor test=develop
* Fix code style
test=develop
* Fix windows build failure
test=develop
* Fix code style and windows build failure
test=develop
* Fix PYTHON3.5 build failure
test=develop
* AsyncExecutor API
test=develop
7 years ago
whs
1b9753d109
Make pad2d support for variable paddings. ( #14667 )
...
* Make pad2d support for variable paddings.
test=develop
* Rename get_paddings and add inline modifier.
test=develop
* Fix comments.
7 years ago
luotao1
bcc90123f0
speedup box_coder_op for multi-threads
...
test=develop
7 years ago
phlrain
6ce4250172
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
Qiao Longfei
44debca844
Merge pull request #14589 from jacquesqiao/refactor-prefetch
...
Refactor prefetch
7 years ago
phlrain
bd94ab0ef3
rename op; test=develop
7 years ago
phlrain
92f5be1d82
remove inputvarname in operator; test=develop
7 years ago
phlrain
cf1fe61004
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
phlrain
d1a17cadd4
fix cudnn rnn; test=develop
7 years ago
Tao Luo
20120d9c97
Merge pull request #14608 from jczaja/prv-conv2d-transpose-mkldnn
...
[MKL-DNN]conv2d transpose
7 years ago
Qiao Longfei
3e45a5a5ec
lookup_table gpu kernel support prefetch
...
test=develop
7 years ago
qingqing01
731d45a39a
Enable BatchNorm to use global mean and variane during training ( #14630 )
...
* Enable BatchNorm to use global mean and variane during training
* Update doc and follow comments.
7 years ago
Tao Luo
ea47685f91
Merge pull request #14646 from jczaja/prv-softmax-mkl-sasum
...
Softmax for inference MKL further changes
7 years ago
Qiao Longfei
3a3cfc2d8d
prefetch support gpu
...
test=develop
7 years ago
Qiao Longfei
4b9082a4cd
follow comment
7 years ago
chengduo
6776e92846
refine tensor_array_write_read ( #14643 )
...
test=develop
7 years ago
Jacek Czaja
48e1b97e8e
- Coding style fixes
...
test=develop
7 years ago
Qiao Longfei
d32de7e6e1
fix code format test=develop
7 years ago
Qiao Longfei
5a660aee7d
update log level in parameter prefetch test=develop
7 years ago
Qiao Longfei
8ebde595c9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
...
test=develop
7 years ago
Qiao Longfei
b9d3d75fc4
fix prefetch dependency test=develop
7 years ago
Qiao Longfei
145c535750
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
...
test=develop
7 years ago
minqiyang
9d7c3b18c0
Polish code
...
test=develop
7 years ago
minqiyang
2b430adaee
Polish code
...
test=develop
7 years ago
minqiyang
a02ce58f2c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
...
test=develop
7 years ago
Jiabin Yang
12e1719f96
Merge pull request #14352 from JiabinYang/enhance_hierachical_sigmod_op
...
Enhance hierarchical sigmoid op
7 years ago
Qiao Longfei
40f68b1349
unit test ready
7 years ago
Qiao Longfei
36e26a53b0
Optimize bilinear tensor product op ( #14485 )
...
* optimize bilinear_tensor_product
* add set zero to set grad to 0.
7 years ago
Tao Luo
4ec9de0122
Merge pull request #14628 from Sand3r-/mgallus/mkldnn-elementwise_mul
...
EltwiseMul: Changes from previous PR
7 years ago
Qiao Longfei
35b79ab865
Merge pull request #13983 from jacquesqiao/add-ctr-reader
...
Add ctr reader
7 years ago
Qiao Longfei
da387720d7
fix infer compile test=develop
7 years ago
Jacek Czaja
cf40daee58
- Building fix to softmax for inference
7 years ago
Clementine
6c71c1f8f9
Add activation gelu ( #14569 )
7 years ago
Michal Gallus
9455be0ba5
EltwiseMul: Extract StringToFormat to MKLDNN helper
...
test=develop
7 years ago
Jacek Czaja
1540df51cf
- Fix to test_conv2d_transpose_mkldnn for GPU
...
test=develop
7 years ago
JiabinYang
eda069068d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enhance_hierachical_sigmod_op
7 years ago
JiabinYang
a08dc83eb0
remove arg 'non_leaf_num', test=develop
7 years ago
chengduo
6648f5ed6f
add ShareLoD for dropout_grad ( #14616 )
...
test=develop
7 years ago
JiabinYang
c469334cfb
polish python code and comment, test=develop
7 years ago
Qiao Longfei
92afbb923c
fix compile problem test=develop
7 years ago
Qiao Longfei
97cbec9b74
clean code
7 years ago
Qiao Longfei
1edd435da6
fix ci problem test=develop
7 years ago
JiabinYang
87648f8edf
merge develop, test=develop
7 years ago
wopeizl
db9284ecde
Merge pull request #14617 from wopeizl/windows/online
...
Windows/online
7 years ago
JiabinYang
c3c3c0b33c
polish code, test=develop
7 years ago
Jacek Czaja
8bfa1fa9bb
- ASUM MKL integration
7 years ago
phlrain
487ee36aec
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
tangwei12
56a4912b76
Make NCE_OP more efficient and support SelectedRows ( #14469 )
...
* Fix truncated normal.
* Fix.
* Make nce support more distribution.
* Fix API.spec.
* Fix python API.
* Fix.
test=develop
* Fix API.spec
test=develop
* Fix sampler.
* Fix order of arguments in python API.
test=develop
* NCE add selectedrows support
* NCE update weighted sampling
* fix bugs in nce_op, and assign_value_op optimized
* fix bugs in nce_op, revert assign_value_op
* nce_op optimize
* nce_op optimize
* nce_op optimize
* add selectedRows test later
test=develop
* add selectedRows supported
* add selectedRows supported
test=develop
* add selectedRows supported
* add nce selectedRows supported, test=develop
* add nce selectedRows supported
* add nce selectedRows supported, test=develop
* fix height in nce, test=develop
* add ut
* add ut, test=develop
* make AutoGrownIndex inline
test=develop
* fix tinny error, test=develop
7 years ago
liuhongyu
1ffe41d722
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
7 years ago
Qiao Longfei
9589babe12
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
...
test=develop
7 years ago
liuhongyu
05917c3c79
add cudnn lstm; test=develop
7 years ago
Qiao Longfei
f35f3fe77a
ctr reader can not be used in windows
...
test=develop
7 years ago
peizhilin
6a85dd3278
Merge remote-tracking branch 'upstream/develop' into windows/build
...
test=develop
7 years ago
peizhilin
38715e6fd0
minor fix
7 years ago
Qiao Longfei
6bef565dac
clean code test=develop
7 years ago
Qiao Longfei
e7d1f524f3
change log level
...
test=develop
7 years ago
JiabinYang
7e4bd695e6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into enhance_hierachical_sigmod_op
7 years ago
Qiao Longfei
fe54adf70c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-ctr-reader
7 years ago
JiabinYang
b10df8bcfa
refine code and add none bias ut, test=develop
7 years ago
Kaipeng Deng
251a1bb0f4
Merge pull request #14588 from heavengate/revert_interpolate
...
fix interpolate_op incompatible. test=develop
7 years ago
Qiao Longfei
668ae9083e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-ctr-reader
7 years ago
Qiyang Min
30e47bce8b
Merge branch 'develop' into revert_vlog
7 years ago
Qiao Longfei
87e4edd2ea
fix grad_varname in remote prefetch
7 years ago
Qiao Longfei
d98c59fd2c
support none sliced variable
7 years ago
dengkaipeng
bb489d4cc9
add interp_method default bilinear. test=develop
7 years ago
dengkaipeng
78f563917c
revert interpolate_op to bilinear_interp_op & nearest_interp_op. test=develop
7 years ago
Jacek Czaja
fb24690a58
- conv2d transpose MKL-DNN
...
test=develop
- Added new header for MKLDNN reuse functionality
- Extended conv2d_transpose GetExpectedKernelType for MKL-DNN supporrt
- Buildable conv transpose mkldnn and conv mkldnn using conv template
- Conv2d transpose roughlt implemented and buildable
- Added modifications conv2d transpose MKLDNN unit tests
- Fix to UT of conv2d transpose mkldnn op
- Wrong type of MKLDNN primitive was chosen for conv2d transpose
- HAcks for conv2d transpose
- UT enalbed
- Replaced copying loop with memcpy
- Draft of passing lambda into AcquireMemory
- Made reorder (IOHW->OIHW) to be called only once
7 years ago
tensor-tang
7a91271436
Merge branch 'develop' into fea/jit/rnn
7 years ago
minqiyang
be04d99fe4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
...
test=develop
7 years ago
JiabinYang
81e145764d
refine code and comments, test=develop
7 years ago
Qiao Longfei
af2f5fc824
fix some bugs
7 years ago
JiabinYang
2f6b529aff
refine code and comments, test=develop
7 years ago
minqiyang
53433d7f2e
Revert the changes of VLOG
...
test=develop
7 years ago
tensor-tang
1f0291a51e
add comments and follow comments
...
test=develop
7 years ago
tensor-tang
557229bd39
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
Qiao Longfei
ed9fa4b301
can run
7 years ago
peizhilin
30849d1f20
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
qingqing01
6224e61fd9
Transpose-Flatten-Concat fusion operator. ( #14568 )
...
* Transpose-Flatten-Concat fusion operator.
* Add unit testing and fix bug.
7 years ago
Qiao Longfei
686d15c8e0
update grpc_variable_response
7 years ago
tangwei12
3639d99f99
Fix save and load lookup table/optimizer vars ( #14301 )
...
* fix mkdir conflict
* fix load/save lookup tables
test=develop
* add lookup_table_utils
* fix load optimize vars on pserver
* delete lookup table utils
* fix save and load lookup tables
* fix load optimizer var
* fix load optimizer var, test=develop
* fix python 3 style, test=develop
* move lookup_table_utils to contrib utils
7 years ago
peizhilin
36cd18b549
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiao Longfei
d827881502
fix pserver and prefetch rpc
7 years ago
Yiqun Liu
bf222f197d
Use sub scope in tensor_array_to_tensor op. ( #14524 )
...
test=develop
7 years ago
JiabinYang
02d68051db
add sparsed bias grad, test=develop
7 years ago
Qiao Longfei
5856c2f332
change Var to FindVar
7 years ago
Qiao Longfei
312b7786d9
clean code
7 years ago
Qiao Longfei
2b6c0c09d6
add unit test
7 years ago
Qiao Longfei
47280ef8b4
lookup table op support prefetch
7 years ago
gongweibao
c1bf9664cd
Add options to disable SO_REUSEPORT of grpc. ( #14269 )
7 years ago
Qiao Longfei
4ad5fd8f54
add parameter prefetch
7 years ago
Qiao Longfei
9d276fe8a8
add parameter prefetch
7 years ago
luotao1
e21edb26f6
add Set/GetCPUNumThreads api
7 years ago
Qiao Longfei
9851a53478
add prefetch part in pserver
7 years ago
JiabinYang
42470f14b7
test=develop
7 years ago
peizhilin
445fff24dc
add the bigobj option to NVCC compile
...
fix code style
7 years ago
qingqing01
36f08eef3b
CUDA kernel for density_prior_box_op. ( #14513 )
...
* CUDA kernel for density_prior_box_op.
* Support flatten to 2D.
7 years ago
tensor-tang
6a7f83d45d
enable gru jitcode and refine act and lstm jitcode
...
test=develop
7 years ago
tensor-tang
686eaf20ba
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
peizhilin
81bd7eeff4
rollback the format
7 years ago
Qiao Longfei
1f87f263a2
clean code
7 years ago
Qiao Longfei
361cb0e078
lookup remote table can compile
7 years ago
JiabinYang
0fca16847c
temp
7 years ago
JiabinYang
e9be3366a9
test=develop
7 years ago
chengduo
00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH ( #13929 )
...
* refine cublase
test=develop
* code refine
* refine cublas
* add GEMME_EX
* add enable_cublas_tensor_op_math doc and add cublasCall
test=develop
* fix CublasCall for cuda version
test=develop
* fix error
test=develop
* fix GEMM_EX to be compatible with gcc 4.8
test=develop
* add GEMM_EX
test=develop
* to compatiable with gcc4.8
test=develop
7 years ago
peizhilin
dfbac60398
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
7c8c9dc9bf
fix unit test cases
7 years ago
tensor-tang
0c5ed5f6fc
enable peephole jitcode
...
test=develop
7 years ago
JiabinYang
3c6102a367
test=develop
7 years ago
Qiao Longfei
7c3ce2952d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into refactor-prefetch
7 years ago
Qiao Longfei
60a4f69b3c
add lookup remote table op
7 years ago
Qiao Longfei
e0b48f7e29
init lookup remote table
7 years ago
tensor-tang
e3b61cf52b
init gru jitcode and fix lstm jitcode
...
test=develop
7 years ago
tensor-tang
0f25446574
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
7 years ago
Dun
ae7d22862b
Group Norm ( #13843 )
...
Add group normalization operator.
7 years ago
wopeizl
d9a1f3e58e
Windows/online ( #14474 )
...
* add recordio support
* disable the openblas multi-thread on windows since no support
adjust the python script
* code style
* code style
test=develop
* add create_recordio_file_reader back
* fix code style
test=develop
* fix the gtest.cmake on windows
* fix cc_test on windows
* fix the win build
test=develop
* remove fused compile support on windows
test=develop
* add the jit support
test=develop
* add the jit support, test=develop
* add the jit support, test=develop
* add the jit back
fix compile error on windows
* rollback test=develop
* test case fix
* disable DSO by default on windows
* exclude warpctc_op on windows
* exclude the dynload_warpctc out on windows
test=develop
* fix the scripts error
test=develop
* disable avx on windows by default
test=develop
* re-organize the cmake file
* disable mkl on windows by default
* add warp_ctc back
* fix the dependency
* fix the dependency
* fix the build issue on windows
* remove unsupported flag on windows
* code style
* code style
test=develop
* fix issue
* add profiler, parallel_executor back
* clean up the pre-definitions on windows
* fix build issue
* test=develop
7 years ago
JiabinYang
57a18e32a1
test=develop
7 years ago
peizhilin
bef475c92b
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo
5d4d117edc
Merge pull request #14502 from qingqing01/cudnn5_fix
...
Fix compling with cuDNN v5
7 years ago
Jiabin Yang
f7b55de9e5
Merge branch 'develop' into enhance_hierachical_sigmod_op
7 years ago
Yu Yang
e68c1fcd5a
Merge pull request #14522 from reyoung/feature/fix_op_header_deps
...
fix(Compile): fix depends error when compile op using cub
7 years ago
tensor-tang
3562051302
add gru refer code and remove redundant avx code
...
test=develop
7 years ago
JiabinYang
af9a3301da
test=develop
7 years ago
Zhaolong Xing
ad349e770f
Merge pull request #14452 from NHZlX/fix_avg_pool_trt_bug
...
fix avg pool trt bug
7 years ago
tensor-tang
f913860873
jitkernel lstm refer support peephole
...
test=develop
7 years ago
tensor-tang
2f9b5f2383
Merge branch 'develop' into fea/jit/rnn
7 years ago
JiabinYang
014e50c284
test=develop
7 years ago
Yu Yang
3edd32d070
fix(Compile): fix depends error when compile op using cub
...
some operators depend on cub and xxhash by header. The dependency should be declared explicitly rather than declared to pybind.
test=develop
7 years ago
Dang Qingqing
cda60311f9
Fix compling with cuDNN v5
...
test=develop
7 years ago
peizhilin
67562a6fcd
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang
10fb4ceefc
Merge pull request #14351 from tpatejko/tpatejko/mkldnn-elementwise_mul
...
[MKLDNN][JIT][AVX512] Elementwise Mul
7 years ago
jerrywgz
13e254faed
refine code, test=develop
7 years ago
tensor-tang
b4c826c548
Merge remote-tracking branch 'ups/develop' into fea/jit/rnn
...
test=develop
7 years ago
tensor-tang
ce31deb7e9
refine refer code and add lstm refer code
...
test=develop
7 years ago
jerrywgz
79cec53111
add ignore index for sigmoid cross entropy with logits op, test=develop
7 years ago
nhzlx
e62872df8b
fix conflicts
7 years ago
tensor-tang
c2cfb03a72
add lstm jitcode
7 years ago
peizhilin
25adf970b2
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Tao Luo
1d3e9bde1e
Merge pull request #14488 from yihuaxu/develop_7a64d48f5_stack_opt
...
Optimize the stack operator
7 years ago
tensor-tang
7aa3aff338
Merge pull request #14465 from tensor-tang/fea/jit/exp
...
jitcode act support all size
7 years ago
Tao Luo
1b894e495f
Merge pull request #14437 from jczaja/prv-softmax-mkl
...
Introducing MKL to softmax for inference
7 years ago
peizhilin
3a72a634cf
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yihua Xu
a906a361be
Add the macro for NVCC (test=develop)
7 years ago
Yihua Xu
d91740acb1
Revert "Remove the remnant code (test=develop)"
...
This reverts commit be50670348
.
7 years ago
Yihua Xu
be50670348
Remove the remnant code (test=develop)
7 years ago
qingqing01
9eefd2c766
Modify some infer-shape about detection operators in compile-time. ( #14483 )
...
* Modify some infer-shape in compile-time.
7 years ago
Yihua Xu
f4c869d872
Optimize the layer_norm operator with AVX intrinsic function ( #14417 )
...
* Optimize layer_norm operator with AVX intrinsic functions
* Revert the wrong modifications
* Implement the jit kernel for layer_norm operator
* Add math headfile to fix the compile issue (test=develop)
* Add math headfile to fix the compile issue (test=develop)
* Fixed the intrinsic headfile issue (test=develop)
* Fix the conflicts (test=develop)
* Revert for CUDA compiler (test=develop)
* Fixed the cuda depency (test=develop)
* Fix the marco issues (test=develop)
7 years ago
peizhilin
ee0fd78c81
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yu Yang
f1a392a5fe
Merge pull request #13804 from sneaxiy/rewrite_allocation
...
Rewrite allocation
7 years ago
Yihua Xu
f418f552df
Merge branch 'develop' into develop_7a64d48f5_stack_opt (test=develop)
7 years ago
peizhilin
8443961a4f
add warp_ctc back
7 years ago
qingqing01
fd7e643153
Convolution fusion operator. ( #14449 )
...
* Convolution fusion operator.
* Clean code
test=develop
7 years ago
Yu Yang
98bbfc17be
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
peizhilin
4a6769da84
re-organize the cmake file
7 years ago
dengkaipeng
8ef6280c03
Add operator double support. test=develop
7 years ago
peizhilin
1aff40a4c6
exclude warpctc_op on windows
7 years ago
peizhilin
7d51a0e887
disable DSO by default on windows
7 years ago
peizhilin
b967e01cbe
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Wu Yi
d7bd0361cb
fix dist deps ( #14471 )
...
* fix dist deps test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
7 years ago
Jacek Czaja
9b0eae3023
- Removing partial specialization of sotmax for inference for GPU
...
test=develop
7 years ago
peizhilin
a3e952f41d
add the jit back
...
fix compile error on windows
7 years ago
tensor-tang
a19b3225a1
fix jitcode small size
...
test=develop
7 years ago
Jacek Czaja
be80bb4f28
- Fix to GPU
...
test=develop
7 years ago
tensor-tang
4dbdfa60ef
sigmoid and tanh support all size
...
test=develop
7 years ago
tensor-tang
ccb8963705
refine exp jitcode with all size
...
test=develop
7 years ago
peizhilin
1cc23ef67d
merge from paddle:develop
7 years ago
tensor-tang
d3eae8f61b
refine relu and fix addrelu test
7 years ago
tensor-tang
4e67fe6a12
refine act and vxx with all size
7 years ago
tensor-tang
ba3eaed7a7
exp support all size
7 years ago
tensor-tang
1ffce8c0ae
fix build error on noavx
...
test=develop
7 years ago
Michal Gallus
c69c41604e
MKLDNN elementwise_mul: Move Kernel to KernelPool to avoid segfaults
...
test=develop
7 years ago
Michal Gallus
785066eb8a
MKLDNN elementwise_mul: Check if AVX512 is available
...
test=develop
7 years ago
Michal Gallus
08f63c4d12
MKLDNN elementwise_mul: Lint changes to UT & integration
...
test=develop
7 years ago
Michal Gallus
49b09327f6
MKLDNN elementwise_mul: Reorder on non-nchw input, fallback on non-16 divisable fm
...
test=develop
7 years ago
Michal Gallus
d14858e4ba
MKLDNN elementwise_mul: Parallelize mul
7 years ago
Michal Gallus
ed31936ba1
MKLDNN elementwise_mul: Support NCHW, update UT
7 years ago
Tomasz Patejko
700bcbf74f
MKLDNN elementwise_mul: h and w loops implemented in xbyak
7 years ago
Tomasz Patejko
ad09facafe
MKLDNN elementwise_mul: CPU tests initially refactored. MKLDNN mul test for broadcast added
7 years ago
Tomasz Patejko
2d73ad180a
MKLDNN elementwise_mul: simple xbyak version for AVX512
7 years ago
Tomasz Patejko
213ec37d6a
MKLDNN elementwise_add: simple initial implementation of the operator for MKLDNN format
7 years ago
Wu Yi
a2d9b34417
Refine operator cmake ( #14413 )
...
* wip simplify operator framework
* wip
* wip
* done test=develop
* clean test=develop
* fix test=develop
* fix deps test=develop
* fix cpu build test=develop
* fix tensorrt build test=develop
* fix tests test=develop
* fix test=develop
* fix cpu build test=develop
7 years ago
peizhilin
764f97deac
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
8580b7a130
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
tensor-tang
7f17e561d7
Merge pull request #14423 from tensor-tang/fea/jit/act
...
jitcode act relu, exp, sigmoid, tanh
7 years ago
Jiabin Yang
28bd5b7bad
fix space_to_depth_op unicode problem ( #14430 )
...
* fix space_to_depth_op unicode problem
* test=develop
7 years ago
Jacek Czaja
513bb6c151
Squashing MKL based softmax for inference
...
test=develop
- Added profiling to softmax functors
- MKL based softmax inference op
- Fix to softmax compuation via MKL
- cleaning
- Cosmetic fixes to softmax MKL
- Fix to ON_INFER lack of propagation
7 years ago
nhzlx
9b64aac41f
add macro for pool2dDirectCUDAFunctor
...
test=develop
7 years ago
whs
1722678258
Make nce support more distribution. ( #13549 )
...
* Fix truncated normal.
* Fix.
* Make nce support more distribution.
* Fix API.spec.
* Fix python API.
* Fix.
test=develop
* Fix API.spec
test=develop
* Fix sampler.
* Fix order of arguments in python API.
test=develop
7 years ago
nhzlx
83f8c403a7
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into fix_avg_pool_trt_bug
...
test=develop
7 years ago
nhzlx
b969116988
fxi avg pool trt bug and fix cpplint
7 years ago
tensor-tang
1f00723fa3
exp, sigmoid, tanh jitcode support more size
...
test=develop
7 years ago
Qiyang Min
d971d5b875
Merge pull request #14431 from velconia/fix_expand_op_dim_in_compile_time
...
Fix expand op incorrect infer shape
7 years ago
Wu Yi
b32c13dc20
Add cudnn ctc loss ( #12366 )
...
* add cudnn ctc loss
* wip add test test=develop
* wip
* wip
* done test=develop
* move include cudnn test=develop
* test test=develop
* fix build test=develop
* fix build test=develop
* fix build on cudnn5 test=develop
* fix cudnn5 build test=develop
* fix cudnn5 build test=develop
* merge develop softmax functor change test=develop
7 years ago
tensor-tang
8cda7b3d20
Merge remote-tracking branch 'ups/develop' into fea/jit/act
...
test=develop
7 years ago
tensor-tang
e2d6eddd32
remove ComputeDeprecated
...
test=develop
7 years ago
peizhilin
6d0d5a76eb
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
dengkaipeng
f115eb0d1e
enhance api. test=develop
7 years ago
tensor-tang
64f7516aee
fix lrn on mac ( #14426 )
...
* rename and fix blas vsqr
test=develop
* update
7 years ago
Yu Yang
c8f6e70ab4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
tensor-tang
f65ddff8d1
unify act jitcode of relu, exp, sigmoid and tanh
7 years ago
tensor-tang
6a159071b6
add vtanh jitcode of size 8
7 years ago
tensor-tang
046374bcd1
add vsigmoid jitcode of size 8
7 years ago
minqiyang
560b29ccb7
Polish code
...
test=develop
7 years ago
minqiyang
21d6e8e8c8
Polish code
...
test=develop
7 years ago
minqiyang
50b6e4c6bc
Fix expand grad op infer shape
...
test=develop
7 years ago
Sylwester Fraczek
8a1eeec579
add mkldnn prop_kind phase for inference-only case to pooling and activations ( #14278 )
...
* add is_test to pooling and activations
add prop_kind support for layers activation. conv and pooling
add a pass that sets is_test to true
add transpiler version of is_test pass
test=develop
* patch test and pass
test=develop
* add pass to analyzer.h
test=develop
* add is_test attr description & pass only on mkldnn
in:
activation_op.cc
batch_norm_op.cc
conv_op.cc
dropout_op.cc
lrn_op.cc
pool_op.cc
sequence_pool_op.cc
softmax_op.cc
* fix is_test handling for activation pool and conv
* change description of is_test for all layers again
* remove GetAttr(use_mkldnn) from pass
* rename correct_mkldnn_test_phase to is_test
and remove dependency on MKLDNN
test=develop
* review fix magic number
* two if(..)s into one
* Check is_test once and pass mkldnn forward prop kind
* dereference shared_ptr with * (without get())
test=develop
* add is_test_pass back
test=develop
7 years ago
peizhilin
d1429ac4a5
add recordio support
7 years ago
chengduo
82773477ae
Add selu ( #14415 )
...
* add selu
* use for range
test=develop
* add API
test=develop
* follow comment
test=develop
* update API.spec
test=develop
7 years ago
dengkaipeng
95d5060ddd
fix abs -> fabs error. test=develop
7 years ago
minqiyang
30147d7f58
Fix expand op incorrect infer shape
...
test=develop
7 years ago
JiabinYang
ba9ff508e8
temp fix
7 years ago
Yihua Xu
03ccb9a461
Optimize the stack operator
7 years ago
dengkaipeng
2faa2b4048
remove cu file. test=develop
7 years ago
tensor-tang
ee2a7f1b8c
refine exp and fix error on avx
...
test=develop
7 years ago
tensor-tang
1e06a32a0d
add vexp jitcode of size 8
...
test=develop
7 years ago
tensor-tang
2354409601
Merge pull request #14374 from tensor-tang/fea/jit/act
...
add vrelu jitcode
7 years ago
Tao Luo
5ef123c778
Merge branch 'develop' into dam_fc
7 years ago
dzhwinter
d3aed98d86
Merge pull request #14320 from wopeizl/windows/online
...
Windows/online
7 years ago
Tao Luo
d3e63e6e04
Merge pull request #14412 from jczaja/prv-dam-softmax
...
Softmax for Inference is enabled when ON_INFER is set
7 years ago
peizhilin
be332a13bc
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Jacek Czaja
b361579f09
- Softmax for Inference is enabled when ON_INFER is set
...
test=develop
7 years ago
Tao Luo
980a6753a8
fix typo to pass the ci
...
test=develop
7 years ago
Tao Luo
8f301f4618
Merge pull request #14381 from qingqing01/manylinux_v5_fix
...
Fix compiling with cuDNN v5.
7 years ago
peizhilin
1a9008c420
code style fix
...
test=develop
7 years ago
Tao Luo
e0d4e04bdd
fix some compiler warning
...
test=develop
7 years ago
Tao Luo
8ea13e336a
add in_num_col_dims for fc
7 years ago
JiabinYang
a507845a77
test=develop
7 years ago
Tao Luo
9eb0ab1db3
Merge pull request #14384 from tensor-tang/refine/lrn
...
Refine lrn cpu forward
7 years ago
peizhilin
30ddc07a7e
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiao Longfei
e65cbd3b06
Merge pull request #14387 from jacquesqiao/lookup_sparse_table_add_test_mode
...
Lookup sparse table add test mode
7 years ago
Qiao Longfei
6cf8f24b1b
Merge pull request #14389 from jacquesqiao/fix_sgd_op_optimize_sparse_table
...
sgd_op optimize selected rows do not enforce id < height
7 years ago
Xin Pan
10ab177f89
Merge pull request #14403 from PaddlePaddle/revert-14337-prv-dam-softmax
...
Revert "Softmax op optimization for inference "
7 years ago
Yan Chunwei
9f252e0032
Combine Inference Analysis with IR ( #13914 )
7 years ago
Tao Luo
5b9c62faee
Revert "Softmax op optimization for inference "
7 years ago
Tao Luo
6490bb2765
Merge pull request #14337 from jczaja/prv-dam-softmax
...
Softmax op optimization for inference
7 years ago
chengduo
9f68e9a7fe
fix auc op ( #14385 )
...
test=develop
7 years ago
dengkaipeng
a0284f6fbc
Add backward CPU kernel. test=develop
7 years ago
Dang Qingqing
d219818434
Fix compiling in cuDNN v5.
...
test=develop
7 years ago
Qiao Longfei
efb5c03f60
sgd_op optimize selected rows do not enforce id < height
...
test=develop
7 years ago
Qiao Longfei
7aa8b2ccf2
optimize code
7 years ago
Qiao Longfei
8d205c853c
add is_test for lookup_sparse_table
7 years ago
tensor-tang
b4dfba1779
refine lrn_op cpu forward and speedup
...
test=develop
7 years ago
tensor-tang
1be85d011d
add mkl vsqr and vpow
7 years ago
JiabinYang
f4be1d99d0
polish code and test
7 years ago
ruri
4a55fb5f5b
Add density_prior_box_op ( #14226 )
...
Density prior box operator for image detection model.
7 years ago
tensor-tang
0043c42b3e
add vrelu jitcode
...
test=develop
7 years ago
dengkaipeng
36c46152e1
Add unittest for yolov3_loss. test=develop
7 years ago
dengkaipeng
77c1328fa7
add CPU kernel forward
7 years ago
dengkaipeng
5d0b568ecb
Add YOLOv3 loss operator. test=develop
7 years ago
JiabinYang
b8ff0972b6
test=develop
7 years ago
JiabinYang
32e05b01f2
test=develop
7 years ago
peizhilin
61fa5218b9
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Yibing Liu
bd2943788b
Fix gather & stack op ( #14355 )
...
* Add int type support for stack_op
* Improve gather op to support index with shape N x 1
test=develop
* Fix stack_op kernel's registry
test=develop
7 years ago
Yu Yang
8f9bfad246
perf(compile): speed up reduce_op compile by splitting files ( #14294 )
...
test=develop
7 years ago
sneaxiy
d231e55065
merge develop
...
test=develop
7 years ago
JiabinYang
c8801e100f
grad diff problem to be fixed and need api spec change to be done
7 years ago
Jacek Czaja
03299ed46c
- Fix to linking for GPU builds of softmax inference
...
test=develop
7 years ago
Jacek Czaja
0756343767
- Fix GPU compilation
...
test=develop
7 years ago
Jacek Czaja
d332326847
- Added unit tests for softmax is_test=True op
...
test=develop
7 years ago
Jacek Czaja
c1fccc29c1
- Noise adding removed for Test phase of softmax
7 years ago
peizhilin
7638f0afb3
simplify the logic
7 years ago
peizhilin
d01a26280e
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Xin Pan
ff28b1ffc0
Merge pull request #14071 from barrierye/add_similarity_focus_op
...
Add similarity focus op
7 years ago
li099
688ed60116
Add lod tensor array to tensor op ( #13990 )
...
* add lod tensor array concat
* add lod tensor array concat
* test=develop
* add lod tensor array concat
test=develop
* Fix API.spec
test=develop
* add lod tensor array concat
test=develop
* revise some bug of lod tensor array concat
test=develop
* add unittest for tensor array concat
test=develop
* change to tensor array to tensor
test=develop
* revise bug
test=develop
* revise a bug
test=develop
* revise a bug
test=develop
* revise a bug of python3
test=develop
7 years ago
peizhilin
e23061e0dc
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
chengduo
6c6e638550
Add InferVarType for some op ( #14201 )
...
* add_infer_var_type
test=develop
* InferVarTypeHelper-> VarTypeInferenceHelper
test=develop
* PassInputTypeAndDTypeOnOutput
test=develop
* follow comment
test=develop
7 years ago
peizhilin
1eec5a428f
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Kaipeng Deng
0b38822624
Merge pull request #14345 from heavengate/fix_grid_sampler
...
fix #14344 : win compile error, EigenTenor * float unsupport. test=develop
7 years ago
peizhilin
ca60e1d34d
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
peizhilin
52f7644f53
Merge remote-tracking branch 'upstream/develop' into windows/build
7 years ago
Qiyang Min
698698f2fa
Merge branch 'develop' into fix_vlog
7 years ago
qingqing01
abe209234f
Exhaustive search for cuDNN conv. ( #14286 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
* Fix compiling test=develop
7 years ago
Yu Yang
b59a9bfb7c
Clean buffered_allocator
...
test=develop
7 years ago
Kaipeng Deng
f215534ecf
Merge pull request #14205 from heavengate/nearest_interp
...
Add interpolate operator replace bilinear_interp_op and add nearest neighbor interp mode
7 years ago
dengkaipeng
72108d8dbe
fix win compile error: EigenTenor * float unsupport. test=develop
7 years ago
Yu Yang
26fb34c365
Merge develop tiny fix
7 years ago
Yu Yang
fdc689142c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
...
test=develop
7 years ago
tensor-tang
22125ebaef
Merge pull request #14321 from tensor-tang/fea/jit/vscal
...
Fea jitcode vscal vaddbias
7 years ago
Tao Luo
34e9e59f4a
Merge pull request #14333 from kbinias/change-hardcoded-format-and-bump-mkldnn-version
...
Changed hardcoded format to any in convolution and bumped MKL-DNN version to 0.17rc
7 years ago
minqiyang
87450b9ad4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
peizhilin
41b423d41b
remove duplicate
7 years ago
peizhilin
dcfab11193
merge from develop
7 years ago
peizhilin
4ffa92d4f0
Merge branch 'develop' into windows/build
7 years ago
chengduo
c5b6573a5a
Fix input<tensor> ( #14208 )
...
* fix input<tensor>
test=develop
* fix split_ids
test=develop
* ElementwiseMul should not support SelectedRows
* fix scale op
test=develop
* change GetTensorFromVar() method to GetTensorOrSelectedRowsFromVar()
* fix operator
* refine MultiOutput
* fix MultiOutput
test=develop
* disable test_dist_save_load
test=develop
* fix elementwise_op
test=develop
* add get_sparse_as_op
test=develop
* add info for check
test=develop
* rename get_sparse_as_op with extract_rows_as_op.
test=develop
* elementwise doesn't support selected_rows
* fix regularizer
* remove extract_rows_as
test=develop
* fix ci
test=develop
* add test for sum_op
* fix regularizer
test=develop
* test=develop
* fix pserver weight decay multi inputs test=develop
7 years ago
Krzysztof Binias
f1c1acf1ac
Changed hardcoded format to any in convolution and bumped MKL-DNN version to 0.17-rc
...
test=develop
7 years ago
Tao Luo
813e54efbd
Merge pull request #14328 from PaddlePaddle/revert-14046-windows/debug
...
Revert "cherry picked windows patches."
7 years ago
minqiyang
3db9fad764
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
Xin Pan
b03a44e062
Merge pull request #14026 from JiabinYang/add_reorg_op
...
Add reorg op
7 years ago
Zhaolong Xing
ba8b5619a3
Revert "cherry picked windows patches."
7 years ago
minqiyang
fcc0452c8b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
...
test=develop
7 years ago
minqiyang
0c3227a523
Change the origin VLOG level to 10 times
...
Fix code to support cpplint syntax check
test=develop
7 years ago
tensor-tang
5e64244f25
add vaddbias jitcode
...
test=develop
7 years ago
tensor-tang
5f7956ae59
Merge remote-tracking branch 'ups/develop' into fea/jit/vscal
7 years ago
peizhilin
869487a2b7
Merge remote-tracking branch 'origin/develop' into windows/build
7 years ago
tensor-tang
3d950a812d
combine jitcode of vscal
7 years ago
tensor-tang
03e11f3fc9
add vscal jitcode
7 years ago
dzhwinter
234a1d9248
Merge remote-tracking branch 'origin/develop' into windows/debug
...
test=develop
7 years ago
chengduo
a270fdf2db
Fix SelectedRowsAdd bug ( #14309 )
...
* fix selected_rows bug
test=develop
* refine cos_sim
test=develop
7 years ago
tensor-tang
2f0a379af7
Merge pull request #14307 from tensor-tang/fix/mac
...
fix mac
7 years ago
Zeng Jinle
b2af213009
Merge pull request #14292 from sneaxiy/delete_buggy_selected_rows_functor
...
Delete buggy selected_rows functor
7 years ago
tensor-tang
161ba9c9d1
fix mac
...
test=develop
7 years ago
tensor-tang
e8642c3c1f
Merge pull request #14265 from tensor-tang/fea/jit/vadd
...
add vadd, vaddrelu jitcode
7 years ago
dengkaipeng
8b47d90f5d
add 'actual_shape' attribute. test=develop
7 years ago
tensor-tang
382307b943
refine code
...
test=develop
7 years ago
tensor-tang
3319072858
fix jit kernel test on mac
...
test=develop
7 years ago
Yu Yang
057a682ee9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
7 years ago
Qiao Longfei
e0c8397426
Merge pull request #14257 from jacquesqiao/optimize-pserver-profiler-thread-pool
...
clean rpc server profiler
7 years ago
chengduo
ffc866159f
hot fix log ( #14293 )
...
test=develop
7 years ago
Zhaolong Xing
65b61db10a
Merge pull request #13927 from NHZlX/fix_googlenet_bug_with_rule
...
Fix googlenet bug with rule
7 years ago
tensor-tang
25e070ecc7
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
barrierye
ef8218be22
update docs test=develop
7 years ago
sneaxiy
9518bc8d0a
delete buggy selected_rows functor
...
test=develop
7 years ago
chengduo
a9b5d42dd4
Add fp16 backward support ( #14202 )
...
* add fp16 backward support
test=develop
* add sum_op fp16 test
* disable test_dist_save_load
test=develop
* add check_grad for sum
* add unit test for softmax_grad fp16
test=develop
* add scale_op unit test
* add mul_grad_op unit test for fp16
* add cross_entropy_grad and eman_grad unit test for fp16
test=develop
* fix cross_entropy unit test
* add pool2d fp16 unit test
* refine conv2d fp16 unit test
test=develop
* refine activation unit test
test=develop
* fix ci
test=develop
* follow zhihong's comment, copy from https://github.com/PaddlePaddle/Paddle/pull/12796
test=develop
7 years ago
Qiao Longfei
3b8dd9ebbd
optimize code test=develop
7 years ago
Qiao Longfei
2921f8a79c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
7 years ago
dzhwinter
2835e04409
merge develop branch. test=develop
7 years ago
dzhwinter
deb4af70ef
add test
7 years ago
qingqing01
db8c52da5e
Revert " Exhaustive search for cuDNN conv. ( #14043 )"
...
This reverts commit ce7d9b0799
.
7 years ago
qingqing01
ce7d9b0799
Exhaustive search for cuDNN conv. ( #14043 )
...
* exhaustive search for cuDNN conv.
* Refine code and add unit testing.
* Clean code
* Fix model load in fluid/inference and unit testing in conv2d
* Follow comments.
7 years ago
tensor-tang
cb4083b9fa
fix compile error
...
test=develop
7 years ago
tensor-tang
dd343a4971
Merge remote-tracking branch 'ups/develop' into fea/jit/vadd
7 years ago
Zeng Jinle
fcbe84cb50
Merge pull request #14270 from sneaxiy/fix_rmsprop_enforce_bug
...
Fix rmsprop_op enforce bug
7 years ago
nhzlx
5700fafd0f
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_googlenet_bug_with_rule
...
test=develop
7 years ago
nhzlx
86b99ac953
fix comments and fix bug
7 years ago
tensor-tang
e6cfdf6c74
Merge pull request #14274 from tensor-tang/fix/jit
...
fix jit on mac
7 years ago
Zeng Jinle
8ac2242b6e
Merge pull request #14075 from sneaxiy/remove_some_locks_in_pe
...
Remove some locks in ParallelExecutor
7 years ago
tensor-tang
b81e1b655e
fix jit on mac
...
test=develop
7 years ago
sneaxiy
11f032a82e
fix rmsprop_op enforce bug
...
test=develop
7 years ago
tensor-tang
b68ececb73
add vaddrelu jitcode
...
test=develop
7 years ago
peizhilin
1f12ba6192
gpu support, fix build issue:
...
1. Non utf-8 characters within comments of OPs may lead to protobuf fail to parse_from_string
2. comment out some ops which not supported on windows
3. cuda libs may not be correctly linked to target on windows
7 years ago
Wu Yi
8fc05e0373
fix cpu build test=develop ( #14260 )
7 years ago
tensor-tang
bb09e31020
add vadd jitcode
...
test=develop
7 years ago
Qiao Longfei
59fbfbfbf7
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
...
test=develop
7 years ago
whs
d6a6a13039
Fix build error of affine grid op in mac os. ( #14237 )
...
* Fix build error of affine grid op in mac os.
test=develop
* Make function return reference.
test=develop
7 years ago
tensor-tang
d55481cfeb
Merge pull request #14241 from tensor-tang/refine/jit/vmulcode
...
Refine/jit/vmulcode
7 years ago
Qiao Longfei
9e4e9e9b6e
clean rpc server profiler
7 years ago
Zeng Jinle
8d930195d9
Merge pull request #14238 from sneaxiy/fix_read_lod_level_bug
...
Fix lod_level share bug in read_op
7 years ago
Wu Yi
306236c2c0
feature/DC asgd ( #12722 )
...
* wip
* add ref_by_trainer_id op
* ready to test
* fix ref inputs
* refine rpc_op_handle
* fix merge bug
7 years ago
dengkaipeng
fef2faa709
limit CUDA kernel parallel threads max number to 4096. test=develop
7 years ago
tensor-tang
c3cbf0b8ef
Merge pull request #14185 from tpatejko/tpatejko/mkldnn-conv-residual-data-reorder
...
Residual data reorder in MKLDNN convolution
7 years ago
peizhilin
71d7980f69
fix build issue 1
7 years ago
dengkaipeng
34bfae243a
Add Interpolate operation. test=develop
7 years ago
sneaxiy
46d4829dd1
fix lod_level share bug in read_op
...
test=develop
7 years ago
tensor-tang
8465e7876f
auto grow the size and fix test
...
test=develop
7 years ago
tensor-tang
9255119fd9
refine jit vmul with all size
7 years ago
tensor-tang
a9c1824131
refine jit vmul code supporting multiple of 2
7 years ago
tensor-tang
61fdc38e51
Merge pull request #14206 from tensor-tang/fea/jit/gen
...
Fea/jit/gen
7 years ago
peizhilin
9d67c1fb69
cpu build support
7 years ago
barrierye
5e7bb6a9bd
update docs test=develop
7 years ago
dzhwinter
60f70b174d
test=develop
7 years ago
sneaxiy
7ff320f8cc
merge develop
7 years ago
dongzhihong
00cf66964f
Merge remote-tracking branch 'origin/develop' into fix/sign_op
...
test=develop
7 years ago
Kaipeng Deng
daed473d4a
Merge pull request #14089 from heavengate/pool_exclude
...
add inclusive/exclusive mode in avg pool
7 years ago
Kaipeng Deng
64f3e3ed8f
Merge pull request #14069 from heavengate/grid_sampler
...
Grid sampler operator for spatial transformer network.
7 years ago
sneaxiy
366ebb93f7
test=develop
7 years ago
dzhwinter
eb2f7ed21b
refine tests. test=develop
7 years ago
Jiabin Yang
9f65b616b2
Merge branch 'develop' into add_reorg_op
7 years ago
Kaipeng Deng
0b29078201
Merge branch 'develop' into grid_sampler
7 years ago
whs
0c319e0b35
Add affine grid generator op ( #12238 )
...
* Add affine grid generator.
* fix ffine grid.
* Add unitest.
* Add CPU kernel and fix unitest.
* Fix CPU kernel.
* Refine code.
test=develop
* Fix python api.
test=develop
* Update python api.
test=develop
* Fix comment.
test=develop
* Rename affine_grid_generator to affine_grid and enhence unitest.
test=develop
* Fix unitest.
test=develop
7 years ago
tangwei12
d325e668b8
[1.1] Load vars on PSERVER ( #14037 )
...
* fix dim0 in _load_slice_up_vars
* fix dim0 in _load_slice_up_vars, fix innershape in delete_var_op
* Revert "fix lookuptable in reduce strategy"
This reverts commit 0e722c5
* add unit test for dist
* add unit test for dist, test=develop
* cancel revert, test=develop
7 years ago
tensor-tang
85bcb286f5
refine vmul jitcode
...
test=develop
7 years ago
tensor-tang
a764e900a5
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
...
test=develop
7 years ago
tensor-tang
a3377f7b0a
refine jitcode and add vmul jitcode implementation
7 years ago
dzhwinter
1ace55c8ee
merge develop branch
7 years ago
dengkaipeng
df4a3544aa
nearest neighbor interp add cuda kernel. test=develop
7 years ago
chengduo
2ccf77d1c1
Refine GetTensorFromVar ( #14160 )
...
* fix GetTensorFromVar
test=release/1.1
* refine GetTensorFromVar
test=develop
7 years ago
dengkaipeng
9755611938
add unittest for nearest_neighbor_interp_op
7 years ago
dengkaipeng
a24691a2a9
add nearest neighbor interpolation operator cpu kernel
7 years ago
JiabinYang
8d3c3e048b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_reorg_op
7 years ago
tensor-tang
f3badacd97
Merge remote-tracking branch 'ups/develop' into fea/jit/gen
7 years ago
tensor-tang
a53b1b0b1b
refine and init jitkernel vmul
7 years ago
tensor-tang
2139b9f677
add jit gencode
7 years ago
Tomasz Patejko
8899d42265
MKLDNN conv residual data: primitive reuse interface used. Reorder done when formats are different
...
test=develop
7 years ago
chengduo
b73708d20b
add int and int64 dtype for gather_op ( #14175 )
...
test=develop
7 years ago
Tomasz Patejko
f11934cbe6
MKLDNN conv residual data: residual data is reorder when formats are incorrect
7 years ago
Tao Luo
cdf2579d08
Merge pull request #14053 from jczaja/prv-seqpool-max
...
Max Sequence pool optimization
7 years ago
Kaipeng Deng
a3b26e8528
Merge branch 'develop' into grid_sampler
7 years ago
dengkaipeng
7333fe8e55
add math formula for exclusive/inclusive mode in avg pool. test=develop
7 years ago
dzhwinter
316765839d
add back jit simd instructions. stage.
7 years ago
Xin Pan
eb7ed1b720
Merge pull request #13897 from gmcather/develop
...
1.add position encoding 2.logloss in nn.py
7 years ago
barrierye
fc23cc9d30
update paddle/fluid/API.spec
...
test=develop
7 years ago
dzhwinter
bf2e4cb188
cleard. staged
7 years ago
chengduo
2f639113ee
Fix sum_op's GetExpectedKernelType ( #14112 )
...
* fix sum_op's GetExpectedKernelType
test=develop
* fix ci fail
test=develop
7 years ago
gmcather
ba22624d7e
position encoding && log loss
...
test=develop
7 years ago
dzhwinter
ebfe5a02b3
merge develop branch
7 years ago
qingqing01
cb27a9219d
Merge pull request #13971 from sefira/FasterOpDoc
...
generate proposal labels doc
7 years ago
sneaxiy
5e5d2223a1
test=develop
7 years ago
tensor-tang
3c957af139
Merge pull request #14080 from tensor-tang/refine/jit/crf2
...
Refine/jit/crf decoding
7 years ago
barrierye
5f3acac9b3
update paddle/fluid/API.spec
...
test=develop
7 years ago
Jacek Czaja
458b16f42a
Rebase of seqpool-max optimization
...
test=develop
- Added rough profiling
- Profiled maxpool itself
- First draft of max seqpool optimization (is_test added)
- Added unit tests to seqpool
- Cosmetic fixes
- Fix to UT of Seq pool
Disabled grad checking for sequence max pool when is_test is set to True
-Cosmetic fix to comment
test=develop
- Fix to GPU build
test=develop
- yet another GPU fix for sequence max pool
- Fix to comment
test=develop
- Change to API of sequence_pool
test=develop
- Yet another API spec change
test=develop
7 years ago
dengkaipeng
ff6329bd5f
fix some inappropriate expressions in api doc for grid_sampler. test=develop
7 years ago
dengkaipeng
8f1e398824
move param exclusive to the last in pool2d/pool3d for forward compatibility:. test=develop
7 years ago
dengkaipeng
593e1b18d7
fix some bugs and add some doc for GridSampleOp
7 years ago
dengkaipeng
0bb0e0c10f
add Grid Sampler Operator for STN.
7 years ago