chengduoZH
8e02870ce2
modify doc
7 years ago
whs
448fee3db4
Merge pull request #7414 from wanghaoshuang/warpctc
...
Adapt warpctc grad op for gradient checking
7 years ago
dzhwinter
b9b75377a2
Feature/hooks ( #7513 )
...
* add copyright hook
* add copyright hook
* refine copyright hook
* "test copyright hook"
* fix check style
* fix ci
7 years ago
wanghaoshuang
8f37c3c2a7
Fix sequence scale functor cuda kernel
...
1. Fix kernel
2. Add more test case
7 years ago
dzhwinter
5ad1aef051
"cudnn operators change to cudnn kernel" ( #6660 )
...
* "unified operators"
* "add CUDNN register"
* "add use cudnn attribute"
* "add attribute"
* "test conv tranpose op"
* "remove duplicated attr"
* "fix op test"
* "add attribute to set cudnn"
* "add more log"
* "need layout op register support"
* "add more log"
* "change GetExpectedKernelType "
* "fix Get attr in conv_op"
* "fix CI"
* "fix tests"
* "removed kernel priority fallback"
* "fix CI"
* "fix stack pointer bug"
* "refine buggy interface"
* "add const cast to save life"
* "fix get_output_with_grad"
* "fix op test with dataformat"
* ""fix pooling
* "fix pooling test"
* "fix CI"
* "fix with_gpu error"
* "add transform needed functional check"
* "fix unpack list error"
* "comment out parallel.do temporary"
* "fix CI"
* "fix compile doc error"
* "make threshold larger"
7 years ago
wanghaoshuang
137f0dfc21
1. Fix warpctc grad tensor initial bug.
...
2. Remove num_seq arguments.
3. Refine CUDA kernel of ScaleLoDTensorFunctor.
4. Change max_relative_error of gradient unitest to 0.007
7 years ago
wanghaoshuang
89de5d5e66
Fix cuda kernel of sequence scale functor
7 years ago
wanghaoshuang
b1af5e435f
1. Fix warpctc grad op
...
2. Add check grad test
7 years ago
Yiqun Liu
b5fda2723f
Port WarpCTC Operator ( #5107 )
...
* Add Seq2BatchFunctor, which will be used in WarpCTCOp.
* Implement WrapCTCFunctor and WrapCTCKernel.
* Add unittest of warpctc_op.
* Modify the check_output inferface in python unittest framework to allow check a subset of outputs.
* Use absolute offset lod in warpctc_op and related functors.
* Refine the comments of warpctc_op.
* The new python unittest supports checking a subset of the outputs, so revoke the previous change.
* Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor.
* Update to the newest codes.
* Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.
7 years ago
Yu Yang
ce6dad3b35
Rename CopyFrom to Copy for tensors ( #7292 )
...
* Rename Tensor::CopyFrom to Tensor::Copy
* Fix CI
* Fix compile
7 years ago
dzhwinter
899a79cceb
Feature/transform ( #7111 )
...
* "fix data transform"
* "data transformer"
* "add device pool"
* "add test"
* "fix ci"
* "fix datalayout implementation "
* "fix based on comment"
7 years ago
sweetsky0901
59c14f0b6e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into detection_output
7 years ago
QI JUN
105ee86d14
fix compile ( #7125 )
7 years ago
Guo Sheng
4b7bd642c5
Merge pull request #7102 from guoshengCS/refine-act-GRU
...
Refine the activation type in the GRU operator related
7 years ago
chengduo
f58fe6d3ed
Merge pull request #6601 from chengduoZH/profiling/cosine_op
...
Refine cos-sim-op
7 years ago
武毅
0bd7f97b4b
Merge pull request #7045 from typhoonzero/adam_selectedrows
...
Adam selectedrows and scatter functors
7 years ago
chengduoZH
812c5f60eb
remove conflict
7 years ago
chengduoZH
24cf2fcd90
move cos_sim_functor to math
7 years ago
typhoonzero
1039c1e3b7
scatter optimizers
7 years ago
typhoonzero
641b4c0fe6
wip
7 years ago
guosheng
23b53c48df
Delete the old activation type for LSTM and GRU operator
7 years ago
guosheng
f74dff97ea
Refine the activation type in the GRU operator related
7 years ago
sweetsky0901
a8109cf0ae
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into detection_output
7 years ago
sweetsky0901
95aec835e6
modify fun name
7 years ago
Yancey
2cdef424d9
Implement selectedrows serialize and deserialize ( #7042 )
...
* implement selectedrows serialize and deserialize
* make serialize/deserialize as global function
* recover send_imp.cc
* delete unused brackets
* fix compile error
* serialize version in LodTensor and SelecetedRows
* fix ci
* fix ci
7 years ago
sweetsky0901
1a685144bb
for xxYY to xx_yy
7 years ago
sweetsky0901
dc7ddcb0b7
resolved conflict
7 years ago
typhoonzero
74b122889c
wip
7 years ago
typhoonzero
d48a0e4eae
WIP: adding generic scattor functors
7 years ago
qingqing01
95da78a6df
Merge pull request #7047 from qingqing01/rowwise_add
...
Optimize the rowwise add function.
7 years ago
qingqing01
19367389c0
Update the CUDA kernel.
7 years ago
qingqing01
41372ded20
Resume CPU implenmentation.
7 years ago
qingqing01
32d881beab
Optimize the rowwise add function.
7 years ago
Tao Luo
c77b696b8e
Merge pull request #7022 from luotao1/license
...
unify the indentation of license
7 years ago
Luo Tao
761b329793
unify the indentation of license
7 years ago
qingqing01
f839154542
Merge pull request #6996 from qingqing01/lstm_active_type
...
Refine the activation type getting in the LSTM operator to speed.
7 years ago
dangqingqing
a8e18549c2
Fix the clang format.
7 years ago
qingqing01
d760b6a58d
Refine the activation type getting in the LSTM operator to speed.
7 years ago
QI JUN
efd3726929
remove unused place ( #6972 )
...
* remove unused place
* fix ci
7 years ago
dzhwinter
0d2235aadf
GPUPlace to CUDAPlace ( #6960 )
7 years ago
qiaolongfei
682eee40cb
fix math_function warning
7 years ago
Yu Yang
7e214b4985
Speed up ColwiseSum in CPU ( #6834 )
...
* Remove unnecessary reshape in ColwiseSum
Speed up 12s -> 10s.
* Hand write ColwiseAdd in CPU
7 years ago
chengduo
e19032fb4e
Merge pull request #6743 from chengduoZH/profiling/02.recognize_digits
...
Refine elementwiseAdd and im2col
7 years ago
chengduoZH
cb3a74e436
revert im2col
7 years ago
chengduoZH
7b0744edcf
refine im2col
7 years ago
chengduoZH
f1ab13bd0e
refine
7 years ago
chengduoZH
293b292e0f
refine im2col
7 years ago
QI JUN
93a2d9c59d
add more place test and rename Cudnn to CUDNN ( #6621 )
...
* add more place_test and rename Cudnn to CUDNN
* fix ci
7 years ago
sweetsky0901
929be3a4a5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into detection_output
7 years ago
tensor-tang
7728c53448
Merge remote-tracking branch 'upstream/develop' into fluid
...
Conflicts:
paddle/platform/place.h
7 years ago