You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Paddle/paddle/fluid/operators/math
Jack Zhou 9362d85e0e
Add LSTM, Simple RNN and GRU CPU kernel (#28577)
5 years ago
..
detail Add LSTM, Simple RNN and GRU CPU kernel (#28577) 5 years ago
CMakeLists.txt exec ut no more than 15s 1 (#28439) 5 years ago
algorithm.h
beam_search.cc use iwyu clean include (#27267) 5 years ago
beam_search.cu fix the error message for the math dir 5 years ago
beam_search.h Return parent_idx in beam_search op (#15520) 6 years ago
beam_search_test.cc use iwyu clean include (#27267) 5 years ago
bert_encoder_functor.cu Fix registering trt plugin (#25744) 5 years ago
bert_encoder_functor.h [Paddle-TRT]: Ernie Dynamic shape support. (#23138) 5 years ago
blas.cc use iwyu clean include (#27267) 5 years ago
blas.h Paddle support compile on sw (#27858) 5 years ago
blas_impl.cu.h add fp16 for matmul (#27523) 5 years ago
blas_impl.h use iwyu clean include (#27267) 5 years ago
bloomfilter.h refine murmurhash3_x64_128 for bloom_filter (#20996) 6 years ago
compound_functors.h Optimize fused_elewise_activation_grad op. (#18041) 6 years ago
concat.hip.cu
concat_and_split.cc use iwyu clean include (#27267) 5 years ago
concat_and_split.cu Add macro BOOST_GET to enrich the error information of boost :: get (#24175) 5 years ago
concat_and_split.h Add bfloat16 data type (#25402) 5 years ago
concat_test.cc use iwyu clean include (#27267) 5 years ago
context_project.cc use iwyu clean include (#27267) 5 years ago
context_project.cu
context_project.h use iwyu clean include (#27267) 5 years ago
cos_sim_functor.cc use iwyu clean include (#27267) 5 years ago
cos_sim_functor.cu
cos_sim_functor.h use iwyu clean include (#27267) 5 years ago
cpu_vec.h use iwyu clean include (#27267) 5 years ago
cpu_vec_test.cc use iwyu clean include (#27267) 5 years ago
cross_entropy.cc use iwyu clean include (#27267) 5 years ago
cross_entropy.cu Enhance the error messages for files in operators/math 5 years ago
cross_entropy.h unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631) 6 years ago
depthwise_conv.cu Optimize the depthwise op test=develop (#22265) 5 years ago
depthwise_conv.h conv_transpose supports channel_last input, test=develop, test=document_preview (#20072) 6 years ago
fc.cc optimize fc jit (#21878) 5 years ago
fc.cu [Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494) 5 years ago
fc.h Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) 6 years ago
functors.h Support fp16 in GPU impl of fused_elemwise_activation_op. (#20636) 6 years ago
gru_compute.cc Add LSTM, Simple RNN and GRU CPU kernel (#28577) 5 years ago
gru_compute.cu Fix ce ocr_recognition test fails (#20987) 6 years ago
gru_compute.h Add LSTM, Simple RNN and GRU CPU kernel (#28577) 5 years ago
im2col.cc use iwyu clean include (#27267) 5 years ago
im2col.cu Enhance the error messages for files in operators/math 5 years ago
im2col.h conv_transpose supports channel_last input, test=develop, test=document_preview (#20072) 6 years ago
im2col_cfo_cpu.h fix conv_transpose's bug: compatible with Anylayout setting, test=develop (#20589) 6 years ago
im2col_test.cc
lstm_compute.cc Add LSTM, Simple RNN and GRU CPU kernel (#28577) 5 years ago
lstm_compute.cu Add LSTM, Simple RNN and GRU CPU kernel (#28577) 5 years ago
lstm_compute.h Add LSTM, Simple RNN and GRU CPU kernel (#28577) 5 years ago
math_cuda_utils.h [Paddle-TRT] SkipLayernorm vectorized memory optimization (#25117) 5 years ago
math_function.cc xpu support for fill_constant Op (#27675) 5 years ago
math_function.cu enhance reduce op which can reduce tensor with arbitrary rank 5 years ago
math_function.h xpu support for fill_constant Op (#27675) 5 years ago
math_function_impl.h adjust kunlun header file (#28536) 5 years ago
math_function_test.cc Error description optimize for the math dir 5 years ago
math_function_test.cu Error description optimize for math dir 5 years ago
matrix_bit_code.cc use iwyu clean include (#27267) 5 years ago
matrix_bit_code.h Add NOMINMAX define due to windows.h max/min macro conflict (#25637) 5 years ago
matrix_inverse.cc Add the implementation of inverse (#23310) 5 years ago
matrix_inverse.cu.cc use iwyu clean include (#27267) 5 years ago
matrix_inverse.h Add the implementation of inverse (#23310) 5 years ago
maxouting.cc maxout supports channel_last input (#20846) 6 years ago
maxouting.cu maxout supports channel_last input (#20846) 6 years ago
maxouting.h maxout supports channel_last input (#20846) 6 years ago
padding.h Error description optimize for math dir 5 years ago
pooling.cc fix Wmaybe-uninitialized warning in pooling.cc, test=develop (#28126) 5 years ago
pooling.cu Pool2d cuda kernel supports fp16 (#28316) 5 years ago
pooling.h Pool2d cuda kernel supports fp16 (#28316) 5 years ago
prelu.cu fix bug of prelu when rank not equal 4, test=develop (#25067) 5 years ago
prelu.h fix bug of prelu when rank not equal 4, test=develop (#25067) 5 years ago
sample_prob.cc use iwyu clean include (#27267) 5 years ago
sample_prob.cu refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603) 6 years ago
sample_prob.h use iwyu clean include (#27267) 5 years ago
sampler.cc Refine paddle.manual_seed (#26496) 5 years ago
sampler.h Error description optimize for the math dir 5 years ago
segment_pooling.cc Add the cpu version of segment sum mean max min op 5 years ago
segment_pooling.cu refine gpu kernel config for Paddle (#28085) 5 years ago
segment_pooling.h Add the cpu version of segment sum mean max min op 5 years ago
selected_rows_functor.cc update error info for selected_rows_functor 5 years ago
selected_rows_functor.cu update error info for selected_rows_functor 5 years ago
selected_rows_functor.h fix the diff between async mode and async_half mode (#19535) 6 years ago
selected_rows_functor_test.cc fix the diff between async mode and async_half mode (#19535) 6 years ago
selected_rows_functor_test.cu.cc use iwyu clean include (#27267) 5 years ago
sequence2batch.cc use iwyu clean include (#27267) 5 years ago
sequence2batch.cu optimize the error message for math dir 5 years ago
sequence2batch.h optimize the error message for math dir 5 years ago
sequence_padding.cc use iwyu clean include (#27267) 5 years ago
sequence_padding.cu optimize the error message for math dir 5 years ago
sequence_padding.h optimize the error message for math dir 5 years ago
sequence_padding_test.cc use iwyu clean include (#27267) 5 years ago
sequence_pooling.cc optimize the error message for math dir 5 years ago
sequence_pooling.cu optimize the error message for math dir 5 years ago
sequence_pooling.h support 2-level lod of input in sequence_pool (#19839) 6 years ago
sequence_pooling_test.cc optimize the error message for math dir 5 years ago
sequence_scale.cc add support to float64 input of warpctc op. (#27399) 5 years ago
sequence_scale.cu add support to float64 input of warpctc op. (#27399) 5 years ago
sequence_scale.h use iwyu clean include (#27267) 5 years ago
softmax.cc
softmax.cu replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109) 5 years ago
softmax.h [CPU] refine cpu softmax bwd (#17534) 6 years ago
softmax_impl.h remove eval in eigen function when dtype is fp16 (#23845) 5 years ago
tree2col.cc optimize the error message for math dir 5 years ago
tree2col.cu Add macro BOOST_GET to enrich the error information of boost :: get (#24175) 5 years ago
tree2col.h Tree conv op (#15217) 6 years ago
unpooling.cc optimize the error message for unpooling.cc 5 years ago
unpooling.cu unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631) 6 years ago
unpooling.h
vol2col.cc use iwyu clean include (#27267) 5 years ago
vol2col.cu Error description optimize for the math dir 5 years ago
vol2col.h conv_transpose supports channel_last input, test=develop, test=document_preview (#20072) 6 years ago
vol2col_test.cc use iwyu clean include (#27267) 5 years ago