Merge branch 'develop' into cmake_protobuf

refactor_docs
Liu Yiqun 8 years ago
commit c27e71e2fd

@ -1,11 +1,103 @@
# Release v0.10.0
We are glad to release version 0.10.0. In this version, we are happy to release the new
[Python API](http://research.baidu.com/paddlepaddles-new-api-simplifies-deep-learning-programs/).
- Our old Python API is kind of out of date. It's hard to learn and hard to
use. To write a PaddlePaddle program using the old API, we'd have to write
at least two Python files: one `data provider` and another one that defines
the network topology. Users start a PaddlePaddle job by running the
`paddle_trainer` C++ program, which calls Python interpreter to run the
network topology configuration script and then start the training loop,
which iteratively calls the data provider function to load minibatches.
This prevents us from writing a Python program in a modern way, e.g., in the
Jupyter Notebook.
- The new API, which we often refer to as the *v2 API*, allows us to write
much shorter Python programs to define the network and the data in a single
.py file. Also, this program can run in Jupyter Notebook, since the entry
point is in Python program and PaddlePaddle runs as a shared library loaded
and invoked by this Python program.
Basing on the new API, we delivered an online interative
book, [Deep Learning 101](http://book.paddlepaddle.org/index.en.html)
and [its Chinese version](http://book.paddlepaddle.org/).
We also worked on updating our online documentation to describe the new API.
But this is an ongoing work. We will release more documentation improvements
in the next version.
We also worked on bring the new API to distributed model training (via MPI and
Kubernetes). This work is ongoing. We will release more about it in the next
version.
## New Features
* We release [new Python API](http://research.baidu.com/paddlepaddles-new-api-simplifies-deep-learning-programs/).
* Deep Learning 101 book in [English](http://book.paddlepaddle.org/index.en.html) and [Chinese](http://book.paddlepaddle.org/).
* Support rectangle input for CNN.
* Support stride pooling for seqlastin and seqfirstin.
* Expose `seq_concat_layer/seq_reshape_layer` in `trainer_config_helpers`.
* Add dataset package: CIFAR, MNIST, IMDB, WMT14, CONLL05, movielens, imikolov.
* Add Priorbox layer for Single Shot Multibox Detection.
* Add smooth L1 cost.
* Add data reader creator and data reader decorator for v2 API.
* Add the CPU implementation of cmrnorm projection.
## Improvements
* Support Python virtualenv for `paddle_trainer`.
* Add pre-commit hooks, used for automatically format our code.
* Upgrade protobuf to version 3.x.
* Add an option to check data type in Python data provider.
* Speedup the backward of average layer on GPU.
* Documentation refinement.
* Check dead links in documents using Travis-CI.
* Add a example for explaining `sparse_vector`.
* Add ReLU in layer_math.py
* Simplify data processing flow for Quick Start.
* Support CUDNN Deconv.
* Add data feeder in v2 API.
* Support predicting the samples from sys.stdin for sentiment demo.
* Provide multi-proccess interface for image preprocessing.
* Add benchmark document for v1 API.
* Add ReLU in `layer_math.py`.
* Add packages for automatically downloading public datasets.
* Rename `Argument::sumCost` to `Argument::sum` since class `Argument` is nothing with cost.
* Expose Argument::sum to Python
* Add a new `TensorExpression` implementation for matrix-related expression evaluations.
* Add lazy assignment for optimizing the calculation of a batch of multiple expressions.
* Add abstract calss `Function` and its implementation:
* `PadFunc` and `PadGradFunc`.
* `ContextProjectionForwardFunc` and `ContextProjectionBackwardFunc`.
* `CosSimBackward` and `CosSimBackwardFunc`.
* `CrossMapNormalFunc` and `CrossMapNormalGradFunc`.
* `MulFunc`.
* Add class `AutoCompare` and `FunctionCompare`, which make it easier to write unit tests for comparing gpu and cpu version of a function.
* Generate `libpaddle_test_main.a` and remove the main function inside the test file.
* Support dense numpy vector in PyDataProvider2.
* Clean code base, remove some copy-n-pasted code snippets:
* Extract `RowBuffer` class for `SparseRowMatrix`.
* Clean the interface of `GradientMachine`.
* Use `override` keyword in layer.
* Simplify `Evaluator::create`, use `ClassRegister` to create `Evaluator`s.
* Check MD5 checksum when downloading demo's dataset.
* Add `paddle::Error` which intentially replace `LOG(FATAL)` in Paddle.
## Bug Fixes
* Check layer input types for `recurrent_group`.
* Don't run `clang-format` with .cu source files.
* Fix bugs with `LogActivation`.
* Fix the bug that runs `test_layerHelpers` multiple times.
* Fix the bug that the seq2seq demo exceeds protobuf message size limit.
* Fix the bug in dataprovider converter in GPU mode.
* Fix a bug in `GatedRecurrentLayer`.
* Fix bug for `BatchNorm` when testing more than one models.
* Fix broken unit test of paramRelu.
* Fix some compile-time warnings about `CpuSparseMatrix`.
* Fix `MultiGradientMachine` error when `trainer_count > batch_size`.
* Fix bugs that prevents from asynchronous data loading in `PyDataProvider2`.
# Release v0.9.0

@ -44,7 +44,6 @@ if(MKL_INC_DIR AND MKL_CORE_LIB AND MKL_SEQUENTIAL_LIB AND MKL_INTEL_LP64)
message(STATUS "Found MKL (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})")
set(CBLAS_FOUND ON)
if(${MKL_LAPACK_INC_DIR})
add_definitions(-DPADDLE_USE_LAPACK)
message(STATUS "Found lapack in MKL (include: ${MKL_LAPACK_INC_DIR})")
endif()
return() # return file.
@ -80,7 +79,6 @@ if(ATLAS_INC_DIR AND ATLAS_CBLAS_LIB AND ATLAS_LIB AND NOT CBLAS_FOUND)
message(STATUS "Found ATLAS (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})")
set(CBLAS_FOUND ON)
if(ATLAS_CLAPACK_INC_DIR)
add_definitions(-DPADDLE_USE_LAPACK)
set(CBLAS_INC_DIR ${CBLAS_INC_DIR} ${ATLAS_CLAPACK_INC_DIR})
message(STATUS "Found lapack in ATLAS (include: ${ATLAS_CLAPACK_INC_DIR})")
endif()
@ -115,7 +113,6 @@ if(OPENBLAS_INC_DIR AND OPENBLAS_LIB)
message(STATUS "Found OpenBLAS (include: ${CBLAS_INC_DIR}, library: ${CBLAS_LIBRARIES})")
set(CBLAS_FOUND ON)
if(OPENBLAS_LAPACKE_INC_DIR)
add_definitions(-DPADDLE_USE_LAPACK)
message(STATUS "Found lapack in OpenBLAS (include: ${OPENBLAS_LAPACKE_INC_DIR})")
endif()
return()

@ -24,45 +24,17 @@ IF(NOT ${CBLAS_FOUND})
SET(CBLAS_LIBRARIES "${CBLAS_INSTALL_DIR}/lib/${LIBRARY_PREFIX}openblas${STATIC_LIBRARY_SUFFIX}"
CACHE FILEPATH "openblas library." FORCE)
# check fortran compiler and library
SET(COMMON_ARGS CC=${CMAKE_C_COMPILER} NO_LAPACK=1 NO_SHARED=1)
IF(ANDROID)
SET(OPENBLAS_COMMIT "b5c96fcfcdc82945502a2303116a64d89985daf5")
SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 ARM_SOFTFP_ABI=1 NOFORTRAN=1 USE_THREAD=0 libs)
SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 ARM_SOFTFP_ABI=1 USE_THREAD=0 libs)
ELSEIF(RPI)
SET(OPENBLAS_COMMIT "v0.2.19")
SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 NOFORTRAN=1 USE_THREAD=0 libs)
SET(OPTIONAL_ARGS HOSTCC=${HOST_C_COMPILER} TARGET=ARMV7 USE_THREAD=0 libs)
ELSE()
IF(CMAKE_COMPILER_IS_GNUCC)
ENABLE_LANGUAGE(Fortran)
if (NOT CMAKE_Fortran_COMPILER_VERSION)
# cmake < 3.4 cannot get CMAKE_Fortran_COMPILER_VERSION directly.
execute_process(COMMAND ${CMAKE_Fortran_COMPILER} -dumpversion
OUTPUT_VARIABLE CMAKE_Fortran_COMPILER_VERSION)
endif()
string(REGEX MATCHALL "[0-9]+" Fortran_VERSION ${CMAKE_Fortran_COMPILER_VERSION})
list(GET Fortran_VERSION 0 Fortran_MAJOR)
list(GET Fortran_VERSION 1 Fortran_MINOR)
find_library(GFORTRAN_LIBRARY NAMES gfortran PATHS
/lib
/usr/lib
/usr/lib/gcc/x86_64-linux-gnu/${Fortran_MAJOR}.${Fortran_MINOR}/
/usr/lib/gcc/x86_64-linux-gnu/${Fortran_MAJOR}/)
if (NOT GFORTRAN_LIBRARY)
message(FATAL_ERROR "Cannot found gfortran library which it is used by openblas")
endif()
find_package(Threads REQUIRED)
LIST(APPEND CBLAS_LIBRARIES ${GFORTRAN_LIBRARY} ${CMAKE_THREAD_LIBS_INIT})
ENDIF(CMAKE_COMPILER_IS_GNUCC)
IF(NOT CMAKE_Fortran_COMPILER)
MESSAGE(FATAL_ERROR "To build lapack in libopenblas, "
"you need to set gfortran compiler: cmake .. -DCMAKE_Fortran_COMPILER=...")
ENDIF(NOT CMAKE_Fortran_COMPILER)
ADD_DEFINITIONS(-DPADDLE_USE_LAPACK)
SET(OPENBLAS_COMMIT "v0.2.19")
SET(OPENBLAS_ARGS FC=${CMAKE_Fortran_COMPILER} DYNAMIC_ARCH=1 libs netlib)
SET(OPENBLAS_ARGS DYNAMIC_ARCH=1 libs)
ENDIF()
ExternalProject_Add(
@ -73,7 +45,7 @@ IF(NOT ${CBLAS_FOUND})
PREFIX ${CBLAS_SOURCES_DIR}
INSTALL_DIR ${CBLAS_INSTALL_DIR}
BUILD_IN_SOURCE 1
BUILD_COMMAND ${CMAKE_MAKE_PROGRAM} CC=${CMAKE_C_COMPILER} NO_SHARED=1 ${OPTIONAL_ARGS}
BUILD_COMMAND ${CMAKE_MAKE_PROGRAM} ${COMMON_ARGS} ${OPTIONAL_ARGS}
INSTALL_COMMAND ${CMAKE_MAKE_PROGRAM} install NO_SHARED=1 PREFIX=<INSTALL_DIR>
UPDATE_COMMAND ""
CONFIGURE_COMMAND ""

@ -1,5 +1,4 @@
set(CPACK_PACKAGE_NAME paddle)
set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "")
set(CPACK_PACKAGE_VERSION_MAJOR ${PADDLE_MAJOR_VERSION})
set(CPACK_PACKAGE_VERSION_MINOR ${PADDLE_MINOR_VERSION})
set(CPACK_PACKAGE_VERSION_PATCH ${PADDLE_PATCH_VERSION})
@ -10,8 +9,9 @@ set(CPACK_DEBIAN_PACKAGE_ARCHITECTURE amd64)
set(CPACK_DEBIAN_PACKAGE_MAINTAINER PaddlePaddle Dev <paddle-dev@baidu.com>)
set(CPACK_PACKAGE_DESCRIPTION_SUMMARY "Paddle")
set(CPACK_PACKAGE_DESCRIPTION "")
set(CPACK_DEBIAN_PACKAGE_DEPENDS "libatlas3-base, libgflags2, libgoogle-glog0, libprotobuf8, libpython2.7, libstdc++6, python-numpy, python-pip, python-pip-whl, python-protobuf")
set(CPACK_DEBIAN_PACKAGE_DEPENDS "libpython2.7-dev, libstdc++6, python-pip, curl, libgfortran3, python-pip-whl")
set(CPACK_DEBIAN_PACKAGE_SECTION Devel)
set(CPACK_DEBIAN_PACKAGE_VERSION ${PADDLE_VERSION})
set(CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA "${PROJ_ROOT}/paddle/scripts/deb/postinst")
#set(CPACK_GENERATOR "DEB")
# Start cpack

@ -29,7 +29,7 @@ settings(
batch_size=128,
learning_rate=2e-3,
learning_method=AdamOptimizer(),
average_window=0.5,
model_average=ModelAverage(0.5),
regularization=L2Regularization(8e-4),
gradient_clipping_threshold=25)

@ -69,7 +69,8 @@ def gru_encoder_decoder(data_conf,
encoder_size=512,
decoder_size=512,
beam_size=3,
max_length=250):
max_length=250,
error_clipping=50):
"""
A wrapper for an attention version of GRU Encoder-Decoder network
is_generating: whether this config is used for generating
@ -90,9 +91,19 @@ def gru_encoder_decoder(data_conf,
input=src_word_id,
size=word_vector_dim,
param_attr=ParamAttr(name='_source_language_embedding'))
src_forward = simple_gru(input=src_embedding, size=encoder_size)
src_forward = simple_gru(
input=src_embedding,
size=encoder_size,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
src_backward = simple_gru(
input=src_embedding, size=encoder_size, reverse=True)
input=src_embedding,
size=encoder_size,
reverse=True,
naive=True,
gru_layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
encoded_vector = concat_layer(input=[src_forward, src_backward])
with mixed_layer(size=decoder_size) as encoded_proj:
@ -117,11 +128,13 @@ def gru_encoder_decoder(data_conf,
decoder_inputs += full_matrix_projection(input=context)
decoder_inputs += full_matrix_projection(input=current_word)
gru_step = gru_step_layer(
gru_step = gru_step_naive_layer(
name='gru_decoder',
input=decoder_inputs,
output_mem=decoder_mem,
size=decoder_size)
size=decoder_size,
layer_attr=ExtraLayerAttribute(
error_clipping_threshold=error_clipping))
with mixed_layer(
size=target_dict_dim, bias_attr=True,

@ -2,7 +2,8 @@
============
.. toctree::
:maxdepth: 2
:maxdepth: 1
build_and_install/index_cn.rst
basic_usage/index_cn.rst
- `深度学习入门课程 <http://book.paddlepaddle.org/>`_

@ -2,7 +2,8 @@ GET STARTED
============
.. toctree::
:maxdepth: 2
:maxdepth: 1
build_and_install/index_en.rst
basic_usage/index_en.rst
- `Deep Learning 101 <http://book.paddlepaddle.org/index.en.html>`_

@ -19,18 +19,18 @@
在 PaddlePaddle中下面这些Layer能够接受双层序列作为输入完成相应的计算。
pooling_layer
==============
pooling
========
pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_pooling_layer` 配置API。
pooling 的使用示例如下,详细见 :ref:`api_v2.layer_pooling` 配置API。
.. code-block:: bash
seq_pool = pooling_layer(input=layer,
pooling_type=AvgPooling(),
agg_level=AggregateLevel.EACH_SEQUENCE)
seq_pool = pooling(input=layer,
pooling_type=pooling.Max(),
agg_level=AggregateLevel.EACH_SEQUENCE)
- `pooling_type` 目前支持两种,分别是:MaxPooling()和AvgPooling()。
- `pooling_type` 目前支持两种,分别是:pooling.Max()和pooling.Avg()。
- `agg_level=AggregateLevel.EACH_TIMESTEP` 时(默认值):
@ -47,7 +47,7 @@ pooling_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers
last_seq 和 first_seq
=====================
last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_seq` 类似),详细见 :ref:`api_trainer_config_helpers_layers_last_seq` 配置API。
last_seq 的使用示例如下( :ref:`api_v2.layer_first_seq` 类似),详细见 :ref:`api_v2.layer_last_seq` 配置API。
.. code-block:: bash
@ -65,16 +65,16 @@ last_seq 的使用示例如下( :ref:`api_trainer_config_helpers_layers_first_
- 输入:必须是一个双层序列
- 输出一个单层序列其中每个元素是双层序列中每个subseq最后一个或第一个元素。
expand_layer
============
expand
======
expand_layer 的使用示例如下,详细见 :ref:`api_trainer_config_helpers_layers_expand_layer` 配置API。
expand 的使用示例如下,详细见 :ref:`api_v2.layer_expand` 配置API。
.. code-block:: bash
expand = expand_layer(input=layer1,
expand_as=layer2,
expand_level=ExpandLevel.FROM_TIMESTEP)
ex = expand(input=layer1,
expand_as=layer2,
expand_level=ExpandLevel.FROM_TIMESTEP)
- `expand_level=ExpandLevel.FROM_TIMESTEP` 时(默认值):

@ -4,7 +4,6 @@ RNN相关模型
.. toctree::
:maxdepth: 1
rnn_config_cn.rst
recurrent_group_cn.md
hierarchical_layer_cn.rst
hrnn_rnn_api_compare_cn.rst

@ -1,7 +1,2 @@
RNN Models
==========
.. toctree::
:maxdepth: 1
rnn_config_en.rst

@ -5,7 +5,6 @@ PaddlePaddle 文档
:maxdepth: 1
getstarted/index_cn.rst
tutorials/index_cn.md
howto/index_cn.rst
api/index_cn.rst
faq/index_cn.rst

@ -5,8 +5,6 @@ PaddlePaddle Documentation
:maxdepth: 1
getstarted/index_en.rst
tutorials/index_en.md
howto/index_en.rst
api/index_en.rst
about/index_en.rst

@ -114,10 +114,7 @@
</ul>
</div>
<ul class="site-page-links">
<li><a>Home</a></li>
<li><a>Get Started</a></li>
<li class="active"><a>Documentation</a></li>
<li><a>About Us</a></li>
<li><a href="/">Home</a></li>
</ul>
</div>
<div class="doc-module">
@ -137,7 +134,7 @@
{{ toctree }}
{% endblock %}
</nav>
{% if toc %}
{% if False %}
<nav class="local-toc">{{ toc }}</nav>
{% endif %}
<section class="doc-content-wrap">
@ -168,7 +165,8 @@
VERSION:'{{ release|e }}',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'{{ '' if no_search_suffix else file_suffix }}',
HAS_SOURCE: {{ has_source|lower }}
HAS_SOURCE: {{ has_source|lower }},
SOURCELINK_SUFFIX: ".txt",
};
</script>
{%- for scriptfile in script_files %}

@ -21,16 +21,13 @@ set(CUDA_CXX_WITH_GPU_SOURCES
if(WITH_GPU)
set(CUDA_CXX_SOURCES
src/hl_dso_loader.cc
src/hl_warpctc_wrap.cc
${CUDA_CXX_WITH_GPU_SOURCES})
set_source_files_properties(${CUDA_CXX_SOURCES}
PROPERTIES COMPILE_FLAGS "-D__NVCC__")
else()
set(CUDA_CXX_SOURCES
src/hl_dso_loader.cc
src/hl_warpctc_wrap.cc)
set(CUDA_CXX_SOURCES src/hl_warpctc_wrap.cc)
endif()
set(CUDA_CU_SOURCES
@ -47,7 +44,6 @@ set(CUDA_CU_SOURCES
set(CUDA_HEADERS
include/hl_time.h
include/hl_dso_loader.h
include/hl_warpctc_wrap.h
include/hl_sequence.h
include/hl_cuda_cublas.h

@ -40,18 +40,18 @@ public:
namespace gpu {
static __device__ Active<real>::forward forward[] = HPPL_ACTIVE_FUNCTION;
static __device__ Active<real>::backward backward[] = HPPL_ACTIVE_FUNCTION;
}
} // namespace gpu
#else
namespace cpu {
static Active<real>::forward forward[] = HPPL_ACTIVE_FUNCTION;
static Active<real>::backward backward[] = HPPL_ACTIVE_FUNCTION;
}
} // namespace cpu
#ifdef __AVX__
namespace avx {
static Active<__m256>::forward forward[] = HPPL_ACTIVE_FUNCTION;
static Active<__m256>::backward backward[] = HPPL_ACTIVE_FUNCTION;
}
} // namespace avx
#endif
#endif

@ -273,23 +273,23 @@ extern void hl_bilinear_forward(const real* inData,
const real ratioW);
/**
* @brief Bilinear interpolation backward.
*
* @param[out] inGrad input gradient.
* @param[in] inImgH input image height.
* @param[in] inImgW input image width.
* @param[in] inputH input batchSize.
* @param[in] inputW input image data dim.
* @param[in] outGrad output gradient.
* @param[in] outImgH output image height.
* @param[in] outImgW output image width.
* @param[in] outputH output batchSize.
* @param[in] outputW output image data dim.
* @param[in] numChannels number of channels.
* @param[in] ratioH inImgH / outImgH.
* @param[in] ratioW inImgW / outImgW.
*
*/
* @brief Bilinear interpolation backward.
*
* @param[out] inGrad input gradient.
* @param[in] inImgH input image height.
* @param[in] inImgW input image width.
* @param[in] inputH input batchSize.
* @param[in] inputW input image data dim.
* @param[in] outGrad output gradient.
* @param[in] outImgH output image height.
* @param[in] outImgW output image width.
* @param[in] outputH output batchSize.
* @param[in] outputW output image data dim.
* @param[in] numChannels number of channels.
* @param[in] ratioH inImgH / outImgH.
* @param[in] ratioW inImgW / outImgW.
*
*/
extern void hl_bilinear_backward(real* inGrad,
const size_t inImgH,
const size_t inImgW,

@ -14,10 +14,9 @@ limitations under the License. */
#include "hl_cuda_cublas.h"
#include <sys/time.h>
#include <mutex>
#include "hl_cuda.h"
#include "hl_dso_loader.h"
#include "hl_thread.ph"
#include "paddle/utils/DynamicLoader.h"
#include "paddle/utils/Logging.h"
namespace dynload {

@ -15,10 +15,9 @@ limitations under the License. */
#include "hl_cuda_cudnn.h"
#include <cudnn.h>
#include <gflags/gflags.h>
#include <mutex>
#include "hl_cuda_cudnn.ph"
#include "hl_dso_loader.h"
#include "hl_thread.ph"
#include "paddle/utils/DynamicLoader.h"
#include "paddle/utils/Logging.h"
DEFINE_int32(cudnn_conv_workspace_limit_in_mb,

@ -21,11 +21,10 @@ limitations under the License. */
#include <sys/syscall.h>
#include <sys/time.h>
#include <unistd.h>
#include <mutex>
#include "hl_cuda.ph"
#include "hl_thread.ph"
#include "hl_dso_loader.h"
#include "paddle/utils/Logging.h"
#include "paddle/utils/DynamicLoader.h"
// clang-format on
namespace dynload {

@ -14,7 +14,7 @@ limitations under the License. */
#include "hl_warpctc_wrap.h"
#include <mutex>
#include "hl_dso_loader.h"
#include "paddle/utils/DynamicLoader.h"
#include "paddle/utils/Logging.h"
namespace dynload {

@ -12,7 +12,7 @@ endif()
add_library(paddle_function STATIC ${cpp_files} ${cu_objs})
add_dependencies(paddle_function ${external_project_dependencies})
add_dependencies(paddle_function gen_proto_cpp)
if(WITH_GPU)
if(WITH_TESTING)

@ -74,9 +74,9 @@ TEST(MulOp, DDDMatrixMul) {
}
/**
* C += A * B, B, C dense, A sparse
* dense = sparse * dense
*/
* C += A * B, B, C dense, A sparse
* dense = sparse * dense
*/
void testFuncDSparseDMatrix(
size_t dimM, size_t dimN, size_t dimK, size_t nnz, SparseFormat FORMAT) {
real scaleT = 1.0;
@ -119,9 +119,9 @@ TEST(MuLOp, DSparseDMul) {
}
/**
* C += A * B, A, C dense, B sparse
* dense = dense * sparse
*/
* C += A * B, A, C dense, B sparse
* dense = dense * sparse
*/
void testFuncDDSparseMatrix(
size_t dimM, size_t dimN, size_t dimK, size_t nnz, SparseFormat FORMAT) {
real scaleT = 1.0;
@ -165,9 +165,9 @@ TEST(MulOp, DDSparseMul) {
}
/**
* C += A * B, A sparse, B, C dense
* sparse = dense * dense
*/
* C += A * B, A sparse, B, C dense
* sparse = dense * dense
*/
void testFuncSparseDDMatrix(
size_t dimM, size_t dimN, size_t dimK, size_t nnz, SparseFormat FORMAT) {
real scaleT = 1.0;

@ -21,7 +21,6 @@ limitations under the License. */
#include "MultiGradientMachine.h"
#include "MultiNetwork.h"
#include "NeuralNetwork.h"
#include "NeuralNetwork.h"
#include "ParallelNeuralNetwork.h"
#include "hl_gpu.h"

@ -637,7 +637,7 @@ void RecurrentGradientMachine::removeBeamSearchStatisticsCallbacks() {
/* create scattered id infomation for all realLayer of inFrameLines one time.
* If hasSubseq, will also create scattered sequenceStartPositions infomation
* for all realLayer of inFrameLines one time.
*/
*/
void RecurrentGradientMachine::createInFrameInfo(int inlinkId,
const Argument& input,

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save