Paddle/python/paddle/fluid/lod_tensor.py

#   Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from __future__ import print_function

from . import core
from .data_feeder import DataToLoDTensorConverter
import numpy as np

__all__ = ['create_lod_tensor', 'create_random_int_lodtensor']


def create_lod_tensor(data, recursive_seq_lens, place):
    """
    Create a LoDTensor from a numpy array, list or existing LoDTensor.

    The implementation is as follows:

    1. Check whether the length-based LoD, i.e., :code:`recursive_seq_lens`
       is valid.

    2. Convert :code:`recursive_seq_lens` to a offset-based LoD.

    3. Based on :code:`place` , copy the :code:`data` from a numpy array, list
       or existing LoDTensor to CPU or GPU device.

    4. Set offset-based LoD to the output LoDTensor.

    Suppose we want to create a LoDTensor to hold data for word sequences,
    where each word is represented by an integer. If we want to create
    a LoDTensor to represent two sentences, one of 2 words, and one of 3 words.

    Then :code:`data` would be a numpy array of integers with shape (5, 1).
    :code:`recursive_seq_lens` would be [[2, 3]], indicating the word number
    in each sentence. This length-based :code:`recursive_seq_lens` [[2, 3]]
    would be converted to offset-based LoD [[0, 2, 5]] inside the function
    call.

    Please reference :ref:`user_guide_lod_tensor` for more details regarding LoD.

    Args:
        data (numpy.ndarray|list|LoDTensor): a numpy array, a list or ad LoDTensor
                holding the data to be copied.
        recursive_seq_lens (list[list[int]]): a list of lists indicating the
                length-based LoD info.
        place (CPUPlace|CUDAPlace): CPU or GPU place indicating where the data
                in the created LoDTensor will be stored.

    Returns:
         A LoDTensor with tensor data and recursive_seq_lens info.

    Examples:

        .. code-block:: python

            import paddle.fluid as fluid
            import numpy as np

            t = fluid.create_lod_tensor(np.ndarray([5, 30]), [[2, 3]], fluid.CPUPlace())
    """
    if isinstance(data, core.LoDTensor):
        return create_lod_tensor(np.array(data), recursive_seq_lens, place)
    elif isinstance(data, list):
        # dtype and shape are not important here,
        # we only want to reuse code of DataToLoDTensorConverter
        converter = DataToLoDTensorConverter(
            place=place,
            lod_level=len(recursive_seq_lens),
            shape=[],
            dtype=core.VarDesc.VarType.FP32)

        new_recursive_seq_lens = []
        for seq in data:
            new_recursive_seq_lens.append(len(seq))
            converter.feed(seq)

        assert [
            new_recursive_seq_lens
        ] == recursive_seq_lens, "data and recursive_seq_lens do not match"

        arr = np.array(converter.data)

        # FIXME(zjl): the original logic of create_lod_tensor would append
        # 1 to the shape. Maybe it is not a right way? Currently, we only
        # follow the previous logic
        arr = arr.reshape(arr.shape + (1, ))
        tensor = core.LoDTensor()
        tensor.set(arr, place)
        tensor.set_recursive_sequence_lengths(recursive_seq_lens)
        return tensor
    elif isinstance(data, np.ndarray):
        tensor = core.LoDTensor()
        tensor.set(data, place)
        tensor.set_recursive_sequence_lengths(recursive_seq_lens)
        assert tensor.has_valid_recursive_sequence_lengths(
        ), "the provided lod info is invalid"
        return tensor
    else:
        raise TypeError(
            "data should be either a LoDTensor, a Numpy array or a list")


def create_random_int_lodtensor(recursive_seq_lens, base_shape, place, low,
                                high):
    """
    Create a LoDTensor containing random integers.

    The implementation is as follows:

    1. Obtain the shape of output LoDTensor based on :code:`recursive_seq_lens`
       and :code:`base_shape` . The first dimension of the shape is the total
       length of sequences, while the other dimensions are the same as
       :code:`base_shape` .

    2. Create a numpy array of random integers, and parse the created numpy
       array as parameter :code:`data` of :ref:`api_fluid_create_lod_tensor` to
       create the output LoDTensor.

    Suppose we want to create a LoDTensor to hold data for 2 sequences, where
    the dimension of the sequences are [2, 30] and [3, 30] respectively.
    The :code:`recursive_seq_lens` would be [[2, 3]], and :code:`base_shape`
    would be [30] (the other dimensions excluding the sequence length).
    Therefore, the shape of the output LoDTensor would be [5, 30], where
    the first dimension 5 is the total lengths of the sequences, and the
    other dimensions are :code:`base_shape`.

    Args:
        recursive_seq_lens (list[list[int]]): a list of lists indicating the
                length-based LoD info.
        base_shape (list[int]): the shape of the output LoDTensor excluding
                the first dimension.
        place (CPUPlace|CUDAPlace): CPU or GPU place indicating where
                the data in the created LoDTensor will be stored.
        low (int): the lower bound of the random integers.
        high (int): the upper bound of the random integers.

    Returns:
        A LoDTensor with tensor data and recursive_seq_lens info, whose data
        is inside [low, high].

    Examples:
        .. code-block:: python

          import paddle.fluid as fluid

          t = fluid.create_random_int_lodtensor(recursive_seq_lens=[[2, 3]],
                    base_shape=[30], place=fluid.CPUPlace(), low=0, high=10)
          print(t.shape()) # [5, 30]
    """
    assert isinstance(base_shape, list), "base_shape should be a list"
    # append the total number of basic elements to the front of its shape
    overall_shape = [sum(recursive_seq_lens[-1])] + base_shape
    # the range of integer data elements is [low, high]
    data = np.random.random_integers(low, high, overall_shape).astype("int64")
    return create_lod_tensor(data, recursive_seq_lens, place)
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago			`# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.`
			`#`
			`# Licensed under the Apache License, Version 2.0 (the "License");`
			`# you may not use this file except in compliance with the License.`
			`# You may obtain a copy of the License at`
			`#`
			`# http://www.apache.org/licenses/LICENSE-2.0`
			`#`
			`# Unless required by applicable law or agreed to in writing, software`
			`# distributed under the License is distributed on an "AS IS" BASIS,`
			`# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.`
			`# See the License for the specific language governing permissions and`
			`# limitations under the License.`

Add print_function for all python files 7 years ago			`from __future__ import print_function`

Apply 2to3 to current paddle main python code 7 years ago			`from . import core`
Fix create_lod_tensor (#18196) * fix_create_lod_tensor, test=develop * remove program_guard import,test=develop * fix windows numpy default int32 error, test=develop 6 years ago			`from .data_feeder import DataToLoDTensorConverter`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago			`import numpy as np`

			`__all__ = ['create_lod_tensor', 'create_random_int_lodtensor']`


fix lodtensor.py 7 years ago			`def create_lod_tensor(data, recursive_seq_lens, place):`
Polish LoDTensor API 7 years ago			`"""`
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`Create a LoDTensor from a numpy array, list or existing LoDTensor.`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`The implementation is as follows:`
Polish LoDTensor API 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			1. Check whether the length-based LoD, i.e., :code:`recursive_seq_lens`
			`is valid.`
Polish LoDTensor API 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			2. Convert :code:`recursive_seq_lens` to a offset-based LoD.
Polish LoDTensor API 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			3. Based on :code:`place` , copy the :code:`data` from a numpy array, list
			`or existing LoDTensor to CPU or GPU device.`
Polish LoDTensor API 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`4. Set offset-based LoD to the output LoDTensor.`
Apply 2to3 to current paddle main python code 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`Suppose we want to create a LoDTensor to hold data for word sequences,`
			`where each word is represented by an integer. If we want to create`
			`a LoDTensor to represent two sentences, one of 2 words, and one of 3 words.`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			Then :code:`data` would be a numpy array of integers with shape (5, 1).
			:code:`recursive_seq_lens` would be [[2, 3]], indicating the word number
			in each sentence. This length-based :code:`recursive_seq_lens` [[2, 3]]
			`would be converted to offset-based LoD [[0, 2, 5]] inside the function`
			`call.`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			Please reference :ref:`user_guide_lod_tensor` for more details regarding LoD.
Polish LoDTensor API 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`Args:`
			`data (numpy.ndarray\|list\|LoDTensor): a numpy array, a list or ad LoDTensor`
			`holding the data to be copied.`
			`recursive_seq_lens (list[list[int]]): a list of lists indicating the`
			`length-based LoD info.`
			`place (CPUPlace\|CUDAPlace): CPU or GPU place indicating where the data`
			`in the created LoDTensor will be stored.`
fix api doc,test=develop (#17241) 6 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`Returns:`
			`A LoDTensor with tensor data and recursive_seq_lens info.`
fix api doc,test=develop (#17241) 6 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`Examples:`
fix api doc,test=develop (#17241) 6 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`.. code-block:: python`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`import paddle.fluid as fluid`
			`import numpy as np`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`t = fluid.create_lod_tensor(np.ndarray([5, 30]), [[2, 3]], fluid.CPUPlace())`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago			`"""`
			`if isinstance(data, core.LoDTensor):`
fix lodtensor.py 7 years ago			`return create_lod_tensor(np.array(data), recursive_seq_lens, place)`
Add create LoDTensor from list option and simplify recommender book example (#10946) * add create lodtensor from list * modify book example 7 years ago			`elif isinstance(data, list):`
fix lod_tensor.py grammar error, test=develop (#18308) 6 years ago			`# dtype and shape are not important here,`
Fix create_lod_tensor (#18196) * fix_create_lod_tensor, test=develop * remove program_guard import,test=develop * fix windows numpy default int32 error, test=develop 6 years ago			`# we only want to reuse code of DataToLoDTensorConverter`
			`converter = DataToLoDTensorConverter(`
			`place=place,`
			`lod_level=len(recursive_seq_lens),`
			`shape=[],`
			`dtype=core.VarDesc.VarType.FP32)`

fix lodtensor.py 7 years ago			`new_recursive_seq_lens = []`
Add create LoDTensor from list option and simplify recommender book example (#10946) * add create lodtensor from list * modify book example 7 years ago			`for seq in data:`
fix lodtensor.py 7 years ago			`new_recursive_seq_lens.append(len(seq))`
Fix create_lod_tensor (#18196) * fix_create_lod_tensor, test=develop * remove program_guard import,test=develop * fix windows numpy default int32 error, test=develop 6 years ago			`converter.feed(seq)`

fix lodtensor.py 7 years ago			`assert [`
			`new_recursive_seq_lens`
			`] == recursive_seq_lens, "data and recursive_seq_lens do not match"`
Fix create_lod_tensor (#18196) * fix_create_lod_tensor, test=develop * remove program_guard import,test=develop * fix windows numpy default int32 error, test=develop 6 years ago
			`arr = np.array(converter.data)`

			`# FIXME(zjl): the original logic of create_lod_tensor would append`
			`# 1 to the shape. Maybe it is not a right way? Currently, we only`
			`# follow the previous logic`
			`arr = arr.reshape(arr.shape + (1, ))`
			`tensor = core.LoDTensor()`
			`tensor.set(arr, place)`
			`tensor.set_recursive_sequence_lengths(recursive_seq_lens)`
			`return tensor`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago			`elif isinstance(data, np.ndarray):`
			`tensor = core.LoDTensor()`
			`tensor.set(data, place)`
fix lodtensor.py 7 years ago			`tensor.set_recursive_sequence_lengths(recursive_seq_lens)`
Modify Pybind LoDTensor API according to length-based LoD (#11106) * add lod_tensor util and modify pybind * refind pybind LoDTensor API and modify LoDTensor and DataFeeder test * fix test error * fix detection map op test * fix reorder_lod_tensor test * fix seq_concat_op * fix chunk evel op test * fix target assign op * fix warp ctc op * address comments step 1: reverse reset_lod op * step 2: modify op test * add warning message * remove has_valid_lod * add back has_valid_lod * address comments * add exception catching trial 7 years ago			`assert tensor.has_valid_recursive_sequence_lengths(`
			`), "the provided lod info is invalid"`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago			`return tensor`
			`else:`
Add create LoDTensor from list option and simplify recommender book example (#10946) * add create lodtensor from list * modify book example 7 years ago			`raise TypeError(`
			`"data should be either a LoDTensor, a Numpy array or a list")`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago

fix lodtensor.py 7 years ago			`def create_random_int_lodtensor(recursive_seq_lens, base_shape, place, low,`
			`high):`
Polish LoDTensor API 7 years ago			`"""`
			`Create a LoDTensor containing random integers.`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`The implementation is as follows:`
Polish LoDTensor API 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			1. Obtain the shape of output LoDTensor based on :code:`recursive_seq_lens`
			and :code:`base_shape` . The first dimension of the shape is the total
			`length of sequences, while the other dimensions are the same as`
			:code:`base_shape` .
Polish LoDTensor API 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`2. Create a numpy array of random integers, and parse the created numpy`
			array as parameter :code:`data` of :ref:`api_fluid_create_lod_tensor` to
			`create the output LoDTensor.`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`Suppose we want to create a LoDTensor to hold data for 2 sequences, where`
			`the dimension of the sequences are [2, 30] and [3, 30] respectively.`
			The :code:`recursive_seq_lens` would be [[2, 3]], and :code:`base_shape`
			`would be [30] (the other dimensions excluding the sequence length).`
			`Therefore, the shape of the output LoDTensor would be [5, 30], where`
			`the first dimension 5 is the total lengths of the sequences, and the`
			other dimensions are :code:`base_shape`.
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
			`Args:`
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`recursive_seq_lens (list[list[int]]): a list of lists indicating the`
			`length-based LoD info.`
			`base_shape (list[int]): the shape of the output LoDTensor excluding`
			`the first dimension.`
			`place (CPUPlace\|CUDAPlace): CPU or GPU place indicating where`
			`the data in the created LoDTensor will be stored.`
			`low (int): the lower bound of the random integers.`
			`high (int): the upper bound of the random integers.`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago
			`Returns:`
Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`A LoDTensor with tensor data and recursive_seq_lens info, whose data`
			`is inside [low, high].`
fix api doc,test=develop (#17241) 6 years ago
			`Examples:`
			`.. code-block:: python`

			`import paddle.fluid as fluid`

Fix en docs of apis (#20050) * fix en docs of apis, test=develop, test=document_fix * follow chunwei's comments, test=develop 5 years ago			`t = fluid.create_random_int_lodtensor(recursive_seq_lens=[[2, 3]],`
			`base_shape=[30], place=fluid.CPUPlace(), low=0, high=10)`
			`print(t.shape()) # [5, 30]`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago			`"""`
			`assert isinstance(base_shape, list), "base_shape should be a list"`
			`# append the total number of basic elements to the front of its shape`
fix lodtensor.py 7 years ago			`overall_shape = [sum(recursive_seq_lens[-1])] + base_shape`
Apply 2to3 to current paddle main python code 7 years ago			`# the range of integer data elements is [low, high]`
Add lod_tensor.py for ease of creating lod tensor in book examples (#10817) * add lod_tensor utility python module * add lod_tensor test code * add more lod tensor tests * modify word2vec example code using new api * add comment 7 years ago			`data = np.random.random_integers(low, high, overall_shape).astype("int64")`
fix lodtensor.py 7 years ago			`return create_lod_tensor(data, recursive_seq_lens, place)`