add design doc on keys of operaror kernel type (#6782)
* add design doc on keys of operator kernel type * follow commentsdel_some_in_makelist
parent
f605f16726
commit
1a3d4b0d3d
@ -0,0 +1,91 @@
|
||||
# Design Doc: The Keys of Operator Kernel Type
|
||||
## Problem
|
||||
An operator can have different kernel implementations, and each operator will have a map to store the related kernels. Fluid uses `OpKernelType` as a key to identify a unique Kernel. Before an operator runs, an certain kernel must be chosen by a key of `OpKernelType`. Currently, `OpKernelType` is defined as follows:
|
||||
|
||||
```cpp
|
||||
struct OpKernelType {
|
||||
platform::Place place_;
|
||||
proto::DataType data_type_;
|
||||
};
|
||||
```
|
||||
For more details, please refer to [codes](https://github.com/PaddlePaddle/Paddle/blob/2d5ec16bc8a09fb8e0f62c89b116b0cd1d333907/paddle/framework/operator.h#L348-L374) in github.
|
||||
|
||||
It contains two keys, `Place` and `DataType`. And these two keys will be hashed to a unique key to represent a certain type of kernel. However, these two keys are not enough. We need a more complete representation of `OpKernelType`.
|
||||
|
||||
We often implement a kernel of an operator with some computing library in certain device(place). Please remind that computing library and device are not one-to-one corresponding. A device can have a lot of computing libraries and a computing library can also support several devices.
|
||||
|
||||
For example, Eigen library can support Nvidia GPU/AMD GPU/CPU. And MKLDNN library can support Intel CPU/Intel FPGA. Both `Place` and `Library` should be a key of `OpKernelType`.
|
||||
|
||||
It's obvious that different DataTypes, like fp64/fp32/int8 will have different kernels. But the data layout of a Tensor will also lead to different implementation. Please refer to the batch norm operator [kernels](https://github.com/PaddlePaddle/Paddle/blob/a948fac4d0ad7e0412d373b8aabeb711c2899563/paddle/operators/batch_norm_op.cc#L180-L209). Data Layout should also be taken into consideration.
|
||||
|
||||
## Solution
|
||||
|
||||
There are four keys to determine a kernel type of an operator: `Place`/`Library`/`DataType`/`Layout`.
|
||||
|
||||
```cpp
|
||||
struct OpKernelType {
|
||||
platform::Place place_;
|
||||
platform::Library library_;
|
||||
proto::DataType data_type_;
|
||||
framework::Layout layout_;
|
||||
};
|
||||
```
|
||||
|
||||
Following is the details:
|
||||
|
||||
### Place
|
||||
|
||||
`Place` is defined as follows:
|
||||
|
||||
```cpp
|
||||
typedef boost::variant<CUDAPlace, ROCmPlace, FPGAPlace, CPUPlace> Place;
|
||||
```
|
||||
|
||||
`Place` is to represent the device memory where data is locating.
|
||||
|
||||
|
||||
### Library
|
||||
|
||||
One operator kernel is usually implemented based on one library. `Library` is defined as a enum variable:
|
||||
|
||||
```cpp
|
||||
enum Library { Plain, MKLDNN, CUDNN };
|
||||
```
|
||||
|
||||
We use `Plain` enumerator to represent default library. Since most operators in Fluid are implemented based on `Eigen` library, we take `Eigen` library as the `Plain` enumerator.
|
||||
A library usually has a corresponding `DeviceContext` which contains some handles needed by computation. Fluid now have two default DeviceContexts in CPU and CUDA, `CPUDeviceContext` and `CUDADeviceContext`. `CPUDeviceContext` contains a Eigen library handle and `CDUADeviceContext` contains a Eigen library handle and cuBLAS handle.
|
||||
|
||||
If we want to support new Library, a new enumerator need to be added to `Library` and a new corresponding `LibraryDeviceContext` will be created.
|
||||
|
||||
|
||||
### DataType
|
||||
|
||||
|
||||
`DataType` is defined in [framework.proto](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto). Currently, int32/int64/fp32/fp64 are supported.
|
||||
|
||||
### Layout
|
||||
|
||||
Actually, a Tensor is a view of a block of memory. Besides a pointer to the memory, we also have to get some other descriptions of this block of memory, such as shape(ddim), stride, and layout.
|
||||
|
||||
Different layout leads to different implementation of operator kernel. There are mainly 4 principles we have to follow to support layout in our fluid framework.
|
||||
|
||||
- We take layout as a data member of Tensor. Layout is actually a enum variable. If fluid is built with MKLDNN, then, the memory format in MKLDNN will be added into this enum variable too.
|
||||
|
||||
- Users have to set layout for input data. And some operators like fill_constant/random, also have to set layout of generating data. Of course, we can have some default layout, like NCHW.
|
||||
|
||||
- The inference of Layout is at run-time, not compile-time.
|
||||
|
||||
- Every operator have to implement different kernels for different layouts. Let's take MKLDNN as an example, if we want to implement a MKLDNN convolution operator, we have to realize all the kernels for different layout, list at [here](http://01org.github.io/mkl-dnn/structmkldnn_1_1memory.html). And we will have a special macro to do registering kernels for MKLDNN operators.
|
||||
|
||||
`Layout` is also defined as a enum variable:
|
||||
|
||||
```cpp
|
||||
enum Layout {
|
||||
kNCHW,
|
||||
kNHWC,
|
||||
#ifdef PADDLE_WITH_MKLDNN
|
||||
knChw8c
|
||||
...
|
||||
#endif
|
||||
};
|
||||
```
|
Loading…
Reference in new issue