* init kernel hint
* fix typo
* rm unused code
* add include in op_kernel.h
* restore op_kernel since it will be moved to op_kernel_type
* change force_cpu to use_cpu
* fix compilation
There are mainly following fixes:
- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`