* HIP cmake.
Enable whole archieve build for pybind library.
Disable two warning.
Rollback to C++11.
Link RCCL to WA gpu kernel loading issue.
Update eigen to fix build failure.
Add more include directories.
Fix O3 build failure.
Update eigen.
fix tensor_util_test segment fault issue
add more macro check in hip.cmake.
we may consider refine hip.cmake to inherit all add_definitions() in parrent scope, in the future.
Fix rocRAND load.
Update eigen to fix gru_unit_op and reduce_op.
Add HIP support to testing.
Update eigen to support int16 and int8 in arg min and arg max.
* add rocprim as cub library used by nv implementation
* Reduce build time in rocprim.
* Add rocprim introduction, remove useless cmake code.
* Remove useless flags and format cmake file.