* Add profiling information for inference example, recognize digits.
* Refine the profiling method.
* Correct the use of RecordEvent and simplify recognize_digits.
* compile and install the static library of fluid inference
* fix dynload_cuda not in CPU mode
* update shared library and adjust the deploy of openblas
* adjust the deploy of openblas
* * auto add all fluid modules for static library
* use libprotobuf.a instead of libprotobuf-lite.a for profiler
* use set_property to set the global varible instead of ENV
* add gpu depends of fluid modules, auto add inference_lib_dist depends
* change the condition of openblas_lib, and fix a typo