* init batch norm op
* prepare input output
* compute mean_out var_out save_mean save_var on CPU
* active is test
* use eigen to do computation
* complete batch norm forward
* set default momentum to 0.9
* add batch norm grad op in CPU
* add tensor_format and NHWC support, add python test
* add test training
* add batch norm gradient test
* improve comment, fix foward Python UnitTest
* add gradient test
* fix eigen warning
* follow name style
* fix a bug
* change float to T
* add simple forward test
* test with different place
* add backward test
* refine python test
* remove old python test code
* code clean
* follow code style
* update comment
* "add model format design doc"
* "add restore function"
* "add parse protobuf"
* "move necessary information to saver.proto"
* "format code"
* "add gpu option"
* "add lod info"
* "add saveop python test wrapper"
* "checkpoint reuse save operator"
* "rewrite model format design doc"
* "async support needed"
* "fix run once"
* "fix doc based on comments"
* "refine based on comments"
* "fix based comments"
* "remove persistable flag from framework.proto"
* "add IndicateDataType to restore op"
* "add save test"
* "modify save restore code"
* "modified the restore logic"
* rm checkpoint_op.cc
* rm test_checkpoint_op.py
* "get inputs outputs name from execution context"
* Saving each variable to a independent file
* Fix bugs
* Rewrite save_restore_op_test with new Python framework
* Move `SaveOp` and `RestoreOp` from OpWithKernel to OpBase
* Refine unit test of SaveOp and RestoreOp
* fix compile errorwq