I expect that we need it to:
1. Find flaky test commit.
2. Find regression commit.
Disable test is dangerous. We should first try to find the culprit commit
and fix it or revert it.
I managed to find 3d1ac72a (between 279aa62 and 6773129) that causes
the recent failure of test_rnn_encoder_decoder.