Commit Graph

4 Commits (44b738e8ac359c5ee252ae88ebf6eecc499a87ee)

Author SHA1 Message Date
xulei2020 18b519ae0f add sentence piece
6 years ago
qianlong cae77c0c22 BasicTokenizer not case fold on preserverd words
6 years ago
qianlong 4f16f036be Add WhitespaceTokenizer and UnicodeScriptTokenizer for nlp
6 years ago
qianlong 451c20a6f5 Add UnicodeCharTokenizer for nlp
6 years ago