You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
mindspore/tests/ut/data/dataset/testTokenizerData/bert_tokenizer.txt

15 lines
264 B

床前明月光
疑是地上霜
举头望明月
低头思故乡
I am making small mistakes during working hours
😀嘿嘿😃哈哈😄大笑😁嘻嘻
繁體字
unused [CLS]
unused [SEP]
unused [UNK]
unused [PAD]
unused [MASK]
[unused1]
[unused10]
12+/-28=40/-16