You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
mindspore/model_zoo/research/cv/TNT
zhanghuiyao bd31e4e912
modify tdt readme file
4 years ago
..
fig add TNT model 4 years ago
src add TNT model 4 years ago
eval.py add TNT model 4 years ago
mindpsore_hub_conf.py add TNT model 4 years ago
readme.md modify tdt readme file 4 years ago

readme.md

Contents

TNT Description

The TNT (Transformer in Transformer) network is a pure transformer model for visual recognition. TNT treats an image as a sequence of patches and treats a patch as a sequence of pixels. TNT block utilizes a outer transformer block to process the sequence of patches and an inner transformer block to process the sequence of pixels.

Paper: Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang. Transformer in Transformer. preprint 2021.

Model architecture

The overall network architecture of TNT is show below:

Dataset

Dataset used: Oxford-IIIT Pet

  • Dataset size: 7049 colorful images in 1000 classes
    • Train: 3680 images
    • Test: 3369 images
  • Data format: RGB images.
    • Note: Data will be processed in src/dataset.py

Environment Requirements

Script description

Script and sample code

TNT
├── eval.py # inference entry
├── fig
   └── tnt.png # the illustration of TNT network
├── readme.md # Readme
└── src
    ├── config.py # config of model and data
    ├── pet_dataset.py # dataset loader
    └── tnt.py # TNT network

Training process

To Be Done

Eval process

Usage

After installing MindSpore via the official website, you can start evaluation as follows:

Launch

# infer example
  GPU: python eval.py --model tnt-b --dataset_path ~/Pets/test.mindrecord --platform GPU --checkpoint_path [CHECKPOINT_PATH]

checkpoint can be downloaded at https://www.mindspore.cn/resources/hub.

Result

result: {'acc': 0.95} ckpt= ./tnt-b-pets.ckpt

Model Description

Performance

Evaluation Performance

TNT on ImageNet2012
Parameters
Model Version TNT-B TNT-S
uploaded Date 21/03/2021 (month/day/year) 21/03/2021 (month/day/year)
MindSpore Version 1.1 1.1
Dataset ImageNet2012 ImageNet2012
Input size 224x224 224x224
Parameters (M) 86.4 23.8
FLOPs (M) 14.1 5.2
Accuracy (Top1) 82.8 81.3
TNT on Oxford-IIIT Pet
Parameters
Model Version TNT-B TNT-S
uploaded Date 21/03/2021 (month/day/year) 21/03/2021 (month/day/year)
MindSpore Version 1.1 1.1
Dataset Oxford-IIIT Pet Oxford-IIIT Pet
Input size 384x384 384x384
Parameters (M) 86.4 23.8
Accuracy (Top1) 95.0 94.7

Description of Random Situation

In dataset.py, we set the seed inside "create_dataset" function. We also use random seed in train.py.

ModelZoo Homepage

Please check the official homepage.