History

jinyaohui 334a32d501 fix pylint		4 years ago
..
scripts	densenet121 and mass fix.	4 years ago
src	fix pylint	4 years ago
README.md	update links for README	4 years ago
__init__.py	normalize the code about psenet	4 years ago
test.py	normalize the code about psenet	4 years ago
train.py	densenet121 and mass fix.	4 years ago

README.md

Unescape Escape

PSENet Description
Dataset
Features
- Mixed Precision
Environment Requirements
Quick Start
Script Description
Model Description
- Performance
  - Evaluation Performance
  - Inference Performance
- How to use

PSENet Description

With the development of convolutional neural network, scene text detection technology has been developed rapidly. However, there are still two problems in this algorithm, which hinders its application in industry. On the one hand, most of the existing algorithms require quadrilateral bounding boxes to accurately locate arbitrary shape text. On the other hand, two adjacent instances of text can cause error detection overwriting both instances. Traditionally, a segmentation-based approach can solve the first problem, but usually not the second. To solve these two problems, a new PSENet (PSENet) is proposed, which can accurately detect arbitrary shape text instances. More specifically, PSENet generates different scale kernels for each text instance and gradually expands the minimum scale kernel to a text instance with full shape. Because of the large geometric margins between the minimum scale kernels, our method can effectively segment closed text instances, making it easier to detect arbitrary shape text instances. The effectiveness of PSENet has been verified by numerous experiments on CTW1500, full text, ICDAR 2015, and ICDAR 2017 MLT.

Paper: Wenhai Wang, Enze Xie, Xiang Li, Wenbo Hou, Tong Lu, Gang Yu, Shuai Shao; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9336-9345

PSENet Example

Description

Progressive Scale Expansion Network (PSENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene.

Dataset

Dataset used: ICDAR2015 A training set of 1000 images containing about 4500 readable words A testing set containing about 2000 readable words

Environment Requirements

Hardware（Ascend）
- Prepare hardware environment with Ascend processor. If you want to try Ascend , please send the application form to ascend@huawei.com. Once approved, you can get the resources.
Framework
- MindSpore
For more information, please check the resources below：
- MindSpore Tutorials
- MindSpore Python API
install Mindspore
install pyblind11
install Opencv3.4

Quick Start

After installing MindSpore via the official website, you can start training and evaluation as follows:

# run distributed training example
sh scripts/run_distribute_train.sh pretrained_model.ckpt

#setup opencv library
download pyblind11, opencv3.4,setup opencv3.4

#make so file
run src/ETSNET/pse/Makefile; make libadaptor.so

#run test.py
python test.py --ckpt=pretrained_model.ckpt

#download eval method from [here](https://rrc.cvc.uab.es/?ch=4&com=tasks#TextLocalization).
#click "My Methods" button,then download Evaluation Scripts
download script.py
# run evaluation example
sh scripts/run_eval_ascend.sh

Script Description

Script and Sample Code

└── PSENet  
	├── README.md                           // descriptions about PSENet
	├── scripts  
		├── run_distribute_train.sh  		// shell script for distributed
		└── eval_ic15.sh  					// shell script for evaluation 
	├── src  
		├── __init__.py  
        ├── generate_hccn_file.py          // creating rank.json
		├── ETSNET  
			├── __init__.py  
			├── base.py                     // convolution and BN operator
			├── dice_loss.py                // calculate PSENet loss value
			├── etsnet.py                   // Subnet in  PSENet
			├── fpn.py                      // Subnet in  PSENet
			├── resnet50.py                 // Subnet in  PSENet
			├── pse                         // Subnet in  PSENet
                ├── __init__.py
                ├── adaptor.cpp
                ├── adaptor.h
                ├── Makefile
		├── config.py                       // parameter configuration 
		├── dataset.py                      // creating dataset
		└── network_define.py               // PSENet architecture
	├── test.py                             // test script 
	└── train.py                            // training script

Script Parameters

Major parameters in train.py and config.py are:

--pre_trained: Whether training from scratch or training based on the
               pre-trained model.Optional values are True, False. 
--device_id: Device ID used to train or evaluate the dataset. Ignore it
             when you use train.sh for distributed training.
--device_num: devices used when you use train.sh for distributed training.

Training Process

Distributed Training

sh scripts/run_distribute_train.sh pretrained_model.ckpt

The above shell script will run distribute training in the background. You can view the results through the file device[X]/log. The loss value will be achieved as follows:

# grep "epoch: " device_*/loss.log
device_0/log:epoch: 1, step: 20, loss is 0.80383
device_0/log:epcoh: 2, step: 40, loss is 0.77951
...
device_1/log:epoch: 1, step: 20, loss is 0.78026
device_1/log:epcoh: 2, step: 40, loss is 0.76629

Evaluation Process

Eval Script for ICDAR2015

Usage

step 1: download eval method from here.
step 2: click "My Methods" button,then download Evaluation Scripts.
step 3: it is recommended to symlink the eval method root to $MINDSPORE/model_zoo/psenet/eval_ic15/. if your folder structure is different,you may need to change the corresponding paths in eval script files.

sh ./script/run_eval_ascend.sh.sh

Result

Calculated!{"precision": 0.814796668299853, "recall": 0.8006740491092923, "hmean": 0.8076736279747451, "AP": 0}

Model Description

Performance

Evaluation Performance

Parameters	PSENet
Model Version	Inception V1
Resource	Ascend 910 ；CPU 2.60GHz，56cores；Memory，314G
uploaded Date	09/15/2020 (month/day/year)
MindSpore Version	1.0-alpha
Dataset	ICDAR2015
Training Parameters	start_lr=0.1; lr_scale=0.1
Optimizer	SGD
Loss Function	LossCallBack
outputs	probability
Loss	0.35
Speed	1pc: 444 ms/step; 4pcs: 446 ms/step
Total time	1pc: 75.48 h; 4pcs: 18.87 h
Parameters (M)	27.36
Checkpoint for Fine tuning	109.44M (.ckpt file)
Scripts	https://gitee.com/mindspore/mindspore/tree/master/model_zoo/psenet

Inference Performance

Parameters	PSENet
Model Version	Inception V1
Resource	Ascend 910
Uploaded Date	09/15/2020 (month/day/year)
MindSpore Version	1.0-alpha
Dataset	ICDAR2015
outputs	probability
Accuracy	1pc: 81%; 8pcs: 81%

How to use

Inference

If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this Link. Following the steps below, this is a simple example:

# Load unseen dataset for inference
dataset = dataset.create_dataset(cfg.data_path, 1, False)

# Define model 
config.INFERENCE = False
net = ETSNet(config)
net = net.set_train()
param_dict = load_checkpoint(args.pre_trained)
load_param_into_net(net, param_dict)
print('Load Pretrained parameters done!')

criterion = DiceLoss(batch_size=config.TRAIN_BATCH_SIZE)

lrs = lr_generator(start_lr=1e-3, lr_scale=0.1, total_iters=config.TRAIN_TOTAL_ITER)
opt = nn.SGD(params=net.trainable_params(), learning_rate=lrs, momentum=0.99, weight_decay=5e-4)

# warp model
net = WithLossCell(net, criterion)
net = TrainOneStepCell(net, opt)

time_cb = TimeMonitor(data_size=step_size)
loss_cb = LossCallBack(per_print_times=20)
# set and apply parameters of check point
ckpoint_cf = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=2)
ckpoint_cb = ModelCheckpoint(prefix="ETSNet", config=ckpoint_cf, directory=config.TRAIN_MODEL_SAVE_PATH)

model = Model(net)
model.train(config.TRAIN_REPEAT_NUM, ds, dataset_sink_mode=False, callbacks=[time_cb, loss_cb, ckpoint_cb])

# Load pre-trained model
param_dict = load_checkpoint(cfg.checkpoint_path)
load_param_into_net(net, param_dict)
net.set_train(False)

# Make predictions on the unseen dataset
acc = model.eval(dataset)
print("accuracy: ", acc)

README.md Unescape Escape

Contents