commit
e0fa21bd94
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,9 @@
|
||||
#!/bin/bash
|
||||
|
||||
mkdir -p $1/demo/cxx/ocr/debug/
|
||||
cp ../../ppocr/utils/ppocr_keys_v1.txt $1/demo/cxx/ocr/debug/
|
||||
cp -r ./* $1/demo/cxx/ocr/
|
||||
cp ./config.txt $1/demo/cxx/ocr/debug/
|
||||
cp ../../doc/imgs/11.jpg $1/demo/cxx/ocr/debug/
|
||||
|
||||
echo "Prepare Done"
|
@ -0,0 +1,28 @@
|
||||
# Paddle Serving 服务部署
|
||||
|
||||
本教程将介绍基于[Paddle Serving](https://github.com/PaddlePaddle/Serving)部署PaddleOCR在线预测服务的详细步骤。
|
||||
|
||||
## 快速启动服务
|
||||
|
||||
### 1. 准备环境
|
||||
|
||||
### 2. 模型转换
|
||||
|
||||
### 3. 启动服务
|
||||
启动服务可以根据实际需求选择启动`标准版`或者`快速版`,两种方式的对比如下表:
|
||||
|
||||
|版本|特点|适用场景|
|
||||
|-|-|-|
|
||||
|标准版|||
|
||||
|快速版|||
|
||||
|
||||
#### 方式1. 启动标准版服务
|
||||
|
||||
#### 方式2. 启动快速版服务
|
||||
|
||||
|
||||
## 发送预测请求
|
||||
|
||||
## 返回结果格式说明
|
||||
|
||||
## 自定义修改服务逻辑
|
@ -0,0 +1,36 @@
|
||||
# BENCHMARK
|
||||
|
||||
This document gives the prediction time-consuming benchmark of PaddleOCR Ultra Lightweight Chinese Model (8.6M) on each platform.
|
||||
|
||||
## TEST DATA
|
||||
* 500 images were randomly sampled from the Chinese public data set [ICDAR2017-RCTW](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/datasets.md#ICDAR2017-RCTW-17).
|
||||
Most of the pictures in the set were collected in the wild through mobile phone cameras.
|
||||
Some are screenshots.
|
||||
These pictures show various scenes, including street scenes, posters, menus, indoor scenes and screenshots of mobile applications.
|
||||
|
||||
## MEASUREMENT
|
||||
The predicted time-consuming indicators on the four platforms are as follows:
|
||||
|
||||
| Long size(px) | T4(s) | V100(s) | Intel Xeon 6148(s) | Snapdragon 855(s) |
|
||||
| :---------: | :-----: | :-------: | :------------------: | :-----------------: |
|
||||
| 960 | 0.092 | 0.057 | 0.319 | 0.354 |
|
||||
| 640 | 0.067 | 0.045 | 0.198 | 0.236 |
|
||||
| 480 | 0.057 | 0.043 | 0.151 | 0.175 |
|
||||
|
||||
Explanation:
|
||||
* The evaluation time-consuming stage is the complete stage from image input to result output, including image
|
||||
pre-processing and post-processing.
|
||||
* ```Intel Xeon 6148``` is the server-side CPU model. Intel MKL-DNN is used in the test to accelerate the CPU prediction speed.
|
||||
To use this operation, you need to:
|
||||
* Update to the latest version of PaddlePaddle: https://www.paddlepaddle.org.cn/documentation/docs/zh/install/Tables.html#whl-dev
|
||||
Please select the corresponding mkl version wheel package according to the CUDA version and Python version of your environment,
|
||||
for example, CUDA10, Python3.7 environment, you should:
|
||||
|
||||
```
|
||||
# Obtain the installation package
|
||||
wget https://paddle-wheel.bj.bcebos.com/0.0.0-gpu-cuda10-cudnn7-mkl/paddlepaddle_gpu-0.0.0-cp37-cp37m-linux_x86_64.whl
|
||||
# Installation
|
||||
pip3.7 install paddlepaddle_gpu-0.0.0-cp37-cp37m-linux_x86_64.whl
|
||||
```
|
||||
* Use parameters ```--enable_mkldnn True``` to turn on the acceleration switch when making predictions
|
||||
* ```Snapdragon 855``` is a mobile processing platform model.
|
@ -0,0 +1,20 @@
|
||||
# DATA ANNOTATION TOOLS
|
||||
There are the commonly used data annotation tools, which will be continuously updated. Welcome to contribute tools~
|
||||
|
||||
### 1. labelImg
|
||||
- Tool description: Rectangular label
|
||||
- Tool address: https://github.com/tzutalin/labelImg
|
||||
- Sketch diagram:
|
||||
![labelimg](../datasets/labelimg.jpg)
|
||||
|
||||
### 2. roLabelImg
|
||||
- Tool description: Label tool rewritten based on labelImg, supporting rotating rectangular label
|
||||
- Tool address: https://github.com/cgvict/roLabelImg
|
||||
- Sketch diagram:
|
||||
![roLabelImg](../datasets/roLabelImg.png)
|
||||
|
||||
### 3. labelme
|
||||
- Tool description: Support four points, polygons, circles and other labels
|
||||
- Tool address: https://github.com/wkentaro/labelme
|
||||
- Sketch diagram:
|
||||
![labelme](../datasets/labelme.jpg)
|
@ -0,0 +1,11 @@
|
||||
# DATA SYNTHESIS TOOLS
|
||||
|
||||
In addition to open source data, users can also use synthesis tools to synthesize data.
|
||||
There are the commonly used data synthesis tools, which will be continuously updated. Welcome to contribute tools~
|
||||
|
||||
* [Text_renderer](https://github.com/Sanster/text_renderer)
|
||||
* [SynthText](https://github.com/ankush-me/SynthText)
|
||||
* [SynthText_Chinese_version](https://github.com/JarveeLee/SynthText_Chinese_version)
|
||||
* [TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator)
|
||||
* [SynthText3D](https://github.com/MhLiao/SynthText3D)
|
||||
* [UnrealText](https://github.com/Jyouhou/UnrealText/)
|
@ -0,0 +1,28 @@
|
||||
# Handwritten OCR dataset
|
||||
Here we have sorted out the commonly used handwritten OCR dataset datasets, which are being updated continuously. We welcome you to contribute datasets ~
|
||||
- [Institute of automation, Chinese Academy of Sciences - handwritten Chinese dataset](#Institute of automation, Chinese Academy of Sciences - handwritten Chinese dataset)
|
||||
- [NIST handwritten single character dataset - English](#NIST handwritten single character dataset - English)
|
||||
|
||||
<a name="Institute of automation, Chinese Academy of Sciences - handwritten Chinese dataset"></a>
|
||||
## Institute of automation, Chinese Academy of Sciences - handwritten Chinese dataset
|
||||
- **Data source**:http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html
|
||||
- **Data introduction**:
|
||||
* It includes online and offline handwritten data,`HWDB1.0~1.2` has totally 3895135 handwritten single character samples, which belong to 7356 categories (7185 Chinese characters and 171 English letters, numbers and symbols);`HWDB2.0~2.2` has totally 5091 pages of images, which are divided into 52230 text lines and 1349414 words. All text and text samples are stored as grayscale images. Some sample words are shown below.
|
||||
|
||||
![](../datasets/CASIA_0.jpg)
|
||||
|
||||
- **Download address**:http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html
|
||||
- **使用建议**:Data for single character, white background, can form a large number of text lines for training. White background can be processed into transparent state, which is convenient to add various backgrounds. For the case of semantic needs, it is suggested to extract single character from real corpus to form text lines.
|
||||
|
||||
|
||||
<a name="NIST handwritten single character dataset - English"></a>
|
||||
## NIST handwritten single character dataset - English(NIST Handprinted Forms and Characters Database)
|
||||
|
||||
- **Data source**: [https://www.nist.gov/srd/nist-special-database-19](https://www.nist.gov/srd/nist-special-database-19)
|
||||
|
||||
- **Data introduction**: NIST19 dataset is suitable for handwritten document and character recognition model training. It is extracted from the handwritten sample form of 3600 authors and contains 810000 character images in total. Nine of them are shown below.
|
||||
|
||||
![](../datasets/nist_demo.png)
|
||||
|
||||
|
||||
- **Download address**: [https://www.nist.gov/srd/nist-special-database-19](https://www.nist.gov/srd/nist-special-database-19)
|
@ -0,0 +1,98 @@
|
||||
|
||||
# Quick start of Chinese OCR model
|
||||
|
||||
## 1. Prepare for the environment
|
||||
|
||||
Please refer to [quick installation](./installation_en.md) to configure the PaddleOCR operating environment.
|
||||
|
||||
|
||||
## 2.inference models
|
||||
|
||||
| Name | Introduction | Detection model | Recognition model | Recognition model with space support |
|
||||
|-|-|-|-|-|
|
||||
|chinese_db_crnn_mobile| Ultra-lightweight Chinese OCR model |[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)
|
||||
|chinese_db_crnn_server| Universal Chinese OCR model |[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar)|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)
|
||||
|
||||
* If wget is not installed in the windows environment, you can copy the link to the browser to download when downloading the model, and uncompress it and place it in the corresponding directory.
|
||||
|
||||
Copy the download address of the `inference model` for detection and recognition in the table above, and uncompress them.
|
||||
|
||||
```
|
||||
mkdir inference && cd inference
|
||||
# Download the detection model and unzip
|
||||
wget {url/of/detection/inference_model} && tar xf {name/of/detection/inference_model/package}
|
||||
# Download the recognition model and unzip
|
||||
wget {url/of/recognition/inference_model} && tar xf {name/of/recognition/inference_model/package}
|
||||
cd ..
|
||||
```
|
||||
|
||||
Take the ultra-lightweight model as an example:
|
||||
|
||||
```
|
||||
mkdir inference && cd inference
|
||||
# Download the detection model of the ultra-lightweight Chinese OCR model and uncompress it
|
||||
wget https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar && tar xf ch_det_mv3_db_infer.tar
|
||||
# Download the recognition model of the ultra-lightweight Chinese OCR model and uncompress it
|
||||
wget https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar && tar xf ch_rec_mv3_crnn_infer.tar
|
||||
cd ..
|
||||
```
|
||||
|
||||
After decompression, the file structure should be as follows:
|
||||
|
||||
```
|
||||
|-inference
|
||||
|-ch_rec_mv3_crnn
|
||||
|- model
|
||||
|- params
|
||||
|-ch_det_mv3_db
|
||||
|- model
|
||||
|- params
|
||||
...
|
||||
```
|
||||
|
||||
## 3. Single image or image set prediction
|
||||
|
||||
* The following code implements text detection and recognition process. When performing prediction, you need to specify the path of a single image or image set through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, and the parameter `rec_model_dir` specifies the path to identify the inference model. The visual results are saved to the `./inference_results` folder by default.
|
||||
|
||||
|
||||
|
||||
```bash
|
||||
|
||||
# Predict a single image specified by image_dir
|
||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/" --rec_model_dir="./inference/ch_rec_mv3_crnn/"
|
||||
|
||||
# Predict imageset specified by image_dir
|
||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_det_mv3_db/" --rec_model_dir="./inference/ch_rec_mv3_crnn/"
|
||||
|
||||
# If you want to use the CPU for prediction, you need to set the use_gpu parameter to False
|
||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/" --rec_model_dir="./inference/ch_rec_mv3_crnn/" --use_gpu=False
|
||||
```
|
||||
|
||||
- Universal Chinese OCR model
|
||||
|
||||
Please follow the above steps to download the corresponding models and update the relevant parameters, The example is as follows.
|
||||
|
||||
```
|
||||
# Predict a single image specified by image_dir
|
||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_r50_vd_db/" --rec_model_dir="./inference/ch_rec_r34_vd_crnn/"
|
||||
```
|
||||
|
||||
- Universal Chinese OCR model with the support of space
|
||||
|
||||
Please follow the above steps to download the corresponding models and update the relevant parameters, The example is as follows.
|
||||
|
||||
|
||||
* Note: Please update the source code to the latest version and add parameters `--use_space_char=True`
|
||||
|
||||
```
|
||||
# Predict a single image specified by image_dir
|
||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_12.jpg" --det_model_dir="./inference/ch_det_r50_vd_db/" --rec_model_dir="./inference/ch_rec_r34_vd_crnn_enhance/" --use_space_char=True
|
||||
```
|
||||
|
||||
For more text detection and recognition tandem reasoning, please refer to the document tutorial
|
||||
: [Inference with Python inference engine](./inference_en.md)。
|
||||
|
||||
In addition, the tutorial also provides other deployment methods for the Chinese OCR model:
|
||||
- [Server-side C++ inference](../../deploy/cpp_infer/readme_en.md)
|
||||
- [Service deployment](./serving_en.md)
|
||||
- [End-to-end deployment](../../deploy/lite/readme_en.md)
|
@ -0,0 +1,55 @@
|
||||
# REFERENCE
|
||||
|
||||
```
|
||||
1. EAST:
|
||||
@inproceedings{zhou2017east,
|
||||
title={EAST: an efficient and accurate scene text detector},
|
||||
author={Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun},
|
||||
booktitle={Proceedings of the IEEE conference on Computer Vision and Pattern Recognition},
|
||||
pages={5551--5560},
|
||||
year={2017}
|
||||
}
|
||||
|
||||
2. DB:
|
||||
@article{liao2019real,
|
||||
title={Real-time Scene Text Detection with Differentiable Binarization},
|
||||
author={Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang},
|
||||
journal={arXiv preprint arXiv:1911.08947},
|
||||
year={2019}
|
||||
}
|
||||
|
||||
3. DTRB:
|
||||
@inproceedings{baek2019wrong,
|
||||
title={What is wrong with scene text recognition model comparisons? dataset and model analysis},
|
||||
author={Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk},
|
||||
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
|
||||
pages={4715--4723},
|
||||
year={2019}
|
||||
}
|
||||
|
||||
4. SAST:
|
||||
@inproceedings{wang2019single,
|
||||
title={A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning},
|
||||
author={Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming},
|
||||
booktitle={Proceedings of the 27th ACM International Conference on Multimedia},
|
||||
pages={1277--1285},
|
||||
year={2019}
|
||||
}
|
||||
|
||||
5. SRN:
|
||||
@article{yu2020towards,
|
||||
title={Towards Accurate Scene Text Recognition with Semantic Reasoning Networks},
|
||||
author={Yu, Deli and Li, Xuan and Zhang, Chengquan and Han, Junyu and Liu, Jingtuo and Ding, Errui},
|
||||
journal={arXiv preprint arXiv:2003.12294},
|
||||
year={2020}
|
||||
}
|
||||
|
||||
6. end2end-psl:
|
||||
@inproceedings{sun2019chinese,
|
||||
title={Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning},
|
||||
author={Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo},
|
||||
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
|
||||
pages={9086--9095},
|
||||
year={2019}
|
||||
}
|
||||
```
|
@ -0,0 +1,71 @@
|
||||
# Visualization
|
||||
|
||||
- [Chinese/English OCR Visualization (Space_support )](#Space_support)
|
||||
- [Ultra-lightweight Chinese/English OCR Visualization](#Ultra-lightweight)
|
||||
- [General Chinese/English OCR Visualization](#General)
|
||||
|
||||
<a name="Space_support"></a>
|
||||
|
||||
## Chinese/English OCR Visualization (Space_support )
|
||||
|
||||
### Ultra-lightweight Model
|
||||
<div align="center">
|
||||
<img src="../imgs_results/img_11.jpg" width="800">
|
||||
</div>
|
||||
|
||||
### General OCR Model
|
||||
<div align="center">
|
||||
<img src="../imgs_results/chinese_db_crnn_server/en_paper.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<a name="Ultra-lightweight"></a>
|
||||
## Ultra-lightweight Chinese/English OCR Visualization
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/1.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/7.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/12.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/4.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/6.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/9.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/16.png" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/22.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<a name="General"></a>
|
||||
## General Chinese/English OCR Visualization
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/chinese_db_crnn_server/11.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/chinese_db_crnn_server/2.jpg" width="800">
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="../imgs_results/chinese_db_crnn_server/8.jpg" width="800">
|
||||
</div>
|
||||
|
||||
|
Loading…
Reference in new issue