parent
bd1c748de6
commit
4d22bf3af6
@ -0,0 +1,49 @@
|
||||
# Optional parameters list
|
||||
|
||||
The following list can be viewed via `--help`
|
||||
|
||||
| FLAG | Supported script | Use | Defaults | Note |
|
||||
| :----------------------: | :------------: | :---------------: | :--------------: | :-----------------: |
|
||||
| -c | ALL | Specify configuration file to use | None | **Please refer to the parameter introduction for configuration file usage** |
|
||||
| -o | ALL | set configuration options | None | Configuration using -o has higher priority than the configuration file selected with -c. E.g: `-o Global.use_gpu=false` |
|
||||
|
||||
|
||||
## Introduction to Global Parameters of Configuration File
|
||||
|
||||
Take `rec_chinese_lite_train.yml` as an example
|
||||
|
||||
|
||||
| Parameter | Use | Default | Note |
|
||||
| :----------------------: | :---------------------: | :--------------: | :--------------------: |
|
||||
| algorithm | Select algorithm to use | Synchronize with configuration file | For selecting model, please refer to the supported model [list](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/README_en.md) |
|
||||
| use_gpu | Set using GPU or not | true | \ |
|
||||
| epoch_num | Maximum training epoch number | 3000 | \ |
|
||||
| log_smooth_window | Sliding window size | 20 | \ |
|
||||
| print_batch_step | Set print log interval | 10 | \ |
|
||||
| save_model_dir | Set model save path | output/{model_name} | \ |
|
||||
| save_epoch_step | Set model save interval | 3 | \ |
|
||||
| eval_batch_step | Set the model evaluation interval | 2000 | \ |
|
||||
|train_batch_size_per_card | Set the batch size during training | 256 | \ |
|
||||
| test_batch_size_per_card | Set the batch size during testing | 256 | \ |
|
||||
| image_shape | Set input image size | [3, 32, 100] | \ |
|
||||
| max_text_length | Set the maximum text length | 25 | \ |
|
||||
| character_type | Set character type | ch | en/ch, the default dict will be used for en, and the custom dict will be used for ch|
|
||||
| character_dict_path | Set dictionary path | ./ppocr/utils/ic15_dict.txt | \ |
|
||||
| loss_type | Set loss type | ctc | Supports two types of loss: ctc / attention |
|
||||
| reader_yml | Set the reader configuration file | ./configs/rec/rec_icdar15_reader.yml | \ |
|
||||
| pretrain_weights | Load pre-trained model path | ./pretrain_models/CRNN/best_accuracy | \ |
|
||||
| checkpoints | Load saved model path | None | Used to load saved parameters to continue training after interruption |
|
||||
| save_inference_dir | path to save model for inference | None | Use to save inference model |
|
||||
|
||||
## Introduction to Reader parameters of Configuration file
|
||||
|
||||
Take `rec_chinese_reader.yml` as an example:
|
||||
|
||||
| Parameter | Use | Default | Note |
|
||||
| :----------------------: | :---------------------: | :--------------: | :--------------------: |
|
||||
| reader_function | Select data reading method | ppocr.data.rec.dataset_traversal,SimpleReader | Support two data reading methods: SimpleReader / LMDBReader |
|
||||
| num_workers | Set the number of data reading threads | 8 | \ |
|
||||
| img_set_dir | Image folder path | ./train_data | \ |
|
||||
| label_file_path | Groundtruth file path | ./train_data/rec_gt_train.txt| \ |
|
||||
| infer_img | Result folder path | ./infer_img | \|
|
||||
|
@ -0,0 +1,30 @@
|
||||
# How to make your own ultra-lightweight OCR models?
|
||||
|
||||
The process of making a customized ultra-lightweight OCR models can be divided into three steps: training text detection model, training text recognition model, and concatenate the predictions from previous steps.
|
||||
|
||||
## step1: Train text detection model
|
||||
|
||||
PaddleOCR provides two text detection algorithms: EAST and DB. Both support MobileNetV3 and ResNet50_vd backbone networks, select the corresponding configuration file as needed and start training. For example, to train with MobileNetV3 as the backbone network for DB detection model :
|
||||
```
|
||||
python3 tools/train.py -c configs/det/det_mv3_db.yml
|
||||
```
|
||||
For more details about data preparation and training tutorials, refer to the documentation [Text detection model training/evaluation/prediction](./detection.md)
|
||||
|
||||
## step2: Train text recognition model
|
||||
|
||||
PaddleOCR provides four text recognition algorithms: CRNN, Rosetta, STAR-Net, and RARE. They all support two backbone networks: MobileNetV3 and ResNet34_vd, select the corresponding configuration files as needed to start training. For example, to train a CRNN recognition model that uses MobileNetV3 as the backbone network:
|
||||
```
|
||||
python3 tools/train.py -c configs/rec/rec_chinese_lite_train.yml
|
||||
```
|
||||
For more details about data preparation and training tutorials, refer to the documentation [Text recognition model training/evaluation/prediction](./recognition.md)
|
||||
|
||||
## step3: Concatenate predictions
|
||||
|
||||
PaddleOCR provides a concatenation tool for detection and recognition models, which can connect any trained detection model and any recognition model into a two-stage text recognition system. The input image goes through four main stages: text detection, text rectification, text recognition, and score filtering to output the text position and recognition results, and at the same time, you can choose to visualize the results.
|
||||
|
||||
When performing prediction, you need to specify the path of a single image or a image folder through the parameter `image_dir`, the parameter `det_model_dir` specifies the path of detection model, and the parameter `rec_model_dir` specifies the path of recogniton model. The visualized results are saved to the `./inference_results` folder by default.
|
||||
|
||||
```
|
||||
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/det/" --rec_model_dir="./inference/rec/"
|
||||
```
|
||||
For more details about text detection and recognition concatenation, please refer to the document [Inference](./inference.md)
|
@ -0,0 +1,96 @@
|
||||
# Text detection
|
||||
|
||||
This section uses the icdar15 dataset as an example to introduce the training, evaluation, and testing of the detection model in PaddleOCR.
|
||||
|
||||
## Data preparation
|
||||
The icdar2015 dataset can be obtained from [official website](https://rrc.cvc.uab.es/?ch=4&com=downloads). Registration is required for downloading.
|
||||
|
||||
Decompress the downloaded dataset to the working directory, assuming it is decompressed under PaddleOCR/train_data/. In addition, PaddleOCR organizes many scattered annotation files into two separate annotation files for train and test respectively, which can be downloaded by wget:
|
||||
```
|
||||
# Under the PaddleOCR path
|
||||
cd PaddleOCR/
|
||||
wget -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/train_icdar2015_label.txt
|
||||
wget -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/test_icdar2015_label.txt
|
||||
```
|
||||
|
||||
After decompressing the data set and downloading the annotation file, PaddleOCR/train_data/ has two folders and two files, which are:
|
||||
```
|
||||
/PaddleOCR/train_data/icdar2015/text_localization/
|
||||
└─ icdar_c4_train_imgs/ Training data of icdar dataset
|
||||
└─ ch4_test_images/ Testing data of icdar dataset
|
||||
└─ train_icdar2015_label.txt Training annotation of icdar dataset
|
||||
└─ test_icdar2015_label.txt Test annotation of icdar dataset
|
||||
```
|
||||
|
||||
The provided annotation file format is as follow:
|
||||
```
|
||||
" Image file name Image annotation information encoded by json.dumps"
|
||||
ch4_test_images/img_61.jpg [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]], ...}]
|
||||
```
|
||||
The image annotation information before json.dumps encoding is a list containing multiple dictionaries. The `points` in the dictionary represent the coordinates (x, y) of the four points of the text box, arranged clockwise from the point at the upper left corner.
|
||||
|
||||
`transcription` represents the text of the current text box, and this information is not needed in the text detection task.
|
||||
If you want to train PaddleOCR on other datasets, you can build the annotation file according to the above format.
|
||||
|
||||
|
||||
## Quickstart training
|
||||
|
||||
First download the pretrained model. The detection model of PaddleOCR currently supports two backbones, namely MobileNetV3 and ResNet50_vd. You can use the model in [PaddleClas](https://github.com/PaddlePaddle/PaddleClas/tree/master/ppcls/modeling/architectures) to replace backbone according to your needs.
|
||||
```
|
||||
cd PaddleOCR/
|
||||
# Download the pre-trained model of MobileNetV3
|
||||
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV3_large_x0_5_pretrained.tar
|
||||
# Download the pre-trained model of ResNet50
|
||||
wget -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar
|
||||
```
|
||||
|
||||
**Start training**
|
||||
```
|
||||
python3 tools/train.py -c configs/det/det_mv3_db.yml
|
||||
```
|
||||
|
||||
In the above instruction, use `-c` to select the training to use the configs/det/det_db_mv3.yml configuration file.
|
||||
For a detailed explanation of the configuration file, please refer to [link](./doc/config-en.md).
|
||||
|
||||
You can also use the `-o` parameter to change the training parameters without modifying the yml file. For example, adjust the training learning rate to 0.0001
|
||||
```
|
||||
python3 tools/train.py -c configs/det/det_mv3_db.yml -o Optimizer.base_lr=0.0001
|
||||
```
|
||||
|
||||
## Evaluation Indicator
|
||||
|
||||
PaddleOCR calculates three indicators for evaluating performance of OCR detection task: Precision, Recall, and Hmean.
|
||||
|
||||
Run the following code to calculate the evaluation indicators. The result will be saved in the test result file specified by `save_res_path` in the configuration file `det_db_mv3.yml`
|
||||
|
||||
When evaluating, set post-processing parameters box_thresh=0.6, unclip_ratio=1.5. If you use different datasets, different models for training, these two parameters should be adjusted for better result.
|
||||
|
||||
```
|
||||
python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="{path/to/weights}/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
|
||||
```
|
||||
The model parameters during training are saved in the `Global.save_model_dir` directory by default. When evaluating indicators, you need to set Global.checkpoints to point to the saved parameter file.
|
||||
|
||||
Such as:
|
||||
```
|
||||
python3 tools/eval.py -c configs/det/det_mv3_db.yml -o Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
|
||||
```
|
||||
|
||||
* Note: box_thresh and unclip_ratio are parameters required for DB post-processing, and not need to be set when evaluating the EAST model.
|
||||
|
||||
## Test detection result
|
||||
|
||||
Test the detection result on a single image:
|
||||
```
|
||||
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy"
|
||||
```
|
||||
|
||||
When testing the DB model, adjust the post-processing threshold:
|
||||
```
|
||||
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/img_10.jpg" Global.checkpoints="./output/det_db/best_accuracy" PostProcess.box_thresh=0.6 PostProcess.unclip_ratio=1.5
|
||||
```
|
||||
|
||||
|
||||
Test the detection result on all images in the folder:
|
||||
```
|
||||
python3 tools/infer_det.py -c configs/det/det_mv3_db.yml -o TestReader.infer_img="./doc/imgs_en/" Global.checkpoints="./output/det_db/best_accuracy"
|
||||
```
|
@ -0,0 +1,79 @@
|
||||
## Quick installation
|
||||
|
||||
After testing, paddleocr can run on glibc 2.23. You can also test other glibc versions or install glic 2.23 for the best compatibility.
|
||||
|
||||
PaddleOCR working environment:
|
||||
- PaddlePaddle1.7
|
||||
- python3
|
||||
- glibc 2.23
|
||||
|
||||
It is recommended to use the docker provided by us to run PaddleOCR, please refer to the use of docker [link](https://docs.docker.com/get-started/).
|
||||
|
||||
1. (Recommended) Prepare a docker environment. The first time you use this image, it will be downloaded automatically. Please be patient.
|
||||
```
|
||||
# Switch to the working directory
|
||||
cd /home/Projects
|
||||
# You need to create a docker container for the first run, and do not need to run the current command when you run it again
|
||||
# Create a docker container named ppocr and map the current directory to the /paddle directory of the container
|
||||
|
||||
#If you want to use docker in a CPU environment, use docker instead of nvidia-docker to create docker
|
||||
sudo docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash
|
||||
```
|
||||
If you have cuda9 installed on your machine, please run the following command to create a container:
|
||||
```
|
||||
sudo nvidia-docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev /bin/bash
|
||||
```
|
||||
If you have cuda10 installed on your machine, please run the following command to create a container:
|
||||
```
|
||||
sudo nvidia-docker run --name ppocr -v $PWD:/paddle --network=host -it hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda10.0-cudnn7-dev /bin/bash
|
||||
```
|
||||
You can also visit [DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/) to get the image that fits your machine.
|
||||
```
|
||||
# ctrl+P+Q to exit docker, to re-enter docker using the following command:
|
||||
sudo docker container exec -it ppocr /bin/bash
|
||||
```
|
||||
|
||||
Note: If the docker pull is too slow, you can download and load the docker image manually according to the following steps. Take cuda9 docker for example, you only need to change cuda9 to cuda10 to use cuda10 docker:
|
||||
```
|
||||
# Download the CUDA9 docker compressed file and unzip it
|
||||
wget https://paddleocr.bj.bcebos.com/docker/docker_pdocr_cuda9.tar.gz
|
||||
# To reduce download time, the uploaded docker image is compressed and needs to be decompressed
|
||||
tar zxf docker_pdocr_cuda9.tar.gz
|
||||
# Create image
|
||||
docker load < docker_pdocr_cuda9.tar
|
||||
# After completing the above steps, check whether the downloaded image is loaded through docker images
|
||||
docker images
|
||||
# If you have the following output after executing docker images, you can follow step 1 to create a docker environment.
|
||||
hub.baidubce.com/paddlepaddle/paddle latest-gpu-cuda9.0-cudnn7-dev f56310dcc829
|
||||
```
|
||||
|
||||
2. Install PaddlePaddle Fluid v1.7 (the higher version is not supported yet, the adaptation work is in progress)
|
||||
```
|
||||
pip3 install --upgrade pip
|
||||
|
||||
# If you have cuda9 installed on your machine, please run the following command to install
|
||||
python3 -m pip install paddlepaddle-gpu==1.7.2.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
# If you have cuda10 installed on your machine, please run the following command to install
|
||||
python3 -m pip install paddlepaddle-gpu==1.7.2.post107 -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
```
|
||||
For more software version requirements, please refer to the instructions in [Installation Document](https://www.paddlepaddle.org.cn/install/quick) for operation.
|
||||
|
||||
|
||||
3. Clone PaddleOCR repo code
|
||||
```
|
||||
# Recommend
|
||||
git clone https://github.com/PaddlePaddle/PaddleOCR
|
||||
|
||||
# If you cannot pull successfully due to network problems, you can also choose to use the code hosting on the cloud:
|
||||
|
||||
git clone https://gitee.com/paddlepaddle/PaddleOCR
|
||||
|
||||
# Note: The cloud-hosting code may not be able to synchronize the update with this GitHub project in real time. There might be a delay of 3-5 days. Please give priority to the recommended method.
|
||||
```
|
||||
|
||||
4. Install third-party libraries
|
||||
```
|
||||
cd PaddleOCR
|
||||
pip3 install -r requirments.txt
|
||||
```
|
Loading…
Reference in new issue