Compare commits

...

208 Commits

Author SHA1 Message Date
xiaoting a09604f897
Merge pull request #2357 from caopulan/fix_srn_post
5 years ago
caopu 39a34ac80f fix srn_postprocess
5 years ago
Double_V cfdefbe1ba
Merge pull request #2334 from tink2123/fix_doc
5 years ago
Double_V 0ee4358329
Merge branch 'release/2.0' into fix_doc
5 years ago
tink2123 5ea613dc4f use pretrained_model for eval
5 years ago
MissPenguin d3a4f78864
Merge pull request #2283 from caopulan/modify
5 years ago
zhoujun 0067516142
Merge pull request #2306 from Intsigstephon/release/2.0
5 years ago
Bin Lu b99f352df5
Update FAQ.md
5 years ago
zhoujun 4b1428d838
Merge branch 'release/2.0' into modify
5 years ago
Bin Lu b602397134
Update README_ch.md
5 years ago
Bin Lu 523c778557
Update FAQ.md
5 years ago
Bin Lu f745eef560
Update FAQ.md
5 years ago
Bin Lu c386f01254
Update FAQ.md
5 years ago
dyning 6c381485c4
Merge pull request #2264 from dyning/release/2.0
5 years ago
dyning 8608f3463c
Merge branch 'release/2.0' into release/2.0
5 years ago
Daniel Yang 8647a503b1
Merge pull request #2292 from Evezerest/release2.0
5 years ago
Evezerest d5f5a38881
Merge branch 'release/2.0' into release2.0
5 years ago
Leif 881758fab5 Update joinus.png
5 years ago
caopu f56abaff7c Update eval.py
5 years ago
dyning ece8449815 add faq 20210316
5 years ago
zhoujun a7937102db
Merge pull request #2229 from LDOUBLEV/fix_2.0
5 years ago
xiaoting 4431a09c28
Merge branch 'release/2.0' into fix_2.0
5 years ago
Double_V ef3f72f07c
Merge pull request #2250 from tink2123/rename_lang
5 years ago
tink2123 f3de116010 add multi-lang
5 years ago
Double_V 15d30741bd
Merge branch 'release/2.0' into fix_2.0
5 years ago
tink2123 28c1f2497f add multi-lang
5 years ago
tink2123 66a44f222a rename lang for doc
5 years ago
tink2123 5b0e46d341 rename multi-lang
5 years ago
tink2123 68dd6afaa4 rename language abbreviations
5 years ago
Daniel Yang c7be8856e9
Merge pull request #2245 from Evezerest/release2.0
5 years ago
Leif eb1fab55f2 Merge remote-tracking branch 'upstream/release/2.0' into release2.0
5 years ago
Leif f9b113649a Update joinus.png
5 years ago
LDOUBLEV 1cfe43c9a6 enable memory optim
5 years ago
Wei Shengyu fd5ea74b00
update FAQ (#2189)
5 years ago
zhoujun f0b1032813
Merge pull request #2200 from Intsigstephon/release/2.0
5 years ago
Bin Lu cece6d201f
Update README.md
5 years ago
Bin Lu 10b1bc78fd
Update image_list.txt
5 years ago
Daniel Yang 978807038a
Merge pull request #2181 from Evezerest/release2.0
5 years ago
Evezerest 84ef96f35d
Merge branch 'release/2.0' into release2.0
5 years ago
Leif bec2f3f490 Merge remote-tracking branch 'origin/release2.0' into release2.0
5 years ago
Leif e07e168cda Update joinus.png
5 years ago
Double_V ac31a18b61
Merge pull request #2171 from LDOUBLEV/cp20
5 years ago
LDOUBLEV 6bd6bee58f fix issue 2086
5 years ago
LDOUBLEV 45111a7b46 Merge branch 'release/2.0' of https://github.com/PaddlePaddle/PaddleOCR into cp20
5 years ago
LDOUBLEV cbd812bc99 fix issue 2086
5 years ago
xiaoting f97fc1d03f
Merge pull request #2164 from xmy0916/release/2.0
5 years ago
xmy0916 1f23c7ad2f fix type error
5 years ago
dyning 3a11283588
Merge pull request #2140 from LDOUBLEV/cp20
5 years ago
LDOUBLEV 222a821ad6 fix typo
5 years ago
LDOUBLEV 909d9c17ff fix serial number
5 years ago
LDOUBLEV 6f794f6ff9 fix serial number
5 years ago
LDOUBLEV 342c7aedb0 add faq, 2021.3.1
5 years ago
LDOUBLEV fd9d8c39d2 Merge branch 'release/2.0' of https://github.com/PaddlePaddle/PaddleOCR into cp20
5 years ago
Double_V 76752b6084
Merge pull request #2105 from LDOUBLEV/cp20
5 years ago
LDOUBLEV b0b8db0654 add faq, 2021.3.1
5 years ago
littletomatodonkey dd9b456bb6
Merge branch 'release/2.0' into cp20
5 years ago
MissPenguin 7d31b5e1c8
Merge pull request #2125 from Evezerest/release2.0
5 years ago
Leif 5ed055e7dd Merge remote-tracking branch 'upstream/release/2.0' into release2.0
5 years ago
Evezerest 0faa016137
Merge branch 'release/2.0' into release2.0
5 years ago
Leif 843053e2ec Update joinus.png
5 years ago
LDOUBLEV cf2bd52989 fix issue #2080
5 years ago
LDOUBLEV 2c7c60bec2 fix issue #2072
5 years ago
Leif 1886647f1d Change the path of joinun.png to absolute path
5 years ago
Daniel Yang fa32b9b184
Merge pull request #2076 from Evezerest/release2.0
5 years ago
Leif f1383e17a1 Update joinus.png
5 years ago
littletomatodonkey 152ab8f3da
add faq 20210222 (#2068)
5 years ago
xiaoting c3d700fa41
fix test_hubserving (#2071)
5 years ago
Double_V e083b9c228
Merge pull request #2014 from lamhoangtung/release/2.0
5 years ago
Double_V ff1b0a3621
Merge pull request #2038 from LDOUBLEV/cp_dilation
5 years ago
LDOUBLEV b61af980eb fix dilation
5 years ago
Double_V 714ca4a73e
Merge pull request #2037 from LDOUBLEV/cp_dilation
5 years ago
LDOUBLEV 21ad9026d3 fix typo
5 years ago
Double_V dcdf083203
Merge pull request #2036 from LDOUBLEV/cp_dilation
5 years ago
LDOUBLEV 611c70f68c add use_dilation params in hubserving
5 years ago
Double_V 4a7214963f
Merge pull request #2023 from LDOUBLEV/doc_cp
5 years ago
Double_V 9deee44a7a
Merge branch 'release/2.0' into doc_cp
5 years ago
Double_V 59b3921d9b
Merge pull request #2024 from LDOUBLEV/cp
5 years ago
Double_V f23aa77757 Merge pull request #1920 from LDOUBLEV/trt_cpp
5 years ago
LDOUBLEV 5e555a8047 cherry-pick fix doc and fix dilation
5 years ago
MissPenguin ecc408a6db
Update inference_en.md
5 years ago
MissPenguin 97d71f0bba
Update inference.md
5 years ago
lamhoangtung e1c44d1f2c
Fix #2013
5 years ago
Daniel Yang d91c3e7c6b
Merge pull request #2010 from Evezerest/release2.0
5 years ago
Leif 67e951d0be Update joinus.png
5 years ago
xiaoting d231cc3cd7
Merge pull request #1989 from tink2123/faq_2.0
5 years ago
xiaoting 7bffc58e89
Merge branch 'release/2.0' into faq_2.0
5 years ago
tink2123 5fc2de704b add faq for 2.8
5 years ago
tink2123 38303bfd02 add faq for 2.8
5 years ago
dyning 8045cee558
Update README_ch.md
5 years ago
Daniel Yang 8d3d0eb15e
Merge pull request #1988 from Evezerest/release2.0
5 years ago
Leif b5d56d1c83 Update joinus.png
5 years ago
MissPenguin e6a95c3a71
Update README.md
5 years ago
MissPenguin c6811aa833
Update README.md
5 years ago
zhoujun 364a777763
Merge pull request #1984 from WenmuZhou/update_reqire
5 years ago
WenmuZhou 4dfc583850 predict_rec support rare
5 years ago
WenmuZhou 9cc6363b0e fix rare export error
5 years ago
xiaoting 890546ca34
Merge pull request #1982 from tink2123/fix_srn_eval
5 years ago
tink2123 807cd824a5 fix srn for eval
5 years ago
zhoujun fe6e31705f
Merge pull request #1981 from WenmuZhou/update_reqire
5 years ago
WenmuZhou 9050c6f933 update srn dataset path
5 years ago
xiaoting 1b1f170c7a
Merge pull request #1979 from tink2123/fix_encode
5 years ago
tink2123 1599a4590f fix attn encode
5 years ago
tink2123 e7187dac83 fix encode for srn
5 years ago
xiaoting a3afc162fa
Merge pull request #1972 from tink2123/fix_eval_for_srn_2.0
5 years ago
tink2123 ad15a64569 polish code for srn eval
5 years ago
zhoujun 6b73d8ed2f
Merge pull request #1954 from WenmuZhou/update_reqire
5 years ago
WenmuZhou 647f85dfca update srn dataset path
5 years ago
WenmuZhou 76f50d89e3 update rare dataset path
5 years ago
zhoujun 4c33d20dae
Merge pull request #1953 from WenmuZhou/update_reqire
5 years ago
WenmuZhou 36b5c0dafa update readme and requirements.txt
5 years ago
zhoujun 3b9c7b82bc
Merge pull request #1949 from WenmuZhou/check_empty_data
5 years ago
WenmuZhou 8e9d851563 add dataset len check
5 years ago
zhoujun 1d41d8903a
Merge pull request #1936 from WenmuZhou/dygraph_rc
5 years ago
xiaoting 60effbd748
Merge pull request #1928 from tink2123/cherry-pick
5 years ago
tink2123 25008de315 fix typo
5 years ago
xiaoting a6146ffc43
Merge pull request #1919 from tink2123/fix_rare
5 years ago
tink2123 da344d539a fix typo for attention
5 years ago
Double_V cdf732289d
Merge pull request #1906 from littletomatodonkey/2.0/fix_typo
5 years ago
littletomatodonkey 0c75cbc55b fix doc
5 years ago
MissPenguin 36dae990b8
Merge pull request #1901 from MissPenguin/release/2.0
5 years ago
root f6daae41e5 fix conflict
5 years ago
root f2bc513a68 delete slim related content
5 years ago
MissPenguin fe775780e4
Update algorithm_overview.md
5 years ago
MissPenguin 8e2dc741b3
Update algorithm_overview_en.md
5 years ago
MissPenguin d4a23337cf
Merge pull request #1900 from LDOUBLEV/rare
5 years ago
LDOUBLEV 873363589c fix link
5 years ago
MissPenguin f368a4b2f6
Merge pull request #1891 from WenmuZhou/faq
5 years ago
MissPenguin dd6f6f5cf3
Merge pull request #1869 from LDOUBLEV/rare
5 years ago
WenmuZhou 259f3bb0e0 update
5 years ago
WenmuZhou dbef4a1d34 update
5 years ago
WenmuZhou 02b0bce42d update
5 years ago
xiaoting 1043613b1c
Merge pull request #1897 from iamyoyo/dygraph
5 years ago
LDOUBLEV a094d27755 opt rec_att_head
5 years ago
Double_V 080e250164
Merge pull request #1868 from WenmuZhou/dygraph_rc
5 years ago
iamyoyo be94977426
program.py 257行 变量名问题
5 years ago
iamyoyo a92bb6d310
Update program.py
5 years ago
LDOUBLEV 0d89f3f913 fix comment
5 years ago
WenmuZhou fae6f1eef7 update
5 years ago
LDOUBLEV 550022ea66 fix comment
5 years ago
LDOUBLEV e7d24ac8b8 fix comment
5 years ago
LDOUBLEV 0f4d92b63f fix conflict wiith SRN
5 years ago
LDOUBLEV 7a054c854b rare doc and opt post_process
5 years ago
WenmuZhou 22a9f2ad00 update faq
5 years ago
WenmuZhou b544a561d5 update faq
5 years ago
xiaoting 0c5c9f694d
Merge pull request #1890 from tink2123/srn_ch
5 years ago
tink2123 7f2304ab3f Adaptation of Chinese char
5 years ago
LDOUBLEV 56cbbdfb01 fix conflict
5 years ago
LDOUBLEV f896032255 pre-commit
5 years ago
xiaoting 2a0c3d4dac
fix eval mode without srn (#1889)
5 years ago
MissPenguin e40100c5b7
Merge pull request #1872 from Evezerest/dy1
5 years ago
xiaoting 570009c217
Merge pull request #1874 from tink2123/srn_doc
5 years ago
tink2123 b8a029686c Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into dygraph
5 years ago
tink2123 6781d55df4 format doc
5 years ago
xiaoting 7f9b885f75
Merge pull request #1873 from tink2123/srn_doc
5 years ago
tink2123 42fe741ff1 add srn doc
5 years ago
Leif b3a451da26 Fix a spelling mistake
5 years ago
Leif f20f6d2d27 Merge remote-tracking branch 'upstream/dygraph' into dy1
5 years ago
Leif 647db30f6f Fix bugs during save recognition results
5 years ago
xiaoting acd479ea46
Merge pull request #1597 from tink2123/dygraph_for_srn
5 years ago
xiaoting 6ebbbfe46c
Merge branch 'dygraph' into dygraph_for_srn
5 years ago
LDOUBLEV f6e03a51f0 upload rare code
5 years ago
MissPenguin fad40158b0
Merge pull request #1623 from tirkarthi/fix-warnings
5 years ago
Double_V 95a60fa49f
Merge pull request #1861 from Justus-Jonas/dygraph
5 years ago
Double_V c9f745e250
fix doc of quant demo (#1865)
5 years ago
littletomatodonkey a4fa186010
fix mv3 to adapt to paddle2.0 (#1864)
5 years ago
zhoujun b0d1dca688
fix starnet export (#1850)
5 years ago
Justus-Jonas Erker c9f24ddaa3
Merge pull request #1 from Justus-Jonas/Justus-Jonas-patch-1
5 years ago
Justus-Jonas Erker 0dcab8e67f
fixed wrong naming of German model
5 years ago
Wei Shengyu 25bf92295f
增加language选项的说明 (#1810)
5 years ago
Double_V b22ee4dd5f
Merge pull request #1844 from LDOUBLEV/trt_cpp
5 years ago
LDOUBLEV 1d11af72ac fix shuffle not work
5 years ago
Daniel Yang d5a3fb5408
Update README_ch.md
5 years ago
Daniel Yang ad074432d5
Update README_ch.md
5 years ago
Daniel Yang fb9727266f
Update README.md
5 years ago
Daniel Yang b7526cc12d
Update README.md
5 years ago
zhoujun 4b715cf200
Merge pull request #1837 from Channingss/dygraph
5 years ago
Daniel Yang 4d8f7e6ce2
Merge pull request #1825 from Evezerest/dy1
5 years ago
Channingss cacf8f8a6c export_model support dynamic input shape
5 years ago
littletomatodonkey d1e150e276
Revert "fix mv3 to adapt to paddle2.0" (#1836)
5 years ago
xiaoting 1888ca2793
Merge pull request #1826 from tink2123/multi_lan_doc
5 years ago
tink2123 06a434bcf9 fix yml
5 years ago
tink2123 edeb12b1e0 rename en_sensitive EN_symbol
5 years ago
Double_V 4dcf1b2227
Merge pull request #1821 from WenmuZhou/dygraph_rc
5 years ago
Double_V b78eeb24ea
Merge pull request #1831 from LDOUBLEV/trt_cpp
5 years ago
tink2123 d9ae86f422 update en char_type
5 years ago
tink2123 8f52a73718 polish code
5 years ago
LDOUBLEV 09fd94e781 fix typo
5 years ago
zhoujun 9982550a5e
Merge pull request #1830 from PaddlePaddle/revert-1829-revert-1827-dyg/fix_mv3
5 years ago
littletomatodonkey 38031eb427
Revert "Revert "fix mv3 to adapt to paddle2.0""
5 years ago
tink2123 5e9fb50db5 Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into multi_languages
5 years ago
zhoujun 5a5d627deb
Merge pull request #1829 from PaddlePaddle/revert-1827-dyg/fix_mv3
5 years ago
zhoujun 493027d39e
Revert "fix mv3 to adapt to paddle2.0"
5 years ago
zhoujun 95ad4d1032
Merge pull request #1827 from littletomatodonkey/dyg/fix_mv3
5 years ago
littletomatodonkey b4b51a0510 fix mv3 to adapt to paddle2.0
5 years ago
tink2123 45117f907d update multi-lang doc
5 years ago
Leif 6b27cdc105 Update joinus.png
5 years ago
xiaoting e7decf3019
Merge pull request #1820 from xmy0916/dygraph
5 years ago
xiaoting f214437538
Merge branch 'dygraph' into dygraph_for_srn
5 years ago
zhoujun a27a43ec0f
Merge pull request #1807 from littletomatodonkey/dyg/fix_seed
5 years ago
zhoujun ff2b1c0d50
Merge pull request #1809 from MissPenguin/dygraph
5 years ago
xmy0916 d65c6cb6f0 add multi language font
5 years ago
MissPenguin 943b9390e7 update FAQ 2021.1.25
5 years ago
littletomatodonkey 9f581156d9 fix data replication for multi-cards sampling
5 years ago
tink2123 ed2f0de95e mv model_average to incubate
5 years ago
tink2123 93670ab5a2 all ready
5 years ago
Karthikeyan Singaravelan 841adff934 Fix syntax warning over comparison of literals using is.
5 years ago
tink2123 297871d4be fix bugs
5 years ago
tink2123 c1fd46641e add srn for dygraph
5 years ago

@ -1031,7 +1031,7 @@ class MainWindow(QMainWindow, WindowMixin):
for box in self.result_dic:
trans_dic = {"label": box[1][0], "points": box[0], 'difficult': False}
if trans_dic["label"] is "" and mode == 'Auto':
if trans_dic["label"] == "" and mode == 'Auto':
continue
shapes.append(trans_dic)
@ -1450,7 +1450,7 @@ class MainWindow(QMainWindow, WindowMixin):
item = QListWidgetItem(closeicon, filename)
self.fileListWidget.addItem(item)
print('dirPath in importDirImages is', dirpath)
print('DirPath in importDirImages is', dirpath)
self.iconlist.clear()
self.additems5(dirpath)
self.changeFileFolder = True
@ -1459,7 +1459,6 @@ class MainWindow(QMainWindow, WindowMixin):
self.reRecogButton.setEnabled(True)
self.actions.AutoRec.setEnabled(True)
self.actions.reRec.setEnabled(True)
self.actions.saveLabel.setEnabled(True)
def openPrevImg(self, _value=False):
@ -1764,7 +1763,7 @@ class MainWindow(QMainWindow, WindowMixin):
QMessageBox.information(self, "Information", msg)
return
result = self.ocr.ocr(img_crop, cls=True, det=False)
if result[0][0] is not '':
if result[0][0] != '':
result.insert(0, box)
print('result in reRec is ', result)
self.result_dic.append(result)
@ -1795,7 +1794,7 @@ class MainWindow(QMainWindow, WindowMixin):
QMessageBox.information(self, "Information", msg)
return
result = self.ocr.ocr(img_crop, cls=True, det=False)
if result[0][0] is not '':
if result[0][0] != '':
result.insert(0, box)
print('result in reRec is ', result)
if result[1][0] == shape.label:
@ -1862,6 +1861,8 @@ class MainWindow(QMainWindow, WindowMixin):
for each in states:
file, state = each.split('\t')
self.fileStatedict[file] = 1
self.actions.saveLabel.setEnabled(True)
self.actions.saveRec.setEnabled(True)
def saveFilestate(self):
@ -1919,22 +1920,29 @@ class MainWindow(QMainWindow, WindowMixin):
rec_gt_dir = os.path.dirname(self.PPlabelpath) + '/rec_gt.txt'
crop_img_dir = os.path.dirname(self.PPlabelpath) + '/crop_img/'
ques_img = []
if not os.path.exists(crop_img_dir):
os.mkdir(crop_img_dir)
with open(rec_gt_dir, 'w', encoding='utf-8') as f:
for key in self.fileStatedict:
idx = self.getImglabelidx(key)
for i, label in enumerate(self.PPlabel[idx]):
if label['difficult']: continue
try:
img = cv2.imread(key)
img_crop = get_rotate_crop_image(img, np.array(label['points'], np.float32))
img_name = os.path.splitext(os.path.basename(idx))[0] + '_crop_'+str(i)+'.jpg'
cv2.imwrite(crop_img_dir+img_name, img_crop)
f.write('crop_img/'+ img_name + '\t')
f.write(label['transcription'] + '\n')
QMessageBox.information(self, "Information", "Cropped images has been saved in "+str(crop_img_dir))
for i, label in enumerate(self.PPlabel[idx]):
if label['difficult']: continue
img_crop = get_rotate_crop_image(img, np.array(label['points'], np.float32))
img_name = os.path.splitext(os.path.basename(idx))[0] + '_crop_'+str(i)+'.jpg'
cv2.imwrite(crop_img_dir+img_name, img_crop)
f.write('crop_img/'+ img_name + '\t')
f.write(label['transcription'] + '\n')
except Exception as e:
ques_img.append(key)
print("Can not read image ",e)
if ques_img:
QMessageBox.information(self, "Information", "The following images can not be saved, "
"please check the image path and labels.\n" + "".join(str(i)+'\n' for i in ques_img))
QMessageBox.information(self, "Information", "Cropped images have been saved in "+str(crop_img_dir))
def speedChoose(self):
if self.labelDialogOption.isChecked():
@ -1991,7 +1999,7 @@ if __name__ == '__main__':
resource_file = './libs/resources.py'
if not os.path.exists(resource_file):
output = os.system('pyrcc5 -o libs/resources.py resources.qrc')
assert output is 0, "operate the cmd have some problems ,please check whether there is a in the lib " \
assert output == 0, "operate the cmd have some problems ,please check whether there is a in the lib " \
"directory resources.py "
import libs.resources
sys.exit(main())

@ -9,7 +9,7 @@ PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, w
### Recent Update
- 2021.1.11: Optimize the labeling experience (by [edencfc](https://github.com/edencfc)),
- Users can choose whether to pop up the label input dialog after drawing the detection box in "View - Pop-up Label Input Dialog".
- Users can choose whether to pop up the label input dialog after drawing the detection box in "View - Pop-up Label Input Dialog".
- The recognition result scrolls synchronously when users click related detection box.
- Click to modify the recognition result.(If you can't change the result, please switch to the system default input method, or switch back to the original input method again)
- 2020.12.18: Support re-recognition of a single label box (by [ninetailskim](https://github.com/ninetailskim) ), perfect shortcut keys.
@ -49,7 +49,7 @@ python3 PPOCRLabel.py
```
pip3 install pyqt5
pip3 uninstall opencv-python # Uninstall opencv manually as it conflicts with pyqt
pip3 install opencv-contrib-python-headless # Install the headless version of opencv
pip3 install opencv-contrib-python-headless==4.2.0.32 # Install the headless version of opencv
cd ./PPOCRLabel # Change the directory to the PPOCRLabel folder
python3 PPOCRLabel.py
```
@ -127,7 +127,7 @@ Therefore, if the recognition result has been manually changed before, it may ch
- Default model: PPOCRLabel uses the Chinese and English ultra-lightweight OCR model in PaddleOCR by default, supports Chinese, English and number recognition, and multiple language detection.
- Model language switching: Changing the built-in model language is supportable by clicking "PaddleOCR"-"Choose OCR Model" in the menu bar. Currently supported languagesinclude French, German, Korean, and Japanese.
- Model language switching: Changing the built-in model language is supportable by clicking "PaddleOCR"-"Choose OCR Model" in the menu bar. Currently supported languagesinclude French, German, Korean, and Japanese.
For specific model download links, please refer to [PaddleOCR Model List](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/models_list_en.md#multilingual-recognition-modelupdating)
- Custom model: The model trained by users can be replaced by modifying PPOCRLabel.py in [PaddleOCR class instantiation](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/PPOCRLabel/PPOCRLabel.py#L110) referring [Custom Model Code](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/whl_en.md#use-custom-model)
@ -160,11 +160,11 @@ For some data that are difficult to recognize, the recognition results will not
```
pyrcc5 -o libs/resources.py resources.qrc
```
- If you get an error ``` module 'cv2' has no attribute 'INTER_NEAREST'```, you need to delete all opencv related packages first, and then reinstall the headless version of opencv
- If you get an error ``` module 'cv2' has no attribute 'INTER_NEAREST'```, you need to delete all opencv related packages first, and then reinstall the 4.2.0.32 version of headless opencv
```
pip install opencv-contrib-python-headless
pip install opencv-contrib-python-headless==4.2.0.32
```
### Related
1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)

@ -49,7 +49,7 @@ python3 PPOCRLabel.py --lang ch
```
pip3 install pyqt5
pip3 uninstall opencv-python # 由于mac版本的opencv与pyqt有冲突需先手动卸载opencv
pip3 install opencv-contrib-python-headless # 安装headless版本的open-cv
pip3 install opencv-contrib-python-headless==4.2.0.32 # 安装headless版本的open-cv
cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下
python3 PPOCRLabel.py --lang ch
```
@ -132,22 +132,22 @@ PPOCRLabel支持三种保存方式
### 错误提示
- 如果同时使用whl包安装了paddleocr其优先级大于通过paddleocr.py调用PaddleOCR类whl包未更新时会导致程序异常。
- PPOCRLabel**不支持对中文文件名**的图片进行自动标注。
- 针对Linux用户如果您在打开软件过程中出现**objc[XXXXX]**开头的错误证明您的opencv版本太高建议安装4.2版本:
```
pip install opencv-python==4.2.0.32
```
- 如果出现 ```Missing string id``` 开头的错误,需要重新编译资源:
```
pyrcc5 -o libs/resources.py resources.qrc
```
- 如果出现``` module 'cv2' has no attribute 'INTER_NEAREST'```错误需要首先删除所有opencv相关包然后重新安装headless版本的opencv
- 如果出现``` module 'cv2' has no attribute 'INTER_NEAREST'```错误需要首先删除所有opencv相关包然后重新安装4.2.0.32版本的headless opencv
```
pip install opencv-contrib-python-headless
pip install opencv-contrib-python-headless==4.2.0.32
```
### 参考资料

@ -5,10 +5,12 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
## Notice
PaddleOCR supports both dynamic graph and static graph programming paradigm
- Dynamic graph: dygraph branch (default), **supported by paddle 2.0rc1+ ([installation](./doc/doc_en/installation_en.md))**
- Dynamic graph: dygraph branch (default), **supported by paddle 2.0.0 ([installation](./doc/doc_en/installation_en.md))**
- Static graph: develop branch
**Recent updates**
- 2021.2.8 Release PaddleOCRv2.0(branch release/2.0) and set as default branch. Check release note here: https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v2.0.0
- 2021.1.21 update more than 25+ multilingual recognition models [models list](./doc/doc_en/models_list_en.md), includingEnglish, Chinese, German, French, JapaneseSpanishPortuguese Russia Arabic and so on. Models for more languages will continue to be updated [Develop Plan](https://github.com/PaddlePaddle/PaddleOCR/issues/1048).
- 2020.12.15 update Data synthesis tool, i.e., [Style-Text](./StyleText/README.md)easy to synthesize a large number of images which are similar to the target scene image.
- 2020.11.25 Update a new data annotation tool, i.e., [PPOCRLabel](./PPOCRLabel/README.md), which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
- 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941

@ -4,11 +4,14 @@
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库助力使用者训练出更好的模型并应用落地。
## 注意
PaddleOCR同时支持动态图与静态图两种编程范式
- 动态图版本dygraph分支默认需将paddle版本升级至2.0rc1+[快速安装](./doc/doc_ch/installation.md)
- 动态图版本dygraph分支默认需将paddle版本升级至2.0.0[快速安装](./doc/doc_ch/installation.md)
- 静态图版本develop分支
**近期更新**
- 2021.1.18 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题总数152个每周一都会更新欢迎大家持续关注。
- 2021.3.22 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题总数193个每周一都会更新欢迎大家持续关注。
- 2021.2.8 正式发布PaddleOCRv2.0(branch release/2.0)并设置为推荐用户使用的默认分支. 发布的详细内容,请参考: https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v2.0.0
- 2021.1.26,28,29 PaddleOCR官方研发团队带来技术深入解读三日直播课1月26日、28日、29日晚上19:30[直播地址](https://live.bilibili.com/21689802)
- 2021.1.21 更新多语言识别模型目前支持语种超过27种[多语言模型下载](./doc/doc_ch/models_list.md),包括中文简体、中文繁体、英文、法文、德文、韩文、日文、意大利文、西班牙文、葡萄牙文、俄罗斯文、阿拉伯文等,后续计划可以参考[多语言研发计划](https://github.com/PaddlePaddle/PaddleOCR/issues/1048)
- 2020.12.15 更新数据合成工具[Style-Text](./StyleText/README_ch.md),可以批量合成大量与目标场景类似的图像,在多个场景验证,效果明显提升。
- 2020.11.25 更新半自动标注工具[PPOCRLabel](./PPOCRLabel/README_ch.md)辅助开发者高效完成标注任务输出格式与PP-OCR训练任务完美衔接。
- 2020.9.22 更新PP-OCR技术文章https://arxiv.org/abs/2009.09941
@ -101,8 +104,8 @@ PaddleOCR同时支持动态图与静态图两种编程范式
- [效果展示](#效果展示)
- FAQ
- [【精选】OCR精选10个问题](./doc/doc_ch/FAQ.md)
- [【理论篇】OCR通用32个问题](./doc/doc_ch/FAQ.md)
- [【实战篇】PaddleOCR实战110个问题](./doc/doc_ch/FAQ.md)
- [【理论篇】OCR通用37个问题](./doc/doc_ch/FAQ.md)
- [【实战篇】PaddleOCR实战141个问题](./doc/doc_ch/FAQ.md)
- [技术交流群](#欢迎加入PaddleOCR技术交流群)
- [参考文献](./doc/doc_ch/reference.md)
- [许可证书](#许可证书)

@ -72,7 +72,7 @@ fusion_generator:
python3 tools/synth_image.py -c configs/config.yml --style_image examples/style_images/2.jpg --text_corpus PaddleOCR --language en
```
* Note 1: The language options is correspond to the corpus. Currently, the tool only supports English, Simplified Chinese and Korean.
* Note 1: The language options is correspond to the corpus. Currently, the tool only supports English(en), Simplified Chinese(ch) and Korean(ko).
* Note 2: Synth-Text is mainly used to generate images for OCR recognition models.
So the height of style images should be around 32 pixels. Images in other sizes may behave poorly.
* Note 3: You can modify `use_gpu` in `configs/config.yml` to determine whether to use GPU for prediction.
@ -98,7 +98,7 @@ What's more, the medium result `fake_bg.jpg` will also be saved, which is the ba
</div>
`fake_text.jpg` * `fake_text.jpg` is the generated image with the same font style as `Style Input`.
`fake_text.jpg` is the generated image with the same font style as `Style Input`.
<div align="center">
@ -120,7 +120,7 @@ In actual application scenarios, it is often necessary to synthesize pictures in
* `with_label`Whether the `label_file` is label file list.
* `CorpusGenerator`
* `method`Method of CorpusGeneratorsupports `FileCorpus` and `EnNumCorpus`. If `EnNumCorpus` is usedNo other configuration is neededotherwise you need to set `corpus_file` and `language`.
* `language`Language of the corpus.
* `language`Language of the corpus. Currently, the tool only supports English(en), Simplified Chinese(ch) and Korean(ko).
* `corpus_file`: Filepath of the corpus. Corpus file should be a text file which will be split by line-endings'\n'. Corpus generator samples one line each time.

@ -63,10 +63,10 @@ fusion_generator:
```python
python3 tools/synth_image.py -c configs/config.yml --style_image examples/style_images/2.jpg --text_corpus PaddleOCR --language en
```
* 注1语言选项和语料相对应目前该工具只支持英文、简体中文和韩语。
* 注1语言选项和语料相对应目前支持英文(en)、简体中文(ch)和韩语(ko)
* 注2Style-Text生成的数据主要应用于OCR识别场景。基于当前PaddleOCR识别模型的设计我们主要支持高度在32左右的风格图像。
如果输入图像尺寸相差过多,效果可能不佳。
* 注3可以通过修改配置文件中的`use_gpu`(true或者false)参数来决定是否使用GPU进行预测。
* 注3可以通过修改配置文件`configs/config.yml`中的`use_gpu`(true或者false)参数来决定是否使用GPU进行预测。
例如,输入如下图片和语料"PaddleOCR":
@ -105,7 +105,7 @@ python3 tools/synth_image.py -c configs/config.yml --style_image examples/style_
* `with_label`:标志`label_file`是否为label文件。
* `CorpusGenerator`
* `method`:语料生成方法,目前有`FileCorpus`和`EnNumCorpus`可选。如果使用`EnNumCorpus`,则不需要填写其他配置,否则需要修改`corpus_file`和`language`
* `language`:语料的语种;
* `language`:语料的语种,目前支持英文(en)、简体中文(ch)和韩语(ko)
* `corpus_file`: 语料文件路径。语料文件应使用文本文件。语料生成器首先会将语料按行切分,之后每次随机选取一行。
语料文件格式示例:

@ -1,2 +1,2 @@
style_images/1.jpg NEATNESS
style_images/2.jpg 锁店君和宾馆
style_images/2.jpg 卡丹鑫宇通

@ -19,21 +19,38 @@ import logging
logging.basicConfig(level=logging.INFO)
support_list = {
'it':'italian', 'xi':'spanish', 'pu':'portuguese', 'ru':'russian', 'ar':'arabic',
'ta':'tamil', 'ug':'uyghur', 'fa':'persian', 'ur':'urdu', 'rs':'serbian latin',
'oc':'occitan', 'rsc':'serbian cyrillic', 'bg':'bulgarian', 'uk':'ukranian', 'be':'belarusian',
'te':'telugu', 'ka':'kannada', 'chinese_cht':'chinese tradition','hi':'hindi','mr':'marathi',
'ne':'nepali',
'it': 'italian',
'es': 'spanish',
'pt': 'portuguese',
'ru': 'russian',
'ar': 'arabic',
'ta': 'tamil',
'ug': 'uyghur',
'fa': 'persian',
'ur': 'urdu',
'rs_latin': 'serbian latin',
'oc': 'occitan',
'rs_cyrillic': 'serbian cyrillic',
'bg': 'bulgarian',
'uk': 'ukranian',
'be': 'belarusian',
'te': 'telugu',
'kn': 'kannada',
'ch_tra': 'chinese tradition',
'hi': 'hindi',
'mr': 'marathi',
'ne': 'nepali',
}
assert(
os.path.isfile("./rec_multi_language_lite_train.yml")
),"Loss basic configuration file rec_multi_language_lite_train.yml.\
assert (os.path.isfile("./rec_multi_language_lite_train.yml")
), "Loss basic configuration file rec_multi_language_lite_train.yml.\
You can download it from \
https://github.com/PaddlePaddle/PaddleOCR/tree/dygraph/configs/rec/multi_language/"
global_config = yaml.load(open("./rec_multi_language_lite_train.yml", 'rb'), Loader=yaml.Loader)
global_config = yaml.load(
open("./rec_multi_language_lite_train.yml", 'rb'), Loader=yaml.Loader)
project_path = os.path.abspath(os.path.join(os.getcwd(), "../../../"))
class ArgsParser(ArgumentParser):
def __init__(self):
super(ArgsParser, self).__init__(
@ -41,15 +58,30 @@ class ArgsParser(ArgumentParser):
self.add_argument(
"-o", "--opt", nargs='+', help="set configuration options")
self.add_argument(
"-l", "--language", nargs='+', help="set language type, support {}".format(support_list))
"-l",
"--language",
nargs='+',
help="set language type, support {}".format(support_list))
self.add_argument(
"--train",type=str,help="you can use this command to change the train dataset default path")
"--train",
type=str,
help="you can use this command to change the train dataset default path"
)
self.add_argument(
"--val",type=str,help="you can use this command to change the eval dataset default path")
"--val",
type=str,
help="you can use this command to change the eval dataset default path"
)
self.add_argument(
"--dict",type=str,help="you can use this command to change the dictionary default path")
"--dict",
type=str,
help="you can use this command to change the dictionary default path"
)
self.add_argument(
"--data_dir",type=str,help="you can use this command to change the dataset default root path")
"--data_dir",
type=str,
help="you can use this command to change the dataset default root path"
)
def parse_args(self, argv=None):
args = super(ArgsParser, self).parse_args(argv)
@ -68,20 +100,28 @@ class ArgsParser(ArgumentParser):
return config
def _set_language(self, type):
assert(type),"please use -l or --language to choose language type"
assert (type), "please use -l or --language to choose language type"
assert(
type[0] in support_list.keys()
),"the sub_keys(-l or --language) can only be one of support list: \n{},\nbut get: {}, " \
"please check your running command".format(support_list, type)
global_config['Global']['character_dict_path'] = 'ppocr/utils/dict/{}_dict.txt'.format(type[0])
global_config['Global']['save_model_dir'] = './output/rec_{}_lite'.format(type[0])
global_config['Train']['dataset']['label_file_list'] = ["train_data/{}_train.txt".format(type[0])]
global_config['Eval']['dataset']['label_file_list'] = ["train_data/{}_val.txt".format(type[0])]
global_config['Global'][
'character_dict_path'] = 'ppocr/utils/dict/{}_dict.txt'.format(type[
0])
global_config['Global'][
'save_model_dir'] = './output/rec_{}_lite'.format(type[0])
global_config['Train']['dataset'][
'label_file_list'] = ["train_data/{}_train.txt".format(type[0])]
global_config['Eval']['dataset'][
'label_file_list'] = ["train_data/{}_val.txt".format(type[0])]
global_config['Global']['character_type'] = type[0]
assert(
os.path.isfile(os.path.join(project_path,global_config['Global']['character_dict_path']))
),"Loss default dictionary file {}_dict.txt.You can download it from \
https://github.com/PaddlePaddle/PaddleOCR/tree/dygraph/ppocr/utils/dict/".format(type[0])
assert (
os.path.isfile(
os.path.join(project_path, global_config['Global'][
'character_dict_path']))
), "Loss default dictionary file {}_dict.txt.You can download it from \
https://github.com/PaddlePaddle/PaddleOCR/tree/dygraph/ppocr/utils/dict/".format(
type[0])
return type[0]
@ -110,43 +150,51 @@ def merge_config(config):
cur[sub_key] = value
else:
cur = cur[sub_key]
def loss_file(path):
assert(
os.path.exists(path)
),"There is no such file:{},Please do not forget to put in the specified file".format(path)
assert (
os.path.exists(path)
), "There is no such file:{},Please do not forget to put in the specified file".format(
path)
if __name__ == '__main__':
FLAGS = ArgsParser().parse_args()
merge_config(FLAGS.opt)
save_file_path = 'rec_{}_lite_train.yml'.format(FLAGS.language)
if os.path.isfile(save_file_path):
os.remove(save_file_path)
if FLAGS.train:
global_config['Train']['dataset']['label_file_list'] = [FLAGS.train]
train_label_path = os.path.join(project_path,FLAGS.train)
train_label_path = os.path.join(project_path, FLAGS.train)
loss_file(train_label_path)
if FLAGS.val:
global_config['Eval']['dataset']['label_file_list'] = [FLAGS.val]
eval_label_path = os.path.join(project_path,FLAGS.val)
loss_file(Eval_label_path)
eval_label_path = os.path.join(project_path, FLAGS.val)
loss_file(eval_label_path)
if FLAGS.dict:
global_config['Global']['character_dict_path'] = FLAGS.dict
dict_path = os.path.join(project_path,FLAGS.dict)
dict_path = os.path.join(project_path, FLAGS.dict)
loss_file(dict_path)
if FLAGS.data_dir:
global_config['Eval']['dataset']['data_dir'] = FLAGS.data_dir
global_config['Train']['dataset']['data_dir'] = FLAGS.data_dir
data_dir = os.path.join(project_path,FLAGS.data_dir)
data_dir = os.path.join(project_path, FLAGS.data_dir)
loss_file(data_dir)
with open(save_file_path, 'w') as f:
yaml.dump(dict(global_config), f, default_flow_style=False, sort_keys=False)
yaml.dump(
dict(global_config), f, default_flow_style=False, sort_keys=False)
logging.info("Project path is :{}".format(project_path))
logging.info("Train list path set to :{}".format(global_config['Train']['dataset']['label_file_list'][0]))
logging.info("Eval list path set to :{}".format(global_config['Eval']['dataset']['label_file_list'][0]))
logging.info("Dataset root path set to :{}".format(global_config['Eval']['dataset']['data_dir']))
logging.info("Dict path set to :{}".format(global_config['Global']['character_dict_path']))
logging.info("Config file set to :configs/rec/multi_language/{}".format(save_file_path))
logging.info("Train list path set to :{}".format(global_config['Train'][
'dataset']['label_file_list'][0]))
logging.info("Eval list path set to :{}".format(global_config['Eval'][
'dataset']['label_file_list'][0]))
logging.info("Dataset root path set to :{}".format(global_config['Eval'][
'dataset']['data_dir']))
logging.info("Dict path set to :{}".format(global_config['Global'][
'character_dict_path']))
logging.info("Config file set to :configs/rec/multi_language/{}".
format(save_file_path))

@ -16,7 +16,7 @@ Global:
infer_img:
# for data or label process
character_dict_path: ppocr/utils/dict/en_dict.txt
character_type: ch
character_type: EN
max_text_length: 25
infer_mode: False
use_space_char: False

@ -1,5 +1,5 @@
Global:
use_gpu: true
use_gpu: True
epoch_num: 72
log_smooth_window: 20
print_batch_step: 10
@ -59,7 +59,7 @@ Metric:
Train:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
@ -78,7 +78,7 @@ Train:
Eval:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image

@ -58,7 +58,7 @@ Metric:
Train:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
@ -77,7 +77,7 @@ Train:
Eval:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image

@ -0,0 +1,102 @@
Global:
use_gpu: True
epoch_num: 72
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/rec_mv3_tps_bilstm_att/
save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 2000]
# if pretrained_model is saved in static mode, load_static_weights must set to True
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg
# for data or label process
character_dict_path:
character_type: en
max_text_length: 25
infer_mode: False
use_space_char: False
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
learning_rate: 0.0005
regularizer:
name: 'L2'
factor: 0.00001
Architecture:
model_type: rec
algorithm: RARE
Transform:
name: TPS
num_fiducial: 20
loc_lr: 0.1
model_name: small
Backbone:
name: MobileNetV3
scale: 0.5
model_name: large
Neck:
name: SequenceEncoder
encoder_type: rnn
hidden_size: 96
Head:
name: AttentionHead
hidden_size: 96
Loss:
name: AttentionLoss
PostProcess:
name: AttnLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- AttnLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 256
drop_last: True
num_workers: 8
Eval:
dataset:
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- AttnLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 256
num_workers: 1

@ -1,5 +1,5 @@
Global:
use_gpu: true
use_gpu: True
epoch_num: 72
log_smooth_window: 20
print_batch_step: 10
@ -63,7 +63,7 @@ Metric:
Train:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
@ -82,7 +82,7 @@ Train:
Eval:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image

@ -58,7 +58,7 @@ Metric:
Train:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
@ -77,7 +77,7 @@ Train:
Eval:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image

@ -56,7 +56,7 @@ Metric:
Train:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
@ -75,7 +75,7 @@ Train:
Eval:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image

@ -0,0 +1,101 @@
Global:
use_gpu: True
epoch_num: 400
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/b3_rare_r34_none_gru/
save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 2000]
# if pretrained_model is saved in static mode, load_static_weights must set to True
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg
# for data or label process
character_dict_path:
character_type: en
max_text_length: 25
infer_mode: False
use_space_char: False
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
learning_rate: 0.0005
regularizer:
name: 'L2'
factor: 0.00000
Architecture:
model_type: rec
algorithm: RARE
Transform:
name: TPS
num_fiducial: 20
loc_lr: 0.1
model_name: large
Backbone:
name: ResNet
layers: 34
Neck:
name: SequenceEncoder
encoder_type: rnn
hidden_size: 256 #96
Head:
name: AttentionHead # AttentionHead
hidden_size: 256 #
l2_decay: 0.00001
Loss:
name: AttentionLoss
PostProcess:
name: AttnLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- AttnLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 256
drop_last: True
num_workers: 8
Eval:
dataset:
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- AttnLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 256
num_workers: 8

@ -62,7 +62,7 @@ Metric:
Train:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
@ -81,7 +81,7 @@ Train:
Eval:
dataset:
name: LMDBDateSet
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image

@ -0,0 +1,107 @@
Global:
use_gpu: True
epoch_num: 72
log_smooth_window: 20
print_batch_step: 5
save_model_dir: ./output/rec/srn_new
save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 5000]
# if pretrained_model is saved in static mode, load_static_weights must set to True
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg
# for data or label process
character_dict_path:
character_type: en
max_text_length: 25
num_heads: 8
infer_mode: False
use_space_char: False
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
clip_norm: 10.0
lr:
learning_rate: 0.0001
Architecture:
model_type: rec
algorithm: SRN
in_channels: 1
Transform:
Backbone:
name: ResNetFPN
Head:
name: SRNHead
max_text_length: 25
num_heads: 8
num_encoder_TUs: 2
num_decoder_TUs: 4
hidden_dims: 512
Loss:
name: SRNLoss
PostProcess:
name: SRNLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/training/
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- SRNLabelEncode: # Class handling label
- SRNRecResizeImg:
image_shape: [1, 64, 256]
- KeepKeys:
keep_keys: ['image',
'label',
'length',
'encoder_word_pos',
'gsrm_word_pos',
'gsrm_slf_attn_bias1',
'gsrm_slf_attn_bias2'] # dataloader will return list in this order
loader:
shuffle: False
batch_size_per_card: 64
drop_last: False
num_workers: 4
Eval:
dataset:
name: LMDBDataSet
data_dir: ./train_data/data_lmdb_release/validation/
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- SRNLabelEncode: # Class handling label
- SRNRecResizeImg:
image_shape: [1, 64, 256]
- KeepKeys:
keep_keys: ['image',
'label',
'length',
'encoder_word_pos',
'gsrm_word_pos',
'gsrm_slf_attn_bias1',
'gsrm_slf_attn_bias2']
loader:
shuffle: False
drop_last: False
batch_size_per_card: 32
num_workers: 4

@ -1,6 +1,8 @@
# 服务器端C++预测
本教程将介绍在服务器端部署PaddleOCR超轻量中文检测、识别模型的详细步骤。
本章节介绍PaddleOCR 模型的的C++部署方法与之对应的python预测部署方式参考[文档](../../doc/doc_ch/inference.md)。
C++在性能计算上优于python因此在大多数CPU、GPU部署场景多采用C++的部署方式本节将介绍如何在Linux\Windows (CPU\GPU)环境下配置C++环境并完成
PaddleOCR模型部署。
## 1. 准备环境

@ -1,7 +1,9 @@
# Server-side C++ inference
In this tutorial, we will introduce the detailed steps of deploying PaddleOCR ultra-lightweight Chinese detection and recognition models on the server side.
This chapter introduces the C++ deployment method of the PaddleOCR model, and the corresponding python predictive deployment method refers to [document](../../doc/doc_ch/inference.md).
C++ is better than python in terms of performance calculation. Therefore, in most CPU and GPU deployment scenarios, C++ deployment is mostly used.
This section will introduce how to configure the C++ environment and complete it in the Linux\Windows (CPU\GPU) environment
PaddleOCR model deployment.
## 1. Prepare the environment

@ -76,7 +76,7 @@ void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes,
float(*std::max_element(&predict_batch[n * predict_shape[2]],
&predict_batch[(n + 1) * predict_shape[2]]));
if (argmax_idx > 0 && (!(i > 0 && argmax_idx == last_index))) {
if (argmax_idx > 0 && (!(n > 0 && argmax_idx == last_index))) {
score += max_value;
count += 1;
str_res.push_back(label_list_[argmax_idx]);

@ -9,7 +9,7 @@ use_mkldnn 0
max_side_len 960
det_db_thresh 0.3
det_db_box_thresh 0.5
det_db_unclip_ratio 2.0
det_db_unclip_ratio 1.6
det_model_dir ./inference/ch_ppocr_mobile_v2.0_det_infer/
# cls config

@ -20,7 +20,8 @@ def read_params():
#DB parmas
cfg.det_db_thresh = 0.3
cfg.det_db_box_thresh = 0.5
cfg.det_db_unclip_ratio = 2.0
cfg.det_db_unclip_ratio = 1.6
cfg.use_dilation = False
# #EAST parmas
# cfg.det_east_score_thresh = 0.8

@ -20,7 +20,8 @@ def read_params():
#DB parmas
cfg.det_db_thresh = 0.3
cfg.det_db_box_thresh = 0.5
cfg.det_db_unclip_ratio = 2.0
cfg.det_db_unclip_ratio = 1.6
cfg.use_dilation = False
#EAST parmas
cfg.det_east_score_thresh = 0.8

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save