tesseract-ocr/tesseract

Define the / sign as a word delimiter.

Open

#1221 opened on Nov 24, 2017

View on GitHub
 (3 comments) (0 reactions) (1 assignee)C++ (74,090 stars) (10,622 forks)batch import
feature requesthelp wanted

Description

Environment

  • Tesseract Version: 3.04.01
  • Platform: both on Debian Linux Buster (testing) and Windows 7 64bit

Current Behavior:

In hOCR mode words bound by the / sign are regarded as one word.

Expected Behavior:

The / sign, the word before and after should be regarded as a word on its own each.

Suggested Fix:

Define the / sign as a word delimiter.

Contributor guide