Generated by
JDiff

com.itextpdf.pdfocr.tesseract4 Documentation Differences

This file contains all the changes in documentation in the package com.itextpdf.pdfocr.tesseract4 as colored differences. Deletions are shown like this , and additions are shown like this.
If no deletions or additions are shown in an entry, the HTML tags will be what has changed. The new HTML tags are shown in the differences. If no documentation existed, and then some was added in a later version, this change is noted in the appropriate class pages of differences, but the change is not shown on this page. Only changes in existing text are shown here. Similarly, documentation which was inherited from another class or interface is not shown here.
Note that an HTML error in the new documentation may cause the display of other documentation changes to be presented incorrectly. For instance, failure to close a tag will cause all subsequent paragraphs to be displayed differently.

Class Tesseract4OcrEngineProperties, int getMinimalConfidenceLevel()

Gets minimal confidence level for HOCR line to be considered as properly recognized. If real confidence level is lower then line is ignored Default value is 0 which means that everything is considered as properly recognized Value may vary in range of 0- 100 100 @return minimal confidence level
Class Tesseract4OcrEngineProperties, Tesseract4OcrEngineProperties setMinimalConfidenceLevel(int)

Sets minimal confidence level for HOCR line to be considered as properly recognized. If real confidence level is lower then line is ignored Default value is 0 which means that everything is considered as properly recognized Value may vary in range of 0- 100 100 @param minimalConfidenceLevel minimal confidence level value @return this Tesseract4OcrEngineProperties instance
Class Tesseract4OcrEngineProperties, Tesseract4OcrEngineProperties setUseTxtToImproveHocrParsing(boolean)

Sets .useTxtToImproveHocrParsing. Used to make HOCR recognition result more precise. This is needed for cases of Thai language or some Chinese dialects where every character is interpreted as a single word. For more information see https://github.com/tesseract-ocr/tesseract/issues/2702 @param useTxtToImproveHocrParsing .useTxtToImproveHocrParsing @return this Tesseract4OcrEngineProperties instance.