Package com.itextpdf.pdfocr.util
Class PdfOcrTextBuilder
java.lang.Object
com.itextpdf.pdfocr.util.PdfOcrTextBuilder
Class to build text output from the provided image OCR result and write it to the TXT file.
-
Method Summary
Modifier and TypeMethodDescriptionstatic StringConstructs string output from the providedIOcrEngine.doImageOcr(java.io.File)result.static voidcollectWordsIntoLines(Map<Integer, List<TextInfo>> textInfos) Merges the providedIOcrEngine.doImageOcr(java.io.File)result into lines and updates line bounding boxes to match the largest words.correctRotationAngle(Map<Integer, List<TextInfo>> result) Processes all text infos to round the rotation angle to either 0, 90, 180 or 270 degrees.static voidgenerifyWordBBoxesByLine(Map<Integer, List<TextInfo>> textInfos) Sorts the providedIOcrEngine.doImageOcr(java.io.File)result by lines and updates line bboxes to match the largest words.static voidsortTextInfosByLines(Map<Integer, List<TextInfo>> textInfos) Sorts the providedIOcrEngine.doImageOcr(java.io.File)result by lines.
-
Method Details
-
buildText
Constructs string output from the providedIOcrEngine.doImageOcr(java.io.File)result. -
generifyWordBBoxesByLine
Sorts the providedIOcrEngine.doImageOcr(java.io.File)result by lines and updates line bboxes to match the largest words. -
collectWordsIntoLines
Merges the providedIOcrEngine.doImageOcr(java.io.File)result into lines and updates line bounding boxes to match the largest words. -
sortTextInfosByLines
Sorts the providedIOcrEngine.doImageOcr(java.io.File)result by lines. -
correctRotationAngle
Processes all text infos to round the rotation angle to either 0, 90, 180 or 270 degrees. Text bounding rectangle will be used for updated text bounding points.- Parameters:
-
result- OCR result to process - Returns:
- same result, but corrected
-