public class OcrWithPostProcessingEngine extends Object implements com.itextpdf.pdfocr.IOcrEngine
| Constructor and Description |
|---|
OcrWithPostProcessingEngine(com.itextpdf.pdfocr.IOcrEngine baseOcrEngine, List<IOcrEnginePostProcessor> postProcessors, boolean isTaggingSupported)
Creates new OcrWithPostProcessingEngine instance.
|
| Modifier and Type | Method and Description |
|---|---|
void |
createTxtFile(List<File> inputImages, File txtFile)
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
|
void |
createTxtFile(List<File> inputImages, File txtFile, com.itextpdf.pdfocr.OcrProcessContext ocrProcessContext)
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
|
Map<Integer,List |
doImageOcr(File input)
Performs ocr with post-processing to your input file.
|
Map<Integer,List |
doImageOcr(File input, com.itextpdf.pdfocr.OcrProcessContext ocrProcessContext)
Performs ocr with post-processing to your input file.
|
boolean |
isTaggingSupported()
Gets whether results will be tagged or not.
|
public OcrWithPostProcessingEngine(com.itextpdf.pdfocr.IOcrEngine baseOcrEngine,
List<IOcrEnginePostProcessor> postProcessors,
boolean isTaggingSupported)
OcrWithPostProcessingEngine instance.
baseOcrEngine - base ocr engine which implements IOcrEngine
postProcessors - List of IOcrEnginePostProcessor
isTaggingSupported - if true results will be tagged, otherwise tag structure will be missing
public boolean isTaggingSupported()
true if results will be tagged, false otherwise;
public Map<Integer,List> doImageOcr(File input)
public Map<Integer,List> doImageOcr(File input, com.itextpdf.pdfocr.OcrProcessContext ocrProcessContext)
doImageOcr in interface com.itextpdf.pdfocr.IOcrEngine
input - input image File
ocrProcessContext - ocr processing context
Map where key is Integer representing the number of the page and value is List of TextInfo elements where each TextInfo element contains a word or a line and its 4 coordinates (bbox)
public void createTxtFile(List<File> inputImages, File txtFile)
IOcrEngine for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multicolumn layout, tables etc.).
createTxtFile in interface com.itextpdf.pdfocr.IOcrEngine
inputImages - List of images to be OCRed
txtFile - file to be created
public void createTxtFile(List<File> inputImages, File txtFile, com.itextpdf.pdfocr.OcrProcessContext ocrProcessContext)
IOcrEngine for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multicolumn layout, tables etc.).
createTxtFile in interface com.itextpdf.pdfocr.IOcrEngine
inputImages - List of images to be OCRed
txtFile - file to be created
ocrProcessContext - ocr processing context
Copyright © 2024. All rights reserved.