Package com.itextpdf.pdfocr
Interface IOcrEngine
- All Known Implementing Classes:
-
AbstractTesseract4OcrEngine
,Tesseract4ExecutableOcrEngine
,Tesseract4LibOcrEngine
public interface IOcrEngine
IOcrEngine
interface is used for instantiating new OcrReader objects. IOcrEngine
interface provides possibility to perform OCR, to read data from input files and to return the contained text in the required format.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
createTxtFile
(List<File> inputImages, File txtFile) Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path.void
createTxtFile
(List<File> inputImages, File txtFile, OcrProcessContext ocrProcessContext) Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path.doImageOcr
(File input) Reads data from the provided input image file and returns retrieved data in the format described below.doImageOcr
(File input, OcrProcessContext ocrProcessContext) Reads data from the provided input image file and returns retrieved data in the format described below.boolean
Checks whether tagging is supported by the OCR engine.
-
Method Details
-
doImageOcr
Reads data from the provided input image file and returns retrieved data in the format described below. -
doImageOcr
Reads data from the provided input image file and returns retrieved data in the format described below. -
createTxtFile
Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)- Parameters:
-
inputImages
-List
of images to be OCRed -
txtFile
- file to be created
-
createTxtFile
Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)- Parameters:
-
inputImages
-List
of images to be OCRed -
txtFile
- file to be created -
ocrProcessContext
- ocr processing context
-
isTaggingSupported
boolean isTaggingSupported()Checks whether tagging is supported by the OCR engine.- Returns:
-
true
if tagging is supported by the engine,false
otherwise
-