Package com.itextpdf.pdfocr.onnxtr
Class OnnxTrOcrEngine
java.lang.Object
com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
- All Implemented Interfaces:
-
IOcrEngine,IProductAware,AutoCloseable
IOcrEngine implementation, based on OnnxTR/DocTR machine learning OCR projects.
NOTE: OnnxTrOcrEngine instance shall be closed after all usages to avoid native allocations leak.
-
Constructor Summary
ConstructorsConstructorDescriptionOnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors.OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor, OnnxTrEngineProperties properties) Create a new OCR engine with the provided predictors.OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors, without text orientation prediction. -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()voidcreateTxtFile(List<File> inputImages, File txtFile) Performs OCR using providedIOcrEnginefor the given list of input images and saves output to a text file using provided path.voidcreateTxtFile(List<File> inputImages, File txtFile, OcrProcessContext ocrProcessContext) Performs OCR using providedIOcrEnginefor the given list of input images and saves output to a text file using provided path.doImageOcr(File input) Reads data from the provided input image file and returns retrieved data in the format described below.doImageOcr(File input, OcrProcessContext ocrProcessContext) Reads data from the provided input image file and returns retrieved data in the format described below.Gets the container with meta info.com.itextpdf.commons.actions.data.ProductDataGets object containing information about the product.booleanChecks whether tagging is supported by the OCR engine.
-
Constructor Details
-
OnnxTrOcrEngine
public OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors.- Parameters:
-
detectionPredictor- text detector. For an input image it outputs a list of text boxes -
orientationPredictor- text orientation predictor. For an input image, which is a tight crop of text, it outputs its orientation in 90 degrees steps. Can be null, in that case all text is assumed to be upright -
recognitionPredictor- text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string
-
OnnxTrOcrEngine
public OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor, OnnxTrEngineProperties properties) Create a new OCR engine with the provided predictors.- Parameters:
-
detectionPredictor- text detector. For an input image it outputs a list of text boxes -
orientationPredictor- text orientation predictor. For an input image, which is a tight crop of text, it outputs its orientation in 90 degrees steps. Can be null, in that case all text is assumed to be upright -
recognitionPredictor- text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string -
properties- set of properties
-
OnnxTrOcrEngine
public OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors, without text orientation prediction.- Parameters:
-
detectionPredictor- text detector. For an input image it outputs a list of text boxes -
recognitionPredictor- text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string
-
-
Method Details
-
close
- Specified by:
-
closein interfaceAutoCloseable - Throws:
-
Exception
-
doImageOcr
Reads data from the provided input image file and returns retrieved data in the format described below.- Specified by:
-
doImageOcrin interfaceIOcrEngine - Parameters:
-
input- input imageFile - Returns:
-
Mapwhere key isIntegerrepresenting the number of the page and value isListofTextInfoelements where eachTextInfoelement contains a word or a line and its 4 coordinates(bbox)
-
doImageOcr
Reads data from the provided input image file and returns retrieved data in the format described below.- Specified by:
-
doImageOcrin interfaceIOcrEngine - Parameters:
-
input- input imageFile -
ocrProcessContext- ocr processing context - Returns:
-
Mapwhere key isIntegerrepresenting the number of the page and value isListofTextInfoelements where eachTextInfoelement contains a word or a line and its 4 coordinates(bbox)
-
createTxtFile
Performs OCR using providedIOcrEnginefor the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)- Specified by:
-
createTxtFilein interfaceIOcrEngine - Parameters:
-
inputImages-Listof images to be OCRed -
txtFile- file to be created
-
createTxtFile
public void createTxtFile(List<File> inputImages, File txtFile, OcrProcessContext ocrProcessContext) Performs OCR using providedIOcrEnginefor the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)- Specified by:
-
createTxtFilein interfaceIOcrEngine - Parameters:
-
inputImages-Listof images to be OCRed -
txtFile- file to be created -
ocrProcessContext- ocr processing context
-
isTaggingSupported
public boolean isTaggingSupported()Checks whether tagging is supported by the OCR engine.- Specified by:
-
isTaggingSupportedin interfaceIOcrEngine - Returns:
-
trueif tagging is supported by the engine,falseotherwise
-
getMetaInfoContainer
Gets the container with meta info.- Specified by:
-
getMetaInfoContainerin interfaceIProductAware - Returns:
- the held meta info container
-
getProductData
public com.itextpdf.commons.actions.data.ProductData getProductData()Gets object containing information about the product.- Specified by:
-
getProductDatain interfaceIProductAware - Returns:
- product data
-