Package com.itextpdf.pdfocr.onnxtr
Class OnnxTrOcrEngine
java.lang.Object
com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
- All Implemented Interfaces:
-
IOcrEngine
,IProductAware
,AutoCloseable
IOcrEngine
implementation, based on OnnxTR/DocTR machine learning OCR projects.
NOTE: OnnxTrOcrEngine
instance shall be closed after all usages to avoid native allocations leak.
-
Constructor Summary
ConstructorsConstructorDescriptionOnnxTrOcrEngine
(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors.OnnxTrOcrEngine
(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor, OnnxTrEngineProperties properties) Create a new OCR engine with the provided predictors.OnnxTrOcrEngine
(IDetectionPredictor detectionPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors, without text orientation prediction. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
void
createTxtFile
(List<File> inputImages, File txtFile) Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path.void
createTxtFile
(List<File> inputImages, File txtFile, OcrProcessContext ocrProcessContext) Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path.doImageOcr
(File input) Reads data from the provided input image file and returns retrieved data in the format described below.doImageOcr
(File input, OcrProcessContext ocrProcessContext) Reads data from the provided input image file and returns retrieved data in the format described below.Gets the container with meta info.com.itextpdf.commons.actions.data.ProductData
Gets object containing information about the product.boolean
Checks whether tagging is supported by the OCR engine.
-
Constructor Details
-
OnnxTrOcrEngine
public OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors.- Parameters:
-
detectionPredictor
- text detector. For an input image it outputs a list of text boxes -
orientationPredictor
- text orientation predictor. For an input image, which is a tight crop of text, it outputs its orientation in 90 degrees steps. Can be null, in that case all text is assumed to be upright -
recognitionPredictor
- text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string
-
OnnxTrOcrEngine
public OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor, OnnxTrEngineProperties properties) Create a new OCR engine with the provided predictors.- Parameters:
-
detectionPredictor
- text detector. For an input image it outputs a list of text boxes -
orientationPredictor
- text orientation predictor. For an input image, which is a tight crop of text, it outputs its orientation in 90 degrees steps. Can be null, in that case all text is assumed to be upright -
recognitionPredictor
- text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string -
properties
- set of properties
-
OnnxTrOcrEngine
public OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IRecognitionPredictor recognitionPredictor) Create a new OCR engine with the provided predictors, without text orientation prediction.- Parameters:
-
detectionPredictor
- text detector. For an input image it outputs a list of text boxes -
recognitionPredictor
- text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string
-
-
Method Details
-
close
- Specified by:
-
close
in interfaceAutoCloseable
- Throws:
-
Exception
-
doImageOcr
Reads data from the provided input image file and returns retrieved data in the format described below.- Specified by:
-
doImageOcr
in interfaceIOcrEngine
- Parameters:
-
input
- input imageFile
- Returns:
-
Map
where key isInteger
representing the number of the page and value isList
ofTextInfo
elements where eachTextInfo
element contains a word or a line and its 4 coordinates(bbox)
-
doImageOcr
Reads data from the provided input image file and returns retrieved data in the format described below.- Specified by:
-
doImageOcr
in interfaceIOcrEngine
- Parameters:
-
input
- input imageFile
-
ocrProcessContext
- ocr processing context - Returns:
-
Map
where key isInteger
representing the number of the page and value isList
ofTextInfo
elements where eachTextInfo
element contains a word or a line and its 4 coordinates(bbox)
-
createTxtFile
Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)- Specified by:
-
createTxtFile
in interfaceIOcrEngine
- Parameters:
-
inputImages
-List
of images to be OCRed -
txtFile
- file to be created
-
createTxtFile
public void createTxtFile(List<File> inputImages, File txtFile, OcrProcessContext ocrProcessContext) Performs OCR using providedIOcrEngine
for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)- Specified by:
-
createTxtFile
in interfaceIOcrEngine
- Parameters:
-
inputImages
-List
of images to be OCRed -
txtFile
- file to be created -
ocrProcessContext
- ocr processing context
-
isTaggingSupported
public boolean isTaggingSupported()Checks whether tagging is supported by the OCR engine.- Specified by:
-
isTaggingSupported
in interfaceIOcrEngine
- Returns:
-
true
if tagging is supported by the engine,false
otherwise
-
getMetaInfoContainer
Gets the container with meta info.- Specified by:
-
getMetaInfoContainer
in interfaceIProductAware
- Returns:
- the held meta info container
-
getProductData
public com.itextpdf.commons.actions.data.ProductData getProductData()Gets object containing information about the product.- Specified by:
-
getProductData
in interfaceIProductAware
- Returns:
- product data
-