java.lang.Object

com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine

All Implemented Interfaces:: IOcrEngine, IProductAware, AutoCloseable

public class OnnxTrOcrEngine extends Object implements IOcrEngine, AutoCloseable, IProductAware

IOcrEngine implementation, based on OnnxTR/DocTR machine learning OCR projects.

NOTE: OnnxTrOcrEngine instance shall be closed after all usages to avoid native allocations leak.

Constructor Summary

Constructors

Constructor

Description

OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor)

Create a new OCR engine with the provided predictors.

OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor, OnnxTrEngineProperties properties)

Create a new OCR engine with the provided predictors.

OnnxTrOcrEngine(IDetectionPredictor detectionPredictor, IRecognitionPredictor recognitionPredictor)

Create a new OCR engine with the provided predictors, without text orientation prediction.
Method Summary

Modifier and Type

Method

Description

void

close()

void

createTxtFile(List<File> inputImages, File txtFile)

Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.

void

createTxtFile(List<File> inputImages, File txtFile, OcrProcessContext ocrProcessContext)

Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.

Map<Integer,List<TextInfo>>

doImageOcr(File input)

Reads data from the provided input image file and returns retrieved data in the format described below.

Map<Integer,List<TextInfo>>

doImageOcr(File input, OcrProcessContext ocrProcessContext)

Reads data from the provided input image file and returns retrieved data in the format described below.

PdfOcrMetaInfoContainer

getMetaInfoContainer()

Gets the container with meta info.

com.itextpdf.commons.actions.data.ProductData

getProductData()

Gets object containing information about the product.

boolean

isTaggingSupported()

Checks whether tagging is supported by the OCR engine.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- OnnxTrOcrEngine
  
  public OnnxTrOcrEngine (IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor)
  
  Create a new OCR engine with the provided predictors.
  
  Parameters:
  
  detectionPredictor - text detector. For an input image it outputs a list of text boxes
  
  orientationPredictor - text orientation predictor. For an input image, which is a tight crop of text, it outputs its orientation in 90 degrees steps. Can be null, in that case all text is assumed to be upright
  
  recognitionPredictor - text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string
- OnnxTrOcrEngine
  
  public OnnxTrOcrEngine (IDetectionPredictor detectionPredictor, IOrientationPredictor orientationPredictor, IRecognitionPredictor recognitionPredictor, OnnxTrEngineProperties properties)
  
  Create a new OCR engine with the provided predictors.
  
  Parameters:
  
  detectionPredictor - text detector. For an input image it outputs a list of text boxes
  
  orientationPredictor - text orientation predictor. For an input image, which is a tight crop of text, it outputs its orientation in 90 degrees steps. Can be null, in that case all text is assumed to be upright
  
  recognitionPredictor - text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string
  
  properties - set of properties
- OnnxTrOcrEngine
  
  public OnnxTrOcrEngine (IDetectionPredictor detectionPredictor, IRecognitionPredictor recognitionPredictor)
  
  Create a new OCR engine with the provided predictors, without text orientation prediction.
  
  Parameters:
  
  detectionPredictor - text detector. For an input image it outputs a list of text boxes
  
  recognitionPredictor - text recognizer. For an input image, which is a tight crop of text, it outputs the displayed string
Method Details
- close
  
  public void close() throws Exception
  
  Specified by:
  
  close in interface AutoCloseable
  
  Throws:
  
  Exception
- doImageOcr
  
  public Map<Integer,List<TextInfo>> doImageOcr (File input)
  
  Reads data from the provided input image file and returns retrieved data in the format described below.
  
  Specified by:
  
  doImageOcr in interface IOcrEngine
  
  Parameters:
  
  input - input image File
  
  Returns:
  
  Map where key is Integer representing the number of the page and value is List of TextInfo elements where each TextInfo element contains a word or a line and its 4 coordinates(bbox)
- doImageOcr
  
  public Map<Integer,List<TextInfo>> doImageOcr (File input, OcrProcessContext ocrProcessContext)
  
  Reads data from the provided input image file and returns retrieved data in the format described below.
  
  Specified by:
  
  doImageOcr in interface IOcrEngine
  
  Parameters:
  
  input - input image File
  
  ocrProcessContext - ocr processing context
  
  Returns:
  
  Map where key is Integer representing the number of the page and value is List of TextInfo elements where each TextInfo element contains a word or a line and its 4 coordinates(bbox)
- createTxtFile
  
  public void createTxtFile (List<File> inputImages, File txtFile)
  
  Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)
  
  Specified by:
  
  createTxtFile in interface IOcrEngine
  
  Parameters:
  
  inputImages - List of images to be OCRed
  
  txtFile - file to be created
- createTxtFile
  
  public void createTxtFile (List<File> inputImages, File txtFile, OcrProcessContext ocrProcessContext)
  
  Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path. Note that a human reading order is not guaranteed due to possible specifics of input images (multi column layout, tables etc)
  
  Specified by:
  
  createTxtFile in interface IOcrEngine
  
  Parameters:
  
  inputImages - List of images to be OCRed
  
  txtFile - file to be created
  
  ocrProcessContext - ocr processing context
- isTaggingSupported
  
  public boolean isTaggingSupported()
  
  Checks whether tagging is supported by the OCR engine.
  
  Specified by:
  
  isTaggingSupported in interface IOcrEngine
  
  Returns:
  
  true if tagging is supported by the engine, false otherwise
- getMetaInfoContainer
  
  public PdfOcrMetaInfoContainer getMetaInfoContainer()
  
  Gets the container with meta info.
  
  Specified by:
  
  getMetaInfoContainer in interface IProductAware
  
  Returns:
  
  the held meta info container
- getProductData
  
  public com.itextpdf.commons.actions.data.ProductData getProductData()
  
  Gets object containing information about the product.
  
  Specified by:
  
  getProductData in interface IProductAware
  
  Returns:
  
  product data

Class OnnxTrOcrEngine

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

OnnxTrOcrEngine

OnnxTrOcrEngine

OnnxTrOcrEngine

Method Details

close

doImageOcr

doImageOcr

createTxtFile

createTxtFile

isTaggingSupported

getMetaInfoContainer

getProductData