Class OnnxDetectionPredictor

java.lang.Object
com.itextpdf.pdfocr.onnx.AbstractOnnxPredictor<BufferedImage, List>
com.itextpdf.pdfocr.onnx.detection.OnnxDetectionPredictor
All Implemented Interfaces:
IDetectionPredictor, IPredictor<BufferedImage,List>, AutoCloseable

public class OnnxDetectionPredictor extends AbstractOnnxPredictor<BufferedImage,List> implements IDetectionPredictor
A text detection predictor implementation, which is using ONNX Runtime and its ML models to find, where text is located on an image.
  • Constructor Details

    • OnnxDetectionPredictor

      public OnnxDetectionPredictor (OnnxDetectionPredictorProperties properties)
      Creates a text detection predictor with the specified properties.
      Parameters:
      properties - properties of the predictor
  • Method Details

    • dbNet

      public static OnnxDetectionPredictor dbNet (String modelPath)
      Creates a new text detection predictor using an existing pre-trained DBNet model, stored on disk.

      This can be used to load the following models from OnnxTR:

      These models output boxes of words.

      Parameters:
      modelPath - path to the pre-trained model
      Returns:
      a new predictor with the DBNet model loaded
    • dbNet

      public static OnnxDetectionPredictor dbNet (String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator)
      Creates a new text detection predictor using an existing pre-trained DBNet model, stored on disk.

      This can be used to load the following models from OnnxTR:

      These models output boxes of words.

      Parameters:
      modelPath - path to the pre-trained model
      ortSessionOptionsCreator - the ONNX runtime session options creator
      Returns:
      a new predictor with the DBNet model loaded
    • fast

      public static OnnxDetectionPredictor fast (String modelPath)
      Creates a new text detection predictor using an existing pre-trained FAST model, stored on disk. This is the default text detection model in OnnxTR.

      This can be used to load the following models from OnnxTR:

      These models output boxes of words.

      Parameters:
      modelPath - path to the pre-trained model
      Returns:
      a new predictor with the FAST model loaded
    • fast

      public static OnnxDetectionPredictor fast (String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator)
      Creates a new text detection predictor using an existing pre-trained FAST model, stored on disk. This is the default text detection model in OnnxTR.

      This can be used to load the following models from OnnxTR:

      These models output boxes of words.

      Parameters:
      modelPath - path to the pre-trained model
      ortSessionOptionsCreator - the ONNX runtime session options creator
      Returns:
      a new predictor with the FAST model loaded
    • linkNet

      public static OnnxDetectionPredictor linkNet (String modelPath)
      Creates a new text detection predictor using an existing pre-trained LinkNet model, stored on disk.

      This can be used to load the following models from OnnxTR:

      These models output boxes of words.

      Parameters:
      modelPath - path to the pre-trained model
      Returns:
      a new predictor with the LinkNet model loaded
    • linkNet

      public static OnnxDetectionPredictor linkNet (String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator)
      Creates a new text detection predictor using an existing pre-trained LinkNet model, stored on disk.

      This can be used to load the following models from OnnxTR:

      These models output boxes of words.

      Parameters:
      modelPath - path to the pre-trained model
      ortSessionOptionsCreator - the ONNX runtime session options creator
      Returns:
      a new predictor with the LinkNet model loaded
    • paddleOcr

      public static OnnxDetectionPredictor paddleOcr (String modelDirPath) throws IOException
      Creates a new text detection predictor using an existing pre-trained PaddleOCR model, stored on disk.

      Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.

      This method expects the directory to contain two files:

      • inference.onnx - the inference model in the ONNX format
      • inference.yml - the configuration file for the model in YAML

      This method can be used to load the following PaddleOCR models:

      These models output boxes of text lines. Make sure you choose a recognition model that can handle spaces.

      Parameters:
      modelDirPath - path to the directory with the model and its configuration file
      Returns:
      a new predictor with the PaddleOCR model loaded
      Throws:
      IOException - if any I/O error occurs while loading configuration file
    • paddleOcr

      public static OnnxDetectionPredictor paddleOcr (String modelDirPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) throws IOException
      Creates a new text detection predictor using an existing pre-trained PaddleOCR model, stored on disk.

      Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.

      This method expects the directory to contain two files:

      • inference.onnx - the inference model in the ONNX format
      • inference.yml - the configuration file for the model in YAML

      This method can be used to load the following PaddleOCR models:

      These models output boxes of text lines. Make sure you choose a recognition model that can handle spaces.

      Parameters:
      modelDirPath - path to the directory with the model and its configuration file
      ortSessionOptionsCreator - the ONNX runtime session options creator
      Returns:
      a new predictor with the PaddleOCR model loaded
      Throws:
      IOException - if any I/O error occurs while loading configuration file
    • paddleOcr

      public static OnnxDetectionPredictor paddleOcr (String modelPath, String configPath) throws IOException
      Creates a new text detection predictor using an existing pre-trained PaddleOCR model, stored on disk.

      Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.

      This method can be used to load the following PaddleOCR models:

      These models output boxes of text lines. Make sure you choose a recognition model that can handle spaces.

      Parameters:
      modelPath - path to the pre-trained model in the ONNX format
      configPath - path to the configuration file for the model
      Returns:
      a new predictor with the PaddleOCR model loaded
      Throws:
      IOException - if any I/O error occurs while loading configuration file
    • paddleOcr

      public static OnnxDetectionPredictor paddleOcr (String modelPath, String configPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) throws IOException
      Creates a new text detection predictor using an existing pre-trained PaddleOCR model, stored on disk.

      Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.

      This method can be used to load the following PaddleOCR models:

      These models output boxes of text lines. Make sure you choose a recognition model that can handle spaces.

      Parameters:
      modelPath - path to the pre-trained model in the ONNX format
      configPath - path to the configuration file for the model
      ortSessionOptionsCreator - the ONNX runtime session options creator
      Returns:
      a new predictor with the PaddleOCR model loaded
      Throws:
      IOException - if any I/O error occurs while loading configuration file
    • easyOcr

      public static OnnxDetectionPredictor easyOcr (String modelPath)
      Creates a new text detection predictor using an existing pre-trained EasyOCR CRAFT model, stored on disk.

      Only models in the ONNX format are supported. Since, by default, EasyOCR does not provide models in the ONNX format, you might need to do a model conversion yourself.

      This can be used to load the following models from EasyOCR:

      These models output boxes of text lines. Make sure you choose a recognition model that can handle spaces.

      Parameters:
      modelPath - path to the pre-trained model
      Returns:
      a new predictor with the EasyOCR CRAFT model loaded
    • easyOcr

      public static OnnxDetectionPredictor easyOcr (String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator)
      Creates a new text detection predictor using an existing pre-trained EasyOCR CRAFT model, stored on disk.

      Only models in the ONNX format are supported. Since, by default, EasyOCR does not provide models in the ONNX format, you might need to do a model conversion yourself.

      This can be used to load the following models from EasyOCR:

      These models output boxes of text lines. Make sure you choose a recognition model that can handle spaces.

      Parameters:
      modelPath - path to the pre-trained model
      ortSessionOptionsCreator - the ONNX runtime session options creator
      Returns:
      a new predictor with the EasyOCR CRAFT model loaded
    • getProperties

      public OnnxDetectionPredictorProperties getProperties()
      Returns the text detection predictor properties.
      Returns:
      the text detection predictor properties
    • toInputBuffer

      protected FloatBufferMdArray toInputBuffer (List<BufferedImage> batch)
      Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.
      Specified by:
      toInputBuffer in class AbstractOnnxPredictor<BufferedImage,List>
      Parameters:
      batch - batch of raw predictor inputs
      Returns:
      batched model input MD-array buffer
    • fromOutputBuffer

      protected List<List> fromOutputBuffer (List<BufferedImage> inputBatch, FloatBufferMdArray outputBatch)
      Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.
      Specified by:
      fromOutputBuffer in class AbstractOnnxPredictor<BufferedImage,List>
      Parameters:
      inputBatch - list of raw predictor inputs, matching the output
      outputBatch - batched model output MD-array buffer
      Returns:
      a list of predictor output