java.lang.Object

com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor<BufferedImage, String>

com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor

All Implemented Interfaces:: IPredictor<BufferedImage,String>, IRecognitionPredictor, AutoCloseable

public class OnnxRecognitionPredictor extends AbstractOnnxPredictor<BufferedImage,String> implements IRecognitionPredictor

A text recognition predictor implementation, which is using ONNX Runtime and its ML models to recognize text characters on an image.

Constructor Summary

Constructors

Constructor

Description

OnnxRecognitionPredictor(OnnxRecognitionPredictorProperties properties)

Creates a text recognition predictor with the specified properties.
Method Summary

Modifier and Type

Method

Description

static OnnxRecognitionPredictor

crnnMobileNetV3(String modelPath)

Creates a new text recognition predictor using an existing pre-trained CRNN model with a MobileNet V3 backbone, stored on disk.

static OnnxRecognitionPredictor

crnnVgg16(String modelPath)

Creates a new text recognition predictor using an existing pre-trained CRNN model with a VGG-16 backbone, stored on disk.

protected List<String>

fromOutputBuffer(List<BufferedImage> inputBatch, FloatBufferMdArray outputBatch)

Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.

OnnxRecognitionPredictorProperties

getProperties()

Returns the text recognition predictor properties.

static OnnxRecognitionPredictor

master(String modelPath)

Creates a new text recognition predictor using an existing pre-trained MASTER model, stored on disk.

static OnnxRecognitionPredictor

parSeq(String modelPath)

Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.

static OnnxRecognitionPredictor

parSeq(String modelPath, Vocabulary vocabulary, int additionalTokens)

Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.

static OnnxRecognitionPredictor

sar(String modelPath)

Creates a new text recognition predictor using an existing pre-trained SAR model, stored on disk.

protected FloatBufferMdArray

toInputBuffer(List<BufferedImage> batch)

Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.

static OnnxRecognitionPredictor

viTstr(String modelPath)

Creates a new text recognition predictor using an existing pre-trained ViTSTR model, stored on disk.

Methods inherited from class com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor
close, predict

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface java.lang.AutoCloseable
close

Methods inherited from interface com.itextpdf.pdfocr.onnxtr.IPredictor
predict, predict

Constructor Details
- OnnxRecognitionPredictor
  
  public OnnxRecognitionPredictor (OnnxRecognitionPredictorProperties properties)
  
  Creates a text recognition predictor with the specified properties.
  
  Parameters:
  
  properties - properties of the predictor
Method Details
- crnnVgg16
  
  public static OnnxRecognitionPredictor crnnVgg16 (String modelPath)
  Creates a new text recognition predictor using an existing pre-trained CRNN model with a VGG-16 backbone, stored on disk. This is the default text recognition model in OnnxTR.
  This can be used to load the following models from OnnxTR:
  
  crnn_vgg16_bn
  
  crnn_vgg16_bn (8-bit quantized)
  Parameters:
  
  modelPath - path to the pre-trained model
  
  Returns:
  
  a new predictor object with the CRNN model loaded with a VGG-16 backbone
- crnnMobileNetV3
  
  public static OnnxRecognitionPredictor crnnMobileNetV3 (String modelPath)
  Creates a new text recognition predictor using an existing pre-trained CRNN model with a MobileNet V3 backbone, stored on disk.
  This can be used to load the following models from OnnxTR:
  
  crnn_mobilenet_v3_large
  
  crnn_mobilenet_v3_large (8-bit quantized)
  
  crnn_mobilenet_v3_small
  
  crnn_mobilenet_v3_small (8-bit quantized)
  Parameters:
  
  modelPath - path to the pre-trained model
  
  Returns:
  
  a new predictor object with the CRNN model loaded with a MobileNet V3 backbone
- master
  
  public static OnnxRecognitionPredictor master (String modelPath)
  Creates a new text recognition predictor using an existing pre-trained MASTER model, stored on disk.
  This can be used to load the following models from OnnxTR:
  
  MASTER
  
  MASTER (8-bit quantized)
  Parameters:
  
  modelPath - path to the pre-trained model
  
  Returns:
  
  a new predictor object with the MASTER model loaded
- parSeq
  
  public static OnnxRecognitionPredictor parSeq (String modelPath)
  Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.
  This can be used to load the following models from OnnxTR:
  
  parseq
  
  parseq (8-bit quantized)
  Parameters:
  
  modelPath - path to the pre-trained model
  
  Returns:
  
  a new predictor object with the PARSeq model loaded
- parSeq
  
  public static OnnxRecognitionPredictor parSeq (String modelPath, Vocabulary vocabulary, int additionalTokens)
  Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.
  This can be used to load the following models from OnnxTR:
  
  parseq
  
  parseq (8-bit quantized)
  Parameters:
  
  modelPath - path to the pre-trained model
  
  vocabulary - vocabulary used for the model output (without special tokens)
  
  additionalTokens - amount of additional tokens in the total vocabulary after the end-of-string token
  
  Returns:
  
  a new predictor object with the PARSeq model loaded
- sar
  
  public static OnnxRecognitionPredictor sar (String modelPath)
  Creates a new text recognition predictor using an existing pre-trained SAR model, stored on disk.
  This can be used to load the following models from OnnxTR:
  
  sar_resnet31
  
  sar_resnet31 (8-bit quantized)
  Parameters:
  
  modelPath - path to the pre-trained model
  
  Returns:
  
  a new predictor object with the SAR model loaded
- viTstr
  
  public static OnnxRecognitionPredictor viTstr (String modelPath)
  Creates a new text recognition predictor using an existing pre-trained ViTSTR model, stored on disk.
  This can be used to load the following models from OnnxTR:
  
  vitstr_base
  
  vitstr_base (8-bit quantized)
  
  vitstr_small
  
  vitstr_small (8-bit quantized)
  Parameters:
  
  modelPath - path to the pre-trained model
  
  Returns:
  
  a new predictor object with the ViTSTR model loaded
- getProperties
  
  public OnnxRecognitionPredictorProperties getProperties()
  
  Returns the text recognition predictor properties.
  
  Returns:
  
  the text recognition predictor properties
- toInputBuffer
  
  protected FloatBufferMdArray toInputBuffer (List<BufferedImage> batch)
  
  Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.
  
  Specified by:
  
  toInputBuffer in class AbstractOnnxPredictor<BufferedImage,String>
  
  Parameters:
  
  batch - batch of raw predictor inputs
  
  Returns:
  
  batched model input MD-array buffer
- fromOutputBuffer
  
  protected List<String> fromOutputBuffer (List<BufferedImage> inputBatch, FloatBufferMdArray outputBatch)
  
  Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.
  
  Specified by:
  
  fromOutputBuffer in class AbstractOnnxPredictor<BufferedImage,String>
  
  Parameters:
  
  inputBatch - list of raw predictor inputs, matching the output
  
  outputBatch - batched model output MD-array buffer
  
  Returns:
  
  a list of predictor output

Class OnnxRecognitionPredictor

Constructor Summary

Method Summary

Methods inherited from class com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor

Methods inherited from class java.lang.Object

Methods inherited from interface java.lang.AutoCloseable

Methods inherited from interface com.itextpdf.pdfocr.onnxtr.IPredictor

Constructor Details

OnnxRecognitionPredictor

Method Details

crnnVgg16

crnnMobileNetV3

master

parSeq

parSeq

sar

viTstr

getProperties

toInputBuffer

fromOutputBuffer