Class OnnxRecognitionPredictor
- All Implemented Interfaces:
-
IPredictor<BufferedImage,,String> IRecognitionPredictor,AutoCloseable
-
Constructor Summary
ConstructorsConstructorDescriptionCreates a text recognition predictor with the specified properties. -
Method Summary
Modifier and TypeMethodDescriptionstatic OnnxRecognitionPredictorcrnnMobileNetV3(String modelPath) Creates a new text recognition predictor using an existing pre-trained CRNN model with a MobileNet V3 backbone, stored on disk.static OnnxRecognitionPredictorcrnnMobileNetV3(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained CRNN model with a MobileNet V3 backbone, stored on disk.static OnnxRecognitionPredictorCreates a new text recognition predictor using an existing pre-trained CRNN model with a VGG-16 backbone, stored on disk.static OnnxRecognitionPredictorcrnnVgg16(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained CRNN model with a VGG-16 backbone, stored on disk.static OnnxRecognitionPredictoreasyOcr(String modelPath, EasyOcrMapper labelMapper) Creates a new text recognition predictor using an existing pre-trained EasyOCR model, stored on disk.static OnnxRecognitionPredictoreasyOcr(String modelPath, EasyOcrMapper labelMapper, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained EasyOCR model, stored on disk.fromOutputBuffer(List<BufferedImage> inputBatch, FloatBufferMdArray outputBatch) Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.Returns the text recognition predictor properties.static OnnxRecognitionPredictorCreates a new text recognition predictor using an existing pre-trained MASTER model, stored on disk.static OnnxRecognitionPredictormaster(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained MASTER model, stored on disk.static OnnxRecognitionPredictorCreates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.static OnnxRecognitionPredictorpaddleOcr(String modelDirPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.static OnnxRecognitionPredictorCreates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.static OnnxRecognitionPredictorpaddleOcr(String modelPath, String configPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.static OnnxRecognitionPredictorCreates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.static OnnxRecognitionPredictorparSeq(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.static OnnxRecognitionPredictorparSeq(String modelPath, Vocabulary vocabulary, int additionalTokens) Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.static OnnxRecognitionPredictorparSeq(String modelPath, Vocabulary vocabulary, int additionalTokens, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.predict(Iterator<BufferedImage> inputs) Performs prediction on a sequence of input items.static OnnxRecognitionPredictorCreates a new text recognition predictor using an existing pre-trained SAR model, stored on disk.static OnnxRecognitionPredictorsar(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained SAR model, stored on disk.protected FloatBufferMdArraytoInputBuffer(List<BufferedImage> batch) Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.static OnnxRecognitionPredictorCreates a new text recognition predictor using an existing pre-trained ViTSTR model, stored on disk.static OnnxRecognitionPredictorviTstr(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained ViTSTR model, stored on disk.Methods inherited from class com.itextpdf.pdfocr.onnx.AbstractOnnxPredictor
closeMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface java.lang.AutoCloseable
closeMethods inherited from interface com.itextpdf.pdfocr.onnx.IPredictor
predict
-
Constructor Details
-
OnnxRecognitionPredictor
Creates a text recognition predictor with the specified properties.- Parameters:
-
properties- properties of the predictor
-
-
Method Details
-
crnnVgg16
Creates a new text recognition predictor using an existing pre-trained CRNN model with a VGG-16 backbone, stored on disk. This is the default text recognition model in OnnxTR.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new predictor object with the CRNN model loaded with a VGG-16 backbone
-
crnnVgg16
public static OnnxRecognitionPredictor crnnVgg16(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained CRNN model with a VGG-16 backbone, stored on disk. This is the default text recognition model in OnnxTR.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the CRNN model loaded with a VGG-16 backbone
-
crnnMobileNetV3
Creates a new text recognition predictor using an existing pre-trained CRNN model with a MobileNet V3 backbone, stored on disk.This can be used to load the following models from OnnxTR:
- crnn_mobilenet_v3_large
- crnn_mobilenet_v3_large (8-bit quantized)
- crnn_mobilenet_v3_small
- crnn_mobilenet_v3_small (8-bit quantized)
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new predictor object with the CRNN model loaded with a MobileNet V3 backbone
-
crnnMobileNetV3
public static OnnxRecognitionPredictor crnnMobileNetV3(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained CRNN model with a MobileNet V3 backbone, stored on disk.This can be used to load the following models from OnnxTR:
- crnn_mobilenet_v3_large
- crnn_mobilenet_v3_large (8-bit quantized)
- crnn_mobilenet_v3_small
- crnn_mobilenet_v3_small (8-bit quantized)
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the CRNN model loaded with a MobileNet V3 backbone
-
master
Creates a new text recognition predictor using an existing pre-trained MASTER model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new predictor object with the MASTER model loaded
-
master
public static OnnxRecognitionPredictor master(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained MASTER model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the MASTER model loaded
-
parSeq
Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new predictor object with the PARSeq model loaded
-
parSeq
public static OnnxRecognitionPredictor parSeq(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the PARSeq model loaded
-
parSeq
public static OnnxRecognitionPredictor parSeq(String modelPath, Vocabulary vocabulary, int additionalTokens) Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
vocabulary- vocabulary used for the model output (without special tokens) -
additionalTokens- amount of additional tokens in the total vocabulary after the end-of-string token - Returns:
- a new predictor object with the PARSeq model loaded
-
parSeq
public static OnnxRecognitionPredictor parSeq(String modelPath, Vocabulary vocabulary, int additionalTokens, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
vocabulary- vocabulary used for the model output (without special tokens) -
additionalTokens- amount of additional tokens in the total vocabulary after the end-of-string token -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the PARSeq model loaded
-
sar
Creates a new text recognition predictor using an existing pre-trained SAR model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new predictor object with the SAR model loaded
-
sar
public static OnnxRecognitionPredictor sar(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained SAR model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the SAR model loaded
-
viTstr
Creates a new text recognition predictor using an existing pre-trained ViTSTR model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new predictor object with the ViTSTR model loaded
-
viTstr
public static OnnxRecognitionPredictor viTstr(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained ViTSTR model, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the ViTSTR model loaded
-
paddleOcr
Creates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method expects the directory to contain two files:
inference.onnx- the inference model in the ONNX formatinference.yml- the configuration file for the model in YAML
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelDirPath- path to the directory with the model and its configuration file - Returns:
- a new predictor object with the PaddleOCR model loaded
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
paddleOcr
public static OnnxRecognitionPredictor paddleOcr(String modelDirPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) throws IOException Creates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method expects the directory to contain two files:
inference.onnx- the inference model in the ONNX formatinference.yml- the configuration file for the model in YAML
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelDirPath- path to the directory with the model and its configuration file -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the PaddleOCR model loaded
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
paddleOcr
public static OnnxRecognitionPredictor paddleOcr(String modelPath, String configPath) throws IOException Creates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
configPath- path to the configuration file for the model - Returns:
- a new predictor object with the PaddleOCR model loaded
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
paddleOcr
public static OnnxRecognitionPredictor paddleOcr(String modelPath, String configPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) throws IOException Creates a new text recognition predictor using an existing pre-trained PaddleOCR model, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
configPath- path to the configuration file for the model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the PaddleOCR model loaded
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
easyOcr
Creates a new text recognition predictor using an existing pre-trained EasyOCR model, stored on disk.Only models in the ONNX format are supported. Since, by default, EasyOCR does not provide models in the ONNX format, you might need to do a model conversion yourself.
This method can be used to load the following EasyOCR models:
- english_g2
- latin_g2
- zh_sim_g2
- japanese_g2
- korean_g2
- telugu_g2
- kannada_g2
- latin_g1
- zh_sim_g1
- zh_tra_g1
- japanese_g1
- korean_g1
- thai_g1
- devanagari_g1
- cyrillic_g1
- arabic_g1
- bengali_g1
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
labelMapper- label mapper to use for the model - Returns:
- a new predictor object with the EasyOCR model loaded
-
easyOcr
public static OnnxRecognitionPredictor easyOcr(String modelPath, EasyOcrMapper labelMapper, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition predictor using an existing pre-trained EasyOCR model, stored on disk.Only models in the ONNX format are supported. Since, by default, EasyOCR does not provide models in the ONNX format, you might need to do a model conversion yourself.
This method can be used to load the following EasyOCR models:
- english_g2
- latin_g2
- zh_sim_g2
- japanese_g2
- korean_g2
- telugu_g2
- kannada_g2
- latin_g1
- zh_sim_g1
- zh_tra_g1
- japanese_g1
- korean_g1
- thai_g1
- devanagari_g1
- cyrillic_g1
- arabic_g1
- bengali_g1
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
labelMapper- label mapper to use for the model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new predictor object with the EasyOCR model loaded
-
getProperties
Returns the text recognition predictor properties.- Returns:
- the text recognition predictor properties
-
predict
Performs prediction on a sequence of input items.This method consumes the provided
Iteratorof inputs and produces anIteratorof outputs, typically yielding one result per input item.- Specified by:
-
predictin interfaceIPredictor<BufferedImage,String> - Overrides:
-
predictin classAbstractOnnxPredictor<BufferedImage,String> - Parameters:
-
inputs- anIteratorover the input items to be processed - Returns:
-
an
Iteratorover the predicted output items
-
toInputBuffer
Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.- Specified by:
-
toInputBufferin classAbstractOnnxPredictor<BufferedImage,String> - Parameters:
-
batch- batch of raw predictor inputs - Returns:
- batched model input MD-array buffer
-
fromOutputBuffer
protected List<String> fromOutputBuffer(List<BufferedImage> inputBatch, FloatBufferMdArray outputBatch) Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.- Specified by:
-
fromOutputBufferin classAbstractOnnxPredictor<BufferedImage,String> - Parameters:
-
inputBatch- list of raw predictor inputs, matching the output -
outputBatch- batched model output MD-array buffer - Returns:
- a list of predictor output
-