Class OnnxRecognitionPredictorProperties
It contains a path to the model, model input properties and a model output post-processor.
-
Field Summary
Fields inherited from class com.itextpdf.pdfocr.onnx.AbstractOnnxPredictorProperties
DEFAULT_ORT_SESSION_CREATOR, inputProperties, modelPath, ortSessionOptionsCreator -
Constructor Summary
ConstructorsConstructorDescriptionOnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor) Creates new text recognition predictor properties.OnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor, boolean splitImages) Creates new text recognition predictor properties.OnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor, boolean splitImages, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates new text recognition predictor properties.OnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates new text recognition predictor properties. -
Method Summary
Modifier and TypeMethodDescriptioncrnnMobileNetV3(String modelPath) Creates a new text recognition properties object for existing pre-trained CRNN models with a MobileNet V3 backbone, stored on disk.crnnMobileNetV3(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained CRNN models with a MobileNet V3 backbone, stored on disk.Creates a new text recognition properties object for existing pre-trained CRNN models with a VGG-16 backbone, stored on disk.crnnVgg16(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained CRNN models with a VGG-16 backbone, stored on disk.easyOcr(String modelPath, EasyOcrMapper labelMapper) Creates a new text recognition properties object for existing pre-trained EasyOCR models, stored on disk.easyOcr(String modelPath, EasyOcrMapper labelMapper, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained EasyOCR models, stored on disk.booleanReturns the ONNX model output post-processor.inthashCode()Creates a new text recognition properties object for existing pre-trained MASTER models, stored on disk.master(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained MASTER models, stored on disk.Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.paddleOcr(String modelDirPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.paddleOcr(String modelPath, String configPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.parSeq(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.parSeq(String modelPath, Vocabulary vocabulary, int additionalTokens) Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.parSeq(String modelPath, Vocabulary vocabulary, int additionalTokens, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.Creates a new text recognition properties object for existing pre-trained SAR models, stored on disk.sar(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained SAR models, stored on disk.booleanReturns whether input images should be split.toString()Creates a new text recognition properties object for existing pre-trained ViTSTR models, stored on disk.viTstr(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained ViTSTR models, stored on disk.Methods inherited from class com.itextpdf.pdfocr.onnx.AbstractOnnxPredictorProperties
getInputProperties, getModelPath, getOrtSessionOptionsCreator
-
Constructor Details
-
OnnxRecognitionPredictorProperties
public OnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor, boolean splitImages) Creates new text recognition predictor properties.- Parameters:
-
modelPath- path to the ONNX model to load -
inputProperties- ONNX model input properties -
postProcessor- ONNX model output post-processor -
splitImages- whether input images to the ML model should be split into smaller ones with better aspect ratios
-
OnnxRecognitionPredictorProperties
public OnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor, boolean splitImages, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates new text recognition predictor properties.- Parameters:
-
modelPath- path to the ONNX model to load -
inputProperties- ONNX model input properties -
postProcessor- ONNX model output post-processor -
splitImages- whether input images to the ML model should be split into smaller ones with better aspect ratios -
ortSessionOptionsCreator- ONNX runtime session options creator
-
OnnxRecognitionPredictorProperties
public OnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor) Creates new text recognition predictor properties.Images will be split before passing them to the ML model.
- Parameters:
-
modelPath- path to the ONNX model to load -
inputProperties- ONNX model input properties -
postProcessor- ONNX model output post-processor
-
OnnxRecognitionPredictorProperties
public OnnxRecognitionPredictorProperties(String modelPath, OnnxInputProperties inputProperties, IRecognitionPostProcessor postProcessor, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates new text recognition predictor properties.- Parameters:
-
modelPath- path to the ONNX model to load -
inputProperties- ONNX model input properties -
postProcessor- ONNX model output post-processor -
ortSessionOptionsCreator- ONNX runtime session options creator
-
-
Method Details
-
crnnVgg16
Creates a new text recognition properties object for existing pre-trained CRNN models with a VGG-16 backbone, stored on disk. This is the default text recognition model in OnnxTR.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new text recognition properties object for a CRNN model with a VGG-16 backbone
-
crnnVgg16
public static OnnxRecognitionPredictorProperties crnnVgg16(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained CRNN models with a VGG-16 backbone, stored on disk. This is the default text recognition model in OnnxTR.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new text recognition properties object for a CRNN model with a VGG-16 backbone
-
crnnMobileNetV3
Creates a new text recognition properties object for existing pre-trained CRNN models with a MobileNet V3 backbone, stored on disk.This can be used to load the following models from OnnxTR:
- crnn_mobilenet_v3_large
- crnn_mobilenet_v3_large (8-bit quantized)
- crnn_mobilenet_v3_small
- crnn_mobilenet_v3_small (8-bit quantized)
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new text recognition properties object for a CRNN model with a MobileNet V3 backbone
-
crnnMobileNetV3
public static OnnxRecognitionPredictorProperties crnnMobileNetV3(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained CRNN models with a MobileNet V3 backbone, stored on disk.This can be used to load the following models from OnnxTR:
- crnn_mobilenet_v3_large
- crnn_mobilenet_v3_large (8-bit quantized)
- crnn_mobilenet_v3_small
- crnn_mobilenet_v3_small (8-bit quantized)
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new text recognition properties object for a CRNN model with a MobileNet V3 backbone
-
master
Creates a new text recognition properties object for existing pre-trained MASTER models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new text recognition properties object for a MASTER model
-
master
public static OnnxRecognitionPredictorProperties master(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained MASTER models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new text recognition properties object for a MASTER model
-
parSeq
Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new text recognition properties object for a PARSeq model
-
parSeq
public static OnnxRecognitionPredictorProperties parSeq(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new text recognition properties object for a PARSeq model
-
parSeq
public static OnnxRecognitionPredictorProperties parSeq(String modelPath, Vocabulary vocabulary, int additionalTokens) Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
vocabulary- vocabulary used for the model output (without special tokens) -
additionalTokens- amount of additional tokens in the total vocabulary after the end-of-string token - Returns:
- a new text recognition properties object for a PARSeq model
-
parSeq
public static OnnxRecognitionPredictorProperties parSeq(String modelPath, Vocabulary vocabulary, int additionalTokens, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
vocabulary- vocabulary used for the model output (without special tokens) -
additionalTokens- amount of additional tokens in the total vocabulary after the end-of-string token -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new text recognition properties object for a PARSeq model
-
sar
Creates a new text recognition properties object for existing pre-trained SAR models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new text recognition properties object for a SAR model
-
sar
public static OnnxRecognitionPredictorProperties sar(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained SAR models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new text recognition properties object for a SAR model
-
viTstr
Creates a new text recognition properties object for existing pre-trained ViTSTR models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model - Returns:
- a new text recognition properties object for a ViTSTR model
-
viTstr
public static OnnxRecognitionPredictorProperties viTstr(String modelPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained ViTSTR models, stored on disk.This can be used to load the following models from OnnxTR:
These models cannot handle spaces. Make sure you choose a detection model that outputs words.
- Parameters:
-
modelPath- path to the pre-trained model -
ortSessionOptionsCreator- the ONNX runtime session options creator - Returns:
- a new text recognition properties object for a ViTSTR model
-
paddleOcr
Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method expects the directory to contain two files:
inference.onnx- the inference model in the ONNX formatinference.yml- the configuration file for the model in YAML
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelDirPath- path to the directory with the model and its configuration file - Returns:
- a new text recognition properties object for a PaddleOCR model
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
paddleOcr
public static OnnxRecognitionPredictorProperties paddleOcr(String modelDirPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) throws IOException Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method expects the directory to contain two files:
inference.onnx- the inference model in the ONNX formatinference.yml- the configuration file for the model in YAML
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelDirPath- path to the directory with the model and its configuration file -
ortSessionOptionsCreator- ONNX runtime session options creator - Returns:
- a new text recognition properties object for a PaddleOCR model
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
paddleOcr
public static OnnxRecognitionPredictorProperties paddleOcr(String modelPath, String configPath) throws IOException Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
configPath- path to the configuration file for the model - Returns:
- a new text recognition properties object for a PaddleOCR model
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
paddleOcr
public static OnnxRecognitionPredictorProperties paddleOcr(String modelPath, String configPath, IOrtSessionOptionsCreator ortSessionOptionsCreator) throws IOException Creates a new text recognition properties object for existing pre-trained PaddleOCR models, stored on disk.Only models in the ONNX format are supported. Since, by default, PaddleOCR does not provide models in the ONNX format, you might need to do a model conversion yourself. Check out this page for information on how to do that.
This method can be used to load the following PaddleOCR models:
- PP-OCRv5_server_rec
- PP-OCRv5_mobile_rec
- PP-OCRv4_server_rec_doc
- PP-OCRv4_mobile_rec
- PP-OCRv4_server_rec
- PP-OCRv3_mobile_rec
- ch_SVTRv2_rec
- ch_RepSVTR_rec
- en_PP-OCRv5_mobile_rec
- en_PP-OCRv4_mobile_rec
- en_PP-OCRv3_mobile_rec
- korean_PP-OCRv5_mobile_rec
- latin_PP-OCRv5_mobile_rec
- eslav_PP-OCRv5_mobile_rec
- th_PP-OCRv5_mobile_rec
- el_PP-OCRv5_mobile_rec
- arabic_PP-OCRv5_mobile_rec
- cyrillic_PP-OCRv5_mobile_rec
- devanagari_PP-OCRv5_mobile_rec
- te_PP-OCRv5_mobile_rec
- ta_PP-OCRv5_mobile_rec
- korean_PP-OCRv3_mobile_rec
- japan_PP-OCRv3_mobile_rec
- chinese_cht_PP-OCRv3_mobile_rec
- te_PP-OCRv3_mobile_rec
- ka_PP-OCRv3_mobile_rec
- ta_PP-OCRv3_mobile_rec
- latin_PP-OCRv3_mobile_rec
- arabic_PP-OCRv3_mobile_rec
- cyrillic_PP-OCRv3_mobile_rec
- devanagari_PP-OCRv3_mobile_rec
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
configPath- path to the configuration file for the model -
ortSessionOptionsCreator- ONNX runtime session options creator - Returns:
- a new text recognition properties object for a PaddleOCR model
- Throws:
-
IOException- if any I/O error occurs while loading configuration file
-
easyOcr
public static OnnxRecognitionPredictorProperties easyOcr(String modelPath, EasyOcrMapper labelMapper) Creates a new text recognition properties object for existing pre-trained EasyOCR models, stored on disk.Only models in the ONNX format are supported. Since, by default, EasyOCR does not provide models in the ONNX format, you might need to do a model conversion yourself.
This method can be used to load the following EasyOCR models:
- english_g2
- latin_g2
- zh_sim_g2
- japanese_g2
- korean_g2
- telugu_g2
- kannada_g2
- latin_g1
- zh_sim_g1
- zh_tra_g1
- japanese_g1
- korean_g1
- thai_g1
- devanagari_g1
- cyrillic_g1
- arabic_g1
- bengali_g1
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
labelMapper- label mapper to use for the model - Returns:
- a new text recognition properties object for a EasyOCR model
-
easyOcr
public static OnnxRecognitionPredictorProperties easyOcr(String modelPath, EasyOcrMapper labelMapper, IOrtSessionOptionsCreator ortSessionOptionsCreator) Creates a new text recognition properties object for existing pre-trained EasyOCR models, stored on disk.Only models in the ONNX format are supported. Since, by default, EasyOCR does not provide models in the ONNX format, you might need to do a model conversion yourself.
This method can be used to load the following EasyOCR models:
- english_g2
- latin_g2
- zh_sim_g2
- japanese_g2
- korean_g2
- telugu_g2
- kannada_g2
- latin_g1
- zh_sim_g1
- zh_tra_g1
- japanese_g1
- korean_g1
- thai_g1
- devanagari_g1
- cyrillic_g1
- arabic_g1
- bengali_g1
These models can handle spaces.
- Parameters:
-
modelPath- path to the pre-trained model in the ONNX format -
labelMapper- label mapper to use for the model -
ortSessionOptionsCreator- ONNX runtime session options creator - Returns:
- a new text recognition properties object for a EasyOCR model
-
getPostProcessor
Returns the ONNX model output post-processor.- Returns:
- the ONNX model output post-processor
-
shouldSplitImages
public boolean shouldSplitImages()Returns whether input images should be split.- Returns:
- whether input images should be split
-
equals
-
hashCode
public int hashCode() -
toString
-