Index

A B C D E F G H I J K L M N O P Q R S T U V W Y Z 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form

A

AbstractOnnxPredictor<T, R> - Class in com.itextpdf.pdfocr.onnxtr
Abstract predictor, based on models running over ONNX runtime.
AbstractOnnxPredictor(String, OnnxInputProperties, long[]) - Constructor for class com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor
Creates a new abstract predictor.
AbstractPdfOcrEventHelper - Class in com.itextpdf.pdfocr
Helper class for working with events.
AbstractPdfOcrEventHelper() - Constructor for class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
 
AbstractTesseract4OcrEngine - Class in com.itextpdf.pdfocr.tesseract4
The implementation of IOcrEngine.
AbstractTesseract4OcrEngine(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Creates a new Tesseract4OcrEngineProperties instance based on another Tesseract4OcrEngineProperties instance (copy constructor).
addCell(TableCellTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.TableRowTreeItem
Add a new table cell structure tree item to the table row.
addChild(LogicalStructureTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Add child structure tree item.
addRow(TableRowTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.TableTreeItem
Add a new row structure tree item to the table.
AFRIKAANS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ALBANIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ANCIENT_GREEK - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
applyConfigModuleSettings(ModulesConfigurator) - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
applyConfigModuleSettings(ModulesConfigurator) - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
applyConfigModuleSettings(ModulesConfigurator) - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
applyMappingConfiguration(MappingConfigurator) - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
applyMappingConfiguration(MappingConfigurator) - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
applyMappingConfiguration(MappingConfigurator) - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
applyRotation(ImageData) - Method in interface com.itextpdf.pdfocr.IImageRotationHandler
Apply rotation to image data.
applyRotation(ImageData) - Method in class com.itextpdf.pdfocr.tesseract4.LeptonicaImageRotationHandler
 
applySharpenOptions(OptionsConfigurator) - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
applySharpenOptions(OptionsConfigurator) - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
applySharpenOptions(OptionsConfigurator) - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
ARABIC - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ARABIC_DIACRITICS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ARABIC_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ARABIC_LETTERS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ARABIC_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
argmax(float[]) - Static method in class com.itextpdf.pdfocr.onnxtr.util.MathUtil
Returns the index of the maximum value in the given array.
ARMENIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ArtifactItem - Class in com.itextpdf.pdfocr.structuretree
This class represents artifact structure tree item.
ASCII_LETTERS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ASCII_LOWERCASE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ASCII_UPPERCASE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
AZERBAIJANI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

B

BASQUE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BATCH_SIZE_SHOULD_BE_POSITIVE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
Batching - Class in com.itextpdf.pdfocr.onnxtr.util
Static utility class to help with batching.
BatchProcessingGenerator<T, R> - Class in com.itextpdf.pdfocr.onnxtr.util
Generator with batch processing.
BatchProcessingGenerator(Iterator>, IBatchProcessor) - Constructor for class com.itextpdf.pdfocr.onnxtr.util.BatchProcessingGenerator
Creates a new generator with the provided batch iterator and processor.
BELARUSIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BENGALI_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BOSNIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BufferedImageUtil - Class in com.itextpdf.pdfocr.onnxtr.util
Additional algorithms for working with BufferedImage.
buildText(Map>) - Static method in class com.itextpdf.pdfocr.util.PdfOcrTextBuilder
Constructs string output from the provided IOcrEngine.doImageOcr(java.io.File) result.
BULGARIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BURMESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BURMESE_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BURMESE_DIACRITICS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BURMESE_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BURMESE_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BURMESE_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BURMESE_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
BY_LINES - Enum constant in enum com.itextpdf.pdfocr.onnxtr.TextPositioning
Text will be grouped by lines.
BY_LINES - Enum constant in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
Text will be located by lines retrieved from hocr file.
BY_WORDS - Enum constant in enum com.itextpdf.pdfocr.onnxtr.TextPositioning
Text will be grouped by words.
BY_WORDS - Enum constant in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
Text will be located by words retrieved from hocr file.
BY_WORDS_AND_LINES - Enum constant in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
Similar to BY_WORDS mode, but top and bottom of word BBox are inherited from line.

C

calculateLevenshteinDistance(String, String) - Static method in class com.itextpdf.pdfocr.onnxtr.util.MathUtil
Calculates the Levenshtein distance between two input strings.
CANNOT_ADD_DATA_TO_PDF_DOCUMENT - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant CANNOT_ADD_DATA_TO_PDF_DOCUMENT.
CANNOT_BINARIZE_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_CONVERT_IMAGE_TO_GRAYSCALE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_CREATE_BUFFERED_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_CREATE_PDF_DOCUMENT - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
CANNOT_DELETE_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_FIND_PATH_TO_TESSERACT_EXECUTABLE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
CANNOT_GET_TEMPORARY_DIRECTORY - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_OCR_INPUT_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_PARSE_NODE_BBOX - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_PROCESS_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_READ_DEFAULT_FONT - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant CANNOT_READ_DEFAULT_FONT.
CANNOT_READ_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_READ_IMAGE_METADATA - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_READ_INPUT_IMAGE - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
CANNOT_READ_INPUT_IMAGE - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant CANNOT_READ_INPUT_IMAGE.
CANNOT_READ_INPUT_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_READ_INPUT_IMAGE_PARAMS - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
CANNOT_READ_PROVIDED_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
CANNOT_RESOLVE_PROVIDED_FONTS - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
CANNOT_RETRIEVE_PAGES_FROM_IMAGE - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
 
CANNOT_RETRIEVE_PAGES_FROM_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
Deprecated.
CANNOT_USE_USER_WORDS - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
CANNOT_WRITE_TO_FILE - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
CANNOT_WRITE_TO_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
Deprecated.
CATALAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
clamp(double, double, double) - Static method in class com.itextpdf.pdfocr.onnxtr.util.MathUtil
Clamps a value between a specified minimum and maximum range.
close() - Method in class com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor
close() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
com.itextpdf.pdfocr - package com.itextpdf.pdfocr
 
com.itextpdf.pdfocr.exceptions - package com.itextpdf.pdfocr.exceptions
 
com.itextpdf.pdfocr.logs - package com.itextpdf.pdfocr.logs
 
com.itextpdf.pdfocr.onnxtr - package com.itextpdf.pdfocr.onnxtr
 
com.itextpdf.pdfocr.onnxtr.actions.data - package com.itextpdf.pdfocr.onnxtr.actions.data
 
com.itextpdf.pdfocr.onnxtr.actions.events - package com.itextpdf.pdfocr.onnxtr.actions.events
 
com.itextpdf.pdfocr.onnxtr.detection - package com.itextpdf.pdfocr.onnxtr.detection
 
com.itextpdf.pdfocr.onnxtr.exceptions - package com.itextpdf.pdfocr.onnxtr.exceptions
 
com.itextpdf.pdfocr.onnxtr.orientation - package com.itextpdf.pdfocr.onnxtr.orientation
 
com.itextpdf.pdfocr.onnxtr.recognition - package com.itextpdf.pdfocr.onnxtr.recognition
 
com.itextpdf.pdfocr.onnxtr.util - package com.itextpdf.pdfocr.onnxtr.util
 
com.itextpdf.pdfocr.statistics - package com.itextpdf.pdfocr.statistics
 
com.itextpdf.pdfocr.structuretree - package com.itextpdf.pdfocr.structuretree
 
com.itextpdf.pdfocr.tesseract4 - package com.itextpdf.pdfocr.tesseract4
 
com.itextpdf.pdfocr.tesseract4.actions.data - package com.itextpdf.pdfocr.tesseract4.actions.data
 
com.itextpdf.pdfocr.tesseract4.actions.events - package com.itextpdf.pdfocr.tesseract4.actions.events
 
com.itextpdf.pdfocr.tesseract4.exceptions - package com.itextpdf.pdfocr.tesseract4.exceptions
 
com.itextpdf.pdfocr.tesseract4.logs - package com.itextpdf.pdfocr.tesseract4.logs
 
com.itextpdf.pdfocr.util - package com.itextpdf.pdfocr.util
 
COMMAND_FAILED - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
concat(Vocabulary...) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
Creates a new vocabulary by concatenating multiple ones.
COULD_NOT_FIND_CORRESPONDING_GLYPH_TO_UNICODE_CHARACTER - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant COULD_NOT_FIND_CORRESPONDING_GLYPH_TO_UNICODE_CHARACTER.
CREATED_TEMPORARY_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
createPdf(List, PdfWriter) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter.
createPdf(List, PdfWriter, DocumentProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter.
createPdf(List, PdfWriter, DocumentProperties, IOcrProcessProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter.
createPdfA(List, PdfWriter, DocumentProperties, PdfOutputIntent) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter, DocumentProperties and PdfOutputIntent.
createPdfA(List, PdfWriter, DocumentProperties, PdfOutputIntent, IOcrProcessProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter, DocumentProperties and PdfOutputIntent.
createPdfA(List, PdfWriter, PdfOutputIntent) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter and PdfOutputIntent.
createPdfAFile(List, File, PdfOutputIntent) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided File and PdfOutputIntent.
createPdfFile(List, File) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided File.
createProcessImageEvent(SequenceId, IMetaInfo, EventConfirmationType) - Static method in class com.itextpdf.pdfocr.tesseract4.actions.events.PdfOcrTesseract4ProductEvent
Creates process-image event.
createProcessImageOnnxTrEvent(SequenceId, IMetaInfo, EventConfirmationType) - Static method in class com.itextpdf.pdfocr.onnxtr.actions.events.PdfOcrOnnxTrProductEvent
Creates process-image-onnxtr event.
createStatisticsAggregatorFromName(String) - Method in class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
createTxtFile(List, File) - Method in interface com.itextpdf.pdfocr.IOcrEngine
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
createTxtFile(List, File) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
createTxtFile(List, File) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
createTxtFile(List, File, OcrProcessContext) - Method in interface com.itextpdf.pdfocr.IOcrEngine
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
createTxtFile(List, File, OcrProcessContext) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
createTxtFile(List, File, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Performs OCR using provided IOcrEngine for the given list of input images and saves output to a text file using provided path.
crnnMobileNetV3(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a new text recognition predictor using an existing pre-trained CRNN model with a MobileNet V3 backbone, stored on disk.
crnnMobileNetV3(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates a new text recognition properties object for existing pre-trained CRNN models with a MobileNet V3 backbone, stored on disk.
CrnnPostProcessor - Class in com.itextpdf.pdfocr.onnxtr.recognition
Implementation of a text recognition predictor post-processor, used for OnnxTR CRNN model outputs.
CrnnPostProcessor() - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.CrnnPostProcessor
Creates a new post-processor with the default vocabulary.
CrnnPostProcessor(Vocabulary) - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.CrnnPostProcessor
Creates a new post-processor.
crnnVgg16(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a new text recognition predictor using an existing pre-trained CRNN model with a VGG-16 backbone, stored on disk.
crnnVgg16(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates a new text recognition properties object for existing pre-trained CRNN models with a VGG-16 backbone, stored on disk.
CROATIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
CURRENCY - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
CZECH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

D

DANISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DATA - Enum constant in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
Processing of an image in the engine with data output
dbNet(String) - Static method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictor
Creates a new text detection predictor using an existing pre-trained DBNet model, stored on disk.
dbNet(String) - Static method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
Creates a new text detection properties object for existing pre-trained DBNet models, stored on disk.
DefaultOrientationMapper - Class in com.itextpdf.pdfocr.onnxtr.orientation
Default implementation for mapping output of a crop orientation model to TextOrientation values.
DefaultOrientationMapper() - Constructor for class com.itextpdf.pdfocr.onnxtr.orientation.DefaultOrientationMapper
Constructs a new DefaultOrientationMapper with default behavior.
DEVANAGARI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DEVANAGARI_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DEVANAGARI_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DEVANAGARI_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DEVANAGARI_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DEVANAGARI_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DEVANAGARI_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DEVANAGARI_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
doImageOcr(File) - Method in interface com.itextpdf.pdfocr.IOcrEngine
Reads data from the provided input image file and returns retrieved data in the format described below.
doImageOcr(File) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Reads data from the provided input image file and returns retrieved data in the format described below.
doImageOcr(File) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Reads data from the provided input image file and returns retrieved data in the format described below.
doImageOcr(File, OcrProcessContext) - Method in interface com.itextpdf.pdfocr.IOcrEngine
Reads data from the provided input image file and returns retrieved data in the format described below.
doImageOcr(File, OcrProcessContext) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Reads data from the provided input image file and returns retrieved data in the format described below.
doImageOcr(File, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Reads data from the provided input image file and returns retrieved data in the format described below.
doImageOcr(File, OutputFormat) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Reads data from the provided input image file and returns retrieved data as string.
doImageOcr(File, OutputFormat, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Reads data from the provided input image file and returns retrieved data as string.
doTesseractOcr(File, File, OutputFormat) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Performs tesseract OCR for the first (or for the only) image page.
doTesseractOcr(File, File, OutputFormat, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Performs tesseract OCR for the first (or for the only) image page.
DUTCH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

E

ELEM_COUNT_DOES_NOT_MATCH_SHAPE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
EndOfStringPostProcessor - Class in com.itextpdf.pdfocr.onnxtr.recognition
Implementation of a text recognition predictor post-processor, used for OnnxTR non-CRNN model outputs.
EndOfStringPostProcessor() - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.EndOfStringPostProcessor
Creates a new post-processor with the default vocabulary.
EndOfStringPostProcessor(Vocabulary) - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.EndOfStringPostProcessor
Creates a new post-processor without any additional tokens.
EndOfStringPostProcessor(Vocabulary, int) - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.EndOfStringPostProcessor
Creates a new post-processor.
ENGLISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
equals(Object) - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
equals(Object) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
equals(Object) - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
equals(Object) - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
equals(Object) - Method in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
ESPERANTO - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ESTONIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ETHIOPIC - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
euclideanModulo(float, float) - Static method in class com.itextpdf.pdfocr.onnxtr.util.MathUtil
Computes the Euclidean modulo (non-negative remainder) of x modulo y.
EXPECTED_CHANNEL_COUNT - Static variable in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Expected channel count.
EXPECTED_SHAPE_SIZE - Static variable in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Expected shape size.
expit(float) - Static method in class com.itextpdf.pdfocr.onnxtr.util.MathUtil
Computes the sigmoid function, also known as the logistic function, for the given input.
extractBoxes(BufferedImage, Collection) - Static method in class com.itextpdf.pdfocr.onnxtr.util.BufferedImageUtil
Extracts sub-images from an image, based on provided rotated 4-point boxes.

F

FAILED_TO_CLOSE_ONNX_RUNTIME_SESSION - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
FAILED_TO_INIT_ONNX_RUNTIME_SESSION - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
FAILED_TO_INIT_SESSION_OPTIONS - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
FAILED_TO_LOAD_ONNXRUNTIME - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
FAILED_TO_READ_IMAGE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
fast(String) - Static method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictor
Creates a new text detection predictor using an existing pre-trained FAST model, stored on disk.
fast(String) - Static method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
Creates a new text detection properties object for existing pre-trained FAST models, stored on disk.
FINNISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
FloatBufferMdArray - Class in com.itextpdf.pdfocr.onnxtr
Multidimensional array with a FloatBuffer backing storage.
FloatBufferMdArray(FloatBuffer, long[]) - Constructor for class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Constructs a new FloatBufferMdArray with the specified data buffer and shape.
FRENCH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
FRISIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
fromOutputBuffer(List, FloatBufferMdArray) - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictor
Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.
fromOutputBuffer(List, FloatBufferMdArray) - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictor
Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.
fromOutputBuffer(List, FloatBufferMdArray) - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.
fromOutputBuffer(List, FloatBufferMdArray) - Method in class com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor
Converts ONNX runtime model batched output MD-array buffer to a list of predictor outputs.

G

GALICIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GENERIC_CYRILLIC_LETTERS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
generifyWordBBoxesByLine(Map>) - Static method in class com.itextpdf.pdfocr.util.PdfOcrTextBuilder
Sorts the provided IOcrEngine.doImageOcr(java.io.File) result by lines and updates line bboxes to match the largest words.
GEORGIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GERMAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
getAccessibilityProperties() - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Retrieve structure tree element's properties.
getAllImages(File) - Static method in class com.itextpdf.pdfocr.util.TiffImageUtil
Retrieves all images from a TIFF file.
getArrayOffset() - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Gets internal offset of the provided float buffer array.
getArraySize() - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Gets number of available bytes for read from provided float buffer array.
getAvailableModuleSettings() - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
getAvailableModuleSettings() - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
getAvailableModuleSettings() - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
getBatchSize() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns input batch size.
getBboxRect() - Method in class com.itextpdf.pdfocr.TextInfo
Gets bbox coordinates.
getBlueMean() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns blue channel mean, used for normalization.
getBlueStd() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns blue channel standard deviation, used for normalization.
getChannelCount() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns input channel count.
getChildren() - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Retrieve all child structure tree items.
getConfirmationType() - Method in class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
Returns the confirmation type of event.
getData() - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Returns a duplicate of the backing FloatBuffer.
getDefaultFontFamily() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets preferred font family to be used when selecting font from FontProvider.
getDefaultFontFamily() - Method in class com.itextpdf.pdfocr.PdfOcrFontProvider
Gets default font family.
getDefaultLanguage() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Gets default language for ocr.
getDefaultUserWordsSuffix() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Gets default user words suffix.
getDependencies() - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
getDependencies() - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
getDependencies() - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
getDimension(int) - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Returns the size of the specified dimension.
getDimensionCount() - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Returns the number of dimensions of this multidimensional array.
getEventType() - Method in class com.itextpdf.pdfocr.onnxtr.actions.events.PdfOcrOnnxTrProductEvent
getEventType() - Method in class com.itextpdf.pdfocr.tesseract4.actions.events.PdfOcrTesseract4ProductEvent
 
getFontProvider() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Returns FontProvider that was set previously or if it is null a new instance of PdfOcrFontProvider is returned.
getGreenMean() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns green channel mean, used for normalization.
getGreenStd() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns green channel standard deviation, used for normalization.
getHeight() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns input height.
getIgnoredResources() - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
getIgnoredResources() - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
getIgnoredResources() - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
getIgnoredSourceFiles() - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
getIgnoredSourceFiles() - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
getIgnoredSourceFiles() - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
getImageLayerName() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets name of image layer.
getImagePreprocessingOptions() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Gets Tesseract4OcrEngineProperties.imagePreprocessingOptions.
getImageRotationHandler() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets image rotation handler instance.
getImageType(File) - Static method in class com.itextpdf.pdfocr.util.TiffImageUtil
Gets the image type.
getInputProperties() - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
Returns the ONNX model input properties.
getInputProperties() - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
Returns the ONNX model input properties.
getInputProperties() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Returns the ONNX model input properties.
getInstance() - Static method in class com.itextpdf.pdfocr.onnxtr.actions.data.PdfOcrOnnxTrProductData
Getter for an instance of ProductData related to iText pdfOcr OnnxTr module.
getInstance() - Static method in class com.itextpdf.pdfocr.structuretree.ArtifactItem
Retrieve an instance of ArtifactItem.
getInstance() - Static method in class com.itextpdf.pdfocr.tesseract4.actions.data.PdfOcrTesseract4ProductData
Getter for an instance of ProductData related to iText pdfOcr Tesseract4 module.
getLanguages() - Method in class com.itextpdf.pdfocr.OcrEngineProperties
Gets list of languages required for provided images.
getLanguagesAsString() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Gets list of languages concatenated with "+" symbol to a string in format required by tesseract.
getLogicalStructureTreeItem() - Method in class com.itextpdf.pdfocr.TextInfo
Retrieves structure tree item for the text item.
getLookUpString() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
Returns the look-up string.
getMappingPriority() - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
getMappingPriority() - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
getMappingPriority() - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
getMean() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns per-channel mean, used for normalization.
getMean(int) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns channel-specific mean, used for normalization.
getMessage() - Method in exception com.itextpdf.pdfocr.exceptions.PdfOcrException
getMessageParams() - Method in exception com.itextpdf.pdfocr.exceptions.PdfOcrException
Gets additional params for Exception message.
getMetaInfoContainer() - Method in interface com.itextpdf.pdfocr.IProductAware
Gets the container with meta info.
getMetaInfoContainer() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Gets the container with meta info.
getMetaInfoContainer() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Gets the container with meta info.
getMinimalConfidenceLevel() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Gets minimal confidence level for HOCR line to be considered as properly recognized.
getModelPath() - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
Returns the path to the ONNX model.
getModelPath() - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
Returns the path to the ONNX model.
getModelPath() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Returns the path to the ONNX model.
getModuleName() - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
getModuleName() - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
getModuleName() - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
getOcrEngine() - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Gets used IOcrEngine reader object to perform OCR.
getOcrEventHelper() - Method in class com.itextpdf.pdfocr.OcrProcessContext
Returns helper for working with events.
getOcrPdfCreatorProperties() - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Gets properties for OcrPdfCreator.
getOcrProcessProperties() - Method in class com.itextpdf.pdfocr.OcrProcessContext
Get extra OCR process properties.
getOrientation() - Method in class com.itextpdf.pdfocr.TextInfo
Gets the text orientation.
getOutputMapper() - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
Returns the ONNX model output mapper.
getOverwrittenResources() - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
getOverwrittenResources() - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
getOverwrittenResources() - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
getPageSegMode() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Gets Page Segmentation Mode.
getPageSize() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets required size for output PDF document.
getParent() - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Retrieve parent structure tree item.
getPathToExecutable() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
Gets path to tesseract executable.
getPathToTessData() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Gets path to directory with tess data.
getPdfLang() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets PDF language.
getPdfOcrStatisticsEventType() - Method in class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
Gets the type of statistic event.
getPostProcessor() - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
Returns the ONNX model output post-processor.
getPostProcessor() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Returns the ONNX model output post-processor.
getProductData() - Method in interface com.itextpdf.pdfocr.IProductAware
Gets object containing information about the product.
getProductData() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Gets object containing information about the product.
getProductData() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
 
getProperties() - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictor
Returns the text detection predictor properties.
getProperties() - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictor
Returns the crop orientation predictor properties.
getProperties() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Returns the text recognition predictor properties.
getRedMean() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns red channel mean, used for normalization.
getRedStd() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns red channel standard deviation, used for normalization.
getScalar(int) - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Returns the scalar value at the specified index.
getScaleMode() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets scale mode for input images using available options from ScaleMode enumeration.
getSequenceId() - Method in class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
Returns the sequence id
getShape() - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Returns a copy of the shape array that defines the dimensions of this multidimensional array.
getShape() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns target input shape.
getShape(int) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns target input dimension value.
getStatisticsNames() - Method in class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
getStd() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns per-channel standard deviation, used for normalization.
getStd(int) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns channel-specific standard deviation, used for normalization.
getSubArray(int) - Method in class com.itextpdf.pdfocr.onnxtr.FloatBufferMdArray
Returns a sub-array representing the slice at the specified index of the first dimension.
getTempFilePath(String, String) - Static method in class com.itextpdf.pdfocr.util.PdfOcrFileUtil
Gets path to temp file in current system temporary directory.
getTesseract4OcrEngineProperties() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Gets properties for AbstractTesseract4OcrEngine.
getTesseractInstance() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4LibOcrEngine
Gets tesseract instance.
getText() - Method in class com.itextpdf.pdfocr.TextInfo
Gets text element.
getTextColor() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets text color in output PDF document.
getTextLayerName() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets name of text layer.
getTextPositioning() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrEngineProperties
Defines the way text is retrieved from ocr engine output using TextPositioning.
getTextPositioning() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Defines the way text is retrieved from tesseract output using TextPositioning.
getTileHeight() - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Gets ImagePreprocessingOptions.tileHeight.
getTileWidth() - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Gets ImagePreprocessingOptions.tileWidth.
getTitle() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Gets PDF document title.
getWidth() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns input width.
GREEK - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GREEK_EXTENDED - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
GUJARATI_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

H

hashCode() - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
hashCode() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
hashCode() - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
hashCode() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
hashCode() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
hasNext() - Method in class com.itextpdf.pdfocr.onnxtr.util.BatchProcessingGenerator
HAUSA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HEBREW - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HEBREW_CANTILLATIONS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HEBREW_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HEBREW_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HEBREW_SPECIALS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HEBREW_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HINDI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HINDI_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
HOCR - Enum constant in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
Reader will produce XHTML output compliant with the hOCR specification.
HORIZONTAL - Enum constant in enum com.itextpdf.pdfocr.TextOrientation
Horizontal text, non-rotated.
HORIZONTAL_ROTATED_180 - Enum constant in enum com.itextpdf.pdfocr.TextOrientation
Horizontal text, rotated 180 degrees counter-clockwise.
HORIZONTAL_ROTATED_270 - Enum constant in enum com.itextpdf.pdfocr.TextOrientation
Horizontal text, rotated 270 degrees counter-clockwise.
HORIZONTAL_ROTATED_90 - Enum constant in enum com.itextpdf.pdfocr.TextOrientation
Horizontal text, rotated 90 degrees counter-clockwise.
HUNGARIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

I

IBatchProcessor<T, R> - Interface in com.itextpdf.pdfocr.onnxtr.util
Batch processor mapper interface.
ICELANDIC - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
identifyOsType() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Identifies type of current OS and return it (win, linux).
IDetectionPostProcessor - Interface in com.itextpdf.pdfocr.onnxtr.detection
Interface for post-processors, which convert normalized, but still raw output of an ML model and returns rotated boxes with the detected objects.
IDetectionPredictor - Interface in com.itextpdf.pdfocr.onnxtr.detection
Interface for predictors, which take a full image and find text boxes on them.
IImageRotationHandler - Interface in com.itextpdf.pdfocr
Rotation information may be stored in image metadata.
IMAGE_LAYER_NAME_IS_NOT_APPLIED - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
 
ImagePreprocessingOptions - Class in com.itextpdf.pdfocr.tesseract4
Additional options applied on image preprocessing step.
ImagePreprocessingOptions() - Constructor for class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Creates ImagePreprocessingOptions instance.
ImagePreprocessingOptions(ImagePreprocessingOptions) - Constructor for class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Creates a new ImagePreprocessingOptions instance based on another ImagePreprocessingOptions instance (copy constructor).
INCORRECT_INPUT_IMAGE_FORMAT - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
INCORRECT_LANGUAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
INDEX_OUT_OF_BOUNDS - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
INDONESIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
initializeTesseract(OutputFormat) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4LibOcrEngine
Initializes instance of tesseract if it haven't been already initialized or it have been disposed and sets all the required properties.
INVALID_NUMBER_OF_OUTPUTS - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
IO_EXCEPTION_OCCURRED - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
IOcrEngine - Interface in com.itextpdf.pdfocr
IOcrEngine interface is used for instantiating new OcrReader objects.
IOcrProcessProperties - Interface in com.itextpdf.pdfocr
OCR properties passed to the OCR engine as part of OcrProcessContext.
IOrientationPredictor - Interface in com.itextpdf.pdfocr.onnxtr.orientation
Interface for predictors, which take a cropped image of text and determine its orientation.
IOutputLabelMapper<T> - Interface in com.itextpdf.pdfocr.onnxtr
Interface for mapping an integer index (continuous from 0) to output values.
IPredictor<T, R> - Interface in com.itextpdf.pdfocr.onnxtr
Interface of a generic predictor.
IProductAware - Interface in com.itextpdf.pdfocr
The interface that holds information about product data and meta info.
IRecognitionPostProcessor - Interface in com.itextpdf.pdfocr.onnxtr.recognition
Interface for post-processors, which convert raw output of an ML model and returns recognized characters as a string.
IRecognitionPredictor - Interface in com.itextpdf.pdfocr.onnxtr.recognition
Interface for predictors, which take a cropped image of text and recognize text characters on it.
IRISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
isPreprocessingImages() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Checks whether image preprocessing is needed.
isSmoothTiling() - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Gets ImagePreprocessingOptions.smoothTiling.
isTagged() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Retrieve information on whether pdf document should be tagged or not.
isTaggingSupported() - Method in interface com.itextpdf.pdfocr.IOcrEngine
Checks whether tagging is supported by the OCR engine.
isTaggingSupported() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Checks whether tagging is supported by the OCR engine.
isTaggingSupported() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
 
isTiffImage(File) - Static method in class com.itextpdf.pdfocr.util.TiffImageUtil
Checks whether image type is TIFF.
isUseTxtToImproveHocrParsing() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Gets Tesseract4OcrEngineProperties.useTxtToImproveHocrParsing.
isWindows() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Checks current os type.
ITALIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

J

JAPANESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
JAVANESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
JAVANESE_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
JAVANESE_DIACRITICS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
JAVANESE_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
JAVANESE_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
JAVANESE_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
JAVANESE_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

K

KANNADA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KANNADA_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KANNADA_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KANNADA_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KANNADA_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KANNADA_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KANNADA_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KANNADA_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KAZAKH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER_DIACRITICS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KHMER_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KOREAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KURDISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
KYRGYZ - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

L

labelDimension() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.CrnnPostProcessor
Returns the size of the output character label vector.
labelDimension() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.EndOfStringPostProcessor
Returns the size of the output character label vector.
labelDimension() - Method in interface com.itextpdf.pdfocr.onnxtr.recognition.IRecognitionPostProcessor
Returns the size of the output character label vector.
LANGUAGE_IS_NOT_IN_THE_LIST - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
LAO - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
LATIN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
LATIN_EXTENDED - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
LATVIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
LEGACY_FRENCH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
LeptonicaImageRotationHandler - Class in com.itextpdf.pdfocr.tesseract4
Leptonica based implementation of IImageRotationHandler.
LeptonicaImageRotationHandler() - Constructor for class com.itextpdf.pdfocr.tesseract4.LeptonicaImageRotationHandler
 
linkNet(String) - Static method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictor
Creates a new text detection predictor using an existing pre-trained LinkNet model, stored on disk.
linkNet(String) - Static method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
Creates a new text detection properties object for existing pre-trained LinkNet models, stored on disk.
LITHUANIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
LogicalStructureTreeItem - Class in com.itextpdf.pdfocr.structuretree
This class represents structure tree item of the text item put into the pdf document.
LogicalStructureTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Instantiate a new LogicalStructureTreeItem instance.
LogicalStructureTreeItem(AccessibilityProperties) - Constructor for class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Instantiate a new LogicalStructureTreeItem instance.
LOOK_UP_STRING_CONTAINS_2_CODE_UNITS_POINTS - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
LUXEMBOURGISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

M

MACEDONIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
makePdfSearchable(PdfDocument) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR of all images in an input PDF document and adds recognized text on top of the images.
makePdfSearchable(PdfDocument, IOcrProcessProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR of all images in an input PDF document and adds recognized text on top of the images.
makePdfSearchable(File, File) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR of all images in an input PDF file and generates searchable PDF.
makePdfSearchable(File, File, IOcrProcessProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Performs OCR of all images in an input PDF file and generates searchable PDF.
MALAGASY - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAY - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAYALAM - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAYALAM_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAYALAM_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAYALAM_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAYALAM_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAYALAM_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALAYALAM_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MALTESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MAORI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
map(int) - Method in interface com.itextpdf.pdfocr.onnxtr.IOutputLabelMapper
Returns value, which is mapped to the specified index.
map(int) - Method in class com.itextpdf.pdfocr.onnxtr.orientation.DefaultOrientationMapper
Returns value, which is mapped to the specified index.
map(int) - Method in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
Returns character, which is mapped to the specified index in the lookup string.
MARATHI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
master(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a new text recognition predictor using an existing pre-trained MASTER model, stored on disk.
master(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates a new text recognition properties object for existing pre-trained MASTER models, stored on disk.
MathUtil - Class in com.itextpdf.pdfocr.onnxtr.util
Additional math functions.
MAX_SHOULD_NOT_BE_LESS_THAN_MIN - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
mobileNetV3(String) - Static method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictor
Creates a new crop orientation predictor using an existing pre-trained MobileNetV3 model, stored on disk.
mobileNetV3(String) - Static method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
Creates a new crop orientation properties object for existing pre-trained MobileNetV3 models, stored on disk.
MODEL_DID_NOT_PASS_VALIDATION - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
MODEL_ONLY_SUPPORTS_RGB - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
MONGOLIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MONTENEGRIN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MULTI_LANG - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
MULTI_LANG_FULL - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

N

NEGATIVE_VALUE_IN_SHAPE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
NEPALI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
next() - Method in class com.itextpdf.pdfocr.onnxtr.util.BatchProcessingGenerator
normalizeRotatedRect(RotatedRect) - Static method in class com.itextpdf.pdfocr.onnxtr.util.OpenCvUtil
Normalizes RotatedRect, so that its angle is in the [-45; 45) range.
NORWEGIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
NUMBER_OF_PAGES_IN_IMAGE - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant NUMBER_OF_PAGES_IN_IMAGE.

O

OcrEngineProperties - Class in com.itextpdf.pdfocr
This class contains additional properties for ocr engine.
OcrEngineProperties() - Constructor for class com.itextpdf.pdfocr.OcrEngineProperties
Creates a new OcrEngineProperties instance.
OcrEngineProperties(OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.OcrEngineProperties
Creates a new OcrEngineProperties instance based on another OcrEngineProperties instance (copy constructor).
OcrPdfCreator - Class in com.itextpdf.pdfocr
OcrPdfCreator is the class that creates PDF documents containing input images and text that was recognized using provided IOcrEngine.
OcrPdfCreator(IOcrEngine) - Constructor for class com.itextpdf.pdfocr.OcrPdfCreator
Creates a new OcrPdfCreator instance.
OcrPdfCreator(IOcrEngine, OcrPdfCreatorProperties) - Constructor for class com.itextpdf.pdfocr.OcrPdfCreator
Creates a new OcrPdfCreator instance.
OcrPdfCreatorProperties - Class in com.itextpdf.pdfocr
Properties that will be used by the OcrPdfCreator.
OcrPdfCreatorProperties() - Constructor for class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Creates a new OcrPdfCreatorProperties instance.
OcrPdfCreatorProperties(OcrPdfCreatorProperties) - Constructor for class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Creates a new OcrPdfCreatorProperties instance based on another OcrPdfCreatorProperties instance (copy constructor).
OcrProcessContext - Class in com.itextpdf.pdfocr
Class for storing ocr processing context.
OcrProcessContext(AbstractPdfOcrEventHelper) - Constructor for class com.itextpdf.pdfocr.OcrProcessContext
Creates an instance of ocr process context
ODIA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ODIA_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ODIA_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ODIA_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ODIA_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ODIA_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ODIA_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
ODIA_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
onEvent(AbstractProductITextEvent) - Method in class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
Handles the event.
ONLY_SUPPORT_RGB_IMAGES - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
ONNX_RUNTIME_OPERATION_FAILED - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
OnnxDetectionPostProcessor - Class in com.itextpdf.pdfocr.onnxtr.detection
Implementation of a text detection predictor post-processor, used for OnnxTR model outputs.
OnnxDetectionPostProcessor() - Constructor for class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPostProcessor
Creates a new post-processor with the default threshold values.
OnnxDetectionPostProcessor(float, float) - Constructor for class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPostProcessor
Creates a new post-processor.
OnnxDetectionPredictor - Class in com.itextpdf.pdfocr.onnxtr.detection
A text detection predictor implementation, which is using ONNX Runtime and its ML models to find, where text is located on an image.
OnnxDetectionPredictor(OnnxDetectionPredictorProperties) - Constructor for class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictor
Creates a text detection predictor with the specified properties.
OnnxDetectionPredictorProperties - Class in com.itextpdf.pdfocr.onnxtr.detection
Properties for configuring text detection ONNX models.
OnnxDetectionPredictorProperties(String, OnnxInputProperties, IDetectionPostProcessor) - Constructor for class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
Creates new text detection predictor properties.
OnnxInputProperties - Class in com.itextpdf.pdfocr.onnxtr
Properties of the input of an ONNX model, which expects an RGB image.
OnnxInputProperties(float[], float[], long[], boolean) - Constructor for class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Creates model input properties.
OnnxOrientationPredictor - Class in com.itextpdf.pdfocr.onnxtr.orientation
A crop orientation predictor implementation, which is using ONNX Runtime and its ML models to figure out, how text is oriented in a cropped image of text.
OnnxOrientationPredictor(OnnxOrientationPredictorProperties) - Constructor for class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictor
Creates a crop orientation predictor with the specified properties.
OnnxOrientationPredictorProperties - Class in com.itextpdf.pdfocr.onnxtr.orientation
Properties for configuring crop orientation ONNX models.
OnnxOrientationPredictorProperties(String, OnnxInputProperties, IOutputLabelMapper) - Constructor for class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
Creates new crop orientation predictor properties.
OnnxRecognitionPredictor - Class in com.itextpdf.pdfocr.onnxtr.recognition
A text recognition predictor implementation, which is using ONNX Runtime and its ML models to recognize text characters on an image.
OnnxRecognitionPredictor(OnnxRecognitionPredictorProperties) - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a text recognition predictor with the specified properties.
OnnxRecognitionPredictorProperties - Class in com.itextpdf.pdfocr.onnxtr.recognition
Properties for configuring text recognition ONNX models.
OnnxRecognitionPredictorProperties(String, OnnxInputProperties, IRecognitionPostProcessor) - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates new text recognition predictor properties.
OnnxTrEngineProperties - Class in com.itextpdf.pdfocr.onnxtr
Properties that are used by the OnnxTrOcrEngine.
OnnxTrEngineProperties() - Constructor for class com.itextpdf.pdfocr.onnxtr.OnnxTrEngineProperties
Creates a new OnnxTrEngineProperties instance.
OnnxTrOcrEngine - Class in com.itextpdf.pdfocr.onnxtr
IOcrEngine implementation, based on OnnxTR/DocTR machine learning OCR projects.
OnnxTrOcrEngine(IDetectionPredictor, IOrientationPredictor, IRecognitionPredictor) - Constructor for class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Create a new OCR engine with the provided predictors.
OnnxTrOcrEngine(IDetectionPredictor, IOrientationPredictor, IRecognitionPredictor, OnnxTrEngineProperties) - Constructor for class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Create a new OCR engine with the provided predictors.
OnnxTrOcrEngine(IDetectionPredictor, IRecognitionPredictor) - Constructor for class com.itextpdf.pdfocr.onnxtr.OnnxTrOcrEngine
Create a new OCR engine with the provided predictors, without text orientation prediction.
OpenCvUtil - Class in com.itextpdf.pdfocr.onnxtr.util
Static class with OpenCV utility functions.
OutputFormat - Enum in com.itextpdf.pdfocr.tesseract4
Enumeration of the available output formats.

P

PAGE_NUMBER_IS_INCORRECT - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
PAGE_SIZE_IS_NOT_APPLIED - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
 
ParagraphTreeItem - Class in com.itextpdf.pdfocr.structuretree
A convenience class to associate certain text items with the paragraph structure item.
ParagraphTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.ParagraphTreeItem
Instantiate a new ParagraphTreeItem instance.
parSeq(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.
parSeq(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.
parSeq(String, Vocabulary, int) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a new text recognition predictor using an existing pre-trained PARSeq model, stored on disk.
parSeq(String, Vocabulary, int) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates a new text recognition properties object for existing pre-trained PARSeq models, stored on disk.
PASHTO - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PATH_TO_TESS_DATA_DIRECTORY_CONTAINS_NON_ASCII_CHARACTERS - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
PATH_TO_TESS_DATA_DIRECTORY_IS_INVALID - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
PATH_TO_TESS_DATA_IS_NOT_SET - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
PDF - Enum constant in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
Creating a PDF file
PDF_DOCUMENT_MUST_BE_OPENED_IN_STAMPING_MODE - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
PDF_LANGUAGE_PROPERTY_IS_NOT_SET - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant PDF_LANGUAGE_PROPERTY_IS_NOT_SET.
PDFA - Enum constant in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
Creating a PDF-A file
PDFA_IS_NOT_SUPPORTED - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
PdfOcrException - Exception in com.itextpdf.pdfocr.exceptions
Exception class for custom exceptions.
PdfOcrException(String) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrException
Creates a new PdfOcrException.
PdfOcrException(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrException
Creates a new PdfOcrException.
PdfOcrException(Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrException
Creates a new PdfOcrException.
PdfOcrExceptionMessageConstant - Class in com.itextpdf.pdfocr.exceptions
Class that bundles all the exception message templates as constants.
PdfOcrFileUtil - Class in com.itextpdf.pdfocr.util
Utility class for working with files.
PdfOcrFontProvider - Class in com.itextpdf.pdfocr
FontProvider extension for ocr engine.
PdfOcrFontProvider() - Constructor for class com.itextpdf.pdfocr.PdfOcrFontProvider
Creates a new PdfOcrFontProvider instance with the default font and the default font family.
PdfOcrFontProvider(FontSet, String) - Constructor for class com.itextpdf.pdfocr.PdfOcrFontProvider
Creates a new PdfOcrFontProvider instance based on provided FontSet instance and font family.
PdfOcrInputException - Exception in com.itextpdf.pdfocr.exceptions
Exception class for input related exceptions.
PdfOcrInputException(String) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrInputException
Creates a new PdfOcrInputException.
PdfOcrInputException(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrInputException
Creates a new PdfOcrInputException.
PdfOcrInputException(Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrInputException
Creates a new PdfOcrInputException.
PdfOcrInputTesseract4Exception - Exception in com.itextpdf.pdfocr.tesseract4.exceptions
Exception class for Tesseract4 input related exceptions.
PdfOcrInputTesseract4Exception(String) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrInputTesseract4Exception
PdfOcrInputTesseract4Exception(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrInputTesseract4Exception
PdfOcrInputTesseract4Exception(Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrInputTesseract4Exception
PdfOcrLogMessageConstant - Class in com.itextpdf.pdfocr.logs
Class that bundles all the log message templates as constants.
PdfOcrMetaInfoContainer - Class in com.itextpdf.pdfocr
Container to keep meta info.
PdfOcrMetaInfoContainer(IMetaInfo) - Constructor for class com.itextpdf.pdfocr.PdfOcrMetaInfoContainer
Creates instance of container to keep passed meta info.
PdfOcrOnnxTrExceptionMessageConstant - Class in com.itextpdf.pdfocr.onnxtr.exceptions
Class that bundles all the error message templates as constants.
PdfOcrOnnxTrProductData - Class in com.itextpdf.pdfocr.onnxtr.actions.data
Stores an instance of ProductData related to iText pdfOcr OnnxTr module.
PdfOcrOnnxTrProductEvent - Class in com.itextpdf.pdfocr.onnxtr.actions.events
Class represents events registered in iText pdfOcr OnnxTr module.
PdfOcrOutputType - Enum in com.itextpdf.pdfocr.statistics
pdfOcr output types for statistics.
PdfOcrOutputTypeStatisticsEvent - Class in com.itextpdf.pdfocr.statistics
Class which represents an event for specifying type of an ocr processing.
PdfOcrOutputTypeStatisticsEvent(PdfOcrOutputType, ProductData) - Constructor for class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
Creates instance of pdfOcr statistics event.
PdfOcrTesseract4Exception - Exception in com.itextpdf.pdfocr.tesseract4.exceptions
Exception class for Tesseract4 exceptions.
PdfOcrTesseract4Exception(String) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4Exception
Creates a new PdfOcrTesseract4Exception.
PdfOcrTesseract4Exception(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4Exception
Creates a new PdfOcrTesseract4Exception.
PdfOcrTesseract4Exception(Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4Exception
Creates a new PdfOcrTesseract4Exception.
PdfOcrTesseract4ExceptionMessageConstant - Class in com.itextpdf.pdfocr.tesseract4.exceptions
Class that bundles all the error message templates as constants.
PdfOcrTesseract4ProductData - Class in com.itextpdf.pdfocr.tesseract4.actions.data
Stores an instance of ProductData related to iText pdfOcr Tesseract4 module.
PdfOcrTesseract4ProductData() - Constructor for class com.itextpdf.pdfocr.tesseract4.actions.data.PdfOcrTesseract4ProductData
 
PdfOcrTesseract4ProductEvent - Class in com.itextpdf.pdfocr.tesseract4.actions.events
Class represents events registered in iText pdfOcr Tesseract4 module.
PdfOcrTextBuilder - Class in com.itextpdf.pdfocr.util
Class to build text output from the provided image OCR result and write it to the TXT file.
PERSIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PERSIAN_LETTERS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
POLISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PORTUGUESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
predict(Iterable) - Method in interface com.itextpdf.pdfocr.onnxtr.IPredictor
Performs prediction on a sequence of input items provided as an Iterable.
predict(Iterator) - Method in class com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor
Performs prediction on a sequence of input items.
predict(Iterator) - Method in interface com.itextpdf.pdfocr.onnxtr.IPredictor
Performs prediction on a sequence of input items.
process(FloatBufferMdArray) - Method in class com.itextpdf.pdfocr.onnxtr.recognition.CrnnPostProcessor
Process ML model output and return recognized characters as string.
process(FloatBufferMdArray) - Method in class com.itextpdf.pdfocr.onnxtr.recognition.EndOfStringPostProcessor
Process ML model output and return recognized characters as string.
process(FloatBufferMdArray) - Method in interface com.itextpdf.pdfocr.onnxtr.recognition.IRecognitionPostProcessor
Process ML model output and return recognized characters as string.
process(BufferedImage, FloatBufferMdArray) - Method in interface com.itextpdf.pdfocr.onnxtr.detection.IDetectionPostProcessor
Process ML model output for a specified image and return a list of detected objects.
process(BufferedImage, FloatBufferMdArray) - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPostProcessor
Process ML model output for a specified image and return a list of detected objects.
PROCESS_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.actions.events.PdfOcrTesseract4ProductEvent
Process image event type.
PROCESS_IMAGE_ONNXTR - Static variable in class com.itextpdf.pdfocr.onnxtr.actions.events.PdfOcrOnnxTrProductEvent
Process image event type.
processBatch(List) - Method in interface com.itextpdf.pdfocr.onnxtr.util.IBatchProcessor
Processes a batch of input items and produces a corresponding batch of output items.
PROVIDED_FONT_PROVIDER_IS_INVALID - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant PROVIDED_FONT_PROVIDER_IS_INVALID.
PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
PUNJABI_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

Q

QUECHUA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

R

removeChild(LogicalStructureTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Remove child structure tree item.
ROMANIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
rotate(BufferedImage, TextOrientation) - Static method in class com.itextpdf.pdfocr.onnxtr.util.BufferedImageUtil
Rotates image based on text orientation.
RUSSIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
RUSSIAN_CYRILLIC_LETTERS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
RUSSIAN_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

S

SANSKRIT - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
sar(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a new text recognition predictor using an existing pre-trained SAR model, stored on disk.
sar(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates a new text recognition properties object for existing pre-trained SAR models, stored on disk.
SCALE_HEIGHT - Enum constant in enum com.itextpdf.pdfocr.ScaleMode
Only height of the image will be proportionally scaled to fit required size that is set using OcrPdfCreatorProperties.setPageSize(Rectangle) method.
SCALE_TO_FIT - Enum constant in enum com.itextpdf.pdfocr.ScaleMode
The image will be scaled to fit within the page width and height dimensions that are set using OcrPdfCreatorProperties.setPageSize(Rectangle) method.
SCALE_WIDTH - Enum constant in enum com.itextpdf.pdfocr.ScaleMode
Only width of the image will be proportionally scaled to fit required size that is set using OcrPdfCreatorProperties.setPageSize(Rectangle) method.
ScaleMode - Enum in com.itextpdf.pdfocr
Enumeration of the possible scale modes for input images.
SCOTTISH_GAELIC - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SERBIAN_CYRILLIC - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SERBIAN_LATIN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
setAccessibilityProperties(AccessibilityProperties) - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
Set structure tree element's properties.
setBboxRect(Rectangle) - Method in class com.itextpdf.pdfocr.TextInfo
Sets text bbox.
setConfigModuleSettings(ModulesConfigurator) - Method in class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
setConfigModuleSettings(ModulesConfigurator) - Method in class com.itextpdf.pdfocr.SharpenConfigMapping
 
setConfigModuleSettings(ModulesConfigurator) - Method in class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
setFontProvider(FontProvider) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets font provider.
setFontProvider(FontProvider, String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets font provider and default font family.
setImageLayerName(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets name for the image layer, null by default.
setImagePreprocessingOptions(ImagePreprocessingOptions) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Sets Tesseract4OcrEngineProperties.imagePreprocessingOptions.
setImageRotationHandler(IImageRotationHandler) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets image rotation handler instance.
setLanguages(List) - Method in class com.itextpdf.pdfocr.OcrEngineProperties
Sets list of languages to be recognized in provided images.
setLogicalStructureTreeItem(LogicalStructureTreeItem) - Method in class com.itextpdf.pdfocr.TextInfo
Sets logical structure tree parent item for the text info.
setMessageParams(String...) - Method in exception com.itextpdf.pdfocr.exceptions.PdfOcrException
Sets additional params for Exception message.
setMetaInfo(IMetaInfo) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Set meta info for this OcrPdfCreatorProperties.
setMinimalConfidenceLevel(int) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Sets minimal confidence level for HOCR line to be considered as properly recognized.
setOcrEngine(IOcrEngine) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Sets IOcrEngine reader object to perform OCR.
setOcrEventHelper(AbstractPdfOcrEventHelper) - Method in class com.itextpdf.pdfocr.OcrProcessContext
Sets ocr event helper.
setOcrPdfCreatorProperties(OcrPdfCreatorProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Sets properties for OcrPdfCreator.
setOrientation(TextOrientation) - Method in class com.itextpdf.pdfocr.TextInfo
Sets the text orientation.
setPageSegMode(Integer) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Sets Page Segmentation Mode.
setPageSize(Rectangle) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets required size for output PDF document.
setPathToExecutable(String) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
Sets path to tesseract executable.
setPathToTessData(File) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Sets path to directory with tess data.
setPdfLang(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Specify PDF natural language, and optionally locale.
setPreprocessingImages(boolean) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Sets true if image preprocessing is needed.
setScaleMode(ScaleMode) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets scale mode for input images using available options from ScaleMode enumeration.
setSmoothTiling(boolean) - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Sets ImagePreprocessingOptions.smoothTiling.
setTagged(boolean) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Defines whether pdf document should be tagged or not.
setTesseract4OcrEngineProperties(Tesseract4OcrEngineProperties) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Sets properties for AbstractTesseract4OcrEngine.
setText(String) - Method in class com.itextpdf.pdfocr.TextInfo
Sets text element.
setTextColor(Color) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets text color in output PDF document.
setTextLayerName(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets name for the text layer.
setTextPositioning(TextPositioning) - Method in class com.itextpdf.pdfocr.onnxtr.OnnxTrEngineProperties
Defines the way text is retrieved from ocr engine output using TextPositioning.
setTextPositioning(TextPositioning) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Defines the way text is retrieved from tesseract output using TextPositioning.
setTileHeight(int) - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Sets ImagePreprocessingOptions.tileHeight.
setTileWidth(int) - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
Sets ImagePreprocessingOptions.tileWidth.
setTitle(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
Sets PDF document title.
setUseTxtToImproveHocrParsing(boolean) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Sets Tesseract4OcrEngineProperties.useTxtToImproveHocrParsing.
SHAPE_IS_NOT_VALID - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
SharpenConfigMapping - Class in com.itextpdf.pdfocr.onnxtr
Service implementation of MappingConfiguration containing the module's Sharpen configuration.
SharpenConfigMapping - Class in com.itextpdf.pdfocr
Service implementation of MappingConfiguration containing the module's Sharpen configuration.
SharpenConfigMapping - Class in com.itextpdf.pdfocr.tesseract4
Service implementation of MappingConfiguration containing the module's Sharpen configuration.
SharpenConfigMapping() - Constructor for class com.itextpdf.pdfocr.onnxtr.SharpenConfigMapping
 
SharpenConfigMapping() - Constructor for class com.itextpdf.pdfocr.SharpenConfigMapping
 
SharpenConfigMapping() - Constructor for class com.itextpdf.pdfocr.tesseract4.SharpenConfigMapping
 
SIMPLIFIED_CHINESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINDHI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SINHALA_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
size() - Method in interface com.itextpdf.pdfocr.onnxtr.IOutputLabelMapper
Returns a number of mappable values.
size() - Method in class com.itextpdf.pdfocr.onnxtr.orientation.DefaultOrientationMapper
Returns a number of mappable values.
size() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
Returns the size of the vocabulary.
SLOVAK - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SLOVENE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SOMALI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
sortTextInfosByLines(Map>) - Static method in class com.itextpdf.pdfocr.util.PdfOcrTextBuilder
Sorts the provided IOcrEngine.doImageOcr(java.io.File) result by lines.
SPANISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SpanTreeItem - Class in com.itextpdf.pdfocr.structuretree
A convenience class to associate certain text items with the span structure item.
SpanTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.SpanTreeItem
Instantiate a new SpanTreeItem instance.
START_OCR_FOR_IMAGES - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
The constant START_OCR_FOR_IMAGES.
START_OCR_FOR_IMAGES - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
Deprecated.
STATISTICS_EVENT_TYPE_CANT_BE_NULL - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
STATISTICS_EVENT_TYPE_IS_NOT_DETECTED - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
SUDANESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SUDANESE_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SUDANESE_DIACRITICS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SUDANESE_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SUDANESE_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SWAHILI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
SWEDISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

T

TableCellTreeItem - Class in com.itextpdf.pdfocr.structuretree
A convenience class to associate certain text items with the table cell structure item.
TableCellTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.TableCellTreeItem
Instantiate a new TableCellTreeItem instance.
TableRowTreeItem - Class in com.itextpdf.pdfocr.structuretree
A convenience class to associate certain text items with the table row structure item.
TableRowTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.TableRowTreeItem
Instantiate a new TableRowTreeItem instance.
TableTreeItem - Class in com.itextpdf.pdfocr.structuretree
A convenience class to associate certain text items with the table structure item.
TableTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.TableTreeItem
Instantiate a new TableTreeItem instance.
TAGALOG - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAGGED_PDF_IS_NOT_SUPPORTED - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
TAGGING_IS_NOT_SUPPORTED - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
 
TAJIK - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_FRACTIONS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TAMIL_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TATAR - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU_CONSONANTS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU_DIGITS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU_MATRAS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU_PUNCTUATION - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU_SIGNS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU_VIRAMA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TELUGU_VOWELS - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TESSERACT_FAILED - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
TESSERACT_FAILED - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
TESSERACT_LIB_NOT_INSTALLED - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
TESSERACT_LIB_NOT_INSTALLED_WIN - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
TESSERACT_NOT_FOUND - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
 
Tesseract4ExecutableOcrEngine - Class in com.itextpdf.pdfocr.tesseract4
The implementation of AbstractTesseract4OcrEngine for tesseract OCR.
Tesseract4ExecutableOcrEngine(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
Creates a new Tesseract4ExecutableOcrEngine instance.
Tesseract4ExecutableOcrEngine(String, Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
Creates a new Tesseract4ExecutableOcrEngine instance.
Tesseract4LibOcrEngine - Class in com.itextpdf.pdfocr.tesseract4
The implementation of AbstractTesseract4OcrEngine for tesseract OCR.
Tesseract4LibOcrEngine(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4LibOcrEngine
Creates a new Tesseract4LibOcrEngine instance.
Tesseract4LogMessageConstant - Class in com.itextpdf.pdfocr.tesseract4.logs
Class that bundles all the log message templates as constants.
Tesseract4OcrEngineProperties - Class in com.itextpdf.pdfocr.tesseract4
Properties that will be used by the IOcrEngine.
Tesseract4OcrEngineProperties() - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Creates a new Tesseract4OcrEngineProperties instance.
Tesseract4OcrEngineProperties(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
Creates a new Tesseract4OcrEngineProperties instance based on another Tesseract4OcrEngineProperties instance (copy constructor).
TesseractHelper - Class in com.itextpdf.pdfocr.tesseract4
Helper class.
TextInfo - Class in com.itextpdf.pdfocr
This class describes how recognized text is positioned on the image providing bbox for each text item (could be a line or a word).
TextInfo() - Constructor for class com.itextpdf.pdfocr.TextInfo
Creates a new TextInfo instance.
TextInfo(TextInfo) - Constructor for class com.itextpdf.pdfocr.TextInfo
Creates a new TextInfo instance from existing one.
TextInfo(String, Rectangle) - Constructor for class com.itextpdf.pdfocr.TextInfo
Creates a new TextInfo instance.
TextInfo(String, Rectangle, TextOrientation) - Constructor for class com.itextpdf.pdfocr.TextInfo
Creates a new TextInfo instance.
TextOrientation - Enum in com.itextpdf.pdfocr
Enumeration of supported text orientations.
TextPositioning - Enum in com.itextpdf.pdfocr.onnxtr
Enumeration of the possible types of text positioning.
TextPositioning - Enum in com.itextpdf.pdfocr.tesseract4
Enumeration of the possible types of text positioning.
THAI - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TiffImageUtil - Class in com.itextpdf.pdfocr.util
Utility class to handle tiff images.
toBchwInput(Collection, OnnxInputProperties) - Static method in class com.itextpdf.pdfocr.onnxtr.util.BufferedImageUtil
Converts a collection of images to a batched ML model input in a BCHW format with 3 channels.
toInputBuffer(List) - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictor
Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.
toInputBuffer(List) - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictor
Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.
toInputBuffer(List) - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.
toInputBuffer(List) - Method in class com.itextpdf.pdfocr.onnxtr.AbstractOnnxPredictor
Converts predictor inputs to an ONNX runtime model batched input MD-array buffer.
TOO_MANY_IMAGES - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
toString() - Method in class com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPredictorProperties
toString() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
toString() - Method in class com.itextpdf.pdfocr.onnxtr.orientation.OnnxOrientationPredictorProperties
toString() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
toString() - Method in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
TURKISH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
TXT - Enum constant in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
Reader will produce plain txt file.

U

UKRAINIAN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
UNEXPECTED_DIMENSION_VALUE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_INPUT_SHAPE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_INPUT_SIZE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_INPUT_TYPE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_MAT_TYPE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_MEAN_CHANNEL_COUNT - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_OUTPUT_SHAPE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_OUTPUT_SIZE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_OUTPUT_TYPE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_SHAPE_SIZE - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNEXPECTED_STD_CHANNEL_COUNT - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
UNSUPPORTED_EXIF_ORIENTATION_VALUE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
 
URDU - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
useSymmetricPad() - Method in class com.itextpdf.pdfocr.onnxtr.OnnxInputProperties
Returns whether padding should be symmetrical during input resizing.
UYGHUR - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
UZBEK_CYRILLIC - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
UZBEK_LATIN - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

V

validateInputPdfDocument(PdfDocument) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
Validates input PDF document.
validateLanguages(List) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
Validates list of provided languages and checks if they all exist in given tess data directory.
valueOf(String) - Static method in enum com.itextpdf.pdfocr.onnxtr.TextPositioning
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.itextpdf.pdfocr.ScaleMode
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.itextpdf.pdfocr.TextOrientation
Returns the enum constant of this type with the specified name.
values() - Static method in enum com.itextpdf.pdfocr.onnxtr.TextPositioning
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.itextpdf.pdfocr.ScaleMode
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.itextpdf.pdfocr.TextOrientation
Returns an array containing the constants of this enum type, in the order they are declared.
VALUES_SHOULD_BE_A_NON_EMPTY_ARRAY - Static variable in class com.itextpdf.pdfocr.onnxtr.exceptions.PdfOcrOnnxTrExceptionMessageConstant
 
VIETNAMESE - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
viTstr(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictor
Creates a new text recognition predictor using an existing pre-trained ViTSTR model, stored on disk.
viTstr(String) - Static method in class com.itextpdf.pdfocr.onnxtr.recognition.OnnxRecognitionPredictorProperties
Creates a new text recognition properties object for existing pre-trained ViTSTR models, stored on disk.
Vocabulary - Class in com.itextpdf.pdfocr.onnxtr.recognition
A string-based LUT for mapping text recognition model results to characters.
Vocabulary(String) - Constructor for class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
Creates a new vocabulary based on a look-up string.

W

WELSH - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
wrap(Iterator, int) - Static method in class com.itextpdf.pdfocr.onnxtr.util.Batching
Wraps an existing iterator into a new one, which output List-based batches,
writeToTextFile(String, String) - Static method in class com.itextpdf.pdfocr.util.PdfOcrFileUtil
Writes provided String to text file using provided path.

Y

YAKUT - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
YORUBA - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 

Z

ZULU - Static variable in class com.itextpdf.pdfocr.onnxtr.recognition.Vocabulary
 
A B C D E F G H I J K L M N O P Q R S T U V W Y Z 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form