Index
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
A
- AbstractPdfOcrEventHelper - Class in com.itextpdf.pdfocr
-
Helper class for working with events.
- AbstractPdfOcrEventHelper() - Constructor for class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
- AbstractTesseract4OcrEngine - Class in com.itextpdf.pdfocr.tesseract4
-
The implementation of
IOcrEngine
. - AbstractTesseract4OcrEngine(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Creates a new
Tesseract4OcrEngineProperties
instance based on anotherTesseract4OcrEngineProperties
instance (copy constructor). - addCell(TableCellTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.TableRowTreeItem
-
Add a new table cell structure tree item to the table row.
- addChild(LogicalStructureTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Add child structure tree item.
- addRow(TableRowTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.TableTreeItem
-
Add a new row structure tree item to the table.
- applyRotation(ImageData) - Method in interface com.itextpdf.pdfocr.IImageRotationHandler
-
Apply rotation to image data.
- applyRotation(ImageData) - Method in class com.itextpdf.pdfocr.tesseract4.LeptonicaImageRotationHandler
- ArtifactItem - Class in com.itextpdf.pdfocr.structuretree
-
This class represents artifact structure tree item.
B
- BY_LINES - Enum constant in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
-
Text will be located by lines retrieved from hocr file.
- BY_WORDS - Enum constant in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
-
Text will be located by words retrieved from hocr file.
- BY_WORDS_AND_LINES - Enum constant in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
-
Similar to BY_WORDS mode, but top and bottom of word BBox are inherited from line.
C
- CANNOT_ADD_DATA_TO_PDF_DOCUMENT - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant CANNOT_ADD_DATA_TO_PDF_DOCUMENT.
- CANNOT_BINARIZE_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_CONVERT_IMAGE_TO_GRAYSCALE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_CREATE_BUFFERED_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_CREATE_PDF_DOCUMENT - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
- CANNOT_DELETE_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_FIND_PATH_TO_TESSERACT_EXECUTABLE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- CANNOT_GET_TEMPORARY_DIRECTORY - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_OCR_INPUT_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_PARSE_NODE_BBOX - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_PROCESS_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_READ_DEFAULT_FONT - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant CANNOT_READ_DEFAULT_FONT.
- CANNOT_READ_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_READ_IMAGE_METADATA - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_READ_INPUT_IMAGE - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
- CANNOT_READ_INPUT_IMAGE - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant CANNOT_READ_INPUT_IMAGE.
- CANNOT_READ_INPUT_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_READ_PROVIDED_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- CANNOT_RESOLVE_PROVIDED_FONTS - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
- CANNOT_RETRIEVE_PAGES_FROM_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_USE_USER_WORDS - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- CANNOT_WRITE_TO_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- com.itextpdf.pdfocr - package com.itextpdf.pdfocr
- com.itextpdf.pdfocr.exceptions - package com.itextpdf.pdfocr.exceptions
- com.itextpdf.pdfocr.logs - package com.itextpdf.pdfocr.logs
- com.itextpdf.pdfocr.statistics - package com.itextpdf.pdfocr.statistics
- com.itextpdf.pdfocr.structuretree - package com.itextpdf.pdfocr.structuretree
- com.itextpdf.pdfocr.tesseract4 - package com.itextpdf.pdfocr.tesseract4
- com.itextpdf.pdfocr.tesseract4.actions.data - package com.itextpdf.pdfocr.tesseract4.actions.data
- com.itextpdf.pdfocr.tesseract4.actions.events - package com.itextpdf.pdfocr.tesseract4.actions.events
- com.itextpdf.pdfocr.tesseract4.exceptions - package com.itextpdf.pdfocr.tesseract4.exceptions
- com.itextpdf.pdfocr.tesseract4.logs - package com.itextpdf.pdfocr.tesseract4.logs
- COMMAND_FAILED - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- COULD_NOT_FIND_CORRESPONDING_GLYPH_TO_UNICODE_CHARACTER - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant COULD_NOT_FIND_CORRESPONDING_GLYPH_TO_UNICODE_CHARACTER.
- CREATED_TEMPORARY_FILE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
-
createPdf(List
, PdfWriter) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedPdfWriter
. -
createPdf(List
, PdfWriter, DocumentProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedPdfWriter
. -
createPdf(List
, PdfWriter, DocumentProperties, IOcrProcessProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedPdfWriter
. -
createPdfA(List
, PdfWriter, DocumentProperties, PdfOutputIntent) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedPdfWriter
,DocumentProperties
andPdfOutputIntent
. -
createPdfA(List
, PdfWriter, DocumentProperties, PdfOutputIntent, IOcrProcessProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedPdfWriter
,DocumentProperties
andPdfOutputIntent
. -
createPdfA(List
, PdfWriter, PdfOutputIntent) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedPdfWriter
andPdfOutputIntent
. -
createPdfAFile(List
, File, PdfOutputIntent) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedFile
andPdfOutputIntent
. -
createPdfFile(List
, File) - Method in class com.itextpdf.pdfocr.OcrPdfCreator -
Performs OCR with set parameters using provided
IOcrEngine
and creates PDF using providedFile
. - createProcessImageEvent(SequenceId, IMetaInfo, EventConfirmationType) - Static method in class com.itextpdf.pdfocr.tesseract4.actions.events.PdfOcrTesseract4ProductEvent
-
Creates process-image event.
- createStatisticsAggregatorFromName(String) - Method in class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
-
createTxtFile(List
, File) - Method in interface com.itextpdf.pdfocr.IOcrEngine -
Performs OCR using provided
IOcrEngine
for the given list of input images and saves output to a text file using provided path. -
createTxtFile(List
, File) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine -
Performs OCR using provided
IOcrEngine
for the given list of input images and saves output to a text file using provided path. -
createTxtFile(List
, File, OcrProcessContext) - Method in interface com.itextpdf.pdfocr.IOcrEngine -
Performs OCR using provided
IOcrEngine
for the given list of input images and saves output to a text file using provided path. -
createTxtFile(List
, File, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine -
Performs OCR using provided
IOcrEngine
for the given list of input images and saves output to a text file using provided path.
D
- DATA - Enum constant in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
-
Processing of an image in the engine with data output
- doImageOcr(File) - Method in interface com.itextpdf.pdfocr.IOcrEngine
-
Reads data from the provided input image file and returns retrieved data in the format described below.
- doImageOcr(File) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Reads data from the provided input image file and returns retrieved data in the format described below.
- doImageOcr(File, OcrProcessContext) - Method in interface com.itextpdf.pdfocr.IOcrEngine
-
Reads data from the provided input image file and returns retrieved data in the format described below.
- doImageOcr(File, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Reads data from the provided input image file and returns retrieved data in the format described below.
- doImageOcr(File, OutputFormat) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Reads data from the provided input image file and returns retrieved data as string.
- doImageOcr(File, OutputFormat, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Reads data from the provided input image file and returns retrieved data as string.
- doTesseractOcr(File, File, OutputFormat) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Performs tesseract OCR for the first (or for the only) image page.
- doTesseractOcr(File, File, OutputFormat, OcrProcessContext) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Performs tesseract OCR for the first (or for the only) image page.
G
- getAccessibilityProperties() - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Retrieve structure tree element's properties.
- getBboxRect() - Method in class com.itextpdf.pdfocr.TextInfo
-
Gets bbox coordinates.
- getChildren() - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Retrieve all child structure tree items.
- getConfirmationType() - Method in class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
-
Returns the confirmation type of event.
- getDefaultFontFamily() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets preferred font family to be used when selecting font from
FontProvider
. - getDefaultFontFamily() - Method in class com.itextpdf.pdfocr.PdfOcrFontProvider
-
Gets default font family.
- getDefaultLanguage() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Gets default language for ocr.
- getDefaultUserWordsSuffix() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Gets default user words suffix.
- getEventType() - Method in class com.itextpdf.pdfocr.tesseract4.actions.events.PdfOcrTesseract4ProductEvent
- getFontProvider() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Returns FontProvider that was set previously or if it is
null
a new instance ofPdfOcrFontProvider
is returned. - getImageLayerName() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets name of image layer.
- getImagePreprocessingOptions() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Gets
Tesseract4OcrEngineProperties.imagePreprocessingOptions
. - getImageRotationHandler() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets image rotation handler instance.
- getInstance() - Static method in class com.itextpdf.pdfocr.structuretree.ArtifactItem
-
Retrieve an instance of
ArtifactItem
. - getInstance() - Static method in class com.itextpdf.pdfocr.tesseract4.actions.data.PdfOcrTesseract4ProductData
-
Getter for an instance of
ProductData
related to iText pdfOcr Tesseract4 module. - getLanguages() - Method in class com.itextpdf.pdfocr.OcrEngineProperties
-
Gets list of languages required for provided images.
- getLanguagesAsString() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Gets list of languages concatenated with "+" symbol to a string in format required by tesseract.
- getLogicalStructureTreeItem() - Method in class com.itextpdf.pdfocr.TextInfo
-
Retrieves structure tree item for the text item.
- getMessage() - Method in exception com.itextpdf.pdfocr.exceptions.PdfOcrException
- getMessageParams() - Method in exception com.itextpdf.pdfocr.exceptions.PdfOcrException
-
Gets additional params for Exception message.
- getMetaInfoContainer() - Method in interface com.itextpdf.pdfocr.IProductAware
-
Gets the container with meta info.
- getMetaInfoContainer() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Gets the container with meta info.
- getMinimalConfidenceLevel() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Gets minimal confidence level for HOCR line to be considered as properly recognized.
- getOcrEngine() - Method in class com.itextpdf.pdfocr.OcrPdfCreator
-
Gets used
IOcrEngine
. - getOcrEventHelper() - Method in class com.itextpdf.pdfocr.OcrProcessContext
-
Returns helper for working with events.
- getOcrPdfCreatorProperties() - Method in class com.itextpdf.pdfocr.OcrPdfCreator
-
Gets properties for
OcrPdfCreator
. - getOcrProcessProperties() - Method in class com.itextpdf.pdfocr.OcrProcessContext
-
Get extra OCR process properties.
- getPageSegMode() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Gets Page Segmentation Mode.
- getPageSize() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets required size for output PDF document.
- getParent() - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Retrieve parent structure tree item.
- getPathToExecutable() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
-
Gets path to tesseract executable.
- getPathToTessData() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Gets path to directory with tess data.
- getPdfLang() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets PDF language.
- getPdfOcrStatisticsEventType() - Method in class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
-
Gets the type of statistic event.
- getProductData() - Method in interface com.itextpdf.pdfocr.IProductAware
-
Gets object containing information about the product.
- getProductData() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
- getScaleMode() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets scale mode for input images using available options from
ScaleMode
enumeration. - getSequenceId() - Method in class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
-
Returns the sequence id
- getStatisticsNames() - Method in class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
- getTesseract4OcrEngineProperties() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Gets properties for
AbstractTesseract4OcrEngine
. - getTesseractInstance() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4LibOcrEngine
-
Gets tesseract instance.
- getText() - Method in class com.itextpdf.pdfocr.TextInfo
-
Gets text element.
- getTextColor() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets text color in output PDF document.
- getTextLayerName() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets name of text layer.
- getTextPositioning() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Defines the way text is retrieved from tesseract output using
TextPositioning
. - getTileHeight() - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Gets
tileHeight
. - getTileWidth() - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Gets
tileWidth
. - getTitle() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Gets PDF document title.
H
- HOCR - Enum constant in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
-
Reader will produce XHTML output compliant with the hOCR specification.
I
- identifyOsType() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Identifies type of current OS and return it (win, linux).
- IImageRotationHandler - Interface in com.itextpdf.pdfocr
-
Rotation information may be stored in image metadata.
- ImagePreprocessingOptions - Class in com.itextpdf.pdfocr.tesseract4
-
Additional options applied on image preprocessing step.
- ImagePreprocessingOptions() - Constructor for class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Creates
ImagePreprocessingOptions
instance. - ImagePreprocessingOptions(ImagePreprocessingOptions) - Constructor for class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Creates a new
ImagePreprocessingOptions
instance based on anotherImagePreprocessingOptions
instance (copy constructor). - INCORRECT_INPUT_IMAGE_FORMAT - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- INCORRECT_LANGUAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- initializeTesseract(OutputFormat) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4LibOcrEngine
-
Initializes instance of tesseract if it haven't been already initialized or it have been disposed and sets all the required properties.
- IOcrEngine - Interface in com.itextpdf.pdfocr
-
IOcrEngine
interface is used for instantiating new OcrReader objects. - IOcrProcessProperties - Interface in com.itextpdf.pdfocr
-
OCR properties passed to the OCR engine as part of
OcrProcessContext
. - IProductAware - Interface in com.itextpdf.pdfocr
-
The interface that holds information about product data and meta info.
- isPreprocessingImages() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Checks whether image preprocessing is needed.
- isSmoothTiling() - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Gets
smoothTiling
. - isTagged() - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Retrieve information on whether pdf document should be tagged or not.
- isTaggingSupported() - Method in interface com.itextpdf.pdfocr.IOcrEngine
-
Checks whether tagging is supported by the OCR engine.
- isTaggingSupported() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
- isUseTxtToImproveHocrParsing() - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Gets
Tesseract4OcrEngineProperties.useTxtToImproveHocrParsing
. - isWindows() - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Checks current os type.
L
- LANGUAGE_IS_NOT_IN_THE_LIST - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- LeptonicaImageRotationHandler - Class in com.itextpdf.pdfocr.tesseract4
-
Leptonica based implementation of
IImageRotationHandler
. - LeptonicaImageRotationHandler() - Constructor for class com.itextpdf.pdfocr.tesseract4.LeptonicaImageRotationHandler
- LogicalStructureTreeItem - Class in com.itextpdf.pdfocr.structuretree
-
This class represents structure tree item of the text item put into the pdf document.
- LogicalStructureTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Instantiate a new
LogicalStructureTreeItem
instance. - LogicalStructureTreeItem(AccessibilityProperties) - Constructor for class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Instantiate a new
LogicalStructureTreeItem
instance.
N
- NUMBER_OF_PAGES_IN_IMAGE - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant NUMBER_OF_PAGES_IN_IMAGE.
O
- OcrEngineProperties - Class in com.itextpdf.pdfocr
-
This class contains additional properties for ocr engine.
- OcrEngineProperties() - Constructor for class com.itextpdf.pdfocr.OcrEngineProperties
-
Creates a new
OcrEngineProperties
instance. - OcrEngineProperties(OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.OcrEngineProperties
-
Creates a new
OcrEngineProperties
instance based on anotherOcrEngineProperties
instance (copy constructor). - OcrPdfCreator - Class in com.itextpdf.pdfocr
-
OcrPdfCreator
is the class that creates PDF documents containing input images and text that was recognized using providedIOcrEngine
. - OcrPdfCreator(IOcrEngine) - Constructor for class com.itextpdf.pdfocr.OcrPdfCreator
-
Creates a new
OcrPdfCreator
instance. - OcrPdfCreator(IOcrEngine, OcrPdfCreatorProperties) - Constructor for class com.itextpdf.pdfocr.OcrPdfCreator
-
Creates a new
OcrPdfCreator
instance. - OcrPdfCreatorProperties - Class in com.itextpdf.pdfocr
-
Properties that will be used by the
OcrPdfCreator
. - OcrPdfCreatorProperties() - Constructor for class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Creates a new
OcrPdfCreatorProperties
instance. - OcrPdfCreatorProperties(OcrPdfCreatorProperties) - Constructor for class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Creates a new
OcrPdfCreatorProperties
instance based on anotherOcrPdfCreatorProperties
instance (copy constructor). - OcrProcessContext - Class in com.itextpdf.pdfocr
-
Class for storing ocr processing context.
- OcrProcessContext(AbstractPdfOcrEventHelper) - Constructor for class com.itextpdf.pdfocr.OcrProcessContext
-
Creates an instance of ocr process context
- onEvent(AbstractProductITextEvent) - Method in class com.itextpdf.pdfocr.AbstractPdfOcrEventHelper
-
Handles the event.
- OutputFormat - Enum in com.itextpdf.pdfocr.tesseract4
-
Enumeration of the available output formats.
P
- PAGE_NUMBER_IS_INCORRECT - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- ParagraphTreeItem - Class in com.itextpdf.pdfocr.structuretree
-
A convenience class to associate certain text items with the paragraph structure item.
- ParagraphTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.ParagraphTreeItem
-
Instantiate a new
ParagraphTreeItem
instance. - PATH_TO_TESS_DATA_DIRECTORY_CONTAINS_NON_ASCII_CHARACTERS - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- PATH_TO_TESS_DATA_DIRECTORY_IS_INVALID - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- PATH_TO_TESS_DATA_IS_NOT_SET - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- PDF - Enum constant in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
-
Creating a PDF file
- PDF_LANGUAGE_PROPERTY_IS_NOT_SET - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant PDF_LANGUAGE_PROPERTY_IS_NOT_SET.
- PDFA - Enum constant in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
-
Creating a PDF-A file
- PdfOcrException - Exception in com.itextpdf.pdfocr.exceptions
-
Exception class for custom exceptions.
- PdfOcrException(String) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrException
-
Creates a new
PdfOcrException
. - PdfOcrException(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrException
-
Creates a new
PdfOcrException
. - PdfOcrException(Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrException
-
Creates a new
PdfOcrException
. - PdfOcrExceptionMessageConstant - Class in com.itextpdf.pdfocr.exceptions
-
Class that bundles all the exception message templates as constants.
- PdfOcrFontProvider - Class in com.itextpdf.pdfocr
-
FontProvider
extension for ocr engine. - PdfOcrFontProvider() - Constructor for class com.itextpdf.pdfocr.PdfOcrFontProvider
-
Creates a new
PdfOcrFontProvider
instance with the default font and the default font family. - PdfOcrFontProvider(FontSet, String) - Constructor for class com.itextpdf.pdfocr.PdfOcrFontProvider
-
Creates a new
PdfOcrFontProvider
instance based on providedFontSet
instance and font family. - PdfOcrInputException - Exception in com.itextpdf.pdfocr.exceptions
-
Exception class for input related exceptions.
- PdfOcrInputException(String) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrInputException
-
Creates a new
PdfOcrInputException
. - PdfOcrInputException(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrInputException
-
Creates a new
PdfOcrInputException
. - PdfOcrInputException(Throwable) - Constructor for exception com.itextpdf.pdfocr.exceptions.PdfOcrInputException
-
Creates a new
PdfOcrInputException
. - PdfOcrInputTesseract4Exception - Exception in com.itextpdf.pdfocr.tesseract4.exceptions
-
Exception class for Tesseract4 input related exceptions.
- PdfOcrInputTesseract4Exception(String) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrInputTesseract4Exception
-
Creates a new
PdfOcrInputTesseract4Exception
. - PdfOcrInputTesseract4Exception(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrInputTesseract4Exception
-
Creates a new
PdfOcrInputTesseract4Exception
. - PdfOcrInputTesseract4Exception(Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrInputTesseract4Exception
-
Creates a new
PdfOcrInputTesseract4Exception
. - PdfOcrLogMessageConstant - Class in com.itextpdf.pdfocr.logs
-
Class that bundles all the log message templates as constants.
- PdfOcrMetaInfoContainer - Class in com.itextpdf.pdfocr
-
Container to keep meta info.
- PdfOcrMetaInfoContainer(IMetaInfo) - Constructor for class com.itextpdf.pdfocr.PdfOcrMetaInfoContainer
-
Creates instance of container to keep passed meta info.
- PdfOcrOutputType - Enum in com.itextpdf.pdfocr.statistics
-
pdfOcr output types for statistics.
- PdfOcrOutputTypeStatisticsEvent - Class in com.itextpdf.pdfocr.statistics
-
Class which represents an event for specifying type of an ocr processing.
- PdfOcrOutputTypeStatisticsEvent(PdfOcrOutputType, ProductData) - Constructor for class com.itextpdf.pdfocr.statistics.PdfOcrOutputTypeStatisticsEvent
-
Creates instance of pdfOcr statistics event.
- PdfOcrTesseract4Exception - Exception in com.itextpdf.pdfocr.tesseract4.exceptions
-
Exception class for Tesseract4 exceptions.
- PdfOcrTesseract4Exception(String) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4Exception
-
Creates a new
PdfOcrTesseract4Exception
. - PdfOcrTesseract4Exception(String, Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4Exception
-
Creates a new
PdfOcrTesseract4Exception
. - PdfOcrTesseract4Exception(Throwable) - Constructor for exception com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4Exception
-
Creates a new
PdfOcrTesseract4Exception
. - PdfOcrTesseract4ExceptionMessageConstant - Class in com.itextpdf.pdfocr.tesseract4.exceptions
-
Class that bundles all the error message templates as constants.
- PdfOcrTesseract4ProductData - Class in com.itextpdf.pdfocr.tesseract4.actions.data
-
Stores an instance of
ProductData
related to iText pdfOcr Tesseract4 module. - PdfOcrTesseract4ProductData() - Constructor for class com.itextpdf.pdfocr.tesseract4.actions.data.PdfOcrTesseract4ProductData
- PdfOcrTesseract4ProductEvent - Class in com.itextpdf.pdfocr.tesseract4.actions.events
-
Class represents events registered in iText pdfOcr Tesseract4 module.
- PROCESS_IMAGE - Static variable in class com.itextpdf.pdfocr.tesseract4.actions.events.PdfOcrTesseract4ProductEvent
-
Process image event type.
- PROVIDED_FONT_PROVIDER_IS_INVALID - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant PROVIDED_FONT_PROVIDER_IS_INVALID.
R
- removeChild(LogicalStructureTreeItem) - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Remove child structure tree item.
S
- SCALE_HEIGHT - Enum constant in enum com.itextpdf.pdfocr.ScaleMode
-
Only height of the image will be proportionally scaled to fit required size that is set using
OcrPdfCreatorProperties.setPageSize(Rectangle)
method. - SCALE_TO_FIT - Enum constant in enum com.itextpdf.pdfocr.ScaleMode
-
The image will be scaled to fit within the page width and height dimensions that are set using
OcrPdfCreatorProperties.setPageSize(Rectangle)
method. - SCALE_WIDTH - Enum constant in enum com.itextpdf.pdfocr.ScaleMode
-
Only width of the image will be proportionally scaled to fit required size that is set using
OcrPdfCreatorProperties.setPageSize(Rectangle)
method. - ScaleMode - Enum in com.itextpdf.pdfocr
-
Enumeration of the possible scale modes for input images.
- setAccessibilityProperties(AccessibilityProperties) - Method in class com.itextpdf.pdfocr.structuretree.LogicalStructureTreeItem
-
Set structure tree element's properties.
- setBboxRect(Rectangle) - Method in class com.itextpdf.pdfocr.TextInfo
-
Sets text bbox.
- setFontProvider(FontProvider) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets font provider.
- setFontProvider(FontProvider, String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets font provider and default font family.
- setImageLayerName(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets name for the image layer.
- setImagePreprocessingOptions(ImagePreprocessingOptions) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Sets
Tesseract4OcrEngineProperties.imagePreprocessingOptions
. - setImageRotationHandler(IImageRotationHandler) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets image rotation handler instance.
-
setLanguages(List
) - Method in class com.itextpdf.pdfocr.OcrEngineProperties -
Sets list of languages to be recognized in provided images.
- setLogicalStructureTreeItem(LogicalStructureTreeItem) - Method in class com.itextpdf.pdfocr.TextInfo
-
Sets logical structure tree parent item for the text info.
- setMessageParams(String...) - Method in exception com.itextpdf.pdfocr.exceptions.PdfOcrException
-
Sets additional params for Exception message.
- setMetaInfo(IMetaInfo) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Set meta info for this
OcrPdfCreatorProperties
. - setMinimalConfidenceLevel(int) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Sets minimal confidence level for HOCR line to be considered as properly recognized.
- setOcrEngine(IOcrEngine) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
-
Sets
IOcrEngine
reader object to perform OCR. - setOcrEventHelper(AbstractPdfOcrEventHelper) - Method in class com.itextpdf.pdfocr.OcrProcessContext
-
Sets ocr event helper.
- setOcrPdfCreatorProperties(OcrPdfCreatorProperties) - Method in class com.itextpdf.pdfocr.OcrPdfCreator
-
Sets properties for
OcrPdfCreator
. - setPageSegMode(Integer) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Sets Page Segmentation Mode.
- setPageSize(Rectangle) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets required size for output PDF document.
- setPathToExecutable(String) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
-
Sets path to tesseract executable.
- setPathToTessData(File) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Sets path to directory with tess data.
- setPdfLang(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Specify PDF natural language, and optionally locale.
- setPreprocessingImages(boolean) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Sets true if image preprocessing is needed.
- setScaleMode(ScaleMode) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets scale mode for input images using available options from
ScaleMode
enumeration. - setSmoothTiling(boolean) - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Sets
smoothTiling
. - setTagged(boolean) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Defines whether pdf document should be tagged or not.
- setTesseract4OcrEngineProperties(Tesseract4OcrEngineProperties) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine
-
Sets properties for
AbstractTesseract4OcrEngine
. - setText(String) - Method in class com.itextpdf.pdfocr.TextInfo
-
Sets text element.
- setTextColor(Color) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets text color in output PDF document.
- setTextLayerName(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets name for the text layer.
- setTextPositioning(TextPositioning) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Defines the way text is retrieved from tesseract output using
TextPositioning
. - setTileHeight(int) - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Sets
tileHeight
. - setTileWidth(int) - Method in class com.itextpdf.pdfocr.tesseract4.ImagePreprocessingOptions
-
Sets
tileWidth
. - setTitle(String) - Method in class com.itextpdf.pdfocr.OcrPdfCreatorProperties
-
Sets PDF document title.
- setUseTxtToImproveHocrParsing(boolean) - Method in class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Sets
Tesseract4OcrEngineProperties.useTxtToImproveHocrParsing
. - SpanTreeItem - Class in com.itextpdf.pdfocr.structuretree
-
A convenience class to associate certain text items with the span structure item.
- SpanTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.SpanTreeItem
-
Instantiate a new
SpanTreeItem
instance. - START_OCR_FOR_IMAGES - Static variable in class com.itextpdf.pdfocr.logs.PdfOcrLogMessageConstant
-
The constant START_OCR_FOR_IMAGES.
- START_OCR_FOR_IMAGES - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- STATISTICS_EVENT_TYPE_CANT_BE_NULL - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
- STATISTICS_EVENT_TYPE_IS_NOT_DETECTED - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
T
- TableCellTreeItem - Class in com.itextpdf.pdfocr.structuretree
-
A convenience class to associate certain text items with the table cell structure item.
- TableCellTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.TableCellTreeItem
-
Instantiate a new
TableCellTreeItem
instance. - TableRowTreeItem - Class in com.itextpdf.pdfocr.structuretree
-
A convenience class to associate certain text items with the table row structure item.
- TableRowTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.TableRowTreeItem
-
Instantiate a new
TableRowTreeItem
instance. - TableTreeItem - Class in com.itextpdf.pdfocr.structuretree
-
A convenience class to associate certain text items with the table structure item.
- TableTreeItem() - Constructor for class com.itextpdf.pdfocr.structuretree.TableTreeItem
-
Instantiate a new
TableTreeItem
instance. - TAGGING_IS_NOT_SUPPORTED - Static variable in class com.itextpdf.pdfocr.exceptions.PdfOcrExceptionMessageConstant
- TESSERACT_FAILED - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- TESSERACT_FAILED - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
- TESSERACT_LIB_NOT_INSTALLED - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- TESSERACT_LIB_NOT_INSTALLED_WIN - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- TESSERACT_NOT_FOUND - Static variable in class com.itextpdf.pdfocr.tesseract4.exceptions.PdfOcrTesseract4ExceptionMessageConstant
- Tesseract4ExecutableOcrEngine - Class in com.itextpdf.pdfocr.tesseract4
-
The implementation of
AbstractTesseract4OcrEngine
for tesseract OCR. - Tesseract4ExecutableOcrEngine(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
-
Creates a new
Tesseract4ExecutableOcrEngine
instance. - Tesseract4ExecutableOcrEngine(String, Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4ExecutableOcrEngine
-
Creates a new
Tesseract4ExecutableOcrEngine
instance. - Tesseract4LibOcrEngine - Class in com.itextpdf.pdfocr.tesseract4
-
The implementation of
AbstractTesseract4OcrEngine
for tesseract OCR. - Tesseract4LibOcrEngine(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4LibOcrEngine
-
Creates a new
Tesseract4LibOcrEngine
instance. - Tesseract4LogMessageConstant - Class in com.itextpdf.pdfocr.tesseract4.logs
-
Class that bundles all the log message templates as constants.
- Tesseract4OcrEngineProperties - Class in com.itextpdf.pdfocr.tesseract4
-
Properties that will be used by the
IOcrEngine
. - Tesseract4OcrEngineProperties() - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Creates a new
Tesseract4OcrEngineProperties
instance. - Tesseract4OcrEngineProperties(Tesseract4OcrEngineProperties) - Constructor for class com.itextpdf.pdfocr.tesseract4.Tesseract4OcrEngineProperties
-
Creates a new
Tesseract4OcrEngineProperties
instance based on anotherTesseract4OcrEngineProperties
instance (copy constructor). - TesseractHelper - Class in com.itextpdf.pdfocr.tesseract4
-
Helper class.
- TextInfo - Class in com.itextpdf.pdfocr
-
This class describes how recognized text is positioned on the image providing bbox for each text item (could be a line or a word).
- TextInfo() - Constructor for class com.itextpdf.pdfocr.TextInfo
-
Creates a new
TextInfo
instance. - TextInfo(TextInfo) - Constructor for class com.itextpdf.pdfocr.TextInfo
-
Creates a new
TextInfo
instance from existing one. - TextInfo(String, Rectangle) - Constructor for class com.itextpdf.pdfocr.TextInfo
-
Creates a new
TextInfo
instance. - TextPositioning - Enum in com.itextpdf.pdfocr.tesseract4
-
Enumeration of the possible types of text positioning.
- TXT - Enum constant in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
-
Reader will produce plain txt file.
U
- UNSUPPORTED_EXIF_ORIENTATION_VALUE - Static variable in class com.itextpdf.pdfocr.tesseract4.logs.Tesseract4LogMessageConstant
V
-
validateLanguages(List
) - Method in class com.itextpdf.pdfocr.tesseract4.AbstractTesseract4OcrEngine -
Validates list of provided languages and checks if they all exist in given tess data directory.
- valueOf(String) - Static method in enum com.itextpdf.pdfocr.ScaleMode
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum com.itextpdf.pdfocr.ScaleMode
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum com.itextpdf.pdfocr.statistics.PdfOcrOutputType
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum com.itextpdf.pdfocr.tesseract4.OutputFormat
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum com.itextpdf.pdfocr.tesseract4.TextPositioning
-
Returns an array containing the constants of this enum type, in the order they are declared.
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form