Index
All Classes and Interfaces|All Packages|Serialized Form
A
- AbstractValueResult - Class in com.itextpdf.pdf2data.result.value
-
Common abstract parent for all possible results.
- AbstractValueResult(String) - Constructor for class com.itextpdf.pdf2data.result.value.AbstractValueResult
-
Constructor for abstract result.
B
- BOLD - Enum constant in enum class com.itextpdf.pdf2data.result.meta.FontStyle
-
Bold style.
- BOLD_ITALIC - Enum constant in enum class com.itextpdf.pdf2data.result.meta.FontStyle
-
Bold and Italic style.
- build() - Method in class com.itextpdf.pdf2data.ocr.engine.Tesseract4BasedEngine.Builder
-
Creates new
OcrWithPostProcessingEngine
engine.
C
- check(File) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the pdf file and returns recognition results amount.
- check(File, RecognitionProperties) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the document and returns recognition results amount.
- check(InputStream) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the pdf file and returns recognition results amount.
- check(InputStream, RecognitionProperties) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the document and returns recognition results amount.
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.DataFieldResult
-
Clones this instance without nested metadata entries in results.
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.RecognitionResult
-
Clones this instance without metadata entries in results.
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.AbstractValueResult
-
Clones this instance without metadata entries.
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.group.GroupEntryResult
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.group.GroupResult
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.ImageResult
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.table.TableResult
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.table.TableRowResult
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.TextResult
- cloneWithoutMeta() - Method in class com.itextpdf.pdf2data.result.value.UnknownResult
- com.itextpdf.pdf2data - package com.itextpdf.pdf2data
- com.itextpdf.pdf2data.exceptions - package com.itextpdf.pdf2data.exceptions
- com.itextpdf.pdf2data.ocr.engine - package com.itextpdf.pdf2data.ocr.engine
- com.itextpdf.pdf2data.result - package com.itextpdf.pdf2data.result
- com.itextpdf.pdf2data.result.meta - package com.itextpdf.pdf2data.result.meta
- com.itextpdf.pdf2data.result.value - package com.itextpdf.pdf2data.result.value
- com.itextpdf.pdf2data.result.value.group - package com.itextpdf.pdf2data.result.value.group
- com.itextpdf.pdf2data.result.value.table - package com.itextpdf.pdf2data.result.value.table
- convertP2dtaToP2d(File, File) - Static method in class com.itextpdf.pdf2data.Pdf2DataTemplateConverter
-
Converts passed p2dta file to it processed representation.
- convertPdfV3ToP2dta(InputStream, File) - Static method in class com.itextpdf.pdf2data.Pdf2DataTemplateConverter
-
Converts pdf templateV3 to p2dta.
- convertXmlV3ToP2d(InputStream, File) - Static method in class com.itextpdf.pdf2data.Pdf2DataTemplateConverter
-
Converts xml templateV3 to p2dta.
- create(File) - Static method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Creates instance of
Pdf2DataExtractor
from pdf2data template file. - create(File, OcrWithPostProcessingEngine) - Static method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Creates instance of
Pdf2DataExtractor
from pdf2data template file with provided OCR engine. -
createBuilder(List
, File) - Static method in class com.itextpdf.pdf2data.ocr.engine.Tesseract4BasedEngine -
Creates new
Tesseract4BasedEngine.Builder
. - createForImageFile() - Static method in class com.itextpdf.pdf2data.RecognitionProperties
-
Creates an instance of recognition properties for image file processing with OCR.
- createForPdfFile() - Static method in class com.itextpdf.pdf2data.RecognitionProperties
-
Creates an instance of recognition properties for default PDF file processing.
- createForPdfFileWithOcr() - Static method in class com.itextpdf.pdf2data.RecognitionProperties
-
Creates an instance of recognition properties for PDF file with OCR processing.
- createFromTemplateContentJson(InputStream) - Static method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Creates instance of
Pdf2DataExtractor
from stream which contants pdf2data template content in JSON format. - createFromTemplateContentJson(InputStream, OcrWithPostProcessingEngine) - Static method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Creates instance of
Pdf2DataExtractor
from stream which contants pdf2data template content in JSON format. -
createTxtFile(List
, File) - Method in class com.itextpdf.pdf2data.OcrWithPostProcessingEngine -
Performs OCR using provided
IOcrEngine
for the given list of input images and saves output to a text file using provided path. -
createTxtFile(List
, File, OcrProcessContext) - Method in class com.itextpdf.pdf2data.OcrWithPostProcessingEngine -
Performs OCR using provided
IOcrEngine
for the given list of input images and saves output to a text file using provided path.
D
- DataFieldResult - Class in com.itextpdf.pdf2data.result
-
Class which represents data field result.
-
DataFieldResult(String, List
) - Constructor for class com.itextpdf.pdf2data.result.DataFieldResult -
Creates an instance of data field result.
- DocumentSourceType - Enum Class in com.itextpdf.pdf2data
-
Enum which specifies the file type.
- doImageOcr(File) - Method in class com.itextpdf.pdf2data.OcrWithPostProcessingEngine
-
Performs ocr with post-processing to your input file.
- doImageOcr(File, OcrProcessContext) - Method in class com.itextpdf.pdf2data.OcrWithPostProcessingEngine
-
Performs ocr with post-processing to your input file.
-
doImageOcrPostprocess(File, Map
>, OcrProcessContext) - Method in interface com.itextpdf.pdf2data.IOcrEnginePostProcessor -
Post process ocr results.
E
- enableTATRPostProcessing() - Method in class com.itextpdf.pdf2data.ocr.engine.Tesseract4BasedEngine.Builder
-
Enables TATR post-processing if called.
- equals(Object) - Method in class com.itextpdf.pdf2data.result.DataFieldResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.meta.FontMetaResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.RecognitionResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.AbstractValueResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.group.GroupEntryResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.group.GroupResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.ImageResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.table.TableResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.table.TableRowResult
- equals(Object) - Method in class com.itextpdf.pdf2data.result.value.TextResult
- extract(File) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the pdf file.
- extract(File, RecognitionProperties) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the file.
- extract(InputStream) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the pdf file.
- extract(InputStream, RecognitionProperties) - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Recognize the file.
F
- FontMetaResult - Class in com.itextpdf.pdf2data.result.meta
-
Class representing font metadata.
- FontMetaResult(String, FontStyle, String) - Constructor for class com.itextpdf.pdf2data.result.meta.FontMetaResult
-
Creates an instance of font metadata representation.
- FontStyle - Enum Class in com.itextpdf.pdf2data.result.meta
-
Enum representing possible font styles.
G
- getBase64() - Method in class com.itextpdf.pdf2data.result.value.ImageResult
-
Get image base64 string
- getBytes() - Method in class com.itextpdf.pdf2data.result.value.ImageResult
-
Get image bytes.
- getCells() - Method in class com.itextpdf.pdf2data.result.value.table.TableRowResult
-
Get row cells.
- getColspan() - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
-
Get number of columns to which the current cell spans
- getContent() - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
-
Get cell content.
- getContent() - Method in class com.itextpdf.pdf2data.result.value.TextResult
-
Get text content.
- getDataFieldResults() - Method in class com.itextpdf.pdf2data.result.RecognitionResult
-
Get data field results map.
- getDataType() - Method in class com.itextpdf.pdf2data.result.DataFieldResult
-
Get data field's data type.
- getDataType() - Method in class com.itextpdf.pdf2data.result.value.group.GroupEntryResult
-
Get group entry data type.
- getDocumentSourceType() - Method in class com.itextpdf.pdf2data.RecognitionProperties
-
Get document source type which specifies how to treat the provided document.
- getEntries() - Method in class com.itextpdf.pdf2data.result.value.group.GroupResult
-
Get group result entries.
- getFontColor() - Method in class com.itextpdf.pdf2data.result.meta.FontMetaResult
-
Get font color
- getFontMeta() - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
-
Get cell's font metadata.
- getFontMeta() - Method in class com.itextpdf.pdf2data.result.value.TextResult
-
Get text font metadata.
- getFontName() - Method in class com.itextpdf.pdf2data.result.meta.FontMetaResult
-
Get font name.
- getFontStyle() - Method in class com.itextpdf.pdf2data.result.meta.FontMetaResult
-
Get font style.
- getHeight() - Method in class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
-
Get height of the location.
- getImage() - Method in class com.itextpdf.pdf2data.result.value.ImageResult
-
Get image object.
- getOcrEngine() - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Gets current OCR engine instance.
- getPage() - Method in class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
-
Get page number of the location.
- getPageLocationMeta() - Method in class com.itextpdf.pdf2data.result.value.ImageResult
-
Get image page location
- getPageLocationMeta() - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
-
Get cell's page location.
- getPageLocationMeta() - Method in class com.itextpdf.pdf2data.result.value.table.TableRowResult
-
Get row's page location.
- getPageLocationMeta() - Method in class com.itextpdf.pdf2data.result.value.TextResult
-
Get text page location.
- getPageLocationMetas() - Method in class com.itextpdf.pdf2data.result.value.table.TableResult
-
Get table page locations.
- getPreprocessingType() - Method in class com.itextpdf.pdf2data.RecognitionProperties
-
Get preprocessing type which specifies how to preprocess the provided document.
- getResult() - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Retrieve result object.
- getResults() - Method in class com.itextpdf.pdf2data.result.DataFieldResult
-
Get list of data field's results.
- getResults() - Method in class com.itextpdf.pdf2data.result.value.group.GroupEntryResult
-
Get group entry results.
- getResultSchemaVersion() - Method in class com.itextpdf.pdf2data.result.RecognitionResult
-
Get result schema version.
- getResultType() - Method in class com.itextpdf.pdf2data.result.value.AbstractValueResult
-
Get specific result type.
- getRows() - Method in class com.itextpdf.pdf2data.result.value.table.TableResult
-
Get table rows.
- getRowspan() - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
-
Get number of rows to which the current cell spans.
- getTableDetectionPredictor() - Static method in class com.itextpdf.pdf2data.ocr.engine.Pdf2DataTATRPostProcessorStaticInitializer
-
Creates new
Predictor
for table detection model. - getTableStructurePredictor() - Static method in class com.itextpdf.pdf2data.ocr.engine.Pdf2DataTATRPostProcessorStaticInitializer
-
Creates new
Predictor
for table structure model. - getTemplate() - Method in class com.itextpdf.pdf2data.Pdf2DataExtractor
-
Gets current template instance.
- getWidth() - Method in class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
-
Get width of the location.
- getX() - Method in class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
-
Get X coordinate on the page.
- getY() - Method in class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
-
The Y coordinate on the page.
- GroupEntryResult - Class in com.itextpdf.pdf2data.result.value.group
-
Class representing single group result's entry.
-
GroupEntryResult(String, List
) - Constructor for class com.itextpdf.pdf2data.result.value.group.GroupEntryResult -
Creates an instance of group result's entry representation.
- GroupResult - Class in com.itextpdf.pdf2data.result.value.group
-
Class which represents group results.
-
GroupResult(Map
) - Constructor for class com.itextpdf.pdf2data.result.value.group.GroupResult -
Creates an instance of group result.
H
- hashCode() - Method in class com.itextpdf.pdf2data.result.DataFieldResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.meta.FontMetaResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.RecognitionResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.AbstractValueResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.group.GroupEntryResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.group.GroupResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.ImageResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.table.TableCellResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.table.TableResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.table.TableRowResult
- hashCode() - Method in class com.itextpdf.pdf2data.result.value.TextResult
I
- IMAGE - Enum constant in enum class com.itextpdf.pdf2data.DocumentSourceType
-
Image file.
- ImageResult - Class in com.itextpdf.pdf2data.result.value
-
Class which represents image result.
- ImageResult(PageLocationMetaResult, String) - Constructor for class com.itextpdf.pdf2data.result.value.ImageResult
-
Creates an instance of image result.
- initializeStaticModels(File, File) - Static method in class com.itextpdf.pdf2data.ocr.engine.Pdf2DataTATRPostProcessorStaticInitializer
-
Initialize models for table post-processing.
- InvalidResultException - Exception in com.itextpdf.pdf2data.exceptions
-
Exception which is used by pdf2data on recognition result parsing for invalid result structure.
- InvalidResultException(String) - Constructor for exception com.itextpdf.pdf2data.exceptions.InvalidResultException
-
Creates the new instance.
- InvalidSegmentException - Exception in com.itextpdf.pdf2data.exceptions
-
Exception class for invalid segment.
- InvalidSegmentException(String) - Constructor for exception com.itextpdf.pdf2data.exceptions.InvalidSegmentException
-
Creates a new
InvalidSegmentException
. - IOcrEnginePostProcessor - Interface in com.itextpdf.pdf2data
-
Base interface for ocr post processors.
- isIncludeMetaData() - Method in class com.itextpdf.pdf2data.SerializationProperties
-
Returns a state of extracting metadata.
- isInitialized() - Static method in class com.itextpdf.pdf2data.ocr.engine.Pdf2DataTATRPostProcessorStaticInitializer
-
Checks whether TATR post-preprocessing was enabled or not.
- isTaggingSupported() - Method in class com.itextpdf.pdf2data.OcrWithPostProcessingEngine
-
Gets whether results will be tagged or not.
- ITALIC - Enum constant in enum class com.itextpdf.pdf2data.result.meta.FontStyle
-
Italic style.
N
- NONE - Enum constant in enum class com.itextpdf.pdf2data.PreprocessingType
-
No preprocessing.
- NORMAL - Enum constant in enum class com.itextpdf.pdf2data.result.meta.FontStyle
-
Normal style.
O
- OCR - Enum constant in enum class com.itextpdf.pdf2data.PreprocessingType
-
OCR preprocessing.
- OcrWithPostProcessingEngine - Class in com.itextpdf.pdf2data
-
Engine which will apply post processors (if present) to results of base ocr engine.
-
OcrWithPostProcessingEngine(IOcrEngine, List
, boolean) - Constructor for class com.itextpdf.pdf2data.OcrWithPostProcessingEngine -
Creates new
OcrWithPostProcessingEngine
instance.
P
- PageLocationMetaResult - Class in com.itextpdf.pdf2data.result.meta
-
Class representing page location metadata.
- PageLocationMetaResult(Double, Double, Double, Double, Integer) - Constructor for class com.itextpdf.pdf2data.result.meta.PageLocationMetaResult
-
Creates an instance of page location metadata representation.
- PDF - Enum constant in enum class com.itextpdf.pdf2data.DocumentSourceType
-
PDF file.
- Pdf2DataExtractor - Class in com.itextpdf.pdf2data
-
Pdf2DataExtractor
is a class for extracting data from files. - Pdf2DataTATRPostProcessorStaticInitializer - Class in com.itextpdf.pdf2data.ocr.engine
-
Class for Microsoft TATR models initializing.
- Pdf2DataTemplateConverter - Class in com.itextpdf.pdf2data
-
Contains methods for creating p2dta.
- PreprocessingType - Enum Class in com.itextpdf.pdf2data
-
Enum which specifies the preprocessing type.
R
- readFromJson(InputStream) - Static method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Reads result from input stream containing the recognition result in JSON format.
- RecognitionProperties - Class in com.itextpdf.pdf2data
-
The
RecognitionProperties
class represents properties of recognition. - RecognitionResult - Class in com.itextpdf.pdf2data.result
-
Class which represents recognition result.
-
RecognitionResult(String, SortedMap
) - Constructor for class com.itextpdf.pdf2data.result.RecognitionResult -
Creates an instance of recognition result.
- RecognitionResultHolder - Class in com.itextpdf.pdf2data
-
Recognition result holder with methods to operate with the results.
- RecognitionResultHolder(RecognitionResult) - Constructor for class com.itextpdf.pdf2data.RecognitionResultHolder
-
Creates an instance with specified result.
S
- SerializationProperties - Class in com.itextpdf.pdf2data
-
The
SerializationProperties
class represents properties of serialization. - SerializationProperties() - Constructor for class com.itextpdf.pdf2data.SerializationProperties
-
Creates an instance of properties with default state of extracting metadata as
false
. - setIncludeMetaData(boolean) - Method in class com.itextpdf.pdf2data.SerializationProperties
-
Sets if metadata will be extracted during serialization.
- setMetaInfo(IMetaInfo) - Method in class com.itextpdf.pdf2data.RecognitionProperties
-
Sets IMetaInfo for this recognition properties instance
T
- TableCellResult - Class in com.itextpdf.pdf2data.result.value.table
-
Class which represents single table cell result.
- TableCellResult(PageLocationMetaResult, FontMetaResult, Integer, Integer, String) - Constructor for class com.itextpdf.pdf2data.result.value.table.TableCellResult
-
Creates an instance of table cell result.
- TableResult - Class in com.itextpdf.pdf2data.result.value.table
-
Class which represents table result.
-
TableResult(List
, List - Constructor for class com.itextpdf.pdf2data.result.value.table.TableResult) -
Creates an instance of table result.
- TableRowResult - Class in com.itextpdf.pdf2data.result.value.table
-
Class which represents table row result.
-
TableRowResult(PageLocationMetaResult, List
) - Constructor for class com.itextpdf.pdf2data.result.value.table.TableRowResult -
Creates an instance of table row result.
- Tesseract4BasedEngine - Class in com.itextpdf.pdf2data.ocr.engine
-
Engine which uses
Tesseract4LibOcrEngine
as based ocr engine. - Tesseract4BasedEngine.Builder - Class in com.itextpdf.pdf2data.ocr.engine
-
Builder for
Tesseract4BasedEngine
. - textPositioning(TextPositioning) - Method in class com.itextpdf.pdf2data.ocr.engine.Tesseract4BasedEngine.Builder
-
Defines the way text is retrieved from tesseract output using
TextPositioning
. - TextResult - Class in com.itextpdf.pdf2data.result.value
-
Class representing text result.
- TextResult(PageLocationMetaResult, FontMetaResult, String) - Constructor for class com.itextpdf.pdf2data.result.value.TextResult
-
Creates an instance of text result.
U
- UnknownResult - Class in com.itextpdf.pdf2data.result.value
-
Class which representing unknown result type.
- UnknownResult(String) - Constructor for class com.itextpdf.pdf2data.result.value.UnknownResult
-
Creates an instance of unknown result type.
- unloadModels() - Static method in class com.itextpdf.pdf2data.ocr.engine.Pdf2DataTATRPostProcessorStaticInitializer
-
Close all table models related resources.
V
- valueOf(String) - Static method in enum class com.itextpdf.pdf2data.DocumentSourceType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class com.itextpdf.pdf2data.PreprocessingType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class com.itextpdf.pdf2data.result.meta.FontStyle
-
Returns the enum constant of this class with the specified name.
- values() - Static method in enum class com.itextpdf.pdf2data.DocumentSourceType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class com.itextpdf.pdf2data.PreprocessingType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class com.itextpdf.pdf2data.result.meta.FontStyle
-
Returns an array containing the constants of this enum class, in the order they are declared.
W
- writeToJson(File) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Writes holded result into the specified file as JSON.
- writeToJson(File, SerializationProperties) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Writes holded result into the specified file as JSON.
- writeToJson(OutputStream) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Writes holded result into the specified output stream as JSON.
- writeToJson(OutputStream, SerializationProperties) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Writes holded result into the specified output stream as JSON.
- writeToXml(File) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Saves recognition results to provided xml file.
- writeToXml(File, SerializationProperties) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Saves recognition results to provided xml file.
- writeToXml(OutputStream) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Saves recognition results to output stream in xml form.
- writeToXml(OutputStream, SerializationProperties) - Method in class com.itextpdf.pdf2data.RecognitionResultHolder
-
Saves recognition results to output stream in xml form.
All Classes and Interfaces|All Packages|Serialized Form