pdfOCR 3.0.2 API
|
Classes |
|
class | AbstractTesseract4OcrEngine |
The implementation of iText.Pdfocr.IOcrEngine. More... |
|
class | ImagePreprocessingOptions |
Additional options applied on image preprocessing step. More... |
|
class | ImagePreprocessingUtil |
Utilities class to work with images. |
|
class | LeptonicaImageRotationHandler |
Leptonica based implementation of iText.Pdfocr.IImageRotationHandler. More... |
|
class | Tesseract4EventHelper |
Helper class for working with events. |
|
class | Tesseract4ExecutableOcrEngine |
The implementation of AbstractTesseract4OcrEngine for tesseract OCR. More... |
|
class | Tesseract4FileResultEventHelper |
Helper class for working with events. |
|
class | Tesseract4LibOcrEngine |
The implementation of AbstractTesseract4OcrEngine for tesseract OCR. More... |
|
class | Tesseract4MetaInfo |
class | Tesseract4OcrEngineProperties |
Properties that will be used by the iText.Pdfocr.IOcrEngine. More... |
|
class | TesseractHelper |
Helper class. More... |
|
class | TesseractOcrUtil |
Utilities class to work with tesseract command line tool and image preprocessing using Net.Sourceforge.Lept4j.ILeptonica. |
|
Enumerations |
|
enum | OutputFormat { OutputFormat.HOCR, OutputFormat.TXT } |
Enumeration of the available output formats. More... |
|
enum | TextPositioning { TextPositioning.BY_LINES, TextPositioning.BY_WORDS, TextPositioning.BY_WORDS_AND_LINES } |
Enumeration of the possible types of text positioning. More... |
|
|
strong |
Enumeration of the available output formats.
Enumeration of the available output formats. It is used when there is possibility in selected Reader to process input file and to return result in the required output format.
Enumerator | |
---|---|
HOCR | Reader will produce XHTML output compliant with the hOCR specification. Reader will produce XHTML output compliant with the hOCR specification. Output will be parsed and represented as IList |
TXT | Reader will produce plain txt file. |
|
strong |
Enumeration of the possible types of text positioning.
Enumeration of the possible types of text positioning. It is used when there is possibility in selected Reader to process the text by lines or by words and to return coordinates for the selected type of item. For tesseract this value makes sense only if selected OutputFormat is OutputFormat.HOCR.