pdfOCR 3.0.2 API
iText.Pdfocr.Tesseract4.Tesseract4ExecutableOcrEngine Class Reference

The implementation of AbstractTesseract4OcrEngine for tesseract OCR. More...

Inheritance diagram for iText.Pdfocr.Tesseract4.Tesseract4ExecutableOcrEngine:
iText.Pdfocr.Tesseract4.AbstractTesseract4OcrEngine iText.Pdfocr.IOcrEngine iText.Pdfocr.IProductAware

Public Member Functions

  Tesseract4ExecutableOcrEngine (Tesseract4OcrEngineProperties tesseract4OcrEngineProperties)
  Creates a new Tesseract4ExecutableOcrEngine instance. More...
 
  Tesseract4ExecutableOcrEngine (String executablePath, Tesseract4OcrEngineProperties tesseract4OcrEngineProperties)
  Creates a new Tesseract4ExecutableOcrEngine instance. More...
 
String  GetPathToExecutable ()
  Gets path to tesseract executable. More...
 
void  SetPathToExecutable (String path)
  Sets path to tesseract executable. More...
 
- Public Member Functions inherited from iText.Pdfocr.Tesseract4.AbstractTesseract4OcrEngine
  AbstractTesseract4OcrEngine (Tesseract4OcrEngineProperties tesseract4OcrEngineProperties)
  Creates a new Tesseract4OcrEngineProperties instance based on another Tesseract4OcrEngineProperties instance (copy constructor). More...
 
virtual void  DoTesseractOcr (FileInfo inputImage, FileInfo outputFile, OutputFormat outputFormat)
  Performs tesseract OCR for the first (or for the only) image page. More...
 
virtual void  DoTesseractOcr (FileInfo inputImage, FileInfo outputFile, OutputFormat outputFormat, OcrProcessContext ocrProcessContext)
  Performs tesseract OCR for the first (or for the only) image page. More...
 
virtual void  CreateTxtFile (IList< FileInfo > inputImages, FileInfo txtFile)
  Performs OCR using provided iText.Pdfocr.IOcrEngine for the given list of input images and saves output to a text file using provided path. More...
 
virtual void  CreateTxtFile (IList< FileInfo > inputImages, FileInfo txtFile, OcrProcessContext ocrProcessContext)
  Performs OCR using provided iText.Pdfocr.IOcrEngine for the given list of input images and saves output to a text file using provided path. More...
 
Tesseract4OcrEngineProperties  GetTesseract4OcrEngineProperties ()
  Gets properties for AbstractTesseract4OcrEngine. More...
 
void  SetTesseract4OcrEngineProperties (Tesseract4OcrEngineProperties tesseract4OcrEngineProperties)
  Sets properties for AbstractTesseract4OcrEngine. More...
 
String  GetLanguagesAsString ()
  Gets list of languages concatenated with "+" symbol to a string in format required by tesseract. More...
 
IDictionary< int, IList< TextInfo > >  DoImageOcr (FileInfo input)
  Reads data from the provided input image file and returns retrieved data in the format described below. More...
 
IDictionary< int, IList< TextInfo > >  DoImageOcr (FileInfo input, OcrProcessContext ocrProcessContext)
  Reads data from the provided input image file and returns retrieved data in the format described below. More...
 
String  DoImageOcr (FileInfo input, OutputFormat outputFormat, OcrProcessContext ocrProcessContext)
  Reads data from the provided input image file and returns retrieved data as string. More...
 
String  DoImageOcr (FileInfo input, OutputFormat outputFormat)
  Reads data from the provided input image file and returns retrieved data as string. More...
 
virtual bool  IsWindows ()
  Checks current os type. More...
 
virtual String  IdentifyOsType ()
  Identifies type of current OS and return it (win, linux). More...
 
virtual void  ValidateLanguages (IList< String > languagesList)
  Validates list of provided languages and checks if they all exist in given tess data directory. More...
 
virtual PdfOcrMetaInfoContainer  GetMetaInfoContainer ()
  Gets the container with meta info. More...
 
virtual ProductData  GetProductData ()
  Gets object containing information about the product. More...
 

Detailed Description

The implementation of AbstractTesseract4OcrEngine for tesseract OCR.

The implementation of AbstractTesseract4OcrEngine for tesseract OCR. This class provides possibilities to use features of "tesseract" CL tool (optical character recognition engine for various operating systems). Please note that it's assumed that "tesseract" has already been installed locally.

Constructor & Destructor Documentation

◆ Tesseract4ExecutableOcrEngine() [1/2]

iText.Pdfocr.Tesseract4.Tesseract4ExecutableOcrEngine.Tesseract4ExecutableOcrEngine ( Tesseract4OcrEngineProperties  tesseract4OcrEngineProperties )
inline

Creates a new Tesseract4ExecutableOcrEngine instance.

Parameters
tesseract4OcrEngineProperties set of properties

◆ Tesseract4ExecutableOcrEngine() [2/2]

iText.Pdfocr.Tesseract4.Tesseract4ExecutableOcrEngine.Tesseract4ExecutableOcrEngine ( String  executablePath,
Tesseract4OcrEngineProperties  tesseract4OcrEngineProperties 
)
inline

Creates a new Tesseract4ExecutableOcrEngine instance.

Parameters
executablePath path to tesseract executable
tesseract4OcrEngineProperties set of properties

Member Function Documentation

◆ GetPathToExecutable()

String iText.Pdfocr.Tesseract4.Tesseract4ExecutableOcrEngine.GetPathToExecutable ( )
inline

Gets path to tesseract executable.

Returns
path to tesseract executable

◆ SetPathToExecutable()

void iText.Pdfocr.Tesseract4.Tesseract4ExecutableOcrEngine.SetPathToExecutable ( String  path )
inline

Sets path to tesseract executable.

Sets path to tesseract executable. By default it's assumed that "tesseract" already exists in the "PATH".

Parameters
path path to tesseract executable