Class OcrPdfCreator

java.lang.Object
com.itextpdf.pdfocr.OcrPdfCreator

public class OcrPdfCreator extends Object
OcrPdfCreator is the class that creates PDF documents containing input images and text that was recognized using provided IOcrEngine. OcrPdfCreator provides possibilities to set list of input images to be used for OCR, to set scaling mode for images, to set color of text in output PDF document, to set fixed size of the PDF document's page and to perform OCR using given images and to return PdfDocument as result. OCR is based on the provided IOcrEngine (e.g. tesseract reader). This parameter is obligatory and it should be provided in constructor or using setter.
  • Constructor Details

  • Method Details

    • getOcrPdfCreatorProperties

      public final OcrPdfCreatorProperties getOcrPdfCreatorProperties()
      Gets properties for OcrPdfCreator.
      Returns:
      set properties OcrPdfCreatorProperties
    • setOcrPdfCreatorProperties

      public final void setOcrPdfCreatorProperties (OcrPdfCreatorProperties ocrPdfCreatorProperties)
      Sets properties for OcrPdfCreator.
      Parameters:
      ocrPdfCreatorProperties - set of properties OcrPdfCreatorProperties for OcrPdfCreator
    • createPdfA

      public final com.itextpdf.kernel.pdf.PdfDocument createPdfA (List<File> inputImages, com.itextpdf.kernel.pdf.PdfWriter pdfWriter, com.itextpdf.kernel.pdf.DocumentProperties documentProperties, com.itextpdf.kernel.pdf.PdfOutputIntent pdfOutputIntent, IOcrProcessProperties ocrProcessProperties) throws PdfOcrException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter, DocumentProperties and PdfOutputIntent. PDF/A-3u document will be created if provided PdfOutputIntent is not null.

      NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated PdfDocument. If you don't need this, use the createPdfAFile(java.util.List, java.io.File, com.itextpdf.kernel.pdf.PdfOutputIntent) method. In this case, only the pdfOcr event will be dispatched.

      Parameters:
      inputImages - List of images to be OCRed
      pdfWriter - the PdfWriter object to write final PDF document to
      documentProperties - document properties
      pdfOutputIntent - PdfOutputIntent for PDF/A-3u document
      ocrProcessProperties - extra OCR process properties passed to OcrProcessContext
      Returns:
      result PDF/A-3u PdfDocument object
      Throws:
      PdfOcrException - if it was not possible to read provided or default font
    • createPdfA

      public final com.itextpdf.kernel.pdf.PdfDocument createPdfA (List<File> inputImages, com.itextpdf.kernel.pdf.PdfWriter pdfWriter, com.itextpdf.kernel.pdf.PdfOutputIntent pdfOutputIntent) throws PdfOcrException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter and PdfOutputIntent. PDF/A-3u document will be created if provided PdfOutputIntent is not null.

      NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated PdfDocument. If you don't need this, use the createPdfAFile(java.util.List, java.io.File, com.itextpdf.kernel.pdf.PdfOutputIntent) method. In this case, only the pdfOcr event will be dispatched.

      Parameters:
      inputImages - List of images to be OCRed
      pdfWriter - the PdfWriter object to write final PDF document to
      pdfOutputIntent - PdfOutputIntent for PDF/A-3u document
      Returns:
      result PDF/A-3u PdfDocument object
      Throws:
      PdfOcrException - if it was not possible to read provided or default font
    • createPdfA

      public final com.itextpdf.kernel.pdf.PdfDocument createPdfA (List<File> inputImages, com.itextpdf.kernel.pdf.PdfWriter pdfWriter, com.itextpdf.kernel.pdf.DocumentProperties documentProperties, com.itextpdf.kernel.pdf.PdfOutputIntent pdfOutputIntent) throws PdfOcrException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter, DocumentProperties and PdfOutputIntent. PDF/A-3u document will be created if provided PdfOutputIntent is not null.

      NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated PdfDocument. If you don't need this, use the createPdfAFile(java.util.List, java.io.File, com.itextpdf.kernel.pdf.PdfOutputIntent) method. In this case, only the pdfOcr event will be dispatched.

      Parameters:
      inputImages - List of images to be OCRed
      pdfWriter - the PdfWriter object to write final PDF document to
      documentProperties - document properties
      pdfOutputIntent - PdfOutputIntent for PDF/A-3u document
      Returns:
      result PDF/A-3u PdfDocument object
      Throws:
      PdfOcrException - if it was not possible to read provided or default font
    • createPdf

      public final com.itextpdf.kernel.pdf.PdfDocument createPdf (List<File> inputImages, com.itextpdf.kernel.pdf.PdfWriter pdfWriter, com.itextpdf.kernel.pdf.DocumentProperties documentProperties, IOcrProcessProperties ocrProcessProperties) throws PdfOcrException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter.

      NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated PdfDocument. If you don't need this, use the createPdfFile(java.util.List, java.io.File) method. In this case, only the pdfOcr event will be dispatched.

      Parameters:
      inputImages - List of images to be OCRed
      pdfWriter - the PdfWriter object to write final PDF document to
      documentProperties - document properties
      ocrProcessProperties - extra OCR process properties passed to OcrProcessContext
      Returns:
      result PdfDocument object
      Throws:
      PdfOcrException - if provided font is incorrect
    • createPdf

      public final com.itextpdf.kernel.pdf.PdfDocument createPdf (List<File> inputImages, com.itextpdf.kernel.pdf.PdfWriter pdfWriter, com.itextpdf.kernel.pdf.DocumentProperties documentProperties) throws PdfOcrException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter.

      NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated PdfDocument. If you don't need this, use the createPdfFile(java.util.List, java.io.File) method. In this case, only the pdfOcr event will be dispatched.

      Parameters:
      inputImages - List of images to be OCRed
      pdfWriter - the PdfWriter object to write final PDF document to
      documentProperties - document properties
      Returns:
      result PdfDocument object
      Throws:
      PdfOcrException - if provided font is incorrect
    • createPdf

      public final com.itextpdf.kernel.pdf.PdfDocument createPdf (List<File> inputImages, com.itextpdf.kernel.pdf.PdfWriter pdfWriter) throws PdfOcrException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided PdfWriter.

      NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated PdfDocument. If you don't need this, use the createPdfFile(java.util.List, java.io.File) method. In this case, only the pdfOcr event will be dispatched.

      Parameters:
      inputImages - List of images to be OCRed
      pdfWriter - the PdfWriter object to write final PDF document to
      Returns:
      result PdfDocument object
      Throws:
      PdfOcrException - if provided font is incorrect
    • createPdfFile

      public void createPdfFile (List<File> inputImages, File outPdfFile) throws PdfOcrException, IOException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided File.
      Parameters:
      inputImages - List of images to be OCRed
      outPdfFile - the File object to write final PDF document to
      Throws:
      IOException - signals that an I/O exception of some sort has occurred.
      PdfOcrException - if it was not possible to read provided or default font
    • createPdfAFile

      public void createPdfAFile (List<File> inputImages, File outPdfFile, com.itextpdf.kernel.pdf.PdfOutputIntent pdfOutputIntent) throws PdfOcrException, IOException
      Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided File and PdfOutputIntent. PDF/A-3u document will be created if provided PdfOutputIntent is not null.
      Parameters:
      inputImages - List of images to be OCRed
      outPdfFile - the File object to write final PDF document to
      pdfOutputIntent - PdfOutputIntent for PDF/A-3u document
      Throws:
      IOException - signals that an I/O exception of some sort has occurred
      PdfOcrException - if it was not possible to read provided or default font
    • getOcrEngine

      public final IOcrEngine getOcrEngine()
      Gets used IOcrEngine. Returns IOcrEngine reader object to perform OCR.
      Returns:
      selected IOcrEngine instance
    • setOcrEngine

      public final void setOcrEngine (IOcrEngine reader)
      Sets IOcrEngine reader object to perform OCR.
      Parameters:
      reader - selected IOcrEngine instance