pdfOCR 4.1.0 API
|
OcrPdfCreator is the class that creates PDF documents containing input images and text that was recognized using provided IOcrEngine. More...
Public Member Functions |
|
OcrPdfCreator (IOcrEngine ocrEngine) | |
Creates a new OcrPdfCreator instance. More... |
|
OcrPdfCreator (IOcrEngine ocrEngine, OcrPdfCreatorProperties ocrPdfCreatorProperties) | |
Creates a new OcrPdfCreator instance. More... |
|
OcrPdfCreatorProperties | GetOcrPdfCreatorProperties () |
Gets properties for OcrPdfCreator. More... |
|
void | SetOcrPdfCreatorProperties (OcrPdfCreatorProperties ocrPdfCreatorProperties) |
Sets properties for OcrPdfCreator. More... |
|
PdfDocument | CreatePdfA (IList< FileInfo > inputImages, PdfWriter pdfWriter, DocumentProperties documentProperties, PdfOutputIntent pdfOutputIntent, IOcrProcessProperties ocrProcessProperties) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter , iText.Kernel.Pdf.DocumentProperties and iText.Kernel.Pdf.PdfOutputIntent. More... |
|
PdfDocument | CreatePdfA (IList< FileInfo > inputImages, PdfWriter pdfWriter, PdfOutputIntent pdfOutputIntent) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter and iText.Kernel.Pdf.PdfOutputIntent. More... |
|
PdfDocument | CreatePdfA (IList< FileInfo > inputImages, PdfWriter pdfWriter, DocumentProperties documentProperties, PdfOutputIntent pdfOutputIntent) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter , iText.Kernel.Pdf.DocumentProperties and iText.Kernel.Pdf.PdfOutputIntent. More... |
|
PdfDocument | CreatePdf (IList< FileInfo > inputImages, PdfWriter pdfWriter, DocumentProperties documentProperties, IOcrProcessProperties ocrProcessProperties) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter. More... |
|
PdfDocument | CreatePdf (IList< FileInfo > inputImages, PdfWriter pdfWriter, DocumentProperties documentProperties) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter. More... |
|
PdfDocument | CreatePdf (IList< FileInfo > inputImages, PdfWriter pdfWriter) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter. More... |
|
virtual void | CreatePdfFile (IList< FileInfo > inputImages, FileInfo outPdfFile) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided System.IO.FileInfo. More... |
|
virtual void | CreatePdfAFile (IList< FileInfo > inputImages, FileInfo outPdfFile, PdfOutputIntent pdfOutputIntent) |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided System.IO.FileInfo and iText.Kernel.Pdf.PdfOutputIntent. More... |
|
IOcrEngine | GetOcrEngine () |
Gets used IOcrEngine reader object to perform OCR. More... |
|
void | SetOcrEngine (IOcrEngine reader) |
Sets IOcrEngine reader object to perform OCR. More... |
|
virtual void | MakePdfSearchable (FileInfo inputPdf, FileInfo outputPdf) |
Performs OCR of all images in an input PDF file and generates searchable PDF. More... |
|
virtual void | MakePdfSearchable (FileInfo inputPdf, FileInfo outputPdf, IOcrProcessProperties ocrProcessProperties) |
Performs OCR of all images in an input PDF file and generates searchable PDF. More... |
|
virtual void | MakePdfSearchable (PdfDocument pdfDoc) |
Performs OCR of all images in an input PDF document and adds recognized text on top of the images. More... |
|
virtual void | MakePdfSearchable (PdfDocument pdfDoc, IOcrProcessProperties ocrProcessProperties) |
Performs OCR of all images in an input PDF document and adds recognized text on top of the images. More... |
|
Package Functions |
|
virtual void | ValidateInputPdfDocument (PdfDocument pdfDoc) |
Validates input PDF document. More... |
|
OcrPdfCreator is the class that creates PDF documents containing input images and text that was recognized using provided IOcrEngine.
OcrPdfCreator is the class that creates PDF documents containing input images and text that was recognized using provided IOcrEngine.
OcrPdfCreator provides possibilities to set list of input images to be used for OCR, to set scaling mode for images, to set color of text in output PDF document, to set fixed size of the PDF document's page and to perform OCR using given images and to return iText.Kernel.Pdf.PdfDocument as result. OCR is based on the provided IOcrEngine (e.g. tesseract reader). This parameter is obligatory and it should be provided in constructor or using setter.
|
inline |
|
inline |
Creates a new OcrPdfCreator instance.
ocrEngine | selected OCR Reader IOcrEngine |
ocrPdfCreatorProperties | set of properties for OcrPdfCreator |
|
inline |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter.
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter.
NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated iText.Kernel.Pdf.PdfDocument . If you don't need this, use the CreatePdfFile(System.Collections.Generic.IList
inputImages |
System.Collections.IList
pdfWriter | the iText.Kernel.Pdf.PdfWriter object to write final PDF document to |
|
inline |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter.
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter.
NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated iText.Kernel.Pdf.PdfDocument . If you don't need this, use the CreatePdfFile(System.Collections.Generic.IList
inputImages |
System.Collections.IList
pdfWriter | the iText.Kernel.Pdf.PdfWriter object to write final PDF document to |
documentProperties | document properties |
|
inline |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter.
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter.
NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated iText.Kernel.Pdf.PdfDocument . If you don't need this, use the CreatePdfFile(System.Collections.Generic.IList
inputImages |
System.Collections.IList
pdfWriter | the iText.Kernel.Pdf.PdfWriter object to write final PDF document to |
documentProperties | document properties |
ocrProcessProperties | extra OCR process properties passed to OcrProcessContext |
|
inline |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter , iText.Kernel.Pdf.DocumentProperties and iText.Kernel.Pdf.PdfOutputIntent.
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter , iText.Kernel.Pdf.DocumentProperties and iText.Kernel.Pdf.PdfOutputIntent . PDF/A-3u document will be created if provided iText.Kernel.Pdf.PdfOutputIntent is not null.
NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated iText.Kernel.Pdf.PdfDocument . If you don't need this, use the CreatePdfAFile(System.Collections.Generic.IList
inputImages |
System.Collections.IList
pdfWriter | the iText.Kernel.Pdf.PdfWriter object to write final PDF document to |
documentProperties | document properties |
pdfOutputIntent |
iText.Kernel.Pdf.PdfOutputIntent for PDF/A-3u document
|
inline |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter , iText.Kernel.Pdf.DocumentProperties and iText.Kernel.Pdf.PdfOutputIntent.
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter , iText.Kernel.Pdf.DocumentProperties and iText.Kernel.Pdf.PdfOutputIntent . PDF/A-3u document will be created if provided iText.Kernel.Pdf.PdfOutputIntent is not null.
NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated iText.Kernel.Pdf.PdfDocument . If you don't need this, use the CreatePdfAFile(System.Collections.Generic.IList
inputImages |
System.Collections.IList
pdfWriter | the iText.Kernel.Pdf.PdfWriter object to write final PDF document to |
documentProperties | document properties |
pdfOutputIntent |
iText.Kernel.Pdf.PdfOutputIntent for PDF/A-3u document
ocrProcessProperties | extra OCR process properties passed to OcrProcessContext |
|
inline |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter and iText.Kernel.Pdf.PdfOutputIntent.
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided iText.Kernel.Pdf.PdfWriter and iText.Kernel.Pdf.PdfOutputIntent . PDF/A-3u document will be created if provided iText.Kernel.Pdf.PdfOutputIntent is not null.
NOTE that after executing this method you will have a product event from the both itextcore and pdfOcr. Therefore, use this method only if you need to work with the generated iText.Kernel.Pdf.PdfDocument . If you don't need this, use the CreatePdfAFile(System.Collections.Generic.IList
inputImages |
System.Collections.IList
pdfWriter | the iText.Kernel.Pdf.PdfWriter object to write final PDF document to |
pdfOutputIntent |
iText.Kernel.Pdf.PdfOutputIntent for PDF/A-3u document
|
inlinevirtual |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided System.IO.FileInfo and iText.Kernel.Pdf.PdfOutputIntent.
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided System.IO.FileInfo and iText.Kernel.Pdf.PdfOutputIntent. PDF/A-3u document will be created if provided iText.Kernel.Pdf.PdfOutputIntent is not null.
inputImages |
System.Collections.IList
outPdfFile | the System.IO.FileInfo object to write final PDF document to |
pdfOutputIntent |
iText.Kernel.Pdf.PdfOutputIntent for PDF/A-3u document
|
inlinevirtual |
Performs OCR with set parameters using provided IOcrEngine and creates PDF using provided System.IO.FileInfo.
inputImages |
System.Collections.IList
outPdfFile | the System.IO.FileInfo object to write final PDF document to |
|
inline |
Gets used IOcrEngine reader object to perform OCR.
|
inline |
Gets properties for OcrPdfCreator.
|
inlinevirtual |
Performs OCR of all images in an input PDF file and generates searchable PDF.
Performs OCR of all images in an input PDF file and generates searchable PDF.
By default, it does not allow to OCR PDF/A documents and tagged documents. The reason is that the result document might not comply with PDF/A specification and an added content might be not tagged depending on the IOcrEngine implementation. To overrule this behavior one can override ValidateInputPdfDocument(iText.Kernel.Pdf.PdfDocument) with an empty implementation.
Note that OcrPdfCreatorProperties.SetPageSize(iText.Kernel.Geom.Rectangle) , OcrPdfCreatorProperties.SetScaleMode(ScaleMode) and OcrPdfCreatorProperties.SetImageLayerName(System.String) have no effect for this method.
inputPdf | PDF file to OCR |
outputPdf | searchable PDF with the recognized text on top of the images |
|
inlinevirtual |
Performs OCR of all images in an input PDF file and generates searchable PDF.
Performs OCR of all images in an input PDF file and generates searchable PDF.
By default, it does not allow to OCR PDF/A documents and tagged documents. The reason is that the result document might not comply with PDF/A specification and an added content might be not tagged depending on the IOcrEngine implementation. To overrule this behavior one can override ValidateInputPdfDocument(iText.Kernel.Pdf.PdfDocument) with an empty implementation.
Note that OcrPdfCreatorProperties.SetPageSize(iText.Kernel.Geom.Rectangle) , OcrPdfCreatorProperties.SetScaleMode(ScaleMode) and OcrPdfCreatorProperties.SetImageLayerName(System.String) have no effect for this method.
inputPdf | PDF file to OCR |
outputPdf | searchable PDF with the recognized text on top of the images |
ocrProcessProperties | extra OCR process properties passed to OcrProcessContext. |
|
inlinevirtual |
Performs OCR of all images in an input PDF document and adds recognized text on top of the images.
Performs OCR of all images in an input PDF document and adds recognized text on top of the images.
By default, it does not allow to OCR PDF/A documents and tagged documents. The reason is that the result document might not comply with PDF/A specification and an added content might be not tagged depending on the IOcrEngine implementation. To overrule this behavior one can override ValidateInputPdfDocument(iText.Kernel.Pdf.PdfDocument) with an empty implementation.
Note that OcrPdfCreatorProperties.SetPageSize(iText.Kernel.Geom.Rectangle) , OcrPdfCreatorProperties.SetScaleMode(ScaleMode) and OcrPdfCreatorProperties.SetImageLayerName(System.String) have no effect for this method.
pdfDoc | PDF document with images to OCR |
|
inlinevirtual |
Performs OCR of all images in an input PDF document and adds recognized text on top of the images.
Performs OCR of all images in an input PDF document and adds recognized text on top of the images.
By default, it does not allow to OCR PDF/A documents and tagged documents. The reason is that the result document might not comply with PDF/A specification and an added content might be not tagged depending on the IOcrEngine implementation. To overrule this behavior one can override ValidateInputPdfDocument(iText.Kernel.Pdf.PdfDocument) with an empty implementation.
Note that OcrPdfCreatorProperties.SetPageSize(iText.Kernel.Geom.Rectangle) , OcrPdfCreatorProperties.SetScaleMode(ScaleMode) and OcrPdfCreatorProperties.SetImageLayerName(System.String) have no effect for this method.
pdfDoc | PDF document with images to OCR |
ocrProcessProperties | extra OCR process properties passed to OcrProcessContext |
|
inline |
Sets IOcrEngine reader object to perform OCR.
reader | selected IOcrEngine instance |
|
inline |
Sets properties for OcrPdfCreator.
ocrPdfCreatorProperties | set of properties OcrPdfCreatorProperties for OcrPdfCreator |
|
inlinepackagevirtual |
Validates input PDF document.
Validates input PDF document.
It checks that an input document is not tagged and not PDF/A. If you need to OCR tagged and/or PDF/A documents, override this method with empty implementation. In that case it would be best to use MakePdfSearchable(iText.Kernel.Pdf.PdfDocument, IOcrProcessProperties) overload because there you can pass iText.Pdfa.PdfADocument or PdfUADocument instance which will do the validation of the output document.
pdfDoc | a PDF document to check |