Pdf2Data 5.0.1 API
|
Pdf2DataExtractor
is a class for extracting data from files. More...
Public Member Functions |
|
virtual iText.Pdf2Data.Template.Template | GetTemplate () |
Gets current template instance. |
|
virtual IOcrEngine | GetOcrEngine () |
Gets current OCR engine instance. |
|
virtual RecognitionResultHolder | Extract (FileInfo targetPDF) |
Recognize the pdf file. |
|
virtual RecognitionResultHolder | Extract (FileInfo targetFile, RecognitionProperties properties) |
Recognize the file. |
|
virtual RecognitionResultHolder | Extract (Stream targetInputStream) |
Recognize the pdf file. |
|
virtual RecognitionResultHolder | Extract (Stream targetInputStream, RecognitionProperties properties) |
Recognize the file. |
|
virtual IDictionary< String, int?> | Check (FileInfo targetPDF) |
Recognize the pdf file and returns recognition results amount. |
|
virtual IDictionary< String, int?> | Check (FileInfo targetFile, RecognitionProperties properties) |
Recognize the document and returns recognition results amount. |
|
virtual IDictionary< String, int?> | Check (Stream targetInputStream) |
Recognize the pdf file and returns recognition results amount. |
|
virtual IDictionary< String, int?> | Check (Stream targetInputStream, RecognitionProperties properties) |
Recognize the document and returns recognition results amount. |
|
Static Public Member Functions |
|
static iText.Pdf2Data.Pdf2DataExtractor | Create (FileInfo p2dFile) |
Creates instance of Pdf2DataExtractor from pdf2data template file. |
|
static iText.Pdf2Data.Pdf2DataExtractor | Create (FileInfo p2dFile, OcrWithPostProcessingEngine ocrEngine) |
Creates instance of Pdf2DataExtractor from pdf2data template file with provided OCR engine. |
|
static iText.Pdf2Data.Pdf2DataExtractor | CreateFromTemplateContentJson (Stream templateContentJsonStream) |
Creates instance of Pdf2DataExtractor from stream which contants pdf2data template content in JSON format. |
|
static iText.Pdf2Data.Pdf2DataExtractor | CreateFromTemplateContentJson (Stream templateContentJsonStream, OcrWithPostProcessingEngine ocrEngine) |
Creates instance of Pdf2DataExtractor from stream which contants pdf2data template content in JSON format. |
|
Pdf2DataExtractor
is a class for extracting data from files.
Pdf2DataExtractor
is a class for extracting data from files.
To create instance of Pdf2DataExtractor
to extract data from PDF file, use Create(System.IO.FileInfo).
To create instance of Pdf2DataExtractor
to extract data from image, use Create(System.IO.FileInfo, OcrWithPostProcessingEngine).
To extract data from PDF file use Extract(System.IO.FileInfo) method.
To extract data from image use Extract(System.IO.FileInfo, RecognitionProperties) method with file type specified via RecognitionProperties instance.
|
inlinevirtual |
Recognize the document and returns recognition results amount.
targetFile | file for recognition |
properties | a RecognitionProperties instance |
|
inlinevirtual |
Recognize the pdf file and returns recognition results amount.
targetPDF | pdf file for recognition |
|
inlinevirtual |
Recognize the pdf file and returns recognition results amount.
targetInputStream | input stream from pdf file for recognition |
|
inlinevirtual |
Recognize the document and returns recognition results amount.
targetInputStream | input stream from file for recognition |
properties | a RecognitionProperties instance |
|
inlinestatic |
Creates instance of Pdf2DataExtractor
from pdf2data template file.
Creates instance of Pdf2DataExtractor
from pdf2data template file. Note that template should be processed.
p2dFile | pdf2data template archive |
|
inlinestatic |
Creates instance of Pdf2DataExtractor
from pdf2data template file with provided OCR engine.
Creates instance of Pdf2DataExtractor
from pdf2data template file with provided OCR engine. Note that template should be processed.
p2dFile | pdf2data template archive |
ocrEngine | OCR engine to be used for OCR involving recognitions. May be null if no OCR involving recognitions would be used. |
|
inlinestatic |
Creates instance of Pdf2DataExtractor
from stream which contants pdf2data template content in JSON format.
Creates instance of Pdf2DataExtractor
from stream which contants pdf2data template content in JSON format. Note that template should be processed.
templateContentJsonStream | processed template content stream |
|
inlinestatic |
Creates instance of Pdf2DataExtractor
from stream which contants pdf2data template content in JSON format.
Creates instance of Pdf2DataExtractor
from stream which contants pdf2data template content in JSON format. Note that template should be processed.
templateContentJsonStream | processed template content stream |
ocrEngine | OCR engine to be used for OCR involving recognitions. May be null if no OCR involving recognitions would be used. |
|
inlinevirtual |
Recognize the file.
targetFile | file for recognition |
properties | a RecognitionProperties instance |
RecognitionResultHolder instance
|
inlinevirtual |
Recognize the pdf file.
targetPDF | pdf file for recognition |
RecognitionResultHolder instance
|
inlinevirtual |
Recognize the pdf file.
targetInputStream | input stream from pdf file for recognition |
RecognitionResultHolder instance
|
inlinevirtual |
Recognize the file.
targetInputStream | input stream from file for recognition |
properties | a RecognitionProperties instance |
RecognitionResultHolder instance
|
inlinevirtual |
Gets current OCR engine instance.
|
inlinevirtual |
Gets current template instance.