public final class PdfTextExtractor extends Object
| Modifier and Type | Method and Description |
|---|---|
static String |
getTextFromPage(PdfReader reader, int pageNumber)
Extract text from a specified page using the default strategy.
|
static String |
getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy)
Extract text from a specified page using an extraction strategy.
|
static String |
getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy, Map<String,ContentOperator> additionalContentOperators)
Extract text from a specified page using an extraction strategy.
|
public static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy, Map<String,ContentOperator> additionalContentOperators) throws IOException
reader - the reader to extract text from
pageNumber - the page to extract text from
strategy - the strategy to use for extracting text
additionalContentOperators - an optional map of custom ContentOperators for rendering instructions
IOException - if any operation fails while reading from the provided PdfReader
public static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy) throws IOException
reader - the reader to extract text from
pageNumber - the page to extract text from
strategy - the strategy to use for extracting text
IOException - if any operation fails while reading from the provided PdfReader
public static String getTextFromPage(PdfReader reader, int pageNumber) throws IOException
Note: the default strategy is subject to change. If using a specific strategy is important, use getTextFromPage(PdfReader, int, TextExtractionStrategy)
reader - the reader to extract text from
pageNumber - the page to extract text from
IOException - if any operation fails while reading from the provided PdfReader
Copyright © 1998–2021. All rights reserved.