public final class PdfTextExtractor extends Object
Modifier and Type | Method and Description |
---|---|
static String |
getTextFromPage(PdfReader reader, int pageNumber)
Extract text from a specified page using the default strategy.
|
static String |
getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy)
Extract text from a specified page using an extraction strategy.
|
static String |
getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy, Map<String,ContentOperator> additionalContentOperators)
Extract text from a specified page using an extraction strategy.
|
public static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy, Map<String,ContentOperator> additionalContentOperators) throws IOException
reader
- the reader to extract text from
pageNumber
- the page to extract text from
strategy
- the strategy to use for extracting text
additionalContentOperators
- an optional map of custom ContentOperators for rendering instructions
IOException
- if any operation fails while reading from the provided PdfReader
public static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy) throws IOException
reader
- the reader to extract text from
pageNumber
- the page to extract text from
strategy
- the strategy to use for extracting text
IOException
- if any operation fails while reading from the provided PdfReader
public static String getTextFromPage(PdfReader reader, int pageNumber) throws IOException
Note: the default strategy is subject to change. If using a specific strategy is important, use getTextFromPage(PdfReader, int, TextExtractionStrategy)
reader
- the reader to extract text from
pageNumber
- the page to extract text from
IOException
- if any operation fails while reading from the provided PdfReader
Copyright © 1998–2022. All rights reserved.