Class LocationTextExtractionStrategy
java.lang.Object
com.itextpdf.kernel.pdf.canvas.parser.listener.LocationTextExtractionStrategy
- All Implemented Interfaces:
-
IEventListener
,ITextExtractionStrategy
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic interface
-
Constructor Summary
ConstructorDescriptionCreates a new text extraction renderer.Creates a new text extraction renderer, with a custom strategy for creating new TextChunkLocation objects based on the input of the TextRenderInfo. -
Method Summary
Modifier and TypeMethodDescriptionvoid
eventOccurred
(IEventData data, EventType type) Called when some event occurs during parsing a content stream.Returns the text that has been processed so far.Provides the set of event types this listener supports.protected boolean
isChunkAtWordBoundary
(TextChunk chunk, TextChunk previousChunk) Determines if a space character should be inserted between a previous chunk and the current chunk.boolean
Gets the value of the property which determines if /ActualText will be used when extracting the textsetRightToLeftRunDirection
(boolean rightToLeftRunDirection) Sets if text flows from left to right or from right to left.setUseActualText
(boolean useActualText) Changes the behavior of text extraction so that if the parameter is set totrue
, /ActualText marked content property will be used instead of raw decoded bytes.
-
Constructor Details
-
LocationTextExtractionStrategy
public LocationTextExtractionStrategy()Creates a new text extraction renderer. -
LocationTextExtractionStrategy
public LocationTextExtractionStrategy(LocationTextExtractionStrategy.ITextChunkLocationStrategy strat) Creates a new text extraction renderer, with a custom strategy for creating new TextChunkLocation objects based on the input of the TextRenderInfo.- Parameters:
-
strat
- the custom strategy
-
-
Method Details
-
setUseActualText
Changes the behavior of text extraction so that if the parameter is set totrue
, /ActualText marked content property will be used instead of raw decoded bytes. Beware: the logic is not stable yet.- Parameters:
-
useActualText
- true to use /ActualText, false otherwise - Returns:
- this object
-
setRightToLeftRunDirection
Sets if text flows from left to right or from right to left. Call this method withtrue
argument for extracting Arabic, Hebrew or other text with right-to-left writing direction.- Parameters:
-
rightToLeftRunDirection
- value specifying whether the direction should be right to left - Returns:
- this object
-
isUseActualText
public boolean isUseActualText()Gets the value of the property which determines if /ActualText will be used when extracting the text- Returns:
- true if /ActualText value is used, false otherwise
-
eventOccurred
Description copied from interface:IEventListener
Called when some event occurs during parsing a content stream.- Specified by:
-
eventOccurred
in interfaceIEventListener
- Parameters:
-
data
- Combines the data required for processing corresponding event type. -
type
- Event type.
-
getSupportedEvents
Description copied from interface:IEventListener
Provides the set of event types this listener supports. Returns null if all possible event types are supported.- Specified by:
-
getSupportedEvents
in interfaceIEventListener
- Returns:
- Set of event types supported by this listener or null if all possible event types are supported.
-
getResultantText
Description copied from interface:ITextExtractionStrategy
Returns the text that has been processed so far.- Specified by:
-
getResultantText
in interfaceITextExtractionStrategy
- Returns:
-
String
instance with the current resultant text
-
isChunkAtWordBoundary
Determines if a space character should be inserted between a previous chunk and the current chunk. This method is exposed as a callback so subclasses can fine time the algorithm for determining whether a space should be inserted or not. By default, this method will insert a space if the there is a gap of more than half the font space character width between the end of the previous chunk and the beginning of the current chunk. It will also indicate that a space is needed if the starting point of the new chunk appears *before* the end of the previous chunk (i.e. overlapping text).- Parameters:
-
chunk
- the new chunk being evaluated -
previousChunk
- the chunk that appeared immediately before the current chunk - Returns:
- true if the two chunks represent different words (i.e. should have a space between them). False otherwise.
-