iText 9.6.0 API
iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy Class Reference
Inheritance diagram for iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy:
iText.Kernel.Pdf.Canvas.Parser.Listener.ITextExtractionStrategy iText.Kernel.Pdf.Canvas.Parser.Listener.IEventListener

Classes

interface   ITextChunkLocationStrategy
 

Public Member Functions

  LocationTextExtractionStrategy ()
  Creates a new text extraction renderer. More...
 
  LocationTextExtractionStrategy (LocationTextExtractionStrategy.ITextChunkLocationStrategy strat)
  Creates a new text extraction renderer, with a custom strategy for creating new TextChunkLocation objects based on the input of the TextRenderInfo. More...
 
virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy  SetUseActualText (bool useActualText)
  Changes the behavior of text extraction so that if the parameter is set to true , /ActualText marked content property will be used instead of raw decoded bytes. More...
 
virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy  SetRightToLeftRunDirection (bool rightToLeftRunDirection)
  Sets if text flows from left to right or from right to left. More...
 
virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy  SetOutputChunkSeparator (String outputChunkSeparator)
  Sets the string value used to separate chunks when formatting output. More...
 
virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy  SetOutputNewline (String outputNewline)
  Sets the string value used to separate lines when formatting output. More...
 
virtual bool  IsUseActualText ()
  Gets the value of the property which determines if /ActualText will be used when extracting the text More...
 
virtual void  EventOccurred (IEventData data, EventType type)
  Called when some event occurs during parsing a content stream. More...
 
virtual ICollection< EventType GetSupportedEvents ()
  Provides the set of event types this listener supports. More...
 
virtual String  GetResultantText ()
  Returns the text that has been processed so far. More...
 

Package Functions

virtual bool  IsChunkAtWordBoundary (TextChunk chunk, TextChunk previousChunk)
  Determines if a space character should be inserted between a previous chunk and the current chunk. More...
 

Constructor & Destructor Documentation

◆ LocationTextExtractionStrategy() [1/2]

iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.LocationTextExtractionStrategy ( )
inline

Creates a new text extraction renderer.

◆ LocationTextExtractionStrategy() [2/2]

iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.LocationTextExtractionStrategy ( LocationTextExtractionStrategy.ITextChunkLocationStrategy  strat )
inline

Creates a new text extraction renderer, with a custom strategy for creating new TextChunkLocation objects based on the input of the TextRenderInfo.

Parameters
strat the custom strategy

Member Function Documentation

◆ EventOccurred()

virtual void iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.EventOccurred ( IEventData  data,
EventType  type 
)
inlinevirtual

Called when some event occurs during parsing a content stream.

Parameters
data Combines the data required for processing corresponding event type.
type Event type.

Implements iText.Kernel.Pdf.Canvas.Parser.Listener.IEventListener.

◆ GetResultantText()

virtual String iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.GetResultantText ( )
inlinevirtual

Returns the text that has been processed so far.

Returns

System.String instance with the current resultant text

Implements iText.Kernel.Pdf.Canvas.Parser.Listener.ITextExtractionStrategy.

◆ GetSupportedEvents()

virtual ICollection<EventType> iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.GetSupportedEvents ( )
inlinevirtual

Provides the set of event types this listener supports.

Provides the set of event types this listener supports. Returns null if all possible event types are supported.

Returns
Set of event types supported by this listener or null if all possible event types are supported.

Implements iText.Kernel.Pdf.Canvas.Parser.Listener.IEventListener.

◆ IsChunkAtWordBoundary()

virtual bool iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.IsChunkAtWordBoundary ( TextChunk  chunk,
TextChunk  previousChunk 
)
inlinepackagevirtual

Determines if a space character should be inserted between a previous chunk and the current chunk.

Determines if a space character should be inserted between a previous chunk and the current chunk. This method is exposed as a callback so subclasses can fine time the algorithm for determining whether a space should be inserted or not. By default, this method will insert a space if the there is a gap of more than half the font space character width between the end of the previous chunk and the beginning of the current chunk. It will also indicate that a space is needed if the starting point of the new chunk appears before the end of the previous chunk (i.e. overlapping text).

Parameters
chunk the new chunk being evaluated
previousChunk the chunk that appeared immediately before the current chunk
Returns
true if the two chunks represent different words (i.e. should have a space between them). False otherwise.

◆ IsUseActualText()

virtual bool iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.IsUseActualText ( )
inlinevirtual

Gets the value of the property which determines if /ActualText will be used when extracting the text

Returns
true if /ActualText value is used, false otherwise

◆ SetOutputChunkSeparator()

virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.SetOutputChunkSeparator ( String  outputChunkSeparator )
inlinevirtual

Sets the string value used to separate chunks when formatting output.

Parameters
outputChunkSeparator the string that will be used as a separator between chunks. Must not be null
Returns
this object

◆ SetOutputNewline()

virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.SetOutputNewline ( String  outputNewline )
inlinevirtual

Sets the string value used to separate lines when formatting output.

Parameters
outputNewline the string that will be used to represent a new line. Must not be null
Returns
this object

◆ SetRightToLeftRunDirection()

virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.SetRightToLeftRunDirection ( bool  rightToLeftRunDirection )
inlinevirtual

Sets if text flows from left to right or from right to left.

Sets if text flows from left to right or from right to left. Call this method with true argument for extracting Arabic, Hebrew or other text with right-to-left writing direction.

Parameters
rightToLeftRunDirection value specifying whether the direction should be right to left
Returns
this object

◆ SetUseActualText()

virtual iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy.SetUseActualText ( bool  useActualText )
inlinevirtual

Changes the behavior of text extraction so that if the parameter is set to true , /ActualText marked content property will be used instead of raw decoded bytes.

Changes the behavior of text extraction so that if the parameter is set to true , /ActualText marked content property will be used instead of raw decoded bytes. Beware: the logic is not stable yet.

Parameters
useActualText true to use /ActualText, false otherwise
Returns
this object