pdfOCR 5.0.0 API
iText.Pdfocr.TextInfo Class Reference

This class describes how recognized text is positioned on the image providing bbox for each text item (could be a line or a word). More...

Public Member Functions

  TextInfo ()
  Creates a new TextInfo instance. More...
 
  TextInfo (iText.Pdfocr.TextInfo textInfo)
  Creates a new TextInfo instance from existing one. More...
 
  TextInfo (String text, Point[] bbox)
  Creates new TextInfo instance. More...
 
  TextInfo (String text, Rectangle bbox)
  Creates new TextInfo instance. More...
 
virtual String  GetText ()
  Gets text element. More...
 
virtual iText.Pdfocr.TextInfo  SetText (String newText)
  Sets text element. More...
 
virtual Point[]  GetTextPoints ()
  Gets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points. More...
 
virtual iText.Pdfocr.TextInfo  SetTextPoints (Point[] textPoints)
  Sets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points. More...
 
virtual Point[]  GetPixelTextPoints (int imageHeight)
  Gets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels. More...
 
virtual iText.Pdfocr.TextInfo  SetPixelTextPoints (Point[] textPoints, int imageHeight)
  Sets an array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels. More...
 
virtual Rectangle  GetBBoxRect ()
  Converts a text polygon to a bounding box. More...
 
virtual float  GetRotationAngle ()
  Returns the text rotation angle in radian for this TextInfo in the range of -pi to pi. More...
 
virtual LogicalStructureTreeItem  GetLogicalStructureTreeItem ()
  Retrieves structure tree item for the text item. More...
 
virtual void  SetLogicalStructureTreeItem (LogicalStructureTreeItem logicalStructureTreeItem)
  Sets logical structure tree parent item for the text info. More...
 

Detailed Description

This class describes how recognized text is positioned on the image providing bbox for each text item (could be a line or a word).

Constructor & Destructor Documentation

◆ TextInfo() [1/4]

iText.Pdfocr.TextInfo.TextInfo ( )
inline

Creates a new TextInfo instance.

◆ TextInfo() [2/4]

iText.Pdfocr.TextInfo.TextInfo ( iText.Pdfocr.TextInfo  textInfo )
inline

Creates a new TextInfo instance from existing one.

Parameters
textInfo to create from

◆ TextInfo() [3/4]

iText.Pdfocr.TextInfo.TextInfo ( String  text,
Point[]  bbox 
)
inline

Creates new TextInfo instance.

Parameters
text text string
bbox array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points (0 - lower-left, 1 - upper-left, 2 - upper-right, 3 - lower-right point)

◆ TextInfo() [4/4]

iText.Pdfocr.TextInfo.TextInfo ( String  text,
Rectangle  bbox 
)
inline

Creates new TextInfo instance.

Creates new TextInfo instance. Could be used for not rotated text chunks.

Parameters
text text string
bbox

iText.Kernel.Geom.Rectangle describing text bounding box expressed in PDF points

Member Function Documentation

◆ GetBBoxRect()

virtual Rectangle iText.Pdfocr.TextInfo.GetBBoxRect ( )
inlinevirtual

Converts a text polygon to a bounding box.

Returns

iText.Kernel.Geom.Rectangle representing text bounding box

◆ GetLogicalStructureTreeItem()

virtual LogicalStructureTreeItem iText.Pdfocr.TextInfo.GetLogicalStructureTreeItem ( )
inlinevirtual

Retrieves structure tree item for the text item.

Returns
structure tree item.

◆ GetPixelTextPoints()

virtual Point [] iText.Pdfocr.TextInfo.GetPixelTextPoints ( int  imageHeight )
inlinevirtual

Gets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels.

Gets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels.

Point array stores text polygon in the following order relative to text: 0 - lower-left, 1 - upper-left, 2 - upper-right, 3 - lower-right point.

The following coordinate system is used for text points coordinate: the origin is located in left top corner of the page (image), vertical (y) coordinates increase from the top of the page to the bottom, horizontal (x) coordinates increase from the left side of the page to the right, axe unit is pixel (1 pixel = 1/96 inch = 0.75 PDF point).

Parameters
imageHeight height of the image to convert the text PDF points to image pixels coordinates. Used to change the y origin
Returns
array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels

◆ GetRotationAngle()

virtual float iText.Pdfocr.TextInfo.GetRotationAngle ( )
inlinevirtual

Returns the text rotation angle in radian for this TextInfo in the range of -pi to pi.

Returns
the text rotation angle in radian for the current TextInfo

◆ GetText()

virtual String iText.Pdfocr.TextInfo.GetText ( )
inlinevirtual

Gets text element.

Returns
text string

◆ GetTextPoints()

virtual Point [] iText.Pdfocr.TextInfo.GetTextPoints ( )
inlinevirtual

Gets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points.

Gets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points.

Point array stores text polygon in the following order relative to text: 0 - lower-left, 1 - upper-left, 2 - upper-right, 3 - lower-right point.

The following coordinate system is used for points coordinate: the origin is located in left bottom corner of the page, vertical (y) coordinates increase from the bottom of the page to the top, horizontal (x) coordinates increase from the left side of the page to the right, axe unit is user space unit which we call PDF point (1 PDF point = 1/72 inch = 4/3 pixel).

Returns
array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points

◆ SetLogicalStructureTreeItem()

virtual void iText.Pdfocr.TextInfo.SetLogicalStructureTreeItem ( LogicalStructureTreeItem  logicalStructureTreeItem )
inlinevirtual

Sets logical structure tree parent item for the text info.

Sets logical structure tree parent item for the text info. It allows to organize text chunks into logical hierarchy, e.g. specify document paragraphs, tables, etc.

If LogicalStructureTreeItem is set, then the list of TextInfo s in IOcrEngine.DoImageOcr(System.IO.FileInfo) return value is expected to be in logical order.

Parameters
logicalStructureTreeItem structure tree item

◆ SetPixelTextPoints()

virtual iText.Pdfocr.TextInfo iText.Pdfocr.TextInfo.SetPixelTextPoints ( Point[]  textPoints,
int  imageHeight 
)
inlinevirtual

Sets an array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels.

Sets an array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels.

Point array should store text polygon in the following order relative to text: 0 - lower-left, 1 - upper-left, 2 - upper-right, 3 - lower-right point.

The following coordinate system is used for text points coordinate: the origin is located in left top corner of the page, vertical (y) coordinates increase from the top of the page to the bottom, horizontal (x) coordinates increase from the left side of the page to the right, axe unit is pixel (1 pixel = 1/96 inch = 0.75 PDF point).

Parameters
textPoints array of 4 iText.Kernel.Geom.Point s describing text bbox (0 - lower-left, 1 - upper-left, 2 - upper-right, 3 - lower-right relative to text) expressed in pixels
imageHeight height of the image to convert the text PDF points to image pixels coordinates. Used to change the y origin
Returns
array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in pixels

◆ SetText()

virtual iText.Pdfocr.TextInfo iText.Pdfocr.TextInfo.SetText ( String  newText )
inlinevirtual

Sets text element.

Parameters
newText retrieved text
Returns
this instance

◆ SetTextPoints()

virtual iText.Pdfocr.TextInfo iText.Pdfocr.TextInfo.SetTextPoints ( Point[]  textPoints )
inlinevirtual

Sets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points.

Sets array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points.

Point array should store text polygon in the following order relative to text: 0 - lower-left, 1 - upper-left, 2 - upper-right, 3 - lower-right point.

The following coordinate system is used for points coordinate: the origin is located in left bottom corner of the page, vertical (y) coordinates increase from the bottom of the page to the top, horizontal (x) coordinates increase from the left side of the page to the right, axe unit is user space unit which we call PDF point (1 PDF point = 1/72 inch = 4/3 pixel).

Parameters
textPoints array of 4 iText.Kernel.Geom.Point s describing text bbox (lower-left based relative to text) expressed in points
Returns
this instance