iText 7 7.1.8 API
iText.StyledXmlParser.Jsoup.Parser.Parser Class Reference

Parses HTML into a iText.StyledXmlParser.Jsoup.Nodes.Document . Generally best to use one of the more convenient parse methods in iText.StyledXmlParser.Jsoup.Jsoup More...

Public Member Functions

  Parser (TreeBuilder treeBuilder)
  Create a new Parser, using the specified TreeBuilder More...
 
virtual Document  ParseInput (String html, String baseUri)
 
virtual TreeBuilder  GetTreeBuilder ()
  Get the TreeBuilder currently in use. More...
 
virtual iText.StyledXmlParser.Jsoup.Parser.Parser  SetTreeBuilder (TreeBuilder treeBuilder)
  Update the TreeBuilder used when parsing content. More...
 
virtual bool  IsTrackErrors ()
  Check if parse error tracking is enabled. More...
 
virtual iText.StyledXmlParser.Jsoup.Parser.Parser  SetTrackErrors (int maxErrors)
  Enable or disable parse error tracking for the next parse. More...
 
virtual IList< ParseError GetErrors ()
  Retrieve the parse errors, if any, from the last parse. More...
 

Static Public Member Functions

static Document  Parse (String html, String baseUri)
  Parse HTML into a Document. More...
 
static Document  ParseXml (String xml, String baseUri)
  Parse XML into a Document. More...
 
static IList< iText.StyledXmlParser.Jsoup.Nodes.Node ParseFragment (String fragmentHtml, iText.StyledXmlParser.Jsoup.Nodes.Element context, String baseUri)
  Parse a fragment of HTML into a list of nodes. More...
 
static IList< iText.StyledXmlParser.Jsoup.Nodes.Node ParseXmlFragment (String fragmentXml, String baseUri)
  Parse a fragment of XML into a list of nodes. More...
 
static Document  ParseBodyFragment (String bodyHtml, String baseUri)
  Parse a fragment of HTML into the body of a Document. More...
 
static String  UnescapeEntities (String @string, bool inAttribute)
  Utility method to unescape HTML entities from a string More...
 
static Document  ParseBodyFragmentRelaxed (String bodyHtml, String baseUri)
 
static iText.StyledXmlParser.Jsoup.Parser.Parser  HtmlParser ()
  Create a new HTML parser. More...
 
static iText.StyledXmlParser.Jsoup.Parser.Parser  XmlParser ()
  Create a new XML parser. More...
 

Detailed Description

Parses HTML into a iText.StyledXmlParser.Jsoup.Nodes.Document . Generally best to use one of the more convenient parse methods in iText.StyledXmlParser.Jsoup.Jsoup

Constructor & Destructor Documentation

◆ Parser()

iText.StyledXmlParser.Jsoup.Parser.Parser.Parser ( TreeBuilder  treeBuilder )
inline

Create a new Parser, using the specified TreeBuilder

Parameters
treeBuilder TreeBuilder to use to parse input into Documents.

Member Function Documentation

◆ GetErrors()

virtual IList<ParseError> iText.StyledXmlParser.Jsoup.Parser.Parser.GetErrors ( )
inlinevirtual

Retrieve the parse errors, if any, from the last parse.

Returns
list of parse errors, up to the size of the maximum errors tracked.

◆ GetTreeBuilder()

virtual TreeBuilder iText.StyledXmlParser.Jsoup.Parser.Parser.GetTreeBuilder ( )
inlinevirtual

Get the TreeBuilder currently in use.

Returns
current TreeBuilder.

◆ HtmlParser()

static iText.StyledXmlParser.Jsoup.Parser.Parser iText.StyledXmlParser.Jsoup.Parser.Parser.HtmlParser ( )
inlinestatic

Create a new HTML parser.

Create a new HTML parser. This parser treats input as HTML5, and enforces the creation of a normalised document, based on a knowledge of the semantics of the incoming tags.

Returns
a new HTML parser.

◆ IsTrackErrors()

virtual bool iText.StyledXmlParser.Jsoup.Parser.Parser.IsTrackErrors ( )
inlinevirtual

Check if parse error tracking is enabled.

Returns
current track error state.

◆ Parse()

static Document iText.StyledXmlParser.Jsoup.Parser.Parser.Parse ( String  html,
String  baseUri 
)
inlinestatic

Parse HTML into a Document.

Parameters
html HTML to parse
baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
Returns
parsed Document

◆ ParseBodyFragment()

static Document iText.StyledXmlParser.Jsoup.Parser.Parser.ParseBodyFragment ( String  bodyHtml,
String  baseUri 
)
inlinestatic

Parse a fragment of HTML into the body of a Document.

Parameters
bodyHtml fragment of HTML
baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
Returns
Document, with empty head, and HTML parsed into body

◆ ParseBodyFragmentRelaxed()

static Document iText.StyledXmlParser.Jsoup.Parser.Parser.ParseBodyFragmentRelaxed ( String  bodyHtml,
String  baseUri 
)
inlinestatic
Parameters
bodyHtml HTML to parse
baseUri baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
Returns
parsed Document

◆ ParseFragment()

static IList<iText.StyledXmlParser.Jsoup.Nodes.Node> iText.StyledXmlParser.Jsoup.Parser.Parser.ParseFragment ( String  fragmentHtml,
iText.StyledXmlParser.Jsoup.Nodes.Element  context,
String  baseUri 
)
inlinestatic

Parse a fragment of HTML into a list of nodes.

Parse a fragment of HTML into a list of nodes. The context element, if supplied, supplies parsing context.

Parameters
fragmentHtml the fragment of HTML to parse
context (optional) the element that this HTML fragment is being parsed for (i.e. for inner HTML). This provides stack context (for implicit element creation).
baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
Returns
list of nodes parsed from the input HTML. Note that the context element, if supplied, is not modified.

◆ ParseXml()

static Document iText.StyledXmlParser.Jsoup.Parser.Parser.ParseXml ( String  xml,
String  baseUri 
)
inlinestatic

Parse XML into a Document.

Parameters
xml XML to parse
baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
Returns
parsed Document

◆ ParseXmlFragment()

static IList<iText.StyledXmlParser.Jsoup.Nodes.Node> iText.StyledXmlParser.Jsoup.Parser.Parser.ParseXmlFragment ( String  fragmentXml,
String  baseUri 
)
inlinestatic

Parse a fragment of XML into a list of nodes.

Parameters
fragmentXml the fragment of XML to parse
baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
Returns
list of nodes parsed from the input XML.

◆ SetTrackErrors()

virtual iText.StyledXmlParser.Jsoup.Parser.Parser iText.StyledXmlParser.Jsoup.Parser.Parser.SetTrackErrors ( int  maxErrors )
inlinevirtual

Enable or disable parse error tracking for the next parse.

Parameters
maxErrors the maximum number of errors to track. Set to 0 to disable.
Returns
this, for chaining

◆ SetTreeBuilder()

virtual iText.StyledXmlParser.Jsoup.Parser.Parser iText.StyledXmlParser.Jsoup.Parser.Parser.SetTreeBuilder ( TreeBuilder  treeBuilder )
inlinevirtual

Update the TreeBuilder used when parsing content.

Parameters
treeBuilder current TreeBuilder
Returns
this, for chaining

◆ UnescapeEntities()

static String iText.StyledXmlParser.Jsoup.Parser.Parser.UnescapeEntities ( String @  string,
bool  inAttribute 
)
inlinestatic

Utility method to unescape HTML entities from a string

Parameters
string HTML escaped string
inAttribute if the string is to be escaped in strict mode (as attributes are)
Returns
an unescaped string

◆ XmlParser()

static iText.StyledXmlParser.Jsoup.Parser.Parser iText.StyledXmlParser.Jsoup.Parser.Parser.XmlParser ( )
inlinestatic

Create a new XML parser.

Create a new XML parser. This parser assumes no knowledge of the incoming tags and does not treat it as HTML, rather creates a simple tree directly from the input.

Returns
a new simple XML parser.