Parser (pdfHTML 1.0.3 API)

java.lang.Object
- org.jsoup.parser.Parser

```
public class Parser
extends Object
```
Parses HTML into a Document. Generally best to use one of the more convenient parse methods in Jsoup.

Constructor Summary

Constructors
Constructor and Description
`Parser(TreeBuilder treeBuilder)` Create a new Parser, using the specified TreeBuilder

Method Summary

All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods
Modifier and Type	Method and Description
`List<ParseError>`	`getErrors()` Retrieve the parse errors, if any, from the last parse.
`TreeBuilder`	`getTreeBuilder()` Get the TreeBuilder currently in use.
`static Parser`	`htmlParser()` Create a new HTML parser.
`boolean`	`isTrackErrors()` Check if parse error tracking is enabled.
`static Document`	`parse(String html, String baseUri)` Parse HTML into a Document.
`static Document`	`parseBodyFragment(String bodyHtml, String baseUri)` Parse a fragment of HTML into the `body` of a Document.
`static Document`	`parseBodyFragmentRelaxed(String bodyHtml, String baseUri)` Deprecated. Use `parseBodyFragment(java.lang.String, java.lang.String)` or `parseFragment(java.lang.String, org.jsoup.nodes.Element, java.lang.String)` instead.
`static List<Node>`	`parseFragment(String fragmentHtml, Element context, String baseUri)` Parse a fragment of HTML into a list of nodes.
`Document`	`parseInput(String html, String baseUri)`
`static List<Node>`	`parseXmlFragment(String fragmentXml, String baseUri)` Parse a fragment of XML into a list of nodes.
`Parser`	`setTrackErrors(int maxErrors)` Enable or disable parse error tracking for the next parse.
`Parser`	`setTreeBuilder(TreeBuilder treeBuilder)` Update the TreeBuilder used when parsing content.
`static String`	`unescapeEntities(String string, boolean inAttribute)` Utility method to unescape HTML entities from a string
`static Parser`	`xmlParser()` Create a new XML parser.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - Parser
```
public Parser(TreeBuilder treeBuilder)
```
    Create a new Parser, using the specified TreeBuilder
    
    Parameters:
    
    treeBuilder - TreeBuilder to use to parse input into Documents.
- Method Detail
  - parseInput
```
public Document parseInput(String html,
                           String baseUri)
```
  - getTreeBuilder
```
public TreeBuilder getTreeBuilder()
```
    Get the TreeBuilder currently in use.
    
    Returns:
    
    current TreeBuilder.
  - setTreeBuilder
```
public Parser setTreeBuilder(TreeBuilder treeBuilder)
```
    Update the TreeBuilder used when parsing content.
    
    Parameters:
    
    treeBuilder - current TreeBuilder
    
    Returns:
    
    this, for chaining
  - isTrackErrors
```
public boolean isTrackErrors()
```
    Check if parse error tracking is enabled.
    
    Returns:
    
    current track error state.
  - setTrackErrors
```
public Parser setTrackErrors(int maxErrors)
```
    Enable or disable parse error tracking for the next parse.
    
    Parameters:
    
    maxErrors - the maximum number of errors to track. Set to 0 to disable.
    
    Returns:
    
    this, for chaining
  - getErrors
```
public List<ParseError> getErrors()
```
    Retrieve the parse errors, if any, from the last parse.
    
    Returns:
    
    list of parse errors, up to the size of the maximum errors tracked.
  - parse
```
public static Document parse(String html,
                             String baseUri)
```
    Parse HTML into a Document.
    
    Parameters:
    
    html - HTML to parse
    
    baseUri - base URI of document (i.e. original fetch location), for resolving relative URLs.
    
    Returns:
    
    parsed Document
  - parseFragment
```
public static List<Node> parseFragment(String fragmentHtml,
                                       Element context,
                                       String baseUri)
```
    Parse a fragment of HTML into a list of nodes. The context element, if supplied, supplies parsing context.
    
    Parameters:
    
    fragmentHtml - the fragment of HTML to parse
    
    context - (optional) the element that this HTML fragment is being parsed for (i.e. for inner HTML). This provides stack context (for implicit element creation).
    
    baseUri - base URI of document (i.e. original fetch location), for resolving relative URLs.
    
    Returns:
    
    list of nodes parsed from the input HTML. Note that the context element, if supplied, is not modified.
  - parseXmlFragment
```
public static List<Node> parseXmlFragment(String fragmentXml,
                                          String baseUri)
```
    Parse a fragment of XML into a list of nodes.
    
    Parameters:
    
    fragmentXml - the fragment of XML to parse
    
    baseUri - base URI of document (i.e. original fetch location), for resolving relative URLs.
    
    Returns:
    
    list of nodes parsed from the input XML.
  - parseBodyFragment
```
public static Document parseBodyFragment(String bodyHtml,
                                         String baseUri)
```
    Parse a fragment of HTML into the body of a Document.
    
    Parameters:
    
    bodyHtml - fragment of HTML
    
    baseUri - base URI of document (i.e. original fetch location), for resolving relative URLs.
    
    Returns:
    
    Document, with empty head, and HTML parsed into body
  - unescapeEntities
```
public static String unescapeEntities(String string,
                                      boolean inAttribute)
```
    Utility method to unescape HTML entities from a string
    
    Parameters:
    
    string - HTML escaped string
    
    inAttribute - if the string is to be escaped in strict mode (as attributes are)
    
    Returns:
    
    an unescaped string
  - parseBodyFragmentRelaxed
```
public static Document parseBodyFragmentRelaxed(String bodyHtml,
                                                String baseUri)
```
    Deprecated. Use parseBodyFragment(java.lang.String, java.lang.String) or parseFragment(java.lang.String, org.jsoup.nodes.Element, java.lang.String) instead.
    
    Parameters:
    
    bodyHtml - HTML to parse
    
    baseUri - baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
    
    Returns:
    
    parsed Document
  - htmlParser
```
public static Parser htmlParser()
```
    Create a new HTML parser. This parser treats input as HTML5, and enforces the creation of a normalised document, based on a knowledge of the semantics of the incoming tags.
    
    Returns:
    
    a new HTML parser.
  - xmlParser
```
public static Parser xmlParser()
```
    Create a new XML parser. This parser assumes no knowledge of the incoming tags and does not treat it as HTML, rather creates a simple tree directly from the input.
    
    Returns:
    
    a new simple XML parser.

Class Parser

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

Parser

Method Detail

parseInput

getTreeBuilder

setTreeBuilder

isTrackErrors

setTrackErrors

getErrors

parse

parseFragment

parseXmlFragment

parseBodyFragment

unescapeEntities

parseBodyFragmentRelaxed

htmlParser

xmlParser