public class TaggedPdfReaderTool extends Object
Modifier and Type | Field and Description |
---|---|
protected PdfDocument |
document |
protected OutputStreamWriter |
out |
protected Map<PdfDictionary,Map<Integer,String>> |
parsedTags |
protected String |
rootTag |
Constructor and Description |
---|
TaggedPdfReaderTool(PdfDocument document)
Constructs a TaggedPdfReaderTool via a given PdfDocument .
|
Modifier and Type | Method and Description |
---|---|
void |
convertToXml(OutputStream os)
Converts the current tag structure into an XML file with default encoding (UTF-8).
|
void |
convertToXml(OutputStream os, String charset)
Converts the current tag structure into an XML file with provided encoding.
|
protected static String |
escapeXML(String s, boolean onlyASCII)
NOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.
|
protected static String |
fixTagName(String tag) |
protected void |
inspectAttributes(PdfStructElem kid) |
protected void |
inspectKid(IStructureNode kid) |
protected void |
inspectKids(List<IStructureNode> kids) |
static boolean |
isValidCharacterValue(int c)
Checks if a character value should be escaped/unescaped.
|
protected void |
parseTag(PdfMcr kid) |
TaggedPdfReaderTool |
setRootTag(String rootTagName)
Sets the name of the root tag of the resultant XML file
|
protected PdfDocument document
protected OutputStreamWriter out
protected String rootTag
protected Map<PdfDictionary,Map<Integer,String>> parsedTags
public TaggedPdfReaderTool(PdfDocument document)
TaggedPdfReaderTool
via a given PdfDocument
.
document
- the document to read tag structure from
public static boolean isValidCharacterValue(int c)
c
- a character value
public void convertToXml(OutputStream os) throws IOException
os
- the output stream to save XML file to
IOException
public void convertToXml(OutputStream os, String charset) throws IOException
os
- the output stream to save XML file to
charset
- the charset of the resultant XML file
IOException
public TaggedPdfReaderTool setRootTag(String rootTagName)
rootTagName
- the name of the root tag
protected void inspectKids(List<IStructureNode> kids)
protected void inspectKid(IStructureNode kid)
protected void inspectAttributes(PdfStructElem kid)
protected void parseTag(PdfMcr kid)
Copyright © 1998–2018 iText Group NV. All rights reserved.