Package com.itextpdf.kernel.utils
Class TaggedPdfReaderTool
java.lang.Object
com.itextpdf.kernel.utils.TaggedPdfReaderTool
Converts a tagged PDF document into an XML file.
-
Field Summary
Modifier and TypeFieldDescriptionprotected PdfDocument
protected OutputStreamWriter
protected Map<PdfDictionary,
Map<Integer, String>> protected String
-
Constructor Summary
ConstructorDescriptionTaggedPdfReaderTool
(PdfDocument document) Constructs aTaggedPdfReaderTool
via a givenPdfDocument
. -
Method Summary
Modifier and TypeMethodDescriptionvoid
Converts the current tag structure into an XML file with default encoding (UTF-8).void
convertToXml
(OutputStream os, String charset) Converts the current tag structure into an XML file with provided encoding.protected static String
NOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.protected static String
fixTagName
(String tag) protected void
protected void
inspectKid
(IStructureNode kid) protected void
inspectKids
(List<IStructureNode> kids) static boolean
isValidCharacterValue
(int c) Checks if a character value should be escaped/unescaped.protected void
setRootTag
(String rootTagName) Sets the name of the root tag of the resultant XML file
-
Field Details
-
document
-
out
-
rootTag
-
parsedTags
-
-
Constructor Details
-
TaggedPdfReaderTool
Constructs aTaggedPdfReaderTool
via a givenPdfDocument
.- Parameters:
-
document
- the document to read tag structure from
-
-
Method Details
-
isValidCharacterValue
public static boolean isValidCharacterValue(int c) Checks if a character value should be escaped/unescaped.- Parameters:
-
c
- a character value - Returns:
- true if it's OK to escape or unescape this value
-
convertToXml
Converts the current tag structure into an XML file with default encoding (UTF-8).- Parameters:
-
os
- the output stream to save XML file to - Throws:
-
IOException
- in case of any I/O error
-
convertToXml
Converts the current tag structure into an XML file with provided encoding.- Parameters:
-
os
- the output stream to save XML file to -
charset
- the charset of the resultant XML file - Throws:
-
IOException
- in case of any I/O error
-
setRootTag
Sets the name of the root tag of the resultant XML file- Parameters:
-
rootTagName
- the name of the root tag - Returns:
- this object
-
inspectKids
-
inspectKid
-
inspectAttributes
-
parseTag
-
fixTagName
-
escapeXML
NOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.- Parameters:
-
s
- the string to be escaped -
onlyASCII
- codes above 127 will always be escaped with nn; iftrue
- Returns:
- the escaped string
-