Package com.itextpdf.kernel.utils
Class TaggedPdfReaderTool
java.lang.Object
com.itextpdf.kernel.utils.TaggedPdfReaderTool
Converts a tagged PDF document into an XML file.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected PdfDocumentprotected OutputStreamWriterprotected Map<PdfDictionary,Map<Integer, String>> protected String -
Constructor Summary
ConstructorsConstructorDescriptionTaggedPdfReaderTool(PdfDocument document) Constructs aTaggedPdfReaderToolvia a givenPdfDocument. -
Method Summary
Modifier and TypeMethodDescriptionvoidConverts the current tag structure into an XML file with default encoding (UTF-8).voidconvertToXml(OutputStream os, String charset) Converts the current tag structure into an XML file with provided encoding.protected static StringNOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.protected static StringfixTagName(String tag) protected voidprotected voidinspectKid(IStructureNode kid) protected voidinspectKids(List<IStructureNode> kids) static booleanisValidCharacterValue(int c) Checks if a character value should be escaped/unescaped.protected voidsetRootTag(String rootTagName) Sets the name of the root tag of the resultant XML file
-
Field Details
-
document
-
out
-
rootTag
-
parsedTags
-
-
Constructor Details
-
TaggedPdfReaderTool
Constructs aTaggedPdfReaderToolvia a givenPdfDocument.- Parameters:
-
document- the document to read tag structure from
-
-
Method Details
-
isValidCharacterValue
public static boolean isValidCharacterValue(int c) Checks if a character value should be escaped/unescaped.- Parameters:
-
c- a character value - Returns:
- true if it's OK to escape or unescape this value
-
convertToXml
Converts the current tag structure into an XML file with default encoding (UTF-8).- Parameters:
-
os- the output stream to save XML file to - Throws:
-
IOException- in case of any I/O error
-
convertToXml
Converts the current tag structure into an XML file with provided encoding.- Parameters:
-
os- the output stream to save XML file to -
charset- the charset of the resultant XML file - Throws:
-
IOException- in case of any I/O error
-
setRootTag
Sets the name of the root tag of the resultant XML file- Parameters:
-
rootTagName- the name of the root tag - Returns:
- this object
-
inspectKids
-
inspectKid
-
inspectAttributes
-
parseTag
-
fixTagName
-
escapeXML
NOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.- Parameters:
-
s- the string to be escaped -
onlyASCII- codes above 127 will always be escaped with nn; iftrue - Returns:
- the escaped string
-