Class CompareTool
For visual comparison it uses external tools: Ghostscript and ImageMagick, which should be installed on your machine. To allow CompareTool to use them, you need to pass either java properties or environment variables with names "ITEXT_GS_EXEC" and "ITEXT_MAGICK_COMPARE_EXEC", which would contain the commands to execute the Ghostscript and ImageMagick tools.
CompareTool class was mainly designed for the testing purposes of iText in order to ensure that the same code produces the same PDF document. For this reason you will often encounter such parameter names as "outDoc" and "cmpDoc" which stand for output document and document-for-comparison. The first one is viewed as the current result, and the second one is referred as normal or ideal result. OutDoc is compared to the ideal cmpDoc. Therefore all reports of the comparison are in the form: "Expected ..., but was ...". This should be interpreted in the following way: "expected" part stands for the content of the cmpDoc and "but was" part stands for the content of the outDoc.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Class containing results of the comparison of two documents.static class
Exceptions thrown when errors occur during generation and comparison of images obtained on the basis of pdf files. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
Clean up memory occupied for the tests.boolean
compareArrays
(PdfArray outArray, PdfArray cmpArray) Simple method that compares two given PdfArrays by content.boolean
compareBooleans
(PdfBoolean outBoolean, PdfBoolean cmpBoolean) Simple method that compares two given PdfBooleans.compareByCatalog
(PdfDocument outDocument, PdfDocument cmpDocument) Compares two PDF documents by content starting from Catalog dictionary and then recursively comparing corresponding objects which are referenced from it.compareByContent
(String outPdf, String cmpPdf, String outPath) Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them.compareByContent
(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix) Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them.compareByContent
(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, byte[] outPass, byte[] cmpPass) This method overload is used to compare two encrypted PDF documents.compareByContent
(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer, List<Rectangle>> ignoredAreas) Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them.compareByContent
(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer, List<Rectangle>> ignoredAreas, byte[] outPass, byte[] cmpPass) This method overload is used to compare two encrypted PDF documents.boolean
compareDictionaries
(PdfDictionary outDict, PdfDictionary cmpDict) Simple method that compares two given PdfDictionaries by content.compareDictionariesStructure
(PdfDictionary outDict, PdfDictionary cmpDict) Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents.compareDictionariesStructure
(PdfDictionary outDict, PdfDictionary cmpDict, Set<PdfName> excludedKeys) Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents.compareDocumentInfo
(String outPdf, String cmpPdf) Compares document info dictionaries of two pdf documents.compareDocumentInfo
(String outPdf, String cmpPdf, byte[] outPass, byte[] cmpPass) Compares document info dictionaries of two pdf documents.compareLinkAnnotations
(String outPdf, String cmpPdf) Checks if two documents have identical link annotations on corresponding pages.boolean
compareNames
(PdfName outName, PdfName cmpName) Simple method that compares two given PdfNames.boolean
compareNumbers
(PdfNumber outNumber, PdfNumber cmpNumber) Simple method that compares two given PdfNumbers.protected boolean
compareObjects
(PdfObject outObj, PdfObject cmpObj, ObjectPath currentPath, CompareTool.CompareResult compareResult) Compare PDF objects.boolean
compareStreams
(PdfStream outStream, PdfStream cmpStream) Simple method that compares two given PdfStreams by content.compareStreamsStructure
(PdfStream outStream, PdfStream cmpStream) Compares structures of two corresponding streams from out and cmp PDF documents.boolean
compareStrings
(PdfString outString, PdfString cmpString) Simple method that compares two given PdfStrings.compareTagStructures
(String outPdf, String cmpPdf) Compares tag structures of the two PDF documents.compareVisually
(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix) Compares two documents visually.compareVisually
(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer, List<Rectangle>> ignoredAreas) Compares two documents visually.boolean
compareXmls
(byte[] xml1, byte[] xml2) Utility method that provides simple comparison of the two xml files stored in byte arrays.boolean
compareXmls
(String outXmlFile, String cmpXmlFile) Utility method that provides simple comparison of the two xml files.compareXmp
(String outPdf, String cmpPdf) Compares xmp metadata of the two given PDF documents.compareXmp
(String outPdf, String cmpPdf, boolean ignoreDateAndProducerProperties) Compares xmp metadata of the two given PDF documents.protected String[]
Converts document info into a string array.static PdfReader
createOutputReader
(String filename) CreatePdfReader
out of the data created recently or read from disk.static PdfReader
createOutputReader
(String filename, ReaderProperties properties) CreatePdfReader
out of the data created recently or read from disk.static PdfWriter
createTestPdfWriter
(String filename) CreatePdfWriter
optimized for tests.static PdfWriter
createTestPdfWriter
(String filename, WriterProperties properties) CreatePdfWriter
optimized for tests.Disables the default logic of pages comparison.Enables the comparison of the encryption properties of the documents.enableEncryptionCompare
(boolean kdfSaltCompareEnabled) Enables the comparison of the encryption properties of the documents.GetsReaderProperties
to be passed later to thePdfReader
of the cmp document.GetsReaderProperties
to be passed later to thePdfReader
of the output document.setCompareByContentErrorsLimit
(int compareByContentMaxErrorCount) Sets the maximum errors count which will be returned as the result of the comparison.void
setEventCountingMetaInfo
(IMetaInfo metaInfo) SetsIMetaInfo
info that will be used for both read and written documents creation.setGenerateCompareByContentXmlReport
(boolean generateCompareByContentXmlReport) Enables or disables the generation of the comparison report in the form of an xml document.
-
Constructor Details
-
CompareTool
public CompareTool()Create newCompareTool
instance.
-
-
Method Details
-
createTestPdfWriter
CreatePdfWriter
optimized for tests.- Parameters:
-
filename
- File to write to when necessary. - Returns:
-
PdfWriter
to be used in tests. - Throws:
-
FileNotFoundException
- if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason. -
IOException
-
createTestPdfWriter
public static PdfWriter createTestPdfWriter(String filename, WriterProperties properties) throws IOException CreatePdfWriter
optimized for tests.- Parameters:
-
filename
- File to write to when necessary. -
properties
-WriterProperties
to use. - Returns:
-
PdfWriter
to be used in tests. - Throws:
-
FileNotFoundException
- if the file exists but is a directory rather than a regular file, does not exist but cannot be created, or cannot be opened for any other reason. -
IOException
-
createOutputReader
CreatePdfReader
out of the data created recently or read from disk.- Parameters:
-
filename
- File to read the data from when necessary. - Returns:
-
PdfReader
to be used in tests. - Throws:
-
IOException
- on error
-
createOutputReader
public static PdfReader createOutputReader(String filename, ReaderProperties properties) throws IOException CreatePdfReader
out of the data created recently or read from disk.- Parameters:
-
filename
- File to read the data from when necessary. -
properties
-ReaderProperties
to use. - Returns:
-
PdfReader
to be used in tests. - Throws:
-
IOException
- on error
-
cleanup
Clean up memory occupied for the tests.- Parameters:
-
path
- Path to clean up memory for.
-
compareByCatalog
public CompareTool.CompareResult compareByCatalog(PdfDocument outDocument, PdfDocument cmpDocument) Compares two PDF documents by content starting from Catalog dictionary and then recursively comparing corresponding objects which are referenced from it. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.The main difference between this method and the
compareByContent(String, String, String, String)
methods is the return value. This method returns aCompareTool.CompareResult
class instance, which could be used in code, whilst compareByContent methods in case of the differences simply return String value, which could only be printed. Also, keep in mind that this method doesn't perform visual comparison of the documents.For more explanations about what outDoc and cmpDoc are see last paragraph of the
CompareTool
class description.- Parameters:
-
outDocument
- aPdfDocument
corresponding to the output file, which is to be compared with cmp-file. -
cmpDocument
- aPdfDocument
corresponding to the cmp-file, which is to be compared with output file. - Returns:
-
the report on comparison of two files in the form of the custom class
CompareTool.CompareResult
instance. - See Also:
-
disableCachedPagesComparison
Disables the default logic of pages comparison. This option makes sense only forcompareByCatalog(PdfDocument, PdfDocument)
method.By default, pages are treated as special objects and if they are met in the process of comparison, then they are not checked as objects, but rather simply checked that they have same page numbers in both documents. This behaviour is intended for the
compareByContent(java.lang.String, java.lang.String, java.lang.String)
set of methods, because in them documents are compared in page by page basis. Thus, we don't need to check if pages are of the same content when they are met in comparison process, we are sure that we will compare their content or we have already compared them.However, if you would use
compareByCatalog(com.itextpdf.kernel.pdf.PdfDocument, com.itextpdf.kernel.pdf.PdfDocument)
with default behaviour of pages comparison, pages won't be checked at all, every time when reference to the page dictionary is met, only page numbers will be compared for both documents. You can say that in this case, comparison will be performed for all document's catalog entries except /Pages (However in fact, document's page tree structures will be compared, but pages themselves - won't).- Returns:
-
this
CompareTool
instance.
-
setCompareByContentErrorsLimit
Sets the maximum errors count which will be returned as the result of the comparison.- Parameters:
-
compareByContentMaxErrorCount
- the errors count. - Returns:
- this CompareTool instance.
-
setGenerateCompareByContentXmlReport
Enables or disables the generation of the comparison report in the form of an xml document.IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods!
- Parameters:
-
generateCompareByContentXmlReport
- true to enable xml report generation, false - to disable. - Returns:
- this CompareTool instance.
-
setEventCountingMetaInfo
SetsIMetaInfo
info that will be used for both read and written documents creation.- Parameters:
-
metaInfo
- meta info to set
-
enableEncryptionCompare
Enables the comparison of the encryption properties of the documents. Encryption properties comparison results are returned along with all other comparison results.IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods!
compareByCatalog(PdfDocument, PdfDocument)
doesn't compare encryption properties because encryption properties aren't part of the document's Catalog.- Returns:
- this CompareTool instance.
-
enableEncryptionCompare
Enables the comparison of the encryption properties of the documents. Encryption properties comparison results are returned along with all other comparison results.IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods!
compareByCatalog(PdfDocument, PdfDocument)
doesn't compare encryption properties because encryption properties aren't part of the document's Catalog.- Parameters:
-
kdfSaltCompareEnabled
- set totrue
ifPdfName.KDFSalt
entry must be compared, {code false} otherwise - Returns:
- this CompareTool instance.
-
getOutReaderProperties
GetsReaderProperties
to be passed later to thePdfReader
of the output document.Documents for comparison are opened in reader mode. This method is intended to alter
ReaderProperties
which are used to open the output document. This is particularly useful for comparison of encrypted documents.For more explanations about what outDoc and cmpDoc are see last paragraph of the
CompareTool
class description.- Returns:
-
ReaderProperties
instance to be passed later to thePdfReader
of the output document.
-
getCmpReaderProperties
GetsReaderProperties
to be passed later to thePdfReader
of the cmp document.Documents for comparison are opened in reader mode. This method is intended to alter
ReaderProperties
which are used to open the cmp document. This is particularly useful for comparison of encrypted documents.For more explanations about what outDoc and cmpDoc are see last paragraph of the
CompareTool
class description.- Returns:
-
ReaderProperties
instance to be passed later to thePdfReader
of the cmp document.
-
compareVisually
public String compareVisually(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix) throws InterruptedException, IOException Compares two documents visually. For the comparison two external tools are used: Ghostscript and ImageMagick. For more info about needed configuration for visual comparison process seeCompareTool
class description.Note, that this method uses
ImageMagickHelper
andGhostscriptHelper
classes and therefore may create temporary files and directories.During comparison for every page of the two documents an image file will be created in the folder specified by outPath parameter. Then those page images will be compared and if there are any differences for some pages, another image file will be created with marked differences on it.
- Parameters:
-
outPdf
- the absolute path to the output file, which is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file. -
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison. -
differenceImagePrefix
- file name prefix for image files with marked differences if there is any. - Returns:
- string containing list of the pages that are visually different, or null if there are no visual differences.
- Throws:
-
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and anInterruptedException
is thrown. -
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
-
compareVisually
public String compareVisually(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer, List<Rectangle>> ignoredAreas) throws InterruptedException, IOExceptionCompares two documents visually. For the comparison two external tools are used: Ghostscript and ImageMagick. For more info about needed configuration for visual comparison process seeCompareTool
class description.Note, that this method uses
ImageMagickHelper
andGhostscriptHelper
classes and therefore may create temporary files and directories.During comparison for every page of two documents an image file will be created in the folder specified by outPath parameter. Then those page images will be compared and if there are any differences for some pages, another image file will be created with marked differences on it.
It is possible to ignore certain areas of the document pages during visual comparison. This is useful for example in case if documents should be the same except certain page area with date on it. In this case, in the folder specified by the outPath, new pdf documents will be created with the black rectangles at the specified ignored areas, and visual comparison will be performed on these new documents.
- Parameters:
-
outPdf
- the absolute path to the output file, which is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file. -
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison. -
differenceImagePrefix
- file name prefix for image files with marked differences if there is any. -
ignoredAreas
- a map with one-based page numbers as keys and lists of ignored rectangles as values. - Returns:
- string containing list of the pages that are visually different, or null if there are no visual differences.
- Throws:
-
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and anInterruptedException
is thrown. -
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
-
compareByContent
public String compareByContent(String outPdf, String cmpPdf, String outPath) throws InterruptedException, IOException Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.When comparison by content is finished, if any differences were found, visual comparison is automatically started. For this overload, differenceImagePrefix value is generated using diff_%outPdfFileName%_ format.
For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outPdf
- the absolute path to the output file, which is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file. -
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison. - Returns:
- string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
- Throws:
-
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and anInterruptedException
is thrown. -
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created. - See Also:
-
compareByContent
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix) throws InterruptedException, IOException Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.When comparison by content is finished, if any differences were found, visual comparison is automatically started.
For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outPdf
- the absolute path to the output file, which is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file. -
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison. -
differenceImagePrefix
- file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format. - Returns:
- string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
- Throws:
-
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and anInterruptedException
is thrown. -
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created. - See Also:
-
compareByContent
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, byte[] outPass, byte[] cmpPass) throws InterruptedException, IOException This method overload is used to compare two encrypted PDF documents. Document passwords are passed with outPass and cmpPass parameters.Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.
When comparison by content is finished, if any differences were found, visual comparison is automatically started. For more info see
compareVisually(String, String, String, String)
.For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outPdf
- the absolute path to the output file, which is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file. -
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison. -
differenceImagePrefix
- file name prefix for image files with marked visual differences if there is any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format. -
outPass
- password for the encrypted document specified by the outPdf absolute path. -
cmpPass
- password for the encrypted document specified by the cmpPdf absolute path. - Returns:
- string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
- Throws:
-
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and anInterruptedException
is thrown. -
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created. - See Also:
-
compareByContent
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer, List<Rectangle>> ignoredAreas) throws InterruptedException, IOExceptionCompares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.When comparison by content is finished, if any differences were found, visual comparison is automatically started.
For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outPdf
- the absolute path to the output file, which is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file. -
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison. -
differenceImagePrefix
- file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format. -
ignoredAreas
- a map with one-based page numbers as keys and lists of ignored rectangles as values. - Returns:
- string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
- Throws:
-
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and anInterruptedException
is thrown. -
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created. - See Also:
-
compareByContent
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer, List<Rectangle>> ignoredAreas, byte[] outPass, byte[] cmpPass) throws InterruptedException, IOExceptionThis method overload is used to compare two encrypted PDF documents. Document passwords are passed with outPass and cmpPass parameters.Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.
When comparison by content is finished, if any differences were found, visual comparison is automatically started.
For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outPdf
- the absolute path to the output file, which is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file. -
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison. -
differenceImagePrefix
- file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format. -
ignoredAreas
- a map with one-based page numbers as keys and lists of ignored rectangles as values. -
outPass
- password for the encrypted document specified by the outPdf absolute path. -
cmpPass
- password for the encrypted document specified by the cmpPdf absolute path. - Returns:
- string containing text report on the encountered content differences and also list of the pages that are visually different, or null if there are no content and therefore no visual differences.
- Throws:
-
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and anInterruptedException
is thrown. -
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created. - See Also:
-
compareDictionaries
Simple method that compares two given PdfDictionaries by content. This is "deep" comparing, which means that all nested objects are also compared by content.- Parameters:
-
outDict
- dictionary to compare. -
cmpDict
- dictionary to compare. - Returns:
- true if dictionaries are equal by content, otherwise false.
-
compareDictionariesStructure
public CompareTool.CompareResult compareDictionariesStructure(PdfDictionary outDict, PdfDictionary cmpDict) Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.Both out and cmp
PdfDictionary
shall have indirect references.By default page dictionaries are excluded from the comparison when met and are instead compared in a special manner, simply comparing their page numbers. This behavior can be disabled by calling
disableCachedPagesComparison()
.For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outDict
- an indirectPdfDictionary
from the output file, which is to be compared to cmp-file dictionary. -
cmpDict
- an indirectPdfDictionary
from the cmp-file file, which is to be compared to output file dictionary. - Returns:
-
CompareTool.CompareResult
instance containing differences between the two dictionaries, ornull
if dictionaries are equal.
-
compareDictionariesStructure
public CompareTool.CompareResult compareDictionariesStructure(PdfDictionary outDict, PdfDictionary cmpDict, Set<PdfName> excludedKeys) Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.Both out and cmp
PdfDictionary
shall have indirect references.By default page dictionaries are excluded from the comparison when met and are instead compared in a special manner, simply comparing their page numbers. This behavior can be disabled by calling
disableCachedPagesComparison()
.For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outDict
- an indirectPdfDictionary
from the output file, which is to be compared to cmp-file dictionary. -
cmpDict
- an indirectPdfDictionary
from the cmp-file file, which is to be compared to output file dictionary. -
excludedKeys
- aSet
of names that designate entries fromoutDict
andcmpDict
dictionaries which are to be skipped during comparison. - Returns:
-
CompareTool.CompareResult
instance containing differences between the two dictionaries, ornull
if dictionaries are equal.
-
compareStreamsStructure
Compares structures of two corresponding streams from out and cmp PDF documents. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.For more explanations about what outPdf and cmpPdf are see last paragraph of the
CompareTool
class description.- Parameters:
-
outStream
- aPdfStream
from the output file, which is to be compared to cmp-file stream. -
cmpStream
- aPdfStream
from the cmp-file file, which is to be compared to output file stream. - Returns:
-
CompareTool.CompareResult
instance containing differences between the two streams, ornull
if streams are equal.
-
compareStreams
Simple method that compares two given PdfStreams by content. This is "deep" comparing, which means that all nested objects are also compared by content.- Parameters:
-
outStream
- stream to compare. -
cmpStream
- stream to compare. - Returns:
- true if stream are equal by content, otherwise false.
-
compareArrays
Simple method that compares two given PdfArrays by content. This is "deep" comparing, which means that all nested objects are also compared by content.- Parameters:
-
outArray
- array to compare. -
cmpArray
- array to compare. - Returns:
- true if arrays are equal by content, otherwise false.
-
compareNames
Simple method that compares two given PdfNames.- Parameters:
-
outName
- name to compare. -
cmpName
- name to compare. - Returns:
- true if names are equal, otherwise false.
-
compareNumbers
Simple method that compares two given PdfNumbers.- Parameters:
-
outNumber
- number to compare. -
cmpNumber
- number to compare. - Returns:
- true if numbers are equal, otherwise false.
-
compareStrings
Simple method that compares two given PdfStrings.- Parameters:
-
outString
- string to compare. -
cmpString
- string to compare. - Returns:
- true if strings are equal, otherwise false.
-
compareBooleans
Simple method that compares two given PdfBooleans.- Parameters:
-
outBoolean
- boolean to compare. -
cmpBoolean
- boolean to compare. - Returns:
- true if booleans are equal, otherwise false.
-
compareXmp
Compares xmp metadata of the two given PDF documents.- Parameters:
-
outPdf
- the absolute path to the output file, which xmp is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which xmp is to be compared to output file. - Returns:
- text report on the xmp differences, or null if there are no differences.
-
compareXmp
Compares xmp metadata of the two given PDF documents.- Parameters:
-
outPdf
- the absolute path to the output file, which xmp is to be compared to cmp-file. -
cmpPdf
- the absolute path to the cmp-file, which xmp is to be compared to output file. -
ignoreDateAndProducerProperties
- true, if to ignore differences in date or producer xmp metadata properties. - Returns:
- text report on the xmp differences, or null if there are no differences.
-
compareXmls
public boolean compareXmls(byte[] xml1, byte[] xml2) throws ParserConfigurationException, SAXException, IOException Utility method that provides simple comparison of the two xml files stored in byte arrays.- Parameters:
-
xml1
- first xml file data to compare. -
xml2
- second xml file data to compare. - Returns:
- true if xml structures are identical, false otherwise.
- Throws:
-
ParserConfigurationException
- if a XML DocumentBuilder cannot be created which satisfies the configuration requested. -
SAXException
- if any XML parse errors occur. -
IOException
- If any IO errors occur during reading XML files.
-
compareXmls
public boolean compareXmls(String outXmlFile, String cmpXmlFile) throws ParserConfigurationException, SAXException, IOException Utility method that provides simple comparison of the two xml files.- Parameters:
-
outXmlFile
- absolute path to the out xml file to compare. -
cmpXmlFile
- absolute path to the cmp xml file to compare. - Returns:
- true if xml structures are identical, false otherwise.
- Throws:
-
ParserConfigurationException
- if a XML DocumentBuilder cannot be created which satisfies the configuration requested. -
SAXException
- if any XML parse errors occur. -
IOException
- If any IO errors occur during reading XML files.
-
compareDocumentInfo
public String compareDocumentInfo(String outPdf, String cmpPdf, byte[] outPass, byte[] cmpPass) throws IOException Compares document info dictionaries of two pdf documents.This method overload is used to compare two encrypted PDF documents. Document passwords are passed with outPass and cmpPass parameters.
- Parameters:
-
outPdf
- the absolute path to the output file, which info is to be compared to cmp-file info. -
cmpPdf
- the absolute path to the cmp-file, which info is to be compared to output file info. -
outPass
- password for the encrypted document specified by the outPdf absolute path. -
cmpPass
- password for the encrypted document specified by the cmpPdf absolute path. - Returns:
- text report on the differences in documents infos.
- Throws:
-
IOException
- if PDF reader cannot be created due to IO issues
-
compareDocumentInfo
Compares document info dictionaries of two pdf documents.- Parameters:
-
outPdf
- the absolute path to the output file, which info is to be compared to cmp-file info. -
cmpPdf
- the absolute path to the cmp-file, which info is to be compared to output file info. - Returns:
- text report on the differences in documents infos.
- Throws:
-
IOException
- if PDF reader cannot be created due to IO issues
-
compareLinkAnnotations
Checks if two documents have identical link annotations on corresponding pages.- Parameters:
-
outPdf
- the absolute path to the output file, which links are to be compared to cmp-file links. -
cmpPdf
- the absolute path to the cmp-file, which links are to be compared to output file links. - Returns:
- text report on the differences in documents links.
- Throws:
-
IOException
- if PDF reader cannot be created due to IO issues
-
compareTagStructures
public String compareTagStructures(String outPdf, String cmpPdf) throws IOException, ParserConfigurationException, SAXException Compares tag structures of the two PDF documents.This method creates xml files in the same folder with outPdf file. These xml files contain documents tag structures converted into the xml structure. These xml files are compared if they are equal.
- Parameters:
-
outPdf
- the absolute path to the output file, which tags are to be compared to cmp-file tags. -
cmpPdf
- the absolute path to the cmp-file, which tags are to be compared to output file tags. - Returns:
- text report of the differences in documents tags.
- Throws:
-
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created. -
ParserConfigurationException
- if a XML DocumentBuilder cannot be created which satisfies the configuration requested. -
SAXException
- if any XML parse errors occur.
-
convertDocInfoToStrings
Converts document info into a string array.Converts document info into a string array. It can be used to compare PdfDocumentInfo later on. Default implementation retrieves title, author, subject, keywords and producer.
- Parameters:
-
info
- an instance of PdfDocumentInfo to be converted. - Returns:
- String array with all the document info tester is interested in.
-
compareObjects
protected boolean compareObjects(PdfObject outObj, PdfObject cmpObj, ObjectPath currentPath, CompareTool.CompareResult compareResult) Compare PDF objects.- Parameters:
-
outObj
- out object corresponding to the output file, which is to be compared with cmp object -
cmpObj
- cmp object corresponding to the cmp-file, which is to be compared with out object -
currentPath
- current objectsObjectPath
path -
compareResult
-CompareTool.CompareResult
for the results of the comparison of the two documents - Returns:
- true if objects are equal, false otherwise.
-