public class CompareTool extends Object
For visual comparison it uses external tools: Ghostscript and ImageMagick, which should be installed on your machine. To allow CompareTool to use them, you need to pass either java properties or environment variables with names "ITEXT_GS_EXEC" and "ITEXT_MAGICK_COMPARE_EXEC", which would contain the commands to execute the Ghostscript and ImageMagick tools.
CompareTool class was mainly designed for the testing purposes of iText in order to ensure that the same code produces the same PDF document. For this reason you will often encounter such parameter names as "outDoc" and "cmpDoc" which stand for output document and document-for-comparison. The first one is viewed as the current result, and the second one is referred as normal or ideal result. OutDoc is compared to the ideal cmpDoc. Therefore all reports of the comparison are in the form: "Expected ..., but was ...". This should be interpreted in the following way: "expected" part stands for the content of the cmpDoc and "but was" part stands for the content of the outDoc.
Modifier and Type | Class and Description |
---|---|
class |
CompareTool.CompareResult
Class containing results of the comparison of two documents.
|
class |
CompareTool.CompareToolExecutionException
Exceptions thrown when errors occur during generation and comparison of images obtained on the basis of pdf files.
|
class |
CompareTool.ObjectPath
Class that helps to find two corresponding objects in the compared documents and also keeps track of the already met during comparing process parent indirect objects.
|
Constructor and Description |
---|
CompareTool() |
Modifier and Type | Method and Description |
---|---|
boolean |
compareArrays(PdfArray outArray, PdfArray cmpArray)
Simple method that compares two given PdfArrays by content.
|
boolean |
compareBooleans(PdfBoolean outBoolean, PdfBoolean cmpBoolean)
Simple method that compares two given PdfBooleans.
|
CompareTool.CompareResult |
compareByCatalog(PdfDocument outDocument, PdfDocument cmpDocument)
Compares two PDF documents by content starting from Catalog dictionary and then recursively comparing corresponding objects which are referenced from it.
|
String |
compareByContent(String outPdf, String cmpPdf, String outPath)
Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them.
|
String |
compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix)
Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them.
|
String |
compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, byte[] outPass, byte[] cmpPass)
This method overload is used to compare two encrypted PDF documents.
|
String |
compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer,List<Rectangle>> ignoredAreas)
Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them.
|
String |
compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer,List<Rectangle>> ignoredAreas, byte[] outPass, byte[] cmpPass)
This method overload is used to compare two encrypted PDF documents.
|
boolean |
compareDictionaries(PdfDictionary outDict, PdfDictionary cmpDict)
Simple method that compares two given PdfDictionaries by content.
|
CompareTool.CompareResult |
compareDictionariesStructure(PdfDictionary outDict, PdfDictionary cmpDict)
Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents.
|
CompareTool.CompareResult |
compareDictionariesStructure(PdfDictionary outDict, PdfDictionary cmpDict, Set<PdfName> excludedKeys)
Recursively compares structures of two corresponding dictionaries from out and cmp PDF documents.
|
String |
compareDocumentInfo(String outPdf, String cmpPdf)
Compares document info dictionaries of two pdf documents.
|
String |
compareDocumentInfo(String outPdf, String cmpPdf, byte[] outPass, byte[] cmpPass)
Compares document info dictionaries of two pdf documents.
|
String |
compareLinkAnnotations(String outPdf, String cmpPdf)
Checks if two documents have identical link annotations on corresponding pages.
|
boolean |
compareNames(PdfName outName, PdfName cmpName)
Simple method that compares two given PdfNames.
|
boolean |
compareNumbers(PdfNumber outNumber, PdfNumber cmpNumber)
Simple method that compares two given PdfNumbers.
|
protected boolean |
compareObjects(PdfObject outObj, PdfObject cmpObj, CompareTool.ObjectPath currentPath, CompareTool.CompareResult compareResult) |
boolean |
compareStreams(PdfStream outStream, PdfStream cmpStream)
Simple method that compares two given PdfStreams by content.
|
CompareTool.CompareResult |
compareStreamsStructure(PdfStream outStream, PdfStream cmpStream)
Compares structures of two corresponding streams from out and cmp PDF documents.
|
boolean |
compareStrings(PdfString outString, PdfString cmpString)
Simple method that compares two given PdfStrings.
|
String |
compareTagStructures(String outPdf, String cmpPdf)
Compares tag structures of the two PDF documents.
|
String |
compareVisually(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix)
Compares two documents visually.
|
String |
compareVisually(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer,List<Rectangle>> ignoredAreas)
Compares two documents visually.
|
boolean |
compareXmls(byte[] xml1, byte[] xml2)
Utility method that provides simple comparison of the two xml files stored in byte arrays.
|
boolean |
compareXmls(String outXmlFile, String cmpXmlFile)
Utility method that provides simple comparison of the two xml files.
|
String |
compareXmp(String outPdf, String cmpPdf)
Compares xmp metadata of the two given PDF documents.
|
String |
compareXmp(String outPdf, String cmpPdf, boolean ignoreDateAndProducerProperties)
Compares xmp metadata of the two given PDF documents.
|
CompareTool |
disableCachedPagesComparison()
Disables the default logic of pages comparison.
|
CompareTool |
enableEncryptionCompare()
Enables the comparison of the encryption properties of the documents.
|
ReaderProperties |
getCmpReaderProperties()
Gets ReaderProperties to be passed later to the PdfReader of the cmp document.
|
ReaderProperties |
getOutReaderProperties()
Gets ReaderProperties to be passed later to the PdfReader of the output document.
|
CompareTool |
setCompareByContentErrorsLimit(int compareByContentMaxErrorCount)
Sets the maximum errors count which will be returned as the result of the comparison.
|
void |
setEventCountingMetaInfo(IMetaInfo metaInfo)
Sets IMetaInfo info that will be used for both read and written documents creation.
|
CompareTool |
setGenerateCompareByContentXmlReport(boolean generateCompareByContentXmlReport)
Enables or disables the generation of the comparison report in the form of an xml document.
|
public CompareTool.CompareResult compareByCatalog(PdfDocument outDocument, PdfDocument cmpDocument) throws IOException
The main difference between this method and the compareByContent(String, String, String, String)
methods is the return value. This method returns a CompareTool.CompareResult
class instance, which could be used in code, whilst compareByContent methods in case of the differences simply return String value, which could only be printed. Also, keep in mind that this method doesn't perform visual comparison of the documents.
For more explanations about what outDoc and cmpDoc are see last paragraph of the CompareTool
class description.
outDocument
- a PdfDocument
corresponding to the output file, which is to be compared with cmp-file.
cmpDocument
- a PdfDocument
corresponding to the cmp-file, which is to be compared with output file.
CompareTool.CompareResult
instance.
IOException
- obsolete. Would be removed in 7.2.
CompareTool.CompareResult
public CompareTool disableCachedPagesComparison()
compareByCatalog(PdfDocument, PdfDocument)
method.
By default, pages are treated as special objects and if they are met in the process of comparison, then they are not checked as objects, but rather simply checked that they have same page numbers in both documents. This behaviour is intended for the compareByContent(java.lang.String, java.lang.String, java.lang.String)
set of methods, because in them documents are compared in page by page basis. Thus, we don't need to check if pages are of the same content when they are met in comparison process, we are sure that we will compare their content or we have already compared them.
However, if you would use compareByCatalog(com.itextpdf.kernel.pdf.PdfDocument, com.itextpdf.kernel.pdf.PdfDocument)
with default behaviour of pages comparison, pages won't be checked at all, every time when reference to the page dictionary is met, only page numbers will be compared for both documents. You can say that in this case, comparison will be performed for all document's catalog entries except /Pages (However in fact, document's page tree structures will be compared, but pages themselves - won't).
CompareTool
instance.
public CompareTool setCompareByContentErrorsLimit(int compareByContentMaxErrorCount)
compareByContentMaxErrorCount
- the errors count.
public CompareTool setGenerateCompareByContentXmlReport(boolean generateCompareByContentXmlReport)
IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods!
generateCompareByContentXmlReport
- true to enable xml report generation, false - to disable.
public void setEventCountingMetaInfo(IMetaInfo metaInfo)
IMetaInfo
info that will be used for both read and written documents creation.
metaInfo
- meta info to set
public CompareTool enableEncryptionCompare()
IMPORTANT NOTE: this flag affects only the comparison performed by compareByContent methods! compareByCatalog(PdfDocument, PdfDocument)
doesn't compare encryption properties because encryption properties aren't part of the document's Catalog.
public ReaderProperties getOutReaderProperties()
ReaderProperties
to be passed later to the PdfReader
of the output document.
Documents for comparison are opened in reader mode. This method is intended to alter ReaderProperties
which are used to open the output document. This is particularly useful for comparison of encrypted documents.
For more explanations about what outDoc and cmpDoc are see last paragraph of the CompareTool
class description.
ReaderProperties
instance to be passed later to the PdfReader
of the output document.
public ReaderProperties getCmpReaderProperties()
ReaderProperties
to be passed later to the PdfReader
of the cmp document.
Documents for comparison are opened in reader mode. This method is intended to alter ReaderProperties
which are used to open the cmp document. This is particularly useful for comparison of encrypted documents.
For more explanations about what outDoc and cmpDoc are see last paragraph of the CompareTool
class description.
ReaderProperties
instance to be passed later to the PdfReader
of the cmp document.
public String compareVisually(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix) throws InterruptedException, IOException
CompareTool
class description.
During comparison for every page of the two documents an image file will be created in the folder specified by outPath parameter. Then those page images will be compared and if there are any differences for some pages, another image file will be created with marked differences on it.
outPdf
- the absolute path to the output file, which is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file.
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison.
differenceImagePrefix
- file name prefix for image files with marked differences if there is any.
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException
is thrown.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
public String compareVisually(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer,List<Rectangle>> ignoredAreas) throws InterruptedException, IOException
CompareTool
class description.
During comparison for every page of two documents an image file will be created in the folder specified by outPath parameter. Then those page images will be compared and if there are any differences for some pages, another image file will be created with marked differences on it.
It is possible to ignore certain areas of the document pages during visual comparison. This is useful for example in case if documents should be the same except certain page area with date on it. In this case, in the folder specified by the outPath, new pdf documents will be created with the black rectangles at the specified ignored areas, and visual comparison will be performed on these new documents.
outPdf
- the absolute path to the output file, which is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file.
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison.
differenceImagePrefix
- file name prefix for image files with marked differences if there is any.
ignoredAreas
- a map with one-based page numbers as keys and lists of ignored rectangles as values.
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException
is thrown.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
public String compareByContent(String outPdf, String cmpPdf, String outPath) throws InterruptedException, IOException
When comparison by content is finished, if any differences were found, visual comparison is automatically started. For this overload, differenceImagePrefix value is generated using diff_%outPdfFileName%_ format.
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outPdf
- the absolute path to the output file, which is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file.
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison.
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException
is thrown.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
compareVisually(String, String, String, String)
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix) throws InterruptedException, IOException
When comparison by content is finished, if any differences were found, visual comparison is automatically started.
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outPdf
- the absolute path to the output file, which is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file.
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison.
differenceImagePrefix
- file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException
is thrown.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
compareVisually(String, String, String, String)
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, byte[] outPass, byte[] cmpPass) throws InterruptedException, IOException
Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.
When comparison by content is finished, if any differences were found, visual comparison is automatically started. For more info see compareVisually(String, String, String, String)
.
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outPdf
- the absolute path to the output file, which is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file.
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison.
differenceImagePrefix
- file name prefix for image files with marked visual differences if there is any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
outPass
- password for the encrypted document specified by the outPdf absolute path.
cmpPass
- password for the encrypted document specified by the cmpPdf absolute path.
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException
is thrown.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
compareVisually(String, String, String, String)
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer,List<Rectangle>> ignoredAreas) throws InterruptedException, IOException
When comparison by content is finished, if any differences were found, visual comparison is automatically started.
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outPdf
- the absolute path to the output file, which is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file.
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison.
differenceImagePrefix
- file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
ignoredAreas
- a map with one-based page numbers as keys and lists of ignored rectangles as values.
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException
is thrown.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
compareVisually(String, String, String, String)
public String compareByContent(String outPdf, String cmpPdf, String outPath, String differenceImagePrefix, Map<Integer,List<Rectangle>> ignoredAreas, byte[] outPass, byte[] cmpPass) throws InterruptedException, IOException
Compares two PDF documents by content starting from page dictionaries and then recursively comparing corresponding objects which are referenced from them. You can roughly imagine it as depth-first traversal of the two trees that represent pdf objects structure of the documents.
When comparison by content is finished, if any differences were found, visual comparison is automatically started.
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outPdf
- the absolute path to the output file, which is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which is to be compared to output file.
outPath
- the absolute path to the folder, which will be used to store image files for visual comparison.
differenceImagePrefix
- file name prefix for image files with marked visual differences if there are any; if it's set to null the prefix defaults to diff_%outPdfFileName%_ format.
ignoredAreas
- a map with one-based page numbers as keys and lists of ignored rectangles as values.
outPass
- password for the encrypted document specified by the outPdf absolute path.
cmpPass
- password for the encrypted document specified by the cmpPdf absolute path.
InterruptedException
- if the current thread is interrupted by another thread while it is waiting for ghostscript or imagemagic processes, then the wait is ended and an InterruptedException
is thrown.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
compareVisually(String, String, String, String)
public boolean compareDictionaries(PdfDictionary outDict, PdfDictionary cmpDict) throws IOException
outDict
- dictionary to compare.
cmpDict
- dictionary to compare.
IOException
- obsolete. Would be removed in 7.2.
public CompareTool.CompareResult compareDictionariesStructure(PdfDictionary outDict, PdfDictionary cmpDict)
Both out and cmp PdfDictionary
shall have indirect references.
By default page dictionaries are excluded from the comparison when met and are instead compared in a special manner, simply comparing their page numbers. This behavior can be disabled by calling disableCachedPagesComparison()
.
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outDict
- an indirect PdfDictionary
from the output file, which is to be compared to cmp-file dictionary.
cmpDict
- an indirect PdfDictionary
from the cmp-file file, which is to be compared to output file dictionary.
CompareTool.CompareResult
instance containing differences between the two dictionaries, or null
if dictionaries are equal.
public CompareTool.CompareResult compareDictionariesStructure(PdfDictionary outDict, PdfDictionary cmpDict, Set<PdfName> excludedKeys)
Both out and cmp PdfDictionary
shall have indirect references.
By default page dictionaries are excluded from the comparison when met and are instead compared in a special manner, simply comparing their page numbers. This behavior can be disabled by calling disableCachedPagesComparison()
.
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outDict
- an indirect PdfDictionary
from the output file, which is to be compared to cmp-file dictionary.
cmpDict
- an indirect PdfDictionary
from the cmp-file file, which is to be compared to output file dictionary.
excludedKeys
- a Set
of names that designate entries from outDict
and cmpDict
dictionaries which are to be skipped during comparison.
CompareTool.CompareResult
instance containing differences between the two dictionaries, or null
if dictionaries are equal.
public CompareTool.CompareResult compareStreamsStructure(PdfStream outStream, PdfStream cmpStream)
For more explanations about what outPdf and cmpPdf are see last paragraph of the CompareTool
class description.
outStream
- a PdfStream
from the output file, which is to be compared to cmp-file stream.
cmpStream
- a PdfStream
from the cmp-file file, which is to be compared to output file stream.
CompareTool.CompareResult
instance containing differences between the two streams, or null
if streams are equal.
public boolean compareStreams(PdfStream outStream, PdfStream cmpStream) throws IOException
outStream
- stream to compare.
cmpStream
- stream to compare.
IOException
- obsolete. Would be removed in 7.2.
public boolean compareArrays(PdfArray outArray, PdfArray cmpArray) throws IOException
outArray
- array to compare.
cmpArray
- array to compare.
IOException
- obsolete. Would be removed in 7.2.
public boolean compareNames(PdfName outName, PdfName cmpName)
outName
- name to compare.
cmpName
- name to compare.
public boolean compareNumbers(PdfNumber outNumber, PdfNumber cmpNumber)
outNumber
- number to compare.
cmpNumber
- number to compare.
public boolean compareStrings(PdfString outString, PdfString cmpString)
outString
- string to compare.
cmpString
- string to compare.
public boolean compareBooleans(PdfBoolean outBoolean, PdfBoolean cmpBoolean)
outBoolean
- boolean to compare.
cmpBoolean
- boolean to compare.
public String compareXmp(String outPdf, String cmpPdf)
outPdf
- the absolute path to the output file, which xmp is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which xmp is to be compared to output file.
public String compareXmp(String outPdf, String cmpPdf, boolean ignoreDateAndProducerProperties)
outPdf
- the absolute path to the output file, which xmp is to be compared to cmp-file.
cmpPdf
- the absolute path to the cmp-file, which xmp is to be compared to output file.
ignoreDateAndProducerProperties
- true, if to ignore differences in date or producer xmp metadata properties.
public boolean compareXmls(byte[] xml1, byte[] xml2) throws ParserConfigurationException, SAXException, IOException
xml1
- first xml file data to compare.
xml2
- second xml file data to compare.
ParserConfigurationException
- if a XML DocumentBuilder cannot be created which satisfies the configuration requested.
SAXException
- if any XML parse errors occur.
IOException
- If any IO errors occur during reading XML files.
public boolean compareXmls(String outXmlFile, String cmpXmlFile) throws ParserConfigurationException, SAXException, IOException
outXmlFile
- absolute path to the out xml file to compare.
cmpXmlFile
- absolute path to the cmp xml file to compare.
ParserConfigurationException
- if a XML DocumentBuilder cannot be created which satisfies the configuration requested.
SAXException
- if any XML parse errors occur.
IOException
- If any IO errors occur during reading XML files.
public String compareDocumentInfo(String outPdf, String cmpPdf, byte[] outPass, byte[] cmpPass) throws IOException
This method overload is used to compare two encrypted PDF documents. Document passwords are passed with outPass and cmpPass parameters.
outPdf
- the absolute path to the output file, which info is to be compared to cmp-file info.
cmpPdf
- the absolute path to the cmp-file, which info is to be compared to output file info.
outPass
- password for the encrypted document specified by the outPdf absolute path.
cmpPass
- password for the encrypted document specified by the cmpPdf absolute path.
IOException
- if PDF reader cannot be created due to IO issues
public String compareDocumentInfo(String outPdf, String cmpPdf) throws IOException
outPdf
- the absolute path to the output file, which info is to be compared to cmp-file info.
cmpPdf
- the absolute path to the cmp-file, which info is to be compared to output file info.
IOException
- if PDF reader cannot be created due to IO issues
public String compareLinkAnnotations(String outPdf, String cmpPdf) throws IOException
outPdf
- the absolute path to the output file, which links are to be compared to cmp-file links.
cmpPdf
- the absolute path to the cmp-file, which links are to be compared to output file links.
IOException
- if PDF reader cannot be created due to IO issues
public String compareTagStructures(String outPdf, String cmpPdf) throws IOException, ParserConfigurationException, SAXException
This method creates xml files in the same folder with outPdf file. These xml files contain documents tag structures converted into the xml structure. These xml files are compared if they are equal.
outPdf
- the absolute path to the output file, which tags are to be compared to cmp-file tags.
cmpPdf
- the absolute path to the cmp-file, which tags are to be compared to output file tags.
IOException
- is thrown if any of the input files are missing or any of the auxiliary files that are created during comparison process weren't possible to be created.
ParserConfigurationException
- if a XML DocumentBuilder cannot be created which satisfies the configuration requested.
SAXException
- if any XML parse errors occur.
protected boolean compareObjects(PdfObject outObj, PdfObject cmpObj, CompareTool.ObjectPath currentPath, CompareTool.CompareResult compareResult)
Copyright © 1998–2023 iText Group NV. All rights reserved.