public class PdfReader extends Object implements Closeable
Modifier and Type | Class and Description |
---|---|
protected static class |
PdfReader.ReusableRandomAccessSource |
static class |
PdfReader.StrictnessLevel
Enumeration representing the strictness level for reading.
|
Modifier and Type | Field and Description |
---|---|
protected static boolean |
correctStreamLength |
protected PdfEncryption |
decrypt |
static PdfReader.StrictnessLevel |
DEFAULT_STRICTNESS_LEVEL
The default PdfReader.StrictnessLevel to be used.
|
protected boolean |
encrypted |
protected long |
eofPos |
protected boolean |
fixedXref |
protected PdfVersion |
headerPdfVersion |
protected boolean |
hybridXref |
protected long |
lastXref |
protected PdfAConformanceLevel |
pdfAConformanceLevel |
protected PdfDocument |
pdfDocument |
protected ReaderProperties |
properties |
protected boolean |
rebuiltXref |
protected PdfTokenizer |
tokens |
protected PdfDictionary |
trailer |
protected boolean |
xrefStm |
Constructor and Description |
---|
PdfReader(File file)
Reads and parses a PDF document.
|
PdfReader(InputStream is)
Reads and parses a PDF document.
|
PdfReader(InputStream is, ReaderProperties properties)
Reads and parses a PDF document.
|
PdfReader(IRandomAccessSource byteSource, ReaderProperties properties)
Constructs a new PdfReader.
|
PdfReader(String filename)
Reads and parses a PDF document.
|
PdfReader(String filename, ReaderProperties properties)
Reads and parses a PDF document.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Close PdfTokenizer .
|
byte[] |
computeUserPassword()
Computes user password if standard encryption handler is used with Standard40, Standard128 or AES128 encryption algorithm.
|
static byte[] |
decodeBytes(byte[] b, PdfDictionary streamDictionary)
Decode bytes applying the filters specified in the provided dictionary using default filter handlers.
|
static byte[] |
decodeBytes(byte[] b, PdfDictionary streamDictionary, Map<PdfName,IFilterHandler> filterHandlers)
Decode a byte[] applying the filters specified in the provided dictionary using the provided filter handlers.
|
protected void |
fixXref() |
int |
getCryptoMode()
Gets encryption algorithm and access permissions.
|
long |
getFileLength()
Provides the size of the opened file.
|
long |
getLastXref()
Gets position of the last Cross-Reference table.
|
byte[] |
getModifiedFileId()
Gets modified file ID, the second element in PdfName.ID key of trailer.
|
byte[] |
getOriginalFileId()
Gets original file ID, the first element in PdfName.ID key of trailer.
|
PdfAConformanceLevel |
getPdfAConformanceLevel()
Gets the declared PDF/A conformance level of the source document that is being read.
|
long |
getPermissions()
Gets the encryption permissions.
|
RandomAccessFileOrArray |
getSafeFile()
Gets a new file instance of the original PDF document.
|
PdfReader.StrictnessLevel |
getStrictnessLevel()
Get the current PdfReader.StrictnessLevel of the reader.
|
protected PdfNumber |
getXrefPrev(PdfObject prevObjectToCheck) |
boolean |
hasFixedXref()
If any exception generated while reading PdfObject, PdfReader will try to fix offsets of all objects.
|
boolean |
hasHybridXref()
Some documents contain hybrid XRef, for more information see "7.5.8.4 Compatibility with Applications That Do Not Support Compressed Reference Streams" in PDF 32000-1:2008 spec.
|
boolean |
hasRebuiltXref()
If any exception generated while reading XRef section, PdfReader will try to rebuild it.
|
boolean |
hasXrefStm()
Indicates whether the document has Cross-Reference Streams.
|
boolean |
isCloseStream()
Gets whether close() method shall close input stream.
|
boolean |
isEncrypted()
Checks if the PdfDocument read with this PdfReader is encrypted.
|
boolean |
isOpenedWithFullPermission()
Checks if the document was opened with the owner password so that the end application can decide what level of access restrictions to apply.
|
protected PdfArray |
readArray(boolean objStm) |
protected PdfDictionary |
readDictionary(boolean objStm) |
protected PdfObject |
readObject(boolean readAsDirect) |
protected PdfObject |
readObject(boolean readAsDirect, boolean objStm) |
protected PdfObject |
readObject(PdfIndirectReference reference) |
protected void |
readObjectStream(PdfStream objectStream) |
protected void |
readPdf()
Parses the entire PDF
|
protected PdfName |
readPdfName(boolean readAsDirect) |
protected PdfObject |
readReference(boolean readAsDirect) |
InputStream |
readStream(PdfStream stream, boolean decode)
Reads, decrypts and optionally decodes stream bytes into ByteArrayInputStream .
|
byte[] |
readStreamBytes(PdfStream stream, boolean decode)
Reads, decrypt and optionally decode stream bytes.
|
byte[] |
readStreamBytesRaw(PdfStream stream)
Reads and decrypt stream bytes.
|
protected void |
readXref() |
protected PdfDictionary |
readXrefSection() |
protected boolean |
readXrefStream(long ptr) |
protected void |
rebuildXref() |
void |
setCloseStream(boolean closeStream)
Sets whether close() method shall close input stream.
|
PdfReader |
setMemorySavingMode(boolean memorySavingMode)
Defines if memory saving mode is enabled.
|
PdfReader |
setStrictnessLevel(PdfReader.StrictnessLevel strictnessLevel)
Set the PdfReader.StrictnessLevel for the reader.
|
PdfReader |
setUnethicalReading(boolean unethicalReading)
The iText is not responsible if you decide to change the value of this parameter.
|
public static final PdfReader.StrictnessLevel DEFAULT_STRICTNESS_LEVEL
PdfReader.StrictnessLevel
to be used.
protected static boolean correctStreamLength
protected PdfTokenizer tokens
protected PdfEncryption decrypt
protected PdfVersion headerPdfVersion
protected long lastXref
protected long eofPos
protected PdfDictionary trailer
protected PdfDocument pdfDocument
protected PdfAConformanceLevel pdfAConformanceLevel
protected ReaderProperties properties
protected boolean encrypted
protected boolean rebuiltXref
protected boolean hybridXref
protected boolean fixedXref
protected boolean xrefStm
public PdfReader(IRandomAccessSource byteSource, ReaderProperties properties) throws IOException
byteSource
- source of bytes for the reader
properties
- properties of the created reader
IOException
- if an I/O error occurs
public PdfReader(InputStream is, ReaderProperties properties) throws IOException
is
- the InputStream
containing the document. If the inputStream is an instance of RASInputStream
then the IRandomAccessSource
would be extracted. Otherwise the stream is read to the end but is not closed.
properties
- properties of the created reader
IOException
- on error
public PdfReader(File file) throws FileNotFoundException, IOException
file
- the File
containing the document.
IOException
- on error
FileNotFoundException
- when the specified File is not found
public PdfReader(InputStream is) throws IOException
is
- the InputStream
containing the document. If the inputStream is an instance of RASInputStream
then the IRandomAccessSource
would be extracted. Otherwise the stream is read to the end but is not closed.
IOException
- on error
public PdfReader(String filename, ReaderProperties properties) throws IOException
filename
- the file name of the document
properties
- properties of the created reader
IOException
- on error
public PdfReader(String filename) throws IOException
filename
- the file name of the document
IOException
- on error
public void close() throws IOException
PdfTokenizer
.
close
in interface Closeable
close
in interface AutoCloseable
IOException
- on error.
public PdfReader setUnethicalReading(boolean unethicalReading)
unethicalReading
- true to enable unethicalReading, false to disable it. By default unethicalReading is disabled.
PdfReader
instance.
public PdfReader setMemorySavingMode(boolean memorySavingMode)
By default memory saving mode is disabled for the sake of time–memory trade-off.
If memory saving mode is enabled, document processing might slow down, but reading will be less memory demanding.
memorySavingMode
- true to enable memory saving mode, false to disable it.
PdfReader
instance.
public PdfReader.StrictnessLevel getStrictnessLevel()
PdfReader.StrictnessLevel
of the reader.
PdfReader.StrictnessLevel
public PdfReader setStrictnessLevel(PdfReader.StrictnessLevel strictnessLevel)
PdfReader.StrictnessLevel
for the reader. If the argument is null
, then the DEFAULT_STRICTNESS_LEVEL
will be used.
strictnessLevel
- the PdfReader.StrictnessLevel
to set
PdfReader
instance
public boolean isCloseStream()
close()
method shall close input stream.
close()
method will close input stream, otherwise false.
public void setCloseStream(boolean closeStream)
close()
method shall close input stream.
closeStream
- true, if close()
method shall close input stream, otherwise false.
public boolean hasRebuiltXref()
PdfException
- if the method has been invoked before the PDF document was read.
public boolean hasHybridXref()
PdfException
- if the method has been invoked before the PDF document was read.
public boolean hasXrefStm()
PdfException
- if the method has been invoked before the PDF document was read.
public boolean hasFixedXref()
This method's returned value might change over time, because PdfObjects reading can be postponed even up to document closing.
PdfException
- if the method has been invoked before the PDF document was read.
public long getLastXref()
PdfException
- if the method has been invoked before the PDF document was read.
public byte[] readStreamBytes(PdfStream stream, boolean decode) throws IOException
stream
- a PdfStream
stream instance to be read and optionally decoded.
decode
- true if to get decoded stream bytes, false if to leave it originally encoded.
IOException
- on error.
public byte[] readStreamBytesRaw(PdfStream stream) throws IOException
stream
- a PdfStream
stream instance to be read
IOException
- on error.
public InputStream readStream(PdfStream stream, boolean decode) throws IOException
ByteArrayInputStream
. User is responsible for closing returned stream.
stream
- a PdfStream
stream instance to be read
decode
- true if to get decoded stream, false if to leave it originally encoded.
null
if reading was failed.
IOException
- on error.
public static byte[] decodeBytes(byte[] b, PdfDictionary streamDictionary)
b
- the bytes to decode
streamDictionary
- the dictionary that contains filter information
PdfException
- if there are any problems decoding the bytes
public static byte[] decodeBytes(byte[] b, PdfDictionary streamDictionary, Map<PdfName,IFilterHandler> filterHandlers)
b
- the bytes to decode
streamDictionary
- the dictionary that contains filter information
filterHandlers
- the map used to look up a handler for each type of filter
PdfException
- if there are any problems decoding the bytes
public RandomAccessFileOrArray getSafeFile()
public long getFileLength()
public boolean isOpenedWithFullPermission()
true
.
true
if the document was opened with the owner password or if it's not encrypted, false
if the document was opened with the user password.
PdfException
- if the method has been invoked before the PDF document was read.
public long getPermissions()
WriterProperties.setStandardEncryption(byte[], byte[], int, int)
. See ISO 32000-1, Table 22 for more details.
PdfException
- if the method has been invoked before the PDF document was read.
public int getCryptoMode()
int
value corresponding to a certain type of encryption.
PdfException
- if the method has been invoked before the PDF document was read.
EncryptionConstants
public PdfAConformanceLevel getPdfAConformanceLevel()
pdfAConformanceLevel
is lazy initialized. It will be initialized during the first call of this method.
null
if no PDF/A conformance level information is specified.
public byte[] computeUserPassword()
PdfException
- if the method has been invoked before the PDF document was read.
public byte[] getOriginalFileId()
PdfName.ID
key of trailer. If the size of ID array does not equal 2, an empty array will be returned.
The returned value reflects the value that was written in opened document. If document is modified, the ultimate document id can be retrieved from PdfDocument.getOriginalDocumentId()
.
PdfException
- if the method has been invoked before the PDF document was read.
PdfDocument.getOriginalDocumentId()
public byte[] getModifiedFileId()
PdfName.ID
key of trailer. If the size of ID array does not equal 2, an empty array will be returned.
The returned value reflects the value that was written in opened document. If document is modified, the ultimate document id can be retrieved from PdfDocument.getModifiedDocumentId()
.
PdfException
- if the method has been invoked before the PDF document was read.
PdfDocument.getModifiedDocumentId()
public boolean isEncrypted()
PdfDocument
read with this PdfReader
is encrypted.
true
is the document is encrypted, otherwise false
.
PdfException
- if the method has been invoked before the PDF document was read.
protected void readPdf() throws IOException
IOException
- if an I/O error occurs.
protected void readObjectStream(PdfStream objectStream) throws IOException
IOException
protected PdfObject readObject(PdfIndirectReference reference)
protected PdfObject readObject(boolean readAsDirect) throws IOException
IOException
protected PdfObject readReference(boolean readAsDirect)
protected PdfObject readObject(boolean readAsDirect, boolean objStm) throws IOException
IOException
protected PdfName readPdfName(boolean readAsDirect)
protected PdfDictionary readDictionary(boolean objStm) throws IOException
IOException
protected PdfArray readArray(boolean objStm) throws IOException
IOException
protected void readXref() throws IOException
IOException
protected PdfDictionary readXrefSection() throws IOException
IOException
protected boolean readXrefStream(long ptr) throws IOException
IOException
protected void fixXref() throws IOException
IOException
protected void rebuildXref() throws IOException
IOException
Copyright © 1998–2022 iText Group NV. All rights reserved.