java.lang.Object

com.itextpdf.io.source.PdfTokenizer

All Implemented Interfaces:: Closeable, AutoCloseable

public class PdfTokenizer extends Object implements Closeable

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static enum

PdfTokenizer.TokenType
Field Summary

Fields

Modifier and Type

Field

Description

static final byte[]

F

static final byte[]

False

protected int

generation

protected boolean

hexString

static final byte[]

N

static final byte[]

Null

static final byte[]

Obj

protected ByteBuffer

outBuf

static final byte[]

R

protected int

reference

static final byte[]

Startxref

static final byte[]

Stream

static final byte[]

Trailer

static final byte[]

True

protected PdfTokenizer.TokenType

type

static final byte[]

Xref
Constructor Summary

Constructors

Constructor

Description

PdfTokenizer(RandomAccessFileOrArray file)

Creates a PdfTokenizer for the specified RandomAccessFileOrArray.
Method Summary

Modifier and Type

Method

Description

void

backOnePosition(int ch)

void

checkFdfHeader()

static int[]

checkObjectStart(PdfTokenizer lineTokenizer)

Check whether line starts with object declaration.

String

checkPdfHeader()

static boolean

checkTrailer(ByteBuffer line)

Checks whether line equals to 'trailer'.

void

close()

static byte[]

decodeStringContent(byte[] content, boolean hexWriting)

Resolve escape symbols or hexadecimal symbols.

protected static byte[]

decodeStringContent(byte[] content, int from, int to, boolean hexWriting)

Resolve escape symbols or hexadecimal symbols.

byte[]

getByteContent()

byte[]

getDecodedStringContent()

int

getGenNr()

int

getHeaderOffset()

int

getIntValue()

long

getLongValue()

long

getNextEof()

Gets next %%EOF marker in current PDF file.

int

getObjNr()

long

getPosition()

RandomAccessFileOrArray

getSafeFile()

long

getStartxref()

String

getStringValue()

PdfTokenizer.TokenType

getTokenType()

boolean

isCloseStream()

protected static boolean

isDelimiter(int ch)

protected static boolean

isDelimiterWhitespace(int ch)

boolean

isHexString()

static boolean

isWhitespace(int ch)

Is a certain character a whitespace? Currently checks on the following: '0', '9', '10', '12', '13', '32'.

protected static boolean

isWhitespace(int ch, boolean isWhitespace)

Checks whether a character is a whitespace.

long

length()

boolean

nextToken()

void

nextValidToken()

int

peek()

Gets the next byte of pdf source without moving source position.

int

peek(byte[] buffer)

Gets the next buffer.length bytes of pdf source without moving source position.

int

read()

void

readFully(byte[] bytes)

boolean

readLineSegment(ByteBuffer buffer)

Reads data into the provided byte[].

boolean

readLineSegment(ByteBuffer buffer, boolean isNullWhitespace)

Reads data into the provided byte[].

String

readString(int size)

void

seek(long pos)

void

setCloseStream(boolean closeStream)

void

throwError(String error, Object... messageParams)

Helper method to handle content errors.

boolean

tokenValueEqualsTo(byte[] cmp)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- Obj
  
  public static final byte[] Obj
- R
  
  public static final byte[] R
- Xref
  
  public static final byte[] Xref
- Startxref
  
  public static final byte[] Startxref
- Stream
  
  public static final byte[] Stream
- Trailer
  
  public static final byte[] Trailer
- N
  
  public static final byte[] N
- F
  
  public static final byte[] F
- Null
  
  public static final byte[] Null
- True
  
  public static final byte[] True
- False
  
  public static final byte[] False
- type
  
  protected PdfTokenizer.TokenType type
- reference
  
  protected int reference
- generation
  
  protected int generation
- hexString
  
  protected boolean hexString
- outBuf
  
  protected ByteBuffer outBuf
Constructor Details
- PdfTokenizer
  
  public PdfTokenizer (RandomAccessFileOrArray file)
  
  Creates a PdfTokenizer for the specified RandomAccessFileOrArray. The beginning of the file is read to determine the location of the header, and the data source is adjusted as necessary to account for any junk that occurs in the byte source before the header
  
  Parameters:
  
  file - the source
Method Details
- seek
  
  public void seek (long pos)
- readFully
  
  public void readFully (byte[] bytes) throws IOException
  
  Throws:
  
  IOException
- getPosition
  
  public long getPosition()
- close
  
  public void close() throws IOException
  
  Specified by:
  
  close in interface AutoCloseable
  
  Specified by:
  
  close in interface Closeable
  
  Throws:
  
  IOException
- length
  
  public long length()
- read
  
  public int read() throws IOException
  
  Throws:
  
  IOException
- peek
  
  public int peek() throws IOException
  
  Gets the next byte of pdf source without moving source position.
  
  Returns:
  
  the byte, or -1 if EOF is reached
  
  Throws:
  
  IOException - in case of any reading error.
- peek
  
  public int peek (byte[] buffer) throws IOException
  
  Gets the next buffer.length bytes of pdf source without moving source position.
  
  Parameters:
  
  buffer - buffer to store read bytes
  
  Returns:
  
  the number of read bytes. If it is less than buffer.length it means EOF has been reached.
  
  Throws:
  
  IOException - in case of any reading error.
- readString
  
  public String readString (int size) throws IOException
  
  Throws:
  
  IOException
- getTokenType
  
  public PdfTokenizer.TokenType getTokenType()
- getByteContent
  
  public byte[] getByteContent()
- getStringValue
  
  public String getStringValue()
- getDecodedStringContent
  
  public byte[] getDecodedStringContent()
- tokenValueEqualsTo
  
  public boolean tokenValueEqualsTo (byte[] cmp)
- getObjNr
  
  public int getObjNr()
- getGenNr
  
  public int getGenNr()
- backOnePosition
  
  public void backOnePosition (int ch)
- getHeaderOffset
  
  public int getHeaderOffset() throws IOException
  
  Throws:
  
  IOException
- checkPdfHeader
  
  public String checkPdfHeader() throws IOException
  
  Throws:
  
  IOException
- checkFdfHeader
  
  public void checkFdfHeader() throws IOException
  
  Throws:
  
  IOException
- getStartxref
  
  public long getStartxref() throws IOException
  
  Throws:
  
  IOException
- getNextEof
  
  public long getNextEof() throws IOException
  
  Gets next %%EOF marker in current PDF file.
  
  Returns:
  
  next %%EOF marker position
  
  Throws:
  
  IOException - in case of input-output related exceptions during PDF document reading
- nextValidToken
  
  public void nextValidToken() throws IOException
  
  Throws:
  
  IOException
- nextToken
  
  public boolean nextToken() throws IOException
  
  Throws:
  
  IOException
- getLongValue
  
  public long getLongValue()
- getIntValue
  
  public int getIntValue()
- isHexString
  
  public boolean isHexString()
- isCloseStream
  
  public boolean isCloseStream()
- setCloseStream
  
  public void setCloseStream (boolean closeStream)
- getSafeFile
  
  public RandomAccessFileOrArray getSafeFile()
- decodeStringContent
  
  protected static byte[] decodeStringContent (byte[] content, int from, int to, boolean hexWriting)
  
  Resolve escape symbols or hexadecimal symbols.
  NOTE Due to PdfReference 1.7 part 3.2.3 String value contain ASCII characters, so we can convert it directly to byte array.
  
  Parameters:
  
  content - string bytes to be decoded
  
  from - given start index
  
  to - given end index
  
  hexWriting - true if given string is hex-encoded, e.g. '<69546578…>'. False otherwise, e.g. '((iText( some version)…)'
  
  Returns:
  
  byte[] for decrypting or for creating String.
- decodeStringContent
  
  public static byte[] decodeStringContent (byte[] content, boolean hexWriting)
  
  Resolve escape symbols or hexadecimal symbols.
  NOTE Due to PdfReference 1.7 part 3.2.3 String value contain ASCII characters, so we can convert it directly to byte array.
  
  Parameters:
  
  content - string bytes to be decoded
  
  hexWriting - true if given string is hex-encoded, e.g. '<69546578…>'. False otherwise, e.g. '((iText( some version)…)'
  
  Returns:
  
  byte[] for decrypting or for creating String.
- isWhitespace
  
  public static boolean isWhitespace (int ch)
  
  Is a certain character a whitespace? Currently checks on the following: '0', '9', '10', '12', '13', '32'.
  The same as calling isWhiteSpace(ch, true).
  
  Parameters:
  
  ch - int
  
  Returns:
  
  boolean
- isWhitespace
  
  protected static boolean isWhitespace (int ch, boolean isWhitespace)
  
  Checks whether a character is a whitespace. Currently checks on the following: '0', '9', '10', '12', '13', '32'.
  
  Parameters:
  
  ch - int
  
  isWhitespace - boolean
  
  Returns:
  
  boolean
- isDelimiter
  
  protected static boolean isDelimiter (int ch)
- isDelimiterWhitespace
  
  protected static boolean isDelimiterWhitespace (int ch)
- throwError
  
  public void throwError (String error, Object... messageParams)
  
  Helper method to handle content errors. Add file position to PdfRuntimeException.
  
  Parameters:
  
  error - message.
  
  messageParams - error params.
  
  Throws:
  
  IOException - wrap error message into PdfRuntimeException and add position in file.
- checkTrailer
  
  public static boolean checkTrailer (ByteBuffer line)
  
  Checks whether line equals to 'trailer'.
  
  Parameters:
  
  line - for check
  
  Returns:
  
  true, if line is equals to 'trailer', otherwise false
- readLineSegment
  
  public boolean readLineSegment (ByteBuffer buffer) throws IOException
  
  Reads data into the provided byte[]. Checks on leading whitespace. See isWhiteSpace(int) or isWhiteSpace(int, boolean) for a list of whitespace characters.
  The same as calling readLineSegment(input, true).
  
  Parameters:
  
  buffer - a ByteBuffer to which the result of reading will be saved
  
  Returns:
  
  true, if something was read or if the end of the input stream is not reached
  
  Throws:
  
  IOException - in case of any reading error
- readLineSegment
  
  public boolean readLineSegment (ByteBuffer buffer, boolean isNullWhitespace) throws IOException
  
  Reads data into the provided byte[]. Checks on leading whitespace. See isWhiteSpace(int) or isWhiteSpace(int, boolean) for a list of whitespace characters.
  
  Parameters:
  
  buffer - a ByteBuffer to which the result of reading will be saved
  
  isNullWhitespace - boolean to indicate whether '0' is whitespace or not. If in doubt, use true or overloaded method readLineSegment(input)
  
  Returns:
  
  true, if something was read or if the end of the input stream is not reached
  
  Throws:
  
  IOException - in case of any reading error
- checkObjectStart
  
  public static int[] checkObjectStart (PdfTokenizer lineTokenizer)
  
  Check whether line starts with object declaration.
  
  Parameters:
  
  lineTokenizer - tokenizer, built by single line.
  
  Returns:
  
  object number and generation if check is successful, otherwise - null.

Class PdfTokenizer

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

Obj

R

Xref

Startxref

Stream

Trailer

N

F

Null

True

False

type

reference

generation

hexString

outBuf

Constructor Details

PdfTokenizer

Method Details

seek

readFully

getPosition

close

length

read

peek

peek

readString

getTokenType

getByteContent

getStringValue

getDecodedStringContent

tokenValueEqualsTo

getObjNr

getGenNr

backOnePosition

getHeaderOffset

checkPdfHeader

checkFdfHeader

getStartxref

getNextEof

nextValidToken

nextToken

getLongValue

getIntValue

isHexString

isCloseStream

setCloseStream

getSafeFile

decodeStringContent

decodeStringContent

isWhitespace

isWhitespace

isDelimiter

isDelimiterWhitespace

throwError

checkTrailer

readLineSegment

readLineSegment

checkObjectStart