Class PdfPage


public class PdfPage extends PdfObjectWrapper<PdfDictionary>
  • Constructor Details

  • Method Details

    • getPageSize

      public Rectangle getPageSize()
      Gets page size, defined by media box object. This method doesn't take page rotation into account.
      Returns:
      Rectangle that specify page size.
    • getPageSizeWithRotation

      public Rectangle getPageSizeWithRotation()
      Gets page size, considering page rotation.
      Returns:
      Rectangle that specify size of rotated page.
    • getRotation

      public int getRotation()
      Gets the number of degrees by which the page shall be rotated clockwise when displayed or printed. Shall be a multiple of 90.
      Returns:
      int number of degrees. Default value: 0
    • setRotation

      public PdfPage setRotation (int degAngle)
      Sets the page rotation.
      Parameters:
      degAngle - the int number of degrees by which the page shall be rotated clockwise when displayed or printed. Shall be a multiple of 90.
      Returns:
      this PdfPage instance.
    • getContentStream

      public PdfStream getContentStream (int index)
      Gets the content stream at specified 0-based index in the Contents object PdfArray. The situation when Contents object is a PdfStream is treated like a one element array.
      Parameters:
      index - the int index of returned PdfStream.
      Returns:
      PdfStream object at specified index; will return null in case page dictionary doesn't adhere to the specification, meaning that the document is an invalid PDF.
      Throws:
      IndexOutOfBoundsException - if the index is out of range
    • getContentStreamCount

      public int getContentStreamCount()
      Gets the size of Contents object PdfArray. The situation when Contents object is a PdfStream is treated like a one element array.
      Returns:
      the int size of Contents object, or 1 if Contents object is a PdfStream.
    • getFirstContentStream

      public PdfStream getFirstContentStream()
      Returns the Contents object if it is PdfStream, or first stream in the array if it is PdfArray.
      Returns:
      first PdfStream in Contents object, or null if Contents is empty.
    • getLastContentStream

      public PdfStream getLastContentStream()
      Returns the Contents object if it is PdfStream, or last stream in the array if it is PdfArray.
      Returns:
      first PdfStream in Contents object, or null if Contents is empty.
    • newContentStreamBefore

      public PdfStream newContentStreamBefore()
      Creates new PdfStream object and puts it at the beginning of Contents array (if Contents object is PdfStream it will be replaced with one-element array).
      Returns:
      Created PdfStream object.
    • newContentStreamAfter

      public PdfStream newContentStreamAfter()
      Creates new PdfStream object and puts it at the end of Contents array (if Contents object is PdfStream it will be replaced with one-element array).
      Returns:
      Created PdfStream object.
    • getResources

      public PdfResources getResources()
      Gets the PdfResources wrapper object for this page resources. If page doesn't have resource object, then it will be inherited from page's parents. If neither parents nor page has the resource object, then the new one is created and added to page dictionary.

      NOTE: If you'll try to modify the inherited resources, then the new resources object will be created, so you won't change the parent's resources. This new object under the wrapper will be added to page dictionary on flush(), or you can add it manually with this line, if needed:
      getPdfObject().put(PdfName.Resources, getResources().getPdfObject());
      Returns:
      PdfResources wrapper of the page.
    • setResources

      public PdfPage setResources (PdfResources pdfResources)
      Sets PdfResources object.
      Parameters:
      pdfResources - PdfResources to set.
      Returns:
      this PdfPage instance.
    • setXmpMetadata

      public PdfPage setXmpMetadata (byte[] xmpMetadata) throws IOException
      Sets the XMP Metadata.
      Parameters:
      xmpMetadata - the byte[] of XMP Metadata to set.
      Returns:
      this PdfPage instance.
      Throws:
      IOException - in case of writing error.
    • setXmpMetadata

      public PdfPage setXmpMetadata (XMPMeta xmpMeta, SerializeOptions serializeOptions) throws XMPException, IOException
      Serializes XMP Metadata to byte array and sets it.
      Parameters:
      xmpMeta - the XMPMeta object to set.
      serializeOptions - the SerializeOptions used while serialization.
      Returns:
      this PdfPage instance.
      Throws:
      XMPException - in case of XMP Metadata serialization error.
      IOException - in case of writing error.
    • setXmpMetadata

      public PdfPage setXmpMetadata (XMPMeta xmpMeta) throws XMPException, IOException
      Serializes XMP Metadata to byte array and sets it. Uses padding equals to 2000.
      Parameters:
      xmpMeta - the XMPMeta object to set.
      Returns:
      this PdfPage instance.
      Throws:
      XMPException - in case of XMP Metadata serialization error.
      IOException - in case of writing error.
    • getXmpMetadata

      public PdfStream getXmpMetadata()
      Gets the XMP Metadata object.
      Returns:
      PdfStream object, that represent XMP Metadata.
    • copyTo

      public PdfPage copyTo (PdfDocument toDocument)
      Copies page to the specified document.

      NOTE: Works only for pages from the document opened in reading mode, otherwise an exception is thrown.
      Parameters:
      toDocument - a document to copy page to.
      Returns:
      copied PdfPage.
    • copyTo

      public PdfPage copyTo (PdfDocument toDocument, IPdfPageExtraCopier copier)
      Copies page to the specified document.

      NOTE: Works only for pages from the document opened in reading mode, otherwise an exception is thrown.
      Parameters:
      toDocument - a document to copy page to.
      copier - a copier which bears a special copy logic. May be null. It is recommended to use the same instance of IPdfPageExtraCopier for the same output document.
      Returns:
      copied PdfPage.
    • copyTo

      public PdfPage copyTo (PdfDocument toDocument, IPdfPageExtraCopier copier, boolean addPageToDocument, int pageInsertIndex)
      Copies page and adds it to the specified document to the end or by index if the corresponding parameter is true.

      NOTE: Works only for pages from the document opened in reading mode, otherwise an exception is thrown. NOTE: If both documents (from which and to which the copy is made) are tagged, you must additionally call the IPdfPageFormCopier.recreateAcroformToProcessCopiedFields(PdfDocument) method after copying the tag structure to process copied fields, like add them to the document and merge fields with the same names.
      Parameters:
      toDocument - a document to copy page to.
      copier - a copier which bears a special copy logic. May be null. It is recommended to use the same instance of IPdfPageExtraCopier for the same output document.
      addPageToDocument - true if page should be added to document.
      pageInsertIndex - position to add the page to, if -1 page will be added to the end of the document, will be ignored if addPageToDocument is false.
      Returns:
      copied PdfPage.
    • getPdfLayers

      public Set<PdfLayer> getPdfLayers()
      Get all pdf layers stored under this page's annotations/xobjects/resources. Note that it will include all layers, even those already stored under /OCProperties entry in catalog. To get only unique layers, you can simply exclude ocgs, which already present in catalog.
      Returns:
      set of pdf layers, associated with this page.
    • copyAsFormXObject

      public PdfFormXObject copyAsFormXObject (PdfDocument toDocument) throws IOException
      Copies page as FormXObject to the specified document.
      Parameters:
      toDocument - a document to copy to.
      Returns:
      copied PdfFormXObject object.
      Throws:
      IOException - if an I/O error occurs.
    • getDocument

      public PdfDocument getDocument()
      Gets the PdfDocument that owns that page, or null if such document isn't exist.
      Returns:
      PdfDocument that owns that page, or null if such document isn't exist.
    • flush

      public void flush()
      Flushes page dictionary, its content streams, annotations and thumb image.

      If the page belongs to the document which is tagged, page flushing also triggers flushing of the tags, which are considered to belong to the page. The logic that defines if the given tag (structure element) belongs to the page is the following: if all the marked content references (dictionary or number references), that are the descendants of the given structure element, belong to the current page - the tag is considered to belong to the page. If tag has descendants from several pages - it is flushed, if all other pages except the current one are flushed.

      Overrides:
      flush in class PdfObjectWrapper<PdfDictionary>
    • flush

      public void flush (boolean flushResourcesContentStreams)
      Flushes page dictionary, its content streams, annotations and thumb image. If flushResourcesContentStreams is true, all content streams that are rendered on this page (like FormXObjects, annotation appearance streams, patterns) and also all images associated with this page will also be flushed.

      For notes about tag structure flushing see PdfPage#flush() method.

      If PdfADocument is used, flushing will be applied only if flushResourcesContentStreams is true.

      Be careful with handling document in which some of the pages are flushed. Keep in mind that flushed objects are finalized and are completely written to the output stream. This frees their memory but makes it impossible to modify or read data from them. Whenever there is an attempt to modify or to fetch flushed object inner contents an exception will be thrown. Flushing is only possible for objects in the writing and stamping modes, also its possible to flush modified objects in append mode.

      Parameters:
      flushResourcesContentStreams - if true all content streams that are rendered on this page (like form xObjects, annotation appearance streams, patterns) and also all images associated with this page will be flushed.
    • getMediaBox

      public Rectangle getMediaBox()
      Gets Rectangle object specified by page's Media Box, that defines the boundaries of the physical medium on which the page shall be displayed or printed
      Returns:
      Rectangle object specified by page Media Box, expressed in default user space units.
      Throws:
      PdfException - in case of any error while reading MediaBox object.
    • setMediaBox

      public PdfPage setMediaBox (Rectangle rectangle)
      Sets the Media Box object, that defines the boundaries of the physical medium on which the page shall be displayed or printed.
      Parameters:
      rectangle - the Rectangle object to set, expressed in default user space units.
      Returns:
      this PdfPage instance.
    • getCropBox

      public Rectangle getCropBox()
      Gets the Rectangle specified by page's CropBox, that defines the visible region of default user space. When the page is displayed or printed, its contents shall be clipped (cropped) to this rectangle and then shall be imposed on the output medium in some implementation-defined manner.
      Returns:
      the Rectangle object specified by pages's CropBox, expressed in default user space units. MediaBox by default.
    • setCropBox

      public PdfPage setCropBox (Rectangle rectangle)
      Sets the CropBox object, that defines the visible region of default user space. When the page is displayed or printed, its contents shall be clipped (cropped) to this rectangle and then shall be imposed on the output medium in some implementation-defined manner.
      Parameters:
      rectangle - the Rectangle object to set, expressed in default user space units.
      Returns:
      this PdfPage instance.
    • setBleedBox

      public PdfPage setBleedBox (Rectangle rectangle)
      Sets the BleedBox object, that defines the region to which the contents of the page shall be clipped when output in a production environment.
      Parameters:
      rectangle - the Rectangle object to set, expressed in default user space units.
      Returns:
      this PdfPage instance.
    • getBleedBox

      public Rectangle getBleedBox()
      Gets the Rectangle object specified by page's BleedBox, that define the region to which the contents of the page shall be clipped when output in a production environment.
      Returns:
      the Rectangle object specified by page's BleedBox, expressed in default user space units. CropBox by default.
    • setArtBox

      public PdfPage setArtBox (Rectangle rectangle)
      Sets the ArtBox object, that define the extent of the page’s meaningful content (including potential white space) as intended by the page’s creator.
      Parameters:
      rectangle - the Rectangle object to set, expressed in default user space units.
      Returns:
      this PdfPage instance.
    • getArtBox

      public Rectangle getArtBox()
      Gets the Rectangle object specified by page's ArtBox, that define the extent of the page’s meaningful content (including potential white space) as intended by the page’s creator.
      Returns:
      the Rectangle object specified by page's ArtBox, expressed in default user space units. CropBox by default.
    • setTrimBox

      public PdfPage setTrimBox (Rectangle rectangle)
      Sets the TrimBox object, that define the intended dimensions of the finished page after trimming.
      Parameters:
      rectangle - the Rectangle object to set, expressed in default user space units.
      Returns:
      this PdfPage instance.
    • getTrimBox

      public Rectangle getTrimBox()
      Gets the Rectangle object specified by page's TrimBox object, that define the intended dimensions of the finished page after trimming.
      Returns:
      the Rectangle object specified by page's TrimBox, expressed in default user space units. CropBox by default.
    • getContentBytes

      public byte[] getContentBytes()
      Get decoded bytes for the whole page content.
      Returns:
      byte array.
      Throws:
      PdfException - in case of any IOException.
    • getStreamBytes

      public byte[] getStreamBytes (int index)
      Gets decoded bytes of a certain stream of a page content.
      Parameters:
      index - index of stream inside Content.
      Returns:
      byte array.
      Throws:
      PdfException - in case of any IOException.
    • getNextMcid

      public int getNextMcid()
      Calculates and returns the next available for this page's content stream MCID reference.
      Returns:
      calculated MCID reference.
      Throws:
      PdfException - in case of not tagged document.
    • getStructParentIndex

      public int getStructParentIndex()
      Gets the key of the page’s entry in the structural parent tree.
      Returns:
      the key of the page’s entry in the structural parent tree. If page has no entry in the structural parent tree, returned value is -1.
    • setAdditionalAction

      public PdfPage setAdditionalAction (PdfName key, PdfAction action)
      Helper method to add an additional action to this page. May be used in chain.
      Parameters:
      key - a PdfName specifying the name of an additional action
      action - the PdfAction to add as an additional action
      Returns:
      this PdfPage instance.
    • getAnnotations

      public List<PdfAnnotation> getAnnotations()
      Gets array of annotation dictionaries that shall contain indirect references to all annotations associated with the page.
      Returns:
      the List<PdfAnnotation> containing all page's annotations.
    • containsAnnotation

      public boolean containsAnnotation (PdfAnnotation annotation)
      Checks if page contains the specified annotation.
      Parameters:
      annotation - the PdfAnnotation to check.
      Returns:
      true if page contains specified annotation and false otherwise.
    • addAnnotation

      public PdfPage addAnnotation (PdfAnnotation annotation)
      Adds specified annotation to the end of annotations array and tagged it. May be used in chain.
      Parameters:
      annotation - the PdfAnnotation to add.
      Returns:
      this PdfPage instance.
    • addAnnotation

      public PdfPage addAnnotation (int index, PdfAnnotation annotation, boolean tagAnnotation)
      Adds specified PdfAnnotation to specified index in annotations array with or without autotagging. May be used in chain.
      Parameters:
      index - the index at which specified annotation will be added. If -1 then annotation will be added to the end of array.
      annotation - the PdfAnnotation to add.
      tagAnnotation - if true the added annotation will be autotagged.

      (see TagStructureContext.getAutoTaggingPointer())

      Returns:
      this PdfPage instance.
    • removeAnnotation

      public PdfPage removeAnnotation (PdfAnnotation annotation)
      Removes an annotation from the page.

      When document is tagged a corresponding logical structure content item for this annotation will be removed; its immediate structure element parent will be removed as well if the following conditions are met: annotation content item was its single child and structure element role is either Annot or Form.

      Parameters:
      annotation - an annotation to be removed
      Returns:
      this PdfPage instance.
    • removeAnnotation

      public PdfPage removeAnnotation (PdfAnnotation annotation, boolean rememberTagPointer)
      Removes an annotation from the page.

      When document is tagged a corresponding logical structure content item for this annotation will be removed; its immediate structure element parent will be removed as well if the following conditions are met: annotation content item was its single child and structure element role is either Annot or Form.

      Parameters:
      annotation - an annotation to be removed
      rememberTagPointer - if set to true, the TagStructureContext.getAutoTaggingPointer() instance of TagTreePointer will be moved to the parent of the removed annotation tag. Can be used to add a new annotation to the same place in the tag structure. (E.g. when merged Acroform field is split into a field and a pure widget, the page annotation needs to be replaced by the new one)
      Returns:
      this PdfPage instance.
    • getAnnotsSize

      public int getAnnotsSize()
      Gets the number of PdfAnnotation associated with this page.
      Returns:
      the int number of PdfAnnotation associated with this page.
    • getOutlines

      public List<PdfOutline> getOutlines (boolean updateOutlines)
      This method gets outlines of a current page
      Parameters:
      updateOutlines - if the flag is true, the method reads the whole document and creates outline tree. If the flag is false, the method gets cached outline tree (if it was cached via calling getOutlines method before).
      Returns:
      return all outlines of a current page
    • isIgnorePageRotationForContent

      public boolean isIgnorePageRotationForContent()
      Returns:
      true - if in case the page has a rotation, then new content will be automatically rotated in the opposite direction. On the rotated page this would look like if new content ignores page rotation.
    • setIgnorePageRotationForContent

      public PdfPage setIgnorePageRotationForContent (boolean ignorePageRotationForContent)
      If true - defines that in case the page has a rotation, then new content will be automatically rotated in the opposite direction. On the rotated page this would look like if new content ignores page rotation. Default value - false.
      Parameters:
      ignorePageRotationForContent - - true to ignore rotation of the new content on the rotated page.
      Returns:
      this PdfPage instance.
    • setPageLabel

      public PdfPage setPageLabel (PageLabelNumberingStyle numberingStyle, String labelPrefix)
      This method adds or replaces a page label.
      Parameters:
      numberingStyle - The numbering style that shall be used for the numeric portion of each page label. May be NULL
      labelPrefix - The label prefix for page labels in this range. May be NULL
      Returns:
      this PdfPage instance.
    • setPageLabel

      public PdfPage setPageLabel (PageLabelNumberingStyle numberingStyle, String labelPrefix, int firstPage)
      This method adds or replaces a page label.
      Parameters:
      numberingStyle - The numbering style that shall be used for the numeric portion of each page label. May be NULL
      labelPrefix - The label prefix for page labels in this range. May be NULL
      firstPage - The value of the numeric portion for the first page label in the range. Must be greater or equal 1.
      Returns:
      this PdfPage instance.
    • setTabOrder

      public PdfPage setTabOrder (PdfName tabOrder)
      Sets a name specifying the tab order that shall be used for annotations on the page. The possible values are PdfName.R (row order), PdfName.C (column order), and PdfName.S (structure order). Beginning with PDF 2.0, the possible values also include PdfName.A (annotations array order) and PdfName.W (widget order). See ISO 32000 12.5, "Annotations" for details.
      Parameters:
      tabOrder - a PdfName specifying the annotations tab order. See method description for the allowed values.
      Returns:
      this PdfPage instance.
    • getTabOrder

      public PdfName getTabOrder()
      Gets a name specifying the tab order that shall be used for annotations on the page. The possible values are PdfName.R (row order), PdfName.C (column order), and PdfName.S (structure order). Beginning with PDF 2.0, the possible values also include PdfName.A (annotations array order) and PdfName.W (widget order). See ISO 32000 12.5, "Annotations" for details.
      Returns:
      a PdfName specifying the annotations tab order or null if tab order is not defined.
    • setThumbnailImage

      public PdfPage setThumbnailImage (PdfImageXObject thumb)
      Sets a stream object that shall define the page’s thumbnail image. Thumbnail images represent the contents of its pages in miniature form
      Parameters:
      thumb - the thumbnail image
      Returns:
      this PdfPage object
    • getThumbnailImage

      public PdfImageXObject getThumbnailImage()
      Sets a stream object that shall define the page’s thumbnail image. Thumbnail images represent the contents of its pages in miniature form
      Returns:
      the thumbnail image, or null if it is not present
    • addOutputIntent

      public PdfPage addOutputIntent (PdfOutputIntent outputIntent)
      Adds PdfOutputIntent that shall specify the colour characteristics of output devices on which the page might be rendered.
      Parameters:
      outputIntent - PdfOutputIntent to add.
      Returns:
      this PdfPage object
      See Also:
    • put

      public PdfPage put (PdfName key, PdfObject value)
      Helper method that associates specified value with the specified key in the underlying PdfDictionary. Can be used in method chaining.
      Parameters:
      key - the PdfName key with which the specified value is to be associated
      value - the PdfObject value to be associated with the specified key.
      Returns:
      this PdfPage object.
    • remove

      public PdfPage remove (PdfName key)
      Helper method that removes the value associated with the specified key from the underlying PdfDictionary. Can be used in method chaining.
      Parameters:
      key - the PdfName key for which associated value is to be removed
      Returns:
      this PdfPage object
    • addAssociatedFile

      public void addAssociatedFile (String description, PdfFileSpec fs)
      Adds file associated with PDF page and identifies the relationship between them.

      Associated files may be used in Pdf/A-3 and Pdf 2.0 documents. The method adds file to array value of the AF key in the page dictionary. If description is provided, it also will add file description to catalog Names tree.

      For associated files their associated file specification dictionaries shall include the AFRelationship key

      Parameters:
      description - the file description
      fs - file specification dictionary of associated file
    • addAssociatedFile

      public void addAssociatedFile (PdfFileSpec fs)

      Adds file associated with PDF page and identifies the relationship between them.

      Associated files may be used in Pdf/A-3 and Pdf 2.0 documents. The method adds file to array value of the AF key in the page dictionary.

      For associated files their associated file specification dictionaries shall include the AFRelationship key

      Parameters:
      fs - file specification dictionary of associated file
    • getAssociatedFiles

      public PdfArray getAssociatedFiles (boolean create)
      Returns files associated with PDF page.
      Parameters:
      create - defines whether AF arrays will be created if it doesn't exist
      Returns:
      associated files array
    • isWrappedObjectMustBeIndirect

      protected boolean isWrappedObjectMustBeIndirect()
      Description copied from class: PdfObjectWrapper
      Defines if the object behind this wrapper must be an indirect object in the resultant document.

      If this method returns true it doesn't necessarily mean that object must be in the indirect state at any moment, but rather defines that when the object will be written to the document it will be transformed into indirect object if it's not indirect yet.

      Return value of this method shouldn't depend on any logic, it should return always true or false.
      Specified by:
      isWrappedObjectMustBeIndirect in class PdfObjectWrapper<PdfDictionary>
      Returns:
      true if in the resultant document the object behind the wrapper must be indirect, otherwise false.