java.lang.Object

com.logicaldoc.core.util.PDFImageExtractor

All Implemented Interfaces:: AutoCloseable

public class PDFImageExtractor extends Object implements AutoCloseable

This utility class allows the extraction of raster images from a PDF document

Since:: 1.0.0
Author:: Marco Meschieri - LogicalDOC

Constructor Summary

Constructors

Constructor

Description

PDFImageExtractor(File pdfFile)

Creates a new PDF reader for the given PDF file
Method Summary

Modifier and Type

Method

Description

void

close()

Closes the PDF and releases resources used

BufferedImage

extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey)

Extracts the imageKey image from the given page

List<BufferedImage>

extractImages()

Extracts all images of the entire document

Set<org.apache.pdfbox.cos.COSName>

getImageKeys(int pageIndex)

Gets the set of images identifiers inside the given page

int

getNumberOfPages()

Returns the total number of pages in the PDF

BufferedImage

getPageAsImage(int pageIndex)

Renders the specified page as a buffered image

BufferedImage

rotate90SX(BufferedImage bi)

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- PDFImageExtractor
  
  public PDFImageExtractor(File pdfFile)
  
  Creates a new PDF reader for the given PDF file
  
  Parameters:
  
  pdfFile - the pdf file
Method Details
- close
  
  public void close() throws IOException
  
  Closes the PDF and releases resources used
  
  Specified by:
  
  close in interface AutoCloseable
  
  Throws:
  
  IOException - if the pdf file cannot be read
- getNumberOfPages
  
  public int getNumberOfPages()
  
  Returns the total number of pages in the PDF
  
  Returns:
  
  the total number of pages
- getPageAsImage
  
  public BufferedImage getPageAsImage(int pageIndex) throws IOException
  
  Renders the specified page as a buffered image
  
  Parameters:
  
  pageIndex - zero based page index, i.e., the first page is page 0
  
  Returns:
  
  object representation of the image
  
  Throws:
  
  IOException - if the pdf file cannot be read
- extractImage
  
  public BufferedImage extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey) throws IOException
  
  Extracts the imageKey image from the given page
  
  Parameters:
  
  pageIndex - zero based page index, i.e., the first page is page 0
  
  imageKey - identifier of the image
  
  Returns:
  
  object representation of the image
  
  Throws:
  
  IOException - if the pdf file cannot be read
- rotate90SX
  
  public BufferedImage rotate90SX(BufferedImage bi)
- getImageKeys
  
  public Set<org.apache.pdfbox.cos.COSName> getImageKeys(int pageIndex) throws IOException
  
  Gets the set of images identifiers inside the given page
  
  Parameters:
  
  pageIndex - zero based page index, i.e., the first page is page 0
  
  Returns:
  
  set of image identifiers
  
  Throws:
  
  IOException - if the pdf file cannot be read
- extractImages
  
  public List<BufferedImage> extractImages() throws IOException
  
  Extracts all images of the entire document
  
  Returns:
  
  The list of images
  
  Throws:
  
  IOException - if the pdf file cannot be read

Class PDFImageExtractor

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

PDFImageExtractor

Method Details

close

getNumberOfPages

getPageAsImage

extractImage

rotate90SX

getImageKeys

extractImages