Package com.logicaldoc.core.util
Class PDFImageExtractor
- java.lang.Object
-
- com.logicaldoc.core.util.PDFImageExtractor
-
public class PDFImageExtractor extends Object
This utility class allows the extraction of raster images from a PDF document- Since:
- 1.0.0
- Author:
- Marco Meschieri - LogicalDOC
-
-
Constructor Summary
Constructors Constructor Description PDFImageExtractor(File pdfFile)Creates a new PDF reader for the given PDF file
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Closes the PDF and releases resources usedBufferedImageextractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey)Extracts the imageKey image from the given pageList<BufferedImage>extractImages()Extracts all images of the entire documentSet<org.apache.pdfbox.cos.COSName>getImageKeys(int pageIndex)Gets the set of images identifiers inside the given pageintgetNumberOfPages()Returns the total number of pages in the PDFBufferedImagegetPageAsImage(int pageIndex)Renders the specified page as a buffered imageBufferedImagerotate90SX(BufferedImage bi)
-
-
-
Constructor Detail
-
PDFImageExtractor
public PDFImageExtractor(File pdfFile)
Creates a new PDF reader for the given PDF file- Parameters:
pdfFile- the pdf file
-
-
Method Detail
-
close
public void close() throws IOExceptionCloses the PDF and releases resources used- Throws:
IOException- if the pdf file cannot be read
-
getNumberOfPages
public int getNumberOfPages()
Returns the total number of pages in the PDF- Returns:
- the total number of pages
-
getPageAsImage
public BufferedImage getPageAsImage(int pageIndex) throws IOException
Renders the specified page as a buffered image- Parameters:
pageIndex- zero based page index, i.e., the first page is page 0- Returns:
- object representation of the image
- Throws:
IOException- if the pdf file cannot be read
-
extractImage
public BufferedImage extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey) throws IOException
Extracts the imageKey image from the given page- Parameters:
pageIndex- zero based page index, i.e., the first page is page 0imageKey- identifier of the image- Returns:
- object representation of the image
- Throws:
IOException- if the pdf file cannot be read
-
rotate90SX
public BufferedImage rotate90SX(BufferedImage bi)
-
getImageKeys
public Set<org.apache.pdfbox.cos.COSName> getImageKeys(int pageIndex) throws IOException
Gets the set of images identifiers inside the given page- Parameters:
pageIndex- zero based page index, i.e., the first page is page 0- Returns:
- set of image identifiers
- Throws:
IOException- if the pdf file cannot be read
-
extractImages
public List<BufferedImage> extractImages() throws IOException
Extracts all images of the entire document- Returns:
- The list of images
- Throws:
IOException- if the pdf file cannot be read
-
-