Package com.logicaldoc.core.util
Class PDFImageExtractor
- java.lang.Object
-
- com.logicaldoc.core.util.PDFImageExtractor
-
public class PDFImageExtractor extends Object
This utility class allows the extraction of raster images from a PDF document- Since:
- 1.0.0
- Author:
- Marco Meschieri - LogicalDOC
-
-
Constructor Summary
Constructors Constructor Description PDFImageExtractor(File pdfFile)
Creates a new PDF reader for the given PDF file
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Closes the PDF and releases resources usedBufferedImage
extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey)
Extracts the imageKey image from the given pageList<BufferedImage>
extractImages()
Extracts all images of the entire documentSet<org.apache.pdfbox.cos.COSName>
getImageKeys(int pageIndex)
Gets the set of images identifiers inside the given pageint
getNumberOfPages()
Returns the total number of pages in the PDFBufferedImage
getPageAsImage(int pageIndex)
Renders the specified page as a buffered imageBufferedImage
rotate90SX(BufferedImage bi)
-
-
-
Constructor Detail
-
PDFImageExtractor
public PDFImageExtractor(File pdfFile)
Creates a new PDF reader for the given PDF file- Parameters:
pdfFile
- the pdf file
-
-
Method Detail
-
close
public void close() throws IOException
Closes the PDF and releases resources used- Throws:
IOException
- if the pdf file cannot be read
-
getNumberOfPages
public int getNumberOfPages()
Returns the total number of pages in the PDF- Returns:
- the total number of pages
-
getPageAsImage
public BufferedImage getPageAsImage(int pageIndex) throws IOException
Renders the specified page as a buffered image- Parameters:
pageIndex
- zero based page index, i.e., the first page is page 0- Returns:
- object representation of the image
- Throws:
IOException
- if the pdf file cannot be read
-
extractImage
public BufferedImage extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey) throws IOException
Extracts the imageKey image from the given page- Parameters:
pageIndex
- zero based page index, i.e., the first page is page 0imageKey
- identifier of the image- Returns:
- object representation of the image
- Throws:
IOException
- if the pdf file cannot be read
-
rotate90SX
public BufferedImage rotate90SX(BufferedImage bi)
-
getImageKeys
public Set<org.apache.pdfbox.cos.COSName> getImageKeys(int pageIndex) throws IOException
Gets the set of images identifiers inside the given page- Parameters:
pageIndex
- zero based page index, i.e., the first page is page 0- Returns:
- set of image identifiers
- Throws:
IOException
- if the pdf file cannot be read
-
extractImages
public List<BufferedImage> extractImages() throws IOException
Extracts all images of the entire document- Returns:
- The list of images
- Throws:
IOException
- if the pdf file cannot be read
-
-