Package com.logicaldoc.core.util
Class PDFImageExtractor
java.lang.Object
com.logicaldoc.core.util.PDFImageExtractor
- All Implemented Interfaces:
AutoCloseable
This utility class allows the extraction of raster images from a PDF document
- Since:
- 1.0.0
- Author:
- Marco Meschieri - LogicalDOC
-
Constructor Summary
ConstructorDescriptionPDFImageExtractor
(File pdfFile) Creates a new PDF reader for the given PDF file -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Closes the PDF and releases resources usedextractImage
(int pageIndex, org.apache.pdfbox.cos.COSName imageKey) Extracts the imageKey image from the given pageExtracts all images of the entire documentSet
<org.apache.pdfbox.cos.COSName> getImageKeys
(int pageIndex) Gets the set of images identifiers inside the given pageint
Returns the total number of pages in the PDFgetPageAsImage
(int pageIndex) Renders the specified page as a buffered image
-
Constructor Details
-
PDFImageExtractor
Creates a new PDF reader for the given PDF file- Parameters:
pdfFile
- the pdf file
-
-
Method Details
-
close
Closes the PDF and releases resources used- Specified by:
close
in interfaceAutoCloseable
- Throws:
IOException
- if the pdf file cannot be read
-
getNumberOfPages
public int getNumberOfPages()Returns the total number of pages in the PDF- Returns:
- the total number of pages
-
getPageAsImage
Renders the specified page as a buffered image- Parameters:
pageIndex
- zero based page index, i.e., the first page is page 0- Returns:
- object representation of the image
- Throws:
IOException
- if the pdf file cannot be read
-
extractImage
public BufferedImage extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey) throws IOException Extracts the imageKey image from the given page- Parameters:
pageIndex
- zero based page index, i.e., the first page is page 0imageKey
- identifier of the image- Returns:
- object representation of the image
- Throws:
IOException
- if the pdf file cannot be read
-
rotate90SX
-
getImageKeys
Gets the set of images identifiers inside the given page- Parameters:
pageIndex
- zero based page index, i.e., the first page is page 0- Returns:
- set of image identifiers
- Throws:
IOException
- if the pdf file cannot be read
-
extractImages
Extracts all images of the entire document- Returns:
- The list of images
- Throws:
IOException
- if the pdf file cannot be read
-